Abstract This specification defines a mechanism by which user agents may verify that a fetched resource has been delivered without unexpected manipulation.

1. Introduction This section is non-normative. Sites and applications on the web are rarely composed of resources from only a single origin. For example, authors pull scripts and styles from a wide variety of services and content delivery networks, and must trust that the delivered representation is, in fact, what they expected to load. If an attacker can trick a user into downloading content from a hostile server (via DNS poisoning, or other such means), the author has no recourse. Likewise, an attacker who can replace the file on the Content Delivery Network (CDN) server has the ability to inject arbitrary content. Delivering resources over a secure channel mitigates some of this risk: with TLS, HSTS, and pinned public keys, a user agent can be fairly certain that it is indeed speaking with the server it believes it’s talking to. These mechanisms, however, authenticate only the server, not the content. An attacker (or administrator) with access to the server can manipulate content with impunity. Ideally, authors would not only be able to pin the keys of a server, but also pin the content, ensuring that an exact representation of a resource, and only that representation, loads and executes. This document specifies such a validation scheme, extending two HTML elements with an integrity attribute that contains a cryptographic hash of the representation of the resource the author expects to load. For instance, an author may wish to load some framework from a shared server rather than hosting it on their own origin. Specifying that the expected SHA-384 hash of https://example.com/example-framework.js is Li9vy3DqF8tnTXuiaAJuML3ky+er10rcgNR/VqsVpcw+ThHmYcwiB1pbOxEbzJr7 means that the user agent can verify that the data it loads from that URL matches that expected hash before executing the JavaScript it contains. This integrity verification significantly reduces the risk that an attacker can substitute malicious content. This example can be communicated to a user agent by adding the hash to a script element, like so: Example 1 < script src = "https://example.com/example-framework.js" integrity = "sha384-Li9vy3DqF8tnTXuiaAJuML3ky+er10rcgNR/VqsVpcw+ThHmYcwiB1pbOxEbzJr7" crossorigin = "anonymous" > </ script > Scripts, of course, are not the only response type which would benefit from integrity validation. The scheme specified here also applies to link and future versions of this specification are likely to expand this coverage. 1.1 Goals Compromise of a third-party service should not automatically mean compromise of every site which includes its scripts. Content authors will have a mechanism by which they can specify expectations for content they load, meaning for example that they could load a specific script, and not any script that happens to have a particular URL. The verification mechanism should have error-reporting functionality which would inform the author that an invalid response was received. 1.2 Use Cases/Examples 1.2.1 Resource Integrity An author wishes to use a content delivery network to improve performance for globally-distributed users. It is important, however, to ensure that the CDN’s servers deliver only the code the author expects them to deliver. To mitigate the risk that a CDN compromise (or unexpectedly malicious behavior) would change that site in unfortunate ways, the following integrity metadata is added to the link element included on the page: Example 2 < link rel = "stylesheet" href = "https://site53.example.net/style.css" integrity = "sha384-+/M6kredJcxdsqkczBUjMLvqyHb1K/JThDXWsBVxMEeZHEaMKEOEct339VItX1zB" crossorigin = "anonymous" >

An author wants to include JavaScript provided by a third-party analytics service. To ensure that only the code that has been carefully reviewed is executed, the author generates integrity metadata for the script, and adds it to the script element: Example 3 < script src = "https://analytics-r-us.example.com/v1.0/include.js" integrity = "sha384-MBO5IDfYaE6c6Aao94oZrIOiC6CGiSN2n4QUbHNPhzk5Xhm0djZLQqTpL0HzTUxk" crossorigin = "anonymous" > </ script >

A user agent wishes to ensure that JavaScript code running in high-privilege HTML contexts (for example, a browser’s New Tab page) aren’t manipulated before display. Integrity metadata mitigates the risk that altered JavaScript will run in these pages’ high-privilege contexts.

2. Conformance As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative. The key words MAY, MUST, and SHOULD are to be interpreted as described in [ RFC2119 ]. Conformance requirements phrased as algorithms or specific steps can be implemented in any manner, so long as the end result is equivalent. In particular, the algorithms defined in this specification are intended to be easy to understand and are not intended to be performant. Implementers are encouraged to optimize. 2.1 Key Concepts and Terminology This section defines several terms used throughout the document. The term digest refers to the base64-encoded result of executing a cryptographic hash function on an arbitrary block of data. The term origin is defined in the Origin specification. [ RFC6454 ] The representation data and content encoding of a resource are defined by RFC7231, section 3. [ RFC7231 ] A base64 encoding is defined in RFC 4648, section 4. [ RFC4648 ] The SHA-256 , SHA-384 , and SHA-512 are part of the SHA-2 set of cryptographic hash functions defined by the NIST in “FIPS PUB 180-4: Secure Hash Standard (SHS)”. 2.2 Grammatical Concepts The Augmented Backus-Naur Form (ABNF) notation used in this document is specified in RFC5234. [ ABNF ] Appendix B.1 of [ ABNF ] defines VCHAR (printing characters). WSP (white space) characters are defined in Section 2.4.1 Common parser idioms of the HTML 5 specification as White_Space characters .

4. Proxies Optimizing proxies and other intermediate servers which modify the responses MUST ensure that the digest associated with those responses stays in sync with the new content. One option is to ensure that the integrity metadata associated with resources is updated. Another would be simply to deliver only the canonical version of resources for which a page author has requested integrity verification. To help inform intermediate servers, those serving the resources SHOULD send along with the resource a Cache-Control header with a value of no-transform .

5. Security Considerations This section is non-normative. 5.1 Non-secure contexts remain non-secure Integrity metadata delivered by a context that is not a Secure Context, such as an HTTP page, only protects an origin against a compromise of the server where an external resources is hosted. Network attackers can alter the digest in-flight (or remove it entirely, or do absolutely anything else to the document), just as they could alter the response the hash is meant to validate. Thus, it is recommended that authors deliver integrity metadata only to a Secure Context. See also securing the web. 5.2 Hash collision attacks Digests are only as strong as the hash function used to generate them. It is recommended that user agents refuse to support known-weak hashing functions and limit supported algorithms to those known to be collision resistant. Examples of hashing functions that are not recommended include MD5 and SHA-1. At the time of writing, SHA-384 is a good baseline. Moreover, it is recommended that user agents re-evaluate their supported hash functions on a regular basis and deprecate support for those functions shown to be insecure. Over time, hash functions may be shown to be much weaker than expected and, in some cases, broken, so it is important that user agents stay aware of these developments. 5.3 Cross-origin data leakage This specification requires the CORS settings attribute to be present on integrity-protected cross-origin requests. If that requirement were omitted, attackers could violate the same-origin policy and determine whether a cross-origin resource has certain content. Attackers would attempt to load the resource with a known digest, and watch for load failures. If the load fails, the attacker could surmise that the response didn’t match the hash and thereby gain some insight into its contents. This might reveal, for example, whether or not a user is logged into a particular service. Moreover, attackers could brute-force specific values in an otherwise static resource. Consider a JSON response that looks like this: Example 8 { 'status' : 'authenticated' , 'username' : 'admin' } An attacker could precompute hashes for the response with a variety of common usernames, and specify those hashes while repeatedly attempting to load the document. A successful load would confirm that the attacker has correctly guessed the username.

6. Acknowledgements Much of the content here is inspired heavily by Gervase Markham’s Link Fingerprints concept, as well as WHATWG’s Link Hashes. A special thanks to Mike West of Google, Inc. for his invaluable contributions to the initial version of this spec. Additionally, Brad Hill, Anne van Kesteren, Jonathan Kingston, Mark Nottingham, Dan Veditz, Eduardo Vela, Tanvi Vyas, and Michal Zalewski provided invaluable feedback.