Please note there may beerrata for this document.
The English version of this specification is the only normative version. Non-normativetranslations may also be available.
Copyright © 2014-2016W3C® (MIT,ERCIM,Keio,Beihang).W3Cliability,trademark anddocument use rules apply.
This specification defines a mechanism by which user agents may verify that a fetched resource has been delivered without unexpected manipulation.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of currentW3C publications and the latest revision of this technical report can be found in theW3C technical reports index at http://www.w3.org/TR/.
A list of changes to this document may be found athttps://github.com/w3c/webappsec-subresource-integrity.
This document was published by theWeb Application Security Working Group as a Recommendation. If you wish to make comments regarding this document, please send them topublic-webappsec@w3.org (subscribe,archives) with[SRI]
at the start of your email's subject. All comments are welcome.
Please see the Working Group'simplementation report.
This document has been reviewed byW3C Members, by software developers, and by otherW3C groups and interested parties, and is endorsed by the Director as aW3C Recommendation. It is a stable document and may be used as reference material or cited from another document.W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web.
W3C expects thefunctionality specified in this Recommendation will not beaffected by changes toFetch. The Working Group will continue to track the Fetchspecification and document issues that impact this specification.
This document was produced by a group operating under the5 February 2004W3C Patent Policy.W3C maintains apublic list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes containsEssential Claim(s) must disclose the information in accordance withsection 6 of theW3C Patent Policy.
This document is governed by the1 September 2015W3C Process Document.
This section is non-normative.
Sites and applications on the web are rarely composed of resources from only a single origin. For example, authors pull scripts and styles from a wide variety of services and content delivery networks, and must trust that the delivered representation is, in fact, what they expected to load. If an attacker can trick a user into downloading content from a hostile server (viaDNS poisoning, or other such means), the author has no recourse. Likewise, an attacker who can replace the file on the Content Delivery Network (CDN) server has the ability to inject arbitrary content.
Delivering resources over a secure channel mitigates some of this risk: withTLS,HSTS, andpinned public keys, a user agent can be fairly certain that it is indeed speaking with the server it believes it’s talking to. These mechanisms, however, authenticateonly the server,not the content. An attacker (or administrator) with access to the server can manipulate content with impunity. Ideally, authors would not only be able to pin the keys of a server, but also pin thecontent, ensuring that an exact representation of a resource, andonly that representation, loads and executes.
This document specifies such a validation scheme, extending two HTML elements with anintegrity
attribute that contains a cryptographic hash of the representation of the resource the author expects to load. For instance, an author may wish to load some framework from a shared server rather than hosting it on their own origin. Specifying that theexpected SHA-384 hash ofhttps://example.com/example-framework.js
isLi9vy3DqF8tnTXuiaAJuML3ky+er10rcgNR/VqsVpcw+ThHmYcwiB1pbOxEbzJr7
means that the user agent can verify that the data it loads from that URL matches that expected hash before executing the JavaScript it contains. This integrity verification significantly reduces the risk that an attacker can substitute malicious content.
This example can be communicated to a user agent by adding the hash to ascript
element, like so:
<scriptsrc="https://example.com/example-framework.js"integrity="sha384-Li9vy3DqF8tnTXuiaAJuML3ky+er10rcgNR/VqsVpcw+ThHmYcwiB1pbOxEbzJr7"crossorigin="anonymous"></script>
Scripts, of course, are not the only response type which would benefit from integrity validation. The scheme specified here also applies tolink
and future versions of this specification are likely to expand this coverage.
Compromise of a third-party service should not automatically mean compromise of every site which includes its scripts. Content authors will have a mechanism by which they can specify expectations for content they load, meaning for example that they could load aspecific script, and notany script that happens to have a particular URL.
The verification mechanism should have error-reporting functionality which would inform the author that an invalid response was received.
An author wishes to use a content delivery network to improve performance for globally-distributed users. It is important, however, to ensure that the CDN’s servers deliveronly the code the author expects them to deliver. To mitigate the risk that a CDN compromise (or unexpectedly malicious behavior) would change that site in unfortunate ways, the followingintegrity metadata is added to thelink
element included on the page:
<linkrel="stylesheet"href="https://site53.example.net/style.css"integrity="sha384-+/M6kredJcxdsqkczBUjMLvqyHb1K/JThDXWsBVxMEeZHEaMKEOEct339VItX1zB"crossorigin="anonymous">
An author wants to include JavaScript provided by a third-party analytics service. To ensure that only the code that has been carefully reviewed is executed, the author generatesintegrity metadata for the script, and adds it to thescript
element:
<scriptsrc="https://analytics-r-us.example.com/v1.0/include.js"integrity="sha384-MBO5IDfYaE6c6Aao94oZrIOiC6CGiSN2n4QUbHNPhzk5Xhm0djZLQqTpL0HzTUxk"crossorigin="anonymous"></script>
A user agent wishes to ensure that JavaScript code running in high-privilege HTML contexts (for example, a browser’s New Tab page) aren’t manipulated before display.Integrity metadata mitigates the risk that altered JavaScript will run in these pages’ high-privilege contexts.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key wordsMAY,MUST, andSHOULD are to be interpreted as described in [RFC2119].
Conformance requirements phrased as algorithms or specific steps can be implemented in any manner, so long as the end result is equivalent. In particular, the algorithms defined in this specification are intended to be easy to understand and are not intended to be performant. Implementers are encouraged to optimize.
This section defines several terms used throughout the document.
The termdigest refers to the base64-encoded result of executing a cryptographic hash function on an arbitrary block of data.
The termorigin is defined in the Origin specification. [RFC6454]
Therepresentation data andcontent encoding of a resource are defined byRFC7231, section 3. [RFC7231]
Abase64 encoding is defined inRFC 4648, section 4. [RFC4648]
TheSHA-256,SHA-384, andSHA-512 are part of theSHA-2 set of cryptographic hash functions defined by the NIST in“FIPS PUB 180-4: Secure Hash Standard (SHS)”.
The Augmented Backus-Naur Form (ABNF) notation used in this document is specified in RFC5234. [ABNF]
Appendix B.1 of [ABNF] definesVCHAR
(printing characters).
WSP
(white space) characters are defined in Section2.4.1 Common parser idioms of the HTML 5 specification asWhite_Space characters
.
The integrity verification mechanism specified here boils down to the process of generating a sufficiently strong cryptographic digest for a resource, and transmitting that digest to a user agent so that it may be used to verify the response.
To verify the integrity of a response, a user agent requiresintegritymetadata as part of therequest. This metadata consists of the following pieces of information:
The hash function and digestMUST be provided in order to validate a response’s integrity.
At the moment, no options are defined. However, future versions of the spec may define options, such as MIME types [MIMETYPE].
This metadataMUST be encoded in the same format as thehash-source
(without the single quotes) insection 4.2 of the Content Security Policy Level 2 specification.
For example, given a script resource containing only the stringalert(\'Hello, world.\');
, an author might chooseSHA-384 as a hash function.H8BRh8j48O9oYatfu5AZzq6A9RINhZO5H16dQZngK7T62em8MUt1FLm52t+eX6xO
is the base64-encoded digest that results. This can be encoded as follows:
sha384-H8BRh8j48O9oYatfu5AZzq6A9RINhZO5H16dQZngK7T62em8MUt1FLm52t+eX6xO
Digests may be generated using any number of utilities.OpenSSL, for example, is quite commonly available. The example in this section is the result of the following command line:
echo -n"alert('Hello, world.');" | openssl dgst -sha384 -binary| openssl base64 -A
Conformant user agentsMUST support theSHA-256,SHA-384 andSHA-512 cryptographic hash functions for use as part of a request’sintegrity metadata andMAY support additional hash functions.
User agentsSHOULD refuse to support known-weak hashing functions like MD5 or SHA-1 andSHOULD restrict supported hashing functions to those known to be collision-resistant. Additionally, user agentsSHOULD re-evaluate their supported hash functions on a regular basis and deprecate support for those functions that have become insecure. SeeHash collision attacks.
Multiple sets ofintegrity metadata may be associated with a single resource in order to provide agility in the face of future cryptographic discoveries. For example, the resource described in the previous section may be described by either of the following hash expressions:
sha384-dOTZf16X8p34q2/kYyEFm0jh89uTjikhnzjeLeF0FHsEaYKb1A1cv+Lyv4Hk8vHdsha512-Q2bFTOhEALkN8hOms2FKTDLy7eugP2zFZ1T8LCvX42Fp3WoNr3bjZSAHeOsHrbV1Fu9/A0EzCinRE7Af1ofPrw==
Authors may choose to specify both, for example:
<scriptsrc="hello_world.js"integrity="sha384-dOTZf16X8p34q2/kYyEFm0jh89uTjikhnzjeLeF0FHsEaYKb1A1cv+Lyv4Hk8vHd sha512-Q2bFTOhEALkN8hOms2FKTDLy7eugP2zFZ1T8LCvX42Fp3WoNr3bjZSAHeOsHrbV1Fu9/A0EzCinRE7Af1ofPrw=="crossorigin="anonymous"></script>
In this case, the user agent will choose the strongest hash function in the list, and use that metadata to validate the response (as described below in the “parse metadata” and “get the strongest metadata fromset” algorithms).
When a hash function is determined to be insecure, user agentsSHOULD deprecate and eventually remove support for integrity validation using the insecure hash function. User agentsMAY check the validity of responses using a digest based on a deprecated function.
To allow authors to switch to stronger hash functions without being held back by older user agents, validation using unsupported hash functions acts like no integrity value was provided (see the “Does response match metadataList” algorithm below). Authors are encouraged to use strong hash functions, and to begin migrating to stronger hash functions as they become available.
User agents must provide a mechanism for determining the relative priority of two hash functions and return the empty string if the priority is equal. That is, if a user agent implemented a function likegetPrioritizedHashFunction(a,b) it would return the hash function the user agent considers the most collision-resistant. For example,getPrioritizedHashFunction('sha256','sha512')
would return'sha512'
andgetPrioritizedHashFunction('sha256','sha256')
would return the empty string.
ThegetPrioritizedHashFunction is an internal implementation detail. It is not an API that implementors provide to web applications. It is used in this document only to simplify the algorithm description.
In order to mitigate an attacker’s ability to read data cross-origin by brute-forcing values via integrity checks, responses are only eligible for such checks if they are same-origin or are the result of explicit access granted to the loading origin via Cross Origin Resource Sharing [CORS].
As noted inRFC6454, section 4, some user agents use globally unique identifiers for each file URI. This means that resources accessed over afile
scheme URL are unlikely to be eligible for integrity checks.
Being in aSecure Context (e.g., a document delivered over HTTPS) is not necessary for the use of integrity validation. Because resource integrity is only an application level security tool, and it does not change the security state of the user agent, a Secure Context is unnecessary. However, if integrity is used in something other than a Secure Context (e.g., a document delivered over HTTP), authors should be aware that the integrity providesno securityguarantees at all. For this reason, authors should only deliver integrity metadata in a Secure Context. SeeNon-secure contexts remain non-secure for more discussion.
The following algorithm details these restrictions:
basic
,cors
ordefault
, returntrue
.false
.Theresponse types are defined by the Fetch specification [FETCH] and refer to the following:
basic
is a same-origin response, and thus the requestor has full access to read the body.cors
is a valid response to a cross-origin, CORS-enabled request, and thus again the requestor has full access to read the body.default
is a valid response that is generated by a Service Worker as a response to the request, so its body, too, is fully readable by the requestor.This algorithm accepts a string, and returns eitherno metadata
, or a set of valid hash expressions whose hash functions are understood by the user agent.
true
.false
.no metadata
ifempty istrue
, otherwise returnresult.getPrioritizedHashFunction(currentAlgorithm, newAlgorithm)
is the empty string, additem toresult. If the result isnewAlgorithm, setstrongest toitem, setresult to the empty set, and additem toresult.no metadata
, returntrue
.false
.true
.true
.false
.This algorithm allows the user agent to accept multiple, valid strong hash functions. For example, a developer might write ascript
element such as:
<scriptsrc="https://example.com/example-framework.js"integrity="sha384-Li9vy3DqF8tnTXuiaAJuML3ky+er10rcgNR/VqsVpcw+ThHmYcwiB1pbOxEbzJr7 sha384-+/M6kredJcxdsqkczBUjMLvqyHb1K/JThDXWsBVxMEeZHEaMKEOEct339VItX1zB"crossorigin="anonymous"></script>
which would allow the user agent to accept two different content payloads, one of which matches the first SHA384 hash value and the other matches the second SHA384 hash value.
User agents may allow users to modify the result of this algorithm via user preferences, bookmarklets, third-party additions to the user agent, and other such mechanisms. For example, redirects generated by an extension likeHTTPS Everywhere could load and execute correctly, even if the HTTPS version of a resource differs from the HTTP version.
This algorithm returnsfalse
if the response is noteligible for integrity validation since Subresource Integrity requires CORS, and it is a logical error to attempt to use it without CORS. Additionally, user agentsSHOULD report a warning message to the developer console to explain this failure.
A variety of HTML elements result in requests for resources that are to be embedded into the document, or executed in its context. To support integrity metadata for some of these elements, a newintegrity
attribute is added to the list of content attributes for thelink
andscript
elements.
A correspondingintegrity
IDL attribute whichreflects the value each element’sintegrity
content attribute is added to theHTMLLinkElement
andHTMLScriptElement
interfaces.
A future revision of this specification is likely to include integrity support for all possible subresources, i.e.,a
,audio
,embed
,iframe
,img
,link
,object
,script
,source
,track
, andvideo
elements.
integrity
attributeTheintegrity
attribute representsintegrity metadata for an element. The value of the attributeMUST be either the empty string, or at least one valid metadata as described by the following ABNF grammar:
integrity-metadata =*WSP hash-with-options*( 1*WSP hash-with-options )*WSP /*WSPhash-with-options = hash-expression*("?" option-expression)option-expression =*VCHARhash-algo =<hash-algo production from [Content Security Policy Level 2, section 4.2]>base64-value =<base64-value production from [Content Security Policy Level 2, section 4.2]>hash-expression = hash-algo"-" base64-value
Theintegrity
IDL attribute mustreflect theintegrity
content attribute.
option-expression
s are associated on a perhash-expression
basis and are applied only to thehash-expression
that immediately precedes it.
In order for user agents to remain fully forwards compatible with future options, the user agentMUST ignore all unrecognizedoption-expression
s.
Note that while theoption-expression
has been reserved in the syntax, no options have been defined. It is likely that a future version of the spec will define a more specific syntax for options, so it is defined here as broadly as possible.
partial interfaceHTMLLinkElement { attributeDOMStringintegrity;};
integrity
of typeDOMStringintegrity
partial interfaceHTMLScriptElement { attributeDOMStringintegrity;};
integrity
of typeDOMStringintegrity
The user agent will refuse to render or execute responses that fail an integrity check, instead returning a network error as defined in Fetch [FETCH].
On a failed integrity check, anerror
event is fired. Developers wishing to provide a canonical fallback resource (e.g., a resource not served from a CDN, perhaps from a secondary, trusted, but slower source) can catch thiserror
event and provide an appropriate handler to replace the failed resource with a different one.
link
element for stylesheetsWhenever a user agent attempts toobtain a resource pointed to by alink
element that has arel
attribute with the keyword ofstylesheet
, modify step 4 to read:
Do a potentially CORS-enabled fetch of the resulting absolute URL, with the mode being the current state of the element’s crossorigin content attribute, the origin being the origin of the link element’s Document, the default origin behavior set to taint, and theintegrity metadata of the request set to the value of the element’sintegrity
attribute.
script
elementReplace step 14.1 of HTML5’s“prepare a script” algorithm with:
src
attribute and the request’s associatedintegrity metadata be the value of the element’sintegrity
attribute.Optimizing proxies and other intermediate servers which modify the responsesMUST ensure that the digest associated with those responses stays in sync with the new content. One option is to ensure that theintegrity metadata associated with resources is updated. Another would be simply to deliver only the canonical version of resources for which a page author has requested integrity verification.
To help inform intermediate servers, those serving the resourcesSHOULD send along with the resource aCache-Control
header with a value ofno-transform
.
This section is non-normative.
Integrity metadata delivered by a context that is not aSecure Context, such as an HTTP page, only protects an origin against a compromise of the server where an external resources is hosted. Network attackers can alter the digest in-flight (or remove it entirely, or do absolutely anything else to the document), just as they could alter the response the hash is meant to validate. Thus, it is recommended that authors deliver integrity metadata only to aSecure Context. See alsosecuring the web.
Digests are only as strong as the hash function used to generate them. It is recommended that user agents refuse to support known-weak hashing functions and limit supported algorithms to those known to be collision resistant. Examples of hashing functions that are not recommended include MD5 and SHA-1. At the time of writing, SHA-384 is a good baseline.
Moreover, it is recommended that user agents re-evaluate their supported hash functions on a regular basis and deprecate support for those functions shown to be insecure. Over time, hash functions may be shown to be much weaker than expected and, in some cases, broken, so it is important that user agents stay aware of these developments.
This specification requires theCORS settings attribute to be present on integrity-protected cross-origin requests. If that requirement were omitted, attackers could violate thesame-origin policy and determine whether a cross-origin resource has certain content.
Attackers would attempt to load the resource with a known digest, and watch for load failures. If the load fails, the attacker could surmise that the response didn’t match the hash and thereby gain some insight into its contents. This might reveal, for example, whether or not a user is logged into a particular service.
Moreover, attackers could brute-force specific values in an otherwise static resource. Consider a JSON response that looks like this:
{'status':'authenticated','username':'admin'}
An attacker could precompute hashes for the response with a variety of common usernames, and specify those hashes while repeatedly attempting to load the document. A successful load would confirm that the attacker has correctly guessed the username.
Much of the content here is inspired heavily by Gervase Markham’sLink Fingerprints concept, as well as WHATWG’sLink Hashes.
A special thanks to Mike West of Google, Inc. for his invaluable contributions to the initial version of this spec. Additionally, Brad Hill, Anne van Kesteren, Jonathan Kingston, Mark Nottingham, Dan Veditz, Eduardo Vela, Tanvi Vyas, and Michal Zalewski provided invaluable feedback.