HTML Sanitizer API

Limited availability

This feature is not Baseline because it does not work in some of the most widely-used browsers.

Experimental:This is anexperimental technology
Check theBrowser compatibility table carefully before using this in production.

TheHTML Sanitizer API allows developers to take strings of HTML and filter out unwanted elements, attributes, and other HTML entities when they are inserted into the DOM or a shadow DOM.

Concepts and usage

Web applications often need to work with untrusted HTML on the client side, for example, as part of a client-side templating solution, when rendering user generated content, or if including data in a frame from another site.

Injecting untrusted HTML can make a site vulnerable to varioustypes of attacks.In particular,cross-site scripting (XSS) attacks work by injecting untrusted HTML into the DOM that then executes JavaScript in the context of the current origin — allowing malicious code to run as though it was served from the site's origin.These attacks can be mitigated by removing unsafe HTML elements and attributes before they are injected into the DOM.

The HTML Sanitizer API provides a number of methods for removing unwanted HTML entities from HTML input before it is injected into the DOM.These come in XSS-safe versions that enforce removal of all unsafe elements and attributes, and potentially unsafe versions that give developers full control over the HTML entities that are allowed.

Sanitization methods

The HTML Sanitizer API provides XSS-safe and XSS-unsafe methods for injecting HTML strings into anElement or aShadowRoot, and for parsing HTML into aDocument.

Safe methods:Element.setHTML(),ShadowRoot.setHTML(), andDocument.parseHTML().
Unsafe methods:Element.setHTMLUnsafe(),ShadowRoot.setHTMLUnsafe(), andDocument.parseHTMLUnsafe().

All the methods take the HTML to be injected and an optionalsanitizer configuration as arguments.The configuration defines the HTML entities that will be filtered out of the input before it is injected.TheElement methods are context aware, and will additionally drop any elements that the HTML specification does not allow in the target element.

The safe methods always remove XSS-unsafe elements and attributes.If no sanitizer is passed as a parameter they will use the default sanitizer configuration, which allows all elements and attributes except those that are known to be unsafe, such as<script> elements andonclick event handlers.If a custom sanitizer is used, it is implicitly updated to remove any elements and attributes that are not XSS-safe (note that the passed sanitizer is not modified, and might still allow unsafe entities if used with an unsafe method).

The safe methods should be used instead ofElement.innerHTML,Element.outerHTML, orShadowRoot.innerHTML, for injecting untrusted HTML content.For example, in most case you can useElement.setHTML() with the default sanitizer as a drop-in replacement forElement.innerHTML.The same methods can also be used for injecting trusted HTML strings that do not need to contain any XSS-unsafe elements.

The XSS-unsafe methods will use whatever sanitizer configuration is passed as an argument.If no sanitizer is passed, then all HTML elements and attributes allowed by the context will be injected.This is similar to usingElement.innerHTML except that the method will parse shadow roots, drop elements that aren't appropriate in the context, and allow some other input that is not allowed when using the property.

The unsafe methods should only be used with untrusted HTML that needs to contain some XSS-unsafe elements or attributes.This is still unsafe, but allows you to reduce the risk by restricting unsafe entities to the minimal set.For example, if you wanted to inject unsafe HTML but for some reason you needed the input to include theonblur handler, you could more safely do so by amending the default sanitizer and using an unsafe method as shown:

const sanitizer = new Sanitizer(); // Default sanitizersanitizer.allowAttribute("onblur"); // Allow onblursomeElement.setHTMLUnsafe(untrustedString, { sanitizer });

Sanitizer configuration

A sanitizer configuration defines what HTML entities will be allowed, replaced, or removed when the sanitizer is used, including elements, attributes,data-* attributes, and comments.

There are two very closely related sanitizer configuration interfaces, either of which can be passed to all the sanitization methods.

SanitizerConfig is a dictionary object that defines arrays for the allowed/disallowed elements and attributes and boolean properties that indicate whether comments and data attributes will be allowed or omitted, and so on.
Only a subset of possible configuration options may be specified in a particular configuration in order to reduce redundancy and ambiguity.The allowed subset is summarized in theAllow and remove configurations section below, and described in detail inValid configuration.
Sanitizer is essentially a wrapper around aSanitizerConfig that provides methods to ergonomically modify the configuration and ensure that it remains valid.
For example, you can use a method to add an allowed element, and it will also remove the element from thereplaceWithChildrenElements array (if present).The interface also provides methods to return a copy of the underlyingSanitizerConfig and also to update the sanitizer so that it is XSS-safe.It may provide normalizations of the sanitizer configuration used to construct it, making it easier to understand and reuse.

While you can use either interface in any of the sanitizing methods,Sanitizer is likely to be more efficient to share and reuse thanSanitizerConfig.

Allow and remove configurations

You can build up a configuration in two ways:

As anallow configuration: specifying the set of elements and/or attributes that you will allow in the output.
As aremove configuration: specifying the set that must not be present in the output.

These sets are specified as arrays in the configuration object fields:elements andattributes, andremoveElements andremoveAttributes.You may not specify both allow and remove arrays for elements or attributes in the same configuration, but other combinations of fields are allowed.The following table shows the permitted combinations.

Element arrays	Attribute arrays	Valid?
`elements`	-	✔️
`elements`	`attributes`	✔️
`elements`	`removeAttributes`	✔️
`removeElements`	-	✔️
`removeElements`	`attributes`	✔️
`removeElements`	`removeAttributes`	✔️
-	`attributes`	✔️
-	`removeAttributes`	✔️
`elements` +`removeElements`	(anything)	❌
(anything)	`attributes` +`removeAttributes`	❌
-	-	✔️

An allow configuration can optionally specify whether per-element attributes should be allowed and/or removed in itselements array.The allowed configuration for these local attributes depends on whether or not globalattributes orremovedAttributes is defined.Thevalid configuration section outlines the restrictions.

In general an "allow configuration" is safer for both the elements and attributes, because you list the elements and/or attributes that you want and know are safe, rather than all the items that are dangerous or might potentially be considered dangerous in future.If you specify an empty configuration object then an empty allow configuration is used.

Allow configurations

With "allow configurations" you specify the elements and attributes you wish toallow (or replace with child elements) — all other elements/attributes in the input will be dropped.This makes it easy to understand what elements will be allowed in the DOM when the HTML is parsed.They are useful when you know exactly what HTML entities you want to be able to inject in a particular context.

Allow configurations are created by defining aSanitizer that wraps aSanitizerConfig that includes theelements and/orattributes arrays (and not theremoveElements orremoveAttributes arrays).

For example, the following configuration is created by passing aSanitizerConfig that allows<p> and<div> elements, andcite andonclick attributes on any allowed element.It will also replace<b> elements with their child nodes.

const sanitizer = new Sanitizer({  elements: ["p", "div"],  replaceWithChildrenElements: ["b"],  attributes: ["cite", "onclick"],});

The same configuration can also be created usingSanitizer methods.Note that in the following code theSanitizer() constructor takes an empty object, which results in aSanitizer where the underlying configuration includes bothelements andattributes arrays — in other words, an "allow configuration".

// Create empty sanitizerconst sanitizer = new Sanitizer({});// Use Sanitizer methods to update the properties.sanitizer.allowElement("p");sanitizer.allowElement("div");sanitizer.replaceElementWithChildren("b");sanitizer.allowAttribute("cite");sanitizer.allowAttribute("onclick");

Remove configurations

In "remove configurations" you specify the HTML elements and attributes that you want to remove: any other elements and attributes are permitted by the sanitizer (but may be blocked if you use a safe sanitizer method, or if the element is not allowed in the context).

Remove configurations are created using aSanitizerConfig that includes theremoveElements and/orremoveAttributes arrays (and not theelements orattributes arrays).

For example, the followingSanitizer configuration would remove the same elements that were allowed in the previous code:

const sanitizer = new Sanitizer({  removeElements: ["p", "div"],  removeAttributes: ["cite", "onclick"],  replaceWithChildrenElements: ["b"],});

The configuration can also be created usingSanitizer methods.To make this a "remove configuration" we have to declare theremoveElements orremoveAttributes array when constructing the object (if only one array is specified the other will be defined as part of normalization).

const sanitizer = new Sanitizer({  removeElements: [],});sanitizer.removeElement("p");sanitizer.removeElement("div");sanitizer.replaceElementWithChildren("b");sanitizer.removeAttribute("cite");sanitizer.removeAttribute("onclick");

Adding and removing from`Sanitizer` configurations

Sanitizer is recommended when you're using a configuration object that you might want to reuse or modify.Whether the sanitizer has an allow or remove configuration depends on theSanitizerConfig passed when the object is created.For example, if you pass a configuration object that has theelements orattributes array (or an empty object) the sanitizer will have an allow configuration.

In the examples above we created an allow configuration and then calledallowElement(),allowAttribute(), andreplaceElementWithChildren() to allow additional elements and attributes, and similarly we created a remove configuration and calledremoveElement() andremoveAttribute() to specify additional elements to remove.

You can also call the allow methods on a remove configuration, and the remove methods on an allow configuration — but they behave differently.When you call the allow methods on an allow sanitizer the specified elements and attributes are added to the underlyingelements andattributes array.However if you call those methods on a remove sanitizer there is noelements andattributes array; instead the specified element isremoved from the correspondingremoveElements orremoveAttributes array, if present.This works because allowing an element in an allow sanitizer is the same as "not removing" an element in a remove sanitizer.

You can call all theSanitizer methods on either an allow or remove sanitizer, and the method will make whatever changes it is able that result in a valid configuration.For example, if you add an element the method will either add it toelements or remove it fromremoveElements if present, depending on the type of sanitizer, and also remove the same element from thereplaceWithChildrenElements array, if present.

Some operations that are possible for an allow configuration are not possible for a remove configuration.For example, per-element attributes are defined in theelements array, which is not present in a remove sanitizer.

The methods returntrue orfalse to indicate whether or not they modified the underlying configuration.So if you callallowElement() on an allow configuration and the specified element is not present, it will be added to theelements array and the method will returntrue.But if the element is already present then the method would returnfalse.Note that if you call the same method to set a per-element attribute, this will returnfalse if called on a remove sanitizer, because the change cannot be made.

Sanitization and Trusted Types

TheTrusted Types API provides mechanisms to ensure that inputs are passed through a user-specified transformation function before being passed to an API that might execute that input.This transformation function is most commonly used to sanitize the input but it doesn't have to: the purpose of the API is primarily to make it easy for developers to audit sanitization code, not to define how or if sanitization is done.

The safe HTML sanitization methods don't use trusted types.Because they always filter all XSS-unsafe entities before input HTML is injected, there is no need to sanitize the input string, or audit the methods.

However the unsafe HTML sanitization methods may inject untrusted HTML, depending on the sanitizer, and so will work with trusted types.The methods can take either a string or aTrustedType as input.If a sanitizer is also supplied, the transformation function will be run first, and then the sanitizer.

Note that the behavior of the transformation function in this case will depend on the website policy (which might be to reject all use of the unsafe methods).

Third party sanitization libraries

Prior to the Sanitizer API, developers typically filtered input strings using third-party libraries such asDOMPurify, perhaps called from transformation functions in trusted types.

These should not be necessary when using the safe HTML sanitization methods as the API is integrated with the browser, and is more aware of the parsing context and what code is allowed to execute than external parser libraries can be.

They may be useful with the unsafe HTML methods and trusted types, depending on website trusted type policies.

Interfaces

SanitizerExperimental: A reusable sanitizer configuration object that defines what elements and attributes should be allowed/removed when sanitizing untrusted strings of HTML.This is used in the methods that insert strings of HTML into the DOM or Document.
SanitizerConfig: A dictionary that defines a sanitizer configuration.This can be used in the same places asSanitizer but is likely to be less efficient to use and reuse.

Extensions to other interfaces

XSS-safe methods

Element.setHTML(): Parse a string of HTML into a subtree of nodes, dropping any elements that are invalid in the context of the element.Then drop any elements and attributes that are not allowed by the sanitizer configuration, and any that are considered XSS-unsafe (even if allowed by the configuration).The subtree is then inserted into the DOM as a subtree of the element.
ShadowRoot.setHTML(): Parse a string of HTML into a subtree of nodes.Then drop any elements and attributes that are not allowed by the sanitizer configuration, and any that are considered XSS-unsafe (even if allowed by the configuration).The subtree is then inserted as a subtree of theShadowRoot.
Document.parseHTML(): Parse a string of HTML into a subtree of nodes.Then drop any elements and attributes that are not allowed by the sanitizer configuration, and any that are considered XSS-unsafe (even if allowed by the configuration).The subtree is then set as the root of theDocument.

XSS-unsafe methods

Element.setHTMLUnsafe(): Parse a string of HTML into a subtree of nodes, dropping any elements that are invalid in the context of the element.Then drop any elements and attributes that are not allowed by the sanitizer: if no sanitizer is specified allow all elements.The subtree is then inserted into the DOM as a subtree of the element.
ShadowRoot.setHTMLUnsafe(): Parse a string of HTML into a subtree of nodes.Then drop any elements and attributes that are not allowed by the sanitizer: if no sanitizer is specified allow all elements.The subtree is then inserted into as a subtree of theShadowRoot.
Document.parseHTMLUnsafe(): Parse a string of HTML into a subtree of nodes.Then drop any elements and attributes that are not allowed by the sanitizer: if no sanitizer is specified allow all elements.The subtree is then set as the root of theDocument.

Examples

The following examples show how to use the sanitizer API using thedefault sanitizer (at time of writing configuration operations are not yet supported).

Using`Element.setHTML()` with the default sanitizer

In most cases callingElement.setHTML() without passing a sanitizer can be used as a drop-in replacement forElement.innerHTML.The code below demonstrates how the method is used to sanitize the HTML input before it is injected into an element with id oftarget.

const untrustedString = "abc <script>alert(1)<" + "/script> def"; // Untrusted HTML (perhaps from user input)const someTargetElement = document.getElementById("target");// someElement.innerHTML = untrustedString;someElement.setHTML(untrustedString);console.log(target.innerHTML); // abc def

The<script> element is not allowed by the default sanitizer, or by thesetHTML() method, so thealert() is removed.

Note that usingElement.setHTMLUnsafe() with the default sanitizer will sanitize the same HTML entities.The main difference is that if you use this method with Trusted Types it may still be audited:

someElement.setHTMLUnsafe(untrustedString);

Using an allow sanitizer configuration

This code shows how you might useElement.setHTMLUnsafe() with an allow sanitizer that allows only<p>,<b>, and<div> elements.All other elements in the input string would be removed.

const sanitizer = new Sanitizer({ elements: ["p", "b", "div"] });someElement.setHTMLUnsafe(untrustedString, { sanitizer });

Note that in this case you should normally usesetHTML().You should only useElement.setHTMLUnsafe() if you need to allow XSS-unsafe elements or attributes.