DOMParser: parseFromString() method
Baseline Widely available
This feature is well established and works across many devices and browser versions. It’s been available across browsers since July 2015.
Warning:This method parses its input as HTML, writing the result into the DOM.APIs like this are known asinjection sinks, and are potentially a vector forcross-site scripting (XSS) attacks, if the input originally came from an attacker.
You can mitigate this risk by always passingTrustedHTML objects instead of strings andenforcing trusted types.SeeSecurity considerations for more information.
TheparseFromString() method of theDOMParser interface parses an input containing either HTML or XML, returning aDocument with the type given in thecontentType property.
Note:TheDocument.parseHTMLUnsafe() static method provides an ergonomic alternative for parsing HTML markup into aDocument.
In this article
Syntax
parseFromString(input, mimeType)Parameters
inputA
TrustedHTMLor string instance defining HTML to be parsed.The markup must contain either anHTML,XML,XHTML, orSVG document.mimeTypeA string that specifies whether the XML parser or the HTML parser is used to parse the string.
Allowed values are:
text/htmltext/xmlapplication/xmlapplication/xhtml+xmlimage/svg+xml
Return value
ADocument withcontentType matching the givenmimeType.
Note:The browser may actually return anHTMLDocument orXMLDocument object.These derive fromDocument and add no attributes: they are essentially equivalent.
Exceptions
TypeErrorThis is thrown when:
mimeTypeis passed a value that is not one of theallowed values.inputis passed a string value whenTrusted Types areenforced by a CSP and no default policy is defined.
Description
TheparseFromString() method parses an input containing either HTML or XML, returning aDocument with thecontentType matching themimeType.ThisDocument contains a complete in-memory DOM that is separate from the main document in the associated page.
If themimeType istext/html the input is parsed as HTML and<script> elements are marked as non-executable, events are not fired, and event handlers aren't called to run inline scripts.While the document can download resources specified in<iframe> and<img> elements, it is essentially inert.This is useful because you can parse HTML inputs that includedeclarative shadow roots, and perform operations on the document without affecting the visible page.For example, you can use this to sanitize the input tree, and inject parts of the input into the visible DOM when needed.
For the other allowed values (text/xml,application/xml,application/xhtml+xml, andimage/svg+xml) the input is parsed as XML.This is useful if you want to import XML files, validate their structure, and extract data.If the input doesn't represent well-formed XML, the returned document will contain a<parsererror> node describing the nature of the parsing error.
DisallowedmimeType values cause aTypeError to be thrown.
Security considerations
This method parses its input into a separate in-memory DOM, disabling any<script> elements and stopping event handlers from running.While the returned document is effectively inert, event handlers and scripts in its DOM will be able to run if they are inserted into the visible DOM.The method is therefore a possible vector forcross-site scripting (XSS) attacks, where potentially unsafe input is first parsed into aDocument without being sanitized, and then injected into the visible/active DOM where code is able to run.
You should mitigate this risk by always passingTrustedHTML objects instead of strings, andenforcing trusted types using therequire-trusted-types-for CSP directive.This ensures that the input is passed through a transformation function, which has the chance tosanitize the input to remove potentially dangerous markup (such as<script> elements and event handler attributes), before it is injected.
UsingTrustedHTML makes it possible to audit and check that sanitization code is effective in just a few places, rather than scattered across all your injection sinks.You should not need to pass a sanitizer to the method when usingTrustedHTML.
Note that even if you sanitize the input of elements and attributes that can execute code, you still need to be careful when taking any user input.For example, your page might use data in an XML document to fetch files that it then executes.
Examples
>Parsing an input using Trusted Types
In this example we'll safely parse a potentially harmful HTML input and then inject it into the DOM of the visible page.
To mitigate the risk of XSS, we'll create aTrustedHTML object from the string containing the HTML.Trusted types are not yet supported on all browsers, so first we define thetrusted types tinyfill.This acts as a transparent replacement for the trusted types JavaScript API:
if (typeof trustedTypes === "undefined") trustedTypes = { createPolicy: (n, rules) => rules };Next we create aTrustedTypePolicy that defines acreateHTML() for transforming an input string intoTrustedHTML instances.Commonly, implementations ofcreateHTML() use a library such asDOMPurify to sanitize the, input as shown below:
const policy = trustedTypes.createPolicy("my-policy", { createHTML: (input) => DOMPurify.sanitize(input),});Then we use thispolicy object to create aTrustedHTML object from the potentially unsafe input string and parse it into aDocument.Note that the resultingDocument will represent a complete HTML document with a root<html>,<head> and<body>, even though the input does not have these elements:
// The potentially malicious stringconst untrustedString = "<p>I might be XSS</p><img src='x' onerror='alert(1)'>";// Create a TrustedHTML instance using the policyconst trustedHTML = policy.createHTML(untrustedString);// Parse the TrustedHTML (which contains a trusted string)const safeDocument = parser.parseFromString(trustedHTML, "text/html");ThesafeDocument now contains a DOM that is sanitized of harmful elements according to our policy.Below we useElement.replaceWith() to replace thebody of the visible DOM with the body of our document: scripts in the new body will run, as will code when event handlers are triggered.
document.body.replaceWith(safeDocument.body);Parsing XML, SVG, and HTML
The code below shows how you use the method to parse each of the content types.While you should use trusted types in real code, here they are omitted for brevity.
const parser = new DOMParser();const xmlString = "<warning>Beware of the tiger</warning>";const doc1 = parser.parseFromString(xmlString, "application/xml");console.log(doc1.contentType); // "application/xml"const svgString = '<circle cx="50" cy="50" r="50"/>';const doc2 = parser.parseFromString(svgString, "image/svg+xml");console.log(doc2.contentType); // "image/svg+xml"const htmlString = "<strong>Beware of the leopard</strong>";const doc3 = parser.parseFromString(htmlString, "text/html");console.log(doc3.contentType); // "text/html"console.log(doc1.documentElement.textContent);// "Beware of the tiger"console.log(doc2.firstChild.tagName);// "circle"console.log(doc3.body.firstChild.textContent);// "Beware of the leopard"Note that theapplication/xml andimage/svg+xml MIME types above are functionally identical — the latter does not include any SVG-specific parsing rules.
Error handling
When using the XML parser with a string that doesn't represent well-formed XML, theXMLDocument returned byparseFromString will contain a<parsererror> node describing the nature of the parsing error.
const parser = new DOMParser();const xmlString = "<warning>Beware of the missing closing tag";const doc = parser.parseFromString(xmlString, "application/xml");const errorNode = doc.querySelector("parsererror");if (errorNode) { // parsing failed} else { // parsing succeeded}Additionally, the parsing error may be reported to the browser's JavaScript console.
Specifications
| Specification |
|---|
| HTML> # dom-domparser-parsefromstring-dev> |
Browser compatibility
See also
XMLSerializerJSON.parse()- counterpart forJSONdocuments.