XML external entity expansion¶
ID: py/xxeKind: path-problemSecurity severity: 9.1Severity: errorPrecision: highTags: - security - external/cwe/cwe-611 - external/cwe/cwe-827Query suites: - python-code-scanning.qls - python-security-extended.qls - python-security-and-quality.qls
Click to see the query in the CodeQL repository
Parsing untrusted XML files with a weakly configured XML parser may lead to an XML External Entity (XXE) attack. This type of attack uses external entity references to access arbitrary files on a system, carry out denial-of-service (DoS) attacks, or server-side request forgery. Even when the result of parsing is not returned to the user, DoS attacks are still possible and out-of-band data retrieval techniques may allow attackers to steal sensitive data.
Recommendation¶
The easiest way to prevent XXE attacks is to disable external entity handling when parsing untrusted data. How this is done depends on the library being used. Note that some libraries, such as recent versions of the XML libraries in the standard library of Python 3, disable entity expansion by default, so unless you have explicitly enabled entity expansion, no further action needs to be taken.
We recommend using thedefusedxml PyPI package, which has been created to prevent XML attacks (both XXE and XML bombs).
Example¶
The following example uses thelxml XML parser to parse a stringxml_src. That string is from an untrusted source, so this code is vulnerable to an XXE attack, since the default parser fromlxml.etree allows local external entities to be resolved.
fromflaskimportFlask,requestimportlxml.etreeapp=Flask(__name__)@app.post("/upload")defupload():xml_src=request.get_data()doc=lxml.etree.fromstring(xml_src)returnlxml.etree.tostring(doc)
To guard against XXE attacks with thelxml library, you should create a parser withresolve_entities set tofalse. This means that no entity expansion is undertaken, although standard predefined entities such as>, for writing> inside the text of an XML element, are still allowed.
fromflaskimportFlask,requestimportlxml.etreeapp=Flask(__name__)@app.post("/upload")defupload():xml_src=request.get_data()parser=lxml.etree.XMLParser(resolve_entities=False)doc=lxml.etree.fromstring(xml_src,parser=parser)returnlxml.etree.tostring(doc)
References¶
Timothy Morgen:XML Schema, DTD, and Entity Attacks.
Timur Yunusov, Alexey Osipov:XML Out-Of-Band Data Retrieval.
Python 3 standard library:XML Vulnerabilities.
Python 2 standard library:XML Vulnerabilities.
PortSwigger:XML external entity (XXE) injection.
Common Weakness Enumeration:CWE-611.
Common Weakness Enumeration:CWE-827.