Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit516a6d4

Browse files
hartworkgpshead
andauthored
[3.10]gh-115398: Expose Expat >=2.6.0 reparse deferral API (CVE-2023-52425) (GH-115623) (GH-116270)
Allow controlling Expat >=2.6.0 reparse deferral (CVE-2023-52425) by adding five new methods:- `xml.etree.ElementTree.XMLParser.flush`- `xml.etree.ElementTree.XMLPullParser.flush`- `xml.parsers.expat.xmlparser.GetReparseDeferralEnabled`- `xml.parsers.expat.xmlparser.SetReparseDeferralEnabled`- `xml.sax.expatreader.ExpatParser.flush`Based on the "flush" idea from#115138 (comment) .Includes code suggested-by: Snild Dolkow <snild@sony.com>and by core dev Serhiy Storchaka.Co-authored-by: Gregory P. Smith <greg@krypto.org>
1 parentb612ec6 commit516a6d4

File tree

14 files changed

+435
-19
lines changed

14 files changed

+435
-19
lines changed

‎Doc/library/pyexpat.rst

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -196,6 +196,42 @@ XMLParser Objects
196196
:exc:`ExpatError` to be raised with the:attr:`code` attribute set to
197197
``errors.codes[errors.XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING]``.
198198

199+
..method::xmlparser.SetReparseDeferralEnabled(enabled)
200+
201+
..warning::
202+
203+
Calling ``SetReparseDeferralEnabled(False)`` has security implications,
204+
as detailed below; please make sure to understand these consequences
205+
prior to using the ``SetReparseDeferralEnabled`` method.
206+
207+
Expat 2.6.0 introduced a security mechanism called "reparse deferral"
208+
where instead of causing denial of service through quadratic runtime
209+
from reparsing large tokens, reparsing of unfinished tokens is now delayed
210+
by default until a sufficient amount of input is reached.
211+
Due to this delay, registered handlers may — depending of the sizing of
212+
input chunks pushed to Expat — no longer be called right after pushing new
213+
input to the parser. Where immediate feedback and taking over responsiblity
214+
of protecting against denial of service from large tokens are both wanted,
215+
calling ``SetReparseDeferralEnabled(False)`` disables reparse deferral
216+
for the current Expat parser instance, temporarily or altogether.
217+
Calling ``SetReparseDeferralEnabled(True)`` allows re-enabling reparse
218+
deferral.
219+
220+
Note that:meth:`SetReparseDeferralEnabled` has been backported to some
221+
prior releases of CPython as a security fix. Check for availability of
222+
:meth:`SetReparseDeferralEnabled` using:func:`hasattr` if used in code
223+
running across a variety of Python versions.
224+
225+
..versionadded::3.10.14
226+
227+
..method::xmlparser.GetReparseDeferralEnabled()
228+
229+
Returns whether reparse deferral is currently enabled for the given
230+
Expat parser instance.
231+
232+
..versionadded::3.10.14
233+
234+
199235
:class:`xmlparser` objects have the following attributes:
200236

201237

‎Doc/library/xml.etree.elementtree.rst

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -165,6 +165,11 @@ data but would still like to have incremental parsing capabilities, take a look
165165
at:func:`iterparse`. It can be useful when you're reading a large XML document
166166
and don't want to hold it wholly in memory.
167167

168+
Where *immediate* feedback through events is wanted, calling method
169+
:meth:`XMLPullParser.flush` can help reduce delay;
170+
please make sure to study the related security notes.
171+
172+
168173
Finding interesting elements
169174
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
170175

@@ -1370,6 +1375,24 @@ XMLParser Objects
13701375

13711376
Feeds data to the parser. *data* is encoded data.
13721377

1378+
1379+
..method::flush()
1380+
1381+
Triggers parsing of any previously fed unparsed data, which can be
1382+
used to ensure more immediate feedback, in particular with Expat >=2.6.0.
1383+
The implementation of:meth:`flush` temporarily disables reparse deferral
1384+
with Expat (if currently enabled) and triggers a reparse.
1385+
Disabling reparse deferral has security consequences; please see
1386+
:meth:`xml.parsers.expat.xmlparser.SetReparseDeferralEnabled` for details.
1387+
1388+
Note that:meth:`flush` has been backported to some prior releases of
1389+
CPython as a security fix. Check for availability of:meth:`flush`
1390+
using:func:`hasattr` if used in code running across a variety of Python
1391+
versions.
1392+
1393+
..versionadded::3.10.14
1394+
1395+
13731396
:meth:`XMLParser.feed` calls *target*\'s ``start(tag, attrs_dict)`` method
13741397
for each opening tag, its ``end(tag)`` method for each closing tag, and data
13751398
is processed by method ``data(data)``. For further supported callback
@@ -1431,6 +1454,22 @@ XMLPullParser Objects
14311454

14321455
Feed the given bytes data to the parser.
14331456

1457+
..method::flush()
1458+
1459+
Triggers parsing of any previously fed unparsed data, which can be
1460+
used to ensure more immediate feedback, in particular with Expat >=2.6.0.
1461+
The implementation of:meth:`flush` temporarily disables reparse deferral
1462+
with Expat (if currently enabled) and triggers a reparse.
1463+
Disabling reparse deferral has security consequences; please see
1464+
:meth:`xml.parsers.expat.xmlparser.SetReparseDeferralEnabled` for details.
1465+
1466+
Note that:meth:`flush` has been backported to some prior releases of
1467+
CPython as a security fix. Check for availability of:meth:`flush`
1468+
using:func:`hasattr` if used in code running across a variety of Python
1469+
versions.
1470+
1471+
..versionadded::3.10.14
1472+
14341473
..method::close()
14351474

14361475
Signal the parser that the data stream is terminated. Unlike

‎Include/pyexpat.h

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,8 +48,10 @@ struct PyExpat_CAPI
4848
enumXML_Status (*SetEncoding)(XML_Parserparser,constXML_Char*encoding);
4949
int (*DefaultUnknownEncodingHandler)(
5050
void*encodingHandlerData,constXML_Char*name,XML_Encoding*info);
51-
/* might benone for expat < 2.1.0 */
51+
/* might beNULL for expat < 2.1.0 */
5252
int (*SetHashSalt)(XML_Parserparser,unsigned longhash_salt);
53+
/* might be NULL for expat < 2.6.0 */
54+
XML_Bool (*SetReparseDeferralEnabled)(XML_Parserparser,XML_Boolenabled);
5355
/* always add new stuff to the end! */
5456
};
5557

‎Lib/test/test_pyexpat.py

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -730,5 +730,59 @@ def resolve_entity(context, base, system_id, public_id):
730730
self.assertEqual(handler_call_args, [("bar","baz")])
731731

732732

733+
classReparseDeferralTest(unittest.TestCase):
734+
deftest_getter_setter_round_trip(self):
735+
parser=expat.ParserCreate()
736+
enabled= (expat.version_info>= (2,6,0))
737+
738+
self.assertIs(parser.GetReparseDeferralEnabled(),enabled)
739+
parser.SetReparseDeferralEnabled(False)
740+
self.assertIs(parser.GetReparseDeferralEnabled(),False)
741+
parser.SetReparseDeferralEnabled(True)
742+
self.assertIs(parser.GetReparseDeferralEnabled(),enabled)
743+
744+
deftest_reparse_deferral_enabled(self):
745+
ifexpat.version_info< (2,6,0):
746+
self.skipTest(f'Expat{expat.version_info} does not '
747+
'support reparse deferral')
748+
749+
started= []
750+
751+
defstart_element(name,_):
752+
started.append(name)
753+
754+
parser=expat.ParserCreate()
755+
parser.StartElementHandler=start_element
756+
self.assertTrue(parser.GetReparseDeferralEnabled())
757+
758+
forchunkin (b'<doc',b'/>'):
759+
parser.Parse(chunk,False)
760+
761+
# The key test: Have handlers already fired? Expecting: no.
762+
self.assertEqual(started, [])
763+
764+
parser.Parse(b'',True)
765+
766+
self.assertEqual(started, ['doc'])
767+
768+
deftest_reparse_deferral_disabled(self):
769+
started= []
770+
771+
defstart_element(name,_):
772+
started.append(name)
773+
774+
parser=expat.ParserCreate()
775+
parser.StartElementHandler=start_element
776+
ifexpat.version_info>= (2,6,0):
777+
parser.SetReparseDeferralEnabled(False)
778+
self.assertFalse(parser.GetReparseDeferralEnabled())
779+
780+
forchunkin (b'<doc',b'/>'):
781+
parser.Parse(chunk,False)
782+
783+
# The key test: Have handlers already fired? Expecting: yes.
784+
self.assertEqual(started, ['doc'])
785+
786+
733787
if__name__=="__main__":
734788
unittest.main()

‎Lib/test/test_sax.py

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
fromioimportBytesIO,StringIO
2020
importcodecs
2121
importos.path
22+
importpyexpat
2223
importshutil
2324
importsys
2425
fromurllib.errorimportURLError
@@ -1214,6 +1215,56 @@ def test_expat_incremental_reset(self):
12141215

12151216
self.assertEqual(result.getvalue(),start+b"<doc>text</doc>")
12161217

1218+
deftest_flush_reparse_deferral_enabled(self):
1219+
ifpyexpat.version_info< (2,6,0):
1220+
self.skipTest(f'Expat{pyexpat.version_info} does not support reparse deferral')
1221+
1222+
result=BytesIO()
1223+
xmlgen=XMLGenerator(result)
1224+
parser=create_parser()
1225+
parser.setContentHandler(xmlgen)
1226+
1227+
forchunkin ("<doc",">"):
1228+
parser.feed(chunk)
1229+
1230+
self.assertEqual(result.getvalue(),start)# i.e. no elements started
1231+
self.assertTrue(parser._parser.GetReparseDeferralEnabled())
1232+
1233+
parser.flush()
1234+
1235+
self.assertTrue(parser._parser.GetReparseDeferralEnabled())
1236+
self.assertEqual(result.getvalue(),start+b"<doc>")
1237+
1238+
parser.feed("</doc>")
1239+
parser.close()
1240+
1241+
self.assertEqual(result.getvalue(),start+b"<doc></doc>")
1242+
1243+
deftest_flush_reparse_deferral_disabled(self):
1244+
result=BytesIO()
1245+
xmlgen=XMLGenerator(result)
1246+
parser=create_parser()
1247+
parser.setContentHandler(xmlgen)
1248+
1249+
forchunkin ("<doc",">"):
1250+
parser.feed(chunk)
1251+
1252+
ifpyexpat.version_info>= (2,6,0):
1253+
parser._parser.SetReparseDeferralEnabled(False)
1254+
1255+
self.assertEqual(result.getvalue(),start)# i.e. no elements started
1256+
self.assertFalse(parser._parser.GetReparseDeferralEnabled())
1257+
1258+
parser.flush()
1259+
1260+
self.assertFalse(parser._parser.GetReparseDeferralEnabled())
1261+
self.assertEqual(result.getvalue(),start+b"<doc>")
1262+
1263+
parser.feed("</doc>")
1264+
parser.close()
1265+
1266+
self.assertEqual(result.getvalue(),start+b"<doc></doc>")
1267+
12171268
# ===== Locator support
12181269

12191270
deftest_expat_locator_noinfo(self):

‎Lib/test/test_xml_etree.py

Lines changed: 63 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -121,10 +121,6 @@
121121
</foo>
122122
"""
123123

124-
fails_with_expat_2_6_0= (unittest.expectedFailure
125-
ifpyexpat.version_info>= (2,6,0)else
126-
lambdatest:test)
127-
128124
defcheckwarnings(*filters,quiet=False):
129125
defdecorator(test):
130126
defnewtest(*args,**kwargs):
@@ -1378,12 +1374,14 @@ def test_attlist_default(self):
13781374

13791375
classXMLPullParserTest(unittest.TestCase):
13801376

1381-
def_feed(self,parser,data,chunk_size=None):
1377+
def_feed(self,parser,data,chunk_size=None,flush=False):
13821378
ifchunk_sizeisNone:
13831379
parser.feed(data)
13841380
else:
13851381
foriinrange(0,len(data),chunk_size):
13861382
parser.feed(data[i:i+chunk_size])
1383+
ifflush:
1384+
parser.flush()
13871385

13881386
defassert_events(self,parser,expected,max_events=None):
13891387
self.assertEqual(
@@ -1401,34 +1399,32 @@ def assert_event_tags(self, parser, expected, max_events=None):
14011399
self.assertEqual([(action,elem.tag)foraction,eleminevents],
14021400
expected)
14031401

1404-
deftest_simple_xml(self,chunk_size=None):
1402+
deftest_simple_xml(self,chunk_size=None,flush=False):
14051403
parser=ET.XMLPullParser()
14061404
self.assert_event_tags(parser, [])
1407-
self._feed(parser,"<!-- comment -->\n",chunk_size)
1405+
self._feed(parser,"<!-- comment -->\n",chunk_size,flush)
14081406
self.assert_event_tags(parser, [])
14091407
self._feed(parser,
14101408
"<root>\n <element key='value'>text</element",
1411-
chunk_size)
1409+
chunk_size,flush)
14121410
self.assert_event_tags(parser, [])
1413-
self._feed(parser,">\n",chunk_size)
1411+
self._feed(parser,">\n",chunk_size,flush)
14141412
self.assert_event_tags(parser, [('end','element')])
1415-
self._feed(parser,"<element>text</element>tail\n",chunk_size)
1416-
self._feed(parser,"<empty-element/>\n",chunk_size)
1413+
self._feed(parser,"<element>text</element>tail\n",chunk_size,flush)
1414+
self._feed(parser,"<empty-element/>\n",chunk_size,flush)
14171415
self.assert_event_tags(parser, [
14181416
('end','element'),
14191417
('end','empty-element'),
14201418
])
1421-
self._feed(parser,"</root>\n",chunk_size)
1419+
self._feed(parser,"</root>\n",chunk_size,flush)
14221420
self.assert_event_tags(parser, [('end','root')])
14231421
self.assertIsNone(parser.close())
14241422

1425-
@fails_with_expat_2_6_0
14261423
deftest_simple_xml_chunk_1(self):
1427-
self.test_simple_xml(chunk_size=1)
1424+
self.test_simple_xml(chunk_size=1,flush=True)
14281425

1429-
@fails_with_expat_2_6_0
14301426
deftest_simple_xml_chunk_5(self):
1431-
self.test_simple_xml(chunk_size=5)
1427+
self.test_simple_xml(chunk_size=5,flush=True)
14321428

14331429
deftest_simple_xml_chunk_22(self):
14341430
self.test_simple_xml(chunk_size=22)
@@ -1627,6 +1623,57 @@ def test_unknown_event(self):
16271623
withself.assertRaises(ValueError):
16281624
ET.XMLPullParser(events=('start','end','bogus'))
16291625

1626+
deftest_flush_reparse_deferral_enabled(self):
1627+
ifpyexpat.version_info< (2,6,0):
1628+
self.skipTest(f'Expat{pyexpat.version_info} does not '
1629+
'support reparse deferral')
1630+
1631+
parser=ET.XMLPullParser(events=('start','end'))
1632+
1633+
forchunkin ("<doc",">"):
1634+
parser.feed(chunk)
1635+
1636+
self.assert_event_tags(parser, [])# i.e. no elements started
1637+
ifETispyET:
1638+
self.assertTrue(parser._parser._parser.GetReparseDeferralEnabled())
1639+
1640+
parser.flush()
1641+
1642+
self.assert_event_tags(parser, [('start','doc')])
1643+
ifETispyET:
1644+
self.assertTrue(parser._parser._parser.GetReparseDeferralEnabled())
1645+
1646+
parser.feed("</doc>")
1647+
parser.close()
1648+
1649+
self.assert_event_tags(parser, [('end','doc')])
1650+
1651+
deftest_flush_reparse_deferral_disabled(self):
1652+
parser=ET.XMLPullParser(events=('start','end'))
1653+
1654+
forchunkin ("<doc",">"):
1655+
parser.feed(chunk)
1656+
1657+
ifpyexpat.version_info>= (2,6,0):
1658+
ifnotETispyET:
1659+
self.skipTest(f'XMLParser.(Get|Set)ReparseDeferralEnabled '
1660+
'methods not available in C')
1661+
parser._parser._parser.SetReparseDeferralEnabled(False)
1662+
1663+
self.assert_event_tags(parser, [])# i.e. no elements started
1664+
ifETispyET:
1665+
self.assertFalse(parser._parser._parser.GetReparseDeferralEnabled())
1666+
1667+
parser.flush()
1668+
1669+
self.assert_event_tags(parser, [('start','doc')])
1670+
ifETispyET:
1671+
self.assertFalse(parser._parser._parser.GetReparseDeferralEnabled())
1672+
1673+
parser.feed("</doc>")
1674+
parser.close()
1675+
1676+
self.assert_event_tags(parser, [('end','doc')])
16301677

16311678
#
16321679
# xinclude tests (samples from appendix C of the xinclude specification)

‎Lib/xml/etree/ElementTree.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1325,6 +1325,11 @@ def read_events(self):
13251325
else:
13261326
yieldevent
13271327

1328+
defflush(self):
1329+
ifself._parserisNone:
1330+
raiseValueError("flush() called after end of stream")
1331+
self._parser.flush()
1332+
13281333

13291334
defXML(text,parser=None):
13301335
"""Parse XML document from string constant.
@@ -1731,6 +1736,15 @@ def close(self):
17311736
delself.parser,self._parser
17321737
delself.target,self._target
17331738

1739+
defflush(self):
1740+
was_enabled=self.parser.GetReparseDeferralEnabled()
1741+
try:
1742+
self.parser.SetReparseDeferralEnabled(False)
1743+
self.parser.Parse(b"",False)
1744+
exceptself._errorasv:
1745+
self._raiseerror(v)
1746+
finally:
1747+
self.parser.SetReparseDeferralEnabled(was_enabled)
17341748

17351749
# --------------------------------------------------------------------
17361750
# C14N 2.0

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp