Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit2007624

Browse files
hartworkgpshead
andauthored
[3.9]gh-115398: Expose Expat >=2.6.0 reparse deferral API (CVE-2023-52425) (GH-115623) (GH-116272)
Allow controlling Expat >=2.6.0 reparse deferral (CVE-2023-52425) by adding five new methods:- `xml.etree.ElementTree.XMLParser.flush`- `xml.etree.ElementTree.XMLPullParser.flush`- `xml.parsers.expat.xmlparser.GetReparseDeferralEnabled`- `xml.parsers.expat.xmlparser.SetReparseDeferralEnabled`- `xml.sax.expatreader.ExpatParser.flush`Based on the "flush" idea from#115138 (comment) .Includes code suggested-by: Snild Dolkow <snild@sony.com>and by core dev Serhiy Storchaka.Co-authored-by: Gregory P. Smith <greg@krypto.org>
1 parent468ba95 commit2007624

File tree

14 files changed

+435
-20
lines changed

14 files changed

+435
-20
lines changed

‎Doc/library/pyexpat.rst

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -196,6 +196,42 @@ XMLParser Objects
196196
:exc:`ExpatError` to be raised with the:attr:`code` attribute set to
197197
``errors.codes[errors.XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING]``.
198198

199+
..method::xmlparser.SetReparseDeferralEnabled(enabled)
200+
201+
..warning::
202+
203+
Calling ``SetReparseDeferralEnabled(False)`` has security implications,
204+
as detailed below; please make sure to understand these consequences
205+
prior to using the ``SetReparseDeferralEnabled`` method.
206+
207+
Expat 2.6.0 introduced a security mechanism called "reparse deferral"
208+
where instead of causing denial of service through quadratic runtime
209+
from reparsing large tokens, reparsing of unfinished tokens is now delayed
210+
by default until a sufficient amount of input is reached.
211+
Due to this delay, registered handlers may — depending of the sizing of
212+
input chunks pushed to Expat — no longer be called right after pushing new
213+
input to the parser. Where immediate feedback and taking over responsiblity
214+
of protecting against denial of service from large tokens are both wanted,
215+
calling ``SetReparseDeferralEnabled(False)`` disables reparse deferral
216+
for the current Expat parser instance, temporarily or altogether.
217+
Calling ``SetReparseDeferralEnabled(True)`` allows re-enabling reparse
218+
deferral.
219+
220+
Note that:meth:`SetReparseDeferralEnabled` has been backported to some
221+
prior releases of CPython as a security fix. Check for availability of
222+
:meth:`SetReparseDeferralEnabled` using:func:`hasattr` if used in code
223+
running across a variety of Python versions.
224+
225+
..versionadded::3.9.19
226+
227+
..method::xmlparser.GetReparseDeferralEnabled()
228+
229+
Returns whether reparse deferral is currently enabled for the given
230+
Expat parser instance.
231+
232+
..versionadded::3.9.19
233+
234+
199235
:class:`xmlparser` objects have the following attributes:
200236

201237

‎Doc/library/xml.etree.elementtree.rst

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -165,6 +165,11 @@ data but would still like to have incremental parsing capabilities, take a look
165165
at:func:`iterparse`. It can be useful when you're reading a large XML document
166166
and don't want to hold it wholly in memory.
167167

168+
Where *immediate* feedback through events is wanted, calling method
169+
:meth:`XMLPullParser.flush` can help reduce delay;
170+
please make sure to study the related security notes.
171+
172+
168173
Finding interesting elements
169174
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
170175

@@ -1352,6 +1357,24 @@ XMLParser Objects
13521357

13531358
Feeds data to the parser. *data* is encoded data.
13541359

1360+
1361+
..method::flush()
1362+
1363+
Triggers parsing of any previously fed unparsed data, which can be
1364+
used to ensure more immediate feedback, in particular with Expat >=2.6.0.
1365+
The implementation of:meth:`flush` temporarily disables reparse deferral
1366+
with Expat (if currently enabled) and triggers a reparse.
1367+
Disabling reparse deferral has security consequences; please see
1368+
:meth:`xml.parsers.expat.xmlparser.SetReparseDeferralEnabled` for details.
1369+
1370+
Note that:meth:`flush` has been backported to some prior releases of
1371+
CPython as a security fix. Check for availability of:meth:`flush`
1372+
using:func:`hasattr` if used in code running across a variety of Python
1373+
versions.
1374+
1375+
..versionadded::3.9.19
1376+
1377+
13551378
:meth:`XMLParser.feed` calls *target*\'s ``start(tag, attrs_dict)`` method
13561379
for each opening tag, its ``end(tag)`` method for each closing tag, and data
13571380
is processed by method ``data(data)``. For further supported callback
@@ -1413,6 +1436,22 @@ XMLPullParser Objects
14131436

14141437
Feed the given bytes data to the parser.
14151438

1439+
..method::flush()
1440+
1441+
Triggers parsing of any previously fed unparsed data, which can be
1442+
used to ensure more immediate feedback, in particular with Expat >=2.6.0.
1443+
The implementation of:meth:`flush` temporarily disables reparse deferral
1444+
with Expat (if currently enabled) and triggers a reparse.
1445+
Disabling reparse deferral has security consequences; please see
1446+
:meth:`xml.parsers.expat.xmlparser.SetReparseDeferralEnabled` for details.
1447+
1448+
Note that:meth:`flush` has been backported to some prior releases of
1449+
CPython as a security fix. Check for availability of:meth:`flush`
1450+
using:func:`hasattr` if used in code running across a variety of Python
1451+
versions.
1452+
1453+
..versionadded::3.9.19
1454+
14161455
..method::close()
14171456

14181457
Signal the parser that the data stream is terminated. Unlike

‎Include/pyexpat.h

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,8 +48,10 @@ struct PyExpat_CAPI
4848
enumXML_Status (*SetEncoding)(XML_Parserparser,constXML_Char*encoding);
4949
int (*DefaultUnknownEncodingHandler)(
5050
void*encodingHandlerData,constXML_Char*name,XML_Encoding*info);
51-
/* might benone for expat < 2.1.0 */
51+
/* might beNULL for expat < 2.1.0 */
5252
int (*SetHashSalt)(XML_Parserparser,unsigned longhash_salt);
53+
/* might be NULL for expat < 2.6.0 */
54+
XML_Bool (*SetReparseDeferralEnabled)(XML_Parserparser,XML_Boolenabled);
5355
/* always add new stuff to the end! */
5456
};
5557

‎Lib/test/test_pyexpat.py

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -730,5 +730,59 @@ def resolve_entity(context, base, system_id, public_id):
730730
self.assertEqual(handler_call_args, [("bar","baz")])
731731

732732

733+
classReparseDeferralTest(unittest.TestCase):
734+
deftest_getter_setter_round_trip(self):
735+
parser=expat.ParserCreate()
736+
enabled= (expat.version_info>= (2,6,0))
737+
738+
self.assertIs(parser.GetReparseDeferralEnabled(),enabled)
739+
parser.SetReparseDeferralEnabled(False)
740+
self.assertIs(parser.GetReparseDeferralEnabled(),False)
741+
parser.SetReparseDeferralEnabled(True)
742+
self.assertIs(parser.GetReparseDeferralEnabled(),enabled)
743+
744+
deftest_reparse_deferral_enabled(self):
745+
ifexpat.version_info< (2,6,0):
746+
self.skipTest(f'Expat{expat.version_info} does not '
747+
'support reparse deferral')
748+
749+
started= []
750+
751+
defstart_element(name,_):
752+
started.append(name)
753+
754+
parser=expat.ParserCreate()
755+
parser.StartElementHandler=start_element
756+
self.assertTrue(parser.GetReparseDeferralEnabled())
757+
758+
forchunkin (b'<doc',b'/>'):
759+
parser.Parse(chunk,False)
760+
761+
# The key test: Have handlers already fired? Expecting: no.
762+
self.assertEqual(started, [])
763+
764+
parser.Parse(b'',True)
765+
766+
self.assertEqual(started, ['doc'])
767+
768+
deftest_reparse_deferral_disabled(self):
769+
started= []
770+
771+
defstart_element(name,_):
772+
started.append(name)
773+
774+
parser=expat.ParserCreate()
775+
parser.StartElementHandler=start_element
776+
ifexpat.version_info>= (2,6,0):
777+
parser.SetReparseDeferralEnabled(False)
778+
self.assertFalse(parser.GetReparseDeferralEnabled())
779+
780+
forchunkin (b'<doc',b'/>'):
781+
parser.Parse(chunk,False)
782+
783+
# The key test: Have handlers already fired? Expecting: yes.
784+
self.assertEqual(started, ['doc'])
785+
786+
733787
if__name__=="__main__":
734788
unittest.main()

‎Lib/test/test_sax.py

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
fromioimportBytesIO,StringIO
1919
importcodecs
2020
importos.path
21+
importpyexpat
2122
importshutil
2223
fromurllib.errorimportURLError
2324
importurllib.request
@@ -1210,6 +1211,56 @@ def test_expat_incremental_reset(self):
12101211

12111212
self.assertEqual(result.getvalue(),start+b"<doc>text</doc>")
12121213

1214+
deftest_flush_reparse_deferral_enabled(self):
1215+
ifpyexpat.version_info< (2,6,0):
1216+
self.skipTest(f'Expat{pyexpat.version_info} does not support reparse deferral')
1217+
1218+
result=BytesIO()
1219+
xmlgen=XMLGenerator(result)
1220+
parser=create_parser()
1221+
parser.setContentHandler(xmlgen)
1222+
1223+
forchunkin ("<doc",">"):
1224+
parser.feed(chunk)
1225+
1226+
self.assertEqual(result.getvalue(),start)# i.e. no elements started
1227+
self.assertTrue(parser._parser.GetReparseDeferralEnabled())
1228+
1229+
parser.flush()
1230+
1231+
self.assertTrue(parser._parser.GetReparseDeferralEnabled())
1232+
self.assertEqual(result.getvalue(),start+b"<doc>")
1233+
1234+
parser.feed("</doc>")
1235+
parser.close()
1236+
1237+
self.assertEqual(result.getvalue(),start+b"<doc></doc>")
1238+
1239+
deftest_flush_reparse_deferral_disabled(self):
1240+
result=BytesIO()
1241+
xmlgen=XMLGenerator(result)
1242+
parser=create_parser()
1243+
parser.setContentHandler(xmlgen)
1244+
1245+
forchunkin ("<doc",">"):
1246+
parser.feed(chunk)
1247+
1248+
ifpyexpat.version_info>= (2,6,0):
1249+
parser._parser.SetReparseDeferralEnabled(False)
1250+
1251+
self.assertEqual(result.getvalue(),start)# i.e. no elements started
1252+
self.assertFalse(parser._parser.GetReparseDeferralEnabled())
1253+
1254+
parser.flush()
1255+
1256+
self.assertFalse(parser._parser.GetReparseDeferralEnabled())
1257+
self.assertEqual(result.getvalue(),start+b"<doc>")
1258+
1259+
parser.feed("</doc>")
1260+
parser.close()
1261+
1262+
self.assertEqual(result.getvalue(),start+b"<doc></doc>")
1263+
12131264
# ===== Locator support
12141265

12151266
deftest_expat_locator_noinfo(self):

‎Lib/test/test_xml_etree.py

Lines changed: 63 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -104,11 +104,6 @@
104104
"""
105105

106106

107-
fails_with_expat_2_6_0= (unittest.expectedFailure
108-
ifpyexpat.version_info>= (2,6,0)else
109-
lambdatest:test)
110-
111-
112107
defcheckwarnings(*filters,quiet=False):
113108
defdecorator(test):
114109
defnewtest(*args,**kwargs):
@@ -1375,12 +1370,14 @@ def test_tree_write_attribute_order(self):
13751370

13761371
classXMLPullParserTest(unittest.TestCase):
13771372

1378-
def_feed(self,parser,data,chunk_size=None):
1373+
def_feed(self,parser,data,chunk_size=None,flush=False):
13791374
ifchunk_sizeisNone:
13801375
parser.feed(data)
13811376
else:
13821377
foriinrange(0,len(data),chunk_size):
13831378
parser.feed(data[i:i+chunk_size])
1379+
ifflush:
1380+
parser.flush()
13841381

13851382
defassert_events(self,parser,expected,max_events=None):
13861383
self.assertEqual(
@@ -1398,34 +1395,32 @@ def assert_event_tags(self, parser, expected, max_events=None):
13981395
self.assertEqual([(action,elem.tag)foraction,eleminevents],
13991396
expected)
14001397

1401-
deftest_simple_xml(self,chunk_size=None):
1398+
deftest_simple_xml(self,chunk_size=None,flush=False):
14021399
parser=ET.XMLPullParser()
14031400
self.assert_event_tags(parser, [])
1404-
self._feed(parser,"<!-- comment -->\n",chunk_size)
1401+
self._feed(parser,"<!-- comment -->\n",chunk_size,flush)
14051402
self.assert_event_tags(parser, [])
14061403
self._feed(parser,
14071404
"<root>\n <element key='value'>text</element",
1408-
chunk_size)
1405+
chunk_size,flush)
14091406
self.assert_event_tags(parser, [])
1410-
self._feed(parser,">\n",chunk_size)
1407+
self._feed(parser,">\n",chunk_size,flush)
14111408
self.assert_event_tags(parser, [('end','element')])
1412-
self._feed(parser,"<element>text</element>tail\n",chunk_size)
1413-
self._feed(parser,"<empty-element/>\n",chunk_size)
1409+
self._feed(parser,"<element>text</element>tail\n",chunk_size,flush)
1410+
self._feed(parser,"<empty-element/>\n",chunk_size,flush)
14141411
self.assert_event_tags(parser, [
14151412
('end','element'),
14161413
('end','empty-element'),
14171414
])
1418-
self._feed(parser,"</root>\n",chunk_size)
1415+
self._feed(parser,"</root>\n",chunk_size,flush)
14191416
self.assert_event_tags(parser, [('end','root')])
14201417
self.assertIsNone(parser.close())
14211418

1422-
@fails_with_expat_2_6_0
14231419
deftest_simple_xml_chunk_1(self):
1424-
self.test_simple_xml(chunk_size=1)
1420+
self.test_simple_xml(chunk_size=1,flush=True)
14251421

1426-
@fails_with_expat_2_6_0
14271422
deftest_simple_xml_chunk_5(self):
1428-
self.test_simple_xml(chunk_size=5)
1423+
self.test_simple_xml(chunk_size=5,flush=True)
14291424

14301425
deftest_simple_xml_chunk_22(self):
14311426
self.test_simple_xml(chunk_size=22)
@@ -1624,6 +1619,57 @@ def test_unknown_event(self):
16241619
withself.assertRaises(ValueError):
16251620
ET.XMLPullParser(events=('start','end','bogus'))
16261621

1622+
deftest_flush_reparse_deferral_enabled(self):
1623+
ifpyexpat.version_info< (2,6,0):
1624+
self.skipTest(f'Expat{pyexpat.version_info} does not '
1625+
'support reparse deferral')
1626+
1627+
parser=ET.XMLPullParser(events=('start','end'))
1628+
1629+
forchunkin ("<doc",">"):
1630+
parser.feed(chunk)
1631+
1632+
self.assert_event_tags(parser, [])# i.e. no elements started
1633+
ifETispyET:
1634+
self.assertTrue(parser._parser._parser.GetReparseDeferralEnabled())
1635+
1636+
parser.flush()
1637+
1638+
self.assert_event_tags(parser, [('start','doc')])
1639+
ifETispyET:
1640+
self.assertTrue(parser._parser._parser.GetReparseDeferralEnabled())
1641+
1642+
parser.feed("</doc>")
1643+
parser.close()
1644+
1645+
self.assert_event_tags(parser, [('end','doc')])
1646+
1647+
deftest_flush_reparse_deferral_disabled(self):
1648+
parser=ET.XMLPullParser(events=('start','end'))
1649+
1650+
forchunkin ("<doc",">"):
1651+
parser.feed(chunk)
1652+
1653+
ifpyexpat.version_info>= (2,6,0):
1654+
ifnotETispyET:
1655+
self.skipTest(f'XMLParser.(Get|Set)ReparseDeferralEnabled '
1656+
'methods not available in C')
1657+
parser._parser._parser.SetReparseDeferralEnabled(False)
1658+
1659+
self.assert_event_tags(parser, [])# i.e. no elements started
1660+
ifETispyET:
1661+
self.assertFalse(parser._parser._parser.GetReparseDeferralEnabled())
1662+
1663+
parser.flush()
1664+
1665+
self.assert_event_tags(parser, [('start','doc')])
1666+
ifETispyET:
1667+
self.assertFalse(parser._parser._parser.GetReparseDeferralEnabled())
1668+
1669+
parser.feed("</doc>")
1670+
parser.close()
1671+
1672+
self.assert_event_tags(parser, [('end','doc')])
16271673

16281674
#
16291675
# xinclude tests (samples from appendix C of the xinclude specification)

‎Lib/xml/etree/ElementTree.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1325,6 +1325,11 @@ def read_events(self):
13251325
else:
13261326
yieldevent
13271327

1328+
defflush(self):
1329+
ifself._parserisNone:
1330+
raiseValueError("flush() called after end of stream")
1331+
self._parser.flush()
1332+
13281333

13291334
defXML(text,parser=None):
13301335
"""Parse XML document from string constant.
@@ -1733,6 +1738,15 @@ def close(self):
17331738
delself.parser,self._parser
17341739
delself.target,self._target
17351740

1741+
defflush(self):
1742+
was_enabled=self.parser.GetReparseDeferralEnabled()
1743+
try:
1744+
self.parser.SetReparseDeferralEnabled(False)
1745+
self.parser.Parse(b"",False)
1746+
exceptself._errorasv:
1747+
self._raiseerror(v)
1748+
finally:
1749+
self.parser.SetReparseDeferralEnabled(was_enabled)
17361750

17371751
# --------------------------------------------------------------------
17381752
# C14N 2.0

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp