Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit854f645

Browse files
hartworkgpshead
andauthored
[3.8]gh-115398: Expose Expat >=2.6.0 reparse deferral API (CVE-2023-52425) (GH-115623) (GH-116275)
Allow controlling Expat >=2.6.0 reparse deferral (CVE-2023-52425) by adding five new methods:- `xml.etree.ElementTree.XMLParser.flush`- `xml.etree.ElementTree.XMLPullParser.flush`- `xml.parsers.expat.xmlparser.GetReparseDeferralEnabled`- `xml.parsers.expat.xmlparser.SetReparseDeferralEnabled`- `xml.sax.expatreader.ExpatParser.flush`Based on the "flush" idea from#115138 (comment) .Includes code suggested-by: Snild Dolkow <snild@sony.com>and by core dev Serhiy Storchaka.Co-authored-by: Gregory P. Smith <greg@krypto.org>
1 parent4d58a1d commit854f645

File tree

14 files changed

+435
-20
lines changed

14 files changed

+435
-20
lines changed

‎Doc/library/pyexpat.rst

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -196,6 +196,42 @@ XMLParser Objects
196196
:exc:`ExpatError` to be raised with the:attr:`code` attribute set to
197197
``errors.codes[errors.XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING]``.
198198

199+
..method::xmlparser.SetReparseDeferralEnabled(enabled)
200+
201+
..warning::
202+
203+
Calling ``SetReparseDeferralEnabled(False)`` has security implications,
204+
as detailed below; please make sure to understand these consequences
205+
prior to using the ``SetReparseDeferralEnabled`` method.
206+
207+
Expat 2.6.0 introduced a security mechanism called "reparse deferral"
208+
where instead of causing denial of service through quadratic runtime
209+
from reparsing large tokens, reparsing of unfinished tokens is now delayed
210+
by default until a sufficient amount of input is reached.
211+
Due to this delay, registered handlers may — depending of the sizing of
212+
input chunks pushed to Expat — no longer be called right after pushing new
213+
input to the parser. Where immediate feedback and taking over responsiblity
214+
of protecting against denial of service from large tokens are both wanted,
215+
calling ``SetReparseDeferralEnabled(False)`` disables reparse deferral
216+
for the current Expat parser instance, temporarily or altogether.
217+
Calling ``SetReparseDeferralEnabled(True)`` allows re-enabling reparse
218+
deferral.
219+
220+
Note that:meth:`SetReparseDeferralEnabled` has been backported to some
221+
prior releases of CPython as a security fix. Check for availability of
222+
:meth:`SetReparseDeferralEnabled` using:func:`hasattr` if used in code
223+
running across a variety of Python versions.
224+
225+
..versionadded::3.8.19
226+
227+
..method::xmlparser.GetReparseDeferralEnabled()
228+
229+
Returns whether reparse deferral is currently enabled for the given
230+
Expat parser instance.
231+
232+
..versionadded::3.8.19
233+
234+
199235
:class:`xmlparser` objects have the following attributes:
200236

201237

‎Doc/library/xml.etree.elementtree.rst

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -163,6 +163,11 @@ data but would still like to have incremental parsing capabilities, take a look
163163
at:func:`iterparse`. It can be useful when you're reading a large XML document
164164
and don't want to hold it wholly in memory.
165165

166+
Where *immediate* feedback through events is wanted, calling method
167+
:meth:`XMLPullParser.flush` can help reduce delay;
168+
please make sure to study the related security notes.
169+
170+
166171
Finding interesting elements
167172
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
168173

@@ -1352,6 +1357,24 @@ XMLParser Objects
13521357

13531358
Feeds data to the parser. *data* is encoded data.
13541359

1360+
1361+
..method::flush()
1362+
1363+
Triggers parsing of any previously fed unparsed data, which can be
1364+
used to ensure more immediate feedback, in particular with Expat >=2.6.0.
1365+
The implementation of:meth:`flush` temporarily disables reparse deferral
1366+
with Expat (if currently enabled) and triggers a reparse.
1367+
Disabling reparse deferral has security consequences; please see
1368+
:meth:`xml.parsers.expat.xmlparser.SetReparseDeferralEnabled` for details.
1369+
1370+
Note that:meth:`flush` has been backported to some prior releases of
1371+
CPython as a security fix. Check for availability of:meth:`flush`
1372+
using:func:`hasattr` if used in code running across a variety of Python
1373+
versions.
1374+
1375+
..versionadded::3.8.19
1376+
1377+
13551378
:meth:`XMLParser.feed` calls *target*\'s ``start(tag, attrs_dict)`` method
13561379
for each opening tag, its ``end(tag)`` method for each closing tag, and data
13571380
is processed by method ``data(data)``. For further supported callback
@@ -1413,6 +1436,22 @@ XMLPullParser Objects
14131436

14141437
Feed the given bytes data to the parser.
14151438

1439+
..method::flush()
1440+
1441+
Triggers parsing of any previously fed unparsed data, which can be
1442+
used to ensure more immediate feedback, in particular with Expat >=2.6.0.
1443+
The implementation of:meth:`flush` temporarily disables reparse deferral
1444+
with Expat (if currently enabled) and triggers a reparse.
1445+
Disabling reparse deferral has security consequences; please see
1446+
:meth:`xml.parsers.expat.xmlparser.SetReparseDeferralEnabled` for details.
1447+
1448+
Note that:meth:`flush` has been backported to some prior releases of
1449+
CPython as a security fix. Check for availability of:meth:`flush`
1450+
using:func:`hasattr` if used in code running across a variety of Python
1451+
versions.
1452+
1453+
..versionadded::3.8.19
1454+
14161455
..method::close()
14171456

14181457
Signal the parser that the data stream is terminated. Unlike

‎Include/pyexpat.h

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,8 +48,10 @@ struct PyExpat_CAPI
4848
enumXML_Status (*SetEncoding)(XML_Parserparser,constXML_Char*encoding);
4949
int (*DefaultUnknownEncodingHandler)(
5050
void*encodingHandlerData,constXML_Char*name,XML_Encoding*info);
51-
/* might benone for expat < 2.1.0 */
51+
/* might beNULL for expat < 2.1.0 */
5252
int (*SetHashSalt)(XML_Parserparser,unsigned longhash_salt);
53+
/* might be NULL for expat < 2.6.0 */
54+
XML_Bool (*SetReparseDeferralEnabled)(XML_Parserparser,XML_Boolenabled);
5355
/* always add new stuff to the end! */
5456
};
5557

‎Lib/test/test_pyexpat.py

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -729,5 +729,59 @@ def resolve_entity(context, base, system_id, public_id):
729729
self.assertEqual(handler_call_args, [("bar","baz")])
730730

731731

732+
classReparseDeferralTest(unittest.TestCase):
733+
deftest_getter_setter_round_trip(self):
734+
parser=expat.ParserCreate()
735+
enabled= (expat.version_info>= (2,6,0))
736+
737+
self.assertIs(parser.GetReparseDeferralEnabled(),enabled)
738+
parser.SetReparseDeferralEnabled(False)
739+
self.assertIs(parser.GetReparseDeferralEnabled(),False)
740+
parser.SetReparseDeferralEnabled(True)
741+
self.assertIs(parser.GetReparseDeferralEnabled(),enabled)
742+
743+
deftest_reparse_deferral_enabled(self):
744+
ifexpat.version_info< (2,6,0):
745+
self.skipTest(f'Expat{expat.version_info} does not '
746+
'support reparse deferral')
747+
748+
started= []
749+
750+
defstart_element(name,_):
751+
started.append(name)
752+
753+
parser=expat.ParserCreate()
754+
parser.StartElementHandler=start_element
755+
self.assertTrue(parser.GetReparseDeferralEnabled())
756+
757+
forchunkin (b'<doc',b'/>'):
758+
parser.Parse(chunk,False)
759+
760+
# The key test: Have handlers already fired? Expecting: no.
761+
self.assertEqual(started, [])
762+
763+
parser.Parse(b'',True)
764+
765+
self.assertEqual(started, ['doc'])
766+
767+
deftest_reparse_deferral_disabled(self):
768+
started= []
769+
770+
defstart_element(name,_):
771+
started.append(name)
772+
773+
parser=expat.ParserCreate()
774+
parser.StartElementHandler=start_element
775+
ifexpat.version_info>= (2,6,0):
776+
parser.SetReparseDeferralEnabled(False)
777+
self.assertFalse(parser.GetReparseDeferralEnabled())
778+
779+
forchunkin (b'<doc',b'/>'):
780+
parser.Parse(chunk,False)
781+
782+
# The key test: Have handlers already fired? Expecting: yes.
783+
self.assertEqual(started, ['doc'])
784+
785+
732786
if__name__=="__main__":
733787
unittest.main()

‎Lib/test/test_sax.py

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
fromioimportBytesIO,StringIO
1919
importcodecs
2020
importos.path
21+
importpyexpat
2122
importshutil
2223
fromurllib.errorimportURLError
2324
fromtestimportsupport
@@ -1206,6 +1207,56 @@ def test_expat_incremental_reset(self):
12061207

12071208
self.assertEqual(result.getvalue(),start+b"<doc>text</doc>")
12081209

1210+
deftest_flush_reparse_deferral_enabled(self):
1211+
ifpyexpat.version_info< (2,6,0):
1212+
self.skipTest(f'Expat{pyexpat.version_info} does not support reparse deferral')
1213+
1214+
result=BytesIO()
1215+
xmlgen=XMLGenerator(result)
1216+
parser=create_parser()
1217+
parser.setContentHandler(xmlgen)
1218+
1219+
forchunkin ("<doc",">"):
1220+
parser.feed(chunk)
1221+
1222+
self.assertEqual(result.getvalue(),start)# i.e. no elements started
1223+
self.assertTrue(parser._parser.GetReparseDeferralEnabled())
1224+
1225+
parser.flush()
1226+
1227+
self.assertTrue(parser._parser.GetReparseDeferralEnabled())
1228+
self.assertEqual(result.getvalue(),start+b"<doc>")
1229+
1230+
parser.feed("</doc>")
1231+
parser.close()
1232+
1233+
self.assertEqual(result.getvalue(),start+b"<doc></doc>")
1234+
1235+
deftest_flush_reparse_deferral_disabled(self):
1236+
result=BytesIO()
1237+
xmlgen=XMLGenerator(result)
1238+
parser=create_parser()
1239+
parser.setContentHandler(xmlgen)
1240+
1241+
forchunkin ("<doc",">"):
1242+
parser.feed(chunk)
1243+
1244+
ifpyexpat.version_info>= (2,6,0):
1245+
parser._parser.SetReparseDeferralEnabled(False)
1246+
1247+
self.assertEqual(result.getvalue(),start)# i.e. no elements started
1248+
self.assertFalse(parser._parser.GetReparseDeferralEnabled())
1249+
1250+
parser.flush()
1251+
1252+
self.assertFalse(parser._parser.GetReparseDeferralEnabled())
1253+
self.assertEqual(result.getvalue(),start+b"<doc>")
1254+
1255+
parser.feed("</doc>")
1256+
parser.close()
1257+
1258+
self.assertEqual(result.getvalue(),start+b"<doc></doc>")
1259+
12091260
# ===== Locator support
12101261

12111262
deftest_expat_locator_noinfo(self):

‎Lib/test/test_xml_etree.py

Lines changed: 63 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -105,11 +105,6 @@
105105
"""
106106

107107

108-
fails_with_expat_2_6_0= (unittest.expectedFailure
109-
ifpyexpat.version_info>= (2,6,0)else
110-
lambdatest:test)
111-
112-
113108
defcheckwarnings(*filters,quiet=False):
114109
defdecorator(test):
115110
defnewtest(*args,**kwargs):
@@ -1250,12 +1245,14 @@ def test_tree_write_attribute_order(self):
12501245

12511246
classXMLPullParserTest(unittest.TestCase):
12521247

1253-
def_feed(self,parser,data,chunk_size=None):
1248+
def_feed(self,parser,data,chunk_size=None,flush=False):
12541249
ifchunk_sizeisNone:
12551250
parser.feed(data)
12561251
else:
12571252
foriinrange(0,len(data),chunk_size):
12581253
parser.feed(data[i:i+chunk_size])
1254+
ifflush:
1255+
parser.flush()
12591256

12601257
defassert_events(self,parser,expected,max_events=None):
12611258
self.assertEqual(
@@ -1273,34 +1270,32 @@ def assert_event_tags(self, parser, expected, max_events=None):
12731270
self.assertEqual([(action,elem.tag)foraction,eleminevents],
12741271
expected)
12751272

1276-
deftest_simple_xml(self,chunk_size=None):
1273+
deftest_simple_xml(self,chunk_size=None,flush=False):
12771274
parser=ET.XMLPullParser()
12781275
self.assert_event_tags(parser, [])
1279-
self._feed(parser,"<!-- comment -->\n",chunk_size)
1276+
self._feed(parser,"<!-- comment -->\n",chunk_size,flush)
12801277
self.assert_event_tags(parser, [])
12811278
self._feed(parser,
12821279
"<root>\n <element key='value'>text</element",
1283-
chunk_size)
1280+
chunk_size,flush)
12841281
self.assert_event_tags(parser, [])
1285-
self._feed(parser,">\n",chunk_size)
1282+
self._feed(parser,">\n",chunk_size,flush)
12861283
self.assert_event_tags(parser, [('end','element')])
1287-
self._feed(parser,"<element>text</element>tail\n",chunk_size)
1288-
self._feed(parser,"<empty-element/>\n",chunk_size)
1284+
self._feed(parser,"<element>text</element>tail\n",chunk_size,flush)
1285+
self._feed(parser,"<empty-element/>\n",chunk_size,flush)
12891286
self.assert_event_tags(parser, [
12901287
('end','element'),
12911288
('end','empty-element'),
12921289
])
1293-
self._feed(parser,"</root>\n",chunk_size)
1290+
self._feed(parser,"</root>\n",chunk_size,flush)
12941291
self.assert_event_tags(parser, [('end','root')])
12951292
self.assertIsNone(parser.close())
12961293

1297-
@fails_with_expat_2_6_0
12981294
deftest_simple_xml_chunk_1(self):
1299-
self.test_simple_xml(chunk_size=1)
1295+
self.test_simple_xml(chunk_size=1,flush=True)
13001296

1301-
@fails_with_expat_2_6_0
13021297
deftest_simple_xml_chunk_5(self):
1303-
self.test_simple_xml(chunk_size=5)
1298+
self.test_simple_xml(chunk_size=5,flush=True)
13041299

13051300
deftest_simple_xml_chunk_22(self):
13061301
self.test_simple_xml(chunk_size=22)
@@ -1499,6 +1494,57 @@ def test_unknown_event(self):
14991494
withself.assertRaises(ValueError):
15001495
ET.XMLPullParser(events=('start','end','bogus'))
15011496

1497+
deftest_flush_reparse_deferral_enabled(self):
1498+
ifpyexpat.version_info< (2,6,0):
1499+
self.skipTest(f'Expat{pyexpat.version_info} does not '
1500+
'support reparse deferral')
1501+
1502+
parser=ET.XMLPullParser(events=('start','end'))
1503+
1504+
forchunkin ("<doc",">"):
1505+
parser.feed(chunk)
1506+
1507+
self.assert_event_tags(parser, [])# i.e. no elements started
1508+
ifETispyET:
1509+
self.assertTrue(parser._parser._parser.GetReparseDeferralEnabled())
1510+
1511+
parser.flush()
1512+
1513+
self.assert_event_tags(parser, [('start','doc')])
1514+
ifETispyET:
1515+
self.assertTrue(parser._parser._parser.GetReparseDeferralEnabled())
1516+
1517+
parser.feed("</doc>")
1518+
parser.close()
1519+
1520+
self.assert_event_tags(parser, [('end','doc')])
1521+
1522+
deftest_flush_reparse_deferral_disabled(self):
1523+
parser=ET.XMLPullParser(events=('start','end'))
1524+
1525+
forchunkin ("<doc",">"):
1526+
parser.feed(chunk)
1527+
1528+
ifpyexpat.version_info>= (2,6,0):
1529+
ifnotETispyET:
1530+
self.skipTest(f'XMLParser.(Get|Set)ReparseDeferralEnabled '
1531+
'methods not available in C')
1532+
parser._parser._parser.SetReparseDeferralEnabled(False)
1533+
1534+
self.assert_event_tags(parser, [])# i.e. no elements started
1535+
ifETispyET:
1536+
self.assertFalse(parser._parser._parser.GetReparseDeferralEnabled())
1537+
1538+
parser.flush()
1539+
1540+
self.assert_event_tags(parser, [('start','doc')])
1541+
ifETispyET:
1542+
self.assertFalse(parser._parser._parser.GetReparseDeferralEnabled())
1543+
1544+
parser.feed("</doc>")
1545+
parser.close()
1546+
1547+
self.assert_event_tags(parser, [('end','doc')])
15021548

15031549
#
15041550
# xinclude tests (samples from appendix C of the xinclude specification)

‎Lib/xml/etree/ElementTree.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1303,6 +1303,11 @@ def read_events(self):
13031303
else:
13041304
yieldevent
13051305

1306+
defflush(self):
1307+
ifself._parserisNone:
1308+
raiseValueError("flush() called after end of stream")
1309+
self._parser.flush()
1310+
13061311

13071312
defXML(text,parser=None):
13081313
"""Parse XML document from string constant.
@@ -1711,6 +1716,15 @@ def close(self):
17111716
delself.parser,self._parser
17121717
delself.target,self._target
17131718

1719+
defflush(self):
1720+
was_enabled=self.parser.GetReparseDeferralEnabled()
1721+
try:
1722+
self.parser.SetReparseDeferralEnabled(False)
1723+
self.parser.Parse(b"",False)
1724+
exceptself._errorasv:
1725+
self._raiseerror(v)
1726+
finally:
1727+
self.parser.SetReparseDeferralEnabled(was_enabled)
17141728

17151729
# --------------------------------------------------------------------
17161730
# C14N 2.0

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp