
This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.
Created on2009-08-10 16:23 byrealpolitik, last changed2022-04-11 14:56 byadmin. This issue is nowclosed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| expat-error.py | realpolitik,2009-08-10 16:23 | |||
| pyexpat.patch | ocean-city,2009-10-09 14:11 | review | ||
| pyexpat-2.patch | amaury.forgeotdarc,2009-10-09 15:29 | review | ||
| issue6676_3x.patch | ned.deily,2014-02-27 07:17 | 3x version | review | |
| issue6676_27.patch | ned.deily,2014-02-27 07:18 | 27 version | ||
| Messages (21) | |||
|---|---|---|---|
| msg91452 -(view) | Author: Matthew (realpolitik) | Date: 2009-08-10 16:23 | |
I'm using the Expat python interface to parse multiple XML files in anapplication and have found that it throws a "Memory Error" exception ifmultiple calls are made to xmlparser.ParseFile(file) on the samexmlparser object. This occurs even with a vanilla xmlparser objectcreated with xml.parsers.expat.ParserCreate().Python Version: 2.6.2Operating System: Ubuntu | |||
| msg91455 -(view) | Author: Matthew (realpolitik) | Date: 2009-08-10 16:55 | |
This also occurs with Python 2.5.1 on OS X | |||
| msg93777 -(view) | Author: Andy Balaam (andybalaam)* | Date: 2009-10-09 08:23 | |
I am also seeing this with Python 2.5.2 on Ubuntu. | |||
| msg93779 -(view) | Author: Andy Balaam (andybalaam)* | Date: 2009-10-09 08:26 | |
Just in case it wasn't obvious - the workaround is to create a newparser (with xml.parsers.expat.ParserCreate()) for every XML file youwant to parse. | |||
| msg93782 -(view) | Author: Hirokazu Yamamoto (ocean-city)*![]() | Date: 2009-10-09 09:45 | |
I'm not familiar with expat, but we can see what is happening moreclearly with attached adhok patch.Traceback (most recent call last): File "expat-error.py", line 14, in <module> p.ParseFile(file)xml.parsers.expat.ExpatError: parsing finished: line 2, column 482It seems ParseFile() doesn't support second call. I'm not sure this isintended behavior or not. | |||
| msg93785 -(view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc)*![]() | Date: 2009-10-09 12:04 | |
The patch is good; a test would be appreciated.The difference now is that in case of true low-memory conditions,ExpatError("no memory") is raised instead of MemoryError.This is acceptable IMO.> It seems ParseFile() doesn't support second callThis is correct; the C expat library has a function XML_ParserReset()which could be called before calling ParseFile() again, but pyexpat doesnot expose it yet (seeissue1208730). | |||
| msg93790 -(view) | Author: Hirokazu Yamamoto (ocean-city)*![]() | Date: 2009-10-09 13:00 | |
Well, I tried to write test like this.1. Check if xml.parsers.expat.error is raised.2. Compare *code* attribute of error object withxml.parsers.expat.errors.XML_ERROR_FINISHEDBut I noticed XML_ERROR_FINISHED is not integer but string. (!)According tohttp://docs.python.org/library/pyexpat.html#expaterror-objects> ExpatError.code>> Expat’s internal error number for the specific error. This will> match one of the constants defined in the errors object from> this module.Is this document bug or implementation bug? Personally, I think string'parsing finished' as error constant might be useless... | |||
| msg93791 -(view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc)*![]() | Date: 2009-10-09 13:12 | |
Looks like an implementation bug to me; far too late to change it, though.In your test, you could use pyexpat.ErrorString(e.code) == pyexpat.errors.XML_ERROR_FINISHEDAnd the docs could mention this trick. | |||
| msg93794 -(view) | Author: Hirokazu Yamamoto (ocean-city)*![]() | Date: 2009-10-09 14:11 | |
Here is the patch. I'm not confident with my English comment though. | |||
| msg93802 -(view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc)*![]() | Date: 2009-10-09 15:29 | |
Do you know the new "context manager" feature of assertRaises? it makesit easier to check for exceptions.I join a new patch that uses it. | |||
| msg93803 -(view) | Author: Hirokazu Yamamoto (ocean-city)*![]() | Date: 2009-10-09 15:44 | |
I knew existence of that new feature, but didn't know how to use it. | |||
| msg93804 -(view) | Author: Hirokazu Yamamoto (ocean-city)*![]() | Date: 2009-10-09 16:13 | |
Hmm, looks useful. I think your patch is good. Only one problem is thatwe cannot use this new feature in python2.6. If we use my patch in thatbranch, I think there is no problem. | |||
| msg98753 -(view) | Author: Will Grainger (willgrainger) | Date: 2010-02-02 18:20 | |
I don't think this is a python specific problem. I have just seen the same error when working with the expat library from C, and the causeis using the same parser to read multiple files. | |||
| msg142950 -(view) | Author: David H. Gutteridge (dhgutteridge) | Date: 2011-08-25 00:57 | |
The documentation should definitely be updated to clarify that a parser instance is not reusable with more than one file. I had a look at the equivalent documentation for Perl and TCL, and Perl's implementation explicitly does not allow attempts to reuse the parser instance (which is clearly noted in the documentation), and TCL's implementation (or one of them, anyway) offers a reset call that explicitly resets the parser in preparation for another file to be submitted. | |||
| msg143242 -(view) | Author: Ned Deily (ned.deily)*![]() | Date: 2011-08-30 23:25 | |
I agree that, at a minimum, the documentation should be updated to include a warning about not reusing a parser instance. Whether it's worth trying to plug all the holes in the expat library is another issue (see, for instance,issue12829). David, would you be willing to propose a wording for a documentation change? | |||
| msg143243 -(view) | Author: Ned Deily (ned.deily)*![]() | Date: 2011-08-30 23:34 | |
Also, noteissue1208730 proposes a feature to expose a binding for XML_ParserReset and has the start of a patch. | |||
| msg143295 -(view) | Author: David H. Gutteridge (dhgutteridge) | Date: 2011-09-01 04:56 | |
Ned: My proposed wording is: "Note that only one document can be parsed by a given instance; it is not possible to reuse an instance to parse multiple files." To provide more detail, one could also add something like: "The isfinal argument of the Parse() method is intended to allow the parsing of a single file in fragments, not the submission of multiple files." | |||
| msg211673 -(view) | Author: David H. Gutteridge (dhgutteridge) | Date: 2014-02-19 23:42 | |
Updating to reflect the Python 3.4 documentation is now also relevant to this discussion. Perhaps someone could commit a change something like my suggestion inmsg143295? | |||
| msg212338 -(view) | Author: Ned Deily (ned.deily)*![]() | Date: 2014-02-27 07:17 | |
Thanks for the reminder, David. Here are patches for 3.x and 2.7 that include updated versions of the proposed pyexpat.c and test_pyexpat.py patches along with a doc update along the lines suggested by David. | |||
| msg215005 -(view) | Author: Roundup Robot (python-dev)![]() | Date: 2014-03-27 23:44 | |
New changeset74faca1ac59c by Ned Deily in branch '2.7':Issue#6676: Ensure a meaningful exception is raised when attemptinghttp://hg.python.org/cpython/rev/74faca1ac59cNew changeset9e3fc66ee0b8 by Ned Deily in branch '3.4':Issue#6676: Ensure a meaningful exception is raised when attemptinghttp://hg.python.org/cpython/rev/9e3fc66ee0b8New changesetee0034434e65 by Ned Deily in branch 'default':Issue#6676: merge from 3.4http://hg.python.org/cpython/rev/ee0034434e65 | |||
| msg215011 -(view) | Author: Ned Deily (ned.deily)*![]() | Date: 2014-03-28 00:52 | |
Applied for release in 3.5.0, 3.4.1 and 2.7.7. Thanks, everyone! | |||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:56:51 | admin | set | github: 50925 |
| 2014-03-28 00:52:37 | ned.deily | set | status: open -> closed versions: + Python 3.5, - Python 3.3 messages: +msg215011 resolution: fixed stage: patch review -> resolved |
| 2014-03-27 23:44:36 | python-dev | set | nosy: +python-dev messages: +msg215005 |
| 2014-02-27 07:18:14 | ned.deily | set | files: +issue6676_27.patch |
| 2014-02-27 07:17:49 | ned.deily | set | files: +issue6676_3x.patch stage: patch review messages: +msg212338 versions: - Python 3.2 |
| 2014-02-19 23:42:36 | dhgutteridge | set | messages: +msg211673 versions: + Python 3.4 |
| 2011-09-01 04:56:04 | dhgutteridge | set | messages: +msg143295 |
| 2011-08-30 23:34:09 | ned.deily | set | messages: +msg143243 |
| 2011-08-30 23:25:50 | ned.deily | set | nosy: +ned.deily messages: +msg143242 versions: + Python 3.2, Python 3.3, - Python 2.6 |
| 2011-08-30 23:20:26 | ned.deily | link | issue12829 superseder |
| 2011-08-25 00:57:12 | dhgutteridge | set | nosy: +dhgutteridge messages: +msg142950 |
| 2010-02-02 18:20:47 | willgrainger | set | nosy: +willgrainger messages: +msg98753 |
| 2009-10-09 16:13:10 | ocean-city | set | messages: +msg93804 |
| 2009-10-09 15:44:53 | ocean-city | set | messages: +msg93803 |
| 2009-10-09 15:29:04 | amaury.forgeotdarc | set | files: +pyexpat-2.patch messages: +msg93802 |
| 2009-10-09 14:11:34 | ocean-city | set | files: -pyexpat_addhok.patch |
| 2009-10-09 14:11:17 | ocean-city | set | files: +pyexpat.patch messages: +msg93794 |
| 2009-10-09 13:12:12 | amaury.forgeotdarc | set | messages: +msg93791 |
| 2009-10-09 13:00:47 | ocean-city | set | messages: +msg93790 |
| 2009-10-09 12:04:56 | amaury.forgeotdarc | set | nosy: +amaury.forgeotdarc messages: +msg93785 |
| 2009-10-09 09:45:54 | ocean-city | set | files: +pyexpat_addhok.patch versions: + Python 2.7 nosy: +ocean-city messages: +msg93782 keywords: +patch |
| 2009-10-09 08:26:50 | andybalaam | set | messages: +msg93779 |
| 2009-10-09 08:23:45 | andybalaam | set | nosy: +andybalaam messages: +msg93777 |
| 2009-08-10 16:55:44 | realpolitik | set | messages: +msg91455 |
| 2009-08-10 16:23:24 | realpolitik | create | |