Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32k
gh-106287: do not write objects after an unmarshalling error#132715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Conversation
…ling errorWriting out an object may involve a slot lookup, which is not safe to do withan exception raised. In debug mode an assertion failure will occur if thishappens. To avoid assertion failures, and possibly more subtle problems theassertion is there to prevent, skip attempts to write an object after an error.Note the unmarshal writing code won't raise an exception itself, however theset writing code calls back into the top-level unmarshal function (seepythongh-78903), which will check for failures and raises an exception. Once thathappens it is not safe to continue writing further objects.Note also that some data may still be written even after an error occurred andan exception has been raised: code objects write objects and longs, and thelatter will be written unconditionally, even if the former fails. Thisshouldn't cause a problem as it is documented that "garbage data will also bewritten to the file" in such cases, and the top-level function will handle andreport the error appropriately.Add a test case to cover the various different object types which could triggersuch failures.
Most changes to Pythonrequire a NEWS entry. Add one using theblurb_it web app or theblurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
Python/marshal.c Outdated
@@ -750,6 +753,10 @@ PyMarshal_WriteLongToFile(long x, FILE *fp, int version) | |||
w_flush(&wf); | |||
} | |||
/* TODO: Contrary to its documentation, this function does NOT set an error on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
It does.PySys_Audit
orw_init_refs
may set an error on failure. The docs say that exceptions are due to marshalling itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
True, those functions set errors as appropriate, apologies for being unclear. Various other error conditions are correctly reported, too. However unmarshallable objects like in this issue are not. AIUI they should raise aValueError: unmarshallable object
(asPyMarshal_WriteObjectToString
does), but the error is silently ignored byPyMarshal_WriteObjectToFile
.
The problem is the errors are recorded inWFILE::error
(e.g. see the end ofw_complex_object
) and it is only in_PyMarshal_WriteObjectToString
that is converted into an exception. This doesn't happen inPyMarshal_WriteObjectToFile
.
This can be reproduced withpython -c "import _testcapi; _testcapi.pymarshal_write_object_to_file(int, 'test.data', 5)"
(with assertions enabled). It asserts there are no exceptions raised after calling the function, but completes successfully. If you replaceint
with{int}
it will abort as expected.
I can reword or just drop the comment, whatever you prefer. Maybe something like, "TODO: this function silently ignores some failures due to unmarshallable objects"? I'd be happy to just fix it, for that matter, but I wasn't sure whether it would be considered worth the churn.
Now I look at that code again, I note the paths that currentlydo raise exceptions while writing objects are going to have them overwritten by that error handling code anyway. That's a shame, since they give specifics like "bad marshal data (unnormalized long data)" instead of a generic "unmarshallable object" message. Perhaps we shouldn't raise an exception if there's already one raised?
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Thanks for the review! I've just dropped the comment for now: itprobably isn't worth worrying about, and if it is, well, I'll just fix it instead 😉 |
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Uh oh!
There was an error while loading.Please reload this page.
Writing out an object may involve a slot lookup, which is not safe to do with an exception raised. In debug mode an assertion failure will occur if this happens. To avoid assertion failures, and possibly more subtle problems the assertion is there to prevent, skip attempts to write an object after an error.
Note the unmarshal writing code won't raise an exception itself, however the set writing code calls back into the top-level unmarshal function (seegh-78903), which will check for failures and raises an exception. Once that happens it is not safe to continue writing further objects.
Note also that some data may still be written even after an error occurred and an exception has been raised: code objects write objects and longs, and the latter will be written unconditionally, even if the former fails. This shouldn't cause a problem as it is documented that "garbage data will also be written to the file" in such cases, and the top-level function will handle and report the error appropriately.
Add a test case to cover the various different object types which could trigger such failures.
Note that, contrary to its documentation, the
PyMarshal_WriteObjectToFile
function doesnot set an error on failure. Fortunately it is not used internally, except in test code, where it is only used with input that should not fail. I noticed this while checking the code and added a comment, but fixing it should be done separately. If we can be bothered, that is: Guido seemed to notice the same thing and mentioned it in a comment 25 years ago, and no-one seems to have noticed since.