Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32k
gh-72680: Fix false positives when using zipfile.is_zipfile()#134250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
The zipfile.is_zipfile function would only search for the EndOfZipfilesection header. This failed to correctly identify non-zipfiles thatcontained this header. Now the zipfile.is_zipfile function verifiesthe first central directory entry.Changes:* Extended zipfile.is_zipfile to verify zipfile catalog* Added tests to validate failure of binary non-zipfiles
Uh oh!
There was an error while loading.Please reload this page.
bedevere-bot commentedMay 19, 2025
🤖 New build scheduled with the buildbot fleet by@gpshead for commit4bfec3e 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F134250%2Fmerge If you want to schedule another build, you need to add the🔨 test-with-buildbots label again. |
Improve Zip file validation in :func:`zipfile.is_zipfile`. | ||
Before this change :func:`zipfile.is_zipfile` only checked the End Central Directory | ||
signature. If the signature could be found in the last 64k of the file, | ||
success! This produced false positives on any file with ``'PK\x05\x06'`` in the | ||
last 64k of the file - including PDFs and PNGs. | ||
This is now corrected by actually validating the Central Directory location | ||
and size based on the information provided by the End of Central Directory | ||
along with verifying the Central Directory signature of the first entry. | ||
This should be sufficient for the vast number of Zip files with fewer | ||
false positives. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This is a very beefy NEWS, could it be condensed. We generally keep them quite brief, details go in issues or docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I suggest to move most of the NEWS entry into the commit message, and keep the NEWS entry short.
1298511
intopython:mainUh oh!
There was an error while loading.Please reload this page.
…ythonGH-134250)bpo-28494: Improve zipfile.is_zipfile reliabilityThe zipfile.is_zipfile function would only search for the EndOfZipfilesection header. This failed to correctly identify non-zipfiles thatcontained this header. Now the zipfile.is_zipfile function verifiesthe first central directory entry.Changes:* Extended zipfile.is_zipfile to verify zipfile catalog* Added tests to validate failure of binary non-zipfiles* Reuse 'concat' handling for is_zipfile(cherry picked from commit1298511)Co-authored-by: Tim Hatch <timhatch@netflix.com>Co-authored-by: John Jolly <john.jolly@gmail.com>
GH-134401 is a backport of this pull request to the3.14 branch. |
…H-134250) (#134401)gh-72680: Fix false positives when using zipfile.is_zipfile() (GH-134250)bpo-28494: Improve zipfile.is_zipfile reliabilityThe zipfile.is_zipfile function would only search for the EndOfZipfilesection header. This failed to correctly identify non-zipfiles thatcontained this header. Now the zipfile.is_zipfile function verifiesthe first central directory entry.Changes:* Extended zipfile.is_zipfile to verify zipfile catalog* Added tests to validate failure of binary non-zipfiles* Reuse 'concat' handling for is_zipfile(cherry picked from commit1298511)Co-authored-by: Tim Hatch <timhatch@netflix.com>Co-authored-by: John Jolly <john.jolly@gmail.com>
Uh oh!
There was an error while loading.Please reload this page.
Rebased#5053 and fixed the impl to pass tests. Original PR and description below by@jjolly
Fix zipfile validation issue by ... providing more validation!
Originally, zipfile.is_zipfile() only checked the End Central Directory
signature. If the signature could be found in the last 64k of the file,
success! This produced false positives on any file with 'PK\x05\x06' in the
last 64k of the file - including PDFs and PNGs.
This is now corrected by actually validating the Central Directory location
and size based on the information provided by the End Central Directory
along with verifying the Central Directory signature of the first entry.
This should be sufficient for the vast number of zipfiles, but more could be
done to absolutely validate the zipfile content - such as validating all
local file headers and Central Directory entries.