Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-117779: Fix reading duplicated entries in zipfile by name#129254

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation

serhiy-storchaka
Copy link
Member

@serhiy-storchakaserhiy-storchaka commentedJan 24, 2025
edited by bedevere-appbot
Loading

@serhiy-storchaka
Copy link
MemberAuthor

I wonder how far should we backport this change.#110016 was a security fix and was backported to 3.8.

@obfusk
Copy link
Contributor

obfusk commentedJan 25, 2025
edited
Loading

I didn't expect such archives to be found in the real world.
I cannot imagine why anybody would create such archive.

I didn't expect it either. And I suspect the creation of those duplicate entries to be a bug, but they do clearly exist in the wild.

Now the last of the duplicated entries can be read, so you will not get an error when read by name. You will still get errors when reading other duplicated entries, this is necessary to prevent ZIP bombing.

I'm slightly unsure about this fix though. For one, it breaks the workaround we added todiffoscope (use the first entry ininfolist() if there are duplicates). This fix is simple and handles opening the file by name, yes, but for other use cases the problem is simply moved around. I have plenty of code that iterates overinfolist() and passes theZipInfo toread(), which would still fail even though there is no actual overlap indicating a ZIP bomb.

IMO it would be admittedly more complex but also more consistent -- and avoid any errors processing these ZIP files -- to explicitly handle entries with duplicate offsets: letting all of them have the same._end_offset, that being the start of the actual next (i.e. not having the same.header_offset) entry instead of still having the entries with duplicate offsets pointing to each other in a different order. This should still prevent actual ZIP bombs as it ensures there is no actual overlap whilst treating entries with the same.header_offset consistently instead of having some that can and others that cannot be opened depending on ordering.

(edited for clarity)

@obfusk
Copy link
Contributor

cc@h01ger@lamby

lamby reacted with thumbs up emoji

@serhiy-storchaka
Copy link
MemberAuthor

Sorry for the delay in responding. I experimented with different solutions and wrote new tests.

To me, this is a ZIP bomb. Even if it does not allow to fill the disk space by decompressing, it still spend CPU time for uncompressing the same data. It may consume a lot of memory if the decompressed entries are store in a list instead of a dict.

But since ZIP archives with duplicated entries encountered in practice, created not by malicious, but just bad software, perhaps we could lower the severity level from error to warning. Warnings can be converted to exceptions, if you want to make your program safer.

@gpshead, what is your opinion on this?

gpshead reacted with thumbs up emoji

@gpshead
Copy link
Member

I like the warning idea.

@gpshead
Copy link
Member

@jaraco - any thoughts on this one? in re-enables processing of one form of zip bomb, just with a warning instead of error. because there are alsomaybe legit situations where this zip structure occurs. It mostly undoes one of the protections added in#110016 for#109858.

@obfusk
Copy link
Contributor

A warning seems reasonable, yeah. But I dislike the "except the last one" edge case where it still breaks, surely that can be avoided?

@obfusk
Copy link
Contributor

IMO a warning indeed makes perfect sense but I think entries with the same.header_offset should all be treated consistently; AFAIK there is zero documentation for the fact that iterating over entries in different ways produces different results and that end user code needs to be aware, not just that e.g. duplicate file names may exist, but of how these entries with identical offsets are being handled internally (and I consider the fact that now matters a regression still).

@serhiy-storchaka
Copy link
MemberAuthor

Warnings or exceptions are emitted for some abnormal ZIP files from ancient times, but it is not possible to detect all anomalies, and even if it is possible, it is not always practical. For example, in a "quote-overlap" zip bomb, the most nested file can be read without error.

It is not a large issue if one of entries with the same offset can be read without error or warning. For consistency with how non-overlapped files with the same name are handled, it should be the last entry.

@serhiy-storchakaserhiy-storchaka merged commit0f04f24 intopython:mainApr 8, 2025
39 checks passed
@miss-islington-app
Copy link

Thanks@serhiy-storchaka for the PR 🌮🎉.. I'm working now to backport this PR to: 3.12, 3.13.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull requestApr 8, 2025
…ythonGH-129254)(cherry picked from commit0f04f24)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this pull requestApr 8, 2025
…ythonGH-129254)(cherry picked from commit0f04f24)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@bedevere-app
Copy link

GH-132263 is a backport of this pull request to the3.13 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.13bugs and security fixes labelApr 8, 2025
@bedevere-app
Copy link

GH-132264 is a backport of this pull request to the3.12 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.12only security fixes labelApr 8, 2025
serhiy-storchaka added a commit that referenced this pull requestApr 8, 2025
…H-129254) (GH-132264)(cherry picked from commit0f04f24)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
serhiy-storchaka added a commit that referenced this pull requestApr 8, 2025
…H-129254) (GH-132263)(cherry picked from commit0f04f24)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@gpsheadgpsheadAwaiting requested review from gpshead

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

3 participants
@serhiy-storchaka@obfusk@gpshead

[8]ページ先頭

©2009-2025 Movatter.jp