Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork33.7k
Description
Bug report
description
Usinggzip.compress() withmtime=0 in 3.8<=cpython<=3.10, theOS byte, i.e. the 10th byte in theGZIP header, is set to255 "unknown" (also see e.g.#83302):
Line 599 indc0adb4
| returnstruct.pack("<BBBBLBB",0x1f,0x8b,8,0,int(mtime),xfl,255) |
However, in cpython 3.11 and 3.12, theOS byte is suddenly set to a "known" value, e.g.3 ("Unix") on Ubuntu.
This isnot mentioned in thechangelog for Python 3.11.
This may lead to problems in the context ofreproducible builds. In our case, hash checking fails after decompressing and re-compressing a gzipped archive.
how to reproduce
Here's an example, where byte 10 is\xff in python 3.10 and\x03 in python 3.11:
~ $pythonPython3.10.12 (main,Jun112023,05:26:28) [GCC11.4.0]onlinux>>>importgzip>>>gzip.compress(b'',mtime=0)b'\x1f\x8b\x08\x00\x00\x00\x00\x00\x02\xff\x03\x00\x00\x00\x00\x00\x00\x00\x00\x00'~ $pyenvshell3.11~ $pythonPython3.11.6 (main,Nov232023,17:30:16) [GCC11.4.0]onlinux>>>importgzip>>>gzip.compress(b'',mtime=0)b'\x1f\x8b\x08\x00\x00\x00\x00\x00\x02\x03\x03\x00\x00\x00\x00\x00\x00\x00\x00\x00'
cause
I guess this is caused by python 3.11 delegating thegzip.compress() call tozlib ifmtime=0, as mentioned in thedocs:
Changed in version 3.11: Speed is improved by compressing all data at once instead of in a streamed fashion. Calls with mtime set to 0 are delegated tozlib.compress() for better speed.
and source:
Lines 609 to 612 in89ddea4
| ifmtime==0: | |
| # Use zlib as it creates the header with 0 mtime by default. | |
| # This is faster and with less overhead. | |
| returnzlib.compress(data,level=compresslevel,wbits=31) |
Apparentlyzlibdoes set theOS byte.
CPython versions tested on:
3.8, 3.9, 3.10, 3.11, 3.12
Operating systems tested on:
Linux, macOS, Windows
Linked PRs
- gh-112346: Bugfix: Remove faster codepath from gzip.compress as it introduces behavioral inconsistencies #114116
- gh-112346: Document the OS byte in
gzip.compressoutput change in 3.11 #120480 - gh-112346: Always set OS byte to 255, simpler gzip.compress function. #120486
- [3.13] gh-112346: Always set OS byte to 255, simpler gzip.compress function. (GH-120486) #120563
- [3.13] gh-112346: Document the OS byte in
gzip.compressoutput change in 3.11 (GH-120480) #120612 - [3.12] gh-112346: Document the OS byte in
gzip.compressoutput change in 3.11 (GH-120480) #120613 - [3.11] gh-112346: Document the OS byte in
gzip.compressoutput change in 3.11 (GH-120480) #120614
Metadata
Metadata
Assignees
Labels
Projects
Status