Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork34.2k
Description
Bug report
test_bz2 concatenates a bunch of Python test files to get 128 KiB of test data:
Lines 69 to 80 in1d091a3
| # Some tests need more than one block of uncompressed data. Since one block | |
| # is at least 100,000 bytes, we gather some data dynamically and compress it. | |
| # Note that this assumes that compression works correctly, so we cannot | |
| # simply use the bigger test data for all tests. | |
| test_size=0 | |
| BIG_TEXT=bytearray(128*1024) | |
| forfnameinglob.glob(os.path.join(glob.escape(os.path.dirname(__file__)),'*.py')): | |
| withopen(fname,'rb')asfh: | |
| test_size+=fh.readinto(memoryview(BIG_TEXT)[test_size:]) | |
| iftest_size>128*1024: | |
| break | |
| BIG_DATA=bz2.compress(BIG_TEXT,compresslevel=1) |
The exact contents depends on the order of results returnedglob.glob(), which is in arbitrary, but typically consistent on a single machine. Some of these orderings of globbed files lead to test failures.
Below is mostly Claude's summary, which seems right to me:
The testDecompressorChunksMaxsize test feedsBIG_DATA[:len(BIG_DATA)-64] to BZ2Decompressor.decompress with max_length=100 and asserts needs_input is False. This assumes the truncated data contains at least one complete bz2 block so the decompressor can produce output. But bz2 is a block compressor - it cannot produce any output until an entire compressed block is available.
With certain file orderings, the first bz2 block's compressed data extends into the last 64 bytes of BIG_DATA. The truncation then produces an incomplete block, the decompressor consumes all input, returns 0 bytes, and correctly sets needs_input=True.