Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Next TAR header offset recomputation is wrong for GNU sparse 1.0 file combined with 'size' PAX header key #136602

Labels
stdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error
@mxmlnkn

Description

@mxmlnkn

Bug report

Bug description:

For a more detailed description, please see#136601.

I have a bug that causes TAR file parsing to end preemptively for very large sparse files. The computed next TAR header is off by one 512 B block.

The problem is the recomputation of the next TAR offset in case the PAX header contains asize key to override the overflowed (> 8GB) TAR size:

cpython/Lib/tarfile.py

Lines 1562 to 1569 in47b01da

if"size"inpax_headers:
# If the extended header replaces the size field,
# we need to recalculate the offset where the next
# header starts.
offset=next.offset_data
ifnext.isreg()ornext.typenotinSUPPORTED_TYPES:
offset+=next._block(next.size)
tarfile.offset=offset

The problem is thatnext.offset_data is used for this recomputation even thoughnext.offset_data gets overwritten in_proc_gnusparse_10:

next.offset_data=tarfile.fileobj.tell()

This leads to the next TAR offset header being off by the number of blocks it takes to store the sparse data.

But, maybe I am wrong and have overlooked something. I can say, that this fixes it for my test case:

diff --git a/Lib/tarfile.py b/Lib/tarfile.pyindex 068aa13ed7..7f3e62f5a2 100644--- a/Lib/tarfile.py+++ b/Lib/tarfile.py@@ -1565,7 +1565,7 @@ def _proc_pax(self, tarfile):                 # header starts.                 offset = next.offset_data                 if next.isreg() or next.type not in SUPPORTED_TYPES:-                    offset += next._block(next.size)+                    offset += next._block(next.size) - BLOCKSIZE                 tarfile.offset = offset          return next

Minimal reproducer (tested on EXT4 with GNU tar 1.35):

echo bar> fooecho bar> sparsefallocate -l 9G sparseecho bar>> sparsefallocate --punch-hole -o 1G -l 10M sparsetar --numeric-owner --format=pax --sparse-version=1.0 -cSf sparse.tar sparse fools -la sparse.tar# -rw-rw-r-- 1 user user 9663682560 Jul 13 14:14 sparse.tartar tvlf sparse.tar# -rw-rw-r-- 1000/1000 9663676420 2025-07-13 14:13 sparse# -rw-rw-r-- 1000/1000          4 2025-07-13 14:11 foopython3 -c'import sys, tarfile;[print(tarInfo.sparse, tarInfo.offset, tarInfo.offset_data, tarInfo.size, tarInfo.name)for tarInfo in tarfile.open(sys.argv[1])]' sparse.tar# [(0, 1073741824), (1084227584, 8579448836), (9663676420, 0)] 0 2048 9653191172 sparse#  -> foo is missing!cat sparse.tar| xz -9| zstd -19| base64

Reproducersparse-file-larger-than-8GiB-followed-by-normal-file.tar.xz.zst file as base64:

cat<<EOF | base64 -d | zstd -d > sparse-file-larger-than-8GiB-followed-by-normal-file.tar.xzKLUv/QRojBIA1CP9N3pYWgAABObWtEYCACEBHAAAABDPWMz//5wCcV0AFwvGh5JaO6ePxyUOuA/zXtE/5U/vyT1WUwqPhMr1HTeZeJyWILwrrtDwH0eKx6KKGcU7D2aYidf/9bCtFMcWp8KxDA1FLF58w9bO4J+eDKd9QfIZFPCutpNB91dMk9bSVazx9pUcWEWn2r0SWsv1BtSYmVDmdKaMdGC/Epx8bcRAnm5Joy2Tgi3O7VouoCAqha+1YYNOQyyB4sG+tDbfLGdW6fyZMztJ/lRFQwtlFpDLHGFpia92kkke+2a/mwMvPc58aiT5X56QuH2mw1OhsrBKnbYYnT89BJjyAh2GTOeDbtZ/lLDGwhvxkXlnCm/M8QiqfUGfqAjnBeikNY2nodSBFo8YQh+636fk9xfuTQ3kKQ8qEWa613HftzHJ/X/ha1bKD91T/SPTCgd/rhyvFtn8FBBiUS7UayidinQBNmGebczIaRsKUQKoffUTC9EbCrRXDQjQMjfDyo7N/eDIxD7jBImHDv8Qk/hxeFn4C83/lShGD6n8fN77mjAuVsCPhfODgcBlxCVT+PWRNjEFpbDub8FwTUcM0ZERqq1gHbrOsScYXFmG6WZSWL7pdqxZ5OVbBQj5x9qt/PtSK3TNHlsgQvndUz34KWQJO4DLKmzftTvwxL0uX6oPPktmQpAT+5I61gCf/xABKwDsc1On/b6ufDEan7eNMW5wnqcjX+woy4XRlZiKfiqR8id19xnABphNmP3Yr9WQD1EPP7IEADz8NCsncIBOR5aC/hM+FaZUAAAAAQD9/778uQYCRAAAAAEA/f85AAJEAAAAAQD9/zkAAkQAAAABAP3/OQACRAAAAAEA/f85AAJEAAAAAQD9/zkAAkQAAAABAP3/OQACRAAAAAEA/f85AAJEAAAAAQD9/zkAAh0GAIQLkODIaNUNHQG+Ib0xB5201x9u5Typk+S1zSY18D/tc2o+BXKM/RM9v6MTQoFntxwNm0So6CELgft8dinBPFJBg583tJn+q69PwBnThZQjYTzvNhv0fkxX4GjmSgwnOrb7GU5pc2qtcrNCcHrPaNkQicmkdyzESbMAA8S2zfCiJIzpnN25EroA08/3fFWQ44JfrakeAIiPdXLNPTRNAAGY3VWAsID7IwAAdKr/2BQXOzADAAAAAARZWgIAG0DNWVsOgERj+N4=EOF

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions


      [8]ページ先頭

      ©2009-2025 Movatter.jp