Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

tarfile indeterminate TarInfo.size when PAX headers containsize andGNU.sparse.realsize keys at the same time #136601

Labels
stdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error
@mxmlnkn

Description

@mxmlnkn

Bug report

Bug description:

Hello,

I am currently debuggingthis issue.

I have noticed that the bug can be reproduced when the problematic file is truncated to 9 GiB B but it does not happen when truncated to 8 GiB.

The problem seems to be that the next member offset is computed wrong. It seems to point 512 B after the correct TAR header, which, in this case, points into the data for the extended attributes such as30 mtime=1752348[...].

One of the differences seems to be this code part, which is not hit for the working case:

cpython/Lib/tarfile.py

Lines 1562 to 1569 in47b01da

if"size"inpax_headers:
# If the extended header replaces the size field,
# we need to recalculate the offset where the next
# header starts.
offset=next.offset_data
ifnext.isreg()ornext.typenotinSUPPORTED_TYPES:
offset+=next._block(next.size)
tarfile.offset=offset

While looking into the line above, i.e., into_apply_pax_info, I noticed that there is no definite order for applying the size even though it can appear multiple times!

cpython/Lib/tarfile.py

Lines 1615 to 1634 in47b01da

def_apply_pax_info(self,pax_headers,encoding,errors):
"""Replace fields with supplemental information from a previous
pax extended or global header.
"""
forkeyword,valueinpax_headers.items():
ifkeyword=="GNU.sparse.name":
setattr(self,"path",value)
elifkeyword=="GNU.sparse.size":
setattr(self,"size",int(value))
elifkeyword=="GNU.sparse.realsize":
setattr(self,"size",int(value))
elifkeywordinPAX_FIELDS:
ifkeywordinPAX_NUMBER_FIELDS:
try:
value=PAX_NUMBER_FIELDS[keyword](value)
exceptValueError:
value=0
ifkeyword=="path":
value=value.rstrip("/")
setattr(self,keyword,value)

In the non-working case, the PAX headers look like this:

{'GNU.sparse.major':'1','GNU.sparse.minor':'0','GNU.sparse.name':'userdata','GNU.sparse.realsize':'9663676416','atime':'1752349406.975921575','ctime':'1752349534.57652562','mtime':'1752349534.57652562','size':'9602318848'}

I.e, the size member first gets set toGNU.sparse.realsize and then tosize. The debug output looks like this:

[_apply_pax_info] SET SIZE to: 9663676416 from key: GNU.sparse.realsize[_apply_pax_info] SET SIZE to: 9602318848 from key: size[_apply_pax_info] SET key to: 1752349534.5765257 from key: mtime

Is it specified that the order of the PAX headers must always be this way? Else, one might just as well encounter it like this:

{'atime':'1752349406.975921575','ctime':'1752349534.57652562','mtime':'1752349534.57652562','size':'9602318848','GNU.sparse.major':'1','GNU.sparse.minor':'0','GNU.sparse.name':'userdata','GNU.sparse.realsize':'9663676416'}

and either one of these orders would be a bug.

The working case does not have this ambiguity:

{'GNU.sparse.major':'1','GNU.sparse.minor':'0','GNU.sparse.name':'userdata','GNU.sparse.realsize':'8589934592','atime':'1752349538.445543898','ctime':'1752351104.53673501','mtime':'1752351104.53673501'}

the debug output looks like this:

[_apply_pax_info] SET SIZE to: 8589934592 from key: GNU.sparse.realsize[_apply_pax_info] SET key to: 1752351104.536735 from key: mtime

I.e., even if the is no ordering problem, there already are different semantics for theTarInfo.size member as one will containGNU.sparse.realsize and the other will contain[PAXHeader.]size.

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions


      [8]ページ先頭

      ©2009-2025 Movatter.jp