Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32k
gh-130819: Updatetarfile.py#_create_gnu_long_header
to align with GNU Tar#130820
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Conversation
ghost commentedMar 4, 2025 • edited by ghost
Loading Uh oh!
There was an error while loading.Please reload this page.
edited by ghost
Uh oh!
There was an error while loading.Please reload this page.
Most changes to Pythonrequire a NEWS entry. Add one using theblurb_it web app or theblurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
ce2e670
to24b90cf
CompareUh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Misc/NEWS.d/next/Library/2025-03-04-03-14-44.gh-issue-130819.Dphgb6.rst OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
24b90cf
to283b34e
Compare
picnixz left a comment• edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Can you motivate the choice for this? namely is there a real benefit between having an explicit user+mode rather than letting the "defaults"? And more importantly, can you cite the relevant manpage / specs where we can find this?
Note: whether this is accpeted or not, this should be treated as a feature request and not a bug IMO. As such, a What's New entry will need to be created, unless the motivation behind this change is not sufficient (in which case we would close the issue as "not planned")
Lib/tarfile.py Outdated
info["mode"] = 0o100644 | ||
info["uname"] = "root" | ||
info["gname"] = "root" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Where in the specs are these decided?
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Oh btw, please reply on the issue instead of the PR (I'll repost my comment above) |
80b8591
to5282dd6
CompareI have made the requested changes; please review again |
Thanks for making the requested changes! @picnixz: please review the changes made to this pull request. |
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Lib/tarfile.py Outdated
_unames = {} # Cached mappings of uid=0 -> uname | ||
_gnames = {} # Cached mappings of gid=0 -> gname |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I prefer that we keep per-instance caches instead of per-class caches, even for 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Hello, sorry but there seems no availableTarFile
object for_create_gnu_long_header
to cache the querying result:
- a
TarFile
instance does have cache members ofself._unames: Dict[uid, uname]
- however, across the calling stack of
TarInfo.tobuf() -> TarInfo.create_gnu_header() -> TarInfo._create_gnu_long_header()
, there's noTarFile
argument.
If we add aTarFile
intoTarInfo.tobuf(...)
, then this PR may break existing subclasses ofTarInfo
. Is it indeed necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Um, to make this cache safe for the free-threading version of CPython, I've replaced the_unames = {}
with_name_uid0 = None
(and_gnames = {}
with_name_gid0 = None
).
Do you developers have any suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@picnixz Any idea about where to put the cache object?
47861a1
toca20885
Compareca20885
to02bbde5
CompareI have made the requested changes; please review again. |
Thanks for making the requested changes! @picnixz: please review the changes made to this pull request. |
@@ -1708,6 +1708,15 @@ sysconfig | |||
(Contributed by Xuehai Pan in :gh:`131799`.) | |||
tarfile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This will need to be moved in whatsnew/3.15.rst now
_name_uid0 = None # Cached uname of uid=0 | ||
_name_gid0 = None # Cached gname of gid=0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
What I meant before is: why using a class variable? the issue is that once we deduceuid=0
, we're stuck with it for theentire Python process.
EDIT: I didn't see you comment, my bad. Then we need to think of another solution because storing them inTarFile
feels wrong. What we can do is to add a private attribute in TarInfo and populate it from TarFile. When writing, if the attribute is not set, we populate it eagerly (and thus subclasses of TarInfowill be slower but they won't be broken). Or instead, we can even just dump them with the legacy way (namely without aligning with GNU Tar). Only default TarFile and TarInfo objects will be having this new feature.
More generally, we should be able to set cached contextual information on TarInfo objects coming from a TarFile.
Uh oh!
There was an error while loading.Please reload this page.
The latest
tarfile
may still generate a file slightly different with the one made by GNU Tar, whenever a path name is longer than 100 bytes. So this PR tries to avoid the difference.More details are in#130819 .