Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

tarfile: Make it possible to extract nested tarfiles in memory#1032

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
woju wants to merge1 commit intomicropython:master
base:master
Choose a base branch
Loading
fromwoju:tarfile-nested

Conversation

@woju
Copy link

Make it possible to extract nested archives, which are used in e.g. Mender artifacts. See commit message for details and a (simplified) example.

FileSection.skip() (see below the diff) uses 2-argument readinto, soattempting to recursively extract archives throws an error. This commitadds optional second argument to fix this problem. After this commit, itis possible to extract nested archives in roughly this fashion:    with open(path, 'rb') as file:        tar_outer = tarfile.TarFile(fileobj=file)        for ti_outer in tar_outer:            tar_inner = tarfile.TarFile(                fileobj=tar_outer.extractfile(ti_outer))            for ti_inner in tar_inner:                ...Nested archives are used in some embedded contexts, for example Menderartifacts.Signed-off-by: Wojciech Porczyk <wojciech.porczyk@connectpoint.pl>
@dpgeorge
Copy link
Member

Thanks for the patch. I can see why it's needed.

But, this is not CPython compatible, and we strive to retain compatibility where possible.

Now,file.readinto(buf, size) is also not CPython compatible, and that's really where the trouble begins. Supporting the second argument there means all file-like objects passed intoTarFile must support this 2-argreadinto form.

So I suggest to fix it by changing howskip works, so it doesn't use the 2-arg form, eg:

--- a/python-stdlib/tarfile/tarfile/__init__.py+++ b/python-stdlib/tarfile/tarfile/__init__.py@@ -55,9 +55,12 @@ class FileSection:         if sz:             buf = bytearray(16)             while sz:-                s = min(sz, 16)-                self.f.readinto(buf, s)-                sz -= s+                if sz >= 16:+                    self.f.readinto(buf)+                    sz -= 16+                else:+                    self.f.read(sz)+                    sz = 0   class TarInfo:

@dpgeorge
Copy link
Member

For reference, this used to work, but commit2ca1527 optimisedskip to not use too much memory.

@woju
Copy link
Author

woju commentedAug 1, 2025

[...] Supporting the second argument there means all file-like objects passed intoTarFile must support this 2-argreadinto form.

I thought this was already the case, because as it is, it doesn't work without 2-argumentreadinto at all, even the outer tarfile can't be extracted.

So I suggest to fix it by changing howskip works, so it doesn't use the 2-arg form, eg:

Sure, wilco. I'll also change the title of the PR

@dpgeorge
Copy link
Member

I thought this was already the case, because as it is, it doesn't work without 2-argumentreadinto at all, even the outer tarfile can't be extracted.

Yes, you're right, all fileobj's that it uses must support 2-argreadinto. For the most part that's OK, all C-based MicroPython streams will implement that. But Python-based files/stream won't.

For example, if there's a tar file on the host PC and you usempremote mount . and then at the REPL try to usetarfile.TarFile to open the tar file on the mounted host PC, it'll fail with the same problem as this PR is addressing:

  File "tarfile/__init__.py", line 129, in __next__  File "tarfile/__init__.py", line 106, in next  File "tarfile/__init__.py", line 59, in skipTypeError: function takes 2 positional arguments but 3 were given

So, instead of trying to add the 2-arg form to all streams/files, better to fix it once here so that it doesn't use the 2-arg form.

@woju
Copy link
Author

woju commentedAug 1, 2025

I agree, yes, this sounds like the correct fix. I'll do that the week after; next week I'm on vacation.

dpgeorge reacted with thumbs up emoji

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

2 participants

@woju@dpgeorge

[8]ページ先頭

©2009-2025 Movatter.jp