Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-129726: Breakgzip.GzipFile reference loop#130055

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
vstinner merged 3 commits intopython:mainfromcmaloney:gh-129796-gzipclose
Feb 28, 2025

Conversation

cmaloney
Copy link
Contributor

@cmaloneycmaloney commentedFeb 12, 2025
edited by bedevere-appbot
Loading

A reference loop was resulting in thefileobj held by theGzipFile being closed before theGzipFile.

The issue started withgh-89550 in 3.12, but was hidden in most cases until 3.13 whengh-62948 made it more visible. I think this needs backporting to 3.12

Sample of test failing onmain:

$ ./python -Wd -E -m test -uall -M8g test_gzip Using random seed: 1951078330:00:00 load avg: 0.54 Run 1 test sequentially in a single process0:00:00 load avg: 0.54 [1/1] test_gziptest test_gzip failed -- Traceback (most recent call last):  File "/<workdir>/python/cpython/Lib/test/test_gzip.py", line 872, in test_refloop_unraisable    self.assertIsNone(cm.unraisable)    ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^AssertionError: UnraisableHookArgs(exc_type=<class 'ValueError'>, exc_value=ValueError('I/O operation on closed file.'), exc_traceback=<traceback object at 0x76afdad089b0>, err_msg='Exception ignored while finalizing file <gzip on 0x76afdad08910>', object=None) is not Nonetest_gzip failed (1 failure)== Tests result: FAILURE ==1 test failed:    test_gzipTotal duration: 735 msTotal tests: run=73 failures=1Total test files: run=1/1 failed=1Result: FAILURE

loop found by@danifus , bug found and explored by@xrmx

A reference loop was resulting in the `fileobj` held by the `GzipFile`being closed before the `GzipFile`.The issue started withpythongh-89550 in 3.12, but was hidden in most casesuntil 3.13 whenpythongh-62948 made it more visible.
@@ -10,6 +10,7 @@
import builtins
import io
import _compression
import weakref
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

the ordering of imports here is painful to me, but resisted bringing to PEP8 because this fix needs backporting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Please reformat these imports for PEP 8 :-) I will handle merge conflicts on backports if needed.

cmaloney reacted with thumbs up emoji
Copy link
Contributor

@xrmxxrmx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Tested with Python 3.13 and with Python 3.12 -X dev against their respective gzip module patched with this diff and cannot reproduce the issue anymore with my snippet. Thanks!

@danifus
Copy link
Contributor

Nice work. +1 on a backport as I think this can result in data loss if there's data waiting to be flushed from the buffer or compressor.

Should we add some docs saying that the GzipFile object should be closed explicitly instead of relying on finalizers to clean it up so as to not rely on GC heuristics?

@cmaloney
Copy link
ContributorAuthor

@danifus 👍. The report production code did a.close() it sounds like (#129726 (comment)), feels to me like another issue that could be resolved: Close to me should mean all data is flushed and the GzipFile will not try to write again (and its write methods should error). Not sure what this last little bit of data is / why it was around after.close() and tried getting written on__del__.

@danifus
Copy link
Contributor

Hmmm I had missed that production example and it baffles me :p I don't see how more data is attempted to be written after.close() has been called once.

We have this check at the start of.close():

cpython/Lib/gzip.py

Lines 353 to 355 in05e89c3

fileobj=self.fileobj
iffileobjisNoneorself._buffer.closed:
return

and this in the finally block at the end of that.close():

cpython/Lib/gzip.py

Lines 365 to 366 in05e89c3

finally:
self.fileobj=None

Seems pretty watertight.

@xrmx it's possible you saw this issue in your code in thelogger.error("dropping flushed data due to transport failure back-off") branch of the following if statement (the one that doesn't close the GzipFile). Let me know if that's not the case and I'll keep investigating.

        if not self.state.should_try():            logger.error("dropping flushed data due to transport failure back-off")        else:            fileobj = buffer.fileobj  # get a reference to the fileobj before closing the gzip file            buffer.close()            data = fileobj.getbuffer()            data = gzip.compress(buffer.getvalue(), compresslevel=self._compress_level)
cmaloney reacted with thumbs up emoji

@xrmx
Copy link
Contributor

xrmx commentedFeb 14, 2025
edited
Loading

@xrmx it's possible you saw this issue in your code in thelogger.error("dropping flushed data due to transport failure back-off") branch of the following if statement (the one that doesn't close the GzipFile). Let me know if that's not the case and I'll keep investigating.

        if not self.state.should_try():            logger.error("dropping flushed data due to transport failure back-off")        else:            fileobj = buffer.fileobj  # get a reference to the fileobj before closing the gzip file            buffer.close()            data = fileobj.getbuffer()            data = gzip.compress(buffer.getvalue(), compresslevel=self._compress_level)

I think I've only seen it frombuffer.close(). Still with the patched 3.13 (so +2 on line numbers) I have seen (at least) another occurrence of the same stacktrace:

Traceback (most recent call last):  File "/usr/lib/python3.13/gzip.py", line 361, in close    fileobj.write(self.compress.flush())ValueError: I/O operation on closed file.

It comes fromtest_send_remote_failover_sync_non_transport_exception_error that on its own prints these:

Exception ignored in: <_io.BytesIO object at 0x7f54eae0a6b0>BufferError: Existing exports of data: object cannot be re-sizedException ignored in: <_io.BytesIO object at 0x7f54eada7420>BufferError: Existing exports of data: object cannot be re-size

And when run with another test sometimes prints the traceback. I also see these BufferError whenhandle_transport_fail is called. But I'll need some time to understand what's going on and provide another reproducer.

@danifus
Copy link
Contributor

That stacktrace could appear in 2 ways:

  • calling.close() directly
  • .close() being called in the finalizer before being garbage collected (__del__()).

Both would give you that traceback location but only the occurrence in the finalizer as the object goes out of scope will give you theException ignored in: <gzip on 0x7fa4fd99c550> text as well (exceptions are ignored in finalizers).

BufferError: Existing exports of data: object cannot be re-size is kind of related to this. You can't write to or close aBytesIO object if the buffer returned fromBytesIO.getbuffer() is still held somewhere.

If you claim the buffer withBytesIO.getbuffer() before closing the GzipFile, when the GzipFile is closed (or destroyed / finalized as it goes out of scope), data will attempt to be flushed to theBytesIO object in.close() and you will get theBufferError: Existing exports of data: object cannot be re-size exception. Try doing something like:

gzf = GzipFile(fileobj=io.BytesIO())fileobj = gzf.fileobjgzf.close()buf = fileobj.getbuffer()  # Grab the buffer after the GzipFile obj has been closed

As for the original issue, are you able to see if code like this:

        if not self.state.should_try():            buffer.close()  # close buffer in this branch of the if statement too            logger.error("dropping flushed data due to transport failure back-off")        else:            ...

resolves your issue without this weak reference fix applied to your python?

@xrmx
Copy link
Contributor

That stacktrace could appear in 2 ways:

* calling `.close()` directly* `.close()` being called in the finalizer before being garbage collected (`__del__()`).

Both would give you that traceback location but only the occurrence in the finalizer as the object goes out of scope will give you theException ignored in: <gzip on 0x7fa4fd99c550> text as well (exceptions are ignored in finalizers).

BufferError: Existing exports of data: object cannot be re-size is kind of related to this. You can't write to or close aBytesIO object if the buffer returned fromBytesIO.getbuffer() is still held somewhere.

If you claim the buffer withBytesIO.getbuffer() before closing the GzipFile, when the GzipFile is closed (or destroyed / finalized as it goes out of scope), data will attempt to be flushed to theBytesIO object in.close() and you will get theBufferError: Existing exports of data: object cannot be re-size exception. Try doing something like:

gzf = GzipFile(fileobj=io.BytesIO())fileobj = gzf.fileobjgzf.close()buf = fileobj.getbuffer()  # Grab the buffer after the GzipFile obj has been closed

Thanks for the explanation, adding the explicit buffer.close() on the other path does not remove theException ignored in: <gzip on 0x7f551ebd2c80> I'm still seeing with the patch applied. I need to dig more.

As for the original issue, are you able to see if code like this:

        if not self.state.should_try():            buffer.close()  # close buffer in this branch of the if statement too            logger.error("dropping flushed data due to transport failure back-off")        else:            ...

resolves your issue without this weak reference fix applied to your python?

Nope, adding buffer.close() does not remove all the gzip stacktraces I was seeing before.

@xrmx
Copy link
Contributor

Oh the elephant in the room is that this code is running as a daemon thread and so it well may be that I got these when it gets killed at exit.

@xrmx
Copy link
Contributor

xrmx commentedFeb 14, 2025
edited
Loading

Oh the elephant in the room is that this code is running as a daemon thread and so it well may be that I got these when it gets killed at exit.

That's not the problem, after poking a bit with the code the following quote from@danifus started to make sense to me:

If you claim the buffer with BytesIO.getbuffer() before closing the GzipFile, when the GzipFile is closed (or destroyed / finalized
as it goes out of scope), data will attempt to be flushed to the BytesIO object in .close() and you will get the BufferError: Existing
exports of data: object cannot be re-size exception.

Adding adata.release() after closing the gzip appears to have fixed the other warnings. Now I still have a single occurrence of the original stacktrace but it's near a test that now raisesValueError: operation forbidden on released memoryview object because of the memoryview release, so maybe it's poking with internals too.

BTW the BytesIO doc talks about aview that you should release but it doesn't say how and it doesn't link thememoryview doc.

Thanks a lot@cmaloney and@danifus for the help!

cmaloney reacted with thumbs up emoji

@danifus
Copy link
Contributor

Glad to hear most of the exceptions are gone.

As for the last one, I think you'll just want todel the instance of the view you got from.getbuffer(). I think your current issue is that there are 2 references to the same memoryview and your.release() in one location is impacting the other.

Instead, just do something like:

bio = BytesIO()buf = bio.getbuffer()del buf  # remove this instance and let the GC release it if no other references exist.

@xrmx
Copy link
Contributor

xrmx commentedFeb 17, 2025
edited
Loading

Glad to hear most of the exceptions are gone.

As for the last one, I think you'll just want todel the instance of the view you got from.getbuffer(). I think your current issue is that there are 2 references to the same memoryview and your.release() in one location is impacting the other.

Instead, just do something like:

bio = BytesIO()buf = bio.getbuffer()del buf  # remove this instance and let the GC release it if no other references exist.

The last one is from a test mocking the function getting the view as arg and trying to read it after it has already been released. If I change the code as you suggested running this test I getEOFError: Compressed file ended before the end-of-stream marker was reached from/usr/lib/python3.13/gzip.py:616: EOFError.

@danifus
Copy link
Contributor

Ah ok, that does sound a bit more involved. You're right, the.release() method is the right way to go.

As far as this issue is concerned, I think the.close() method doesn't have any issues and we should add some documentation saying that GzipFile should be closed explicitly to avoid a potential race condition during garbage collection (mitigated with this patch but may be subject to resurrection if the GC algorithm is changed or a different python implementation uses a different GC). What do you guys think?

@cmaloney
Copy link
ContributorAuthor

I think it's still worthwhile to fix, since CPython sholdn't error if a GzipFile doesn't explicitly have.close() called. That's likely if an exception is thrown and code isn't usingwith / a context manager.

Definitely some Documentation improvements would also be good I think..close() is definitely important for getting correct behavior.

@xrmx
Copy link
Contributor

As far as this issue is concerned, I think the.close() method doesn't have any issues and we should add some documentation saying that GzipFile should be closed explicitly to avoid a potential race condition during garbage collection (mitigated with this patch but may be subject to resurrection if the GC algorithm is changed or a different python implementation uses a different GC). What do you guys think?

Fyi without the gzip patch and with the explicit cleanup in my code I still see a ton ofValueError: I/O operation on closed file. warnings and it's not trivial to track them down 😓

@cmaloney
Copy link
ContributorAuthor

cmaloney commentedFeb 26, 2025
edited
Loading

@vstinner I think this is ready for review. (Not sure why the issue-number bot didn't link it back to the original issue). Exception on close issue that became visible with your change to log exceptions ifclose() fails (GH-12786 which was part ofhttps://bugs.python.org/issue18748 /gh-62948)

Lib/gzip.py Outdated
@@ -226,7 +227,8 @@ def __init__(self, filename=None, mode=None,
0)
self._write_mtime = mtime
self._buffer_size = _WRITE_BUFFER_SIZE
self._buffer = io.BufferedWriter(_WriteBufferStream(self),
write_wrap = _WriteBufferStream(weakref.proxy(self))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Would it be possible to modify _WriteBufferStream to use a simpler weakref.ref() to store gzip_file instead? This object is private so we can change its API.

Something like:

class_WriteBufferStream(io.RawIOBase):"""Minimal object to pass WriteBuffer flushes into GzipFile"""def__init__(self,gzip_file):self.gzip_file=weakref.ref(gzip_file)defwrite(self,data):gzip_file=self.gzip_file()ifgzip_fileisNone:raiseRuntimeError("lost gzip_file")returngzip_file._write_raw(data)

cmaloney reacted with eyes emoji
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Updated to your suggestion here. I Spent a bit of time trying to remove_WriteBufferStream andjust useweakref.proxy, but getting soGzipFile.write has a differentwrite function from the underlyingself._buffer: BufferedWriter (which needs to useself._write_raw) led me in a lot of circles. More understandable for things I tried to just keep_WriteBufferStream.

Copy link
Member

@vstinnervstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM

@vstinnervstinner merged commit7f39137 intopython:mainFeb 28, 2025
43 checks passed
@vstinnervstinner added the needs backport to 3.13bugs and security fixes labelFeb 28, 2025
@miss-islington-app
Copy link

Thanks@cmaloney for the PR, and@vstinner for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13.
🐍🍒⛏🤖

xrmx reacted with heart emoji

miss-islington pushed a commit to miss-islington/cpython that referenced this pull requestFeb 28, 2025
A reference loop was resulting in the `fileobj` held by the `GzipFile`being closed before the `GzipFile`.The issue started withpythongh-89550 in 3.12, but was hidden in most casesuntil 3.13 whenpythongh-62948 made it more visible.(cherry picked from commit7f39137)Co-authored-by: Cody Maloney <cmaloney@users.noreply.github.com>
@bedevere-app
Copy link

GH-130669 is a backport of this pull request to the3.13 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.13bugs and security fixes labelFeb 28, 2025
@vstinnervstinner added the needs backport to 3.12only security fixes labelFeb 28, 2025
@miss-islington-app
Copy link

Thanks@cmaloney for the PR, and@vstinner for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12.
🐍🍒⛏🤖 I'm not a witch! I'm not a witch!

miss-islington pushed a commit to miss-islington/cpython that referenced this pull requestFeb 28, 2025
A reference loop was resulting in the `fileobj` held by the `GzipFile`being closed before the `GzipFile`.The issue started withpythongh-89550 in 3.12, but was hidden in most casesuntil 3.13 whenpythongh-62948 made it more visible.(cherry picked from commit7f39137)Co-authored-by: Cody Maloney <cmaloney@users.noreply.github.com>
@bedevere-app
Copy link

GH-130670 is a backport of this pull request to the3.12 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.12only security fixes labelFeb 28, 2025
@vstinner
Copy link
Member

loop found by@danifus , bug found and explored by@xrmx

PR merged, thank you all! I backported the fix to 3.12 and 3.13.

cmaloney reacted with hooray emoji

vstinner pushed a commit that referenced this pull requestFeb 28, 2025
…130670)gh-129726: Break `gzip.GzipFile` reference loop (GH-130055)A reference loop was resulting in the `fileobj` held by the `GzipFile`being closed before the `GzipFile`.The issue started withgh-89550 in 3.12, but was hidden in most casesuntil 3.13 whengh-62948 made it more visible.(cherry picked from commit7f39137)Co-authored-by: Cody Maloney <cmaloney@users.noreply.github.com>
vstinner pushed a commit that referenced this pull requestFeb 28, 2025
…130669)gh-129726: Break `gzip.GzipFile` reference loop (GH-130055)A reference loop was resulting in the `fileobj` held by the `GzipFile`being closed before the `GzipFile`.The issue started withgh-89550 in 3.12, but was hidden in most casesuntil 3.13 whengh-62948 made it more visible.(cherry picked from commit7f39137)Co-authored-by: Cody Maloney <cmaloney@users.noreply.github.com>
@cmaloneycmaloney deleted the gh-129796-gzipclose branchMarch 7, 2025 05:26
seehwan pushed a commit to seehwan/cpython that referenced this pull requestApr 16, 2025
A reference loop was resulting in the `fileobj` held by the `GzipFile`being closed before the `GzipFile`.The issue started withpythongh-89550 in 3.12, but was hidden in most casesuntil 3.13 whenpythongh-62948 made it more visible.
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@xrmxxrmxxrmx approved these changes

@vstinnervstinnervstinner approved these changes

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

4 participants
@cmaloney@danifus@xrmx@vstinner

[8]ページ先頭

©2009-2025 Movatter.jp