Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork33.7k
gh-113190: Reenable non-debug interned string cleanup#113601
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
gh-113190: Reenable non-debug interned string cleanup#113601
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Eclips4 commentedDec 31, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
I added |
eduardo-elizondo commentedDec 31, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Thanks!! Will ping you once all the tests are passing to move this into safe-to-merge mode 🙂 |
eduardo-elizondo commentedDec 31, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
@Eclips4 tests are passing! Could you help me by adding the "skip news" label (since it's probably not required) and removing the "DO-NOT-MERGE" label? Thanks!! |
eduardo-elizondo commentedJan 4, 2024
cc@nascheme to take a look when you have a chance! |
nascheme commentedJan 4, 2024
It would be nice to do a better cleanup but I'm concerned about how safe this actually is. Since we don't have an accurate refcnt for these strings, we are assuming that no one is actually still using them. It seems possible that some extensions or programs embedding Python could hang on to interned strings after Maybe I'm being too paranoid? I don't know how we could actually fix this though. Keeping hold of and using memory allocated by the Python runtime after My gut feeling is we should proceed cautiously, even though we would like to fix the leak. Maybe we could enable this for alphas and then turn it off again for release? Maybe we need to wait until we get |
eduardo-elizondo commentedJan 5, 2024 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
@nascheme I agree with what you mentioned above and I was paranoid of that scenario too, hence the disabling of the code. Purely going off from the docs, I don't think there's anywhere where Python provides an explicit expectation of what the behavior is of memory that survives a call to Py_Finalize. Furthermore, in the C-API it is specified that Perhaps we should be more explicit about this in the docs (maybe in the "Bugs and Caveats" section of Py_Finalize)? i.e python can't guarantee correctness for any memory allocated during a python initialization and used in any subsequent initializations. Thoughts? Regardless, I completely agree that we should still be careful about putting this out there. Stating with alphas sounds like a good idea to get some early signal. Let me know what's the best way to only enable this only for alphas. |
eduardo-elizondo commentedJan 5, 2024 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
I think we are already there? Here's a quick test program that I ran: |
vstinner commentedJan 8, 2024 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
It's mostly done for one year:https://mail.python.org/archives/list/python-dev@python.org/thread/E4C6TDNVDPDNNP73HTGHN5W42LGAE22F/ "Mostly" means: some remaining stdlib C extensions still implement their own static types which are not cleared by
This change is a backward incompatible. I suggest to just document well the change in What's New in Python 3.13:
In short, a program must not keep references to any Python object between In general, I suggest to reset all pointers to Python objects at each Py_Initialize() call. Or said differently: don't storeany pointer to Python objects after Py_Finalize() :-) Examples of exceptions:
|
eduardo-elizondo commentedJan 8, 2024 • edited by vstinner
Loading Uh oh!
There was an error while loading.Please reload this page.
edited by vstinner
Uh oh!
There was an error while loading.Please reload this page.
For sure, I can update the PR with the details with what you mentioned here!
Given this guarantee, I think it's safe to assume that if we have any Python object leak it's a user induced leak, right? i.e. the runtime itself will not cause incorrect behavior. I will include some wording around this so that users can isolate the surface are when looking for issues (i.e look at the third-party extensions not at the runtime).
That's my take too - though it's not specified in the docs! Any thoughts of also updating the Py_Finalize/Py_Initialize documentation as part of this PR to include stricter language like this? |
vstinner commentedJan 8, 2024
I looked again at the code. It has a complicated history:
If there are applications relying on interned strings to survive between multiple Py_Initialize()/Py_Finalize() cycles, they should already be affected by Python 3.10 and 3.11. By the way, Python 3.11 added _PyTypes_FiniTypes() to "clear" static types in Py_Finalize(). |
bedevere-bot commentedAug 12, 2024
Thanks@eduardo-elizondo for the PR, and@encukou for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12, 3.13. |
Sorry,@eduardo-elizondo and@encukou, I could not cleanly backport this to |
Sorry,@eduardo-elizondo and@encukou, I could not cleanly backport this to |
eduardo-elizondo commentedAug 16, 2024
Thanks@encukou !! |
neonene commentedOct 14, 2024
Is there any chance this will be backported to 3.13.1? |
…ythonGH-113601)(cherry picked from commit3203a74)Co-authored-by: Eddie Elizondo <eelizondo@meta.com>
GH-130581 is a backport of this pull request to the3.13 branch. |
…ythonGH-113601)(cherry picked from commit3203a74)Co-authored-by: Eddie Elizondo <eelizondo@meta.com>
GH-130582 is a backport of this pull request to the3.12 branch. |
hugovk commentedFeb 26, 2025
The conflict preventing Miss Islington from backporting#113601 was the addition to I've made the backport PRs. Do we need any other docs in them? |
encukou commentedFeb 26, 2025
Sorry, I missed the notification! See this paragraph in the added docs:
In other words, it's possible that some extensions will start crashing after I don't think we should backport to bugfix branches, unfortunately. |
hugovk commentedFeb 26, 2025
Okay, I'll close the backports. |
…) in sanitizer runsUnder ASan/LSan with Python 3.12 on Ubuntu 22.04/24.04, the cleanup testsreport direct leaks rooted in CPython’s unicode interning path (frames viaPyUnicode_New) during module import in our embedded init.This is a known CPython issue (python/cpython#113190). Interned-unicodecleanup was re-enabled in 3.13 (python/cpython#113601) but not backportedto 3.12. To keep CI signal meaningful while we still test against 3.12,enable a narrow LSan suppression for PyUnicode_New only, and only on 3.12.Notes:- detect_leaks=1 remains enabled; the suppression is limited to PyUnicode_New and won’t mask leaks in PythonQt.- Remove once CI moves off Python 3.12.Refs:python/cpython#113190,python/cpython#113601
…) in sanitizer runsUnder ASan/LSan with Python 3.12 on Ubuntu 22.04/24.04, the cleanup testsreport direct leaks rooted in CPython’s unicode interning path (frames viaPyUnicode_New) during module import in our embedded init.This is a known CPython issue (python/cpython#113190). Interned-unicodecleanup was re-enabled in 3.13 (python/cpython#113601) but not backportedto 3.12. To keep CI signal meaningful while we still test against 3.12,enable a narrow LSan suppression for PyUnicode_New only, and only on 3.12.Notes:- detect_leaks=1 remains enabled; the suppression is limited to PyUnicode_New and won’t mask leaks in PythonQt.- Remove once CI moves off Python 3.12.Refs:python/cpython#113190,python/cpython#113601
Uh oh!
There was an error while loading.Please reload this page.
InGH-19474 we introduced immortalized interned strings. However, at the time, we didn't have a strict guarantee that a runtime shutdown (i.e
Py_Finalize()) would completely cleanup all the allocated PyObjects (i.e a simple example of./python -X showrefcount -c 'import itertools'would show leaks).Today, these leaks have been largely fixed and there have been stricter guarantees into how obmalloc and the interpreter state works after a runtime shutdown. Given this, it should be safe to re-enable the interned string cleanup and revert the structseq test that had issues with this leaks before. This test should show the correctness of this code now.