Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32.1k
gh-121954: Don't mutate tuples in _PyCode_New#122180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Conversation
Py_ssize_t size = PyTuple_GET_SIZE(orig_tuple); | ||
new_tuple = PyTuple_New(size); | ||
if (!new_tuple) { | ||
Py_DECREF(new_item); | ||
return -1; | ||
} | ||
for (Py_ssize_t i = size; --i >= 0; ) { | ||
_PyTuple_ITEMS(new_tuple)[i] = Py_NewRef( | ||
_PyTuple_ITEMS(orig_tuple)[i]); | ||
} | ||
*new_tuple_p = new_tuple; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Copying the tuple should be a rare event, mostly for creating code from user data.
Should it use some kind of faster-cpython stat counter?
(Symtable change looks fine to me, would rather have someone from faster-cpython review the rest, once the PR is ready.) |
encukou commentedJul 27, 2024 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Thanks! (I'm aware of some of the problems in the PR; since it's not a 3.13 regression I'll get back to this after 3.13rc1 is out.) |
I'm blocked on not understanding some of the internals for free-threaded builds. The free-threaded build tries to “intern” all code constants, but it doesn't -- see for example this, adapted from
Which constants aremeant to be interned here? Worse, when a container (i.e. tuple) is interned, it's latercleared in the @colesbury, do you know? |
colesbury commentedAug 5, 2024 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
All types of code constants (frozenset, slices, strings, numbers) are interned, but there are some code paths that skip interning and immortalization:
I think
We are deallocating the immortalized constants near the end of the interpreter shutdown process. Your comment is about The constants form an acyclic graph so we have to be careful about the order in which we free them. The emptying of containers breaks up this graph, which allows us to call
|
I'll take a closer look at the free-threading test failures |
@encukou, sorry I think the reported leaks were all pre-existing. They seem to be fixed by: When I merge that PR into yours, all the tests pass. Would you please review that PR when you get a chance? |
Thank you! That makes things much clearer.
Those are constants used by the compiler and marshal. But, users can put anything in For example: deff():print(())code=f.__code__code=code.replace(co_consts=(None, []))exec(code) Would it make sense to limit this interning to the marshallable immutable types only?
The frozenset code seems to rely on the order of elements, so it might intern two distinct but equal sets. Would that be a problem? (If so I'll try to find an example.)
The danger I see is that the values interned here are assumed to be “owned” by this machinery, but it seems they could be shared with other code, and perhaps needed in some later stage of interpreter shutdown. |
I'm concerned that allowing non-immortal or mutable types in constants will cause problems downstream in the interpreter. My preference would be to more thoroughly validate the replacement
This is generally our requirement for APIs, but I still thinks it's a good goal -- the more footguns we remove, the better -- but I think we'd be better off if we get there by more thorough validation, rather than expanding scope/support.
I think the behavior matches logic in the bytecode compiler, which also de-dupes constants. We only de-dupe two sets if they contain the same elements and their iteration orders match. The iteration order requirement is not strictly necessary, but it simplifies the comparison logic.
This happens very late in interpreter shutdown for that reason. The thread state and interpreter state are already cleared and static types have been finalized. |
Sounds reasonable. I filed#122850, to not block this PR.
Sure, but people could reasonably expect a “copy-and-patch” operation on
I see. I'll add a comment to make it less likely for someone to add/rearrange the finalization phases. (Leaving now; I'll update the PR next week.) |
bedevere-bot commentedAug 13, 2024
Today, I wasn't able to figure out the refleak free-threaded buildbots found in |
@encukou let me know if you'd like me to take a look at the remaining refleak |
Don't spend hours on it, but if there's a quick trick or tool you know of, or if can point to where you'd look next, I'd appreciate it. |
This hides the refleak: diff --git a/Lib/test/test__interpreters.py b/Lib/test/test__interpreters.pyindex f493a92e0dd..9ffec59c8bf 100644--- a/Lib/test/test__interpreters.py+++ b/Lib/test/test__interpreters.py@@ -26,7 +26,7 @@ def _captured_script(script): indented = script.replace('\n', '\n ') wrapped = dedent(f""" import contextlib- with open({w}, 'w', encoding="utf-8") as spipe:+ with open({w}, 'w', -1, "utf-8") as spipe: with contextlib.redirect_stdout(spipe): {indented} """) |
Uh oh!
There was an error while loading.Please reload this page.
Change
_PyCode_New
(intern_strings
&intern_constants
) to return new tuplesrather than mutating their arguments.
If possible, return the argument rather than create a copy.
In order to make copying less common, intern strings destined for
_PyCode_New
earlier -- in:_PyCode_ConstantKey
, used in the compiler to de-duplicate constants;symtable_implicit_arg
, which generates the names.0
,.1
etc..0
, is baked in as a_Py_STR
This means all string constants produced by the compiler are interned;
previously it was only names and identifier-like strings.
(We alreadyhash all code constants in order to de-duplicate them,
and we treat deleting code objects as rare event. Also, constants were
interned in the free-threaded build.)
Remove a test that verified a string constant isnot interned.
Merge
validate_and_copy_tuple
, the routine for cleaning up user inputfor
code.__new__
, into the internalintern_names
.Note that this changes the exception type for non-string names
in
PyCode_NewWithPosOnlyArgs
and such fromSystemError
toTypeError
;IMO that doesn't need documenting.
Rename
intern_strings
tointern_names
to further distinguish it fromintern_constants
. Unlike constants, names are immortalized.