Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-109052: Use the base opcode when comparing code objects#109107

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
corona10 merged 3 commits intopython:mainfromgaogaotiantian:fix-systrace-replace
Sep 9, 2023

Conversation

@gaogaotiantian
Copy link
Member

@gaogaotiantiangaogaotiantian commentedSep 7, 2023
edited by gvanrossum
Loading

After instrumentation, the opcode of the code object would becomeINSTRUMENTED_LINE orINSTRUMENTED_INSTRUCTION, we should make sure to get the actual opcode when we compare them.

This also belongs togh-107265.

Nagi-ovo and nanxiaoshu reacted with hooray emojiyunline reacted with rocket emoji
_Py_CODEUNITco_instr=_PyCode_CODE(co)[i];
_Py_CODEUNITcp_instr=_PyCode_CODE(cp)[i];
uint8_tco_code=co_instr.op.code;
uint8_tco_code=_Py_GetBaseOpcode(co,i);
Copy link
Member

@corona10corona10Sep 8, 2023
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I am not sure that calling this API at this moment is correct.
Because it will call _PyOpcode_Deopt twice in the end.

See L1812, what will happen

Copy link
Member

@corona10corona10Sep 8, 2023
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Or move co_code or cp_code = _PyOpcode_Deopt[cocode or cp_code]; only if cp_code or co_code is ENTER_EXECUTOR at L1805 or L1814

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

_PyOpcode_Deopt[] is idempotent, so it is harmless.

I think we should change this code to be exactly the same as whatdeopt_code() does, since that's what we're after -- we want to compareco.co_code tocp.co_code without creating those objects.

I also think that_Py_GetBaseOpcode() should mapENTER_EXECUTOR so we wouldn't have to worry about it here at all.

Copy link
Member

@corona10corona10Sep 9, 2023
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

FYI, If we decide to fix_Py_GetBaseOpcode(), we can update#107265 too
cc@brandtbucher

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

And maybe we can close#107265 in one short.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

IIRC there are also cases where we absolutely want the actual opcode, so in those cases we still need to handle ENTER_EXECUTOR explicitly. But yeah, fixing this in_Py_GetBaseOpcode() might reduce the special cases elsewhere. In a new PR, please. :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

IMPORTANT! Do not attempt to "fix"_Py_GetBaseOpcode() by addingENTER_EXECUTOR handling. This will leave theoparg incorrect. That's why that bullet ingh-107265 usesstrikethrough.

corona10 reacted with thumbs up emoji
uint8_tco_code=_Py_GetBaseOpcode(co,i);
uint8_tco_arg=co_instr.op.arg;
uint8_tcp_code=cp_instr.op.code;
uint8_tcp_code=_Py_GetBaseOpcode(cp,i);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

ditto

Copy link
Member

@corona10corona10 left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Looks better?

diff --git a/Objects/codeobject.c b/Objects/codeobject.cindex 70a0c2ebd6..4180d216a3 100644--- a/Objects/codeobject.c+++ b/Objects/codeobject.c@@ -1788,20 +1788,24 @@ code_richcompare(PyObject *self, PyObject *other, int op)         if (co_code == ENTER_EXECUTOR) {             const int exec_index = co_arg;             _PyExecutorObject *exec = co->co_executors->executors[exec_index];-            co_code = exec->vm_data.opcode;+            co_code = _PyOpcode_Deopt[exec->vm_data.opcode];             co_arg = exec->vm_data.oparg;         }+        else {+            co_code = _Py_GetBaseOpcode(co, i);+        }         assert(co_code != ENTER_EXECUTOR);-        co_code = _PyOpcode_Deopt[co_code];         if (cp_code == ENTER_EXECUTOR) {             const int exec_index = cp_arg;             _PyExecutorObject *exec = cp->co_executors->executors[exec_index];-            cp_code = exec->vm_data.opcode;+            cp_code = _PyOpcode_Deopt[exec->vm_data.opcode];             cp_arg = exec->vm_data.oparg;         }+        else {+            cp_code = _Py_GetBaseOpcode(cp, i);+        }         assert(cp_code != ENTER_EXECUTOR);-        cp_code = _PyOpcode_Deopt[cp_code];         if (co_code != cp_code || co_arg != cp_arg) {             goto unequal;

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phraseI have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

And if you don't make the requested changes, you will be put in the comfy chair!

@gaogaotiantian
Copy link
MemberAuthor

I don't think the current code is "wrong" functionality wise, asdeopt is safe for base instructions - you can do multiple deopts and it'll just be the same base opcode. However, it did deopt twice for the opcode, so I moved the deopt into theENTER_EXECUTOR check. Notice that if the opted opcode isENTER_EXECUTOR, it would still beENTER_EXECUTOR after deopted.

I'm not sure ifENTER_EXECUTOR will be instrumented by instruction instrumentation (I would guess so), if it will, then we have to deopt the instruction beforeENTER_EXECUTOR check, or it won't recognize it.

Copy link
Member

@vstinnervstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I dislike this approach, iterating on each bytecode at a specific index, deoptimize it, and then look for the next bytecode. The problem is that there are cases where i++ doesn't give you a bytecode but a cache :-( You have to take care of ENTER_EXECUTOR. The exact implementation of bytecode changes often these days, and it's hard to keep all functions consuming bytecode to remain correct.

Would it be possible instead to write a function which creates a copy of the deoptimized code in one shot? Something like PyCode_GetOriginalBytecode() which would create a bytes object.

@vstinner
Copy link
Member

The problem is that there are cases where i++ doesn't give you a bytecode but a cache :-(

I'm thinking at buggh-107082 which has been fixed by commit233b878. cc@gvanrossum

@gaogaotiantian
Copy link
MemberAuthor

I dislike this approach, iterating on each bytecode at a specific index, deoptimize it, and then look for the next bytecode. The problem is that there are cases where i++ doesn't give you a bytecode but a cache :-( You have to take care of ENTER_EXECUTOR. The exact implementation of bytecode changes often these days, and it's hard to keep all functions consuming bytecode to remain correct.

Would it be possible instead to write a function which creates a copy of the deoptimized code in one shot? Something like PyCode_GetOriginalBytecode() which would create a bytes object.

I'm not saying this is the best way to do it (it was there already to compare the code objects), but the logic here is correct (at least for now).i is specifically taken care of to avoid getting cache instructions in the loop.

A function likePyCode_GetOriginalBytecode() would just put this logic into another abstracted function right? We still need to keep track of everything when instructions changed. It just brings the problem to another person/piece of code.

@vstinner
Copy link
Member

A function like PyCode_GetOriginalBytecode() would just put this logic into another abstracted function right? We still need to keep track of everything when instructions changed. It just brings the problem to another person/piece of code.

Correct, but it's easier to implement when you iterate on a single code object, and this function can be reused in other places, and it can be moved closer to functions which modify bytecode.

@gaogaotiantian
Copy link
MemberAuthor

Also I believe this piece of code was originally written to avoid allocating a new piece of memory. A new function returning bytes would contradict that - not saying that's the wrong way to go, just to mention.

@vstinner
Copy link
Member

Also I believe this piece of code was originally written to avoid allocating a new piece of memory. A new function returning bytes would contradict that - not saying that's the wrong way to go, just to mention.

Honestly, I don't think that code1==code2 operation matters for Python performance. It's usually used in the compiler/parser, not "at runtime". It shouldn't be part of "hot code".

@vstinner
Copy link
Member

By the way, for me, it's surprising that _Py_GetBaseOpcode() can still return ENTER_EXECUTOR and require to get the executor (opcode, oparg). Each caller has to take care of that. Would it be possible to have a function which returns theoriginal base case, so don't return ENTER_EXECUTOR?

Maybe an "iterator-like" API which also gives an offset to the next "original" instruction (the next instruction which, one decoded, will give the original bytecode)?

@gaogaotiantian
Copy link
MemberAuthor

By the way, for me, it's surprising that _Py_GetBaseOpcode() can still return ENTER_EXECUTOR and require to get the executor (opcode, oparg). Each caller has to take care of that. Would it be possible to have a function which returns theoriginal base case, so don't return ENTER_EXECUTOR?

Maybe an "iterator-like" API which also gives an offset to the next "original" instruction (the next instruction which, one decoded, will give the original bytecode)?

I'm just saying, this code is explicitly converted to the current status from what you wanted inbd2e47c ._PyCode_GetCode is the utility function you described and it was used before the commit :)

Do you think we should revert that change? I would guess there might be reasons behind it.

@corona10
Copy link
Member

corona10 commentedSep 8, 2023
edited
Loading

Honestly, I don't think that code1==code2 operation matters for Python performance. It's usually used in the compiler/parser, not "at runtime". It shouldn't be part of "hot code"

I don't think that copying the original object is not beneficial for the scenario as you commented.
Every time we have to optimize a code object, do we have to copy the original object? I think that comparing the code object is more rare case than optimizing itself.

Copy link
Member

@corona10corona10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Fixing itself looks good to me.
For overall policy, it would be a different issue.
I will leave the rest of review to@gvanrossum

@vstinner
Copy link
Member

I prefer to abstraint myself from reviewing this PR :-) I didn't follow recent developments about optimization, so I don't have a strong opinion. I just shared my opinion and feedback on the recent code changes and issues that I saw ;-)

@vstinner
Copy link
Member

I confirm that this change fix issue#109052 crash.

Without this change, Python does crash with the following command:

$ ./python -m test test_sys_settrace -R 3:3(...)...Fatal Python error: Segmentation faultCurrent thread 0x00007f9dce557740 (most recent call first):  File "/home/vstinner/python/main/Lib/test/test_sys_settrace.py", line 1912 in trace  File "/home/vstinner/python/main/Lib/test/test_sys_settrace.py", line 2162 in test_jump_backwards_into_while_block  File "/home/vstinner/python/main/Lib/test/test_sys_settrace.py", line 1975 in run_test  File "/home/vstinner/python/main/Lib/test/test_sys_settrace.py", line 2008 in test(...)

With this change, the test does pass as expected, and Python doesn't crash.

$ ./python -m test test_sys_settrace -R 3:3(...)Result: SUCCESS

Copy link
Member

@brandtbucherbrandtbucher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think this is a good fix for the problem at hand. Zooming out, I agree that we should put all of this "deopting" logic in one place (we're pretty close with_Py_GetBaseOpcode) and actually use it everywhere.

I'll merge in a couple of hours if nobody has any other concerns.

constintexec_index=co_arg;
_PyExecutorObject*exec=co->co_executors->executors[exec_index];
co_code=exec->vm_data.opcode;
co_code=_PyOpcode_Deopt[exec->vm_data.opcode];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Nice catch. I thinkdeopt_code should be doing this as well (it can't happen yet because these are allJUMP_BACKWARDs, which don't have specializations, but it could later when we start sticking executors in arbitrary locations).

I agree with@gvanrossum though that_Py_GetBaseOpcode should just be the one place where all of this logic is kept.@gaogaotiantian, would you be interested in teaching_Py_GetBaseOpcode aboutENTER_EXECUTOR and cleaning up all of the places where we're using that and_PyOpcode_Deopt to do the right thing? I can do it if not.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yes, I can work on that.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

So@gvanrossum mentioned we should NOT handleENTER_EXECUTOR in_Py_GetBaseOpcode because ofoparg. For this matter, it there any work to do?_Py_GetBaseOpcode already handles instrumentation and_PyOpcode_Deopt.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

See the list of bullets ingh-107265.

@corona10
Copy link
Member

I'll merge in a couple of hours if nobody has any other concerns.

Let's just merge!

@bedevere-app
Copy link

GH-112329 is a backport of this pull request to the3.12 branch.

gaogaotiantian added a commit to gaogaotiantian/cpython that referenced this pull requestNov 23, 2023
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@vstinnervstinnervstinner left review comments

@corona10corona10corona10 approved these changes

@brandtbucherbrandtbucherbrandtbucher approved these changes

@markshannonmarkshannonAwaiting requested review from markshannonmarkshannon is a code owner

@gvanrossumgvanrossumAwaiting requested review from gvanrossum

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

6 participants

@gaogaotiantian@bedevere-bot@vstinner@corona10@gvanrossum@brandtbucher

[8]ページ先頭

©2009-2025 Movatter.jp