Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-125038: Crash after genexpr.gi_frame.f_locals manipulations is fixed#125178

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged

Conversation

@efimov-mikhail
Copy link
Member

@efimov-mikhailefimov-mikhail commentedOct 9, 2024
edited
Loading

Some iterator checks are added for _FOR_ITER, _FOR_ITER_TIER_TWO and INSTRUMENTED_FOR_ITER bytecode implementations. TypeError is raised in case of tp_iternext == NULL. Tests on generator modifying through gi_frame.f_locals are added, both to genexpr generators and function generators.

I've added some test cases to emphasize current code behavior and save it explicitly in tests.
IMHO, there is still a room to improvement.

Previous PR becomes incorrect (#125051) after my typo in git commands.
I'm sorry about that.

@JelleZijlstra, Could you please review this again?
Moreover, label "needs backport to 3.13" should be added once again.

Eclips4 reacted with heart emoji
…is fixedSome iterator checks are added for _FOR_ITER,_FOR_ITER_TIER_TWO and INSTRUMENTED_FOR_ITER bytecode implementations.TypeError is raised in case of tp_iternext == NULL.Tests on generator modifying through gi_frame.f_locals are added, bothto genexpr generators and function generators.
…e-125038.ffSLCz.rstCo-authored-by: Kirill Podoprigora <kirill.bast9@mail.ru>
if (iternext==NULL) {
_PyErr_Format(tstate,PyExc_TypeError,
"'for' requires an object with "
"__iter__ method, got %.100s",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This error is incorrect as it's looking for__next__, not__iter__. I think we can use the same error asbuiltins.next():

        PyErr_Format(PyExc_TypeError,             "'%.200s' object is not an iterator",            Py_TYPE(it)->tp_name);

Eclips4 and efimov-mikhail reacted with thumbs up emoji
@markshannon
Copy link
Member

I think this is wrong approach. Rather than modifyingFOR_ITER, etc to handle an incorrect VM state, we should generate code that cannot get into that state, in the bytecode compiler.

Currently, the bytecode compiler assumes that local variables in generator expressions cannot be modified in
This is no longer true, but it remains true that the evaluation stack cannot be modified.

The bytecode for(x for x in ...) starts with:

      0 RETURN_GENERATOR      2 POP_TOP      4 RESUME                   0      6 LOAD_FAST                0 (.0)>>    8 FOR_ITER                 6 (to 24)

Compare that with the code forfor x in ....:

      0 RESUME                   0      2 LOAD_FAST                0 (...)      4 GET_ITER>>    6 FOR_ITER                 4 (to 18)      10 STORE_FAST               1 (x)

Note the additionalGET_ITER as the bytecode compiler doesn't know what... is.

I think adding theGET_ITER before theFOR_ITER would gain the robustness required in this case, without complicating the iteration code.

@efimov-mikhail
Copy link
MemberAuthor

So, basically, we want to provide equal results for such code fragments:

  1. Iterable in f_locals['.0']
g = (x for x in range(10))g.gi_frame.f_locals['.0'] = range(20)
  1. Iterator in f_locals['.0']
g = (x for x in range(10))g.gi_frame.f_locals['.0'] = iter(range(20))

Have I got your idea correct on high level?

IMHO, this will change be convenient.

@markshannon
Copy link
Member

Have I got your idea correct on high level?

Yes, I think those should provide the same results.

efimov-mikhail reacted with thumbs up emoji

@markshannon
Copy link
Member

We also want to avoid emittingGET_ITER twice. It is currently emitted in the scope surrounding the generator expression.
We should move it into the generator expression.

In other words, this function:

deff(s):return (xforxins)

currently generates the following bytecode:

  1           RESUME                   0  2           LOAD_CONST               1 (<code object <genexpr> at 0x7fc1c815e6a0, file "<python-input-0>", line 2>)              MAKE_FUNCTION              LOAD_FAST_TEMP           0 (s)              GET_ITER              CALL                     0              RETURN_VALUEDisassembly of <code object <genexpr> at 0x7fc1c815e6a0, file "<python-input-0>", line 2>:   2           RETURN_GENERATOR               POP_TOP       L1:     RESUME                   0               LOAD_FAST                0 (.0)       L2:     FOR_ITER                 6 (to L3)               ...

I think it should generate:

  1           RESUME                   0  2           LOAD_CONST               1 (<code object <genexpr> at 0x7fc1c815e6a0, file "<python-input-0>", line 2>)              MAKE_FUNCTION              LOAD_FAST_TEMP           0 (s)              CALL                     0              RETURN_VALUEDisassembly of <code object <genexpr> at 0x7fc1c815e6a0, file "<python-input-0>", line 2>:   2           RETURN_GENERATOR               POP_TOP       L1:     RESUME                   0               LOAD_FAST                0 (.0)               GET_ITER       L2:     FOR_ITER                 6 (to L3)               ...

This is a behavior change, but I think a correct one. Iff(2) is called, we get aTypeError, but this seems wrong. The error should occur once the generator expression is iterated over.

efimov-mikhail reacted with thumbs up emoji

@efimov-mikhail
Copy link
MemberAuthor

We also want to avoid emittingGET_ITER twice. It is currently emitted in the scope surrounding the generator expression. We should move it into the generator expression.

Agree.

This is a behavior change, but I think a correct one. Iff(2) is called, we get aTypeError, but this seems wrong. The error should occur once the generator expression is iterated over.

Could you please give some additional information about this behavior change?
Some code snippets, which results will be changed.

@JelleZijlstra
Copy link
Member

Could you please give some additional information about this behavior change? Some code snippets, which results will be changed.

Currently, this happens:

>>> (xfor xin1)Traceback (most recent call last):  File "<python-input-0>", line 1, in <module>    (x for x in 1)                ^TypeError: 'int' object is not iterable

With Mark's proposed change, this error would be thrown only when the generator is iterated over.

I think either behavior is fine and I agree with Mark that the proposed change makes for a more elegant implementation, but it is a change in core language behavior that I could see impacting some users (e.g., if you put a try-except around the construction of the genexp), so I wouldn't feel comfortable making that change in a bugfix release.

@markshannon what do you think of applying this PR's approach in the 3.13 branch, and your idea on main only?

@efimov-mikhail
Copy link
MemberAuthor

With Mark's proposed change, this error would be thrown only when the generator is iterated over.

Understood. Thx for the clarification.

applying this PR's approach in the 3.13 branch, and "GET_ITER changes" on main only?

It looks like a good idea for me, since crash through f_locals modification is already fixed.
And future improvements for GET_ITER bytecode moving could be provided separately.
But it's up to you to decide, of course.

There will be no problem in backports in the case of two separate PRs?

@markshannon
Copy link
Member

My proposed fix should have no performance impact (it does no extra work), so I'd definitely prefer that.

If you think that is too intrusive for 3.13, we could leave theGET_ITER in the enclosing scope and add a newCHECK_ITER instruction to be inserted before theFOR_ITER.
CHECK_ITER would raise if the value TOS is not an iterator, otherwise do nothing.

JelleZijlstra and efimov-mikhail reacted with thumbs up emoji

@efimov-mikhail
Copy link
MemberAuthor

I like the idea of new CHECK_ITER bytecode.

What do you think about this plan:

  1. Add CHECK_ITER bytecode definition, put it in a proper place in generator expressions code, add correct reaction on it in the interpreter in this PR.
  2. Merge this PR to main, provide a backport to 3.13.
  3. Create a new PR.
  4. Move GET_ITER bytecode from the enclosing function to the place of CHECK_ITER bytecode, remove CHECK_ITER, merge PR to main, do not provide any backports.

And what about Python 3.12 and below?
Should we provide any bug fix to those branches?

@markshannon
Copy link
Member

That sounds like a good plan.

If the fix is simple enough, then we can backport to 3.12.
I don't think it is worth worrying about 3.11 and earlier.

efimov-mikhail reacted with thumbs up emoji

@efimov-mikhailefimov-mikhail marked this pull request as draftOctober 15, 2024 08:55
@efimov-mikhail
Copy link
MemberAuthor

As@JelleZijlstra points out,GET_ITER is idempotent, so we can dropCHECK_ITER and useGET_ITER instead.

For 3.14, we can then drop theGET_ITER in the caller.

It seems like this solution will be best tradeoff.
I will provide such changes to this PR.

JelleZijlstra and vasily-v-ryabov reacted with thumbs up emoji

@ghost
Copy link

ghost commentedOct 19, 2024
edited by ghost
Loading

All commit authors signed the Contributor License Agreement.
CLA signed

@markshannon
Copy link
Member

This might introduce a small slowdown in 3.13, but it isn't in the loop, so shouldn't be an issue.

I'm not worried about that though, as we can remove the otherGET_ITER and thus the overhead in 3.14.

efimov-mikhail reacted with thumbs up emoji

Copy link
Member

@markshannonmarkshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Looks good me once you've agreed on a better name for ModifyTest.

@efimov-mikhail
Copy link
MemberAuthor

It's okay for me to change name from "ModifyTest" to something.
Maybe "ModifyBaseIteratorTest"?
Or "ModifyUnderlyingIterableTest"?

@JelleZijlstra
Copy link
Member

Either of those names works for me.

efimov-mikhail reacted with thumbs up emoji

Copy link
Member

@sobolevnsobolevn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thank you!

efimov-mikhail reacted with thumbs up emoji
@efimov-mikhail
Copy link
MemberAuthor

Looks good me once you've agreed on a better name for ModifyTest.

It seems we have agreed. Anything else, or it could be merged?

@JelleZijlstraJelleZijlstra merged commit079875e intopython:mainOct 22, 2024
@miss-islington-app
Copy link

Thanks@efimov-mikhail for the PR, and@JelleZijlstra for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13.
🐍🍒⛏🤖

@miss-islington-app
Copy link

Sorry,@efimov-mikhail and@JelleZijlstra, I could not cleanly backport this to3.13 due to a conflict.
Please backport usingcherry_picker on command line.

cherry_picker 079875e39589eb0628b5883f7ffa387e7476ec06 3.13

JelleZijlstra pushed a commit to JelleZijlstra/cpython that referenced this pull requestOct 22, 2024
…ipulations (pythonGH-125178)(cherry picked from commit079875e)Co-authored-by: Mikhail Efimov <efimov.mikhail@gmail.com>
@bedevere-app
Copy link

GH-125846 is a backport of this pull request to the3.13 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.13bugs and security fixes labelOct 22, 2024
@efimov-mikhailefimov-mikhail deleted the crash_fix_generators_f_locals branchOctober 23, 2024 13:42
JelleZijlstra added a commit that referenced this pull requestOct 23, 2024
…ions (GH-125178) (#125846)(cherry picked from commit079875e)Co-authored-by: Mikhail Efimov <efimov.mikhail@gmail.com>
ebonnal pushed a commit to ebonnal/cpython that referenced this pull requestJan 12, 2025
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@markshannonmarkshannonmarkshannon approved these changes

@Eclips4Eclips4Eclips4 left review comments

@JelleZijlstraJelleZijlstraJelleZijlstra approved these changes

@sobolevnsobolevnsobolevn approved these changes

@ericsnowcurrentlyericsnowcurrentlyAwaiting requested review from ericsnowcurrently

@iritkatrieliritkatrielAwaiting requested review from iritkatrieliritkatriel is a code owner

+1 more reviewer

@hpagvn87hpagvn87hpagvn87 approved these changes

Reviewers whose approvals may not affect merge requirements

Assignees

@JelleZijlstraJelleZijlstra

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

6 participants

@efimov-mikhail@markshannon@JelleZijlstra@sobolevn@Eclips4@hpagvn87

[8]ページ先頭

©2009-2025 Movatter.jp