Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

GH-132554: SpecializeGET_ITER andFOR_ITER forrange#135063

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
markshannon wants to merge14 commits intopython:main
base:main
Choose a base branch
Loading
fromfaster-cpython:specialize-for-iter-range

Conversation

markshannon
Copy link
Member

@markshannonmarkshannon commentedJun 3, 2025
edited
Loading

Extends the idea of "virtual iterators" to ranges as well. Most ranges have a step of one. For these ranges we can treat them much like a C for loop, using tagged integers for the current value and the limit.

The stack during iteration now looks like this:

Original iterable2nd on stacktop of stack
range (step=1)limit (tagged)current (tagged)
list or tupleiterableindex (tagged)
otheriteratorNULL

GET_ITER is specialized for the above cases plus any iterable withPy_TYPE(self)->tp_iter == PyObject_SelfIter which avoids the call toPyObject_GetIter; simply pushingNULL instead.

Also fixes stats forFOR_ITER andGET_ITER.

@markshannonmarkshannon changed the titleSpecializeGET_ITER andFOR_ITER forrangeGH-132554: SpecializeGET_ITER andFOR_ITER forrangeJun 3, 2025
}
else {
PyObject*iter_o=PyStackRef_AsPyObjectBorrow(iter);
next=_PyForIter_NextWithIndex(iter_o,null_or_index);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The _PyForIter_NextWithIndex handles the exact lists and exact tuples.

In this PR we have the code to handlerange iteration by pushing the index and limit to the stack. Could we simplify_PyForIter_NextWithIndex to only deal with lists and for tuples push the index and length of the tuple to the stack (e.g. use the same approach as range)?

(if this is possible maybe in a followup PR)

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Are you proposing pushing a third value to the stack during iteration?
I doubt that would be worth it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yes. Might not be worth it indeed, but I will check once this pr has settled.

@eendebakpt
Copy link
Contributor

Micro benchmarks look good:

for loop length 1: Mean +- std dev: [mainx] 152 ns +- 1 ns -> [prx] 124 ns +- 1 ns: 1.23x fasterrepeat loop length 1: Mean +- std dev: [mainx] 234 ns +- 1 ns -> [prx] 227 ns +- 2 ns: 1.03x fasterfor loop length 2: Mean +- std dev: [mainx] 169 ns +- 1 ns -> [prx] 142 ns +- 3 ns: 1.19x fasterrepeat loop length 2: Mean +- std dev: [mainx] 252 ns +- 2 ns -> [prx] 245 ns +- 4 ns: 1.03x fasterfor loop length 8: Mean +- std dev: [mainx] 291 ns +- 4 ns -> [prx] 256 ns +- 4 ns: 1.14x fasterrepeat loop length 8: Mean +- std dev: [mainx] 367 ns +- 6 ns -> [prx] 359 ns +- 3 ns: 1.02x fasterfor loop length 10000: Mean +- std dev: [mainx] 341 us +- 6 us -> [prx] 330 us +- 3 us: 1.03x fasterrepeat loop length 10000: Mean +- std dev: [mainx] 254 us +- 3 us -> [prx] 258 us +- 2 us: 1.02x slowerGeometric mean: 1.08x faster
Script
import pyperfrunner = pyperf.Runner()loop = """import itertoolsdef g(n):    x=0    for ii in range(n):        x += 1def repeat(n):    x = 0    r = itertools.repeat(None, n)    for ii in r:        x += 1"""for s in [1, 2, 8, 10_000]:    time = runner.timeit(name=f"for loop length {s}", stmt=f"g({s})", setup=loop)    time = runner.timeit(name=f"repeat loop length {s}", stmt=f"repeat({s})", setup=loop)

The repeat loop length x is a control benchmark: it does not involve a for loop, but is slightly faster becausepyperf itself includes a for loop in the timing. That effect is relatively small though.

squeaky-pl and rafalp reacted with hooray emoji

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@eendebakpteendebakpteendebakpt left review comments

@ericsnowcurrentlyericsnowcurrentlyAwaiting requested review from ericsnowcurrentlyericsnowcurrently will be requested when the pull request is marked ready for reviewericsnowcurrently is a code owner

@Fidget-SpinnerFidget-SpinnerAwaiting requested review from Fidget-SpinnerFidget-Spinner will be requested when the pull request is marked ready for reviewFidget-Spinner is a code owner

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

2 participants
@markshannon@eendebakpt

[8]ページ先頭

©2009-2025 Movatter.jp