Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-91432: Specialize FOR_ITER#91713

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
markshannon merged 44 commits intopython:mainfromsweeneyde:special_for_iter2
Jun 21, 2022
Merged
Show file tree
Hide file tree
Changes from1 commit
Commits
Show all changes
44 commits
Select commitHold shift + click to select a range
3a7f8df
initial attempt
sweeneydeApr 17, 2022
3b5ce1e
bump magic
sweeneydeApr 17, 2022
b21b5f4
Merge remote-tracking branch 'upstream/main' into special_for_iter2
sweeneydeApr 18, 2022
37269cf
NOTRACE_DISPATCH_SAME_OPARG
sweeneydeApr 18, 2022
ea0a7ee
Update mark_stacks
sweeneydeApr 18, 2022
0751228
comment out assertions
sweeneydeApr 19, 2022
dc80fda
Merge branch 'main' into special_for_iter2
sweeneydeApr 19, 2022
e429410
Make FOR_ITER_RANGE mutate the local
sweeneydeApr 19, 2022
1cb2de7
Merge remote-tracking branch 'upstream/main' into special_for_iter2
sweeneydeApr 19, 2022
73fa01c
Fix overflow
sweeneydeApr 19, 2022
4213582
fix test
sweeneydeApr 19, 2022
bf58358
fix dis
sweeneydeApr 19, 2022
bd7575e
Add more tests, add more casts
sweeneydeApr 19, 2022
db8754b
Merge branch 'main' into special_for_iter2
sweeneydeApr 20, 2022
2050c4e
merge with main
sweeneydeApr 24, 2022
824e966
Fix stats, take out some PREDICT
sweeneydeApr 24, 2022
6b16772
📜🤖 Added by blurb_it.
blurb-it[bot]Apr 24, 2022
c2a75a5
merge with main
sweeneydeApr 28, 2022
6696384
remove PREDEICTED(STORE_FAST)
sweeneydeApr 28, 2022
d297091
assert no tracing
sweeneydeApr 28, 2022
22635a6
Merge branch 'special_for_iter2' of https://github.com/sweeneyde/cpyt…
sweeneydeApr 28, 2022
f360d65
remove unnecessary cast
sweeneydeApr 28, 2022
81e0500
merge with main
sweeneydeMay 3, 2022
c2eab68
regen
sweeneydeMay 3, 2022
25689c3
merge and bump magic
sweeneydeMay 4, 2022
b5df047
merge with main
sweeneydeMay 8, 2022
2b1c170
Merge remote-tracking branch 'upstream/main' into special_for_iter2
sweeneydeMay 10, 2022
91d280c
Fix test_dis
sweeneydeMay 10, 2022
eba5e60
merge with main
sweeneydeMay 12, 2022
5b153d7
Merge branch 'main' into special_for_iter2
sweeneydeMay 12, 2022
76f9a74
merge with main
sweeneydeMay 20, 2022
0565a68
Merge branch 'special_for_iter2' of https://github.com/sweeneyde/cpyt…
sweeneydeMay 20, 2022
9ba5b79
merge with main
sweeneydeJun 9, 2022
6cdf0e4
Add comment about re-using the old PyLongObject
sweeneydeJun 9, 2022
03dde6a
Merge branch 'main' of https://github.com/python/cpython into special…
sweeneydeJun 10, 2022
f1e2d39
update test_dis.py
sweeneydeJun 10, 2022
463c3b9
Merge remote-tracking branch 'upstream/main' into special_for_iter2
sweeneydeJun 14, 2022
ed29777
use the new exponential backoff
sweeneydeJun 14, 2022
188b357
merge with main
sweeneydeJun 18, 2022
36d0999
revert using sdigits, add _PyLong_AssignValue
sweeneydeJun 19, 2022
ad2e969
revert test_sys sizeof check
sweeneydeJun 19, 2022
d55868f
revert test_range changes
sweeneydeJun 19, 2022
2d6ee26
add comment and use Py_ssize_t
sweeneydeJun 20, 2022
db21da1
Add comment: only positive
sweeneydeJun 20, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
PrevPrevious commit
NextNext commit
Make FOR_ITER_RANGE mutate the local
  • Loading branch information
@sweeneyde
sweeneyde committedApr 19, 2022
commite429410ba16db8b385975f9616896c97f02bce89
8 changes: 4 additions & 4 deletionsInclude/internal/pycore_range.h
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -10,10 +10,10 @@ extern "C" {

typedef struct {
PyObject_HEAD
long index;
long start;
long step;
long len;
sdigit index;
sdigit start;
sdigit step;
sdigit len;
} _PyRangeIterObject;

#ifdef __cplusplus
Expand Down
25 changes: 9 additions & 16 deletionsObjects/rangeobject.c
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -770,7 +770,7 @@ rangeiter_next(_PyRangeIterObject *r)
/* cast to unsigned to avoid possible signed overflow
in intermediate calculations. */
return PyLong_FromLong((long)(r->start +
(unsigned long)(r->index++) * r->step));
(digit)(r->index++) * r->step));
return NULL;
}

Expand DownExpand Up@@ -911,9 +911,9 @@ fast_range_iter(long start, long stop, long step, long len)
_PyRangeIterObject *it = PyObject_New(_PyRangeIterObject, &PyRangeIter_Type);
if (it == NULL)
return NULL;
it->start = start;
it->step = step;
it->len = len;
it->start =Py_SAFE_DOWNCAST(start, long, sdigit);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Would it be cleaner to change the signature offast_range_iter to take sdigits, rather than longs?

Copy link
MemberAuthor

@sweeneydesweeneydeApr 24, 2022
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm not so sure -- downcasts need to happen somewhere, andfast_range_iter is called in two places. ThePyLong_AsLong calls could theoretically be replaced withPy_SIZE checks andob_digit[0] accesses, but I think the existing code is fine using the public API.

it->step =Py_SAFE_DOWNCAST(step, long, sdigit);
it->len =Py_SAFE_DOWNCAST(len, long, sdigit);
it->index = 0;
return (PyObject *)it;
}
Expand DownExpand Up@@ -1097,20 +1097,13 @@ range_iter(PyObject *seq)
goto long_range;
}
ulen = get_len_of_range(lstart, lstop, lstep);
if (ulen > (unsigned long)LONG_MAX) {
if (ulen > PyLong_MASK ||
lstart > PyLong_MASK || lstart < -(long)PyLong_MASK ||
lstop > PyLong_MASK || lstop < -(long)PyLong_MASK ||
lstep > PyLong_MASK || lstep < -(long)PyLong_MASK)
{
goto long_range;
}
/* check for potential overflow of lstart + ulen * lstep */
if (ulen) {
if (lstep > 0) {
if (lstop > LONG_MAX - (lstep - 1))
goto long_range;
}
else {
if (lstop < LONG_MIN + (-1 - lstep))
goto long_range;
}
}
return fast_range_iter(lstart, lstop, lstep, (long)ulen);

long_range:
Expand Down
41 changes: 33 additions & 8 deletionsPython/ceval.c
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -4399,17 +4399,42 @@ _PyEval_EvalFrameDefault(PyThreadState *tstate, _PyInterpreterFrame *frame, int
_PyRangeIterObject *r = (_PyRangeIterObject *)TOP();
DEOPT_IF(Py_TYPE(r) != &PyRangeIter_Type, FOR_ITER);
STAT_INC(FOR_ITER, hit);
if (r->index < r->len) {
PyObject *res = PyLong_FromLong(
(long)(r->start + (unsigned long)(r->index++) * r->step));
if (res == NULL) {
goto error;
_Py_CODEUNIT next = next_instr[INLINE_CACHE_ENTRIES_FOR_ITER];
assert(_PyOpcode_Deopt[_Py_OPCODE(next)] == STORE_FAST);
PyObject **local_ptr = &GETLOCAL(_Py_OPARG(next));
PyObject *local = *local_ptr;
if (r->index >= r->len) {
goto iterator_exhausted_no_error;
}
JUMPBY(INLINE_CACHE_ENTRIES_FOR_ITER + 1);
sdigit value = r->start + (digit)(r->index++) * r->step;
if (value < _PY_NSMALLPOSINTS && value >= -_PY_NSMALLNEGINTS) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think there are too many branches here.
Ideally a specialized instruction should have a fewDEOPT_IF tests, then the rest of the code should be (mostly) branchless.

This also includes far too much knowledge of the internals of theint object, IMO.

Would performance be measurably worse ifstart,end andstop werePy_ssize_t and this code were simplified to
PyLong_FromSsize_t?
This feels like an elaborate workaround the oddities of ourint implementation.
We should be fixing that, not working around it.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think there may be a middle ground here: add a function like this to longobject.c:

int_PyLong_AssignValueLong(PyLongObject **location, long value){    if value in small:        set to small, decref old long    elif refcnt(*location) == 1 and the right size:        mutate in place    else:        Py_SETREF(*location, PyLong_FromLong(value));}

Sort of similar in spirit toPyUnicode_Append()

That way, the PyLongObject implementation details stay out of the eval loop, but we still get the benefit of avoiding the allocation, which I believe is the main source of the speedup.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This could theoretically be re-used forenumerate() as well.

And even if the implementation of integers changes (e.g. new freelists, no more small ints), the operation still makes sense.

*local_ptr = Py_NewRef(&_PyLong_SMALL_INTS[_PY_NSMALLNEGINTS+value]);
Py_XDECREF(local);
NOTRACE_DISPATCH();
}
if (local && PyLong_CheckExact(local) && Py_REFCNT(local) == 1) {
if (value > 0) {
assert(value <= PyLong_MASK);
((PyLongObject *)local)->ob_digit[0] = value;
Py_SET_SIZE(local, 1);
}
else {
assert(value >= -(sdigit)PyLong_MASK);
((PyLongObject *)local)->ob_digit[0] = -(sdigit)value;
Py_SET_SIZE(local, -1);
}
PUSH(res);
JUMPBY(INLINE_CACHE_ENTRIES_FOR_ITER);
NOTRACE_DISPATCH();
}
goto iterator_exhausted_no_error;
PyObject *res = PyLong_FromLong(value);
if (res == NULL) {
// undo the JUMPBY
next_instr -= INLINE_CACHE_ENTRIES_FOR_ITER + 1;
goto error;
}
*local_ptr = res;
Py_XDECREF(local);
NOTRACE_DISPATCH();
}

TARGET(BEFORE_ASYNC_WITH) {
Expand Down
6 changes: 4 additions & 2 deletionsPython/specialize.c
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -2064,16 +2064,18 @@ _Py_Specialize_ForIter(PyObject *iter, _Py_CODEUNIT *instr)
assert(_PyOpcode_Caches[FOR_ITER] == INLINE_CACHE_ENTRIES_FOR_ITER);
_PyForIterCache *cache = (_PyForIterCache *)(instr + 1);
PyTypeObject *tp = Py_TYPE(iter);
_Py_CODEUNIT next = instr[1+INLINE_CACHE_ENTRIES_FOR_ITER];
int next_op = _PyOpcode_Deopt[_Py_OPCODE(next)];
if (tp == &PyListIter_Type) {
_Py_SET_OPCODE(*instr, FOR_ITER_LIST);
goto success;
}
else if (tp == &PyRangeIter_Type) {
else if (tp == &PyRangeIter_Type && next_op == STORE_FAST) {
_Py_SET_OPCODE(*instr, FOR_ITER_RANGE);
goto success;
}
else {
SPECIALIZATION_FAIL(JUMP_BACKWARD,
SPECIALIZATION_FAIL(FOR_ITER,
_PySpecialization_ClassifyIterator(iter));
goto failure;
}
Expand Down

[8]ページ先頭

©2009-2025 Movatter.jp