Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork11.1k
ENH: Use array indexing preparation routines for flatiter objects#28590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
198df6b
to75aaed0
Compare75aaed0
to9f2d51f
Compareassert_raises(ValueError, ia, x.flat, s, np.zeros(9, dtype=float)) | ||
assert_raises(ValueError, ia, x.flat, s, np.zeros(11, dtype=float)) | ||
assert_raises(IndexError, ia, x.flat, s, np.zeros(9, dtype=float)) | ||
assert_raises(IndexError, ia, x.flat, s, np.zeros(11, dtype=float)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
While this is certainly more consistent and I'd even call it a bugfix, it is a behavior change and someone might have code relying on the old behavior. Needs a release note at least. You also need another release note for the new features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Agreed. Added a release note that lists all the most important changes.
8f2b322
to33109dd
CompareThis is a big refactor, so I think we'll need at least two experienced developers to go over the C code changes, so that might take a while. I'll try to do a pass focusing on the correctness of the C code soon. On a first, high-level pass this looks like mostly simplification and cleanup. I think you should also try running the indexing benchmarks to see if there are any significant regressions in existing benchmarks. I think It would also be nice to get new entries in the |
lysnikolaou commentedMar 28, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
These are the results of running the (old & new) benchmarks:
It looks like having special cases for tuple, ellipses etc. (instead of going through |
Probably |
I added a couple of special cases for an empty tuple and boolean indexes. This fixes the two worst performance regressions. I feel that the rest are acceptable, since this goes through a much more complex code path to make sure that everything is set up correctly. |
I added the 2.3 milestone to make sure we don't drop reviewing this before cutting the release. |
@ngoldbaum I am about to push this off to 2.4 unless you want to put it in very soon. |
I spoke with Lysandros and he said it's OK to push this off. We'll coordinate on getting this reviewed soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Sorry for not looking at it much. Overall looks nice, I need to do a pass to see for refcount issues, etc.
I am slightly worried that some of the bad bool cases should maybe have aFutureWarning
(or just go to an error for a bit?!), to enforce correct behavior.
Overall, I am happy that this seemed to have worked well to integrate, the diff is a bit unwieldy, but it can't be helped.
``arr.flat[[True, True]]`` and ``arr.flat[[1.0, 1.0]]`` were incorrectly | ||
treated as ``arr.flat[[1, 1]]``. They now raise an `IndexError`` (unless | ||
``arr.flat[[True, Truee]]`` is a valid boolean index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think I am OK with this, but it is technically a too fast change.
It could make sense to put aFutureWarning
if the input isn't already an array, the way to avoid it, would be to make sure the input is an array.
(I would also be happy to just go with a hard error and a warning it will work in the future, to not bother keeping the old stuff working, heh)
Basically it seems extremely niche, but has the potentially to modify code results.
* Fixed crash when assigning to an empty index tuple: | ||
``arr.flat[()] = 0`` previously crashed the Python interpreter. It now | ||
correctly assigns to the entire array. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I might prefer if this just errors, it seems weird to do this for something that is known to be exactly 1-D to allow a 0-D index.
(In an ideal world, I might prefer if NumPy forced you to add...
for incomplete indices, but it is just too much of a change.)
But I don't feel strongly about it.
return obj; | ||
if (PyTuple_Check(ind) && PyTuple_GET_SIZE(ind) == 0) { | ||
Py_INCREF(self->ao); | ||
return (PyObject *)self->ao; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I would suggest moving the check into the general path to not diverge here, but it's OK here also.
(I.e. I think the index info will tell us about indexing 0 dims with 0 indices.)
} | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Nothing about this branch is correct as testinga.flat[]
againsta.ravel()[]
will tell you.
I could imagine just deprecating it, because it effectively indexes zero dimensions, similar toa.flat[()]
there seems little reason to do so?
if (new == NULL) { | ||
goto fail; | ||
if (index_type == HAS_FANCY) { | ||
ret = iter_subscript_int(self, (PyArrayObject *) indices[0].object, &cast_info); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
It must be an integer array here, I think. But I don't think it is guaranteed to be anintp
array. A bit scary that no test seems to index with a differently sized integer, though?
Py_INCREF(type); | ||
arrval = (PyArrayObject *)PyArray_FromAny(val, type, 0, 0, | ||
Py_INCREF(dtype); | ||
PyArrayObject *arrval = (PyArrayObject *)PyArray_FromAny(val, dtype, 0, 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Would be good to pass the correct maxdims here, IIRC (fixes corner cases around object arrays, even if the choice of how that behaves is a matter of taste).
} | ||
/* Check for Integer or Slice */ | ||
if (PyLong_Check(ind) || PySlice_Check(ind)) { | ||
start = parse_index_entry(ind, &step_size, &n_steps, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Seems likeparse_index_entry
should be deleted, but is not yet.
indices_2d = np.array([[1, 2], [3, 4]]) | ||
assert_array_equal(a.flat[indices_2d], indices_2d) | ||
assert_array_equal(a.flat[[True, 1]], a.flat[[1, 1]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
As mentioned above, would be good to have a basic test here (or it's own test) for e.g.int16
dtype inputs.
(And yes, you can force-cast to integer.)
def test_flatiter_indexing_boolean(self): | ||
a = np.arange(9).reshape((3, 3)) | ||
a.flat[True] = 10 | ||
assert_array_equal(a, np.array([[10, 1, 2], [3, 4, 5], [6, 7, 8]])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Technically wrong.a.flat[True]
should return thea.reshape(1, a.size)
effectively. So if anything it would assign it to everything.
In practice, I may be tempted to just deprecate it, since it seems somewhat useless?
* @param allow_boolean whether to allow the boolean special case | ||
* | ||
* @returns the index_type or -1 on failure and fills the number of indices. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Maybe we should adjust that slightly and leave it onprepare_index_noarray
? It's obvious to look for docs there if you look atprepare_index
, but not vice-versa?
(but just nitpicking/suggestion.)
prepare_index
initer_subscript
anditer_ass_subscript
. This fixes various cases that were broken before:arr.flat[[True, True]]
arr.flat[[1.0, 1.0]]
arr.flat[()] = 0
flatiter
indexing operationsCloses#28314.