Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

ENH: Use array indexing preparation routines for flatiter objects#28590

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
lysnikolaou wants to merge10 commits intonumpy:main
base:main
Choose a base branch
Loading
fromlysnikolaou:use-prepare-index-flatiter

Conversation

lysnikolaou
Copy link
Member

  • Useprepare_index initer_subscript anditer_ass_subscript. This fixes various cases that were broken before:
    • arr.flat[[True, True]]
    • arr.flat[[1.0, 1.0]]
    • arr.flat[()] = 0
  • Add more extensive tests forflatiter indexing operations

Closes#28314.

ngoldbaum reacted with rocket emoji
@lysnikolaoulysnikolaou changed the title[ENH] Use array indexing preparation routines for flatiter objectsENH: Use array indexing preparation routines for flatiter objectsMar 26, 2025
@lysnikolaoulysnikolaouforce-pushed theuse-prepare-index-flatiter branch from198df6b to75aaed0CompareMarch 26, 2025 10:23
@lysnikolaoulysnikolaouforce-pushed theuse-prepare-index-flatiter branch from75aaed0 to9f2d51fCompareMarch 26, 2025 10:30
assert_raises(ValueError, ia, x.flat, s, np.zeros(9, dtype=float))
assert_raises(ValueError, ia, x.flat, s, np.zeros(11, dtype=float))
assert_raises(IndexError, ia, x.flat, s, np.zeros(9, dtype=float))
assert_raises(IndexError, ia, x.flat, s, np.zeros(11, dtype=float))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

While this is certainly more consistent and I'd even call it a bugfix, it is a behavior change and someone might have code relying on the old behavior. Needs a release note at least. You also need another release note for the new features.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Agreed. Added a release note that lists all the most important changes.

@lysnikolaoulysnikolaouforce-pushed theuse-prepare-index-flatiter branch from8f2b322 to33109ddCompareMarch 28, 2025 13:52
@ngoldbaum
Copy link
Member

This is a big refactor, so I think we'll need at least two experienced developers to go over the C code changes, so that might take a while. I'll try to do a pass focusing on the correctness of the C code soon. On a first, high-level pass this looks like mostly simplification and cleanup.

I think you should also try running the indexing benchmarks to see if there are any significant regressions in existing benchmarks. I thinkbench_indexing already captures several workflows that go through the changed low-level C code path.

It would also be nice to get new entries in theFlatIterIndexing benchmark for newly added functionality.

lysnikolaou reacted with thumbs up emoji

@ngoldbaumngoldbaum self-assigned thisMar 28, 2025
@lysnikolaou
Copy link
MemberAuthor

lysnikolaou commentedMar 28, 2025
edited
Loading

These are the results of running the (old & new) benchmarks:

| Change   | Before [93898621] <main>   | After [cfcdabf0] <use-prepare-index-flatiter>   |     Ratio | Benchmark (Parameter)                                       ||----------|----------------------------|-------------------------------------------------|-----------|-------------------------------------------------------------|| +        | 87.0±0.4ns                 | 43.0±0.2ms                                      | 494276    | bench_indexing.FlatIterIndexing.time_flat_empty_tuple_index || +        | 115±3ns                    | 479±8ns                                         |      4.15 | bench_indexing.FlatIterIndexing.time_flat_bool_index_0d     || +        | 39.4±0.3ms                 | 42.8±0.3ms                                      |      1.09 | bench_indexing.FlatIterIndexing.time_flat_ellipsis_index    || +        | 3.95±0.04ms                | 4.29±0.07ms                                     |      1.09 | bench_indexing.FlatIterIndexing.time_flat_slice_index       |

It looks like having special cases for tuple, ellipses etc. (instead of going throughprepare_index) did have an impact on performance. Should we try and keep those special cases in?

@ngoldbaum
Copy link
Member

Should we try and keep those special cases in?

Probably

@lysnikolaou
Copy link
MemberAuthor

Should we try and keep those special cases in?

Probably

I added a couple of special cases for an empty tuple and boolean indexes. This fixes the two worst performance regressions. I feel that the rest are acceptable, since this goes through a much more complex code path to make sure that everything is set up correctly.

@ngoldbaumngoldbaum added this to the2.3.0 release milestoneApr 23, 2025
@ngoldbaum
Copy link
Member

I added the 2.3 milestone to make sure we don't drop reviewing this before cutting the release.

lysnikolaou reacted with thumbs up emoji

@charris
Copy link
Member

I added the 2.3 milestone

@ngoldbaum I am about to push this off to 2.4 unless you want to put it in very soon.

@ngoldbaum
Copy link
Member

I spoke with Lysandros and he said it's OK to push this off. We'll coordinate on getting this reviewed soon.

@sebergseberg added the 56 - Needs Release Note.Needs an entry in doc/release/upcoming_changes labelJun 11, 2025
@sebergseberg self-requested a reviewJune 11, 2025 17:49
Copy link
Member

@sebergseberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Sorry for not looking at it much. Overall looks nice, I need to do a pass to see for refcount issues, etc.

I am slightly worried that some of the bad bool cases should maybe have aFutureWarning (or just go to an error for a bit?!), to enforce correct behavior.

Overall, I am happy that this seemed to have worked well to integrate, the diff is a bit unwieldy, but it can't be helped.


``arr.flat[[True, True]]`` and ``arr.flat[[1.0, 1.0]]`` were incorrectly
treated as ``arr.flat[[1, 1]]``. They now raise an `IndexError`` (unless
``arr.flat[[True, Truee]]`` is a valid boolean index)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think I am OK with this, but it is technically a too fast change.

It could make sense to put aFutureWarningif the input isn't already an array, the way to avoid it, would be to make sure the input is an array.
(I would also be happy to just go with a hard error and a warning it will work in the future, to not bother keeping the old stuff working, heh)

Basically it seems extremely niche, but has the potentially to modify code results.

* Fixed crash when assigning to an empty index tuple:

``arr.flat[()] = 0`` previously crashed the Python interpreter. It now
correctly assigns to the entire array.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I might prefer if this just errors, it seems weird to do this for something that is known to be exactly 1-D to allow a 0-D index.
(In an ideal world, I might prefer if NumPy forced you to add... for incomplete indices, but it is just too much of a change.)

But I don't feel strongly about it.

return obj;
if (PyTuple_Check(ind) && PyTuple_GET_SIZE(ind) == 0) {
Py_INCREF(self->ao);
return (PyObject *)self->ao;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I would suggest moving the check into the general path to not diverge here, but it's OK here also.
(I.e. I think the index info will tell us about indexing 0 dims with 0 indices.)

}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Nothing about this branch is correct as testinga.flat[] againsta.ravel()[] will tell you.

I could imagine just deprecating it, because it effectively indexes zero dimensions, similar toa.flat[()] there seems little reason to do so?

if (new == NULL) {
goto fail;
if (index_type == HAS_FANCY) {
ret = iter_subscript_int(self, (PyArrayObject *) indices[0].object, &cast_info);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It must be an integer array here, I think. But I don't think it is guaranteed to be anintp array. A bit scary that no test seems to index with a differently sized integer, though?

Py_INCREF(type);
arrval = (PyArrayObject *)PyArray_FromAny(val, type, 0, 0,
Py_INCREF(dtype);
PyArrayObject *arrval = (PyArrayObject *)PyArray_FromAny(val, dtype, 0, 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Would be good to pass the correct maxdims here, IIRC (fixes corner cases around object arrays, even if the choice of how that behaves is a matter of taste).

}

/* Check for Integer or Slice */
if (PyLong_Check(ind) || PySlice_Check(ind)) {
start = parse_index_entry(ind, &step_size, &n_steps,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Seems likeparse_index_entry should be deleted, but is not yet.

indices_2d = np.array([[1, 2], [3, 4]])
assert_array_equal(a.flat[indices_2d], indices_2d)

assert_array_equal(a.flat[[True, 1]], a.flat[[1, 1]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

As mentioned above, would be good to have a basic test here (or it's own test) for e.g.int16 dtype inputs.

(And yes, you can force-cast to integer.)

def test_flatiter_indexing_boolean(self):
a = np.arange(9).reshape((3, 3))
a.flat[True] = 10
assert_array_equal(a, np.array([[10, 1, 2], [3, 4, 5], [6, 7, 8]]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Technically wrong.a.flat[True] should return thea.reshape(1, a.size) effectively. So if anything it would assign it to everything.

In practice, I may be tempted to just deprecate it, since it seems somewhat useless?

* @param allow_boolean whether to allow the boolean special case
*
* @returns the index_type or -1 on failure and fills the number of indices.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Maybe we should adjust that slightly and leave it onprepare_index_noarray? It's obvious to look for docs there if you look atprepare_index, but not vice-versa?
(but just nitpicking/suggestion.)

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@sebergsebergseberg left review comments

@ngoldbaumngoldbaumngoldbaum left review comments

Assignees

@ngoldbaumngoldbaum

Labels
56 - Needs Release Note.Needs an entry in doc/release/upcoming_changes
Projects
None yet
Milestone
2.4.0 release
Development

Successfully merging this pull request may close these issues.

BUG: Fix flatiter indexing
4 participants
@lysnikolaou@ngoldbaum@charris@seberg

[8]ページ先頭

©2009-2025 Movatter.jp