Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

MNT: disable the coercion cache for the nogil build#26294

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
ngoldbaum merged 3 commits intonumpy:mainfromngoldbaum:disable-coercion-cache
Apr 19, 2024

Conversation

ngoldbaum
Copy link
Member

This disables the coercion cache on the nogil build, but I'd like to find a benchmark that shows the benefit of this cache more than what's already inbench_array_coercion.py andbench_core.py.

I don't see any significant performance impact before or after this change, although note that I can't actually run asv in comparison mode yet in the nogil build.

When I disable this cache for the single-threaded build (e.g. setCOERCION_CACHE_CACHE_SIZE to 0 in both branches of the ifdef) where Ican run asv in comparison mode, I get the following results:

| Change   | Before [3c09f166] <main>   | After [c981c067] <disable-coercion-cache>   |   Ratio | Benchmark (Parameter)                                               ||----------|----------------------------|---------------------------------------------|---------|---------------------------------------------------------------------|| +        | 219±0.6ns                  | 233±6ns                                     |    1.06 | bench_array_coercion.ArrayCoercionSmall.time_array_no_copy([1])     || +        | 196±0.7ns                  | 208±1ns                                     |    1.06 | bench_array_coercion.ArrayCoercionSmall.time_ascontiguousarray([1]) || +        | 249±3ns                    | 262±6ns                                     |    1.05 | bench_array_coercion.ArrayCoercionSmall.time_asarray_dtype([1])     || Change   | Before [3c09f166] <main>   | After [c981c067] <disable-coercion-cache>   |   Ratio | Benchmark (Parameter)                                                            ||----------|----------------------------|---------------------------------------------|---------|----------------------------------------------------------------------------------|| +        | 687±7ns                    | 732±2ns                                     |    1.07 | bench_core.StatsMethods.time_sum('bool_', 100)                                   || -        | 10.9±2μs                   | 10.4±0.01μs                                 |    0.95 | bench_core.CountNonzero.time_count_nonzero_axis(3, 10000, <class 'numpy.int64'>) || -        | 22.7±0.1μs                 | 21.6±0.4μs                                  |    0.95 | bench_core.StatsMethods.time_var('bool_', 10000)                                 || -        | 640±20ns                   | 602±10ns                                    |    0.94 | bench_core.StatsMethods.time_prod('uint64', 100)                                 || -        | 665±40ns                   | 610±4ns                                     |    0.92 | bench_core.StatsMethods.time_max('int64', 100)                                   || -        | 201±2ns                    | 183±0.9ns                                   |    0.91 | bench_core.Core.time_array_l1                                                    || -        | 7.15±2ms                   | 6.34±0.3ms                                  |    0.89 | bench_core.CountNonzero.time_count_nonzero_axis(3, 1000000, <class 'str'>)       |

So it does have a small, ~5% benefit for a few of the array coercion tests, but a somewhat confusing mix of performance improvements and performance hits for reduction operations, and two other 10% performance hits for creating an array from a single-element list and a benchmark forcount_nonzero withaxis set.

Maybe this is all just noise?

Copy link
Member

@sebergseberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Maybe this is all just noise?

The reduction changes must be noise. Thearray([1]) is exactly the type of operation that would benefit, so that must be real (also if it was noise you would expect other small array benchmarks to also fluctuate stronger).

If you would havearray([[1]]) or so (i.e. deeper nested and maybe even containing further arrays likearray([[small_arr], [small_arr]])) the benefit should be bigger.
I might have timed the benefit of the discovery step on its own also when adding it (so the benefit in percent is bigger compared to full coercion).


Anyway, we can definitely just disable it in the nogil build I think. It may be a nice boost for some code, but overall will have less impact than dropping the small array cache.

@ngoldbaumngoldbaum added the 39 - free-threadingPRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703) labelApr 17, 2024
@ngoldbaum
Copy link
MemberAuthor

I tried testing scikit-learn yesterday and found that parallel forest training led to a similar seg fault and I was able to reduce it down to the test I just added. I decided it's worth having atest_multithreading.py test module specifically for tests of numpy in a multithreaded context. I suspect we'll have more of these and be able to refactor to reduce a lot of boilerplate by just putting them all in the same file.

@ngoldbaum
Copy link
MemberAuthor

Pulling this in, if someone has an issue with the new test file I’ll do a followup.

rgommers reacted with thumbs up emoji

@ngoldbaumngoldbaum merged commitda95f8e intonumpy:mainApr 19, 2024
@rgommersrgommers added this to the2.1.0 release milestoneApr 22, 2024
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@sebergsebergseberg approved these changes

Assignees
No one assigned
Labels
03 - Maintenance39 - free-threadingPRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703)
Projects
None yet
Milestone
2.1.0 release
Development

Successfully merging this pull request may close these issues.

3 participants
@ngoldbaum@seberg@rgommers

[8]ページ先頭

©2009-2025 Movatter.jp