Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

ENH: Standalone benchmark script for the inner loops of ufunc#15987

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
seiko2plus wants to merge3 commits intonumpy:main
base:main
Choose a base branch
Loading
fromseiko2plus:new_ufunc_benchmark

Conversation

seiko2plus
Copy link
Member

@seiko2plusseiko2plus commentedApr 15, 2020
edited
Loading

ENH: A standalone benchmark script for the inner loops of ufunc

This script only measuring the performance of inner loops of ufunc,
the idea behind it is to remove umath object calls from the equation,
in order to reduce the number of noises and provides stable ratios.

@eric-wieser
Copy link
Member

Can we reuse our existing benchmark machinery here?

@seiko2plus
Copy link
MemberAuthor

seiko2plus commentedApr 15, 2020
edited
Loading

@eric-wieser, I tried to use ASV but the result wasn't stable enough, check thispatch andpatch2 from#13516, the idea behind this patch is to benchmarking only the inner loop of ufunc in order to reduce the noises as much as possible, also ASV is kinda slow too.

EDIT: I moved the two mentioned patches to separate pull-requests#15992 and#15990

@eric-wieser
Copy link
Member

It would be nice if we could at least hook into ASV for things like benchmark result comparisons and storage, rather than building our own version of those too. It might be worth starting a conversation with@pv about the best way to do that.

@seberg
Copy link
Member

seberg commentedApr 15, 2020
edited
Loading

@seiko2plus you are repeating the function run multiple times here within yourrun function. May that be enough to stabilize the results a bit in asv?

EDIT: This got lost: "You are doing a few other things here that you are not doing in the asv version."

For example, if you just define therun function in C (and monkeypatch it into Benchmark), and make it do a couple of C-level calls (to offset the ~200ns or so overhead. That might be enough to get a stable result as well?

@seiko2plus
Copy link
MemberAuthor

@seberg, ASV already collect multiple samples for each benchmark, but still not stable enough even on idle CPU.

This script is not providing a replacement for the current ASV implementation, the main reason behind it is to detect any performance changes in the inner loops of ufunc and removing the functionality of umath and multiarry from the equation in order to reduce the noises as much as possible, it also provides more testing cases like multiple strides, sizes and better control for the testing process.

For example, if you just define the run function in C (and monkeypatch it into Benchmark), and make it do a couple of C-level calls (to offset the ~200ns or so overhead. That might be enough to get a stable result as well?

The problem is ASV doesn't provide a way to specify the elapsed time manually.

@seiko2plusseiko2plusforce-pushed thenew_ufunc_benchmark branch 2 times, most recently froma58ab33 to5f4bbdeCompareApril 15, 2020 19:41
@seiko2plus
Copy link
MemberAuthor

seiko2plus commentedApr 15, 2020
edited
Loading

EDIT: This got lost: "You are doing a few other things here that you are not doing in the asv version."

@seberg, I moved the mentioned patches from#13516, into a separate pull#15992 and#15990. also modified the number of repeats and samples to be equal to the default settings of this script.
but still, the ratio of ASV not stable enough.

@seiko2plusseiko2plus changed the titleENH: Benchmark script for the inner loops of universal functions.ENH: A standalone benchmark script for the inner loops of ufuncApr 15, 2020
@seiko2plusseiko2plus marked this pull request as draftApril 17, 2020 02:25
@r-devulap
Copy link
Member

One reason that could be causing noise is turbo mode. In case you haven't already done, I would recommend disabling for benchmarking purposes (set/sys/devices/system/cpu/intel_pstate/no_turboto 1). May be that will help? I haven't had too much variability while benchmarking ufunc's withasv.

@seiko2plus
Copy link
MemberAuthor

@r-devulap, Before I run any benchmarks, I usually do:

  • isolate logical cores from scheduling through linux kernel optionsisolcpus andrcu_nocbs
  • reducing scheduling-clock ticks throughnohz_full for the isolated cores
  • use option--cpu-affinity that comes with this script or what ASV provides for the isolated cores
  • use scaling governor performance via /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
  • disable turbo boost via/sys/devices/system/cpu/intel_pstate/no_turbo
  • make sure thatASLR(address space layout randomization) state is 'full randomization' through
    set 2 to /proc/sys/kernel/randomize_va_space

Lately, I realized a python module calledpyperf, provides a tool to tune the system with the above tips and many more via commandpyperf system tune

However, it seems I should have an idle hardware in order to get almost stable ratios for ASV not just isolate some logical cores since any involved system calls that interpret the thread during collecting the benchmark samples will eliminate the benefits from isolating the logical cores viaisolcpus andrcu_nocbs.

One of the things I don't like in ASV that its uses a separate process for each collected sample,
which makes it too slow.

@seiko2plusseiko2plus marked this pull request as ready for reviewApril 18, 2020 17:13
@seiko2plusseiko2plusforce-pushed thenew_ufunc_benchmark branch 2 times, most recently from8408248 tof17305eCompareApril 19, 2020 13:06
@charrischarris changed the titleENH: A standalone benchmark script for the inner loops of ufuncENH: Standalone benchmark script for the inner loops of ufuncApr 19, 2020
@seiko2plusseiko2plusforce-pushed thenew_ufunc_benchmark branch 3 times, most recently fromdbce6f3 toe62c951CompareApril 23, 2020 03:08
@mattip
Copy link
Member

ping@pv. Is there something here that we all are missing?

    This script only measuring the performance of inner loops    of ufunc, the idea behind it is to remove umath object calls    from the equation, in order to reduce the number of noises and    provides stable ratios.
@hameerabbasi
Copy link
Contributor

hameerabbasi commentedNov 9, 2020
edited
Loading

I ran this PR on a live environment without a desktop (Ubuntu Server), using the method in the PR description. The noise was around 3% and this PR had a performance impact of ±5%, so not too much of a difference.

Whoops, had the wrong tab open. This comment was meant for#16247, copy pasting there.

Base automatically changed frommaster tomainMarch 4, 2021 02:04
@charris
Copy link
Member

close/reopen

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@eric-wiesereric-wiesereric-wieser left review comments

Assignees
No one assigned
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

7 participants
@seiko2plus@eric-wieser@seberg@r-devulap@mattip@hameerabbasi@charris

[8]ページ先頭

©2009-2025 Movatter.jp