Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork366
add benchmarks using pytest-benchmark and codspeed#3562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
@zarr-developers/steering-council I don't have permission to register this repo with codspeed. I submitted a request to register it, could someone approve it? |
done |
does anyone have opinions about benchmarks? feel free to suggest something concrete. Otherwise, I think we should take this as-is and deal with later benchmarks (like partial shard read / writes) in a subsequent pr |
codspeed-hqbot commentedOct 30, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
.github/workflows/codspeed.yml Outdated
| uses: CodSpeedHQ/action@v4 | ||
| with: | ||
| mode: instrumentation | ||
| run: hatch run test.py3.11-1.26-minimal:run-benchmark |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
can we test the latest instead? seems more appropriate...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The latest version of python? What's the reasoning? I'd rather update this file when we drop a supported version vs when a new version of python comes out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Because we'd want to catch a perf regression from upstream changes too? I'm suggested latest version of released librariespy=3.13, np=2.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
we don't have an upper bound on numpy versions, so I don't think this particular workflow will help us catch regressions from upstream changes -- we would need to update this workflow every time a new version of numpy is released. IMO that's something we should do in a separate benchmark workflow. This workflow here will run on every PR, and in that case the oldest version of numpy we support seems better.
we also don't have to use a pre-baked hatch environment here, we could define a dependency set specific to benchmarking. but my feeling is that benchmarking against older versions of stuff gives us a better measure of what users will actually experience.
indexing please. that'll exercise the codec pipeline too. a peakmem metric would be good to track also, if possible. |
I don't think codspeed or pytest-benchmark do memory profiling. we would needhttps://pytest-memray.readthedocs.io/en/latest/ or something equivalent for that. and an indexing benchmark sounds like a great idea but I don't think I have the bandwidth for it in this pr right now |
…into chore/benchmarks
…into chore/benchmarks
I added a benchmark that clearly reveals the performance improvement of#3561 |
I added some slice-based benchmarks based on the examples from#3524, and I updated the contributing docs with a section about the benchmarks. assuming we can resolve the discussion about which python / numpy version to use in the CI job, I think this is ready |
new problem: the codspeed CI benchmarks are way too slow! the benchmark suite runs in 90s locally, and It's taking over 40m to run in CI. Help would be appreciated in speeding this up. |
owing to the large number of syscalls in our benchmark code, codspeed recommended using thewalltime instrument instead of their virtual CPU instrument. But to turn on the walltime benchmark, we would need to run our benchmarking code on codspeed's servers, which is a security risk. Given that codspeed is not turning out to be particularly simple, I am inclined to defer the codspeed CI stuff for later work. But if someone can help get the test runtime down, and / or we are OK running our benchmarks on codspeed's servers, then maybe we can get that sorted in this PR. |
Uh oh!
There was an error while loading.Please reload this page.
Co-authored-by: Max Jones <14077947+maxrjones@users.noreply.github.com>
looks like the walltime instrument is working! I think this is g2g |
since#3554 was an unpopular direction I'm going instead with codspeed + pytest-benchmark. Opening as a draft because I haven't looked into how codspeed works at all, but I'd like people to weigh in on whether these initial benchmarks make sense. Naturally we can add more specific ones later, but I figured just some bulk array read / write workloads would be a good start.