zarr-developers/zarr-pythonPublic

NotificationsYou must be signed in to change notification settings
Fork366
Star1.8k

Zarr-Python: roadmap after 3.1#3250

jhamman started this conversation inGeneral

jhamman

Jul 15, 2025

· 4 comments· 4 replies

Return to top

Discussion options

jhamman
Jul 15, 2025
Maintainer

I gavethis presentation at SciPy last week on the progress Zarr-Python has made over the last year.tldr; we've come a long way!

I also shared a potential feature list that I think could form the beginning of the roadmap beyond version 3.1 (which landed today!).

Targets for Zarr 3.2 and beyond

Performance tuning (sharding, codec-pipeline, async+multi-threading)
Additional array-types (e.g. sparse) and data-types (e.g. ml-dtypes)
New Extensions
- Variable length chunk grids
- Sparse arrays
GPU support
- On-GPU (de)compression
- Other hardware (Apple, MLX)

Curious to get the input from others on what else we're looking to work on next.

A nice outcome from this discussion would be an update to the Zarr-Python Roadmap which is now nicely out of date:https://zarr.readthedocs.io/en/stable/developers/roadmap.html

cc @zarr-developers/python-core-devs, @zarr-developers/python-emeritus

You must be logged in to vote

Replies: 4 comments 4 replies

Comment options

dstansby
Jul 15, 2025

Adding read/write permissions to arrays & groups would be a good one to add, which I have as a work in progress.

You must be logged in to vote

0 replies

Comment options

d-v-b
Jul 30, 2025
Maintainer

a lazy array indexing API.
I actually think this will unlock some performance improvements in our array IO.
removeStorePath, give all of its methods to theStore classes
Do something other than returnNone for missing keys in the store APIs (e.g., use aResult type, assuming there are no performance penalties)
Define an awaitableFuture object, and have all of our user-facing async APIs return instances of thisFuture object that wrap the awaitable they currently return.Future objects would also have async orresult method that just callssync on the underlying awaitable

You must be logged in to vote

4 replies

Comment options

normanrz Jul 30, 2025
Maintainer

a lazy array indexing API.
I actually think this will unlock some performance improvements in our array IO.

Is that the same as array "views"? I would be a fan of that

removeStorePath, give all of its methods to theStore classes

What is the rationale for that? I think it is quite useful to have a pointer to a file (or even byte range) in a store.

Comment options

d-v-b Jul 30, 2025
Maintainer

Is that the same as array "views"? I would be a fan of that

Yeah, we could think of it like "views" but I think the more basic analogy is with the semantics of slicing generic collections. When you slice into a tuple, you get another tuple, not a numpy array. Zarr should follow the same principle. In concrete terms this would require modelling a zarr array as supported by a collection of(stored object, index) tuples. Indexing the zarr array would create a new array supported by a strict subset of the original collection of(stored object, index) tuples. This would also allow indexing to be reversible, i.e. we could concatenate zarr arrays, or build zarr arrays from stored objects on different storage backends.

What is the rationale for that? I think it is quite useful to have a pointer to a file (or even byte range) in a store.

If it's useful to have this, then it should be part of the Store API.StorePath is literally just a store, a string, and a set of convenience methods that use the store and the string. We can get the exact same functionality by putting all of this logic on the store classes themselves, with the benefit of removing a lot of unnecessary code.

Comment options

normanrz Jul 30, 2025
Maintainer

Is that the same as array "views"? I would be a fan of that
Yeah, we could think of it like "views" but I think the more basic analogy is with the semantics of slicing generic collections. When you slice into a tuple, you get another tuple, not a numpy array. Zarr should follow the same principle. In concrete terms this would require modelling a zarr array as supported by a collection of(stored object, index) tuples. Indexing the zarr array would create a new array supported by a strict subset of the original collection of(stored object, index) tuples. This would also allow indexing to be reversible, i.e. we could concatenate zarr arrays, or build zarr arrays from stored objects on different storage backends.

I thinkt that would be great. It would need careful API design to become usable and not too confusing. For my cases, an iterator that returns read/writable chunk or shard views would suffice.

Comment options

d-v-b Jul 30, 2025
Maintainer

yes my plan here is to start with low-level stuff likehttps://github.com/d-v-b/zarr-python/blob/0b9916443d555d9e762f5501314383dc828c26bf/src/zarr/core/array.py#L5253-L5286 and start working that into our indexing routines.

Comment options

martindurant
Jul 30, 2025
Maintainer

Define an awaitable Future object

This has been done before in dask-distributed (inspired by concurrent.futures). fsspec decided not to follow the model, although it was discussed - of course we have sync() in common. I wonder if there is scope to come up with a spinoff project (e.g.,https://docs.rs/futures/latest/futures/ !) for the general public good.

You must be logged in to vote

0 replies

Comment options

martindurant
Jul 30, 2025
Maintainer

remove StorePath, give all of its methods to the Store classes

This is not really user-facing, so whatever makes the most sense internally. If you can remove code and complexity, it's probably worth it.

You must be logged in to vote

0 replies

Movatterモバイル変換

Uh oh!

Zarr-Python: roadmap after 3.1#3250

Uh oh!

jhammanJul 15, 2025 Maintainer

Targets for Zarr 3.2 and beyond

Replies: 4 comments· 4 replies

Uh oh!

dstansbyJul 15, 2025

Uh oh!

d-v-bJul 30, 2025 Maintainer

Uh oh!

normanrzJul 30, 2025 Maintainer

Uh oh!

d-v-bJul 30, 2025 Maintainer

Uh oh!

normanrzJul 30, 2025 Maintainer

Uh oh!

d-v-bJul 30, 2025 Maintainer

Uh oh!

martindurantJul 30, 2025 Maintainer

Uh oh!

Uh oh!

martindurantJul 30, 2025 Maintainer

Uh oh!

jhamman
Jul 15, 2025
Maintainer

Replies: 4 comments 4 replies

dstansby
Jul 15, 2025

d-v-b
Jul 30, 2025
Maintainer

normanrz Jul 30, 2025
Maintainer

d-v-b Jul 30, 2025
Maintainer

normanrz Jul 30, 2025
Maintainer

d-v-b Jul 30, 2025
Maintainer

martindurant
Jul 30, 2025
Maintainer

martindurant
Jul 30, 2025
Maintainer