Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork366
Draft: jigsaw codecs: allow codecs to specify the buffers they work on#3529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
it just dawned to me that we can potentially split up the sparse codec (which is a array-to-bytes codec) into a array-to-array codec that extracts the metadata and component arrays of the sparse array and creates to specialized "multi-array buffer" for sparse arrays, and a generalized array-to-bytes codec that takes the "multi-array buffer" and packs it into bytes. This obviously means that the metadata we extracted has to live in the array-to-array codec's configuration. Then should we want a similar procedure for a different array type (e.g. masked arrays or geoarrow-encoded geometry arrays), we can just create a specialized pair of array-to-array codec and "multi-array buffer" type, and reuse the "multi-array to bytes" codec. |
For my work on the sparse codec (and after discussing with@d-v-b and@jhamman at the zarr summit), I've noticed that it should be possible to have the codecs declare their input and output buffer types. The codec pipeline can then verify that the codecs form a chain of buffer types (kind of like jigsaw puzzle pieces), and infer the codec pipeline's buffer prototype as the input of the first array-to-array codec and the output of the last bytes-to-bytes codec.
TODO:
docs/user-guide/*.mdchanges/