Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Difference Between Codecs and Compressors#3494

jmdelahanty started this conversation inGeneral
Discussion options

I'm trying to understand what the difference between the codecs/compressors arguments in v3 and when I should use them in my code. Is there a general rule of thumb for when to specify a list of compressors vs. when to supply a list of codecs that I can reference?

You must be logged in to vote

Replies: 1 comment 3 replies

Comment options

At the low level,filters,serializer, andcompressors are all formallycodecs in the V3 spec. The terminology offilters /serializer /compressors is used in the Zarr-Python API in order to:

  • Provide a continuity with V2
  • Be more explicit about the roles that these different filters play in the codec pipeline

Specifically,filters are "array-to_array" codecs

deffilters(self)->tuple[Numcodec, ...]|tuple[ArrayArrayCodec, ...]:
"""
Filters that are applied to each chunk of the array, in order, before serializing that
chunk to bytes.
"""

serializer (there can only be one) is an array-to-bytes codec

defserializer(self)->ArrayBytesCodec|None:
"""
Array-to-bytes codec to use for serializing the chunks into bytes.
"""

andcompressors are bytes-to-bytes codecs

defcompressors(self)->tuple[Numcodec, ...]|tuple[BytesBytesCodec, ...]:
"""
Compressors that are applied to each chunk of the array. Compressors are applied in order, and after any
filters are applied (if any are specified) and the data is serialized into bytes.
"""

If I specify each of these, they all end up in thecodecs metadata for the array, e.g.

importzarra=zarr.create_array(shape=10,dtype='f8',filters=[zarr.codecs.Delta()],serializer=zarr.codecs.BytesCodec(),compressors=[zarr.codecs.ZstdCodec()],store=zarr.storage.MemoryStore())print(a.metadata.codecs)# -> (Delta(codec_name='numcodecs.delta', codec_config={}), BytesCodec(endian=<Endian.little: 'little'>), ZstdCodec(level=0, checksum=False))

Hope this helps!

You must be logged in to vote
3 replies
@jmdelahanty
Comment options

Thank you so much this does help!

So is it best practice then to specify each piece separately like in your final example? Or is how I've been trying to set things below sufficient? I'm trying to learn best practices for setting things up with zarr so I'm so grateful for your time!

# from my imports...from zarr.codecs import BloscCodec# later in the code...# Compressor    compressor = BloscCodec(typesize=1, cname='lz4', clevel=1, shuffle="bitshuffle")    # Create output arrays    roi_images = crop_group.create_array(        'roi_images',        shape=(total_detections, *roi_sz),        chunks=(min(chunk_size * 4, total_detections), roi_sz[0], roi_sz[1]),        dtype='uint8',        overwrite=True,        compressors=compressor    )

O

@rabernat
Comment options

Bat you're doing is fine. There is no filter by default and the default serializer is fine.

@jmdelahanty
Comment options

Thank you for your help! Zarr is awesome.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Category
General
Labels
None yet
2 participants
@jmdelahanty@rabernat

[8]ページ先頭

©2009-2025 Movatter.jp