Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork366
-
I'm having difficulties with the default Here's an example: importzarrfromzarr.storageimportLocalStore# Create an array with one full chunk of shape (3,4) and one partial chunk (1,4)shape= (4,4)chunks= (3,4)new_dtype="uint8"overwrite=Truezarr_format=3store=LocalStore(root=".vscode/zarr-data/example.zarr",read_only=False)arr=zarr.create_array(store,name="0",shape=shape,chunks=chunks,dtype=new_dtype,zarr_format=3,compressors=None,filters=None,overwrite=overwrite)arr[:]=42 Inspect the size of the two chunks on disk ls -l .vscode/zarr-data/example.zarr/0/c/0/0| awk'{print $5}'# 12 (expected)ls -l .vscode/zarr-data/example.zarr/0/c/1/0| awk'{print $5}'# 12 (I would expect 4) |
BetaWas this translation helpful?Give feedback.
All reactions
Is this interpretation correct? If so, is it intentional?
for thedefault chunk grid, yes to both questions.
This can still work for virtualization but only if byte ranges are addressible in the virtualization scheme, and the byte range for all the boundary chunks has been calculated.
Replies: 2 comments 3 replies
-
for thedefault chunk grid, yes to both questions. This can still work for virtualization but only if byte ranges are addressible in the virtualization scheme, and the byte range for all the boundary chunks has been calculated. |
BetaWas this translation helpful?Give feedback.
All reactions
-
Interesting, thanks for the link and explanation.
Hmm I think that alone is not sufficient. For example, I have the addressable byte ranges for boundary chunks but encounter a reshape error in the ArrayBytes codec because the buffer length is less than a full chunk's buffer length. IIUC one option is to add a BytesBytes (e.g., An alternative would be to propose a chunk grid extension, but that risks limiting interoperability. |
BetaWas this translation helpful?Give feedback.
All reactions
-
I guess we could also define a custom |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
ah and I was wrong about the byte addressing thing -- on the encoding side, Ithink partial chunks are padded to full size before the codec pipeline runs? and if so, there's no way a byte range can be helpful, because the entire padded chunk will be compressed. |
BetaWas this translation helpful?Give feedback.
All reactions
-
Note that this is the same issue as#3035. Here's the explanation for this behavior that I provided there.
For the default chunk grid, every chunk is identical in terms of how it is stored. There is nothing special about the final chunk. |
BetaWas this translation helpful?Give feedback.
All reactions
👍 1