Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork366
Add properties to get chunk and shard slices#3573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Conversation
e3bee0b tof5ea95aCompare| returnawaitself.store_path.store.getsize_prefix(self.store_path.path) | ||
| @property | ||
| defchunk_slices(self)->Generator[tuple[slice, ...]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
nitpicking the name: i find "slice" to be kind of ambiguous between a verb and a noun, and it also locks us in to returning slice objects, so what if we use the word"region" instead?
and also I think it's helpful if the name of this routine makes it clear that it's an iterator. So what if we call ititer_chunk_regions, i.e. the name of the routine it wraps 😜
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
re. name, I think "slices" is unambiguously a noun because it's plural? I looked at NumPy (https://numpy.org/doc/stable/user/basics.indexing.html#slicing-and-striding), and they use the terms "index", "selection tuple", or "slicing tuple". If we try and stay consistent with NumPy, how about "chunk_indices"?
I like "regions", but to me it's ambiguous whether that means the index, or the array data at that index. If we settle on it that could be fixed by documentation and consistent use though.
re. iterator, do you mean generator? A list/string etc. are also iterators, but using ayield makes this into more specifically a generator.
So perhapsgenerate_chunk_indices? I find that a bit clunky though,
forchunk_indexinarr.chunk_indices:
is much nicer than
forchunk_indexinarr.generate_chunk_indices:
(oriter_chunk_indices)
which is why I prefer simplychunk_indices. I don't knkow if there's prior art in other libraries for naming generators?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
generatorsare iterators, and my thinking was that putting "iter" in the name conveys that users should expect toiterate over the value returned by calling this method.
when I think of iterating over indices, I think of iterating over tuples of coordinates, e.g.,(0, 0, 0),(0, 0, 1), .... which is what_iter_chunk_coords /_iter_shard_coords do right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I like "regions", but to me it's ambiguous whether that means the index, or the array data at that index. If we settle on it that could be fixed by documentation and consistent use though.
I worry that this ambiguity will hold for any name we pick :)
Fixes#2454.