Indexing¶
tensorstore.TensorStore (and objects of othertensorstore.Indexable types) support a common set ofindexingoperations for read/write access to individual positions and subsetsof positions. In addition to full support forNumPy-style basic andadvanced indexing,dimension expressions provide additional indexing capabilitiesintegrated with TensorStore’s support forlabeled/nameddimensions and non-zero origins.
Note
In TensorStore, all indexing operations result in a (read/write)view of the original object, represented as a new object of thesame type with a differenttensorstore.IndexDomain. Indexingoperations never implicitly perform I/O or copy data. This differsfromNumPyindexing, where basicindexing results in a view of the original data, but advancedindexing always results in a copy.
Index transforms¶
Indexing operations are composed into a normalized representation via thetensorstore.IndexTransform class, which represents anindextransform from an input space to an output space. Theexamples below may include theindex transformrepresentation.
NumPy-style indexing¶
NumPy-style indexing is performed using the syntaxobj[expr], whereobj is anytensorstore.Indexable objectand the indexing expressionexpr is one of:
an integer; | |
a | |
| |
| |
| |
|
This form of indexing always operates on a prefix of the dimensions,consuming dimensions from the existing domain and adding dimensions tothe resultant domain in order; if the indexing expression consumesfewer thanobj.rank dimensions, the remaining dimensions areretained unchanged as if indexed by:.
Integer indexing¶
Indexing with an integer selects a single position within the correspondingdimension:
>>>a=ts.array([[0,1,2],[3,4,5]],dtype=ts.int32)>>>a[1]TensorStore({ 'array': [3, 4, 5], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [3], 'input_inclusive_min': [0]},})>>>a[1,2]TensorStore({ 'array': 5, 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_rank': 0},})Each integer index consumes a single dimension from the originaldomain and adds no dimensions to the result domain.
Because TensorStore supports index domains defined over negativeindices, negative values have no special meaning; they simply refer tonegative positions:
>>>a=awaitts.open({..."dtype":"int32",..."driver":"array",..."array":[1,2,3],..."transform":{..."input_shape":[3],..."input_inclusive_min":[-10],..."output":[{..."input_dimension":0,..."offset":10...}],...},...})>>>a[-10]TensorStore({ 'array': 1, 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_rank': 0},})Warning
This differs from the behavior of the built-in sequence types andnumpy.ndarray, where a negative index specifies a positionrelative to the end (upper bound).
Specifying an index outside the explicit bounds of a dimension results in animmediate error:
>>>a=ts.array([0,1,2,3],dtype=ts.int32)>>>a[4]Traceback (most recent call last):...IndexError:OUT_OF_RANGE: Checking bounds of constant output index map for dimension 0: Index 4 is outside valid range [0, 4)...Specifying an index outside theimplicit bounds ofa dimension is permitted:
>>>a=ts.IndexTransform(input_shape=[4],implicit_lower_bounds=[True])>>>a[-1]Rank 0 -> 1 index space transform: Input domain: Output index maps: out[0] = -1>>>a[4]Traceback (most recent call last):...IndexError:OUT_OF_RANGE: Checking bounds of constant output index map for dimension 0: Index 4 is outside valid range (-inf, 4)...While implicit bounds do not constrain indexing operations, the boundswill still be checked by any subsequent read or write operation, whichwill fail if any index is actually out of bounds.
Note
In addition to theint type, integer indices may be specifiedusing any object that supports the__index__ protocol(PEP 357), includingNumPy integer scalar types.
Interval indexing¶
Indexing with aslice objectstart:stop:step selects aninterval or strided interval within the corresponding dimension:
>>>a=ts.array([0,1,2,3,4,5,6,7,8,9],dtype=ts.int32)>>>a[1:5]TensorStore({ 'array': [1, 2, 3, 4], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [5], 'input_inclusive_min': [1], 'output': [{'input_dimension': 0, 'offset': -1}], },})As for the built-in sequence types, thestart value isinclusive while thestop value is exclusive.
Each ofstart,stop, andstep may be aninteger,None, or omitted (equivalent to specifyingNone).SpecifyingNone forstart orstop retains theexisting lower or upper bound, respectively, for the dimension.SpecifyingNone forstep is equivalent to specifying1.
When thestep is1, the domain of the resultingsliced dimension isnot translated to have an origin of zero;instead, it has an origin equal to the start position of the interval(or the existing origin of the start position is unspecified):
>>>a=ts.array([0,1,2,3,4,5,6,7,8,9],dtype=ts.int32)>>>a[1:5][2]TensorStore({ 'array': 2, 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_rank': 0},})If thestep is not1, the origin of the resultingsliced dimension is equal to thestart position divided bythestep value, rounded towards zero:
>>>a=ts.array([0,1,2,3,4,5,6,7,8,9],dtype=ts.int32)>>>a[3:8:2]TensorStore({ 'array': [3, 5, 7], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [4], 'input_inclusive_min': [1], 'output': [{'input_dimension': 0, 'offset': -1}], },})>>>a[7:3:-2]TensorStore({ 'array': [7, 5], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [-1], 'input_inclusive_min': [-3], 'output': [{'input_dimension': 0, 'offset': 3}], },})It is an error to specify an interval outside the explicit bounds of adimension:
>>>a=ts.array([0,1,2,3,4,5,6,7,8,9],dtype=ts.int32)>>>a[3:12]Traceback (most recent call last):...IndexError:OUT_OF_RANGE: Computing interval slice for dimension 0: Slice interval [3, 12) is not contained within domain [0, 10)...Warning
This behavior differs from that of the built-in sequence types andnumpy.ndarray, where any out-of-bounds indices within theinterval are silently skipped.
Specifying an interval outside theimplicit boundsof a dimension is permitted:
>>>a=ts.IndexTransform(input_shape=[4],implicit_lower_bounds=[True])>>>a[-1:2]Rank 1 -> 1 index space transform: Input domain: 0: [-1, 2) Output index maps: out[0] = 0 + 1 * in[0]If a non-None value is specified forstart orstop, the lower or upper bound, respectively, of theresultant dimension will be marked explicit. IfNone is specifiedforstart orstop, the lower or upper bound,respectively, of the resultant dimension will be marked explicit ifthe corresponding original bound is marked explicit.
As with integer indexing, negativestart orstopvalues have no special meaning, and simply indicate negative positions.
Any of thestart,stop, orstop valuesmay be specified as a sequence of integer orNone values (e.g. alist,tuple or 1-dnumpy.ndarray), rather than a single integer:
>>>a=ts.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]],...dtype=ts.int32)>>>a[(1,1):(3,4)]TensorStore({ 'array': [[6, 7, 8], [10, 11, 12]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [3, 4], 'input_inclusive_min': [1, 1], 'output': [ {'input_dimension': 0, 'offset': -1}, {'input_dimension': 1, 'offset': -1}, ], },})This is equivalent to specifying a sequence ofslice objects:
>>>a=ts.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]],...dtype=ts.int32)>>>a[1:3,1:4]TensorStore({ 'array': [[6, 7, 8], [10, 11, 12]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [3, 4], 'input_inclusive_min': [1, 1], 'output': [ {'input_dimension': 0, 'offset': -1}, {'input_dimension': 1, 'offset': -1}, ], },})It is an error to specify aslice with sequences of unequallengths, but a sequence may be combined with a scalar value:
>>>a=ts.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]],...dtype=ts.int32)>>>a[1:(3,4)]TensorStore({ 'array': [[6, 7, 8], [10, 11, 12]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [3, 4], 'input_inclusive_min': [1, 1], 'output': [ {'input_dimension': 0, 'offset': -1}, {'input_dimension': 1, 'offset': -1}, ], },})Adding singleton dimensions¶
Specifying a value oftensorstore.newaxis (equal toNone) adds anew inert/singleton dimension withimplicit bounds\([0, 1)\):
>>>a=ts.IndexTransform(input_rank=2)>>>a[ts.newaxis]Rank 3 -> 2 index space transform: Input domain: 0: [0*, 1*) 1: (-inf*, +inf*) 2: (-inf*, +inf*) Output index maps: out[0] = 0 + 1 * in[1] out[1] = 0 + 1 * in[2]This indexing term consumes no dimensions from the original domain andadds a single dimension after any dimensions added by prior indexingoperations:
>>>a=ts.IndexTransform(input_rank=2)>>>a[:,ts.newaxis,ts.newaxis]Rank 4 -> 2 index space transform: Input domain: 0: (-inf*, +inf*) 1: [0*, 1*) 2: [0*, 1*) 3: (-inf*, +inf*) Output index maps: out[0] = 0 + 1 * in[0] out[1] = 0 + 1 * in[3]Because the added dimension has implicit bounds, it may be givenarbitrary bounds by a subsequent interval indexing term:
>>>a=ts.IndexTransform(input_rank=2)>>>a[ts.newaxis][3:10]Rank 3 -> 2 index space transform: Input domain: 0: [3, 10) 1: (-inf*, +inf*) 2: (-inf*, +inf*) Output index maps: out[0] = 0 + 1 * in[1] out[1] = 0 + 1 * in[2]Ellipsis¶
Specifying the specialEllipsis value (...) is equivalentto specifying as many full slices: as needed to consume theremaining dimensions of the original domain not consumed by otherindexing terms:
>>>a=ts.array([[[1,2,3],[4,5,6]]],dtype=ts.int32)>>>a[...,1]TensorStore({ 'array': [2, 5], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [1, 2], 'input_inclusive_min': [0, 0], 'output': [{'input_dimension': 1}], },})At most oneEllipsis may be specified within a single NumPy-styleindexing expression:
>>>a=ts.array([[[1,2,3],[4,5,6]]],dtype=ts.int32)>>>a[...,1,...]Traceback (most recent call last):...IndexError:An index can only have a single ellipsis (`...`)...As a complete indexing expression ,Ellipsis has no effect and isequivalent to the empty tuple(), but can still be usefulfor the purpose of an assignment:
>>>a=ts.array([0,1,2,3],dtype=ts.int32)>>>a[...]=7>>>aTensorStore({ 'array': [7, 7, 7, 7], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [4], 'input_inclusive_min': [0]},})Integer array indexing¶
Specifying anarray_likeindex array of integer values selects thecoordinates of the dimension given by the elements of the array:
>>>a=ts.array([5,4,3,2],dtype=ts.int32)>>>a[[0,3,3]]TensorStore({ 'array': [5, 2, 2], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [3], 'input_inclusive_min': [0]},})>>>a[[[0,1],[2,3]]]TensorStore({ 'array': [[5, 4], [3, 2]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [2, 2], 'input_inclusive_min': [0, 0]},})This indexing term consumes a single dimension from the originaldomain, and when the full indexing expression involves just a singlearray indexing term, adds the dimensions of the index array to theresult domain.
As with integer and interval indexing, and unlike NumPy, negativevalues in an index array have no special meaning, and simply indicatenegative positions.
When a single indexing expression includes multiple index arrays,vectorized array indexing semantics apply by default: the shapes ofthe index arrays must all be broadcast-compatible, and the dimensionsof the single broadcasted domain are added to the result domain:
>>>a=ts.array([[1,2],[3,4],[5,6]],dtype=ts.int32)>>>a[[0,1,2],[0,1,0]]TensorStore({ 'array': [1, 4, 5], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [3], 'input_inclusive_min': [0]},})>>>a[[[0,1],[2,2]],[[0,1],[1,0]]]TensorStore({ 'array': [[1, 4], [6, 5]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [2, 2], 'input_inclusive_min': [0, 0]},})>>>a[[[0,1],[2,2]],[0,1]]TensorStore({ 'array': [[1, 4], [5, 6]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [2, 2], 'input_inclusive_min': [0, 0]},})If all of the index arrays are applied to consecutive dimensionswithout any interleavedslice,Ellipsis, ortensorstore.newaxisterms (interleaved integer index terms are permitted), then by defaultlegacy NumPy semantics are used: the dimensions of the broadcastedarray domain are addedinline to the result domain after anydimensions added by prior indexing terms in the indexing expression:
>>>a=ts.array([[[1,2],[3,4]],[[5,6],[7,8]]],dtype=ts.int32)>>>a[:,[1,0],[1,1]]TensorStore({ 'array': [[4, 2], [8, 6]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [2, 2], 'input_inclusive_min': [0, 0]},})If there are any interleavedslice,Ellipsis, ortensorstore.newaxis terms, then instead the dimensions of thebroadcasted array domain are added as the first dimensions of theresult domain:
>>>a=ts.array([[[1,2],[3,4]],[[5,6],[7,8]]],dtype=ts.int32)>>>a[:,[1,0],ts.newaxis,[1,1]]TensorStore({ 'array': [[4, 8], [2, 6]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [2, 2, [1]], 'input_inclusive_min': [0, 0, [0]], 'output': [{'input_dimension': 0}, {'input_dimension': 1}], },})To ensure that the added array domain dimensions are added as thefirst dimensions of the result domain regardless of whether there areany interleavedslice,Ellipsis, ortensorstore.newaxis terms,use thevindex indexing method.
To instead performouter array indexing, where each index array isapplied orthogonally, use theoindex indexingmethod.
Note
Thelegacy NumPy indexing behavior, whereby array domaindimensions are added eitherinline or as the first dimensionsdepending on whether the index arrays are applied to consecutivedimensions, is the default behavior for compatibility with NumPybut may be confusing. It is recommended to instead use either thevindex oroindex indexing method for lessconfusing behavior when using multiple index arrays.
Boolean array indexing¶
Specifying anarray_like ofbool values is equivalent tospecifying a sequence of integer index arrays containing thecoordinates ofTrue values (in C order), e.g. as obtained fromnumpy.nonzero.
Specifying a 1-dbool array is equivalent to a single index array of thenon-zero coordinates:
>>>a=ts.array([0,1,2,3,4],dtype=ts.int32)>>>a[[True,False,True,True]]TensorStore({ 'array': [0, 2, 3], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [3], 'input_inclusive_min': [0]},})>>># equivalent, using index array>>>a[[0,2,3]]TensorStore({ 'array': [0, 2, 3], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [3], 'input_inclusive_min': [0]},})More generally, specifying ann-dimensionalbool array is equivalent tospecifyingn index arrays, where theith index array specifiestheith coordinate of theTrue values:
>>>a=ts.array([[0,1,2],[3,4,5]],dtype=ts.int32)>>>a[[[True,False,False],[True,True,False]]]TensorStore({ 'array': [0, 3, 4], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [3], 'input_inclusive_min': [0]},})>>># equivalent, using index arrays>>>a[[0,1,1],[0,0,1]]TensorStore({ 'array': [0, 3, 4], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [3], 'input_inclusive_min': [0]},})This indexing term consumesn dimensions from the original domain,wheren is the rank of thebool array.
It is perfectly valid to mix boolean array indexing with other formsof indexing, including integer array indexing, with exactly the sameresult as if the boolean array were replaced by the equivalentsequence of integer index arrays:
>>>a=ts.array([[0,1,2],[3,4,5],[7,8,9]],dtype=ts.int32)>>>a[[True,False,True],[2,1]]TensorStore({ 'array': [2, 8], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [2], 'input_inclusive_min': [0]},})>>># equivalent, using index array>>>a[[0,2],[2,1]]TensorStore({ 'array': [2, 8], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [2], 'input_inclusive_min': [0]},})Warning
Mixing boolean and integer index arrays in the default vectorizedindexing mode, while supported for compatibility with NumPy, islikely to be confusing. In most cases of mixed boolean and integerarray indexing,outer indexing modeprovides more useful behavior.
The scalar valuesTrue andFalse are treated as zero-rank booleanarrays. Zero-rank boolean arrays are supported, but there is noequivalent integer index array representation. If there are no otherinteger or boolean arrays, specifying a zero-rank boolean array isequivalent to specifyingtensorstore.newaxis, except that the addeddimension has explicit rather than implicit bounds, and in the case ofaFalse array the added dimension has the empty bounds of\([0,0)\):
>>>a=ts.IndexTransform(input_rank=2)>>>a[:,True]Rank 3 -> 2 index space transform: Input domain: 0: (-inf*, +inf*) 1: [0, 1) 2: (-inf*, +inf*) Output index maps: out[0] = 0 + 1 * in[0] out[1] = 0 + 1 * in[2]>>>a[:,False]Rank 3 -> 2 index space transform: Input domain: 0: (-inf*, +inf*) 1: [0, 0) 2: (-inf*, +inf*) Output index maps: out[0] = 0 + 1 * in[0] out[1] = 0 + 1 * in[2]If there are other integer or boolean arrays, specifying a zero-rankboolean array has no effect except that:
the other index array shapes must be broadcast-compatible with theshape
[0]in the case of aFalsezero-rank array,meaning they are all empty arrays (in the case of aTruezero-rank array, the other index array shapes must bebroadcast-compatible with the shape[1], which is alwayssatisfied);in legacy NumPy indexing mode, if it is separated from anotherinteger or boolean array term by a
slice,Ellipsis, ortensorstore.newaxis, it causes the dimensions of the broadcastarray domain to be added as the first dimensions of the resultdomain:
>>>a=ts.IndexTransform(input_rank=2)>>># Index array dimension added to result domain inline>>>a[:,True,[0,1]]Rank 2 -> 2 index space transform: Input domain: 0: (-inf*, +inf*) 1: [0, 2) Output index maps: out[0] = 0 + 1 * in[0] out[1] = 0 + 1 * bounded((-inf, +inf), array(in)), where array = {{0, 1}}>>>a[:,False,[]]Rank 2 -> 2 index space transform: Input domain: 0: (-inf*, +inf*) 1: [0, 0) Output index maps: out[0] = 0 + 1 * in[0] out[1] = 0>>># Index array dimensions added as first dimension of result domain>>>a[True,:,[0,1]]Rank 2 -> 2 index space transform: Input domain: 0: [0, 2) 1: (-inf*, +inf*) Output index maps: out[0] = 0 + 1 * in[1] out[1] = 0 + 1 * bounded((-inf, +inf), array(in)), where array = {{0}, {1}}>>>a[False,:,[]]Rank 2 -> 2 index space transform: Input domain: 0: [0, 0) 1: (-inf*, +inf*) Output index maps: out[0] = 0 + 1 * in[1] out[1] = 0Note
Zero-rank boolean arrays are supported for consistency and forcompatibility with NumPy, but are rarely useful.
Differences compared to NumPy indexing¶
TensorStore indexing has near-perfect compatibility with NumPy, butthere are a few differences to be aware of:
Negative indices have no special meaning in TensorStore, and simplyrefer to negative positions. TensorStore does not support anequivalent shortcut syntax to specify a position
nrelative tothe upper bound of a dimension; instead, it must be specifiedexplicitly, e.g.x[x.domain[0].exclusive_max-n].In TensorStore, out-of-bounds intervals specified by a
sliceresult in an error. In NumPy, out-of-bounds indices specified by asliceare silently truncated.In TensorStore, indexing a dimension with a
slice(withstepof1orNone) restricts the domain of thatdimension but does not translate its origin such that the new lowerbound is 0. In contrast, NumPy does not support non-zero origins andthereforesliceoperations always result in the lower bound beingtranslated to0in NumPy.>>>x=ts.array(np.arange(10,dtype=np.int64))>>>y=x[2:]>>>y[:4]# still excludes the first two elementsTensorStore({ 'array': [2, 3], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int64', 'transform': { 'input_exclusive_max': [4], 'input_inclusive_min': [2], 'output': [{'input_dimension': 0, 'offset': -2}], },})To obtain the behavior of NumPy, the dimensions can be explicitlytranslated to have an origin of
0:>>>z=y[ts.d[:].translate_to[0]]>>>z[:4]# relative to the new originTensorStore({ 'array': [2, 3, 4, 5], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int64', 'transform': {'input_exclusive_max': [4], 'input_inclusive_min': [0]},})To specify a sequence of indexing terms when using the syntax
obj[expr]in TensorStore,exprmust be atuple. InNumPy, for compatibility with its predecessor libraryNumeric, ifexpris alistor other non-numpy.ndarraysequence typecontaining at least oneslice,Ellipsis, orNonevalue, it isinterpreted the same as atuple(this behavior is deprecatedin NumPy since version 1.15.0). TensorStore, incontrast, will attempt to convert any non-tuplesequence to an integeror boolean array, which results in an error if the sequence contains aslice,Ellipsis, orNonevalue.
Vectorized indexing mode (vindex)¶
The expressionobj.vindex[expr], whereobj is anytensorstore.Indexable object andexpr is a validNumPy-style indexing expression, has asimilar effect toobj[expr] except that ifexprspecifies any array indexing terms, the broadcasted array dimensionsare unconditionally added as the first dimensions of the resultdomain:
>>>a=ts.array([[[1,2],[3,4]],[[5,6],[7,8]]],dtype=ts.int32)>>>a.vindex[:,[1,0],[1,1]]TensorStore({ 'array': [[4, 8], [2, 6]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [2, 2], 'input_inclusive_min': [0, 0]},})This avoids the potentially-confusing behavior of the default legacyNumPy semantics, under which the broadcasted array dimensions areadded inline to the result domain if none of the array indexing termsare separated by aslice,Ellipsis, ortensorstore.newaxis term.
Note
Ifexpr does not include any array indexing terms,obj.vindex[expr] is exactly equivalent toobj[expr].
This indexing method is similar to the behavior of:
the proposed
vindexinNumPy Enhancement Proposal 21.
Outer indexing mode (oindex)¶
The expressionobj.oindex[expr], whereobj is anytensorstore.Indexable object andexpr is a validNumPy-style indexing expression,performsouter/orthogonal indexing. The effect is similar toobj[expr], but differs in that any integer or boolean arrayindexing terms are applied orthogonally:
>>>a=ts.array([[0,1,2],[3,4,5]],dtype=ts.int32)>>>a.oindex[[0,0,1],[1,2]]TensorStore({ 'array': [[1, 2], [1, 2], [4, 5]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [3, 2], 'input_inclusive_min': [0, 0]},})>>># equivalent, using boolean array>>>a.oindex[[0,0,1],[False,True,True]]TensorStore({ 'array': [[1, 2], [1, 2], [4, 5]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [3, 2], 'input_inclusive_min': [0, 0]},})Unlike in thedefault orthevindex indexing modes, the index arrayshapes need not be broadcast-compatible; instead, the dimensions ofeach index array (or the 1-d index array equivalent of a booleanarray) are added to the result domain immediately after any dimensionsadded by the previous indexing terms:
>>>a=ts.array([[[1,2],[3,4]],[[5,6],[7,8]]],dtype=ts.int32)>>>a.oindex[[1,0],:,[0,0,1]]TensorStore({ 'array': [[[5, 5, 6], [7, 7, 8]], [[1, 1, 2], [3, 3, 4]]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [2, 2, 3], 'input_inclusive_min': [0, 0, 0], },})Each boolean array indexing term adds a single dimension to the resultdomain:
>>>a=ts.array([[[1,2],[3,4]],[[5,6],[7,8]]],dtype=ts.int32)>>>a.oindex[[[True,False],[False,True]],[1,0]]TensorStore({ 'array': [[2, 1], [8, 7]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [2, 2], 'input_inclusive_min': [0, 0]},})Note
Ifexpr does not include any array indexing terms,obj.oindex[expr] is exactly equivalent toobj[expr].
This indexing method is similar to the behavior of:
the proposed
oindexinNumPy Enhancement Proposal 21.
Dimension expressions¶
Dimension expressions provide an alternative indexing mechanism toNumPy-style indexing that is more powerful and expressive andsupportsdimension labels (but can be moreverbose):
The usual syntax for applying a dimension expression is:obj[ts.d[sel]op1...opN], whereobj is anytensorstore.Indexable object,sel specifies the initialdimension selection andop1...opN specifies a chain of one or moreoperations supported bytensorstore.DimExpression (the... inop1...opN is not a literal PythonEllipsis (...), butsimply denotes a sequence of operation invocations).
Thetensorstore.DimExpression object itself, constructed using thesyntaxts.d[sel]op1...opN is simply a lightweight,immutable representation of the sequence of operations and theirarguments, and performs only minimal validation upon construction;full validation is deferred until it is actually applied to antensorstore.Indexable object, using the syntaxobj[ts.d[sel]op1...opN].
>>>a=ts.array([[[0,1],[2,3],[4,5]],[[6,7],[8,9],[10,11]]],...dtype=ts.int32)>>># Label the dimensions "x", "y", "z">>>a=a[ts.d[:].label["x","y","z"]]>>>aTensorStore({ 'array': [[[0, 1], [2, 3], [4, 5]], [[6, 7], [8, 9], [10, 11]]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [2, 3, 2], 'input_inclusive_min': [0, 0, 0], 'input_labels': ['x', 'y', 'z'], },})>>># Select the y=1, x=0 slice>>>a[ts.d["y","x"][1,0]]TensorStore({ 'array': [2, 3], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [2], 'input_inclusive_min': [0], 'input_labels': ['z'], },})Operations¶
Dimension expressions provide the following advanced operations:
label¶
Sets (or changes) the labels of the selected dimensions.
>>>a=ts.array([[0,1,2,3],[4,5,6,7],[8,9,10,11]],...dtype=ts.int32)>>>a=a[ts.d[:].label["x","y"]]>>>aTensorStore({ 'array': [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [3, 4], 'input_inclusive_min': [0, 0], 'input_labels': ['x', 'y'], },})>>># Select the x=1 slice>>>a[ts.d["x"][1]]TensorStore({ 'array': [4, 5, 6, 7], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [4], 'input_inclusive_min': [0], 'input_labels': ['y'], },})This operation can also be applied directly totensorstore.Indexable types, inwhich case it applies to all dimensions:
>>>ts.array([[0,1,2,3],[4,5,6,7],[8,9,10,11]],...dtype=ts.int32).label['x','y']TensorStore({ 'array': [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [3, 4], 'input_inclusive_min': [0, 0], 'input_labels': ['x', 'y'], },})diagonal¶
Extracts the diagonal of the selected dimensions.
>>>a=ts.array([[0,1,2,3],[4,5,6,7],[8,9,10,11]],...dtype=ts.int32)>>>a[ts.d[:].diagonal]TensorStore({ 'array': [0, 5, 10], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [3], 'input_inclusive_min': [0]},})translate_to¶
Translates the domains of the selected input dimensions to the specifiedorigins without affecting the output range.
>>>a=ts.array([[0,1,2,3],[4,5,6,7],[8,9,10,11]],...dtype=ts.int32)>>>a.origin(0, 0)>>>a[ts.d[:].translate_to[1]].origin(1, 1)>>>a[ts.d[:].translate_to[1,2]].origin(1, 2)This operation can also be applied directly totensorstore.Indexable types, inwhich case it applies to all dimensions:
>>>ts.array([[0,1,2,3],[4,5,6,7],[8,9,10,11]],...dtype=ts.int32).translate_to[1]TensorStore({ 'array': [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [4, 5], 'input_inclusive_min': [1, 1], 'output': [ {'input_dimension': 0, 'offset': -1}, {'input_dimension': 1, 'offset': -1}, ], },})translate_by¶
Translates (shifts) the domains of the selected input dimensions by thespecified offsets, without affecting the output range.
>>>a=ts.array([[0,1,2,3],[4,5,6,7],[8,9,10,11]],...dtype=ts.int32)>>>a[ts.d[:].translate_by[-1,1]].origin(-1, 1)This operation can also be applied directly totensorstore.Indexable types, inwhich case it applies to all dimensions:
>>>ts.array([[0,1,2,3],[4,5,6,7],[8,9,10,11]],...dtype=ts.int32).translate_by[-1,1]TensorStore({ 'array': [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [2, 5], 'input_inclusive_min': [-1, 1], 'output': [ {'input_dimension': 0, 'offset': 1}, {'input_dimension': 1, 'offset': -1}, ], },})translate_backward_by¶
Translates (shifts) the domains of the selected input dimensions backward bythe specified offsets, without affecting the output range.
>>>a=ts.array([[0,1,2,3],[4,5,6,7],[8,9,10,11]],...dtype=ts.int32)>>>a[ts.d[:].translate_backward_by[-1,1]].origin(1, -1)This operation can also be applied directly totensorstore.Indexable types, inwhich case it applies to all dimensions:
>>>ts.array([[0,1,2,3],[4,5,6,7],[8,9,10,11]],...dtype=ts.int32).translate_backward_by[-1,1]TensorStore({ 'array': [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [4, 3], 'input_inclusive_min': [1, -1], 'output': [ {'input_dimension': 0, 'offset': -1}, {'input_dimension': 1, 'offset': 1}, ], },})stride¶
Strides the domains of the selected input dimensions by the specified amounts.
>>>a=ts.array([[0,1,2,3],[4,5,6,7],[8,9,10,11]],...dtype=ts.int32)>>>a[ts.d[1].stride[2]]TensorStore({ 'array': [[0, 2], [4, 6], [8, 10]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [3, 2], 'input_inclusive_min': [0, 0]},})transpose¶
Transposes the selected dimensions to the specified target indices.
>>>a=ts.array([[0,1,2,3],[4,5,6,7],[8,9,10,11]],...dtype=ts.int32)>>>a=a[ts.d[:].label["x","y"]]>>>a[ts.d[1].transpose[0]]TensorStore({ 'array': [[0, 4, 8], [1, 5, 9], [2, 6, 10], [3, 7, 11]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [4, 3], 'input_inclusive_min': [0, 0], 'input_labels': ['y', 'x'], },})>>>a[ts.d[:].transpose[::-1]]TensorStore({ 'array': [[0, 4, 8], [1, 5, 9], [2, 6, 10], [3, 7, 11]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [4, 3], 'input_inclusive_min': [0, 0], 'input_labels': ['y', 'x'], },})mark_bounds_implicit¶
Changes the lower and/or upper bounds of the selected dimensions to beimplicit or explicit.
>>>s=awaitts.open({...'driver':'zarr',...'kvstore':'memory://'...},...shape=[100,200],...dtype=ts.uint32,...create=True)>>>s.domain{ [0, 100*), [0, 200*) }>>>awaits.resize(exclusive_max=[200,300])>>>(awaits.resolve()).domain{ [0, 200*), [0, 300*) }>>>(awaits[ts.d[0].mark_bounds_implicit[False]].resolve()).domain{ [0, 100), [0, 300*) }>>>s_subregion=s[20:30,40:50]>>>s_subregion.domain{ [20, 30), [40, 50) }>>>(await...s_subregion[ts.d[0].mark_bounds_implicit[:True]].resolve()).domain{ [20, 200*), [40, 50) }>>>t=ts.IndexTransform(input_rank=3)>>>t=t[ts.d[0,2].mark_bounds_implicit[False]]>>>tRank 3 -> 3 index space transform: Input domain: 0: (-inf, +inf) 1: (-inf*, +inf*) 2: (-inf, +inf) Output index maps: out[0] = 0 + 1 * in[0] out[1] = 0 + 1 * in[1] out[2] = 0 + 1 * in[2]>>>t=t[ts.d[0,1].mark_bounds_implicit[:True]]>>>tRank 3 -> 3 index space transform: Input domain: 0: (-inf, +inf*) 1: (-inf*, +inf*) 2: (-inf, +inf) Output index maps: out[0] = 0 + 1 * in[0] out[1] = 0 + 1 * in[1] out[2] = 0 + 1 * in[2]>>>t=t[ts.d[1,2].mark_bounds_implicit[True:False]]>>>tRank 3 -> 3 index space transform: Input domain: 0: (-inf, +inf*) 1: (-inf*, +inf) 2: (-inf*, +inf) Output index maps: out[0] = 0 + 1 * in[0] out[1] = 0 + 1 * in[1] out[2] = 0 + 1 * in[2]This operation can also be applied directly totensorstore.Indexable types, inwhich case it applies to all dimensions:
>>>s=awaitts.open({...'driver':'zarr',...'kvstore':'memory://'...},...shape=[100,200],...dtype=ts.uint32,...create=True)>>>s.domain{ [0, 100*), [0, 200*) }>>>s.mark_bounds_implicit[False].domain{ [0, 100), [0, 200) }oindex¶
Applies a NumPy-style indexing operation with outer indexing semantics.
>>>a=ts.array([[0,1,2,3],[4,5,6,7],[8,9,10,11]],...dtype=ts.int32)>>>a[ts.d[:].oindex[(2,2),(0,1,3)]]TensorStore({ 'array': [[8, 9, 11], [8, 9, 11]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [2, 3], 'input_inclusive_min': [0, 0]},})vindex¶
Applies a NumPy-style indexing operation with vectorized indexing semantics.
>>>a=ts.array([[0,1,2,3],[4,5,6,7],[8,9,10,11]],...dtype=ts.int32)>>>a[ts.d[:].vindex[(1,0,2),(0,1,3)]]TensorStore({ 'array': [4, 1, 11], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': {'input_exclusive_max': [3], 'input_inclusive_min': [0]},})Composed examples¶
Composing dimension expressions enables constructing more complex indexingoperations than are easily done with native syntax.
>>>a=ts.array([[[0,1],[2,3],[4,5]],[[6,7],[8,9],[10,11]]],...dtype=ts.int32)[ts.d[:].label["x","y","z"]]>>># Transpose "x" and "z">>>a[ts.d["x","z"].transpose[2,0]]TensorStore({ 'array': [[[0, 6], [2, 8], [4, 10]], [[1, 7], [3, 9], [5, 11]]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [2, 3, 2], 'input_inclusive_min': [0, 0, 0], 'input_labels': ['z', 'y', 'x'], },})>>># Select the x=d, y=d diagonal, and transpose "d" to end>>>a[ts.d["x","y"].diagonal.label["d"].transpose[-1]]TensorStore({ 'array': [[0, 8], [1, 9]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [2, 2], 'input_inclusive_min': [0, 0], 'input_labels': ['z', 'd'], },})>>># Slice z=0, apply outer indexing to "x" and "y", label as "a", "b">>>a[ts.d["z","x","y"].oindex[0,[0,1],[2,1]].label["a","b"]]TensorStore({ 'array': [[4, 2], [10, 8]], 'context': {'data_copy_concurrency': {}}, 'driver': 'array', 'dtype': 'int32', 'transform': { 'input_exclusive_max': [2, 2], 'input_inclusive_min': [0, 0], 'input_labels': ['a', 'b'], },})Dimension selections¶
A dimension selection is specified using the syntaxts.d[sel], wheresel is one of:
an integer, specifying an existing or new dimension by index (as withbuilt-in sequence types, negative numbers specify a dimension index relativeto the end);
a non-empty
str, specifying an existing dimension by label;a
sliceobject,start:stop:step, wherestart,stop, andstepare either integers orNone,specifying a range of existing or new dimensions by index (as for built-insequence types, negative numbers specify a dimension index relative to theend);any sequence (including a
tuple,list, or anothertensorstore.DimSelectionobject) of any of the above.
The result is atensorstore.DimSelection object, which is simply alightweight, immutable container representing the flattened sequenceofint,str, orslice objects:
>>>ts.d[0,1,2]d[0,1,2]>>>ts.d[0:1,2,"x"]d[0:1,2,'x']>>>ts.d[[0,1],[2]]d[0,1,2]>>>ts.d[[0,1],ts.d[2,3]]d[0,1,2,3]Astr label always identifies an existing dimension, and is onlycompatible with operations/terms that expect an existing dimension:
>>>a=ts.IndexTransform(input_labels=['x'])>>>a[ts.d["x"][2:3]]Rank 1 -> 1 index space transform: Input domain: 0: [2, 3) "x" Output index maps: out[0] = 0 + 1 * in[0]An integer may identify either an existing or new dimension dependingon whether it is used with atensorstore.newaxis term:
>>>a=ts.IndexTransform(input_labels=['x','y'])>>># `1` refers to existing dimension "y">>>a[ts.d[1][2:3]]Rank 2 -> 2 index space transform: Input domain: 0: (-inf*, +inf*) "x" 1: [2, 3) "y" Output index maps: out[0] = 0 + 1 * in[0] out[1] = 0 + 1 * in[1]>>># `1` refers to new singleton dimension>>>a[ts.d[1][ts.newaxis]]Rank 3 -> 2 index space transform: Input domain: 0: (-inf*, +inf*) "x" 1: [0*, 1*) 2: (-inf*, +inf*) "y" Output index maps: out[0] = 0 + 1 * in[0] out[1] = 0 + 1 * in[2]A negative dimension index-i is equivalent ton-i, wheren is thesum of the rank of the original domainplusthe number oftensorstore.newaxis terms:
>>>a=ts.IndexTransform(input_labels=['x','y'])>>># `-1` is equivalent to 1, refers to existing dimension "y">>>a[ts.d[-1][2:3]]Rank 2 -> 2 index space transform: Input domain: 0: (-inf*, +inf*) "x" 1: [2, 3) "y" Output index maps: out[0] = 0 + 1 * in[0] out[1] = 0 + 1 * in[1]>>># `-1` is equivalent to 2, refers to new singleton dimension>>>a[ts.d[-1][ts.newaxis]]Rank 3 -> 2 index space transform: Input domain: 0: (-inf*, +inf*) "x" 1: (-inf*, +inf*) "y" 2: [0*, 1*) Output index maps: out[0] = 0 + 1 * in[0] out[1] = 0 + 1 * in[1]Likewise, aslice may identify either existing or new dimensions:
>>>a=ts.IndexTransform(input_labels=['x','y','z'])>>># `:2` refers to existing dimensions "x", "y">>>a[ts.d[:2][1:2,3:4]]Rank 3 -> 3 index space transform: Input domain: 0: [1, 2) "x" 1: [3, 4) "y" 2: (-inf*, +inf*) "z" Output index maps: out[0] = 0 + 1 * in[0] out[1] = 0 + 1 * in[1] out[2] = 0 + 1 * in[2]>>># `:2` refers to two new singleton dimensions>>>a[ts.d[:2][ts.newaxis,ts.newaxis]]Rank 5 -> 3 index space transform: Input domain: 0: [0*, 1*) 1: [0*, 1*) 2: (-inf*, +inf*) "x" 3: (-inf*, +inf*) "y" 4: (-inf*, +inf*) "z" Output index maps: out[0] = 0 + 1 * in[2] out[1] = 0 + 1 * in[3] out[2] = 0 + 1 * in[4]If atensorstore.newaxis term is mixed with a term that consumes anexisting dimension, any dimension indices specified in the dimensionselection (either directly or viaslice objects) are with respect toanintermediate domain with any new singleton dimensions insertedbut no existing dimensions consumed:
>>>a=ts.IndexTransform(input_labels=['x','y'])>>># `1` refers to new singleton dimension, `2` refers to "y">>># intermediate domain is: {0: "x", 1: "", 2: "y"}>>>a[ts.d[1,2][ts.newaxis,0]]Rank 2 -> 2 index space transform: Input domain: 0: (-inf*, +inf*) "x" 1: [0*, 1*) Output index maps: out[0] = 0 + 1 * in[0] out[1] = 0Dimension expression construction¶
Atensorstore.DimExpression that applies a given operation to aninitial dimension selectiondexpr=ts.d[sel] is constructed using:
subscript syntax
dexpr[iexpr](forNumPy-style indexing);attribute syntax
dexpr.diagonalfor operations that take no arguments; orattribute subscript syntax
dexpr.label[arg].
The same syntax may also be used to chain additional operations ontoan existingtensorstore.DimExpression:
>>>a=ts.IndexTransform(input_rank=0)>>>a[ts.d[0][ts.newaxis][1:10].label['z']]Rank 1 -> 0 index space transform: Input domain: 0: [1, 10) "z" Output index maps:When atensorstore.DimExpressiondexpr is applied to atensorstore.Indexable objectobj, using the syntaxobj[dexpr], the following steps occur:
The initial dimension selection specified in
dexprisresolved based on the domain ofobjand the firstoperation ofdexpr.The first operation specified in
dexpris applied toobjusing the resolved initial dimension selection. This resultsin a newtensorstore.Indexableobject of the same type asobjand a new dimension selection consisting of the dimensionsretained from the prior dimension selection or added by the operation.Each subsequent operation, is applied, in order, to the new
tensorstore.Indexableobject and new dimension selection producedby each prior operation.
NumPy-style dimension expression indexing¶
The syntaxdexpr[iexpr],dexpr.vindex[iexpr], anddexpr.oindex[iexpr] chains a NumPy-style indexing operationto an existingtensorstore.DimSelection ortensorstore.DimExpression.
The behavior is similar to that of regularNumPy-styleindexing applied directly to atensorstore.Indexable object, with the following differences:
The terms of the indexing expression
iexprconsumedimensions in order from the dimension selection rather thanstarting from the first dimension of the domain, and unless anEllipsis(...) term is specified,iexprmustinclude a sufficient number of indexing terms to consume the entiredimension selection.tensorstore.newaxisterms are only permitted in the firstoperation of a dimension expression, since in subsequent operationsall dimensions of the dimension selection necessarily refer toexisting dimensions. Additionally, the dimension selection mustspecify the index of the new dimension for eachtensorstore.newaxisterm.If
iexpris ascalar indexing expression that consists of a:single integer,
slicestart:stop:stepwherestart,stop,andstepare integers orNone, ortensorstore.newaxisterm,
it may be used with a dimension selection of more than onedimension, in which case
iexpris implicitly duplicated tomatch the number of dimensions in the dimension selection:>>>a=ts.IndexTransform(input_labels=["x","y"])>>># add singleton dimension to beginning and end>>>a[ts.d[0,-1][ts.newaxis]]Rank 4 -> 2 index space transform: Input domain: 0: [0*, 1*) 1: (-inf*, +inf*) "x" 2: (-inf*, +inf*) "y" 3: [0*, 1*) Output index maps: out[0] = 0 + 1 * in[1] out[1] = 0 + 1 * in[2]>>># slice out square region>>>a[ts.d[:][0:10]]Rank 2 -> 2 index space transform: Input domain: 0: [0, 10) "x" 1: [0, 10) "y" Output index maps: out[0] = 0 + 1 * in[0] out[1] = 0 + 1 * in[1]When using the default indexing mode, i.e.
dexpr[iexpr], if morethan one array indexing term is specified (even if they areconsecutive), the array dimensions are always added as the firstdimensions of the result domain (as ifdexpr.vindex[iexpr]were specified).When using outer indexing mode, i.e.
dexpr.oindex[iexpr],zero-rank boolean arrays are not permitted.