Note
Go to the endto download the full example code.
MaskedTensor Sparsity#
Before working on this tutorial, please make sure to review ourMaskedTensor Overview tutorial <https://pytorch.org/tutorials/prototype/maskedtensor_overview.html>.
Introduction#
Sparsity has been an area of rapid growth and importance within PyTorch; if any sparsity terms are confusing below,please refer to thesparsity tutorial for additional details.
Sparse storage formats have been proven to be powerful in a variety of ways. As a primer, the first use casemost practitioners think about is when the majority of elements are equal to zero (a high degree of sparsity),but even in cases of lower sparsity, certain formats (e.g. BSR) can take advantage of substructures within a matrix.
Note
At the moment, MaskedTensor supports COO and CSR tensors with plans to support additional formats(such as BSR and CSC) in the future. If you have any requests for additional formats,please file a feature requesthere!
Principles#
When creating aMaskedTensor with sparse tensors, there are a few principles that must be observed:
dataandmaskmust have the same storage format, whether that’storch.strided,torch.sparse_coo, ortorch.sparse_csrdataandmaskmust have the same size, indicated bysize()
Sparse COO tensors#
In accordance with Principle #1, a sparse COO MaskedTensor is created by passing in two sparse COO tensors,which can be initialized by any of its constructors, for exampletorch.sparse_coo_tensor().
As a recap ofsparse COO tensors, the COO formatstands for “coordinate format”, where the specified elements are stored as tuples of their indices and thecorresponding values. That is, the following are provided:
indices: array of size(ndim,nse)and dtypetorch.int64values: array of size(nse,) with any integer or floating point dtype
wherendim is the dimensionality of the tensor andnse is the number of specified elements.
For both sparse COO and CSR tensors, you can construct aMaskedTensor by doing either:
masked_tensor(sparse_tensor_data,sparse_tensor_mask)dense_masked_tensor.to_sparse_coo()ordense_masked_tensor.to_sparse_csr()
The second method is easier to illustrate so we’ve shown that below, but for more on the first and the nuances behindthe approach, please read theSparse COO Appendix.
# Disable prototype warnings and suchSparse CSR tensors#
Similarly,MaskedTensor also supports theCSR (Compressed Sparse Row)sparse tensor format. Instead of storing the tuples of the indices like sparse COO tensors, sparse CSR tensorsaim to decrease the memory requirements by storing compressed row indices.In particular, a CSR sparse tensor consists of three 1-D tensors:
crow_indices: array of compressed row indices with size(size[0]+1,). This array indicates which rowa given entry in values lives in. The last element is the number of specified elements,whilecrow_indices[i+1] - crow_indices[i] indicates the number of specified elements in row i.col_indices: array of size(nnz,). Indicates the column indices for each value.values: array of size(nnz,). Contains the values of the CSR tensor.
Of note, both sparse COO and CSR tensors are in abeta state.
By way of example:
Supported Operations#
Unary#
Allunary operators are supported, e.g.:
Binary#
Binary operators are also supported, but theinput masks from the two masked tensors must match. For more information on why this decision was made, pleasefind ourMaskedTensor: Advanced Semantics tutorial.
Please find an example below:
Reductions#
Finally,reductions are supported:
MaskedTensor Helper Methods#
For convenience,MaskedTensor has a number of methods to help convert between the different layoutsand identify the current layout:
Setup:
MaskedTensor.to_sparse_coo() /MaskedTensor.to_sparse_csr() /MaskedTensor.to_dense()to help convert between the different layouts.
MaskedTensor.is_sparse() – this will check if theMaskedTensor’s layoutmatches any of the supported sparse layouts (currently COO and CSR).
MaskedTensor.is_sparse_coo()
MaskedTensor.is_sparse_csr()
Appendix#
Sparse COO Construction#
Recall in ouroriginal example, we created aMaskedTensorand then converted it to a sparse COO MaskedTensor withMaskedTensor.to_sparse_coo().
Alternatively, we can also construct a sparse COO MaskedTensor directly by passing in two sparse COO tensors:
Instead of usingtorch.Tensor.to_sparse(), we can also create the sparse COO tensors directly,which brings us to a warning:
Warning
When using a function likeMaskedTensor.to_sparse_coo() (analogous toTensor.to_sparse()),if the user does not specify the indices like in the above example,then the 0 values will be “unspecified” by default.
Below, we explicitly specify the 0’s:
Note thatmt andmt2 look identical on the surface, and in the vast majority of operations, will yield the sameresult. But this brings us to a detail on the implementation:
data andmask – only for sparse MaskedTensors – can have a different number of elements (nnz())at creation, but the indices ofmask must then be a subset of the indices ofdata. In this case,data will assume the shape ofmask bydata=data.sparse_mask(mask); in other words, any of the elementsindata that are notTrue inmask (that is, not specified) will be thrown away.
Therefore, under the hood, the data looks slightly different;mt2 has the “4” value masked out andmtis completely without it. Their underlying data has different shapes,which would make operations likemt+mt2 invalid.
Sparse CSR Construction#
We can also construct a sparse CSR MaskedTensor using sparse CSR tensors,and like the example above, this results in a similar treatment under the hood.
Conclusion#
In this tutorial, we have introduced how to useMaskedTensor with sparse COO and CSR formats anddiscussed some of the subtleties under the hood in case users decide to access the underlying data structuresdirectly. Sparse storage formats and masked semantics indeed have strong synergies, so much so that they aresometimes used as proxies for each other (as we will see in the next tutorial). In the future, we certainly planto invest and continue developing in this direction.
Further Reading#
To continue learning more, you can find ourEfficiently writing “sparse” semantics for Adagrad with MaskedTensor tutorialto see an example of how MaskedTensor can simplify existing workflows with native masking semantics.
# %%%%%%RUNNABLE_CODE_REMOVED%%%%%%Total running time of the script: (0 minutes 0.002 seconds)