If most of the elements of the matrix have0 value, then it is called a sparse matrix. The two major benefits of using sparse matrix instead of a simple matrix are:
- Storage:There are lesser non-zero elements than zeros and thus lesser memory can be used to store only those elements.
- Computing time: Computing time can be saved by logically designing a data structure traversing only non-zero elements.
Sparse matrices are generally utilized in applied machine learning such as in data containing data-encodings that map categories to count and also in entire subfields of machine learning such as natural language processing (NLP).
Example:
0 0 3 0 4
0 0 5 7 0
0 0 0 0 0
0 2 6 0 0
Representing a sparse matrix by a 2D array leads to wastage of lots of memory as zeroes in the matrix are of no use in most of the cases. So, instead of storing zeroes with non-zero elements, we only store non-zero elements. This means storing non-zero elements withtriples- (Row, Column, value).
Create a Sparse Matrix with SciPy
Thescipy.sparsemodule in Python provides efficient ways to store and work with large sparse matrices using various formats, each optimized for specific operations. Let's explore the different methods this module offers:
Format | Best For | Description |
---|
csr_matrix | Fast row slicing, math operations | Compressed Sparse Row good for arithmetic and row access. |
---|
csc_matrix | Fast column slicing | Compressed Sparse Column efficient for column-based ops. |
---|
coo_matrix | Easy matrix building | Coordinate format using (row, col, value) triples. |
---|
lil_matrix | Incremental row-wise construction | List of Lists, modify rows easily before converting. |
---|
dia_matrix | Diagonal-dominant matrices | Stores only diagonals, saves space. |
---|
dok_matrix | Fast item assignment | Dictionary-like, ideal for random updates. |
---|
Examples
Example 1:csr_matrix (Compressed Sparse Row)
Pythonimportnumpyasnpfromscipy.sparseimportcsr_matrixd=np.array([3,4,5,7,2,6])# datar=np.array([0,0,1,1,3,3])# rowsc=np.array([2,4,2,3,1,2])# colscsr=csr_matrix((d,(r,c)),shape=(4,5))print(csr.toarray())
Output
[[0 0 3 0 4]
[0 0 5 7 0]
[0 0 0 0 0]
[0 2 6 0 0]]
Explanation: Creates a Compressed Sparse Row (CSR) matrix using non-zero values d and their rowrand columnc indices.csr_matrix stores only the non-zero elements efficiently and toarray()converts it back to a full 2D NumPy array.
Example 2:csc_matrix (Compressed Sparse Column)
Pythonimportnumpyasnpfromscipy.sparseimportcsc_matrixd=np.array([3,4,5,7,2,6])# datar=np.array([0,0,1,1,3,3])# rowsc=np.array([2,4,2,3,1,2])# colscsc=csc_matrix((d,(r,c)),shape=(4,5))print(csc.toarray())
Output
[[0 0 3 0 4]
[0 0 5 7 0]
[0 0 0 0 0]
[0 2 6 0 0]]
Explanation:Creates a Compressed Sparse Column (CSC) matrix using non-zero values d with their rowr and columnc indices.csc_matrixefficiently stores data column-wise and toarray()converts it back to a full 2D NumPy array.
Example 3: coo_matrix (Coordinate Format)
Pythonimportnumpyasnpfromscipy.sparseimportcoo_matrixd=np.array([3,4,5,7,2,6])# datar=np.array([0,0,1,1,3,3])# rowsc=np.array([2,4,2,3,1,2])# colscoo=coo_matrix((d,(r,c)),shape=(4,5))print(coo.toarray())
Output
[[0 0 3 0 4]
[0 0 5 7 0]
[0 0 0 0 0]
[0 2 6 0 0]]
Explanation: Creates a COO matrix using non-zero values and their row-column positions. Suitable for quick construction andtoarray()returns the full matrix.
Example 4:lil_matrix (List of Lists)
Pythonimportnumpyasnpfromscipy.sparseimportlil_matrixlil=lil_matrix((4,5))lil[0,2]=3lil[0,4]=4lil[1,2]=5lil[1,3]=7lil[3,1]=2lil[3,2]=6print(lil.toarray())
Output
[[0. 0. 3. 0. 4.]
[0. 0. 5. 7. 0.]
[0. 0. 0. 0. 0.]
[0. 2. 6. 0. 0.]]
Explanation: Creates a List of Lists (LIL) matrix by assigning non-zero values directly to specified row and column positions.lil_matrixallows efficient row-wise insertion
Example 5:dok_matrix (Dictionary of Keys)
Pythonimportnumpyasnpfromscipy.sparseimportdok_matrixdok=dok_matrix((4,5))dok[0,2]=3dok[0,4]=4dok[1,2]=5dok[1,3]=7dok[3,1]=2dok[3,2]=6print(dok.toarray())
Output
[[0. 0. 3. 0. 4.]
[0. 0. 5. 7. 0.]
[0. 0. 0. 0. 0.]
[0. 2. 6. 0. 0.]]
Example 6: dia_matrix (Diagonal Matrix)
Pythonimportnumpyasnpfromscipy.sparseimportdia_matrixd=np.array([[3,0,0,0,0],[0,5,0,0,0]])offsets=np.array([0,-1])dia=dia_matrix((d,offsets),shape=(4,5))print(dia.toarray())
Output
[[3 0 0 0 0]
[0 0 0 0 0]
[0 5 0 0 0]
[0 0 0 0 0]]