SparseVector#
- classpyspark.mllib.linalg.SparseVector(size,*args)[source]#
A simple sparse vector class for passing data to MLlib. Users mayalternatively pass SciPy’s {scipy.sparse} data types.
Methods
asML()Convert this vector to the new mllib-local representation.
dot(other)Dot product with a SparseVector or 1- or 2-dimensional Numpy array.
norm(p)Calculates the norm of a SparseVector.
Number of nonzero elements.
parse(s)Parse string representation back into the SparseVector.
squared_distance(other)Squared distance from a SparseVector or 1-dimensional NumPy array.
toArray()Returns a copy of this SparseVector as a 1-dimensional NumPy array.
Attributes
Size of the vector.
A list of indices corresponding to active entries.
A list of values corresponding to active entries.
Methods Documentation
- asML()[source]#
Convert this vector to the new mllib-local representation.This does NOT copy the data; it copies references.
New in version 2.0.0.
- dot(other)[source]#
Dot product with a SparseVector or 1- or 2-dimensional Numpy array.
Examples
>>>a=SparseVector(4,[1,3],[3.0,4.0])>>>a.dot(a)25.0>>>a.dot(array.array('d',[1.,2.,3.,4.]))22.0>>>b=SparseVector(4,[2],[1.0])>>>a.dot(b)0.0>>>a.dot(np.array([[1,1],[2,2],[3,3],[4,4]]))array([ 22., 22.])>>>a.dot([1.,2.,3.])Traceback (most recent call last):...AssertionError:dimension mismatch>>>a.dot(np.array([1.,2.]))Traceback (most recent call last):...AssertionError:dimension mismatch>>>a.dot(DenseVector([1.,2.]))Traceback (most recent call last):...AssertionError:dimension mismatch>>>a.dot(np.zeros((3,2)))Traceback (most recent call last):...AssertionError:dimension mismatch
- norm(p)[source]#
Calculates the norm of a SparseVector.
Examples
>>>a=SparseVector(4,[0,1],[3.,-4.])>>>a.norm(1)7.0>>>a.norm(2)5.0
- staticparse(s)[source]#
Parse string representation back into the SparseVector.
Examples
>>>SparseVector.parse(' (4, [0,1 ],[ 4.0,5.0] )')SparseVector(4, {0: 4.0, 1: 5.0})
- squared_distance(other)[source]#
Squared distance from a SparseVector or 1-dimensional NumPy array.
Examples
>>>a=SparseVector(4,[1,3],[3.0,4.0])>>>a.squared_distance(a)0.0>>>a.squared_distance(array.array('d',[1.,2.,3.,4.]))11.0>>>a.squared_distance(np.array([1.,2.,3.,4.]))11.0>>>b=SparseVector(4,[2],[1.0])>>>a.squared_distance(b)26.0>>>b.squared_distance(a)26.0>>>b.squared_distance([1.,2.])Traceback (most recent call last):...AssertionError:dimension mismatch>>>b.squared_distance(SparseVector(3,[1,],[1.0,]))Traceback (most recent call last):...AssertionError:dimension mismatch
Attributes Documentation
- size#
Size of the vector.
- indices#
A list of indices corresponding to active entries.
- values#
A list of values corresponding to active entries.