spectral_embedding#
- sklearn.manifold.spectral_embedding(adjacency,*,n_components=8,eigen_solver=None,random_state=None,eigen_tol='auto',norm_laplacian=True,drop_first=True)[source]#
Project the sample on the first eigenvectors of the graph Laplacian.
The adjacency matrix is used to compute a normalized graph Laplacianwhose spectrum (especially the eigenvectors associated to thesmallest eigenvalues) has an interpretation in terms of minimalnumber of cuts necessary to split the graph into comparably sizedcomponents.
This embedding can also ‘work’ even if the
adjacencyvariable isnot strictly the adjacency matrix of a graph but more generallyan affinity or similarity matrix between samples (for instance theheat kernel of a euclidean distance matrix or a k-NN matrix).However care must taken to always make the affinity matrix symmetricso that the eigenvector decomposition works as expected.
Note : Laplacian Eigenmaps is the actual algorithm implemented here.
Read more in theUser Guide.
- Parameters:
- adjacency{array-like, sparse graph} of shape (n_samples, n_samples)
The adjacency matrix of the graph to embed.
- n_componentsint, default=8
The dimension of the projection subspace.
- eigen_solver{‘arpack’, ‘lobpcg’, ‘amg’}, default=None
The eigenvalue decomposition strategy to use. AMG requires pyamgto be installed. It can be faster on very large, sparse problems,but may also lead to instabilities. If None, then
'arpack'isused.- random_stateint, RandomState instance or None, default=None
A pseudo random number generator used for the initializationof the lobpcg eigen vectors decomposition when
eigen_solver=='amg', and for the K-Means initialization. Use an int to makethe results deterministic across calls (SeeGlossary).Note
When using
eigen_solver=='amg',it is necessary to also fix the global numpy seed withnp.random.seed(int)to get deterministic results. Seepyamg/pyamg#139 for furtherinformation.- eigen_tolfloat, default=”auto”
Stopping criterion for eigendecomposition of the Laplacian matrix.If
eigen_tol="auto"then the passed tolerance will depend on theeigen_solver:If
eigen_solver="arpack", theneigen_tol=0.0;If
eigen_solver="lobpcg"oreigen_solver="amg", theneigen_tol=Nonewhich configures the underlyinglobpcgsolver toautomatically resolve the value according to their heuristics. See,scipy.sparse.linalg.lobpcgfor details.
Note that when using
eigen_solver="amg"values oftol<1e-5may leadto convergence issues and should be avoided.Added in version 1.2:Added ‘auto’ option.
- norm_laplacianbool, default=True
If True, then compute symmetric normalized Laplacian.
- drop_firstbool, default=True
Whether to drop the first eigenvector. For spectral embedding, thisshould be True as the first eigenvector should be constant vector forconnected graph, but for spectral clustering, this should be kept asFalse to retain the first eigenvector.
- Returns:
- embeddingndarray of shape (n_samples, n_components)
The reduced samples.
Notes
Spectral Embedding (Laplacian Eigenmaps) is most useful when the graphhas one connected component. If there graph has many components, the firstfew eigenvectors will simply uncover the connected components of the graph.
References
Examples
>>>fromsklearn.datasetsimportload_digits>>>fromsklearn.neighborsimportkneighbors_graph>>>fromsklearn.manifoldimportspectral_embedding>>>X,_=load_digits(return_X_y=True)>>>X=X[:100]>>>affinity_matrix=kneighbors_graph(...X,n_neighbors=int(X.shape[0]/10),include_self=True...)>>># make the matrix symmetric>>>affinity_matrix=0.5*(affinity_matrix+affinity_matrix.T)>>>embedding=spectral_embedding(affinity_matrix,n_components=2,random_state=42)>>>embedding.shape(100, 2)
