- Notifications
You must be signed in to change notification settings - Fork18
Rcpp Machine Learning: Fast robust NMF, divisive clustering, and more
License
GPL-2.0, GPL-3.0 licenses found
Licenses found
zdebruine/RcppML
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
RcppML is an R package for fastnon-negative matrix factorization anddivisive clustering usinglarge sparse matrices. For the single-cell analysis version of functionality in RcppML, check outzdebruine/singlet.
Check out theRcppMLpkgdown site!
RcppML NMF is:
- Thefastest NMF implementation in any language for sparse and dense matrices
- Moreinterpretable than other implementations due to diagonal scaling
- Easy toregularize with an L1 penalty
Install fromCRAN or the development version from GitHub:
install.packages('RcppML') # install CRAN versiondevtools::install_github("zdebruine/RcppML") # compile dev versionNOTE: RcppML is being actively developed. Please check that yourpackageVersion("RcppML") is current before raising issues.
Check out theCRAN manual.
Once installed and loaded, RcppML C++ headers defining classes can be used in C++ files for any R package using#include <RcppML.hpp>.
Sparse matrix factorization by alternating least squares:
- Non-negativity constraints
- L1 regularization
- Diagonal scaling
- Rank-1 and Rank-2 specializations (~2x faster thanirlba SVD equivalents)
Read (and cite) ourbioRXiv manuscript on NMF for single-cell experiments.
Thenmf function runs matrix factorization by alternating least squares in the formA = WDH. Theproject function updatesw orh given the other, while themse function calculates mean squared error of the factor model.
library(RcppML)A <- Matrix::rsparsematrix(1000, 100, 0.1) # sparse Matrix::dgCMatrixmodel <- RcppML::nmf(A, k = 10)h0 <- predict(model, A)evaluate(model, A) # calculate mean squared errorDivisive clustering by rank-2 spectral bipartitioning.
- 2nd SVD vector is linearly related to the difference between factors in rank-2 matrix factorization.
- Rank-2 matrix factorization (optional non-negativity constraints) for spectral bipartitioning~2x faster thanirlba SVD
- Sensitive distance-based stopping criteria similar to Newman-Girvan modularity, but orders of magnitude faster
- Stopping criteria based on minimum number of samples
Thedclust function runs divisive clustering by recursive spectral bipartitioning, while thebipartition function exposes the rank-2 NMF specialization and returns statistics of the bipartition.
library(RcppML)A <- Matrix::rsparsematrix(1000, 1000, 0.1) # sparse Matrix::dgcMatrixclusters <- dclust(A, min_dist = 0.001, min_samples = 5)cluster0 <- bipartition(A)About
Rcpp Machine Learning: Fast robust NMF, divisive clustering, and more
Topics
Resources
License
GPL-2.0, GPL-3.0 licenses found
Licenses found
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors3
Uh oh!
There was an error while loading.Please reload this page.