The ‘RcppML’ package provides high-performance machine learning algorithms using Rcpp with a focus on matrix factorization.
Install the latest development version of RcppML from github:
RcppML contains extremely fast NNLS solvers. Use thennls function to solve systems of equations subject to non-negativity constraints.
TheRcppML::solve function solves the equation for where is symmetric positive definite matrix of dimensions and is a vector of length or a matrix of dimensions .
# construct a system of equationsX <-matrix(rnorm(2000),100,20)btrue <-runif(20)y <-X%*%btrue+rnorm(100)a <-crossprod(X)b <-crossprod(X, y)# solve the system of equationsx <-RcppML::nnls(a, b)# use only coordinate descentx <-RcppML::nnls(a, b,fast_nnls =FALSE,cd_maxit =1000,cd_tol =1e-8)RcppML::solve implements a new and fastest-in-class algorithm for non-negative least squares:
cd_maxit = 0 to use only the FAST solver.Project dense linear factor models onto real-valued sparse matrices (or any matrix coercible toMatrix::dgCMatrix) usingRcppML::project.
RcppML::project solves the equation for .
RcppML::nmf finds a non-negative matrix factorization by alternating least squares (alternating projections of linear models and ).
There are several ways in which the NMF algorithm differs from other currently available methods:
The following example runs rank-10 NMF on a random 1000 x 1000 matrix that is 90% sparse:
A <-rsparsematrix(100,100,0.1)model <-RcppML::nmf(A,10,verbose = F)w <-model$wd <-model$dh <-model$hmodel_tolerance <-tail(model$tol,1)Tolerance is simply a measure of the average correlation between \eqn{w_{i-1} and and and for a given iteration .
For symmetric factorizations (when ), tolerance becomes a measure of the correlation between and , and diagonalization is automatically performed to enforce symmetry:
Mean squared error of a factorization can be calculated for a given model using theRcppML::mse function: