Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Fast truncated singular value decompositions

NotificationsYou must be signed in to change notification settings

bwlewis/irlba

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Implicitly-restarted Lanczos methods for fast truncated singular valuedecomposition of sparse and dense matrices (also referred to as partial SVD).IRLBA stands for Augmented,ImplicitlyRestartedLanczosBidiagonalizationAlgorithm. The package provides the followingfunctions (see help on each for details and examples).

  • irlba() partial SVD function
  • ssvd() l1-penalized matrix decompoisition for sparse PCA (based on Shen and Huang's algorithm)
  • prcomp_irlba() principal components function similar to theprcomp function in stats package for computing the first few principal components of large matrices
  • svdr() alternate partial SVD function based on randomized SVD (see also thersvd package by N. Benjamin Erichson for an alternative implementation)
  • partial_eigen() a very limited partial eigenvalue decomposition for symmetric matrices (see theRSpectra package for more comprehensive truncated eigenvalue decomposition)

Help documentation for each function includes extensive documentation andexamples. Also see the package vignette,vignette("irlba", package="irlba").

An overview web page is here:https://bwlewis.github.io/irlba/.

New in 2.3.2

  • Fixed a regression inprcomp_irlba() discovered by Xiaojie Qiu, see#25, and other related problems reported in#32.
  • Added rchk testing to pre-CRAN submission tests.
  • Fixed a sign bug inssvd() found by Alex Poliakov.

What's new in Version 2.3.1?

  • Fixed anirlba() bug associated with centering (PCA), see#21.
  • Fixedirlba() scaling to conform toscale, see#22.
  • Improvedprcomp_irlba() from a suggestion by N. Benjamin Erichson, see#23.
  • Significanty changed/improvedsvdr() convergence criterion.
  • Added a version of Shen and Huang's Sparse PCA/SVD L1-penalized matrix decomposition (ssvd()).
  • Fixed valgrind errors.

Deprecated features

I will removepartial_eigen() in a future version. As its documentationstates, users are better off using the RSpectra package for eigenvaluecomputations (although not generally for singular value computations).

Themult argument is deprecated and will be removed in a future version. Wenow recommend simply defining a custom class with a custom multiplcationoperator. The example below illustrates the old and new approaches.

library(irlba)set.seed(1)A <- matrix(rnorm(100), 10)# ------------------ old way ----------------------------------------------# A custom matrix multiplication function that scales the columns of A# (cf the scale option). This function scales the columns of A to unit norm.col_scale <- sqrt(apply(A, 2, crossprod))mult <- function(x, y)        {          # check if x is a  vector          if (is.vector(x))          {            return((x %*% y) / col_scale)          }          # else x is the matrix          x %*% (y / col_scale)        }irlba(A, 3, mult=mult)$d## [1] 1.820227 1.622988 1.067185# Compare with:irlba(A, 3, scale=col_scale)$d## [1] 1.820227 1.622988 1.067185# Compare with:svd(sweep(A, 2, col_scale, FUN=`/`))$d[1:3]## [1] 1.820227 1.622988 1.067185# ------------------ new way ----------------------------------------------setClass("scaled_matrix", contains="matrix", slots=c(scale="numeric"))setMethod("%*%", signature(x="scaled_matrix", y="numeric"), function(x ,y) x@.Data %*% (y / x@scale))setMethod("%*%", signature(x="numeric", y="scaled_matrix"), function(x ,y) (x %*% y@.Data) / y@scale)a <- new("scaled_matrix", A, scale=col_scale)irlba(a, 3)$d## [1] 1.820227 1.622988 1.067185

We have learned that using R's existing S4 system is simpler, easier, and moreflexible than using custom arguments with idiosyncratic syntax and behavior.We've even used the new approach to implement distributed parallel matrixproducts for very large problems with amazingly little code.

Wishlist / help wanted...

  • More Matrix classes supported in the fast code path
  • Help improving the solver for singular values in tricky cases (basically, for ill-conditioned problems and especially for the smallest singular values); in general this may require a combination of more careful convergence criteria and use of harmonic Ritz values; Dmitriy Selivanov has proposed alternative convergence criteria in#29 for example.

References

  • Baglama, James, and Lothar Reichel. "Augmented implicitly restarted Lanczos bidiagonalization methods." SIAM Journal on Scientific Computing 27.1 (2005): 19-42.
  • Halko, Nathan, Per-Gunnar Martinsson, and Joel A. Tropp. "Finding structure with randomness: Stochastic algorithms for constructing approximate matrix decompositions." (2009).
  • Shen, Haipeng, and Jianhua Z. Huang. "Sparse principal component analysis via regularized low rank matrix approximation." Journal of multivariate analysis 99.6 (2008): 1015-1034.
  • Witten, Daniela M., Robert Tibshirani, and Trevor Hastie. "A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis." Biostatistics 10.3 (2009): 515-534.

[8]ページ先頭

©2009-2025 Movatter.jp