Movatterモバイル変換


[0]ホーム

URL:


1. Non-negative Matrix Factorization(NMF andNMTF)

Koki Tsuyuzaki

Laboratory for Bioinformatics Research, RIKEN Center for BiosystemsDynamics Research
k.t.the-answer@hotmail.co.jp

Itoshi Nikaido

Laboratory for Bioinformatics Research, RIKEN Center for BiosystemsDynamics Research

2024-05-13

Introduction

In this vignette we consider approximating a non-negative matrix as aproduct of multiple non-negative low-rank matrices (a.k.a., factormatrices).

Test data is available fromtoyModel.

library("nnTensor")
## Warning: no DISPLAY variable so Tk is not available
X<-toyModel("NMF")

You will see that there are five blocks in the data matrix asfollows.

image(X,main="Original Data")

NMF

Here, we consider the approximation of the non-negative data matrix\(X\) (\(N\times M\)) as the matrix product of\(U\) (\(N \timesJ\)) and\(V\) (\(M \times J\)):

\[X \approx U V' \ \mathrm{s.t.}\ U \geq 0, V \geq 0\]

This is known as non-negative matrix factorization (NMF(Lee and Seung 1999; CICHOCK 2009)) andmultiplicative update (MU) rule often used to achieve thisfactorization.

Basic Usage

In NMF, the rank parameter\(J\)(\(\leq \min(N,M)\)) is needed to beset in advance. Other settings such as the number of MU iterations(num.iter) or factorization algorithm(algorithm) are also available. For the details ofarguments of NMF, see?NMF. After the calculation, variousobjects are returned byNMF.

set.seed(123456)out_NMF<-NMF(X,J=5)str(out_NMF,2)
## List of 10##  $ U            : num [1:100, 1:5] 46.4 47.2 47.4 47.7 46.3 ...##  $ V            : num [1:300, 1:5] 1.97 2.01 2 2.03 2.01 ...##  $ J            : num 5##  $ RecError     : Named num [1:101] 1.00e-09 8.35e+03 7.65e+03 7.42e+03 7.21e+03 ...##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...##  $ TrainRecError: Named num [1:101] 1.00e-09 8.35e+03 7.65e+03 7.42e+03 7.21e+03 ...##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...##  $ TestRecError : Named num [1:101] 1e-09 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 ...##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...##  $ RelChange    : Named num [1:101] 1.00e-09 5.43e-01 9.12e-02 3.12e-02 2.95e-02 ...##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...##  $ Trial        : NULL##  $ Runtime      : NULL##  $ RankMethod   : NULL

The reconstruction error (RecError) and relative error(RelChange, the amount of change from the reconstructionerror in the previous step) can be used to diagnose whether thecalculation is converging or not.

layout(t(1:2))plot(log10(out_NMF$RecError[-1]),type="b",main="Reconstruction Error")plot(log10(out_NMF$RelChange[-1]),type="b",main="Relative Change")

The product of\(U\) and\(V\) shows that the original data can bewell-recovered byNMF.

recX<- out_NMF$U%*%t(out_NMF$V)layout(t(1:2))image(X,main="Original Data")image(recX,main="Reconstructed Data (NMF)")

Rank Estimation of NMF

NMF requires the rank paramter\(J\)in advance. However, setting optimal values without prior knowledge andexternal measures is a basically difficult problem. InnnTensor,\(14\) ofevaluation scores(Brunet 2004; Han 2007;Frigyesi 2008; Park 2019; Shao 2017; Fogel 2013; Kim 2003; Hutchins2008; Hoyer 2004) to estimate rank parameter have beenimplemented in theNMF. If multiple rank parameters areset, the evaluation score is calculated within that range, and we canestimate the optimal value from the large or small values and rapidlychanging slopes. For the details, see(Brunet2004; Han 2007; Frigyesi 2008; Park 2019; Shao 2017; Fogel 2013; Kim2003; Hutchins 2008; Hoyer 2004).

Note that here we run with a smallnum.iter todemonstrate the rank estimation with the minimum computation time. Whenusers try it on their own data, this option should be removed.

set.seed(123456)out_NMF2<-NMF(X,J=1:10,num.iter=1)
## Each rank, multiple NMF runs are performed##   |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%## Each rank estimation method##   |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%ccc## dispersion## rss## evar## residuals## sparseness.basis## sparseness.coef## sparseness2.basis## sparseness2.coef## norm.info.gain.basis## norm.info.gain.coef## singular## volume## condition##   |                                                                              |==============                                                        |  20%ccc## dispersion## rss## evar## residuals## sparseness.basis## sparseness.coef## sparseness2.basis## sparseness2.coef## norm.info.gain.basis## norm.info.gain.coef## singular## volume## condition##   |                                                                              |=====================                                                 |  30%ccc## dispersion## rss## evar## residuals## sparseness.basis## sparseness.coef## sparseness2.basis## sparseness2.coef## norm.info.gain.basis## norm.info.gain.coef## singular## volume## condition##   |                                                                              |============================                                          |  40%ccc## dispersion## rss## evar## residuals## sparseness.basis## sparseness.coef## sparseness2.basis## sparseness2.coef## norm.info.gain.basis## norm.info.gain.coef## singular## volume## condition##   |                                                                              |===================================                                   |  50%ccc## dispersion## rss## evar## residuals## sparseness.basis## sparseness.coef## sparseness2.basis## sparseness2.coef## norm.info.gain.basis## norm.info.gain.coef## singular## volume## condition##   |                                                                              |==========================================                            |  60%ccc## dispersion## rss## evar## residuals## sparseness.basis## sparseness.coef## sparseness2.basis## sparseness2.coef## norm.info.gain.basis## norm.info.gain.coef## singular## volume## condition##   |                                                                              |=================================================                     |  70%ccc## dispersion## rss## evar## residuals## sparseness.basis## sparseness.coef## sparseness2.basis## sparseness2.coef## norm.info.gain.basis## norm.info.gain.coef## singular## volume## condition##   |                                                                              |========================================================              |  80%ccc## dispersion## rss## evar## residuals## sparseness.basis## sparseness.coef## sparseness2.basis## sparseness2.coef## norm.info.gain.basis## norm.info.gain.coef## singular## volume## condition##   |                                                                              |===============================================================       |  90%ccc## dispersion## rss## evar## residuals## sparseness.basis## sparseness.coef## sparseness2.basis## sparseness2.coef## norm.info.gain.basis## norm.info.gain.coef## singular## volume## condition##   |                                                                              |======================================================================| 100%ccc## dispersion## rss## evar## residuals## sparseness.basis## sparseness.coef## sparseness2.basis## sparseness2.coef## norm.info.gain.basis## norm.info.gain.coef## singular## volume## condition
str(out_NMF2,2)
## List of 10##  $ U            : NULL##  $ V            : NULL##  $ J            : int [1:10] 1 2 3 4 5 6 7 8 9 10##  $ RecError     : NULL##  $ TrainRecError: NULL##  $ TestRecError : NULL##  $ RelChange    : NULL##  $ Trial        :List of 10##   ..$ Rank1 :List of 14##   ..$ Rank2 :List of 14##   ..$ Rank3 :List of 14##   ..$ Rank4 :List of 14##   ..$ Rank5 :List of 14##   ..$ Rank6 :List of 14##   ..$ Rank7 :List of 14##   ..$ Rank8 :List of 14##   ..$ Rank9 :List of 14##   ..$ Rank10:List of 14##  $ Runtime      : num 30##  $ RankMethod   : chr "all"##  - attr(*, "class")= chr "NMF"

Scores in the data matrix and random matrices are plotted at once.Red and green lines are plotted by the original matrix data and therandomly permutated matrix from the original data matrix,respectively.

plot(out_NMF2)

NMTF

As a different factorization form from NMF, non-negativetri-factrozation (NMTF(Copar e2019; Long 2005;Ding 2006)), which decomposes a non-negative matrix to threefactor matrices, can be considered. NMTF approximates the non-negativedata matrix\(X\) (\(N \times M\)) as the matrix product of\(U\) (\(N\times J1\)),\(S\) (\(J1 \times J2\)), and\(V\) (\(M \timesJ2\)) as follows.

\[X \approx U S V' \ \mathrm{s.t.}\ U \geq 0, S \geq 0, V \geq 0\]

As same as NMF, these factor matrices are also optimized by MU rule(Copar e2019; Long 2005; Ding 2006). Notethat\(S\) is not necessarily adiagonal matrix and often contains non-diagonal elements with non-zeroelements.

Basic Usage

Unlike NMF, NMTF sets two rank parameters\(J1\) (\(\leqN\)) and\(J2\) (\(\leq M\)) for\(U\) and\(V\), respectively. These values areseparately set in advance. For example, here\(4\) and\(5\) are set as follows.

set.seed(123456)out_NMTF<-NMTF(X,rank=c(4,5))str(out_NMTF,2)
## List of 8##  $ U            : num [1:100, 1:4] 2.66e-18 8.52e-18 1.77e-18 8.38e-19 1.05e-18 ...##  $ S            : num [1:4, 1:5] 0.7598 0.04967 0.62901 0.00506 1.5182 ...##  $ V            : num [1:300, 1:5] 1.67e-21 1.93e-20 1.34e-20 1.99e-20 1.20e-20 ...##  $ rank         : num [1:2] 4 5##  $ RecError     : Named num [1:101] 1.00e-09 7.99e+03 7.41e+03 7.37e+03 7.37e+03 ...##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...##  $ TrainRecError: Named num [1:101] 1.00e-09 7.99e+03 7.41e+03 7.37e+03 7.37e+03 ...##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...##  $ TestRecError : Named num [1:101] 1e-09 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 ...##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...##  $ RelChange    : Named num [1:101] 1.00e-09 5.91e-01 7.87e-02 5.25e-03 6.02e-04 ...##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...

As same as NMF, the values of reconstruction error and relative errorindicate that the optimization is converged.

layout(t(1:2))plot(log10(out_NMTF$RecError[-1]),type="b",main="Reconstruction Error")plot(log10(out_NMTF$RelChange[-1]),type="b",main="Relative Change")

The reconstructed matrix (\(USV'\)) shows that the features of thedata matrix are well captured by NMTF.

recX2<- out_NMTF$U%*% out_NMTF$S%*%t(out_NMTF$V)layout(t(1:2))image(X,main="Original Data")image(recX2,main="Reconstructed Data (NMTF)")

Session Information

## R version 4.3.0 (2023-04-21)## Platform: x86_64-pc-linux-gnu (64-bit)## Running under: Ubuntu 22.04.2 LTS## ## Matrix products: default## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 ## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0## ## locale:##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              ##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              ##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   ##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 ##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            ## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       ## ## time zone: Etc/UTC## tzcode source: system (glibc)## ## attached base packages:## [1] stats     graphics  grDevices utils     datasets  methods   base     ## ## other attached packages:## [1] nnTensor_1.3.0## ## loaded via a namespace (and not attached):##  [1] viridis_0.6.3      sass_0.4.6         utf8_1.2.3         generics_0.1.3    ##  [5] tcltk_4.3.0        digest_0.6.31      magrittr_2.0.3     evaluate_0.21     ##  [9] grid_4.3.0         RColorBrewer_1.1-3 fastmap_1.1.1      maps_3.4.1        ## [13] jsonlite_1.8.5     misc3d_0.9-1       gridExtra_2.3      fansi_1.0.4       ## [17] spam_2.9-1         viridisLite_0.4.2  scales_1.2.1       jquerylib_0.1.4   ## [21] cli_3.6.1          rlang_1.1.1        munsell_0.5.0      withr_2.5.0       ## [25] cachem_1.0.8       yaml_2.3.7         tools_4.3.0        dplyr_1.1.4       ## [29] colorspace_2.1-0   ggplot2_3.4.2      vctrs_0.6.5        R6_2.5.1          ## [33] lifecycle_1.0.3    plot3D_1.4         MASS_7.3-60        pkgconfig_2.0.3   ## [37] rTensor_1.4.8      pillar_1.9.0       bslib_0.5.0        gtable_0.3.3      ## [41] glue_1.6.2         Rcpp_1.0.10        fields_14.1        xfun_0.39         ## [45] tibble_3.2.1       tidyselect_1.2.1   highr_0.10         knitr_1.43        ## [49] farver_2.1.1       htmltools_0.5.5    rmarkdown_2.22     labeling_0.4.2    ## [53] tagcloud_0.6       dotCall64_1.0-2    compiler_4.3.0

References

Brunet, J.-P. et al. 2004.“Metagenes and Molecular PatternDiscovery Using Matrix Factorization.”PNAS 101(12):4164–69.
CICHOCK, A. et al. 2009.Nonnegative Matrix and TensorFactorizations. Wiley.
Copar, A. et al. e2019.“Fast Optimization of Non-Negative MatrixTri-Factorization: Supporting Information.”PLOE ONE14(6) (e2019): e0217994.
Ding, C. et al. 2006.“Orthogonal Nonnegative MatrixTri-Factorizations for Clustering.”SIGKDD’06, 126–35.
Fogel, P. 2013.“Permuted NMF: A Simple Algorithm Intended toMinimize the Volume of the Score Matrix.”arXiv.
Frigyesi, A. et al. 2008.“Non-Negative Matrix Factorization forthe Analysis of Complex Gene Expression Data: Identification ofClinically Relevant Tumor Subtypes.”Cancer Informatics.
Han, X. 2007.“Cancer Molecular Pattern Discovery by SubspaceConsensus Kernel Classification.”CSB 2007 6: 55–65.
Hoyer, P. O. 2004.“Non-Negative Matrix Factorization withSparseness Constraints.”JMLR 5, 1457–69.
Hutchins, L. N. et al. 2008.“Position-Dependent MotifCharacterization Using Non-Negative Matrix Factorization.”Bioinformatics 24(23): 2684–90.
Kim, P. M. et al. 2003.“Subsystem Identification ThroughDimensionality Reduction of Large-Scale Gene Expression Data.”Genome Research 13(7): 1706–18.
Lee, D., and H. Seung. 1999.“Learning the Parts of Objects byNon-Negative Matrix Factorization.”Nature 401: 788–91.
Long, B. et al. 2005.“Co-Clustering by Block ValueDecomposition.”SIGKDD’05, 635–40.
Park, H. et al. 2019.“Lecture 3: Nonnegative MatrixFactorization: Algorithms and Applications.”SIAM Gene GolubSummer School.
Shao, C. et al. 2017.“Robust Classification of Single-CellTranscriptome Data by Nonnegative Matrix Factorization.”Bioinformatics 33(2): 235–42.

[8]ページ先頭

©2009-2025 Movatter.jp