Movatterモバイル変換


[0]ホーム

URL:


Type:Package
Title:Eigenvectors from Eigenvalues Sparse Principal ComponentAnalysis (EESPCA)
Version:0.8.0
Maintainer:H. Robert Frost <rob.frost@dartmouth.edu>
Description:Contains logic for computing sparse principal components via the EESPCA method, which is based on an approximation of the eigenvector/eigenvalue identity. Includes logic to support execution of the TPower and rifle sparse PCA methods, as well as logic to estimate the sparsity parameters used by EESPCA, TPower and rifle via cross-validation to minimize the out-of-sample reconstruction error. H. Robert Frost (2021) <doi:10.1080/10618600.2021.1987254>.
Depends:R (≥ 3.6.0), rifle (≥ 1.0.0), MASS, PMA
License:GPL-2 |GPL-3 [expanded from: GPL (≥ 2)]
Copyright:Dartmouth College
Encoding:UTF-8
NeedsCompilation:no
Packaged:2025-07-21 17:34:17 UTC; d37329b
Author:H. Robert Frost [aut, cre]
Repository:CRAN
Date/Publication:2025-07-21 17:50:08 UTC

Eigenvectors

Description

Implementation of Eigenvectors from Eigenvalues Sparse Principal Component Analysis (EESPCA).

Details

Package: EESPCA
Type: Package
Version: 0.7.0
Date: 2021
License: GPL-2

Note

This work was supported by the National Institutes of Health grants K01LM012426, R21CA253408, P20GM130454,and P30CA023108.

Author(s)

H. Robert Frost

References


Approximates the normed squared eigenvector loadings

Description

Approximates the normed squared eigenvector loadings using a simplified version of the formulaassociating normed squared eigenvector loadings with the eigenvalues of the full matrix and sub-matrices.

Usage

    computeApproxNormSquaredEigenvector(cov.X, v1, lambda1, max.iter=5,         lambda.diff.threshold=1e-6, trace=FALSE)

Arguments

cov.X

Covariance matrix.

v1

Principal eigenvector ofcov.X, i.e, the loadings of the first PC.

lambda1

Largest eigenvalue ofcov.X.

max.iter

Maximum number of iterations for power iteration method when computing sub-matrix eigenvalues.See descriptionpowerIteration.

lambda.diff.threshold

Threshold for exiting the power iteration calculation. See descriptionpowerIteration.

trace

True if debugging messages should be displayed during execution.

Value

Vector of approximate normed squared eigenvector loadings.

See Also

eespca,powerIteration

Examples

    set.seed(1)    # Simulate 10x5 MVN data matrix    X=matrix(rnorm(50), nrow=10)    # Estimate covariance matrix    cov.X = cov(X)    # Compute eigenvectors/values    eigen.out = eigen(cov.X)    v1 = eigen.out$vectors[,1]    lambda1 = eigen.out$values[1]    # Print true squared loadings    v1^2    # Compute approximate normed squared eigenvector loadings    computeApproxNormSquaredEigenvector(cov.X=cov.X, v1=v1,        lambda1=lambda1)

Calculates the residual matrix from the reduced rank reconstruction

Description

Utility function for computing the residual matrix formed by subtracting fromX a reduced rank approximation of matrixX generated from the top k principal components contained in matrixV.

Usage

    computeResidualMatrix(X,V,center=TRUE)

Arguments

X

An n-by-p data matrix whose top k principal components are contained in the p-by-k matrixV.

V

A p-by-k matrix containing the loadings for the top k principal components ofX.

center

If true (the default),X will be mean-centered before the residual matrix is computed.If the PCs inV were computed via SVD on a mean-centered matrix or via eigen-decomposition of the sample covariance matrix, this should be set to true.

Value

Residual matrix.

Examples

    set.seed(1)    # Simulate 10x5 MVN data matrix    X=matrix(rnorm(50), nrow=10)    # Perform PCA    prcomp.out = prcomp(X)    # Get rank 2 residual matrix    computeResidualMatrix(X=X, V=prcomp.out$rotation[,1:2])

Eigenvectors from Eigenvalues Sparse Principal Component Analysis (EESPCA)

Description

Computes the first sparse principal component of the specified data matrix using the Eigenvectors from Eigenvalues Sparse Principal Component Analysis (EESPCA) method.

Usage

    eespca(X, max.iter=20, sparse.threshold, lambda.diff.threshold=1e-6,         compute.sparse.lambda=FALSE, sub.mat.max.iter=5, trace=FALSE)

Arguments

X

An n-by-p data matrix for which the first sparse PC will be computed.

max.iter

Maximum number of iterations for power iteration method. SeepowerIteration.

sparse.threshold

Threshold on loadings used to induce sparsity. Loadings below this value are set to 0. If not specified, defaults to1/sqrt(p).

lambda.diff.threshold

Threshold for exiting the power iteration calculation. If the absolute relative difference in lambda is less than this threshold between subsequent iterations,the power iteration method is terminated. SeepowerIteration.

compute.sparse.lambda

If true, the sparse loadings will be used to compute the sparse eigenvalue.

sub.mat.max.iter

Maximum iterations for computation of sub-matrix eigenvalues usingthe power iteration method. To maximize performance, set to 1. Uses the same lambda.diff.threshold.

trace

True if debugging messages should be displayed during execution.

Value

Alist with the following elements:

References

See Also

eespcaForK,computeApproxNormSquaredEigenvector,powerIteration

Examples

    set.seed(1)    # Simulate 10x5 MVN data matrix    X=matrix(rnorm(50), nrow=10)    # Compute first sparse PC loadings using default threshold    eespca(X=X)

Cross-validation for Eigenvectors from Eigenvalues Sparse Principal Component Analysis (EESPCA)

Description

Performs cross-validation of EESPCA to determine the optimal sparsity threshold. Selection is based on the minimization of reconstruction error.Based on the cross-validation approach of Witten et al. as implemented by theSPC.cv method in the PMA package.

Usage

    eespcaCV(X, max.iter=20, sparse.threshold.values, nfolds=5,         lambda.diff.threshold=1e-6, compute.sparse.lambda=FALSE,         sub.mat.max.iter=5, trace=FALSE)

Arguments

X

See description foreespca

max.iter

See description foreespca

sparse.threshold.values

Vector of threshold values to evaluate via cross-validation. See description foreespca for details.

nfolds

Number of cross-validation folds.

lambda.diff.threshold

See description foreespca

compute.sparse.lambda

See description foreespca

sub.mat.max.iter

See description foreespca

trace

See description foreespca

Value

Alist with the following elements:

References

See Also

eespca,SPC.cv

Examples

    set.seed(1)    # Simulate 10x5 MVN data matrix    X=matrix(rnorm(50), nrow=10)    # Generate range of threshold values to evaluate    default.threshold = 1/sqrt(5)    threshold.values = seq(from=.5*default.threshold, to=1.5*default.threshold, length.out=10)    # Use 5-fold cross-validation to estimate optimal sparsity threshold    eespcaCV(X=X, sparse.threshold.values=threshold.values)

Multi-PC version of Eigenvectors from Eigenvalues Sparse Principal Component Analysis (EESPCA)

Description

Computes multiple sparse principal components of the specified data matrix via sequential application ofthe Eigenvectors from Eigenvalues Sparse Principal Component Analysis (EESPCA) algorithm.After computing the first sparse PC via theeespca function, subsequent sparse PCs are computing by repeatedly applyingeespca to the residual matrix formedby subtracting the reconstruction ofX from the originalX. Multiple sparse PCs are not guaranteed to be orthogonal.

Note that the accuracy of the sparse approximation declines substantially for PCs with very smallvariances. To avoid this issue,k should not be set higher than the number of statistically significant PCs according to a Tracey-Widom test.

Usage

    eespcaForK(X, k=2, max.iter=20, sparse.threshold, lambda.diff.threshold=1e-6,         compute.sparse.lambda=FALSE, sub.mat.max.iter=5, trace=FALSE)

Arguments

X

An n-by-p data matrix for which the firstk sparse PCs will be computed.

k

The number of sparse PCs to compute. The specified k must be 2 or greater (for k=1, usetheeespca method). A check is made that k is not greater than the maximum theoreticalrank of X but, for performance reasons, a check is NOT made thatk is less than or equal to the actual rank of X.

max.iter

See description foreespca

sparse.threshold

See description foreespca

lambda.diff.threshold

See description foreespca

compute.sparse.lambda

See description foreespca

sub.mat.max.iter

See description foreespca

trace

See description foreespca

Value

Alist with the following elements:

References

See Also

eespca

Examples

    set.seed(1)    # Simulate 10x5 MVN data matrix    X=matrix(rnorm(50), nrow=10)    # Get first two sparse PCs    eespcaForK(X=X, sparse.threshold=1/sqrt(5), k=2)

Power iteration method for calculating principal eigenvector and eigenvalue.

Description

Computes the principal eigenvector and eigenvalue of the specified matrix using the power iteration method.Includes support for truncating the estimated eigenvector on each iteration to retain just thek eigenvector loadings with the largest absolute values with all other values set to 0,i.e., the the TPower method by Yuan & Zhang.

Usage

    powerIteration(X, k, v1.init, max.iter=10, lambda.diff.threshold=1e-6, trace=FALSE)

Arguments

X

Matrix for which the largest eigenvector and eigenvalue will be computed.

k

If specified, the estimated eigenvector is truncated on each iteration to retainonly the k loadings with the largest absolute values, all other loadings are set to 0. Must be an integer between 1 and ncol(X).

v1.init

If specified, the power iteration calculation will be initialized using this vector, otherwise,the calculation will be initialized using a unit vector with equal values.

max.iter

Maximum number of iterations for power iteration method.

lambda.diff.threshold

Threshold for exiting the power iteration calculation. If the absolute relative difference in computed eigenvalue is less than this threshold between subsequent iterations,the power iteration method is terminated.

trace

True if debugging messages should be displayed during execution.

Value

Alist with the following elements:

References

See Also

eespca

Examples

    set.seed(1)    # Simulate 10x5 MVN data matrix    X=matrix(rnorm(50), nrow=10)    # Compute sample covariance matrix    cov.X = cov(X)    # Use power iteration to get first PC loadings using default initial vector    powerIteration(X=cov.X)

Calculates the reduced rank reconstruction

Description

Utility function for computing the reduced rank reconstruction ofX using the PC loadings inV.

Usage

    reconstruct(X,V,center=TRUE)

Arguments

X

An n-by-p data matrix whose top k principal components are contained the p-by-k matrixV.

V

A p-by-k matrix containing the loadings for the top k principal components ofX.

center

If true (the default),X will be mean-centered before the reconstructionis computed. If the PCs inV were computed via SVD on a mean-centered matrix or via eigen-decomposition of the sample covariance matrix, this should be set to true.

Value

Reduced rank reconstruction of X.

Examples

    set.seed(1)    # Simulate 10x5 MVN data matrix    X=matrix(rnorm(50), nrow=10)    # Perform PCA    prcomp.out = prcomp(X)    # Get rank 2 reconstruction    reconstruct(X, prcomp.out$rotation[,1:2])

Calculates the reduced rank reconstruction error

Description

Utility function for computing the squared Frobenius norm of the residual matrix formed by subtracting fromX a reduced rank approximation of matrixX generated from the top k principal components contained in matrixV.

Usage

    reconstructionError(X,V,center=TRUE)

Arguments

X

An n-by-p data matrix whose top k principal components are contained the p-by-k matrixV.

V

A p-by-k matrix containing the loadings for the top k principal components ofX.

center

If true (the default),X will be mean-centered before the reconstructionerror is computed. If the PCs inV were computed via SVD on a mean-centered matrix or via eigen-decomposition of the sample covariance matrix, this should be set to true.

Value

The squared Frobenius norm of the residual matrix.

Examples

    set.seed(1)    # Simulate 10x5 MVN data matrix    X=matrix(rnorm(50), nrow=10)    # Perform PCA    prcomp.out = prcomp(X)    # Get rank 2 reconstruction error, which will be the minimum since the first 2 PCs are used    reconstructionError(X, prcomp.out$rotation[,1:2])    # Use all PCs to get approximately 0 reconstruction error    reconstructionError(X, prcomp.out$rotation)

Computes the initial eigenvector for the rifle method of Tan et al.

Description

Computes the initial eigenvector for the rifle method of Tan et al. (as implemented by therifle method in the rifle R package) using theinitial.convex method from the rifle package with lambda=sqrt(log(p)/n) and K=1.

Usage

    rifleInit(X)

Arguments

X

n-by-p data matrix to be evaluated via PCA.

Value

Initial eigenvector to use with rifle method.

References

See Also

riflePCACV,rifle,initial.convex

Examples

    set.seed(1)    # Simulate 10x5 MVN data matrix    X=matrix(rnorm(50), nrow=10)    # Compute initial eigenvector to use with rifle method    v1.init = rifleInit(X)    # Use with rifle method to get first PC loadings with 2 non-zero elements    rifle(A=cov(X), B=diag(5), init=v1.init, k=2)

Sparsity parameter selection via cross-validation for rifle method of Tan et al.

Description

Sparsity parameter selection for PCA-based rifle (as implemented by therifle method in the rifle package) using the cross-validation approach of Witten et al. as implemented by theSPC.cv method in the PMA package.

Usage

    riflePCACV(X, k.values, nfolds=5)

Arguments

X

n-by-p data matrix being evaluated via PCA.

k.values

Set of truncation parameter values to evaluate via cross-validation. Values must be between 1 and p.

nfolds

Number of folds for cross-validation

Value

k value that generated the smallest cross-validation error.

References

See Also

rifleInit,rifle,SPC.cv

Examples

    set.seed(1)    # Simulate 10x5 MVN data matrix    X=matrix(rnorm(50), nrow=10)    # Generate range of k values to evaluate    k.values = 1:5    # Use 5-fold cross-validation to estimate optimal k value    riflePCACV(X=X, k.values=k.values)

Implementation of the Yuan and Zhang TPower method.

Description

Implements the TPower method by Yuan and Zhang. Specifically, it computes the sparse principal eigenvector using power iteration method where the estimated eigenvector is truncated on each iteration to retain just thek eigenvector loadings with the largest absolute values with all other values set to 0.

Usage

    tpower(X, k, v1.init, max.iter=10, lambda.diff.threshold=1e-6, trace=FALSE)

Arguments

X

Matrix for which the largest eigenvector and eigenvalue will be computed.

k

Must be an integer between 1 and ncol(X). The estimated eigenvector is truncated on each iteration to retainonly the k loadings with the largest absolute values, all other loadings are set to 0.

v1.init

If specified, the power iteration calculation will be initialized using this vector, otherwise,the calculation will be initialized using a unit vector with equal values.

max.iter

Maximum number of iterations for power iteration method.

lambda.diff.threshold

Threshold for exiting the power iteration calculation. If the absolute relative difference in computed eigenvalues is less than this threshold between subsequent iterations,the power iteration method is terminated.

trace

True if debugging messages should be displayed during execution.

Value

The estimated sparse principal eigenvector.

References

See Also

powerIteration,tpowerPCACV

Examples

    set.seed(1)    # Simulate 10x5 MVN data matrix    X=matrix(rnorm(50), nrow=10)    # Compute first sparse PC loadings with 2 non-zero elements    tpower(X=cov(X), k=2)

Sparsity parameter selection for the Yuan and Zhang TPower method using cross-validation.

Description

Sparsity parameter selection for PCA-based TPower using the cross-validation approach of Witten et al. as implemented by theSPC.cv method in the PMA package.

Usage

    tpowerPCACV(X, k.values, nfolds=5)

Arguments

X

n-by-p data matrix being evaluated via PCA.

k.values

Set of truncation parameter values to evaluate via cross-validation. Values must be between 1 and p.

nfolds

Number of folds for cross-validation

Value

k value that generated the smallest cross-validation error.

References

See Also

tpower,SPC.cv

Examples

    set.seed(1)    # Simulate 10x5 MVN data matrix    X=matrix(rnorm(50), nrow=10)    # Generate range of k values to evaluate    k.values = 1:5    # Use 5-fold cross-validation to estimate optimal k value    tpowerPCACV(X=X, k.values=k.values)

[8]ページ先頭

©2009-2025 Movatter.jp