Movatterモバイル変換


[0]ホーム

URL:


Type:Package
Title:High-Dimensional Covariate-Augmented Overdispersed PoissonFactor Model
Version:1.3
Date:2025-03-27
Author:Wei Liu [aut, cre], Qingzhi Zhong [aut]
Maintainer:Wei Liu <liuweideng@gmail.com>
Description:A covariate-augmented overdispersed Poisson factor model is proposed to jointly perform a high-dimensional Poisson factor analysis and estimate a large coefficient matrix for overdispersed count data. More details can be referred to Liu et al. (2024) <doi:10.1093/biomtc/ujae031>.
License:GPL-3
Depends:irlba, R (≥ 3.5.0)
Imports:MASS, stats, Rcpp (≥ 1.0.10)
URL:https://github.com/feiyoung/COAP
BugReports:https://github.com/feiyoung/COAP/issues
Suggests:knitr, rmarkdown
LinkingTo:Rcpp, RcppArmadillo
VignetteBuilder:knitr
Encoding:UTF-8
RoxygenNote:7.1.2
NeedsCompilation:yes
Packaged:2025-03-27 09:49:55 UTC; Liuxianju
Repository:CRAN
Date/Publication:2025-03-27 11:30:02 UTC

Fit the COAP model

Description

Fit the covariate-augmented overdispersed Poisson factor model

Usage

RR_COAP(  X_count,  multiFac = rep(1, nrow(X_count)),  Z = matrix(1, nrow(X_count), 1),  rank_use = 5,  q = 15,  epsELBO = 1e-05,  maxIter = 30,  verbose = TRUE,  joint_opt_beta = FALSE,  fast_svd = TRUE)

Arguments

X_count

a count matrix, the observed count matrix.

multiFac

an optional vector, the normalization factor for each unit; default as full-one vector.

Z

an optional matrix, the covariate matrix; default as a full-one column vector if there is no additional covariates.

rank_use

an optional integer, specify the rank of the regression coefficient matrix; default as 5.

q

an optional string, specify the number of factors; default as 15.

epsELBO

an optional positive vlaue, tolerance of relative variation rate of the envidence lower bound value, defualt as '1e-5'.

maxIter

the maximum iteration of the VEM algorithm. The default is 30.

verbose

a logical value, whether output the information in iteration.

joint_opt_beta

a logical value, whether use the joint optimization method to update bbeta. The default isFALSE, which means using the separate optimization method.

fast_svd

a logical value, whether use the fast SVD algorithm in the update of bbeta; default isTRUE.

Details

None

Value

return a list including the following components: (1) H, the predicted factor matrix; (2) B, the estimated loading matrix; (3) bbeta, the estimated low-rank large coefficient matrix; (4) invLambda, the inverse of the estimated variances of error; (5) H0, the factor matrix; (6) ELBO: the ELBO value when algorithm stops; (7) ELBO_seq: the sequence of ELBO values.

References

Liu, W. and Q. Zhong (2024). High-dimensional covariate-augmented overdispersed poisson factor model. arXiv preprint arXiv:2402.15071.

See Also

None

Examples

n <- 300; p <- 100d <- 20; q <- 6; r <- 3datlist <- gendata_simu(n=n, p=p, d=20, q=q, rank0=r)str(datlist)fitlist <- RR_COAP(X_count=datlist$X, Z = datlist$Z, q=6, rank_use=3)str(fitlist)

Generate simulated data

Description

Generate simulated data from covariate-augmented Poisson factor models

Usage

gendata_simu(  seed = 1,  n = 300,  p = 50,  d = 20,  q = 6,  rank0 = 3,  rho = c(1.5, 1),  sigma2_eps = 0.1,  seed.beta = 1)

Arguments

seed

a postive integer, the random seed for reproducibility of data generation process.

n

a postive integer, specify the sample size.

p

a postive integer, specify the dimension of count variables.

d

a postive integer, specify the dimension of covariate matrix.

q

a postive integer, specify the number of factors.

rank0

a postive integer, specify the rank of the coefficient matrix.

rho

a numeric vector with length 2 and positive elements, specify the signal strength of regression coefficient and loading matrix, respectively.

sigma2_eps

a positive real, the variance of overdispersion error.

seed.beta

a postive integer, the random seed for reproducibility of data generation process by fixing the regression coefficient matrix beta.

Details

None

Value

return a list including the following components: (1) X, the high-dimensional count matrix; (2) Z, the high-dimensional covriate matrix; (3) bbeta0, the low-rank large coefficient matrix; (4) B0, the loading matrix; (5) H0, the factor matrix; (6) rank: the true rank of bbeta0; (7) q: the true number of factors.

References

None

See Also

RR_COAP

Examples

n <- 300; p <- 100d <- 20; q <- 6; r <- 3datlist <- gendata_simu(n=n, p=p, d=20, q=q, rank0=r)str(datlist)

Select the parameters in COAP models

Description

Select the number of factors and the rank of coefficient matrix in the covariate-augmented overdispersed Poisson factor model

Usage

selectParams(  X_count,  Z,  multiFac = rep(1, nrow(X_count)),  q_max = 15,  r_max = 24,  threshold = c(0.1, 0.01),  verbose = TRUE,  ...)

Arguments

X_count

a count matrix, the observed count matrix.

Z

an optional matrix, the covariate matrix; default as a full-one column vector if there is no additional covariates.

multiFac

an optional vector, the normalization factor for each unit; default as full-one vector.

q_max

an optional string, specify the upper bound for the number of factors; default as 15.

r_max

an optional integer, specify the upper bound for the rank of the regression coefficient matrix; default as 24.

threshold

an optional 2-dimensional positive vector, specify the the thresholds that filters the singular values of beta and B, respectively.

verbose

a logical value, whether output the information in iteration.

...

other arguments passed to the functionRR_COAP.

Details

The threshold is to filter the singular values with low signal, to assist the identification of underlying model structure.

Value

return a named vector with names 'hr' and 'hq', the estimated rank and number of factors.

References

None

See Also

RR_COAP

Examples

n <- 300; p <- 100d <- 20; q <- 6; r <- 3datlist <- gendata_simu(seed=30, n=n, p=p, d=20, q=q, rank0=r)str(datlist)set.seed(1)para_vec <- selectParams(X_count=datlist$X, Z = datlist$Z)print(para_vec)

[8]ページ先頭

©2009-2025 Movatter.jp