Movatterモバイル変換

Type:

Package

Title:

Tools Developed for Structured Sufficient Dimension Reduction(sSDR)

Version:

1.2.0

Date:

2016-03-26

Author:

Yang Liu <zjubioly@gmail.com>, Francesca Chiaromonte, Bing Li

Maintainer:

Yang Liu <zjubioly@gmail.com>

Description:

Performs structured OLS (sOLS) and structured SIR (sSIR).

License:

GPL-2 |GPL-3 [expanded from: GPL (≥ 2)]

LazyData:

TRUE

Depends:

R (≥ 3.0.0), MASS, Matrix

NeedsCompilation:

Packaged:

2016-03-26 18:07:48 UTC; yangliu

Repository:

CRAN

Date/Publication:

2016-03-26 22:02:24

Center a vector

Description

Center a vector

Usage

center(v)

Arguments

v

A vector.

Details

This function centers any vector and returns a vector with mean zero.

Value

A vector with mean zero.

Examples

data <- gen.data(n=100)y.centered <- center(data$y)

Covariance matrix

Description

Covariance matrix

Usage

cov.x(X)

Arguments

X

a n x p matrix of n observations and p predictors.

Details

This function returns A p x p covariance matrix for any n x p matrix.

Value

A p x p covariance matrix.

Examples

data <- gen.data(n=100)x.cov <- cov.x(data$X)

Subspace distance

Description

Subspace distance

Usage

disvm(v1, v2)

Arguments

v1

A matrix, each column consists of a p-dimensional vector.

v2

A matrix, each column consists of a p-dimensional vector.

Details

This function computes the distances between two spaces using the formulationin Li, Zha, Chiaromonte (2005), which is the Frobenius norm of the differencebetween the two orthogonal projection matrices defined by v1 and v2.

Value

A scaler represents the distance between the two spaces spanned byv1 and v2 respectively.

References

Li, B., Zha, H., and Chiaromonte, F. (2005). Contour regression:a general approach to dimension reduction. Annals of Statistics,33(4):1580-1616.

Examples

v1 <- c(1, 0, 0)v2 <- c(0, 1, 0)disvm(v1, v1)disvm(v1, v2)

Groupwise OLS (gOLS)

Description

Groupwise OLS (gOLS)

Usage

gOLS(X, Y, groups, dims)

Arguments

X

A covariate matrix of n observations and p predictors.

Y

A univariate response.

groups

A vector with the number of predictors in each group.

dims

A vector with the dimension (at most 1) for each predictor group.

Details

This function estimates directions for each predictor group using gOLS.Predictors need to be organized in groups within the "X" matrix, as thesame order saved in "groups". We only allow continuous covariatesin the "X" matrix; while categorical covariates can be handled outside ofgOLS, e.g. structured OLS.

Value

gOLS returns a list containning at least the following components:"b_est", the estimated directions for each group with its own dimensionusing gOLS AFTER normalization;"B", the estimated directions for each group using gOLS BEFORE normalization.

References

Liu, Y., Chiaromonte, F., and Li, B. (2015). Structured OrdinaryLeast Squares: a sufficient dimension reduction approach for regressions withpartitioned predictors and heterogeneous units. Submitted.

Examples

data <- gen.data(n=1000, binary=FALSE) # generate datadim(data$X) # covariate matrix of 1000 observations and 15 predictorsdim(data$y) # univariate responsegroups <- c(5, 10) # two predictor groups and their numbers of predictorsdims <- c(1,1) # dimension of each predictor groupest_gOLS <- gOLS(data$X,data$y,groups,dims)names(est_gOLS)

Groupwise OLS (gOLS) BIC criterion to estimate dimensions witheigen-decomposition

Description

Groupwise OLS (gOLS) BIC criterion to estimate dimensions witheigen-decomposition

Usage

gOLS.comp.d(X, y, groups)

Arguments

X

A covariate matrix of n observations and p predictors.

y

A univariate response.

groups

A vector with the number of predictors in each group.

Details

This function estimates dimension for each predictor group usingeigen-decomposition. Predictors need to be organized in groups within the"X" matrix, as the same order saved in "groups". We only allow continuouscovariates in the "X" matrix; while categorical covariates can be handledoutside of gOLS, e.g. structured OLS.

Value

gOLS.comp.d returns a list containning at least the followingcomponents:"d", the estimated dimension (at most 1) for each predictor group;"crit", the BIC criterion from each iteration.

References

Liu, Y., Chiaromonte, F., and Li, B. (2015). Structured OrdinaryLeast Squares: a sufficient dimension reduction approach for regressions withpartitioned predictors and heterogeneous units. Submitted.

Examples

data <- gen.data(n=1000, binary=FALSE) # generate datadim(data$X) # covariate matrix of 1000 observations and 15 predictorsdim(data$y) # univariate responsegroups <- c(5, 10) # two predictor groups and their numbers of predictorsdim_gOLS<-gOLS.comp.d(data$X,data$y,groups)names(dim_gOLS)

Groupwise SIR (gSIR) for binary response

Description

Groupwise SIR (gSIR) for binary response

Usage

gSIR(X, Y, groups, dims)

Arguments

X

A covariate matrix of n observations and p predictors.

Y

A binary response.

groups

A vector with the number of predictors in each group.

dims

A vector with the dimension (at most 1) for each predictor group.

Details

This function estimates directions for each predictor group using gSIR.Predictors need to be organized in groups within the "X" matrix, as thesame order saved in "groups". We only allow continuous covariatesin the "X" matrix; while categorical covariates can be handled outside ofgSIR, e.g. structured SIR.

Value

gSIR returns a list containning at least the following components:"b_est", the estimated directions for each group with its own dimensionusing gSIR AFTER normalization;"B", the estimated directions for each group using gSIR BEFORE normalization.

References

Guo, Z., Li, L., Lu, W., and Li, B. (2014). Groupwise dimensionreduction via envelope method. Journal of the American StatisticalAssociation, accepted.

Examples

data <- gen.data(n=1000, binary=TRUE) # generate datadim(data$X) # covariate matrix of 1000 observations and 15 predictorslength(data$y) # binary responsegroups <- c(5, 10) # two predictor groups and their numbers of predictorsdims <- c(1,1) # dimension of each predictor groupest_gSIR<-gSIR(data$X,data$y,groups,dims)names(est_gSIR)

Groupwise SIR (gSIR) BIC criterion to estimate dimensions witheigen-decomposition (binary response)

Description

Groupwise SIR (gSIR) BIC criterion to estimate dimensions witheigen-decomposition (binary response)

Usage

gSIR.comp.d(X, y, groups)

Arguments

X

A covariate matrix of n observations and p predictors.

y

A binary response.

groups

A vector with the number of predictors in each group.

Details

Value

gSIR.comp.d returns a list containning at least the followingcomponents:"d", the estimated dimension (at most 1) for each predictor group;"crit", the BIC criterion from each iteration.

References

Liu, Y. (2015). Approaches to reduce and integrate data instructured and high-dimensional regression problems in Genomics. Ph.D.Dissertation, The Pennsylvania State University, University Park,Department of Statistics.

Examples

data <- gen.data(n=1000, binary=TRUE) # generate datadim(data$X) # covariate matrix of 1000 observations and 15 predictorslength(data$y) # univariate responsegroups <- c(5, 10) # two predictor groups and their numbers of predictorsdim_gSIR<-gSIR.comp.d(data$X,data$y,groups)names(dim_gSIR)

Simulate data

Description

Simulate data

Usage

gen.data(n, rho = 0.5, theta = 1, binary = FALSE)

Arguments

n

Sample size.

rho

Pairwise correlation between covariates.

theta

Standard deviation of the random error.

binary

If TRUE, generate binary responses; otherwise, by default,create continuous responses.

Details

This function simulates data as presented in Liu (2015).

Value

gen.data returns a list containning at least the followingcomponents:"X", a covariate matrix of n observations and p predictors;"y", a univariate response;"b.true", the actual coefficients for each predictor group.

References

Examples

data <- gen.data(n=100)names(data)

Power of a matrix

Description

Power of a matrix

Usage

matpower(X, alpha)

Arguments

X

A p x p square matrix.

alpha

A scaler determining the order of the power.

Details

This function calculates the power of a square matrix.

Value

A p x p square matrix.

Examples

data <- gen.data(n=100)cov.squared <- matpower(cov.x(data$X), 2)

Normalize a vector

Description

Normalize a vector

Usage

norm1(v)

Arguments

v

A vector.

Details

This function normalizes any non-zero vector and returns a vector withthe norm equal to 1.

Value

A vector with norm 1.

Examples

data <- gen.data(n=100)y.norm1 <- norm1(data$y)

Gram-Schmidt orthonormalization

Description

Gram-Schmidt orthonormalization

Usage

orthnormal(X)

Arguments

X

a n x p matrix of n observations and p predictors.

Details

This function orthonormalizes any n x p matrix.

Value

A n x p matrix of n observations and p predictors.

Examples

data <- gen.data(n=100)x.orth <- orthnormal(data$X)

Structured OLS (sOLS) outer level BIC criterion to estimate dimension witheigen-decomposition

Description

Structured OLS (sOLS) outer level BIC criterion to estimate dimension witheigen-decomposition

Usage

sOLS.comp.d(X, sizes)

Arguments

X

A matrix containing directions estimated from all subpopulations.

sizes

A vector with the sample sizes of all subpopulation.

Details

This function estimates dimension across the subpopulations usingeigen-decomposition. The order of the subpopulations in the "sizes" vectorshould match the one in the "X" matrix. Also, this function returns thelinearly independent directions among all subpopulations.

Value

sOLS.comp.d returns a list containning at least the followingcomponents:"d", the dimension estimated across subpopulations;"u", the "d" linearly independent directions among the matrix X.

References

Liu, Y., Chiaromonte, F., and Li, B. (2015). Structured OrdinaryLeast Squares: a sufficient dimension reduction approach for regressions withpartitioned predictors and heterogeneous units. Submitted.

Examples

v1 <- c(1, 1, 0, 0)v2 <- c(0, 1, 1, 0)v3 <- c(0, 0, 1, 1)v4 <- c(1, 1, 1, 1)m1 <- cbind(v1, v2)sizes1 <- c(100, 200)sOLS.comp.d(m1, sizes1)m2 <- cbind(v1, v2, v3)sizes2 <- c(100, 200, 500)sOLS.comp.d(m2, sizes2)m3 <- cbind(v1, v3, v4)sizes3 <- c(100, 500, 1000)sOLS.comp.d(m3, sizes3)

Matrix standardization

Description

Matrix standardization

Usage

standmat(x)

Arguments

x

A n x p matrix of n observations and p predictors.

Details

This function standardizes a matrix treating each row as a random vectorin an iid sample. It returns a n x p matrix with column-mean zeroand identity-covariance matrix.

Value

A n x p matrix of n observations and p predictors.

Examples

data <- gen.data(n=100)x.std <- standmat(data$X)