Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

An R package providing multiple Imputation of covariance matrices in order to perform factor analysis.

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
NotificationsYou must be signed in to change notification settings

Teebusch/mifa

Repository files navigation

Lifecycle: maturingCRAN statusR-CMD-checkCodecov test coverage

mifa is an R package that implements multiple imputation of covariancematrices to allow to perform factor analysis on incomplete data. Itworks as follows:

  1. Impute missing values multiple times usingMultivariate Imputationwith Chained Equations (MICE) from themice package.

  2. Combine the covariance matrices of the imputed data sets into asingle covariance matrix using Rubin’s rules1

  3. Use the combined covariance matrix for exploratory factor analysis.

mifa also provides two types of confidence intervals for the varianceexplained by different numbers of principal components: Fiellerconfidence intervals (parametric) for larger samples2 andbootstrapped confidence intervals (nonparametric) for smallersamples.3

For more information about the method, see:

Nassiri, V., Lovik, A., Molenberghs, G., Verbeke, G. (2018). On usingmultiple imputation for exploratory factor analysis of incomplete data.Behavior Research Methods 50, 501–517. doi:10.3758/s13428-017-1013-4

Note: The paper was accompanied by an implementation in R, and thispackage emerged from it. The repository appears to have been abandonedby the authors, but you can still find ithere.

Installation

Install from CRAN with:

install.packages("mifa")

Or install the development version fromGithub with:

# install.packages("devtools")devtools::install_github("teebusch/mifa")

Usage

Example Data

For this example we use thebfi data set from thepsych package. Itcontains 2,800 subjects’ answers to 25 personality self-report items and3 demographic variables (sex, education, and age). Each of the 25personality questions is meant to tap into one of the “Big 5”personality factors, as indicated by their names:Openness,Conscientiousness,Agreeableness, ,Extraversion,Neuroticism. There are missing responses for most items. Instead ofdropping the incomplete cases from the analysis, we will usemifa toimpute them, and then perform a factor analysis on the imputedcovariance matrix.

Imputing the Covariance Matrix

First, we usemifa() to impute the covariance matrix and get an ideahow many factors we should use. We use thecov_vars argument to tellmifa to usegender,education, andage for the imputations, butexclude them from the covariance matrix:

library(mifa)library(psych)mi<- mifa(data=bfi,cov_vars=-c(gender,education,age),n_pc=2:8,ci="fieller",print=FALSE)mi#> Imputed covariance matrix of 25 variables#>#> Variable:   A1 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O2 O3 O4 O5#> N Imputed:  16 27 26 19 16 21 24 20 26 16 23 16 25  9 21 22 21 11 36 29 22  0 28 14 20#>#> Number of MICE imputations: 5#> Additional variables used for imputations:#> gender education age#>#> Cumulative proportion of variance explained by n principal components:#>#> n  prop  Fieller CI#> 2  0.33  [0.32, 0.34]#> 3  0.41  [0.40, 0.42]#> 4  0.48  [0.47, 0.49]#> 5  0.54  [0.53, 0.55]#> 6  0.59  [0.58, 0.59]#> 7  0.62  [0.61, 0.63]#> 8  0.66  [0.65, 0.66]

Factor Analysis

It looks like the first 5 principal components explain more than half ofthe variance in the responses, so we perform a factor analysis with 5factors, using thefa() function from thepsych package. We can getthe imputed covariance matrix of our data frommi$cov_combined. Fromthere on, it’s business as usual.

fit<- fa(mi$cov_combined,n.obs= nrow(bfi),nfactors=5)

The factor diagram shows that the five factors correspond nicely to the5 types of questions:

fa.diagram(fit)

We can add the factor scores to the original data, in order to exploregroup differences. Because we need complete data to calculate factorscores, we first impute a single data set with mice:

data_imp<-mice::complete(mice::mice(bfi,1,print=FALSE))fct_scores<-data.frame(factor.scores(data_imp[,1:25],fit)$scores)data_imp<-data.frame(Gender=factor(data_imp$gender),Extraversion=fct_scores$MR1,Neuroticism=fct_scores$MR2,Conscientious=fct_scores$MR3,Openness=fct_scores$MR4,Agreeableness=fct_scores$MR5)levels(data_imp$Gender)<- c("Male","Female")

Then we can visualize the group differences:

library(ggplot2)library(tidyr)data_imp2<-tidyr::pivot_longer(data_imp,cols=-Gender,names_to="factor")ggplot(data_imp2)+  geom_density(aes(value,linetype=Gender))+  facet_wrap(~factor,nrow=2)+  theme(legend.position="inside",legend.position.inside= c(.9,.1))

Further Reading

Footnotes

  1. Rubin D. B. Multiple imputation for nonresponse in surveys (2004).John Wiley & Sons.

  2. Fieller, E. C. (1954). Some problems in interval estimation.Journal of the Royal Statistical Society. Series B (Methodological):175-185.

  3. Shao, J. & Sitter, R. R. (1996). Bootstrap for imputed surveydata. Journal of the American Statistical Association 91.435 (1996):1278-1288. doi:10.1080/01621459.1996.10476997

About

An R package providing multiple Imputation of covariance matrices in order to perform factor analysis.

Topics

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Packages

No packages published

Contributors3

  •  
  •  
  •  

Languages


[8]ページ先頭

©2009-2025 Movatter.jp