| Type: | Package |
| Title: | Cognitive Diagnosis Modeling |
| Version: | 8.3-14 |
| Date: | 2025-07-13 14:03:01 |
| Maintainer: | Alexander Robitzsch <robitzsch@ipn.uni-kiel.de> |
| Description: | Functions for cognitive diagnosis modeling and multidimensional item response modeling for dichotomous and polytomous item responses. This package enables the estimation of the DINA and DINO model (Junker & Sijtsma, 2001, <doi:10.1177/01466210122032064>), the multiple group (polytomous) GDINA model (de la Torre, 2011, <doi:10.1007/s11336-011-9207-7>), the multiple choice DINA model (de la Torre, 2009, <doi:10.1177/0146621608320523>), the general diagnostic model (GDM; von Davier, 2008, <doi:10.1348/000711007X193957>), the structured latent class model (SLCA; Formann, 1992, <doi:10.1080/01621459.1992.10475229>) and regularized latent class analysis (Chen, Li, Liu, & Ying, 2017, <doi:10.1007/s11336-016-9545-6>). See George, Robitzsch, Kiefer, Gross, and Uenlue (2017) <doi:10.18637/jss.v074.i02> or Robitzsch and George (2019, <doi:10.1007/978-3-030-05584-4_26>) for further details on estimation and the package structure. For tutorials on how to use the CDM package see George and Robitzsch (2015, <doi:10.20982/tqmp.11.3.p189>) as well as Ravand and Robitzsch (2015). |
| Depends: | R (≥ 3.1), mvtnorm |
| Imports: | graphics, grDevices, methods, polycor, Rcpp, stats, utils |
| Suggests: | BIFIEsurvey, lattice, MASS, miceadds, ROI, sfsmisc |
| LinkingTo: | Rcpp, RcppArmadillo |
| Enhances: | dina, GDINA, mirt, rrum |
| LazyLoad: | yes |
| LazyData: | yes |
| URL: | https://github.com/alexanderrobitzsch/CDM,https://sites.google.com/view/alexander-robitzsch/software |
| License: | GPL-2 |GPL-3 [expanded from: GPL (≥ 2)] |
| BugReports: | https://github.com/alexanderrobitzsch/CDM/issues?state=open |
| NeedsCompilation: | yes |
| Packaged: | 2025-07-13 12:08:38 UTC; sunpn563 |
| Author: | Alexander Robitzsch [aut, cre], Thomas Kiefer [aut], Ann Cathrice George [aut], Ali Uenlue [aut] |
| Repository: | CRAN |
| Date/Publication: | 2025-07-13 13:00:07 UTC |
Cognitive Diagnosis Modeling
Description
Functions for cognitive diagnosis modeling and multidimensional item response modeling for dichotomous and polytomous item responses. This package enables the estimation of the DINA and DINO model (Junker & Sijtsma, 2001, <doi:10.1177/01466210122032064>), the multiple group (polytomous) GDINA model (de la Torre, 2011, <doi:10.1007/s11336-011-9207-7>), the multiple choice DINA model (de la Torre, 2009, <doi:10.1177/0146621608320523>), the general diagnostic model (GDM; von Davier, 2008, <doi:10.1348/000711007X193957>), the structured latent class model (SLCA; Formann, 1992, <doi:10.1080/01621459.1992.10475229>) and regularized latent class analysis (Chen, Li, Liu, & Ying, 2017, <doi:10.1007/s11336-016-9545-6>). See George, Robitzsch, Kiefer, Gross, and Uenlue (2017) <doi:10.18637/jss.v074.i02> or Robitzsch and George (2019, <doi:10.1007/978-3-030-05584-4_26>) for further details on estimation and the package structure. For tutorials on how to use the CDM package see George and Robitzsch (2015, <doi:10.20982/tqmp.11.3.p189>) as well as Ravand and Robitzsch (2015).
Details
Cognitive diagnosis models (CDMs) are restricted latent class models.They represent model-based classification approaches, which aim atassigning respondents to different attribute profile groups. The latentclasses correspond to the possible attribute profiles, and theconditional item parameters model atypical response behavior in the senseof slipping and guessing errors. The core CDMs in particular differ inthe utilized condensation rule, conjunctive / non-compensatory versusdisjunctive / compensatory, where in the model structure these twotypes of response error parameters enter and what restrictions areimposed on them. The confirmatory character of CDMs is apparent in theQ-matrix, which can be seen as an operationalization of the latentconcepts of an underlying theory. The Q-matrix allows incorporatingqualitative prior knowledge and typically has as its rows the items andas the columns the attributes, with entries 1 or 0, depending on whetheran attribute is measured by an item or not, respectively.
CDMs as compared to common psychometric models (e.g., IRT) containcategorical instead of continuous latent variables. The results ofanalyses using CDMs differ from the results obtained under continuouslatent variable models. CDMs estimate in a direct manner theprobabilistic attribute profile of a respondent, that is, themultivariate vector of the conditional probabilities for possessing theindividual attributes, given her / his response pattern. Based on theseprobabilities, simplified deterministic attribute profiles can bederived, showing whether an individual attribute is essentially possessedor not by a respondent. As compared to alternative two-stepdiscretization approaches, which estimate continuous scores and discretizethe continua based on cut scores, with CDMs the classification error cangenerally be reduced.
The packageCDM implements parameter estimation procedures for theDINA and DINO model (e.g.,de la Torre &Douglas, 2004; Junker & Sijtsma, 2001; Templin &Henson, 2006; the generalized DINA model for dichotomous attributes(GDINA, de la Torre, 2011) and for polytomous attributes(pGDINA, Chen & de la Torre, 2013);the general diagnostic model (GDM, von Davier, 2008) and its extensionto the multidimensional latent class IRT model (Bartolucci, 2007),the structure latent class model (Formann, 1992),and tools for analyzing data under the models.These and related concepts are explained in detail in thebook about diagnostic measurement and CDMs byRupp, Templin and Henson (2010), and in such survey articles asDiBello, Roussos and Stout (2007) andRupp and Templin (2008).
The packageCDM is implemented based on the S3 system. It comeswith a namespace and consists of several external functions (functions thepackage exports).The package contains a utility method for the simulation of artificial data basedon a CDM model (sim.din). It also contains seven internal functions(functions not exported by the package): this areplot,print, andsummary methods for objects of the classdin (plot.din,print.din,summary.din), aprint method forobjects of the classsummary.din (print.summary.din),and three functions for checking the input format and computing intermediateinformation. The features of the packageCDM areillustrated with an accompanying real dataset and Q-matrix(fraction.subtraction.data andfraction.subtraction.qmatrix)and artificial examples (Data-sim).
See George et al. (2016) and Robitzsch and George (2019) for an overview and some computational detailsof theCDM package.
Author(s)
NA
Maintainer: Alexander Robitzsch <robitzsch@ipn.uni-kiel.de>
References
Bartolucci, F. (2007). A class of multidimensional IRT models for testingunidimensionality and clustering items.Psychometrika, 72, 141-157.
Chen, J., & de la Torre, J. (2013).A general cognitive diagnosis model for expert-defined polytomous attributes.Applied Psychological Measurement, 37, 419-437.
Chen, Y., Li, X., Liu, J., & Ying, Z. (2017).Regularized latent class analysis with application in cognitive diagnosis.Psychometrika, 82, 660-692.
de la Torre, J., & Douglas, J. (2004). Higher-order latent trait modelsfor cognitive diagnosis.Psychometrika, 69, 333–353.
de la Torre, J. (2009). A cognitive diagnosis model for cognitively basedmultiple-choice options.Applied Psychological Measurement,33, 163-183.
de la Torre, J. (2011). The generalized DINA model framework.Psychometrika, 76, 179–199.
DiBello, L. V., Roussos, L. A., & Stout, W. F. (2007). Review ofcognitively diagnostic assessment and a summary of psychometric models.In C. R. Rao and S. Sinharay (Eds.),Handbook of Statistics,Vol. 26 (pp. 979–1030). Amsterdam: Elsevier.
Formann, A. K. (1992). Linear logistic latent class analysis for polytomous data.Journal of the American Statistical Association, 87, 476-486.
George, A. C., & Robitzsch, A. (2015) Cognitive diagnosis models in R: A didactic.The Quantitative Methods for Psychology, 11, 189-205.doi:10.20982/tqmp.11.3.p189
George, A. C., Robitzsch, A., Kiefer, T., Gross, J., & Uenlue, A. (2016).The R package CDM for cognitive diagnosis models.Journal of Statistical Software, 74(2), 1-24.
Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with fewassumptions, and connections with nonparametric item response theory.Applied Psychological Measurement, 25, 258–272.
Ravand, H., & Robitzsch, A.(2015). Cognitive diagnostic modeling using R.Practical Assessment, Research & Evaluation, 20(11).Available online: http://pareonline.net/getvn.asp?v=20&n=11
Robitzsch, A., & George, A. C. (2019). The R package CDM.In M. von Davier & Y.-S. Lee (Eds.).Handbook of diagnosticclassification models (pp. 549-572). Cham: Springer.doi:10.1007/978-3-030-05584-4_26
Rupp, A. A., & Templin, J. (2008). Unique characteristics ofdiagnostic classification models: A comprehensive review of the currentstate-of-the-art.Measurement: Interdisciplinary Research andPerspectives, 6, 219–262.
Rupp, A. A., Templin, J., & Henson, R. A. (2010).DiagnosticMeasurement: Theory, Methods, and Applications. New York: The GuilfordPress.
Templin, J., & Henson, R. (2006). Measurement ofpsychological disorders using cognitive diagnosismodels.Psychological Methods, 11, 287–305.
von Davier, M. (2008). A general diagnostic model applied tolanguage testing data.British Journalof Mathematical and Statistical Psychology, 61, 287-307.
See Also
See theGDINA package for comprehensive functions for theGDINA model.
See also theACTCD andNPCD packages for nonparametric cognitivediagnostic models.
See thedina package for estimating the DINA model with a Gibbs sampler.
Examples
#### **********************************## ** CDM 2.5-16 (2013-11-29) **## ** Cognitive Diagnostic Models **## **********************************##Utility Functions inCDM
Description
Utility functions inCDM.
Usage
## requireNamespace with package message for needed installationCDM_require_namespace(pkg)## attach internal function in a packagecdm_attach_internal_function(pack, fun)## print function in summarycdm_print_summary_data_frame(obji, from=NULL, to=NULL, digits=3, rownames_null=FALSE)## print summary callcdm_print_summary_call(object, call_name="call")## print computation timecdm_print_summary_computation_time(object, time_name="time", time_start="s1", time_end="s2")## string vector of matrix entriescdm_matrixstring( matr, string )## mvtnorm::rmvnorm with vector conversion for n=1CDM_rmvnorm(n, mean=NULL, sigma, ...)## fit univariate and multivariate normal distributioncdm_fit_normal(x, w)## fit unidimensional factor analysis by unweighted least squarescdm_fa1(Sigma, method=1, maxit=50, conv=1E-5)## another rbind.fill implementationCDM_rbind_fill( x, y )## fills a vector row-wise into a matrixcdm_matrix2( x, nrow )## fills a vector column-wise into a matrixcdm_matrix1( x, ncol )## SCAD thresholding operatorcdm_penalty_threshold_scad(beta, lambda, a=3.7)## lasso thresholding operatorcdm_penalty_threshold_lasso(val, eta )## ridge thresholding operatorcdm_penalty_threshold_ridge(beta, lambda)## elastic net threshold operatorcdm_penalty_threshold_elnet( beta, lambda, alpha )## SCAD-L2 thresholding operatorcdm_penalty_threshold_scadL2(beta, lambda, alpha, a=3.7)## truncated L1 penalty thresholding operatorcdm_penalty_threshold_tlp( beta, tau, lambda )## MCP thresholding operatorcdm_penalty_threshold_mcp(beta, lambda, a=3.7)## general thresholding operator for regularizationcdm_parameter_regularization(x, regular_type, regular_lam, regular_alpha=NULL, regular_tau=NULL )## values of penalty functioncdm_penalty_values(x, regular_type, regular_lam, regular_tau=NULL, regular_alpha=NULL)## thresholding operators regularizationcdm_parameter_regularization(x, regular_type, regular_lam, regular_alpha=NULL, regular_tau=NULL)## utility functions for P-EM accelerationcdm_pem_inits(parmlist)cdm_pem_inits_assign_parmlist(pem_pars, envir)cdm_pem_acceleration( iter, pem_parameter_index, pem_parameter_sequence, pem_pars, PEM_itermax, parmlist, ll_fct, ll_args, deviance.history=NULL )cdm_pem_acceleration_assign_output_parameters(res_ll_fct, vars, envir, update)## approximation of absolute value function and its derivativeabs_approx(x, eps=1e-05)abs_approx_D1(x, eps=1e-05)## information criteriacdm_calc_information_criteria(ic)cdm_print_summary_information_criteria(object, digits_crit=0, digits_penalty=2)## string pastingcat_paste(...)Arguments
pkg | AnR package |
pack | AnR package |
fun | AnR function |
obji | Object |
from | Integer |
to | Integer |
digits | Number of digits used for printing |
rownames_null | Logical |
call_name | Character |
time_name | Character |
time_start | Character |
time_end | Character |
matr | Matrix |
string | String |
object | Object |
n | Integer |
mean | Mean vector or matrix if separate means for cases are provided. In this case, |
sigma | Covariance matrix |
... | More arguments to be passed (or a list of arguments) |
x | Matrix or vector |
y | Matrix or vector |
w | Vector of sampling weights |
nrow | Integer |
ncol | Integer |
Sigma | Covariance matrix |
method | Method |
maxit | Maximum number of iterations |
conv | Convergence criterion |
beta | Numeric |
lambda | Regularization parameter |
alpha | Regularization parameter |
a | Parameter |
tau | Regularization parameter |
val | Numeric |
eta | Regularization parameter |
regular_type | Type of regularization |
regular_lam | Regularization parameter |
regular_tau | Regularization parameter |
regular_alpha | Regularization parameter |
parmlist | List containing parameters |
pem_pars | Vector containing parameter names |
envir | Environment |
update | Logical |
iter | Iteration number |
pem_parameter_index | List with parameter indices |
pem_parameter_sequence | List with updated parameter sequence |
PEM_itermax | Maximum number of iterations for PEM |
ll_fct | Name of log-likelihood function |
ll_args | Arguments of log-likelihood function |
deviance.history | Deviance history, a data frame. |
res_ll_fct | Result of maximized log-likelihood function |
vars | Vector containing parameter names |
eps | Numeric |
ic | List |
digits_crit | Integer |
digits_penalty | Integer |
Artificial Data: DINA and DINO
Description
Artificial data: dichotomously coded fictitious answers of 400 respondentsto 9 items assuming 3 underlying attributes.
Usage
data(sim.dina) data(sim.dino) data(sim.qmatrix)Format
Thesim.dina andsim.dino data sets include dichotomousanswers ofN=400 respondents toJ=9 items, thus they are400 \times 9 data matrices. For both data setsK=3attributes are assumed to underlie the process of responding, storedinsim.qmatrix.
Thesim.dina data set is simulated according to the DINA condensationrule, whereas thesim.dino data set is simulated according to theDINO condensation rule. The slipping errors for the items 1 to 9 in bothdata sets are0.20, 0.20, 0.20, 0.20, 0.00, 0.50, 0.50, 0.10, 0.03and the guessing errors are0.10, 0.125, 0.15, 0.175, 0.2, 0.225, 0.25, 0.275, 0.3. The attributes are assumed to be mastered with expectedprobabilities of-0.4, 0.2, 0.6, respectively. The correlation ofthe attributes is0.3 for attributes 1 and 2,0.4 forattributes 1 and 3 and0.1 for attributes 2 and 3.
Example Index
Datasetsim.dina
anova (Examples 1, 2),cdi.kli (Example 1),din (Examples 2, 4, 5),gdina (Example 1),itemfit.sx2 (Example 2),modelfit.cor.din (Example 1)
Datasetsim.dino
cdm.est.class.accuracy (Example 1),din (Example 3),gdina (Examples 2, 3, 4),
References
Rupp, A. A., Templin, J. L., & Henson, R. A. (2010)DiagnosticMeasurement: Theory, Methods, and Applications. New York: The GuilfordPress.
Information Criteria
Description
Computes several information criteria for objects which do havethelogLik (stats) S3 method(e.g.din,gdina,gdm, ...) .
Usage
IRT.IC(object)Arguments
object | Objects which do have the |
Value
A vector with deviance and several information criteria.
See Also
See alsoanova.din for model comparisons.A general method is defined inIRT.compareModels.
Examples
############################################################################## EXAMPLE 1: DINA example information criteria#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")#*** Model 1: DINA modelmod1 <- CDM::din( sim.dina, q.matrix=sim.qmatrix )summary(mod1)IRT.IC(mod1)Root Mean Square Deviation (RMSD) Item Fit Statistic
Description
Computed the item fit statistics root mean square deviation (RMSD),mean absolute deviation (MAD) and mean deviation (MD).See Oliveri and von Davier (2011) for details.
The RMSD statistics was denoted as the RMSEA statistic in olderpublications, seeitemfit.rmsea.
If multiple groups are defined in the model object, a weighted item fitstatistic (WRMSD; Yamamoto, Khorramdel, & von Davier, 2013;von Davier, Weeks, Chen, Allen & van der Velden, 2013) isadditionally computed.
Usage
IRT.RMSD(object)## S3 method for class 'IRT.RMSD'summary(object, file=NULL, digits=3, ...)## core computation functionIRT_RMSD_calc_rmsd( n.ik, pi.k, probs, eps=1E-30 )Arguments
object | Object for which the methods |
n.ik | Expected counts |
pi.k | Probabilities trait distribution |
probs | Item response probabilities |
eps | Numerical constant avoiding division by zero |
digits | Number of digits used for rounding |
file | Optional file name for a file in which |
... | Optional parameters to be passed. |
Details
The RMSD and MD statistics are in operational use in PISA studiessince PISA 2015. These fit statistics can also be used for investigatinguniform and nonuniform differential item functioning.
Value
List with entries
RMSD | Item-wise and group-wise RMSD statistic |
RMSD_bc | Item-wise and group-wise RMSD statistic with analyticalbias correction |
MAD | Item-wise and group-wise MAD statistic |
MD | Item-wise and group-wise MD statistic |
chisquare_stat | Item-wise and group-wise |
... | Further values |
References
Oliveri, M. E., & von Davier, M. (2011).Investigation of model fit and score scale comparability ininternational assessments.Psychological Test and Assessment Modeling, 53, 315-333.
von Davier, M., Weeks, J., Chen, H., Allen, J., & van der Velden, R. (2013).Creating simple and complex derived variables and validation of backgroundquestionnaire data.In OECD (Eds.).Technical Report of the Survey of Adults Skills (PIAAC)(Ch. 20). Paris: OECD.
Yamamoto, K., Khorramdel, L., & von Davier, M. (2013).Scaling PIAAC cognitive data.In OECD (Eds.).Technical Report of the Survey of Adults Skills (PIAAC)(Ch. 17). Paris: OECD.
See Also
Examples
## Not run: ############################################################################## EXAMPLE 1: data.read | 1PL model in TAM#############################################################################data(data.read, package="sirt")dat <- data.read#*** Model 1: 1PL modelmod1 <- TAM::tam.mml( resp=dat )summary(mod1)# item fit statisticsimod1 <- CDM::IRT.RMSD(mod1)summary(imod1)############################################################################## EXAMPLE 2: data.math| RMSD and MD statistic for assessing DIF#############################################################################data(data.math, package="sirt")dat <- data.math$dataitems <- grep("M[A-Z]", colnames(dat), value=TRUE )#-- fit multiple group Rasch modelmod <- TAM::tam.mml( dat[,items], group=dat$female )summary(mod)#-- fit statisticsrmod <- CDM::IRT.RMSD(mod)summary(rmod)############################################################################## EXAMPLE 3: RMSD statistic DINA model#############################################################################data(sim.dina)data(sim.qmatrix)dat <- sim.dinaQ <- sim.qmatrix#-- fit DINA modelmod1 <- CDM::gdina( dat, q.matrix=Q, rule="DINA" )summary(mod1)#-- compute RMSD fit statisticrmod1 <- CDM::IRT.RMSD(mod1)summary(rmod1)## End(Not run)Helper Function for Conducting Likelihood Ratio Tests
Description
This is a helper function for conducting likelihood ratio testsand can be generally used for objects for which thelogLik method is defined.
Usage
IRT.anova(object, ...)Arguments
object | Object for which the |
... | A further object to be passed |
See Also
See alsoIRT.compareModels for model comparisonsof several models.
See also asanova.din.
Individual Classification for Fitted Models
Description
Computes individual classifications based on a fitted model.
Usage
IRT.classify(object, type="MLE")Arguments
object | Fitted model for which methods |
type | Type of classification: |
Value
List with entries
class_theta | Individual classification |
class_index | Class index of individual classification |
class_maxval | Maximum value corresponding to individual classification |
See Also
SeeIRT.factor.scores for similar functionality.
Examples
## Not run: ############################################################################## EXAMPLE 1: Individual classification data.ecpe#############################################################################data(data.ecpe, package="CDM")dat <- data.ecpe$dat[,-1]Q <- data.ecpe$q.matrix#** estimate GDINA modelmod <- CDM::gdina(dat, q.matrix=Q)summary(mod)#** classify individualscmod <- CDM::IRT.classify(mod)str(cmod)## End(Not run)Comparisons of Several Models
Description
Performs model comparisons based on information criteria and likelihoodratio test. This function allows all objects for which thelogLik (stats) S3 method is defined.The output ofIRT.modelfit can also be used asinput for this function.
Usage
IRT.compareModels(object, ...)## S3 method for class 'IRT.compareModels'summary(object, extended=TRUE, ...)Arguments
object | Object |
extended | Optional logical indicating whether all or or onlya subset of fit statistics should be printed. |
... | Further objects to be passed. |
Value
A list with following entries
IC | Data frame with information criteria |
LRtest | Data frame with all (useful) pairwiselikelihood ratio tests |
See Also
The function is based onIRT.IC.
For comparing two models seeanova.din.
For computing absolute model fit seeIRT.modelfit.
Examples
## Not run: ############################################################################## EXAMPLE 1: Model comparison sim.dina dataset#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")dat <- sim.dinaq.matrix <- sim.qmatrix#*** Model 0: DINA model with equal guessing and slipping parametersmod0 <- CDM::din( dat, q.matrix, guess.equal=TRUE, slip.equal=TRUE )summary(mod0)#*** Model 1: DINA modelmod1 <- CDM::din( dat, q.matrix )summary(mod1)#*** Model 2: DINO modelmod2 <- CDM::din( dat, q.matrix, rule="DINO")summary(mod2)#*** Model 3: Additive GDINA modelmod3 <- CDM::gdina( dat, q.matrix, rule="ACDM")summary(mod3)#*** Model 4: GDINA modelmod4 <- CDM::gdina( dat, q.matrix, rule="GDINA")summary(mod4)# model comparisonsres <- CDM::IRT.compareModels( mod0, mod1, mod2, mod3, mod4 )res ## > res ## $IC ## Model loglike Deviance Npars Nobs AIC BIC AIC3 AICc CAIC ## 1 mod0 -2176.482 4352.963 9 400 4370.963 4406.886 4379.963 4371.425 4415.886 ## 2 mod1 -2042.378 4084.756 25 400 4134.756 4234.543 4159.756 4138.232 4259.543 ## 3 mod2 -2086.805 4173.610 25 400 4223.610 4323.396 4248.610 4227.086 4348.396 ## 4 mod3 -2048.233 4096.466 32 400 4160.466 4288.193 4192.466 4166.221 4320.193 ## 5 mod4 -2026.633 4053.266 41 400 4135.266 4298.917 4176.266 4144.887 4339.917 ### -> The DINA model (mod1) performed best in terms of AIC. ## $LRtest ## Model1 Model2 Chi2 df p ## 1 mod0 mod1 268.20713 16 0.000000e+00 ## 2 mod0 mod2 179.35362 16 0.000000e+00 ## 3 mod0 mod3 256.49745 23 0.000000e+00 ## 4 mod0 mod4 299.69671 32 0.000000e+00 ## 5 mod1 mod3 -11.70967 7 1.000000e+00 ## 6 mod1 mod4 31.48959 16 1.164415e-02 ## 7 mod2 mod3 77.14383 7 5.262457e-14 ## 8 mod2 mod4 120.34309 16 0.000000e+00 ## 9 mod3 mod4 43.19926 9 1.981445e-06 ### -> The GDINA model (mod4) was superior to the other models in terms# of the likelihood ratio test.# get an overview with summarysummary(res)summary(res,extended=FALSE)#*******************# applying model comparison for objects of class IRT.modelfit# compute model fit statisticsfmod0 <- CDM::IRT.modelfit(mod0)fmod1 <- CDM::IRT.modelfit(mod1)fmod4 <- CDM::IRT.modelfit(mod4)# model comparisonres <- CDM::IRT.compareModels( fmod0, fmod1, fmod4 )res ## $IC ## Model loglike Deviance Npars Nobs AIC BIC AIC3 ## mod0 mod0 -2176.482 4352.963 9 400 4370.963 4406.886 4379.963 ## mod1 mod1 -2042.378 4084.756 25 400 4134.756 4234.543 4159.756 ## mod4 mod4 -2026.633 4053.266 41 400 4135.266 4298.917 4176.266 ## AICc CAIC maxX2 p_maxX2 MADcor SRMSR ## mod0 4371.425 4415.886 118.172707 0.0000000 0.09172287 0.10941300 ## mod1 4138.232 4259.543 8.728248 0.1127943 0.03025354 0.03979948 ## mod4 4144.887 4339.917 2.397241 1.0000000 0.02284029 0.02989669 ## X100.MADRESIDCOV MADQ3 MADaQ3 ## mod0 1.9749936 0.08840892 0.08353917 ## mod1 0.6713952 0.06184332 0.05923058 ## mod4 0.5148707 0.07477337 0.07145600 ## ## $LRtest ## Model1 Model2 Chi2 df p ## 1 mod0 mod1 268.20713 16 0.00000000 ## 2 mod0 mod4 299.69671 32 0.00000000 ## 3 mod1 mod4 31.48959 16 0.01164415## End(Not run)S3 Method for Extracting Used Item Response Dataset
Description
This S3 method extracts the used dataset with item responses.
Usage
IRT.data(object, ...)## S3 method for class 'din'IRT.data(object, ...)## S3 method for class 'gdina'IRT.data(object, ...)## S3 method for class 'gdm'IRT.data(object, ...)## S3 method for class 'mcdina'IRT.data(object, ...)## S3 method for class 'reglca'IRT.data(object, ...)## S3 method for class 'slca'IRT.data(object, ...)Arguments
object | |
... | More arguments to be passed. |
Value
A matrix (or data frame) with item responses and group identifier andweights vector as attributes.
Examples
## Not run: ############################################################################## EXAMPLE 1: Several models for sim.dina data#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")dat <- sim.dinaq.matrix <- sim.qmatrix#--- Model 1: GDINA modelmod1 <- CDM::gdina( data=dat, q.matrix=q.matrix)summary(mod1)dmod1 <- CDM::IRT.data(mod1)str(dmod1)#--- Model 2: DINA modelmod2 <- CDM::din( data=dat, q.matrix=q.matrix)summary(mod2)dmod2 <- CDM::IRT.data(mod2)#--- Model 3: Rasch model with gdm functionmod3 <- CDM::gdm( data=dat, irtmodel="1PL", theta.k=seq(-4,4,length=11), centered.latent=TRUE )summary(mod3)dmod3 <- CDM::IRT.data(mod3)#--- Model 4: Latent class model with two classesdat <- sim.dinaI <- ncol(dat)# define design matricesTP <- 2 # two classes# The idea is that latent classes refer to two different "dimensions".# Items load on latent class indicators 1 and 2, see below.Xdes <- array(0, dim=c(I,2,2,2*I) )items <- colnames(dat)dimnames(Xdes)[[4]] <- c(paste0( colnames(dat), "Class", 1), paste0( colnames(dat), "Class", 2) ) # items, categories, classes, parameters# probabilities for correct solutionfor (ii in 1:I){ Xdes[ ii, 2, 1, ii ] <- 1 # probabilities class 1 Xdes[ ii, 2, 2, ii+I ] <- 1 # probabilities class 2}# estimate modelmod4 <- CDM::slca( dat, Xdes=Xdes)summary(mod4)dmod4 <- CDM::IRT.data(mod4)## End(Not run)S3 Method for Extracting Expected Counts
Description
This S3 method extracts expected counts from model output.
Usage
IRT.expectedCounts(object, ...)## S3 method for class 'din'IRT.expectedCounts(object, ...)## S3 method for class 'gdina'IRT.expectedCounts(object, ...)## S3 method for class 'gdm'IRT.expectedCounts(object, ...)## S3 method for class 'mcdina'IRT.expectedCounts(object, ...)## S3 method for class 'slca'IRT.expectedCounts(object, ...)## S3 method for class 'reglca'IRT.expectedCounts(object, ...)Arguments
object | |
... | More arguments to be passed. |
Value
An array with expected counts. The dimensions are items,categories, latent classes and groups.
Examples
## Not run: ############################################################################## EXAMPLE 1: Expected counts gdm function#############################################################################data(data.fraction1, package="CDM")dat <- data.fraction1$datatheta.k <- seq( -6, 6, len=11 ) # discretized ability#--- Model 1: Rasch modelmod1 <- CDM::gdm( dat, irtmodel="1PL", theta.k=theta.k, skillspace="normal", centered.latent=TRUE )emod1 <- CDM::IRT.expectedCounts(mod1)str(emod1)############################################################################## EXAMPLE 2: Expected counts gdina function#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")#--- Model 1: estimation of the GDINA modelmod1 <- CDM::gdina( data=sim.dina, q.matrix=sim.qmatrix)summary(mod1)emod1 <- CDM::IRT.expectedCounts(mod1)str(emod1)#--- Model 2: GDINA model with two groupsmod2 <- CDM::gdina( data=CDM::sim.dina, q.matrix=CDM::sim.qmatrix, group=rep(1:2, each=200) )summary(mod2)emod2 <- CDM::IRT.expectedCounts( mod2 )str(emod2)## End(Not run)S3 Methods for Extracting Factor Scores (Person Classifications)
Description
This S3 method extracts factor scores or skill classifications.
Usage
IRT.factor.scores(object, ...)## S3 method for class 'din'IRT.factor.scores(object, type="MLE", ...)## S3 method for class 'gdina'IRT.factor.scores(object, type="MLE", ...)## S3 method for class 'mcdina'IRT.factor.scores(object, type="MLE", ...)## S3 method for class 'gdm'IRT.factor.scores(object, type="EAP", ...)## S3 method for class 'slca'IRT.factor.scores(object, type="MLE", ...)Arguments
object | |
type | Type of estimated factor score. This can be |
... | More arguments to be passed. |
Value
A matrix or a vector with classified scores.
See Also
For extracting the individual likelihood or the individual posterior seeIRT.likelihood orIRT.posterior.
Examples
############################################################################## EXAMPLE 1: Extracting factor scores in the DINA model#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")# estimate DINA modelmod1 <- CDM::din( sim.dina, q.matrix=sim.qmatrix)summary(mod1)# MLEfsc1a <- CDM::IRT.factor.scores(mod1)# MAPfsc1b <- CDM::IRT.factor.scores(mod1, type="MAP")# EAPfsc1c <- CDM::IRT.factor.scores(mod1, type="EAP")# compare classification for skill 1stats::xtabs( ~ fsc1a[,1] + fsc1b[,1] )graphics::boxplot( fsc1c[,1] ~ fsc1a[,1] )S3 Method for Computing Observed and Expected Frequencies of Univariate andBivariate Marginals
Description
This S3 method computes observed and expected frequencies for univariate andbivariate distributions.
Usage
IRT.frequencies(object, ...)IRT_frequencies_default(data, post, probs, weights=NULL)IRT_frequencies_wrapper(object, ...)## S3 method for class 'din'IRT.frequencies(object, ...)## S3 method for class 'gdina'IRT.frequencies(object, ...)## S3 method for class 'mcdina'IRT.frequencies(object, ...)## S3 method for class 'gdm'IRT.frequencies(object, ...)## S3 method for class 'slca'IRT.frequencies(object, ...)Arguments
object | |
... | More arguments to be passed. |
data | Item response data as extracted by |
post | Individual posterior distribution as extracted by |
probs | Individual posterior distribution as extracted by |
weights | Optional vector of weights as included as the attribute |
Value
List with following entries
uni_obs | Univariate observed distribution |
uni_exp | Univariate expected distribution |
M_obs | Univariate observed means |
M_exp | Univariate expected means |
SD_obs | Univariate observed standard deviations |
SD_exp | Univariate expected standard deviations |
biv_obs | Bivariate observed frequencies |
biv_exp | Bivariate expected frequencies |
biv_N | Bivariate sample size |
cov_obs | Observed covariances |
cov_cor | Expected covariances |
cor_obs | Observed correlations |
cor_exp | Expected correlations |
chisq | Chi square statistic of local independence |
Examples
## Not run: ############################################################################## EXAMPLE 1: Usage IRT.frequencies#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")# estimate GDINA modelmod1 <- CDM::gdina( data=sim.dina, q.matrix=sim.qmatrix)summary(mod1)# direct usage of IRT.frequenciesfres1 <- CDM::IRT.frequencies(mod1)# use of the default function with input datadata <- CDM::IRT.data(object)post <- CDM::IRT.posterior(object)probs <- CDM::IRT.irfprob(object)fres2 <- CDM::IRT_frequencies_default(data=data, post=post, probs=probs)## End(Not run)S3 Methods for Extracting Item Response Functions
Description
This S3 method extracts item response functions evaluatedat a grid of abilities (skills). Item response functions canbe plotted using theIRT.irfprobPlot function.
Usage
IRT.irfprob(object, ...)## S3 method for class 'din'IRT.irfprob(object, ...)## S3 method for class 'gdina'IRT.irfprob(object, ...)## S3 method for class 'gdm'IRT.irfprob(object, ...)## S3 method for class 'mcdina'IRT.irfprob(object, ...)## S3 method for class 'reglca'IRT.irfprob(object, ...)## S3 method for class 'slca'IRT.irfprob(object, ...)Arguments
object | |
... | More arguments to be passed. |
Value
An array with item response probabilities (items\timescategories\times skill classes [\times group]) and attributes
theta | Uni- or multidimensional skill space (theta grid initem response models). |
prob.theta | Probability distribution of |
skillspace | Design matrix and estimated parameters forskill space distribution (only for |
G | Number of groups |
See Also
Plot functions for item response curves:IRT.irfprobPlot.
For extracting the individual likelihood or posterior seeIRT.likelihood orIRT.posterior.
Examples
## Not run: ############################################################################## EXAMPLE 1: Extracting item response functions mcdina model#############################################################################data(data.cdm02, package="CDM")dat <- data.cdm02$dataq.matrix <- data.cdm02$q.matrix#-- estimate modelmod1 <- CDM::mcdina( dat, q.matrix=q.matrix)#-- extract item response functionsprmod1 <- CDM::IRT.irfprob(mod1)str(prmod1)## End(Not run)Plot Item Response Functions
Description
This function plots item response functions for fitteditem response models for which theIRT.irfprobmethod is defined.
Usage
IRT.irfprobPlot( object, items=NULL, min.theta=-4, max.theta=4, cumul=FALSE, smooth=TRUE, ask=TRUE, n.theta=40, package="lattice",... )Arguments
object | Fitted item response model for which the |
items | Vector of indices of selected items. |
min.theta | Minimum theta to be displayed. |
max.theta | Maximum theta to be displayed. |
cumul | Optional logical indicating whether cumulateditem response functions |
smooth | Optional logical indicating whether item responsefunctions should be smoothed for plotting. |
ask | Logical for asking for a new plot. |
n.theta | Number of theta points if |
package | String indicating which package should be used for plottingthe item response curves. Options are |
... | More arguments to be passed for the plot inlattice. |
Examples
## Not run: ############################################################################## EXAMPLE 1: Plot item response functions from a unidimensional model#############################################################################data(data.Students, package="CDM")dat <- data.Studentsresp <- dat[, paste0("sc",1:4) ]resp[ paste(resp[,1])==3,1] <- 2psych::describe(resp)#--- Model 1: PCM in CDM::gdmtheta.k <- seq( -5, 5, len=21 )mod1 <- CDM::gdm( dat=resp, irtmodel="1PL", theta.k=theta.k, skillspace="normal", centered.latent=TRUE)summary(mod1)# plotIRT.irfprobPlot( mod1 )# plot in graphics package (which comes with R base version)IRT.irfprobPlot( mod1, package="graphics")# plot first and third item and do not smooth discretized item response# functions in IRT.irfprobIRT.irfprobPlot( mod1, items=c(1,3), smooth=FALSE )# cumulated IRFIRT.irfprobPlot( mod1, cumul=TRUE )############################################################################## EXAMPLE 2: Fitted mutidimensional model with gdm#############################################################################dat <- CDM::data.fraction2$dataQmatrix <- CDM::data.fraction2$q.matrix3# Model 1: 3-dimensional Rasch Model (normal distribution)theta.k <- seq( -4, 4, len=11 ) # discretized abilitymod1 <- CDM::gdm( dat, irtmodel="1PL", theta.k=theta.k, Qmatrix=Qmatrix, centered.latent=TRUE, maxiter=10 )summary(mod1)# unsmoothed curvesIRT.irfprobPlot(mod1, smooth=FALSE)# smoothed curvesIRT.irfprobPlot(mod1)## End(Not run)S3 Methods for Computing Item Fit
Description
This S3 method computes some selected item fit statistic.
Usage
IRT.itemfit(object, ...)## S3 method for class 'din'IRT.itemfit(object, method="RMSEA", ...)## S3 method for class 'gdina'IRT.itemfit(object, method="RMSEA", ...)## S3 method for class 'gdm'IRT.itemfit(object, method="RMSEA", ...)## S3 method for class 'reglca'IRT.itemfit(object, method="RMSEA", ...)## S3 method for class 'slca'IRT.itemfit(object, method="RMSEA", ...)Arguments
object | |
method | Method for computing item fit statistic. Until now,only |
... | More arguments to be passed. |
Value
Vector or data frame with item fit statistics.
See Also
For extracting the individual likelihood or posterior seeIRT.likelihood orIRT.posterior.
Examples
## Not run: ############################################################################## EXAMPLE 1: DINA model item fit#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")# estimate modelmod1 <- CDM::din( sim.dina, q.matrix=sim.qmatrix)# compute item fitIRT.itemfit( mod1 )## End(Not run)Jackknifing an Item Response Model
Description
This function performs a Jackknife procedure for estimatingstandard errors for an item response model. The replicationdesign must be defined byIRT.repDesign.Model fit is also assessed via Jackknife.
Statistical inference for derived parameters is performedbyIRT.derivedParameters with a fitted object ofclassIRT.jackknife and a list with defining formulas.
Usage
IRT.jackknife(object,repDesign, ... )IRT.derivedParameters(jkobject, derived.parameters )## S3 method for class 'gdina'IRT.jackknife(object, repDesign, ...)## S3 method for class 'IRT.jackknife'coef(object, bias.corr=FALSE, ...)## S3 method for class 'IRT.jackknife'vcov(object, ...)Arguments
object | Objects for which S3 method |
repDesign | Replication design generated by |
jkobject | Object of class |
derived.parameters | List with defined derived parameters(see Example 2, Model 2). |
bias.corr | Optional logical indicating whether a bias correctionshould be employed. |
... | Further arguments to be passed. |
Value
List with following entries
jpartable | Parameter table with Jackknife estimates |
parsM | Matrix with replicated statistics |
vcov | Variance covariance matrix of parameters |
Examples
## Not run: library(BIFIEsurvey)############################################################################## EXAMPLE 1: Multiple group DINA model with TIMSS data | Cluster sample#############################################################################data(data.timss11.G4.AUT.part, package="CDM")dat <- data.timss11.G4.AUT.part$dataq.matrix <- data.timss11.G4.AUT.part$q.matrix2# extract itemsitems <- paste(q.matrix$item)# generate replicate designrdes <- CDM::IRT.repDesign( data=dat, wgt="TOTWGT", jktype="JK_TIMSS", jkzone="JKCZONE", jkrep="JKCREP" )#--- Model 1: fit multiple group GDINA modelmod1 <- CDM::gdina( dat[,items], q.matrix=q.matrix[,-1], weights=dat$TOTWGT, group=dat$female +1 )# jackknife Model 1jmod1 <- CDM::IRT.jackknife( object=mod1, repDesign=rdes )summary(jmod1)coef(jmod1)vcov(jmod1)############################################################################## EXAMPLE 2: DINA model | Simple random sampling#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")dat <- sim.dinaq.matrix <- sim.qmatrix# generate replicate design with 50 jackknife zones (50 random groups)rdes <- CDM::IRT.repDesign( data=dat, jktype="JK_RANDOM", ngr=50 )#--- Model 1: DINA modelmod1 <- CDM::gdina( dat, q.matrix=q.matrix, rule="DINA")summary(mod1)# jackknife DINA modeljmod1 <- CDM::IRT.jackknife( object=mod1, repDesign=rdes )summary(jmod1)#--- Model 2: DINO modelmod2 <- CDM::gdina( dat, q.matrix=q.matrix, rule="DINO")summary(mod2)# jackknife DINA modeljmod2 <- CDM::IRT.jackknife( object=mod2, repDesign=rdes )summary(jmod2)IRT.compareModels( mod1, mod2 )# statistical inference for derived parametersderived.parameters <- list( "skill1"=~ 0 + I(prob_skillV1_lev1_group1), "skilldiff12"=~ 0 + I( prob_skillV2_lev1_group1 - prob_skillV1_lev1_group1 ), "skilldiff13"=~ 0 + I( prob_skillV3_lev1_group1 - prob_skillV1_lev1_group1 ) )jmod2a <- CDM::IRT.derivedParameters( jmod2, derived.parameters=derived.parameters )summary(jmod2a)coef(jmod2a)## End(Not run)S3 Methods for Extracting of the Individual Likelihood and the Individual Posterior
Description
Functions for extracting the individual likelihood andindividual posterior distribution.
Usage
IRT.likelihood(object, ...)IRT.posterior(object, ...)## S3 method for class 'din'IRT.likelihood(object, ...)## S3 method for class 'din'IRT.posterior(object, ...)## S3 method for class 'gdina'IRT.likelihood(object, ...)## S3 method for class 'gdina'IRT.posterior(object, ...)## S3 method for class 'gdm'IRT.likelihood(object, ...)## S3 method for class 'gdm'IRT.posterior(object, ...)## S3 method for class 'mcdina'IRT.likelihood(object, ...)## S3 method for class 'mcdina'IRT.posterior(object, ...)## S3 method for class 'reglca'IRT.likelihood(object, ...)## S3 method for class 'reglca'IRT.posterior(object, ...)## S3 method for class 'slca'IRT.likelihood(object, ...)## S3 method for class 'slca'IRT.posterior(object, ...)Arguments
object | |
... | More arguments to be passed. |
Value
For both functionsIRT.likelihood andIRT.posterior,it is a matrix with attributes
theta | Uni- or multidimensional skill space (theta grid initem response models). |
prob.theta | Probability distribution of |
skillspace | Design matrix and estimated parameters forskill space distribution (only for |
G | Number of groups |
See Also
GDINA::indlogLik,GDINA::indlogPost
Examples
############################################################################## EXAMPLE 1: Extracting likelihood and posterior from a DINA model#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")#*** estimate modelmod1 <- CDM::din( sim.dina, q.matrix=sim.qmatrix, rule="DINA")#*** extract likelihoodlikemod1 <- CDM::IRT.likelihood(mod1)str(likemod1)# extract thetaattr(likemod1, "theta" )#*** extract posteriorpomod1 <- CDM::IRT.posterior( mod1 )str(pomod1)S3 Method for Computation of Marginal Posterior Distribution
Description
Computes marginal posterior distributions for fitted models in theCDM package.
Usage
IRT.marginal_posterior(object, dim, remove_zeroprobs=TRUE, ...)## S3 method for class 'din'IRT.marginal_posterior(object, dim, remove_zeroprobs=TRUE, ...)## S3 method for class 'gdina'IRT.marginal_posterior(object, dim, remove_zeroprobs=TRUE, ...)## S3 method for class 'mcdina'IRT.marginal_posterior(object, dim, remove_zeroprobs=TRUE, ...)Arguments
object | |
dim | Numeric or character vector indicating dimensions of posterior distributionwhich should be marginalized |
remove_zeroprobs | Logical indicating whether classes with zero probabilities shouldbe removed |
... | Further arguments to be passed |
Value
List with entries
marg_post | Marginal posterior distribution |
map | MAP estimate (individual classification) |
theta | Skill classes |
See Also
Examples
## Not run: ############################################################################## EXAMPLE 1: Dataset with three hierarchical skills############################################################################## simulated data with hierarchical skills:# skill A with 4 levels, skill B with 2 levels and skill C with 3 levelsdata(data.cdm10, package="CDM"")dat <- data.cdm10$dataQ <- data.cdm10$q.matrixprint(Q)# define hierarchical skill structureB <- "A1 > A2 > A3 C1 > C2"skill_space <- CDM::skillspace.hierarchy(B=B, skill.names=colnames(Q))zeroprob.skillclasses <- skill_space$zeroprob.skillclasses# estimate DINA modelmod1 <- CDM::gdina( dat, q.matrix=Q, zeroprob.skillclasses=zeroprob.skillclasses, rule="DINA")summary(mod1)# classification for skill Ares <- CDM::IRT.marginal_posterior(object=mod1, dim=c("A1","A2","A3") )table(res$map)# classification for skill Bres <- CDM::IRT.marginal_posterior(object=mod1, dim=c("B") )table(res$map)# classification for skill Cres <- CDM::IRT.marginal_posterior(object=mod1, dim=c("C1","C2") )table(res$map)## End(Not run)S3 Methods for Assessing Model Fit
Description
This S3 method assesses global (absolute) model fit usingthe methods described inmodelfit.cor.din.
Usage
IRT.modelfit(object, ...)## S3 method for class 'din'IRT.modelfit(object, ...)## S3 method for class 'gdina'IRT.modelfit(object, ...)## S3 method for class 'gdm'IRT.modelfit(object, ...)## S3 method for class 'IRT.modelfit.din'summary(object, ...)## S3 method for class 'IRT.modelfit.gdina'summary(object, ...)## S3 method for class 'IRT.modelfit.gdm'summary(object, ...)Arguments
object | |
... | More arguments to be passed. |
Value
See output ofmodelfit.cor.din.
See Also
For extracting the individual likelihood or posterior seeIRT.likelihood orIRT.posterior.
The model fit of objects of classgdm can be obtainedby using theTAM::tam.modelfit.IRT function in theTAM package.
Examples
## Not run: ############################################################################## EXAMPLE 1: Absolute model fit#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")#*** Model 1: DINA model for DINA simulated datamod1 <- CDM::din( sim.dina, q.matrix=sim.qmatrix, rule="DINA" )fmod1 <- CDM::IRT.modelfit( mod1 )summary(fmod1) ## Test of Global Model Fit ## type value p ## 1 max(X2) 8.728 0.113 ## 2 abs(fcor) 0.143 0.080 ## ## Fit Statistics ## est ## MADcor 0.030 ## SRMSR 0.040 ## 100*MADRESIDCOV 0.671 ## MADQ3 0.062 ## MADaQ3 0.059#*** Model 2: GDINA modelmod2 <- CDM::gdina( sim.dina, q.matrix=sim.qmatrix, rule="GDINA" )fmod2 <- CDM::IRT.modelfit( mod2 )summary(fmod2) ## Test of Global Model Fit ## type value p ## 1 max(X2) 2.397 1 ## 2 abs(fcor) 0.078 1 ## ## Fit Statistics ## est ## MADcor 0.023 ## SRMSR 0.030 ## 100*MADRESIDCOV 0.515 ## MADQ3 0.075 ## MADaQ3 0.071## End(Not run)S3 Method for Extracting a Parameter Table
Description
S3 method which extracts a parameter table.
Usage
IRT.parameterTable(object, ...)Arguments
object | Object of model classes |
... | More arguments to be passed. |
Value
A parameter table
Generation of a Replicate Design forIRT.jackknife
Description
This function generates a Jackknife replicate design which isnecessary to use theIRT.jackknife function. The functionis a wrapper toBIFIE.data.jack in theBIFIEsurvey package.
Usage
IRT.repDesign(data, wgt=NULL, jktype="JK_TIMSS", jkzone=NULL, jkrep=NULL, jkfac=NULL, fayfac=1, wgtrep="W_FSTR", ngr=100, Nboot=200, seed=.Random.seed)Arguments
data | Dataset which must contain weights and item responses |
wgt | Vector with sample weights |
jktype | Type of jackknife procedure for creating the BIFIE.data object. |
jkzone | Variable name for jackknife zones.If |
jkrep | Variable name containing Jackknife replicates |
jkfac | Factor for multiplying jackknife replicate weights.If |
fayfac | Fay factor. For Jackknife, the default is 1. For a Bootstrap with |
wgtrep | Already available replicate design |
ngr | Number of groups |
Nboot | Number of bootstrap samples |
seed | Random seed |
Value
A list with following entries
wgt | Vector with weights |
wgtrep | Matrix containing the replicate design |
fayfac | Fay factor needed for Jackknife calculations |
See Also
SeeIRT.jackknife for further examples.
See theBIFIE.data.jack function in theBIFIEsurvey package.
Examples
## Not run: # load the BIFIEsurvey packagelibrary(BIFIEsurvey)############################################################################## EXAMPLE 1: Design with Jackknife replicate weights in TIMSS#############################################################################data(data.timss11.G4.AUT, package="CDM")dat <- CDM::data.timss11.G4.AUT$data# generate designrdes <- CDM::IRT.repDesign( data=dat, wgt="TOTWGT", jktype="JK_TIMSS", jkzone="JKCZONE", jkrep="JKCREP" )str(rdes)############################################################################## EXAMPLE 2: Bootstrap resampling#############################################################################data(sim.qmatrix, package="CDM")q.matrix <- CDM::sim.qmatrix# simulate data according to the DINA modeldat <- CDM::sim.din(N=2000, q.matrix=q.matrix )$dat# bootstrap with 300 random samplesrdes <- CDM::IRT.repDesign( data=dat, jktype="BOOT", Nboot=300 )## End(Not run)Wald Test for a Linear Hypothesis
Description
Computes a Wald Test for a parameter\bold{\theta}with respect to a linear hypothesis \bold{R} \bold{\theta}=\bold{c}.
Usage
WaldTest( delta, vcov, R, nobs, cvec=NULL, eps=1E-10 )Arguments
delta | Estimated parameter |
vcov | Estimated covariance matrix |
R | Hypothesis matrix |
nobs | Number of observations |
cvec | Hypothesis vector |
eps | Numerical value is added as ridge parameter ofthe covariance matrix |
Value
A vector containing the\chi^2 statistic (X2),degrees of freedom (df),p value (p) and RMSEA statistic (RMSEA).
Likelihood Ratio Test for Model Comparisons
Description
This function compares two models estimated withdin,gdinaorgdm using a likelihood ratio test.
Usage
## S3 method for class 'din'anova(object,...)## S3 method for class 'gdina'anova(object,...)## S3 method for class 'gdm'anova(object,...)## S3 method for class 'mcdina'anova(object,...)## S3 method for class 'reglca'anova(object,...)## S3 method for class 'slca'anova(object,...)Arguments
object | Two objects of class |
... | Further arguments to be passed |
Note
This function is based onIRT.anova.
See Also
Examples
############################################################################## EXAMPLE 1: anova with din objects############################################################################## Model 1d1 <- CDM::din(sim.dina, q.matr=sim.qmatrix )# Model 2 with equal guessing and slipping parametersd2 <- CDM::din(sim.dina, q.matr=sim.qmatrix, guess.equal=TRUE, slip.equal=TRUE)# model comparisonanova(d1,d2) ## Model loglike Deviance Npars AIC BIC Chisq df p ## 2 d2 -2176.482 4352.963 9 4370.963 4406.886 268.2071 16 0 ## 1 d1 -2042.378 4084.756 25 4134.756 4234.543 NA NA NA## Not run: ############################################################################## EXAMPLE 2: anova with gdina objects############################################################################## Model 3: GDINA modeld3 <- CDM::gdina( sim.dina, q.matr=sim.qmatrix )# Model 4: DINA modeld4 <- CDM::gdina( sim.dina, q.matr=sim.qmatrix, rule="DINA")# model comparisonanova(d3,d4) ## Model loglike Deviance Npars AIC BIC Chisq df p ## 2 d4 -2042.293 4084.586 25 4134.586 4234.373 31.31995 16 0.01224 ## 1 d3 -2026.633 4053.267 41 4135.266 4298.917 NA NA NA## End(Not run)Cognitive Diagnostic Indices based on Kullback-Leibler Information
Description
This function computes several cognitive diagnostic indices groundedon the Kullback-Leibler information (Rupp, Henson& Templin, 2009, Ch. 13) at the test, item, attribute and item-attribute level.See Henson and Douglas (2005) and Henson, Roussos, Douglas and He (2008)for more details.
Usage
cdi.kli(object)## S3 method for class 'cdi.kli'summary(object, digits=2, ...)Arguments
object | Object of class |
digits | Number of digits for rounding |
... | Further arguments to be passed |
Value
A list with following entries
test_disc | Test discrimination which is the sum of all globalitem discrimination indices |
attr_disc | Attribute discriminations |
glob_item_disc | Global item discriminations (Cognitive diagnosticindex) |
attr_item_disc | Attribute-specific item discrimination |
KLI | Array with Kullback-Leibler informations of all items (first dimension)and skill classes (in the second and third dimension) |
skillclasses | Matrix containing all used skill classes in the model |
hdist | Matrix containing Hamming distance between skill classes |
pjk | Used probabilities |
q.matrix | Used Q-matrix |
summary | Data frame with test- and item-specificdiscrimination statistics |
References
Henson, R., DiBello, L., & Stout, B. (2018). A generalized approach to defining itemdiscrimination for DCMs.Measurement: Interdisciplinary Research and Perspectives, 16(1), 18-29.http://dx.doi.org/10.1080/15366367.2018.1436855
Henson, R., & Douglas, J. (2005). Test construction for cognitive diagnosis.Applied Psychological Measurement, 29, 262-277.http://dx.doi.org/10.1177/0146621604272623
Henson, R., Roussos, L., Douglas, J., & He, X. (2008).Cognitive diagnostic attribute-level discrimination indices.Applied Psychological Measurement, 32, 275-288.http://dx.doi.org/10.1177/0146621607302478
Rupp, A. A., Templin, J., & Henson, R. A. (2010).DiagnosticMeasurement: Theory, Methods, and Applications. New York: The GuilfordPress.
See Also
Seediscrim.index for computing discrimination indices at theprobability metric.
See Henson, DiBello and Stout (2018) for an overview of different discriminationindices.
Examples
############################################################################## EXAMPLE 1: Examples based on CDM::sim.dina#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")mod <- CDM::din( sim.dina, q.matrix=sim.qmatrix )summary(mod) ## Item parameters ## item guess slip IDI rmsea ## Item1 Item1 0.086 0.210 0.704 0.014 ## Item2 Item2 0.109 0.239 0.652 0.034 ## Item3 Item3 0.129 0.185 0.686 0.028 ## Item4 Item4 0.226 0.218 0.556 0.019 ## Item5 Item5 0.059 0.000 0.941 0.002 ## Item6 Item6 0.248 0.500 0.252 0.036 ## Item7 Item7 0.243 0.489 0.268 0.041 ## Item8 Item8 0.278 0.125 0.597 0.109 ## Item9 Item9 0.317 0.027 0.656 0.065cmod <- CDM::cdi.kli( mod )# attribute discrimination indicesround( cmod$attr_disc, 3 ) ## V1 V2 V3 ## 1.966 2.506 11.169# look at global item discrimination indicesround( cmod$glob_item_disc, 3 ) ## > round( cmod$glob_item_disc, 3 ) ## Item1 Item2 Item3 Item4 Item5 Item6 Item7 Item8 Item9 ## 0.594 0.486 0.533 0.465 5.913 0.093 0.040 0.397 0.656# correlation of IDI and global item discriminationstats::cor( cmod$glob_item_disc, mod$IDI ) ## [1] 0.6927274# attribute-specific item indicesround( cmod$attr_item_disc, 3 ) ## V1 V2 V3 ## Item1 0.648 0.648 0.000 ## Item2 0.000 0.530 0.530 ## Item3 0.581 0.000 0.581 ## Item4 0.697 0.000 0.000 ## Item5 0.000 0.000 8.870 ## Item6 0.000 0.140 0.000 ## Item7 0.040 0.040 0.040 ## Item8 0.000 0.433 0.433 ## Item9 0.000 0.715 0.715## Note that attributes with a zero entry for an item## do not differ from zero for the attribute specific item indexClassification Reliability in a CDM
Description
This function computes the classification accuracy andconsistency originally proposed by Cui, Gierl and Chang (2012;see also Wang et al., 2015).The function computes both statistics by estimators of Johnson and Sinharay (2018;see also Sinharay & Johnson, 2019) and simulation based estimation.
Usage
cdm.est.class.accuracy(cdmobj, n.sims=0, version=2)Arguments
cdmobj | Object of class |
n.sims | Number of simulated persons. If |
version | Correct classification reliability statistics can be obtainedusing the default |
Details
The item parameters and the probability distribution oflatent classes is used as the basis of the simulation.Accuracy and consistency is estimated for both MLE and MAPclassification estimators. In addition, classification accuracy measuresare available for the separate classification of all skills.
Value
A data frame for MLE, MAP and MAP (Skill 1, ..., SkillK)classification reliability for the whole latent class pattern andmarginal skill classification with following columns:
Pa_est | Classification accuracy (Cui et al., 2012) usingthe estimator of Johnson and Sinharay, 2018 |
Pa_sim | Classification accuracy based on simulated data(only for |
Pc | Classification consistency (Cui et al., 2012) usingthe estimator of Johnson and Sinharay, 2018 |
Pc_sim | Classification consistency based on simulated data(only for |
References
Cui, Y., Gierl, M. J., & Chang, H.-H. (2012).Estimating classification consistency and accuracy for cognitivediagnostic assessment.Journal of Educational Measurement, 49, 19-38.doi:10.1111/j.1745-3984.2011.00158.x
Johnson, M. S., & Sinharay, S. (2018). Measures of agreement to assess attribute-levelclassification accuracy and consistency for cognitive diagnostic assessments.Journal of Educational Measurement, 45(4), 635-664.doi:10.1111/jedm.12196
Sinharay, S., & Johnson, M. S. (2019). Measures of agreement:Reliability, classification accuracy, and classification consistency.In M. von Davier & Y.-S. Lee (Eds.).Handbook of diagnosticclassification models (pp. 359-377). Cham: Springer.doi:10.1007/978-3-030-05584-4_17
Wang, W., Song, L., Chen, P., Meng, Y., & Ding, S. (2015). Attribute-level andpattern-level classification consistency and accuracy indices for cognitive diagnosticassessment.Journal of Educational Measurement, 52(4), 457-476.doi:10.1111/jedm.12096
Examples
## Not run: ############################################################################## EXAMPLE 1: DINO data example#############################################################################data(sim.dino, package="CDM")data(sim.qmatrix, package="CDM")#***# Model 1: estimate DINO model with dinmod1 <- CDM::din( sim.dino, q.matrix=sim.qmatrix, rule="DINO")# estimate classification reliabilitycdm.est.class.accuracy( mod1, n.sims=5000)#***# Model 2: estimate DINO model with gdinamod2 <- CDM::gdina( sim.dino, q.matrix=sim.qmatrix, rule="DINO")# estimate classification reliabilitycdm.est.class.accuracy( mod2 )m1 <- mod1$coef[, c("guess", "slip" ) ]m2 <- mod2$coefm2 <- cbind( m1, m2[ seq(1,18,2), "est" ], 1 - m2[ seq(1,18,2), "est" ] - m2[ seq(2,18,2), "est" ] )colnames(m2) <- c("g.M1", "s.M1", "g.M2", "s.M2" ) ## > round( m2, 3 ) ## g.M1 s.M1 g.M2 s.M2 ## Item1 0.109 0.192 0.109 0.191 ## Item2 0.073 0.234 0.072 0.234 ## Item3 0.139 0.238 0.146 0.238 ## Item4 0.124 0.065 0.124 0.009 ## Item5 0.125 0.035 0.125 0.037 ## Item6 0.214 0.523 0.214 0.529 ## Item7 0.193 0.514 0.192 0.514 ## Item8 0.246 0.100 0.246 0.100 ## Item9 0.201 0.032 0.195 0.032# Note that s (the slipping parameter) substantially differs for Item4# for DINO estimation in 'din' and 'gdina'## End(Not run)Extract Estimated Item Parameters and Skill Class DistributionParameters
Description
Extracts the estimated parameters from eitherdin,gdina,gdina orgdm objects.
Usage
## S3 method for class 'din'coef(object, ...)## S3 method for class 'gdina'coef(object, ...)## S3 method for class 'mcdina'coef(object, ...)## S3 method for class 'gdm'coef(object, ...)## S3 method for class 'slca'coef(object, ...)Arguments
object | An object inheriting from either class |
... | Additional arguments to be passed. |
Value
A vector, a matrix or a data frame of the estimated parameters for the fitted model.
See Also
Examples
data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")# DINA modeld1 <- CDM::din( sim.dina, q.matrix=sim.qmatrix)coef(d1)## Not run: # GDINA modeld2 <- CDM::gdina( sim.dina, q.matrix=sim.qmatrix)coef(d2)# GDM modeltheta.k <- seq(-4,4,len=11)d3 <- CDM::gdm( sim.dina, irtmodel="2PL", theta.k=theta.k, Qmatrix=as.matrix(sim.qmatrix), centered.latent=TRUE)coef(d3)## End(Not run)Dataset Student Questionnaire
Description
This dataset contains item responses of students ata scale of cultural activities (act),mathematics self concept (sc) andmathematics joyment (mj).
Usage
data(data.Students)Format
A data frame with 2400 observations on the following 15 variables.
urbanUrbanization level: 1=town, 0=otherwise
femaleA dummy variable for female student
act1Visit a museum (0=never, 1=once or twice a year,2=more than twice a year)
act2Visit a theater or classical concert (0,1,2)
act3Visit a rock or pop concert (0,1,2)
act4Visit a cinema (0,1,2)
act5Visit a public library (0,1,2)
sc1Item 1 self concept "I am usually good at math."(0=do not agree at all,1=rather do not agree, 2=rather agree, 3=completely agree)
sc2Item 2 self concept: "Mathematics is harder for me than many of myclassmates." (0,1,2,3) (reversed)
sc3Item 3 self concept: "I am just not good at math."(0,1,2,3) (reversed)
sc4Item 4 self concept: "I'm learning fast in math." (0,1,2,3)
mj1Item 1 mathematics joyment:"I would like more math at school." (0,1,2,3)
mj2Item 2 mathematics joyment:"I like to learn mathematics." (0,1,2,3)
mj3Item 3 mathematics joyment:"Math is boring." (0,1,2,3) (reversed)
mj4Item 4 mathematics joyment: "I like math." (0,1,2,3)
Source
Subsample of students from an Austrian surveyof 8th grade students.
Examples
## Not run: library(psych)data(data.Students, package="CDM")psych::describe(data.Students) ## var n mean sd median trimmed mad min max range skew kurtosis se ## urban 1 2400 0.31 0.46 0.0 0.27 0.00 0 1 1 0.81 -1.34 0.01 ## female 2 2400 0.51 0.50 1.0 0.51 0.00 0 1 1 -0.03 -2.00 0.01 ## act1 3 2248 0.65 0.73 0.5 0.56 0.74 0 2 2 0.64 -0.88 0.02 ## act2 4 2230 0.47 0.69 0.0 0.34 0.00 0 2 2 1.13 -0.06 0.01 ## act3 5 2218 0.33 0.60 0.0 0.21 0.00 0 2 2 1.62 1.48 0.01 ## act4 6 2342 1.35 0.76 2.0 1.44 0.00 0 2 2 -0.69 -0.96 0.02 ## act5 7 2223 0.52 0.74 0.0 0.40 0.00 0 2 2 1.05 -0.41 0.02 ## sc1 8 2352 0.96 0.80 1.0 0.91 1.48 0 3 3 0.45 -0.39 0.02 ## sc2 9 2347 0.90 0.88 1.0 0.81 1.48 0 3 3 0.66 -0.41 0.02 ## sc3 10 2335 0.86 0.96 1.0 0.73 1.48 0 3 3 0.84 -0.35 0.02 ## sc4 11 2337 1.29 0.90 1.0 1.24 1.48 0 3 3 0.24 -0.71 0.02 ## mj1 12 2351 2.26 0.82 2.0 2.37 1.48 0 3 3 -0.94 0.28 0.02 ## mj2 13 2345 1.89 0.91 2.0 1.95 1.48 0 3 3 -0.35 -0.80 0.02 ## mj3 14 2334 1.47 1.02 1.0 1.47 1.48 0 3 3 0.10 -1.11 0.02 ## mj4 15 2346 1.59 0.99 2.0 1.62 1.48 0 3 3 -0.03 -1.06 0.02## End(Not run)Several Datasets for theCDM Package
Description
Several datasets for theCDM package
Usage
data(data.cdm01)data(data.cdm02)data(data.cdm03)data(data.cdm04)data(data.cdm05)data(data.cdm06)data(data.cdm07)data(data.cdm08)data(data.cdm09)data(data.cdm10)Format
Dataset
data.cdm01This dataset is a multiple choice dataset and used in the
mcdinafunction. The format is:List of 3$ data :'data.frame':..$ I1 : int [1:5003] 3 3 4 1 1 1 1 1 1 1 .....$ I2 : int [1:5003] 1 1 3 1 1 2 1 1 2 1 .....$ I3 : int [1:5003] 4 3 2 3 2 2 2 2 1 2 .....$ I4 : int [1:5003] 3 3 3 2 2 2 2 3 3 1 .....$ I5 : int [1:5003] 2 2 2 3 1 1 2 3 2 1 .....$ I6 : int [1:5003] 3 1 1 1 1 2 1 1 1 1 .....$ I7 : int [1:5003] 1 1 2 2 1 3 1 1 1 3 .....$ I8 : int [1:5003] 1 1 1 1 1 2 1 4 3 3 .....$ I9 : int [1:5003] 3 2 1 1 1 1 3 3 1 3 .....$ I10: int [1:5003] 2 1 2 1 1 2 2 2 2 1 .....$ I11: int [1:5003] 2 2 2 2 1 2 1 2 1 1 .....$ I12: int [1:5003] 1 2 1 1 2 1 1 1 1 2 .....$ I13: int [1:5003] 2 1 1 1 2 1 2 2 1 1 .....$ I14: int [1:5003] 1 1 1 1 1 2 1 1 2 1 .....$ I15: int [1:5003] 1 2 1 1 1 1 1 1 1 1 .....$ I16: int [1:5003] 1 2 2 1 2 2 2 1 1 1 .....$ I17: int [1:5003] 1 1 1 1 1 1 1 1 1 1 ...$ group : int [1:5003] 1 1 1 1 1 1 1 1 1 1 ...$ q.matrix:'data.frame':..$ item : int [1:52] 1 1 1 1 2 2 2 2 3 3 .....$ categ: int [1:52] 1 2 3 4 1 2 3 4 1 2 .....$ A1 : int [1:52] 0 1 0 1 0 1 1 1 0 0 .....$ A2 : int [1:52] 0 0 1 1 0 0 0 1 0 0 .....$ A3 : int [1:52] 0 0 0 0 0 0 0 0 0 0 ...Dataset
data.cdm02Multiple choice dataset with a Q-matrix designed for polytomousattributes.
List of 2$ data :'data.frame':..$ I1 : int [1:3000] 3 3 4 1 1 1 1 1 1 1 .....$ I2 : int [1:3000] 1 1 3 1 1 2 1 1 2 1 .....$ I3 : int [1:3000] 4 3 2 3 2 2 2 2 1 2 ...[...]..$ B17: num [1:3000] 1 1 1 1 1 1 1 1 1 1 .....$ B18: num [1:3000] 1 1 1 1 2 2 2 2 2 2 ...$ q.matrix:'data.frame':..$ item : int [1:100] 1 1 1 1 2 2 2 2 3 3 .....$ categ: int [1:100] 1 2 3 4 1 2 3 4 1 2 .....$ A1 : num [1:100] 0 1 0 1 0 1 1 1 0 0 .....$ A2 : num [1:100] 0 0 1 1 0 0 0 1 0 0 .....$ A3 : num [1:100] 0 0 0 0 0 0 0 0 0 0 .....$ B1 : num [1:100] 0 0 0 0 0 0 0 0 0 0 ...Dataset
data.cdm03:This is a resimulated dataset from Chiu, Koehn and Wu (2016) wherethe data generating model is a reduced RUM model. See Example 1.
List of 2$ data : num [1:725, 1:16] 0 1 1 1 1 1 1 1 1 1 .....- attr(*, "dimnames")=List of 2.. ..$ : NULL.. ..$ : chr [1:16] "I01" "I02" "I03" "I04" ...$ qmatrix:'data.frame': 16 obs. of 6 variables:..$ item: Factor w/ 16 levels "I01","I02","I03",..: 1 2 3 4 5 6 7 8 9 10 .....$ A1 : int [1:16] 1 0 0 0 0 0 0 0 1 1 .....$ A2 : int [1:16] 0 1 0 0 1 1 0 0 0 0 .....$ A3 : int [1:16] 0 0 1 1 1 1 0 0 0 0 .....$ A4 : int [1:16] 0 0 0 0 0 0 1 1 1 1 .....$ A5 : int [1:16] 0 0 0 0 0 0 0 0 0 0 ...Dataset
data.cdm04:Simulated dataset for the sequential DINA model(as described in Ma & de la Torre, 2016).The dataset contains 1000 persons and 12 items which measure 2 skills.
List of 3$ data : num [1:1000, 1:12] 0 0 0 1 1 0 0 0 0 0 .....- attr(*, "dimnames")=List of 2.. ..$ : NULL.. ..$ : chr [1:12] "I1" "I2" "I3" "I4" ...$ q.matrix1:'data.frame': 18 obs. of 4 variables:..$ Item: chr [1:18] "I1" "I2" "I3" "I4" .....$ Cat : int [1:18] 1 1 1 1 1 1 1 2 1 2 .....$ A1 : int [1:18] 1 1 1 0 0 0 1 1 1 1 .....$ A2 : int [1:18] 0 0 0 1 1 1 0 0 0 0 ...$ q.matrix2:'data.frame': 18 obs. of 4 variables:..$ Item: chr [1:18] "I1" "I2" "I3" "I4" .....$ Cat : int [1:18] 1 1 1 1 1 1 1 2 1 2 .....$ A1 : num [1:18] 1 1 1 0 0 0 1 1 1 1 .....$ A2 : num [1:18] 0 0 0 1 1 1 0 0 0 0 ...Dataset
data.cdm05:Example dataset used in Philipp, Strobl, de la Torre and Zeileis (2018).This dataset is a sub-dataset of the
probabilitydataset inthepks package (Heller & Wickelmaier, 2013).List of 3$ data :'data.frame': 504 obs. of 12 variables:..$ b101: num [1:504] 1 1 1 1 1 1 1 1 1 1 .....$ b102: num [1:504] 1 1 1 1 1 1 1 1 1 1 .....$ b103: num [1:504] 1 1 1 1 1 1 1 1 1 1 .....$ b104: num [1:504] 1 1 1 1 0 1 0 0 0 1 .....$ b105: num [1:504] 1 0 1 1 1 1 0 1 1 1 .....$ b106: num [1:504] 1 1 1 1 1 1 1 1 1 1 .....$ b107: num [1:504] 1 1 1 1 1 1 1 1 1 1 .....$ b108: num [1:504] 1 1 1 1 1 1 0 1 1 1 .....$ b109: num [1:504] 1 1 0 1 1 0 0 1 1 0 .....$ b110: num [1:504] 0 0 0 1 0 0 0 0 0 1 .....$ b111: num [1:504] 0 1 0 0 0 1 0 0 0 0 .....$ b112: num [1:504] 1 1 0 1 0 1 0 1 0 0 ...$ q.matrix:'data.frame': 12 obs. of 4 variables:..$ pb: num [1:12] 1 0 0 0 1 1 1 1 1 0 .....$ cp: num [1:12] 0 1 0 0 1 1 0 0 0 1 .....$ un: num [1:12] 0 0 1 0 0 0 1 1 0 0 .....$ id: num [1:12] 0 0 0 1 0 0 0 0 1 1 ...$ skills : Named chr [1:4] "how to calculate the classic probability "..- attr(*, "names")=chr [1:4] "pb" "cp" "un" "id"Dataset
data.cdm06:Resimulated example dataset from Chen and Chen (2017).
List of 3$ data :'data.frame': 2733 obs. of 15 variables:..$ I01: num [1:2733] 1 0 0 1 0 0 0 1 1 1 .....$ I02: num [1:2733] 1 0 0 1 1 0 1 0 0 1 .....$ I03: num [1:2733] 0 0 0 1 1 0 1 0 1 0 .....$ I04: num [1:2733] 1 1 0 0 0 0 1 1 1 0 .....$ I05: num [1:2733] 1 0 1 1 0 1 1 1 1 1 .....$ I06: num [1:2733] 0 0 0 1 1 0 0 0 1 1 .....$ I07: num [1:2733] 1 1 1 0 0 1 1 0 1 1 .....$ I08: num [1:2733] 0 0 0 0 0 0 0 0 1 1 .....$ I09: num [1:2733] 1 0 0 1 1 1 0 1 0 1 .....$ I10: num [1:2733] 0 0 0 1 0 1 1 0 1 1 .....$ I11: num [1:2733] 0 1 0 1 1 1 1 0 1 1 .....$ I12: num [1:2733] 0 1 0 1 0 0 0 1 1 1 .....$ I13: num [1:2733] 0 0 1 1 0 1 0 0 0 1 .....$ I14: num [1:2733] 0 0 0 1 1 0 1 1 0 0 .....$ I15: num [1:2733] 0 0 0 1 0 0 1 0 1 1 ...$ q.matrix:'data.frame': 15 obs. of 5 variables:..$ RI: num [1:15] 1 1 1 0 1 1 1 1 0 0 .....$ JS: num [1:15] 1 0 0 1 0 0 0 0 0 1 .....$ GI: num [1:15] 0 1 0 1 0 0 1 1 1 1 .....$ II: num [1:15] 0 1 1 0 1 0 1 0 0 0 .....$ MI: num [1:15] 0 0 1 0 0 0 0 0 1 0 ...$ skills : chr [1:5, 1:2] "Retrieving explicit information " .....- attr(*, "dimnames")=List of 2.. ..$ : chr [1:5] "RI" "JS" "GI" "II" ..... ..$ : chr [1:2] "skill" "description"Dataset
data.cdm07:This is a resimulated dataset from the social anxiety disorder dataconcerning social phobia which involve 13 dichotomous questions(Fang, Liu & Ling, 2017). The simulation was based on a latent classmodel with five classes. The dataset was also used in Chen, Li, Liuand Ying (2017).
$ data : num [1:863, 1:13] 1 0 1 1 1 1 1 1 1 1 .....- attr(*, "dimnames")=List of 2.. ..$ : NULL.. ..$ : chr [1:13] "I1" "I2" "I3" "I4" ...$ q.matrix: num [1:13, 1:3] 1 1 1 1 0 0 0 0 0 0 .....- attr(*, "dimnames")=List of 2.. ..$ : chr [1:13] "I1" "I2" "I3" "I4" ..... ..$ : chr [1:3] "A1" "A2" "A3"$ items : atomic [1:13] 1 speaking in front of other people? .....- attr(*, "stem")=chr "Have you ever had a strong fear or avoidance of ..."Dataset
data.cdm08:This is a simulated dataset involving four skills and three misconceptionsfor the model for simultaneously identifying skills andmisconceptions (SISM; Kuo, Chen & de la Torre, 2018). The Q-matrix followsthe specification in their simulation study.
List of 2$ data :'data.frame': 1300 obs. of 20 variables:..$ I01: num [1:1300] 1 0 0 1 1 1 1 1 1 1 .....$ I02: num [1:1300] 0 0 0 0 1 1 1 1 1 1 .....$ I03: num [1:1300] 0 0 0 0 1 1 1 1 1 1 .....$ I04: num [1:1300] 1 1 0 1 0 1 1 0 1 1 .....$ I05: num [1:1300] 1 1 1 0 1 1 0 1 1 1 .....[...]..$ I18: num [1:1300] 0 1 0 0 0 0 0 0 0 1 .....$ I19: num [1:1300] 1 1 0 0 0 0 0 1 1 1 .....$ I20: num [1:1300] 1 1 0 0 0 1 0 1 0 1 ...$ q.matrix:'data.frame': 20 obs. of 7 variables:..$ S1: num [1:20] 1 0 0 0 0 0 0 1 0 0 .....$ S2: num [1:20] 0 1 0 0 0 0 0 0 1 0 .....$ S3: num [1:20] 0 0 1 0 0 0 0 0 0 1 .....$ S4: num [1:20] 0 0 0 1 0 0 0 0 0 0 .....$ B1: num [1:20] 0 0 0 0 1 0 0 1 1 0 .....$ B2: num [1:20] 0 0 0 0 0 1 0 0 0 0 .....$ B3: num [1:20] 0 0 0 0 0 0 1 0 0 1 ...Dataset
data.cdm09:This is a simulated dataset involving polytomous skills which is adaptedfrom the empirical example (proportional reasoning data)of Chen and de la Torre (2013).List of 2$ data : num [1:500, 1:15] 1 0 1 1 0 1 1 1 1 1 .....- attr(*, "dimnames")=List of 2.. ..$ : NULL.. ..$ : chr [1:15] "I1" "I2" "I3" "I4" ...$ q.matrix:'data.frame': 15 obs. of 4 variables:..$ A1: int [1:15] 0 0 0 0 2 0 0 2 1 1 .....$ A2: int [1:15] 1 0 2 0 0 1 2 0 1 1 .....$ A3: int [1:15] 0 0 0 1 0 0 0 0 0 0 .....$ A4: int [1:15] 0 1 1 0 0 0 0 0 0 0 ...Dataset
data.cdm10:This is a simulated dataset involving a hierarchical skill structure.Skill A has four levels, skill B possesses two levels and skill C has three levels.List of 2$ data : num [1:1500, 1:15] 1 1 0 0 0 1 1 0 0 1 .....- attr(*, "dimnames")=List of 2.. ..$ : NULL.. ..$ : chr [1:15] "I1" "I2" "I3" "I4" ...$ q.matrix: num [1:15, 1:6] 1 1 1 1 1 1 0 0 0 0 .....- attr(*, "dimnames")=List of 2.. ..$ : chr [1:15] "I1" "I2" "I3" "I4" ..... ..$ : chr [1:6] "A1" "A2" "A3" "B1" ...
References
Chen, H., & Chen, J. (2017). Cognitive diagnostic research on chinesestudents' English listening skills and implications on skill training.English Language Teaching, 10(12), 107-115.http://dx.doi.org/10.5539/elt.v10n12p107
Chen, J., & de la Torre, J. (2013). A general cognitive diagnosis model forexpert-defined polytomous attributes.Applied Psychological Measurement, 37,419-437.http://dx.doi.org/10.1177/0146621613479818
Chen, Y., Li, X., Liu, J., & Ying, Z. (2017).Regularized latent class analysis with application in cognitive diagnosis.Psychometrika, 82, 660-692.http://dx.doi.org/10.1007/s11336-016-9545-6
Chiu, C.-Y., Koehn, H.-F., & Wu, H.-M. (2016).Fitting the reduced RUM with Mplus: A tutorial.International Journal of Testing, 16(4), 331-351.http://dx.doi.org/10.1080/15305058.2016.1148038
Fang, G., Liu, J., & Ying, Z. (2017). On the identifiability ofdiagnostic classification models.arXiv, 1706.01240.https://arxiv.org/abs/1706.01240
Heller, J. and Wickelmaier, F. (2013). Minimum discrepancy estimation inprobabilistic knowledge structures.Electronic Notes in Discrete Mathematics, 42, 49-56.
http://dx.doi.org/10.1016/j.endm.2013.05.145
Kuo, B.-C., Chen, C.-H., & de la Torre, J. (2018).A cognitive diagnosis model for identifying coexisting skills and misconceptions.Applied Psychological Measurement, 42(3), 179-191.http://dx.doi.org/10.1177/0146621617722791
Ma, W., & de la Torre, J. (2016).A sequential cognitive diagnosis model for polytomous responses.British Journal of Mathematical and Statistical Psychology, 69(3), 253-275.
https://doi.org/10.1111/bmsp.12070
Philipp, M., Strobl, C., de la Torre, J., & Zeileis, A. (2018).On the estimation of standard errors in cognitive diagnosis models.Journal of Educational and Behavioral Statistics, 43(1), 88-115.http://dx.doi.org/10.3102/1076998617719728
Examples
## Not run: ############################################################################## EXAMPLE 1: Reduced RUM model, Chiu et al. (2016)#############################################################################data(data.cdm03, package="CDM")dat <- data.cdm03$dataqmatrix <- data.cdm03$qmatrix#*** Model 1: Reduced RUMmod1 <- CDM::gdina( dat, q.matrix=qmatrix[,-1], rule="RRUM" )summary(mod1)#*** Model 2: Additive model with identity link functionmod2 <- CDM::gdina( dat, q.matrix=qmatrix[,-1], rule="ACDM" )summary(mod2)#*** Model 3: Additive model with logit link functionmod3 <- CDM::gdina( dat, q.matrix=qmatrix[,-1], rule="ACDM", linkfct="logit")summary(mod3)############################################################################## EXAMPLE 2: GDINA model - probability dataset from the pks package#############################################################################data(data.cdm05, package="CDM")dat <- data.cdm05$dataQ <- data.cdm05$q.matrix#* estimate modelmod1 <- CDM::gdina( dat, q.matrix=Q )summary(mod1)## End(Not run)Dataset from Book 'Diagnostic Measurement' of Rupp, Templin andHenson (2010)
Description
Dataset from Chapter 9 of the book 'Diagnostic Measurement'(Rupp, Templin & Henson, 2010).
Usage
data(data.dcm)Format
The format of the data is a list containing the dichotomous itemresponse datadata (10000 persons at 7 items)and the Q-matrixq.matrix (7 items and 3 skills):
List of 2 $ data :'data.frame': ..$ id: int [1:10000] 1 2 3 4 5 6 7 8 9 10 ... ..$ D1: num [1:10000] 0 0 0 0 1 0 1 0 0 1 ... ..$ D2: num [1:10000] 0 0 0 0 0 1 1 1 0 1 ... ..$ D3: num [1:10000] 1 0 1 0 1 1 0 0 0 1 ... ..$ D4: num [1:10000] 0 0 1 0 0 1 1 1 0 0 ... ..$ D5: num [1:10000] 1 0 0 0 1 1 1 0 1 0 ... ..$ D6: num [1:10000] 0 0 0 0 1 1 1 0 0 1 ... ..$ D7: num [1:10000] 0 0 0 0 0 1 1 0 1 1 ... $ q.matrix: num [1:7, 1:3] 1 0 0 1 1 0 1 0 1 0 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:7] "D1" "D2" "D3" "D4" ... .. ..$ : chr [1:3] "skill1" "skill2" "skill3"
Source
For supplementary material of the Rupp, Templin and Henson book (2010)seehttp://dcm.coe.uga.edu/.
The dataset was downloaded fromhttp://dcm.coe.uga.edu/supplemental/chapter9.html.
References
Rupp, A. A., Templin, J., & Henson, R. A. (2010).DiagnosticMeasurement: Theory, Methods, and Applications. New York: The GuilfordPress.
Examples
## Not run: data(data.dcm, package="CDM")dat <- data.dcm$data[,-1]Q <- data.dcm$q.matrix#*****************************************************# Model 1: DINA model#*****************************************************mod1 <- CDM::din( dat, q.matrix=Q )summary(mod1)#--------# Model 1m: estimate model in mirt packagelibrary(mirt)library(sirt) #** define theta grid of skills # use the function skillspace.hierarchy just for conveniencehier <- "skill1 > skill2"skillspace <- CDM::skillspace.hierarchy( hier, skill.names=colnames(Q) )Theta <- as.matrix(skillspace$skillspace.complete) #** create mirt modelmirtmodel <- mirt::mirt.model(" skill1=1 skill2=2 skill3=3 (skill1*skill2)=4 (skill1*skill3)=5 (skill2*skill3)=6 (skill1*skill2*skill3)=7 " ) #** mirt parameter tablemod.pars <- mirt::mirt( dat, mirtmodel, pars="values") # use starting values of .20 for guessing parameterind <- which( mod.pars$name=="d" )mod.pars[ind,"value"] <- stats::qlogis(.20) # guessing parameter on the logit metric # use starting values of .80 for anti-slipping parameterind <- which( ( mod.pars$name %in% paste0("a",1:20 ) ) & (mod.pars$est) )mod.pars[ind,"value"] <- stats::qlogis(.80) - stats::qlogis(.20)mod.pars #** prior for the skill space distributionI <- ncol(dat)lca_prior <- function(Theta,Etable){ TP <- nrow(Theta) if ( is.null(Etable) ){ prior <- rep( 1/TP, TP ) } if ( ! is.null(Etable) ){ prior <- ( rowSums(Etable[, seq(1,2*I,2)]) + rowSums(Etable[,seq(2,2*I,2)]) )/I } prior <- prior / sum(prior) return(prior) } #** estimate model in mirtmod1m <- mirt::mirt(dat, mirtmodel, pars=mod.pars, verbose=TRUE, technical=list( customTheta=Theta, customPriorFun=lca_prior) ) # The number of estimated parameters is incorrect because mirt does not correctly count # estimated parameters from the user customized prior distribution.mod1m@nest <- as.integer(sum(mod.pars$est) + nrow(Theta) - 1) # extract log-likelihoodmod1m@logLik # compute AIC and BIC( AIC <- -2*mod1m@logLik+2*mod1m@nest )( BIC <- -2*mod1m@logLik+log(mod1m@Data$N)*mod1m@nest ) #** extract item parameterscmod1m <- sirt::mirt.wrapper.coef(mod1m)$coef# compare estimated guessing and slipping parametersdfr <- data.frame( "din.guess"=mod1$guess$est, "mirt.guess"=plogis(cmod1m$d), "din.slip"=mod1$slip$est, "mirt.slip"=1-plogis( rowSums( cmod1m[, c("d", paste0("a",1:7) ) ] ) ) )round(t(dfr),3) ## [,1] [,2] [,3] [,4] [,5] [,6] [,7] ## din.guess 0.217 0.193 0.189 0.135 0.143 0.135 0.162 ## mirt.guess 0.226 0.189 0.184 0.132 0.142 0.132 0.158 ## din.slip 0.338 0.331 0.334 0.220 0.222 0.211 0.042 ## mirt.slip 0.339 0.333 0.336 0.223 0.225 0.214 0.044# compare estimated skill class distributiondfr <- data.frame("din"=mod1$attribute.patt$class.prob, "mirt"=mod1m@Prior[[1]] )round(t(dfr),3) ## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] ## din 0.113 0.083 0.094 0.092 0.064 0.059 0.065 0.429 ## mirt 0.116 0.074 0.095 0.064 0.095 0.058 0.066 0.433#** extract estimated classificationsfsc1m <- sirt::mirt.wrapper.fscores( mod1m )#- estimated reliabilitiesfsc1m$EAP.rel ## skill1 skill2 skill3 ## 0.5479942 0.5362595 0.5357961#- estimated classfications: EAPs, MLEs and MAPshead( round(fsc1m$person,3) ) ## case M EAP.skill1 SE.EAP.skill1 EAP.skill2 SE.EAP.skill2 EAP.skill3 SE.EAP.skill3 ## 1 1 0.286 0.508 0.500 0.067 0.251 0.820 0.384 ## 2 2 0.000 0.162 0.369 0.191 0.393 0.190 0.392 ## 3 3 0.286 0.200 0.400 0.211 0.408 0.607 0.489 ## 4 4 0.000 0.162 0.369 0.191 0.393 0.190 0.392 ## 5 5 0.571 0.802 0.398 0.267 0.443 0.928 0.258 ## 6 6 0.857 0.998 0.045 1.000 0.019 1.000 0.020 ## MLE.skill1 MLE.skill2 MLE.skill3 MAP.skill1 MAP.skill2 MAP.skill3 ## 1 1 0 1 1 0 1 ## 2 0 0 0 0 0 0 ## 3 0 0 1 0 0 1 ## 4 0 0 0 0 0 0 ## 5 1 0 1 1 0 1 ## 6 1 1 1 1 1 1#** estimate model fit in mirt( fit1m <- mirt::M2( mod1m ) )#*****************************************************# Model 2: DINO model#*****************************************************mod2 <- CDM::din( dat, q.matrix=Q, rule="DINO")summary(mod2)#*****************************************************# Model 3: log-linear model (LCDM): this model is the GDINA model with the# logit link function#*****************************************************mod3 <- CDM::gdina( dat, q.matrix=Q, link="logit")summary(mod3)#*****************************************************# Model 4: GDINA model with identity link function#*****************************************************mod4 <- CDM::gdina( dat, q.matrix=Q )summary(mod4)#*****************************************************# Model 5: GDINA additive model identity link function#*****************************************************mod5 <- CDM::gdina( dat, q.matrix=Q, rule="ACDM")summary(mod5)#*****************************************************# Model 6: GDINA additive model logit link function#*****************************************************mod6 <- CDM::gdina( dat, q.matrix=Q, link="logit", rule="ACDM")summary(mod6)#--------# Model 6m: GDINA additive model in mirt package# use data specifications from Model 1m) #** create mirt modelmirtmodel <- mirt::mirt.model(" skill1=1,4,5,7 skill2=2,4,6,7 skill3=3,5,6,7 " ) #** mirt parameter tablemod.pars <- mirt::mirt( dat, mirtmodel, pars="values") #** estimate model in mirt # Theta and lca_prior as defined as in Model 1mmod6m <- mirt::mirt(dat, mirtmodel, pars=mod.pars, verbose=TRUE, technical=list( customTheta=Theta, customPriorFun=lca_prior) )mod6m@nest <- as.integer(sum(mod.pars$est) + nrow(Theta) - 1) # extract log-likelihoodmod6m@logLik # compute AIC and BIC( AIC <- -2*mod6m@logLik+2*mod6m@nest )( BIC <- -2*mod6m@logLik+log(mod6m@Data$N)*mod6m@nest ) #** skill distribution mod6m@Prior[[1]] #** extract item parameterscmod6m <- mirt.wrapper.coef(mod6m)$coefprint(cmod6m,digits=4) ## item a1 a2 a3 d g u ## 1 D1 1.882 0.000 0.000 -0.9330 0 1 ## 2 D2 0.000 2.049 0.000 -1.0430 0 1 ## 3 D3 0.000 0.000 2.028 -0.9915 0 1 ## 4 D4 2.697 2.525 0.000 -2.9925 0 1 ## 5 D5 2.524 0.000 2.478 -2.7863 0 1 ## 6 D6 0.000 2.818 2.791 -3.1324 0 1 ## 7 D7 3.113 2.918 2.785 -4.2794 0 1#*****************************************************# Model 7: Reduced RUM model#*****************************************************mod7 <- CDM::gdina( dat, q.matrix=Q, rule="RRUM")summary(mod7)#*****************************************************# Model 8: latent class model with 3 classes and 4 sets of starting values#*****************************************************#-- Model 8a: randomLCA packagelibrary(randomLCA)mod8a <- randomLCA::randomLCA( dat, nclass=3, verbose=TRUE, notrials=4)#-- Model8b: rasch.mirtlc function in sirt packagelibrary(sirt)mod8b <- sirt::rasch.mirtlc( dat, Nclasses=3, nstarts=4 )summary(mod8a)summary(mod8b)## End(Not run)DTMR Fraction Data (Bradshaw et al., 2014)
Description
This is a simulated dataset of the DTMR fraction data describedin Bradshaw, Izsak, Templin and Jacobson (2014).
Usage
data(data.dtmr)Format
The format is:
List of 5 $ data : num [1:5000, 1:27] 0 0 0 0 0 1 0 0 1 1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr [1:27] "M1" "M2" "M3" "M4" ... $ q.matrix :'data.frame': 27 obs. of 4 variables: ..$ RU : int [1:27] 1 0 0 1 1 0 1 0 0 0 ... ..$ PI : int [1:27] 0 0 1 0 0 1 0 0 0 0 ... ..$ APP: int [1:27] 0 1 0 0 0 0 0 1 1 1 ... ..$ MC : int [1:27] 0 0 0 0 0 0 0 0 0 0 ... $ skill.distribution:'data.frame': 16 obs. of 5 variables: ..$ RU : int [1:16] 0 0 0 0 0 0 0 0 1 1 ... ..$ PI : int [1:16] 0 0 0 0 1 1 1 1 0 0 ... ..$ APP : int [1:16] 0 0 1 1 0 0 1 1 0 0 ... ..$ MC : int [1:16] 0 1 0 1 0 1 0 1 0 1 ... ..$ freq: int [1:16] 1064 350 280 406 196 126 238 770 14 28 ... $ itempars :'data.frame': 27 obs. of 7 variables: ..$ item : chr [1:27] "M1" "M2" "M3" "M4" ... ..$ lam0 : num [1:27] -1.12 0.59 -2.07 -1.19 -1.67 -3.81 -0.73 -0.62 -0.09 0.28 ... ..$ RU : num [1:27] 2.24 0 0 0.65 1.52 0 1.2 0 0 0 ... ..$ PI : num [1:27] 0 0 1.7 0 0 2.08 0 0 0 0 ... ..$ APP : num [1:27] 0 1.27 0 0 0 0 0 4.25 2.16 0.87 ... ..$ MC : num [1:27] 0 0 0 0 0 0 0 0 0 0 ... ..$ RU.PI: num [1:27] 0 0 0 0 0 0 0 0 0 0 ... $ sim_data :function (N, skill.distribution, itempars) ..- attr(*, "srcref")='srcref' int [1:8] 1 13 20 1 13 1 1 20 .. ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x00000000298a8ed0>
The attribute definition are as follows
RU: Referent units
PI: Partitioning and iterating attribute
APP: Appropriateness attribute
MC: Multiplicative Comparison attribute
Source
Simulated dataset according to Bradshaw et al. (2014).
References
Bradshaw, L., Izsak, A., Templin, J., & Jacobson, E. (2014).Diagnosing teachers' understandings of rational numbers: Building amultidimensional test within the diagnostic classification framework.Educational Measurement: Issues and Practice, 33, 2-14.
Examples
## Not run: ############################################################################## EXAMPLE 1: Model comparisons data.dtmr#############################################################################data(data.dtmr, package="CDM")data <- data.dtmr$dataq.matrix <- data.dtmr$q.matrixI <- ncol(data)#*** Model 1: LCDM# define item wise rulesrule <- rep( "ACDM", I )names(rule) <- colnames(data)rule[ c("M14","M17") ] <- "GDINA2"# estimate modelmod1 <- CDM::gdina( data, q.matrix, linkfct="logit", rule=rule)summary(mod1)#*** Model 2: DINA modelmod2 <- CDM::gdina( data, q.matrix, rule="DINA" )summary(mod2)#*** Model 3: RRUM modelmod3 <- CDM::gdina( data, q.matrix, rule="RRUM" )summary(mod3)#--- model comparisons# LCDM vs. DINAanova(mod1,mod2) ## Model loglike Deviance Npars AIC BIC Chisq df p ## 2 Model 2 -76570.89 153141.8 69 153279.8 153729.5 1726.645 10 0 ## 1 Model 1 -75707.57 151415.1 79 151573.1 152088.0 NA NA NA# LCDM vs. RRUManova(mod1,mod3) ## Model loglike Deviance Npars AIC BIC Chisq df p ## 2 Model 2 -75746.13 151492.3 77 151646.3 152148.1 77.10994 2 0 ## 1 Model 1 -75707.57 151415.1 79 151573.1 152088.0 NA NA NA#--- model fitsummary( CDM::modelfit.cor.din( mod1 ) ) ## Test of Global Model Fit ## type value p ## 1 max(X2) 7.74382 1.00000 ## 2 abs(fcor) 0.04056 0.72707 ## ## Fit Statistics ## est ## MADcor 0.00959 ## SRMSR 0.01217 ## MX2 0.75696 ## 100*MADRESIDCOV 0.20283 ## MADQ3 0.02220############################################################################## EXAMPLE 2: Simulating data of structure data.dtmr#############################################################################data(data.dtmr, package="CDM")# draw sample of N=200set.seed(87)data.dtmr$sim_data(N=200, skill.distribution=data.dtmr$skill.distribution, itempars=data.dtmr$itempars)## End(Not run)Dataset ECPE
Description
ECPE dataset from the Templin and Hoffman (2013) tutorial ofspecifying cognitive diagnostic models in Mplus.
Usage
data(data.ecpe)Format
The format of the data is a list containing the dichotomous itemresponse datadata (2922 persons at 28 items)and the Q-matrixq.matrix (28 items and 3 skills):
List of 2 $ data :'data.frame': ..$ id : int [1:2922] 1 2 3 4 5 6 7 8 9 10 ... ..$ E1 : int [1:2922] 1 1 1 1 1 1 1 0 1 1 ... ..$ E2 : int [1:2922] 1 1 1 1 1 1 1 1 1 1 ... ..$ E3 : int [1:2922] 1 1 1 1 1 1 1 1 1 1 ... ..$ E4 : int [1:2922] 0 1 1 1 1 1 1 1 1 1 ... [...] ..$ E27: int [1:2922] 1 1 1 1 1 1 1 0 1 1 ... ..$ E28: int [1:2922] 1 1 1 1 1 1 1 1 1 1 ... $ q.matrix:'data.frame': ..$ skill1: int [1:28] 1 0 1 0 0 0 1 0 0 1 ... ..$ skill2: int [1:28] 1 1 0 0 0 0 0 1 0 0 ... ..$ skill3: int [1:28] 0 0 1 1 1 1 1 0 1 0 ...
The skills are
skill1: Morphosyntactic rules
skill2: Cohesive rules
skill3: Lexical rules.
Details
The dataset has been used in Templin and Hoffman (2013), andTemplin and Bradshaw (2014).
Source
The dataset was downloaded fromhttp://psych.unl.edu/jtemplin/teaching/dcm/dcm12ncme/.
References
Templin, J., & Bradshaw, L. (2014). Hierarchical diagnostic classificationmodels: A family of models for estimating and testing attributehierarchies.Psychometrika, 79, 317-339.
Templin, J., & Hoffman, L. (2013).Obtaining diagnostic classification model estimates using Mplus.Educational Measurement: Issues and Practice, 32, 37-50.
See Also
GDINA::ecpe
Examples
## Not run: data(data.ecpe, package="CDM")dat <- data.ecpe$data[,-1]Q <- data.ecpe$q.matrix#*** Model 1: LCDM modelmod1 <- CDM::gdina( dat, q.matrix=Q, link="logit")summary(mod1)#*** Model 2: DINA modelmod2 <- CDM::gdina( dat, q.matrix=Q, rule="DINA")summary(mod2)# Model comparison using likelihood ratio testanova(mod1,mod2) ## Model loglike Deviance Npars AIC BIC Chisq df p ## 2 Model 2 -42841.61 85683.23 63 85809.23 86185.97 206.0359 18 0 ## 1 Model 1 -42738.60 85477.19 81 85639.19 86123.57 NA NA NA#*** Model 3: Hierarchical LCDM (HLCDM) | Templin and Bradshaw (2014)# Testing a linear hierarchyhier <- "skill3 > skill2 > skill1"skill.names <- colnames(Q)# define skill space with hierarchyskillspace <- CDM::skillspace.hierarchy( hier, skill.names=skill.names )skillspace$skillspace.reduced ## skill1 skill2 skill3 ## A000 0 0 0 ## A001 0 0 1 ## A011 0 1 1 ## A111 1 1 1zeroprob.skillclasses <- skillspace$zeroprob.skillclasses# define user-defined parameters in LCDM: hierarchical LCDM (HLCDM)Mj.user <- mod1$Mj# select items with require two attributesitems <- which( rowSums(Q) > 1 )# modify design matrix for item parametersfor (ii in items){ m1 <- Mj.user[[ii]] Mj.user[[ii]][[1]] <- (m1[[1]])[,-2] Mj.user[[ii]][[2]] <- (m1[[2]])[-2]}# estimate model# note that avoid.zeroprobs is set to TRUE to avoid algorithmic instabilitiesmod3 <- CDM::gdina( dat, q.matrix=Q, link="logit", zeroprob.skillclasses=zeroprob.skillclasses, Mj=Mj.user, avoid.zeroprobs=TRUE )summary(mod3)#*****************************************#** estimate further models#*** Model 4: RRUM modelmod4 <- CDM::gdina( dat, q.matrix=Q, rule="RRUM")summary(mod4)# compare some modelsIRT.compareModels(mod1, mod2, mod3, mod4 )#*** Model 5a: GDINA model with identity linkmod5a <- CDM::gdina( dat, q.matrix=Q, link="identity")summary(mod5a)#*** Model 5b: GDINA model with logit linkmod5b <- CDM::gdina( dat, q.matrix=Q, link="logit")summary(mod5b)#*** Model 5c: GDINA model with log linkmod5c <- CDM::gdina( dat, q.matrix=Q, link="log")summary(mod5c)# compare modelsIRT.compareModels(mod5a, mod5b, mod5c)## End(Not run)Fraction Subtraction Dataset with Different Subsets of Data and DifferentQ-Matrices
Description
Contains different sub-datasets of the fraction subtraction data of Tatsuokawith different Q-matrix specifications.
Usage
data(data.fraction1)data(data.fraction2)data(data.fraction3)data(data.fraction4)data(data.fraction5)Format
The dataset
data.fraction1is the fraction subtraction data set with536 students and 15 items. The Q-matrix was defined in de la Torre (2009).This dataset is a list with the dataset (data) andthe Q-matrix as entries.The format is:
List of 2$ data :'data.frame':..$ T01: int [1:536] 0 1 1 1 0 0 0 0 0 0 .....$ T02: int [1:536] 1 1 1 1 1 0 0 1 0 0 .....$ T03: int [1:536] 0 1 1 1 1 1 0 0 0 0 .....$ T04: int [1:536] 1 1 1 0 0 0 0 0 0 0 .....$ T05: int [1:536] 0 1 0 0 0 1 1 0 1 1 .....$ T06: int [1:536] 1 1 0 1 0 0 0 0 0 0 .....$ T07: int [1:536] 1 1 0 1 0 0 0 0 0 0 .....$ T08: int [1:536] 1 1 0 1 1 0 0 0 1 1 .....$ T09: int [1:536] 1 1 1 1 0 1 0 0 1 0 .....$ T10: int [1:536] 1 1 1 0 0 0 0 0 0 0 .....$ T11: int [1:536] 1 1 1 1 0 0 0 0 0 0 .....$ T12: int [1:536] 0 1 0 0 0 0 0 0 0 0 .....$ T13: int [1:536] 1 1 0 1 0 0 0 0 0 0 .....$ T14: int [1:536] 1 1 0 0 0 0 0 0 0 0 .....$ T15: int [1:536] 1 1 0 1 0 0 0 0 0 0 ...$ q.matrix: int [1:15, 1:5] 1 1 1 1 0 1 1 1 1 1 .....- attr(*, "dimnames")=List of 2.. ..$ : chr [1:15] "T01" "T02" "T03" "T04" ..... ..$ : chr [1:5] "QT1" "QT2" "QT3" "QT4" ...The dataset
data.fraction2is the fraction subtraction data setwith 536 students and 11 items. For this data set, severalQmatrices areavailable. The data is a list. The first entrydatacontains the data frame. The entryq.matrix1containsthe Q-matrix of Henson, Templin and Willse (2009).The third entryq.matrix2is an alternativeQ-matrix of de la Torre (2009). The fourth entry is amodified Q-matrix ofq.matrix1.The format is:
$ data :'data.frame':..$ H01: int [1:536] 1 1 1 1 1 0 0 1 0 0 .....$ H02: int [1:536] 1 1 1 0 0 0 0 0 0 0 .....$ H03: int [1:536] 0 1 0 0 0 1 1 0 1 1 .....$ H04: int [1:536] 1 1 0 1 0 0 0 0 0 0 .....$ H05: int [1:536] 1 1 0 1 0 0 0 0 0 0 .....$ H06: int [1:536] 1 1 0 1 1 0 0 0 1 1 .....$ H08: int [1:536] 1 1 1 0 0 0 0 0 0 0 .....$ H09: int [1:536] 1 1 1 1 0 0 0 0 0 0 .....$ H10: int [1:536] 0 1 0 0 0 0 0 0 0 0 .....$ H11: int [1:536] 1 1 0 1 0 0 0 0 0 0 .....$ H13: int [1:536] 1 1 0 1 0 0 0 0 0 0 ...$ q.matrix1: int [1:11, 1:3] 1 1 1 1 1 1 1 1 1 1 .....- attr(*, "dimnames")=List of 2.. ..$ : chr [1:11] "H01" "H02" "H03" "H04" ..... ..$ : chr [1:3] "QH1" "QH2" "QH3"$ q.matrix2: int [1:11, 1:5] 1 1 0 1 1 1 1 1 1 1 .....- attr(*, "dimnames")=List of 2.. ..$ : chr [1:11] "H01" "H02" "H03" "H04" ..... ..$ : chr [1:5] "QT1" "QT2" "QT3" "QT4" ...$ q.matrix3: num [1:11, 1:3] 0 0 0 1 0 0 0 0 1 1 .....- attr(*, "dimnames")=List of 2.. ..$ : chr [1:11] "H01" "H02" "H03" "H04" ..... ..$ : chr [1:3] "Dim1" "Dim2" "Dim3"The dataset
data.fraction3contains 12 items and wasused in de la Torre (2011).List of 2$ data :'data.frame': 536 obs. of 12 variables:..$ B01: int [1:536] 0 1 1 1 0 0 0 0 0 0 .....$ B02: int [1:536] 1 1 1 1 1 0 0 1 0 0 .....$ B03: int [1:536] 0 1 1 1 1 1 0 0 0 0 .....$ B04: int [1:536] 0 1 0 0 0 1 1 0 1 1 .....$ B05: int [1:536] 1 1 0 1 0 0 0 0 0 0 .....$ B06: int [1:536] 1 1 0 1 0 0 0 0 0 0 .....$ B07: int [1:536] 1 1 0 1 1 0 0 0 1 1 .....$ B08: int [1:536] 1 1 1 1 0 1 0 0 1 0 .....$ B09: int [1:536] 1 1 1 1 0 0 0 0 0 0 .....$ B10: int [1:536] 0 1 0 0 0 0 0 0 0 0 .....$ B11: int [1:536] 1 1 0 1 0 0 0 0 0 0 .....$ B12: int [1:536] 1 1 0 1 0 0 0 0 0 0 ...$ q.matrix:'data.frame': 12 obs. of 5 variables:..$ item: Factor w/ 13 levels "","B01","B02",..: 2 3 4 5 6 7 8 9 10 11 .....$ QA1 : int [1:12] 1 1 1 1 1 1 1 1 1 1 .....$ QA2 : int [1:12] 0 1 0 0 1 1 1 0 0 0 .....$ QA3 : int [1:12] 0 1 0 1 1 1 0 1 1 1 .....$ QA4 : int [1:12] 0 1 0 0 1 1 0 0 0 1 ...The dataset
data.fraction4contains 17 items and wasused in de la Torre and Douglas (2004) and Chen, Liu, Xu and Ying (2015).List of 2$ data :'data.frame': 536 obs. of 17 variables:..$ A01: int [1:536] 0 0 0 1 0 0 0 0 0 0 .....$ A02: int [1:536] 0 1 1 1 0 0 0 0 0 0 .....$ A03: int [1:536] 0 1 1 1 0 0 0 0 0 0 .....$ A04: int [1:536] 1 1 1 1 1 0 0 1 0 0 .....$ A05: int [1:536] 1 1 0 1 1 0 0 0 1 1 .....$ A06: int [1:536] 1 1 1 1 0 1 0 0 1 0 .....$ A07: int [1:536] 1 1 1 1 0 0 0 0 0 0 .....$ A08: int [1:536] 0 0 0 1 0 0 0 0 0 1 .....$ A09: int [1:536] 1 1 1 0 0 0 0 0 0 0 .....$ A10: int [1:536] 1 1 1 0 0 0 0 0 0 0 .....$ A11: int [1:536] 1 1 0 1 0 0 0 0 0 0 .....$ A12: int [1:536] 1 1 0 1 0 0 0 0 0 0 .....$ A13: int [1:536] 0 1 0 0 0 0 0 0 0 0 .....$ A14: int [1:536] 1 1 0 1 0 0 0 0 0 0 .....$ A15: int [1:536] 1 1 0 0 0 0 0 0 0 0 .....$ A16: int [1:536] 1 1 0 1 0 0 0 0 0 0 .....$ A17: int [1:536] 0 1 0 0 0 0 0 0 0 0 ...$ q.matrix:'data.frame': 17 obs. of 9 variables:..$ item: Factor w/ 18 levels "","A01","A02",..: 2 3 4 5 6 7 8 9 10 11 .....$ QA1 : int [1:17] 0 0 0 0 0 0 0 0 1 0 .....$ QA2 : int [1:17] 0 0 0 1 0 1 1 1 1 1 .....$ QA3 : int [1:17] 0 0 0 1 0 0 0 0 0 0 .....$ QA4 : int [1:17] 1 1 1 0 0 0 0 1 0 0 .....$ QA5 : int [1:17] 0 0 0 1 0 0 1 0 0 1 .....$ QA6 : int [1:17] 1 0 0 0 0 0 1 0 0 0 .....$ QA7 : int [1:17] 1 1 1 1 1 1 1 1 1 1 .....$ QA8 : int [1:17] 0 0 0 0 1 0 0 1 0 0 ...The dataset
data.fraction5contains 15 items and wasused as an example for the multiple strategy DINA model inde la Torre and Douglas (2008) and Hou and de la Torre (2014).The two Q-matrices for coding the multiple strategies are containedin one matrixq.matrixby joining the columns of both matrices.List of 2$ data :'data.frame': 536 obs. of 15 variables:..$ T01: int [1:536] 0 1 1 1 0 0 0 0 0 0 .....$ T02: int [1:536] 1 1 1 1 1 0 0 1 0 0 .....$ T03: int [1:536] 0 1 1 1 1 1 0 0 0 0 .....$ T04: int [1:536] 1 1 1 0 0 0 0 0 0 0 .....$ T05: int [1:536] 0 1 0 0 0 1 1 0 1 1 .....$ T06: int [1:536] 1 1 0 1 0 0 0 0 0 0 .....$ T07: int [1:536] 1 1 0 1 0 0 0 0 0 0 .....$ T08: int [1:536] 1 1 0 1 1 0 0 0 1 1 .....$ T09: int [1:536] 1 1 1 1 0 1 0 0 1 0 .....$ T10: int [1:536] 1 1 1 0 0 0 0 0 0 0 .....$ T11: int [1:536] 1 1 1 1 0 0 0 0 0 0 .....$ T12: int [1:536] 0 1 0 0 0 0 0 0 0 0 .....$ T13: int [1:536] 1 1 0 1 0 0 0 0 0 0 .....$ T14: int [1:536] 1 1 0 0 0 0 0 0 0 0 .....$ T15: int [1:536] 1 1 0 1 0 0 0 0 0 0 ...$ q.matrix:'data.frame': 15 obs. of 15 variables:..$ item: Factor w/ 16 levels "","T01","T02",..: 2 3 4 5 6 7 8 9 10 11 .....$ SA1 : int [1:15] 0 1 1 1 0 1 1 1 1 1 .....$ SA2 : int [1:15] 0 1 0 1 0 1 1 1 0 0 .....$ SA3 : int [1:15] 0 1 0 1 1 1 1 0 1 1 .....$ SA4 : int [1:15] 0 1 0 1 0 1 1 0 0 1 .....$ SA5 : int [1:15] 0 0 0 1 0 0 0 0 0 1 .....$ SA6 : int [1:15] 0 0 0 0 0 0 0 0 0 0 .....$ SA7 : int [1:15] 0 0 0 0 0 0 0 0 0 0 .....$ SB1 : int [1:15] 0 1 1 1 0 1 1 1 1 1 .....$ SB2 : int [1:15] 0 0 0 0 1 1 1 1 0 1 .....$ SB3 : int [1:15] 0 0 0 0 0 0 0 0 0 0 .....$ SB4 : int [1:15] 0 0 0 0 0 0 0 0 0 0 .....$ SB5 : int [1:15] 0 0 0 1 1 0 0 0 0 1 .....$ SB6 : int [1:15] 0 1 0 1 1 1 1 0 1 0 .....$ SB7 : int [1:15] 0 0 0 0 1 0 0 0 0 0 ...
Source
Seefraction.subtraction.data for more informationabout the data source.
References
Chen, Y., Liu, J., Xu, G. and Ying, Z. (2015).Statistical analysis of Q-matrix based diagnostic classification models.Journal of the American Statistical Association, 110(510),850-866.
de la Torre, J. (2009). DINA model parameter estimation:A didactic.Journal of Educational and BehavioralStatistics, 34, 115-130.
de la Torre, J. (2011). The generalized DINA model framework.Psychometrika, 76, 179-199.
de la Torre, J., & Douglas, J. A. (2004).Higher-order latent trait models for cognitive diagnosis.Psychometrika, 69, 333-353.
de la Torre, J., & Douglas, J. A. (2008).Model evaluation and multiple strategies in cognitive diagnosis:An analysis of fraction subtraction data.Psychometrika, 73, 595-624.
Henson, R. A., Templin, J. T., & Willse, J. T. (2009).Defining a family of cognitive diagnosis models usinglog-linear models with latent variables.Psychometrika, 74, 191-210.
Huo, Y., & de la Torre, J. (2014). Estimating a cognitive diagnostic model formultiple strategies via the EM algorithm.Applied Psychological Measurement, 38, 464-485.
See Also
GDINA::frac20
Datasetdata.hr (Ravand et al., 2013)
Description
Datasetdata.hr used for illustrating some functionalitiesof theCDM package (Ravand, Barati, & Widhiarso, 2013).
Usage
data(data.hr)Format
The format of the dataset is:
List of 2 $ data : num [1:1550, 1:35] 1 0 1 1 1 0 1 1 1 0 ... $ q.matrix:'data.frame': ..$ Skill1: int [1:35] 0 0 0 0 0 0 1 0 0 0 ... ..$ Skill2: int [1:35] 0 0 0 0 1 0 0 0 0 0 ... ..$ Skill3: int [1:35] 0 1 1 1 1 0 0 1 0 0 ... ..$ Skill4: int [1:35] 1 0 0 0 0 0 0 0 1 1 ... ..$ Skill5: int [1:35] 0 0 0 0 0 1 0 0 1 1 ...
Source
Simulated data according to Ravand et al. (2013).
References
Ravand, H., Barati, H., & Widhiarso, W. (2013). Exploring diagnostic capacityof a high stakes reading comprehension test: A pedagogical demonstration.Iranian Journal of Language Testing, 3(1), 1-27.
Examples
## Not run: data(data.hr, package="CDM")dat <- data.hr$dataQ <- data.hr$q.matrix#*************# Model 1: DINA modelmod1 <- CDM::din( dat, q.matrix=Q )summary(mod1) # summary# plot resultsplot(mod1)# inspect coefficientscoef(mod1)# posterior distributionposterior <- mod1$posteriorround( posterior[ 1:5, ], 4 ) # first 5 entries# estimate class probabilitiesmod1$attribute.patt# individual classificationsmod1$pattern[1:5,] # first 5 entries#*************# Model 2: GDINA modelmod2 <- CDM::gdina( dat, q.matrix=Q)summary(mod2)#*************# Model 3: Reduced RUM modelmod3 <- CDM::gdina( dat, q.matrix=Q, rule="RRUM" )summary(mod3)#--------# model comparisons# DINA vs GDINAanova( mod1, mod2 ) ## Model loglike Deviance Npars AIC BIC Chisq df p ## 1 Model 1 -31391.27 62782.54 101 62984.54 63524.49 195.9099 20 0 ## 2 Model 2 -31293.32 62586.63 121 62828.63 63475.50 NA NA NA# RRUM vs. GDINAanova( mod2, mod3 ) ## Model loglike Deviance Npars AIC BIC Chisq df p ## 2 Model 2 -31356.22 62712.43 105 62922.43 63483.76 125.7924 16 0 ## 1 Model 1 -31293.32 62586.64 121 62828.64 63475.50 NA NA NA# DINA vs. RRUManova(mod1,mod3) ## Model loglike Deviance Npars AIC BIC Chisq df p ## 1 Model 1 -31391.27 62782.54 101 62984.54 63524.49 70.11246 4 0 ## 2 Model 2 -31356.22 62712.43 105 62922.43 63483.76 NA NA NA#-------# model fit# DINAfmod1 <- CDM::modelfit.cor.din( mod1, jkunits=0)summary(fmod1) ## Test of Global Model Fit ## type value p ## 1 max(X2) 16.35495 0.03125 ## 2 abs(fcor) 0.10341 0.01416 ## ## Fit Statistics ## est ## MADcor 0.01911 ## SRMSR 0.02445 ## MX2 0.93157 ## 100*MADRESIDCOV 0.39100 ## MADQ3 0.02373# GDINAfmod2 <- CDM::modelfit.cor.din( mod2, jkunits=0)summary(fmod2) ## Test of Global Model Fit ## type value p ## 1 max(X2) 7.73670 1 ## 2 abs(fcor) 0.07215 1 ## ## Fit Statistics ## est ## MADcor 0.01830 ## SRMSR 0.02300 ## MX2 0.82584 ## 100*MADRESIDCOV 0.37390 ## MADQ3 0.02383# RRUMfmod3 <- CDM::modelfit.cor.din( mod3, jkunits=0)summary(fmod3) ## Test of Global Model Fit ## type value p ## 1 max(X2) 15.49369 0.04925 ## 2 abs(fcor) 0.10076 0.02201 ## ## Fit Statistics ## est ## MADcor 0.01868 ## SRMSR 0.02374 ## MX2 0.87999 ## 100*MADRESIDCOV 0.38409 ## MADQ3 0.02416## End(Not run)Dataset Jang (2009)
Description
Simulated dataset according to the Jang (2005) L2 reading comprehensionstudy.
Usage
data(data.jang)Format
The format is:
List of 2 $ data : num [1:1500, 1:37] 1 1 1 1 1 1 1 1 1 1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr [1:37] "I1" "I2" "I3" "I4" ... $ q.matrix:'data.frame': ..$ CDV: int [1:37] 1 0 0 1 0 0 0 0 0 0 ... ..$ CIV: int [1:37] 0 0 1 0 0 0 1 0 1 1 ... ..$ SSL: int [1:37] 1 1 1 1 0 0 0 0 0 0 ... ..$ TEI: int [1:37] 0 0 0 0 0 0 0 1 0 0 ... ..$ TIM: int [1:37] 0 0 0 1 1 1 0 0 0 0 ... ..$ INF: int [1:37] 0 1 0 0 0 0 1 0 0 0 ... ..$ NEG: int [1:37] 0 0 0 0 1 0 1 0 0 0 ... ..$ SUM: int [1:37] 0 0 0 0 1 0 0 0 0 0 ... ..$ MCF: int [1:37] 0 0 0 0 0 0 0 0 0 0 ...
Source
Simulated dataset.
References
Jang, E. E. (2009). Cognitive diagnostic assessment of L2 reading comprehensionability: Validity arguments for Fusion Model application to LanguEdge assessment.Language Testing, 26, 31-73.
Examples
## Not run: data(data.jang, package="CDM")data <- data.jang$dataq.matrix <- data.jang$q.matrix#*** Model 1: Reduced RUM modelmod1 <- CDM::gdina( data, q.matrix, rule="RRUM", conv.crit=.001, increment.factor=1.025 )summary(mod1)#*** Model 2: Additive model (identity link)mod2 <- CDM::gdina( data, q.matrix, rule="ACDM", conv.crit=.001, linkfct="identity" )summary(mod2)#*** Model 3: DINA modelmod3 <- CDM::gdina( data, q.matrix, rule="DINA", conv.crit=.001 )summary(mod3)anova(mod1,mod2) ## Model loglike Deviance Npars AIC BIC Chisq df p ## 1 Model 1 -30315.03 60630.06 153 60936.06 61748.98 88.29627 0 0 ## 2 Model 2 -30270.88 60541.76 153 60847.76 61660.68 NA NA NAanova(mod1,mod3) ## Model loglike Deviance Npars AIC BIC Chisq df p ## 2 Model 2 -30373.99 60747.97 129 61005.97 61691.38 117.9128 24 0 ## 1 Model 1 -30315.03 60630.06 153 60936.06 61748.98 NA NA NA# RRUMsummary( CDM::modelfit.cor.din( mod1, jkunits=0) ) ## type value p ## 1 max(X2) 11.79073 0.39645 ## 2 abs(fcor) 0.09541 0.07422 ## est ## MADcor 0.01834 ## SRMSR 0.02300 ## MX2 0.86718 ## 100*MADRESIDCOV 0.38690 ## MADQ3 0.02413# additive model (identity)summary( CDM::modelfit.cor.din( mod2, jkunits=0) ) ## type value p ## 1 max(X2) 9.78958 1.00000 ## 2 abs(fcor) 0.08770 0.22993 ## est ## MADcor 0.01721 ## SRMSR 0.02158 ## MX2 0.69163 ## 100*MADRESIDCOV 0.36343 ## MADQ3 0.02423# DINA modelsummary( CDM::modelfit.cor.din( mod3, jkunits=0) ) ## type value p ## 1 max(X2) 13.48449 0.16020 ## 2 abs(fcor) 0.10651 0.01256 ## est ## MADcor 0.01999 ## SRMSR 0.02495 ## MX2 0.92820 ## 100*MADRESIDCOV 0.42226 ## MADQ3 0.02258## End(Not run)MELAB Data (Li, 2011)
Description
This is a simulated dataset according to the MELAB readingstudy (Li, 2011; Li & Suen, 2013). Li (2011) investigated the Fusionmodel (RUM model) for calibrating this dataset. The dataset in this packageis simulated assuming the reduced RUM model (RRUM).
Usage
data(data.melab)Format
The format of the dataset is:
List of 3 $ data : num [1:2019, 1:20] 0 1 0 1 1 0 0 0 1 1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr [1:20] "I1" "I2" "I3" "I4" ... $ q.matrix :'data.frame': ..$ skill1: int [1:20] 1 1 0 0 1 1 0 1 0 1 ... ..$ skill2: int [1:20] 0 0 0 0 0 0 0 0 0 0 ... ..$ skill3: int [1:20] 0 0 0 1 0 1 1 0 1 0 ... ..$ skill4: int [1:20] 1 0 1 0 1 0 0 1 0 1 ... $ skill.labels:'data.frame': ..$ skill : Factor w/ 4 levels "skill1","skill2",..: 1 2 3 4 ..$ skill.label: Factor w/ 4 levels "connecting and synthesizing",..: 4 3 2 1
Source
Simulated data according to Li (2011).
References
Li, H. (2011). A cognitive diagnostic analysis of the MELAB reading test.Spaan Fellow, 9, 17-46.
Li, H., & Suen, H. K. (2013). Constructing and validating a Q-matrix forcognitive diagnostic analyses of a reading test.Educational Assessment, 18, 1-25.
Examples
## Not run: data(data.melab, package="CDM")data <- data.melab$dataq.matrix <- data.melab$q.matrix#*** Model 1: Reduced RUM modelmod1 <- CDM::gdina( data, q.matrix, rule="RRUM" )summary(mod1)#*** Model 2: GDINA modelmod2 <- CDM::gdina( data, q.matrix, rule="GDINA" )summary(mod2)#*** Model 3: DINA modelmod3 <- CDM::gdina( data, q.matrix, rule="DINA" )summary(mod3)#*** Model 4: 2PL modelmod4 <- CDM::gdm( data, theta.k=seq(-6,6,len=21), center )summary(mod4)#----# Model comparisons#*** RRUM vs. GDINAanova(mod1,mod2) ## Model loglike Deviance Npars AIC BIC Chisq df p ## 1 Model 1 -20252.74 40505.48 69 40643.48 41030.60 30.88801 18 0.02966 ## 2 Model 2 -20237.30 40474.59 87 40648.59 41136.69 NA NA NA ## -> GDINA is not superior to RRUM (according to AIC and BIC)#*** DINA vs. RRUManova(mod1,mod3) ## Model loglike Deviance Npars AIC BIC Chisq df p ## 2 Model 2 -20332.52 40665.04 55 40775.04 41083.61 159.5566 14 0 ## 1 Model 1 -20252.74 40505.48 69 40643.48 41030.60 NA NA NA ## -> RRUM fits the data significantly better than the DINA model#*** RRUM vs. 2PL (use only AIC and BIC for comparison)anova(mod1,mod4) ## Model loglike Deviance Npars AIC BIC Chisq df p ## 2 Model 2 -20390.19 40780.38 43 40866.38 41107.62 274.8962 26 0 ## 1 Model 1 -20252.74 40505.48 69 40643.48 41030.60 NA NA NA ## -> RRUM fits the data better than 2PL#----# Model fit statistics# RRUMfmod1 <- CDM::modelfit.cor.din( mod1, jkunits=0)summary(fmod1) ## Test of Global Model Fit ## type value p ## 1 max(X2) 10.10408 0.28109 ## 2 abs(fcor) 0.06726 0.24023 ## ## Fit Statistics ## est ## MADcor 0.01708 ## SRMSR 0.02158 ## MX2 0.96590 ## 100*MADRESIDCOV 0.27269 ## MADQ3 0.02781 ## -> not a significant misfit of the RRUM model# GDINAfmod2 <- CDM::modelfit.cor.din( mod2, jkunits=0)summary(fmod2) ## Test of Global Model Fit ## type value p ## 1 max(X2) 10.40294 0.23905 ## 2 abs(fcor) 0.06817 0.20964 ## ## Fit Statistics ## est ## MADcor 0.01703 ## SRMSR 0.02151 ## MX2 0.94468 ## 100*MADRESIDCOV 0.27105 ## MADQ3 0.02713## End(Not run)Large-Scale Dataset with Multiple Groups
Description
Large-scale dataset with multiple groups, survey weights and 11 polytomousitems.
Usage
data(data.mg)Format
A data frame with 38243 observations on the following 14 variables.
idstudStudent identifier
groupGroup identifier
weightSurvey weight
I1Item 1
I2Item 2
I3Item 3
I4Item 4
I5Item 5
I6Item 6
I7Item 7
I8Item 8
I9Item 9
I10Item 10
I11Item 11
Source
Subsample of a large-scale dataset of 11 survey questions.
Examples
## Not run: library(psych)data(dat.mg, package="CDM")psych::describe( data.mg ) ## > psych::describe(data.mg) ## var n mean sd median trimmed mad min max ## idstud 1 38243 1039653.91 19309.80 1037899.00 1039927.73 30240.59 1007168.00 1069949.00 ## group 2 38243 8.06 4.07 7.00 8.06 5.93 2.00 14.00 ## weight 3 38243 28.76 19.25 31.88 27.92 19.12 0.79 191.89 ## I1 4 37665 0.88 0.32 1.00 0.98 0.00 0.00 1.00 ## I2 5 37639 0.93 0.25 1.00 1.00 0.00 0.00 1.00 ## I3 6 37473 0.76 0.43 1.00 0.83 0.00 0.00 1.00 ## I4 7 37687 1.88 0.39 2.00 2.00 0.00 0.00 2.00 ## I5 8 37638 1.36 0.75 2.00 1.44 0.00 0.00 2.00 ## I6 9 37587 1.05 0.82 1.00 1.06 1.48 0.00 2.00 ## I7 10 37576 1.55 0.85 2.00 1.57 1.48 0.00 3.00 ## I8 11 37044 0.45 0.50 0.00 0.44 0.00 0.00 1.00 ## I9 12 37249 0.48 0.50 0.00 0.47 0.00 0.00 1.00 ## I10 13 37318 0.63 0.48 1.00 0.66 0.00 0.00 1.00 ## I11 14 37412 1.35 0.80 1.00 1.35 1.48 0.00 3.00## End(Not run)Dataset for Polytomous GDINA Model
Description
Dataset for the estimation of the polytomous GDINA model.
Usage
data(data.pgdina)Format
The dataset is a list with the item response data and the Q-matrix.The format is:
List of 2 $ dat : num [1:1000, 1:30] 1 1 1 1 1 0 1 1 1 1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr [1:30] "I1" "I2" "I3" "I4" ... $ q.matrix: num [1:30, 1:5] 1 0 0 0 0 1 0 0 0 2 ...
Details
The dataset was simulated by the followingR code:
set.seed(89)# define Q-matrixQmatrix <- matrix(c(1,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0, 1,1,2,0,0,0,0,1,2,0,0,0,0,1,2,0,0,0,0,1,1,2,0,0,0,1,2,2,0,1,0,2, 1,0,0,1,1,0,2,2,0,0,2,1,0,1,0,0,2,2,1,2,0,0,0,0,0,2,0,0,0,0,0,2, 0,0,0,0,0,2,0,0,0,0,0,1,2,0,2,0,0,0,2,0,2,0,0,0,2,0,1,2,0,0,2,0, 0,2,0,0,1,1,0,0,1,1,0,1,1,1,0,1,1,1,0,0,0,1,0,1,1,1,0,1,0,1), nrow=30, ncol=5, byrow=TRUE )# define covariance matrix between attributesSigma <- matrix(c(1,.6,.6,.3,.3,.6,1,.6,.3,.3,.6,.6,1, .3,.3,.3,.3,.3,1,.8,.3,.3,.3,.8,1), 5,5, byrow=TRUE )# define thresholds for attributesq1 <- c( -.5, .9 ) # attributes 1,...,4q2 <- c(0) # attribute 5# number of personsN <- 1000# simulate latent attributesalpha1 <- mvrnorm(n=N, mu=rep(0,5), Sigma=Sigma)alpha <- 0*alpha1for (aa in 1:4){ alpha[ alpha1[,aa] > q1[1], aa ] <- 1 alpha[ alpha1[,aa] > q1[2], aa ] <- 2 }aa <- 5 ; alpha[ alpha1[,aa] > q2[1], aa ] <- 1# define item parametersguess <- c(.07,.01,.34,.07,.11,.23,.27,.07,.08,.34,.19,.19,.25,.04,.34, .03,.29,.05,.01,.17,.15,.35,.19,.16,.08,.18,.19,.07,.17,.34)slip <- c(0,.11,.14,.09,.03,.09,.03,.1,.14,.07,.06,.19,.09,.19,.07,.08, .16,.18,.16,.02,.11,.12,.16,.14,.18,.01,.18,.14,.05,.18)# simulate item responsesI <- 30 # number of itemsdat <- latresp <- matrix( 0, N, I, byrow=TRUE)for (ii in 1:I){# ii <- 2 # latent response matrix latresp[,ii] <- 1*( rowMeans( alpha >=matrix( Qmatrix[ ii, ], nrow=N, ncol=5, byrow=TRUE ) )==1 ) # response probability prob <- ifelse( latresp[,ii]==1, 1-slip[ii], guess[ii] ) # simulate item responses dat[,ii] <- 1 * ( runif(N ) < prob ) }colnames(dat) <- paste0("I",1:I)
References
Chen, J., & de la Torre, J. (2013).A general cognitive diagnosis model for expert-defined polytomous attributes.Applied Psychological Measurement, 37, 419-437.
PISA 2000 Reading Study (Chen & de la Torre, 2014)
Description
This is a sub-dataset of the PISA 2000 of German studentsincluding 26 items of the reading test. The 26 items was analyzed inChen and de la Torre (2014) and a subset of 20 items wasanalyzed in Chen and Chen (2016).
Usage
data(data.pisa00R.ct)data(data.pisa00R.cc)Format
The format of the dataset
data.pisa00R.ct(Chen & de la Torre, 2014) is:List of 3$ data :'data.frame': 1095 obs. of 111 variables:.. [list output truncated]$ q.matrix: num [1:26, 1:8] 0 1 0 0 0 1 0 0 0 1 .....- attr(*, "dimnames")=List of 2$ skills : chr [1:8] "Locating information" ...The format of the dataset
data.pisa00R.cc(Q-matrix in Chen and Chen, 2016)List of 2$ q.matrix:'data.frame': 20 obs. of 5 variables:..$ A1: num [1:20] 1 1 0 0 1 1 1 0 0 0 .....$ A2: num [1:20] 0 0 0 1 0 1 1 1 1 1 .....$ A3: num [1:20] 1 1 0 1 1 0 1 0 1 0 .....$ A4: num [1:20] 0 1 1 1 0 0 0 0 0 0 .....$ A5: num [1:20] 0 0 1 0 0 0 0 1 0 1 ...$ skills : Named chr [1:5] "Identifying Explicit Information" .....- attr(*, "names")=chr [1:5] "A1" "A2" "A3" "A4" ...
References
Chen, H., & Chen, J. (2016). Exploring reading comprehension skillrelationships through the G-DINA model.Educational Psychology, 36(6), 1049-1064.
Chen, J., & de la Torre, J. (2014). A procedure for diagnostically modelingextant large-scale assessment data: the case of the programme for internationalstudent assessment in reading.Psychology, 5(18), 1967-1978.
Examples
############################################################################## EXAMPLE 1: PISA items from Chen and de la Torre (2014)# dichotomize item responses#############################################################################data(data.pisa00R.ct, package="CDM")dat <- data.pisa00R.ct$dataQ <- data.pisa00R.ct$q.matrixresp <- dat[, rownames(Q)]#** extract item-wise maximummaxK <- apply( resp, 2, max, na.rm=TRUE )#** dichotomize response dataresp1 <- respfor (ii in seq(1,ncol(resp)) ){ resp1[,ii] <- 1 * ( resp[,ii]==maxK[ii] )}Dataset SDA6 (Jurich & Bradshaw, 2014)
Description
This is a simulated dataset of the SDA6 study according to informationsgiven in Jurich and Bradshaw (2014).
Usage
data(data.sda6)Format
The datasets contains 17 items observed at 1710 students.
The format is:
List of 2 $ data : num [1:1710, 1:17] 0 1 0 1 0 0 0 0 1 0 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr [1:17] "MCM01" "MCM03" "MCM13" "MCM17" ... $ q.matrix:'data.frame': ..$ CM: int [1:17] 1 1 1 1 0 0 0 0 0 0 ... ..$ II: int [1:17] 0 0 0 0 1 1 1 1 0 0 ... ..$ PP: int [1:17] 0 0 0 0 0 0 0 0 1 1 ... ..$ DG: int [1:17] 0 0 0 0 0 0 0 0 0 0 ...
The meaning of the skills is
CM – Critique Methods
II – Identify Improvements
PP – Protect Participants
DG – Discern Generalizability
Source
Simulated data
References
Jurich, D. P., & Bradshaw, L. P. (2014). An illustration of diagnosticclassification modeling in student learning outcomes assessment.International Journal of Testing, 14, 49-72.
Examples
## Not run: data(data.sda6, package="CDM")data <- data.sda6$dataq.matrix <- data.sda6$q.matrix#*** Model 1a: LCDM with gdinamod1a <- CDM::gdina( data, q.matrix, rule="ACDM", linkfct="logit", reduced.skillspace=FALSE )summary(mod1a)#*** Model 1b: estimate LCDM with gdmmod1b <- CDM::gdm( data, q.matrix=q.matrix, theta.k=c(0,1) )summary(mod1b)#*** Model 2: LCDM with hierarchy II > CMB <- "II > CM"ss2 <- CDM::skillspace.hierarchy(B=B, skill.names=colnames(q.matrix ) )mod2 <- CDM::gdina( data, q.matrix, rule="ACDM", linkfct="logit", skillclasses=ss2$skillspace.reduced, reduced.skillspace=FALSE )summary(mod2)#*** Model 3: LCDM with hierarchy II > CM and DG > CMB <- "II > CM DG > CM"ss2 <- CDM::skillspace.hierarchy(B=B, skill.names=colnames(q.matrix ) )mod3 <- CDM::gdina( data, q.matrix, rule="ACDM", linkfct="logit", skillclasses=ss2$skillspace.reduced, reduced.skillspace=FALSE )summary(mod3)# model comparisonsanova(mod1a,mod2)anova(mod1a,mod3)# model fitsummary( CDM::modelfit.cor.din(mod1a))summary( CDM::modelfit.cor.din(mod2) )summary( CDM::modelfit.cor.din(mod3) )## End(Not run)TIMSS 2003 Mathematics 8th Grade (Su et al., 2013)
Description
This is a dataset with a subset of 23 Mathematics items from TIMSS 2003 itemsused in Su, Choi, Lee, Choi and McAninch (2013).
Usage
data(data.timss03.G8.su)Format
The data contains scored item responses (data),the Q-matrix (q.matrix) and further item informations (iteminfo).
The format is
List of 3 $ data :'data.frame': ..$ idstud : num [1:757] 1e+07 1e+07 1e+07 1e+07 1e+07 ... ..$ idbook : num [1:757] 1 1 1 1 1 1 1 1 1 1 ... ..$ M012001 : num [1:757] 0 1 0 0 1 0 1 0 0 0 ... ..$ M012002 : num [1:757] 1 1 0 1 0 0 1 1 1 1 ... ..$ M012004 : num [1:757] 0 1 1 1 1 0 1 1 0 0 ... [...] ..$ M022234B: num [1:757] 0 0 0 0 0 0 0 0 0 0 ... ..$ M022251 : num [1:757] 0 0 0 0 0 0 0 0 0 0 ... ..$ M032570 : num [1:757] 1 1 0 1 0 0 1 1 1 1 ... ..$ M032643 : num [1:757] 1 0 0 0 0 0 1 1 0 0 ... $ q.matrix: int [1:23, 1:13] 1 0 0 0 0 0 1 0 0 0 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:23] "M012001" "M012002" "M012004" "M012016" ... .. ..$ : chr [1:13] "S1" "S2" "S3" "S4" ... $ iteminfo: chr [1:23, 1:9] "M012001" "M012002" "M012004" "M012016" ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr [1:9] "item" "ItemType" "reporting_category" "content" ...
For a detailed description of skillsS1,S2, ...,S15see Su et al. (2013, Table 2).
Source
Subset of US 8th graders (Booklet 1) in the TIMSS 2003 mathematics study
References
Skaggs, G., Wilkins, J. L. M., & Hein, S. F. (2016).Grain size and parameter recovery with TIMSS and the general diagnostic model.International Journal of Testing, 16(4), 310-330.
Su, Y.-L., Choi, K. M., Lee, W.-C., Choi, T., & McAninch, M. (2013).Hierarchical cognitive diagnostic analysis for TIMSS 2003 mathematics.CASMA Research Report 35. Center for Advanced Studies inMeasurement and Assessment (CASMA), University of Iowa.
See Also
The TIMSS 2003 dataset for 8th graders (with a larger number of items)was also analyzed in Skaggs, Wilkins and Hein (2016).
Examples
## Not run: ############################################################################## EXAMPLE 1: Data Su et al. (2013)#############################################################################data(data.timss03.G8.su, package="CDM")data <- data.timss03.G8.su$data[,-c(1,2)]q.matrix <- data.timss03.G8.su$q.matrix#*** Model 1: DINA model with complete skill space of 2^13=8192 skill classesmod1 <- CDM::din( data, q.matrix )#*** Model 2: Skill space approximation with 3000 skill classes instead of# 2^13=8192 classes as in Model 1ss2 <- CDM::skillspace.approximation( L=3000, K=ncol(q.matrix) )mod2 <- CDM::din( data, q.matrix, skillclasses=ss2 )#*** Model 3: DINA model with a hierarchical skill space# see Su et al. (2013): Fig. 6B <- "S1 > S2 > S7 > S8 S15 > S9 S3 > S9 S13 > S4 > S9 S14 > S5 > S6 > S11"# Note that S10 and S12 are not included in the dataset contained in this packageskill.names <- colnames(q.matrix)ss3 <- CDM::skillspace.hierarchy(B=B, skill.names=skill.names)# The reduced skill space "only" contains 325 skill classesmod3 <- CDM::din( data, q.matrix, skillclasses=ss3$skillspace.reduced )## End(Not run)TIMSS 2007 Mathematics 4th Grade (Lee et al., 2011)
Description
TIMSS 2007 (Grade 4) dataset with 25 mathematics (dichotomized) items usedin Lee, Park and Taylan (2011), Park and Lee (2014) and Park, Xing andLee (2018). The dataset includes a sample of 698 Austrian students.
Usage
data(data.timss07.G4.lee)data(data.timss07.G4.py)data(data.timss07.G4.Qdomains)Format
The dataset
data.timss07.G4.leeis alist containing dichotomous item responses (data;information on booklet and gender included),the Q-matrix (q.matrix) and descriptionsof the skills (skillinfo) used in Lee et al. (2011).The format is:
List of 3$ data :'data.frame':..$ idstud : int [1:698] 10110 10111 20105 20106 30203 30204 40106 40107 60111 60112 .....$ idbook : int [1:698] 4 5 4 5 4 5 4 5 4 5 .....$ girl : int [1:698] 0 0 1 1 0 1 0 1 1 1 .....$ M041052 : num [1:698] 1 NA 1 NA 0 NA 1 NA 1 NA .....$ M041056 : num [1:698] 1 NA 0 NA 0 NA 0 NA 1 NA .....$ M041069 : num [1:698] 0 NA 0 NA 0 NA 0 NA 1 NA .....$ M041076 : num [1:698] 1 NA 0 NA 1 NA 1 NA 0 NA .....$ M041281 : num [1:698] 1 NA 0 NA 1 NA 1 NA 0 NA .....$ M041164 : num [1:698] 1 NA 1 NA 0 NA 1 NA 1 NA .....$ M041146 : num [1:698] 0 NA 0 NA 1 NA 1 NA 0 NA .....$ M041152 : num [1:698] 1 NA 1 NA 1 NA 0 NA 1 NA .....$ M041258A: num [1:698] 0 NA 1 NA 1 NA 0 NA 1 NA .....$ M041258B: num [1:698] 1 NA 0 NA 1 NA 0 NA 1 NA .....$ M041131 : num [1:698] 0 NA 0 NA 1 NA 1 NA 1 NA .....$ M041275 : num [1:698] 1 NA 0 NA 0 NA 1 NA 1 NA .....$ M041186 : num [1:698] 1 NA 0 NA 1 NA 1 NA 0 NA .....$ M041336 : num [1:698] 1 NA 1 NA 0 NA 1 NA 0 NA .....$ M031303 : num [1:698] 1 1 0 1 0 1 1 1 0 0 .....$ M031309 : num [1:698] 1 0 1 1 1 1 1 1 0 0 .....$ M031245 : num [1:698] 0 0 0 0 0 0 0 0 0 0 .....$ M031242A: num [1:698] 1 1 0 1 1 1 1 1 0 0 .....$ M031242B: num [1:698] 0 1 0 1 1 1 1 1 1 0 .....$ M031242C: num [1:698] 1 1 0 1 1 1 1 1 1 0 .....$ M031247 : num [1:698] 0 0 0 0 0 0 0 0 0 0 .....$ M031219 : num [1:698] 1 1 1 0 1 1 1 1 1 0 .....$ M031173 : num [1:698] 1 1 0 0 0 1 1 1 1 0 .....$ M031085 : num [1:698] 1 0 0 1 1 1 0 0 0 1 .....$ M031172 : num [1:698] 1 0 0 1 1 1 1 1 1 0 ...$ q.matrix : int [1:25, 1:15] 1 0 0 0 0 0 0 1 0 0 .....- attr(*, "dimnames")=List of 2.. ..$ : chr [1:25] "M041052" "M041056" "M041069" "M041076" ..... ..$ : chr [1:15] "NWN01" "NWN02" "NWN03" "NWN04" ...$ skillinfo:'data.frame':..$ skillindex : int [1:15] 1 2 3 4 5 6 7 8 9 10 .....$ skill : Factor w/ 15 levels "DOR15","DRI13",..: 12 13 14 15 8 9 10 11 4 6 .....$ content : Factor w/ 3 levels "D","G","N": 3 3 3 3 3 3 3 3 2 2 .....$ content_label : Factor w/ 3 levels "Data Display",..: 3 3 3 3 3 3 3 3 2 2 .....$ subcontent : Factor w/ 9 levels "FD","LA","LM",..: 9 9 9 9 1 1 4 6 2 8 .....$ subcontent_label: Factor w/ 9 levels "Fractions and Decimals",..: 9 9 9 9 1 1 4 6 2 8 ...The dataset
data.timss07.G4.pyuses the same items asdata.timss07.G4.leebut employs a simplified Q-matrix with 7 skills.This Q-matrix was used in Park and Lee (2014) and Park et al. (2018).List of 3$ q.matrix:'data.frame': 25 obs. of 7 variables:..$ N1: num [1:25] 1 0 1 1 1 0 0 1 0 0 .....$ N2: num [1:25] 0 1 1 1 0 0 0 0 0 0 .....$ N3: num [1:25] 0 0 0 0 1 0 0 0 0 0 .....$ G4: num [1:25] 0 0 0 0 0 0 1 0 0 1 .....$ G5: num [1:25] 0 0 0 0 0 1 1 1 1 1 .....$ G6: num [1:25] 0 0 0 0 0 1 1 0 0 0 .....$ D7: num [1:25] 0 0 0 0 0 0 0 0 0 0 ...$ domains : Named chr [1:3] "Number" "Geometric Shapes and Measures" "Data Display"..- attr(*, "names")=chr [1:3] "N" "G" "D"$ skills : Named chr [1:7] "Whole Numbers" .....- attr(*, "names")=chr [1:7] "N1" "N2" "N3" "G4" ...The Q-matrix
data.timss07.G4.Qdomainsis a simplificationofdata.timss07.G4.py$q.matrixto 3 domains and involves asimple structure of skills.num [1:25, 1:3] 1 1 1 1 1 0 0 1 0 0 ...- attr(*, "dimnames")=List of 2..$ : chr [1:25] "M041052" "M041056" "M041069" "M041076" .....$ : chr [1:3] "N" "G" "D"
Source
TIMSS 2007 study, 4th Grade, Austrian sample on booklets 4 and 5
References
Lee, Y. S., Park, Y. S., & Taylan, D. (2011).A cognitive diagnostic modeling of attribute mastery in Massachusetts,Minnesota, and the US national sample using the TIMSS 2007.International Journal of Testing, 11, 144-177.
Park, Y. S., & Lee, Y. S. (2014). An extension of the DINA model usingcovariates: Examining factors affecting response probability and latentclassification.Applied Psychological Measurement, 38(5), 376-390.
Park, Y. S., Xing, K., & Lee, Y. S. (2018). Explanatory cognitivediagnostic models: Incorporating latent and observed predictors.Applied Psychological Measurement, 42(5), 376-392.
Yamaguchi, K., & Okada, K. (2018). Comparison among cognitive diagnostic modelsfor the TIMSS 2007 fourth grade mathematics assessment.PloS ONE, 13(2), e0188691.
See Also
A comparison of several countries based on the 25 items is conducted inYamaguchi and Okada (2018).
Examples
## Not run: ############################################################################## EXAMPLE 1: DINA model Lee et al. (2011) - 15 skills#############################################################################data(data.timss07.G4.lee, package="CDM")dat <- data.timss07.G4.lee$dataq.matrix <- data.timss07.G4.lee$q.matrix# extract itemsitems <- grep( "M0", colnames(dat), value=TRUE )#*** Model 1: estimate DINA modelmod1 <- CDM::din( dat[,items], q.matrix )summary(mod1)############################################################################## EXAMPLE 2: DINA models Park and Lee (2014) - 7 skills and 3 skills#############################################################################data(data.timss07.G4.lee, package="CDM")data(data.timss07.G4.py, package="CDM")data(data.timss07.G4.Qdomains, package="CDM")dat <- data.timss07.G4.lee$dataq.matrix <- data.timss07.G4.py$q.matrixitems <- rownames(q.matrix)#*** Model 1: estimate DINA modelmod1 <- CDM::din( dat[,items], q.matrix )summary(mod1)#*** Model 2: estimate DINA model with Q-matrix defined by domainsQ <- data.timss07.G4.Qdomainsmod2 <- CDM::din( dat[,items], q.matrix=Q )summary(mod2)## End(Not run)TIMSS 2011 Mathematics 4th Grade Austrian Students
Description
This is the TIMSS 2011 dataset of 4668 Austrian fourth-graders.See George and Robitzsch (2014, 2015, 2018) for publications using theTIMSS 2011 dataset for cognitive diagnosis modeling. The dataset hasalso been analyzed by Sedat and Arican (2015).
Usage
data(data.timss11.G4.AUT)data(data.timss11.G4.AUT.part)data(data.timss11.G4.sa)Format
The format of the dataset
data.timss11.G4.AUTis:List of 4$ data :'data.frame':..$ uidschool: int [1:4668] 10040001 10040001 10040001 10040001 10040001 10040001 10040001 10040001 10040001 10040001 .....$ uidstud : num [1:4668] 1e+13 1e+13 1e+13 1e+13 1e+13 .....$ IDCNTRY : int [1:4668] 40 40 40 40 40 40 40 40 40 40 .....$ IDBOOK : int [1:4668] 10 12 13 14 1 2 3 4 5 6 .....$ IDSCHOOL : int [1:4668] 1 1 1 1 1 1 1 1 1 1 .....$ IDCLASS : int [1:4668] 102 102 102 102 102 102 102 102 102 102 .....$ IDSTUD : int [1:4668] 10201 10203 10204 10205 10206 10207 10208 10209 10210 10211 .....$ TOTWGT : num [1:4668] 17.5 17.5 17.5 17.5 17.5 .....$ HOUWGT : num [1:4668] 1.04 1.04 1.04 1.04 1.04 .....$ SENWGT : num [1:4668] 0.111 0.111 0.111 0.111 0.111 .....$ SCHWGT : num [1:4668] 11.6 11.6 11.6 11.6 11.6 .....$ STOTWGTU : num [1:4668] 524 524 524 524 524 .....$ WGTADJ1 : int [1:4668] 1 1 1 1 1 1 1 1 1 1 .....$ WGTFAC1 : num [1:4668] 11.6 11.6 11.6 11.6 11.6 .....$ JKCREP : int [1:4668] 1 1 1 1 1 1 1 1 1 1 .....$ JKCZONE : int [1:4668] 1 1 1 1 1 1 1 1 1 1 .....$ female : int [1:4668] 1 0 1 1 1 1 1 1 0 0 .....$ M031346A : int [1:4668] NA NA NA 1 1 NA NA NA NA NA .....$ M031346B : int [1:4668] NA NA NA 0 0 NA NA NA NA NA .....$ M031346C : int [1:4668] NA NA NA 1 1 NA NA NA NA NA .....$ M031379 : int [1:4668] NA NA NA 0 0 NA NA NA NA NA .....$ M031380 : int [1:4668] NA NA NA 0 0 NA NA NA NA NA .....$ M031313 : int [1:4668] NA NA NA 1 1 NA NA NA NA NA ..... [list output truncated]$ q.matrix1:'data.frame':..$ item : Factor w/ 174 levels "M031004","M031009",..: 29 30 31 32 33 25 8 5 17 163 .....$ Co_DA: int [1:174] 0 0 0 0 0 0 0 0 0 0 .....$ Co_DK: int [1:174] 0 0 0 0 0 0 0 0 0 0 .....$ Co_DR: int [1:174] 0 0 0 0 0 0 0 0 0 0 .....$ Co_GA: int [1:174] 0 0 0 0 0 0 0 0 0 0 .....$ Co_GK: int [1:174] 0 0 0 0 0 0 1 1 0 0 .....$ Co_GR: int [1:174] 0 0 0 0 0 0 0 0 0 0 .....$ Co_NA: int [1:174] 1 0 0 0 0 1 0 0 0 1 .....$ Co_NK: int [1:174] 0 0 0 0 0 0 0 0 0 0 .....$ Co_NR: int [1:174] 0 1 1 1 1 0 0 0 1 0 ...$ q.matrix2:'data.frame':..$ item : Factor w/ 174 levels "M031004","M031009",..: 29 30 31 32 33 25 8 5 17 163 .....$ CONT_D: int [1:174] 0 0 0 0 0 0 0 0 0 0 .....$ CONT_G: int [1:174] 0 0 0 0 0 0 1 1 0 0 .....$ CONT_N: int [1:174] 1 1 1 1 1 1 0 0 1 1 ...$ q.matrix3:'data.frame': 174 obs. of 4 variables:..$ item : Factor w/ 174 levels "M031004","M031009",..: 29 30 31 32 33 25 8 5 17 163 .....$ COGN_A: int [1:174] 1 0 0 0 0 1 0 0 0 1 .....$ COGN_K: int [1:174] 0 0 0 0 0 0 1 1 0 0 .....$ COGN_R: int [1:174] 0 1 1 1 1 0 0 0 1 0 ...The dataset
data.timss11.G4.AUT.partis a part ofdata.timss11.G4.AUTand contains only the firstthree booklets (with N=1010 students). The format isList of 4$ data :'data.frame': 1010 obs. of 109 variables:..$ uidschool: int [1:1010] 10040001 10040001 10040001 10040001 .....$ uidstud : num [1:1010] 1e+13 1e+13 1e+13 1e+13 1e+13 .....$ IDCNTRY : int [1:1010] 40 40 40 40 40 40 40 40 40 40 .....$ IDBOOK : int [1:1010] 1 2 3 1 2 1 2 3 1 2 .....$ IDSCHOOL : int [1:1010] 1 1 1 1 1 2 2 2 3 3 .....$ IDCLASS : int [1:1010] 102 102 102 102 102 .....$ IDSTUD : int [1:1010] 10206 10207 10208 10220 .....$ TOTWGT : num [1:1010] 17.5 17.5 17.5 17.5 17.5 .....$ HOUWGT : num [1:1010] 1.04 1.04 1.04 1.04 1.04 .....$ SENWGT : num [1:1010] 0.111 0.111 0.111 0.111 0.111 .....$ SCHWGT : num [1:1010] 11.6 11.6 11.6 11.6 11.6 .....$ STOTWGTU : num [1:1010] 524 524 524 524 524 .....$ WGTADJ1 : int [1:1010] 1 1 1 1 1 1 1 1 1 1 .....$ WGTFAC1 : num [1:1010] 11.6 11.6 11.6 11.6 11.6 .....$ JKCREP : int [1:1010] 1 1 1 1 1 0 0 0 0 0 .....$ JKCZONE : int [1:1010] 1 1 1 1 1 1 1 1 2 2 .....$ female : int [1:1010] 1 1 1 1 0 1 1 1 1 1 .....$ M031346A : int [1:1010] 1 NA NA 1 NA 1 NA NA 1 NA .....$ M031346B : int [1:1010] 0 NA NA 1 NA 0 NA NA 0 NA .....$ M031346C : int [1:1010] 1 NA NA 0 NA 0 NA NA 0 NA .....$ M031379 : int [1:1010] 0 NA NA 0 NA 0 NA NA 1 NA .....$ M031380 : int [1:1010] 0 NA NA 0 NA 0 NA NA 0 NA .....$ M031313 : int [1:1010] 1 NA NA 0 NA 1 NA NA 0 NA .....$ M031083 : int [1:1010] 1 NA NA 1 NA 1 NA NA 1 NA .....$ M031071 : int [1:1010] 0 NA NA 0 NA 1 NA NA 0 NA .....$ M031185 : int [1:1010] 0 NA NA 1 NA 0 NA NA 0 NA .....$ M051305 : int [1:1010] 1 1 NA 1 0 0 0 NA 0 1 .....$ M051091 : int [1:1010] 1 1 NA 1 1 1 1 NA 1 0 ..... [list output truncated]$ q.matrix1:'data.frame': 47 obs. of 10 variables:..$ item : Factor w/ 174 levels "M031004","M031009",..: 29 30 31 32 33 25 8 5 17 163 .....$ Co_DA: int [1:47] 0 0 0 0 0 0 0 0 0 0 .....$ Co_DK: int [1:47] 0 0 0 0 0 0 0 0 0 0 .....$ Co_DR: int [1:47] 0 0 0 0 0 0 0 0 0 0 .....$ Co_GA: int [1:47] 0 0 0 0 0 0 0 0 0 0 .....$ Co_GK: int [1:47] 0 0 0 0 0 0 1 1 0 0 .....$ Co_GR: int [1:47] 0 0 0 0 0 0 0 0 0 0 .....$ Co_NA: int [1:47] 1 0 0 0 0 1 0 0 0 1 .....$ Co_NK: int [1:47] 0 0 0 0 0 0 0 0 0 0 .....$ Co_NR: int [1:47] 0 1 1 1 1 0 0 0 1 0 ...$ q.matrix2:'data.frame': 47 obs. of 4 variables:..$ item : Factor w/ 174 levels "M031004","M031009",..: 29 30 31 32 33 25 8 5 17 163 .....$ CONT_D: int [1:47] 0 0 0 0 0 0 0 0 0 0 .....$ CONT_G: int [1:47] 0 0 0 0 0 0 1 1 0 0 .....$ CONT_N: int [1:47] 1 1 1 1 1 1 0 0 1 1 ...$ q.matrix3:'data.frame': 47 obs. of 4 variables:..$ item : Factor w/ 174 levels "M031004","M031009",..: 29 30 31 32 33 25 8 5 17 163 .....$ COGN_A: int [1:47] 1 0 0 0 0 1 0 0 0 1 .....$ COGN_K: int [1:47] 0 0 0 0 0 0 1 1 0 0 .....$ COGN_R: int [1:47] 0 1 1 1 1 0 0 0 1 0 ...The dataset
data.timss11.G4.sacontains the Q-matrixused in Sedat and Arican (2015).List of 2$ q.matrix:'data.frame': 31 obs. of 13 variables:..$ N1 : num [1:31] 1 0 0 1 1 0 0 0 0 0 .....$ N2 : num [1:31] 1 1 0 0 1 0 0 0 0 0 .....$ N3 : num [1:31] 0 0 0 0 1 0 0 0 0 0 .....$ A4 : num [1:31] 0 0 1 0 0 1 1 1 0 0 .....$ A5 : num [1:31] 0 0 0 0 0 1 0 1 0 0 .....$ A6 : num [1:31] 0 0 0 0 0 0 0 0 0 0 .....$ A7 : num [1:31] 0 0 1 0 0 0 0 0 0 0 .....$ G8 : num [1:31] 0 0 0 0 0 0 0 0 1 1 .....$ G9 : num [1:31] 0 0 0 0 0 0 0 0 1 1 .....$ G10: num [1:31] 0 0 0 0 0 0 0 0 1 1 .....$ G11: num [1:31] 0 0 0 0 0 1 0 0 0 0 .....$ D12: num [1:31] 0 0 0 0 0 0 0 0 0 0 .....$ D13: num [1:31] 0 0 0 0 0 0 0 0 0 0 ...$ skills : Named chr [1:13] "Possesses understanding of" __truncated__ .....- attr(*, "names")=chr [1:13] "N1" "N2" "N3" "A4" ...
References
George, A. C., & Robitzsch, A. (2014). Multiple group cognitive diagnosis models,with an emphasis on differential item functioning.Psychological Test and Assessment Modeling, 56(4), 405-432.
George, A. C., & Robitzsch, A. (2015) Cognitive diagnosis models in R: A didactic.The Quantitative Methods for Psychology, 11, 189-205.
George, A. C., & Robitzsch, A. (2018). Focusing on interactions between contentand cognition: A new perspective on gender differences in mathematicalsub-competencies.Applied Measurement in Education, 31(1), 79-97.
Sedat, S. E. N., & Arican, M. (2015). A diagnostic comparison of Turkish andKorean students' Mathematics performances on the TIMSS 2011 assessment.Journal of Measurement and Evaluation in Education and Psychology, 6(2),238-253.
Variance Matrix of a Nonlinear Estimator Using the Delta Method
Description
Computes the variance of a nonlinear parameter using thedelta method.
Usage
deltaMethod(derived.pars, est, Sigma, h=1e-05)Arguments
derived.pars | Vector of derived parameters written inR formula framework(see Examples). |
est | Vector of parameter estimates |
Sigma | Covariance matrix of parameters |
h | Numerical differentiation parameter |
Value
coef | Vector of nonlinear parameters |
vcov | Covariance matrix of nonlinear parameters |
se | Vector of standard errors |
A | First derivative of nonlinear transformation |
univarTest | Data frame containing univariate summary ofnonlinear parameters |
WaldTest | Multivariate parameter test for nonlinear parameter |
See Also
Seecar::deltaMethod ormsm::deltamethod.
Examples
############################################################################## EXAMPLE 1: Nonlinear parameter##############################################################################-- parameter estimateest <- c( 510.67, 102.57)names(est) <- c("mu", "sigma")#-- covariance matrixSigma <- matrix( c(5.83, 0.45, 0.45, 3.21 ), nrow=2, ncol=2 )colnames(Sigma) <- rownames(Sigma) <- names(est)#-- define derived nonlinear parametersderived.pars <- list( "d"=~ I( ( mu - 508 ) / sigma ), "dsig"=~ I( sigma / 100 - 1) )#*** apply delta methodres <- CDM::deltaMethod( derived.pars, est, Sigma )resParameter Estimation for Mixed DINA/DINO Model
Description
din provides parameter estimation for cognitivediagnosis models of the types “DINA”, “DINO” and “mixed DINAand DINO”.
Usage
din(data, q.matrix, skillclasses=NULL, conv.crit=0.001, dev.crit=10^(-5), maxit=500, constraint.guess=NULL, constraint.slip=NULL, guess.init=rep(0.2, ncol(data)), slip.init=guess.init, guess.equal=FALSE, slip.equal=FALSE, zeroprob.skillclasses=NULL, weights=rep(1, nrow(data)), rule="DINA", wgt.overrelax=0, wgtest.overrelax=FALSE, param.history=FALSE, seed=0, progress=TRUE, guess.min=0, slip.min=0, guess.max=1, slip.max=1)## S3 method for class 'din'print(x, ...)Arguments
data | A required |
q.matrix | A required binary |
skillclasses | An optional matrix for determining the skill space.The argument can be used if a user wants less than |
conv.crit | A numeric which defines the termination criterionof iterations in the parameterestimation process. Iteration ends if the maximal change in parameterestimates is below this value. |
dev.crit | A numeric value which defines the termination criterionof iterations in relative change in deviance. |
maxit | An integer which defines the maximum numberof iterations in the estimation process. |
constraint.guess | An optional matrix of fixed guessingparameters. The first column of this matrix indicates the numbers of theitems whose guessing parameters are fixed and the secondcolumn the values the guessing parameters are fixed to. |
constraint.slip | An optional matrix of fixed slippingparameters. The first column of this matrix indicates the numbers of theitems whose slipping parameters are fixed and the second columnthe values the slipping parameters are fixed to. |
guess.init | An optional initial vector of guessing parameters.Guessing parameters are bounded between 0 and 1. |
slip.init | An optional initial vector of slipping parameters.Slipping parameters are bounded between 0 and 1. |
guess.equal | An optional logical indicating if all guessing parametersare equal to each other. Default is |
slip.equal | An optional logical indicating if all slipping parametersare equal to each other. Default is |
zeroprob.skillclasses | An optional vector of integers which indicateswhich skill classes should have zero probability. Default is |
weights | An optional vector of weights for the response pattern.Non-integer weights allow for different sampling schemes. |
rule | An optional character string or vector of character stringsspecifying the model rule that is used. The character strings must beof |
wgt.overrelax | A parameter which is relevant when an overrelaxationalgorithm is used |
wgtest.overrelax | A logical which indicates if the overrelexationparameter being estimated during iterations |
param.history | A logical which indicates if the parameter history duringiterations should be saved. The default is |
seed | Simulation seed for initial parameters. A value of zero correspondsto deterministic starting values, an integer value different fromzero to random initial values with |
progress | An optional logical indicating whether the functionshould print the progress of iteration in the estimation process. |
guess.min | Minimum value of guessing parameters to be estimated. |
slip.min | Minimum value of slipping parameters to be estimated. |
guess.max | Maximum value of guessing parameters to be estimated. |
slip.max | Maximum value of slipping parameters to be estimated. |
x | Object of class |
... | Further arguments to be passed |
Details
In the CDM DINA (deterministic-input, noisy-and-gate; de la Torre &Douglas, 2004) and DINO (deterministic-input, noisy-or-gate; Templin &Henson, 2006) models endorsement probabilities are modeledbased on guessing and slipping parameters, given the different skillclasses. The probability of respondentn (or corresponding respondents classn)for solving itemj is calculated as a function of therespondent's latent response\eta_{nj}and the guessing and slipping ratesg_j ands_j for itemj conditional on the respondent's skill class\alpha_n:
P(X_{nj}=1 | \alpha_n)=g_j^{(1- \eta_{nj})}(1 - s_j) ^{\eta_{nj}}.
The respondent's latent response (class)\eta_{nj} is a binary number,0 or 1, where 1 indicates presence of all (rule="DINO")or at least one (rule="DINO") required skill(s) foritemj, respectively.
DINA and DINO parameter estimation is performed bymaximization of the marginal likelihood of the data. Thea priori distribution of the skill vectors is a uniform distribution.The implementation follows the EM algorithm by de la Torre (2009).
The functiondin returns an object of the classdin (see ‘Value’), for whichplot,print, andsummary methods are provided;plot.din,print.din, andsummary.din, respectively.
Value
coef | Estimated model parameters. Note that only freelyestimated parameters are included. |
item | A data frame giving for each item condensation rule, theestimated guessing and slipping parameters and their standard errors.All entries are rounded to 3 digits. |
guess | A data frame giving the estimated guessing parametersand their standard errors for each item. |
slip | A data frame giving the estimated slipping parametersand their standard errors for each item. |
IDI | A matrix giving the item discriminationindex (IDI; Lee, de la Torre & Park, 2012) for each item
where a high IDI corresponds to good test itemswhich have both low guessing and slipping rates. Note thata negative IDI indicates violation of the monotonicity condition |
itemfit.rmsea | The RMSEA item fit index (see |
mean.rmsea | Mean of RMSEA item fit indexes. |
loglike | A numeric giving the value of the maximizedlog likelihood. |
AIC | A numeric giving the AIC value of the model. |
BIC | A numeric giving the BIC value of the model. |
Npars | Number of estimated parameters |
posterior | A matrix given the posterior skill distributionfor all respondents. The nth row of the matrix gives the probabilities forrespondent n to possess any of the |
like | A matrix giving the values of the maximized likelihoodfor all respondents. |
data | The input matrix of binary response data. |
q.matrix | The input matrix of the required attributes. |
pattern | A matrix giving the skill classes leading to highest endorsementprobability for the respective response pattern ( |
attribute.patt | A data frame giving the estimated occurrenceprobabilities of the skill classes and the expected frequency of theattribute classes given the model. |
skill.patt | A matrix given the population prevalences of theskills. |
subj.pattern | A vector of strings indicating the item responsepattern for each subject. |
attribute.patt.splitted | A dataframe giving the skill classof the respondents. |
display | A character giving the model specified under |
item.patt.split | A matrix giving the splitted response pattern. |
item.patt.freq | A numeric vector given the frequencies of the responsepattern in |
seed | Used simulation seed for initial parameters |
partable | Parameter table which is used for |
vcov.derived | Design matrix for extended set of parameters in |
converged | Logical indicating whether convergence was achieved. |
control | Optimization parameters used in estimation |
Note
The calculation of standard errors using sampling weights whichrepresent multistage sampling schemes is not correct. Please usereplication methods (like Jackknife) instead.
References
de la Torre, J. (2009). DINA model parameter estimation:A didactic.Journal of Educational and BehavioralStatistics, 34, 115–130.
de la Torre, J., & Douglas, J. (2004). Higher-order latent trait modelsfor cognitive diagnosis.Psychometrika, 69, 333–353.
Lee, Y.-S., de la Torre, J., & Park, Y. S. (2012).Relationships between cognitive diagnosis, CTT, and IRT indices:An empirical investigation.Asia Pacific Educational Research, 13, 333-345.
Rupp, A. A., Templin, J., & Henson, R. A. (2010).DiagnosticMeasurement: Theory, Methods, and Applications. New York: The GuilfordPress.
Templin, J., & Henson, R. (2006). Measurement ofpsychological disorders using cognitive diagnosismodels.Psychological Methods, 11, 287–305.
See Also
plot.din, the S3 method for plotting objects ofthe classdin;print.din, the S3 methodfor printing objects of the classdin;summary.din, the S3 method for summarizing objectsof the classdin, which creates objects of the classsummary.din;din, the main function forDINA and DINO parameter estimation, which creates objects of the classdin.
See thegdina function for the estimation ofthe generalized DINA (GDINA) model.
For assessment of model fit seemodelfit.cor.din andanova.din.
Seeitemfit.sx2 for item fit statistics.
Seediscrim.index for computing discrimination indices.
See alsoCDM-package for generalinformation about this package.
See theNPCD::JMLE function in theNPCD package forjoint maximum likelihood estimationof the DINA, DINO and NIDA model.
See thedina::DINA_Gibbs function in thedinapackage for MCMC based estimation of the DINA model.
Examples
############################################################################## EXAMPLE 1: Examples based on dataset fractions.subtraction.data############################################################################### dataset fractions.subtraction.data and corresponding Q-Matrixhead(fraction.subtraction.data)fraction.subtraction.qmatrix## Misspecification in parameter specification for method CDM::din()## leads to warnings and terminates estimation procedure. E.g.,# See Q-Matrix specificationfractions.dina.warning1 <- CDM::din(data=fraction.subtraction.data, q.matrix=t(fraction.subtraction.qmatrix))# See guess.init specificationfractions.dina.warning2 <- CDM::din(data=fraction.subtraction.data, q.matrix=fraction.subtraction.qmatrix, guess.init=rep(1.2, ncol(fraction.subtraction.data)))# See rule specificationfractions.dina.warning3 <- CDM::din(data=fraction.subtraction.data, q.matrix=fraction.subtraction.qmatrix, rule=c(rep("DINA", 10), rep("DINO", 9)))## Parameter estimation of DINA model# rule="DINA" is defaultfractions.dina <- CDM::din(data=fraction.subtraction.data, q.matrix=fraction.subtraction.qmatrix, rule="DINA")attributes(fractions.dina)str(fractions.dina)## For instance assessing the guessing parameters through## assignmentfractions.dina$guess## corresponding summaries, including IDI,## most frequent skill classes and information## criteria AIC and BICsummary(fractions.dina)## In particular, assessing detailed summary through assignmentdetailed.summary.fs <- summary(fractions.dina)str(detailed.summary.fs)## Item discrimination index of item 8 is too low. This is also## visualized in the first plotplot(fractions.dina)## The reason therefore is a high guessing parameterround(fractions.dina$guess[,1], 2)## Estimate DINA model with different random initial parameters using seed=1345fractions.dina1 <- CDM::din(data=fraction.subtraction.data, q.matrix=fraction.subtraction.qmatrix, rule="DINA", seed=1345)## Fix the guessing parameters of items 5, 8 and 9 equal to .20# define a constraint.guess matrixconstraint.guess <- matrix(c(5,8,9, rep(0.2, 3)), ncol=2)fractions.dina.fixed <- CDM::din(data=fraction.subtraction.data, q.matrix=fraction.subtraction.qmatrix, constraint.guess=constraint.guess)## The second plot shows the expected (MAP) and observed skill## probabilities. The third plot visualizes the skill class## occurrence probabilities; Only the 'top.n.skill.classes' most frequent## skill classes are labeled; it is obvious that the skill class '11111111'## (all skills are mastered) is the most probable in this population.## The fourth plot shows the skill probabilities conditional on response## patterns; in this population the skills 3 and 6 seem to be## mastered easier than the others. The fourth plot shows the## skill probabilities conditional on a specified response## pattern; it is shown whether a skill is mastered (above## .5+'uncertainty') unclassifiable (within the boundaries) or## not mastered (below .5-'uncertainty'). In this case, the## 527th respondent was chosen; if no response pattern is## specified, the plot will not be shown (of course)pattern <- paste(fraction.subtraction.data[527, ], collapse="")plot(fractions.dina, pattern=pattern, display.nr=4)#uncertainty=0.1, top.n.skill.classes=6 are defaultplot(fractions.dina.fixed, uncertainty=0.1, top.n.skill.classes=6, pattern=pattern)## Not run: ############################################################################## EXAMPLE 2: Examples based on dataset sim.dina############################################################################## DINA Modeld1 <- CDM::din(sim.dina, q.matr=sim.qmatrix, rule="DINA", conv.crit=0.01, maxit=500, progress=TRUE)summary(d1)# DINA model with hierarchical skill classes (Hierarchical DINA model)# 1st step: estimate an initial full model to look at the indexing# of skill classesd0 <- CDM::din(sim.dina, q.matr=sim.qmatrix, maxit=1)d0$attribute.patt.splitted# [,1] [,2] [,3]# [1,] 0 0 0# [2,] 1 0 0# [3,] 0 1 0# [4,] 0 0 1# [5,] 1 1 0# [6,] 1 0 1# [7,] 0 1 1# [8,] 1 1 1## In this example, following hierarchical skill classes are only allowed:# 000, 001, 011, 111# We define therefore a vector of indices for skill classes with# zero probabilities (see entries in the rows of the matrix# d0$attribute.patt.splitted above)zeroprob.skillclasses <- c(2,3,5,6) # classes 100, 010, 110, 101# estimate the hierarchical DINA modeld1a <- CDM::din(sim.dina, q.matr=sim.qmatrix, zeroprob.skillclasses=zeroprob.skillclasses )summary(d1a)# Mixed DINA and DINO Modeld1b <- CDM::din(sim.dina, q.matr=sim.qmatrix, rule= c(rep("DINA", 7), rep("DINO", 2)), conv.crit=0.01, maxit=500, progress=FALSE)summary(d1b)# DINO Modeld2 <- CDM::din(sim.dina, q.matr=sim.qmatrix, rule="DINO", conv.crit=0.01, maxit=500, progress=FALSE)summary(d2)# Comparison of DINA and DINO estimateslapply(list("guessing"=rbind("DINA"=d1$guess[,1], "DINO"=d2$guess[,1]), "slipping"=rbind("DINA"= d1$slip[,1], "DINO"=d2$slip[,1])), round, 2)# Comparison of the information criteriac("DINA"=d1$AIC, "MIXED"=d1b$AIC, "DINO"=d2$AIC)# following estimates:d1$coef # guessing and slipping parameterd1$guess # guessing parameterd1$slip # slipping parameterd1$skill.patt # probabilities for skillsd1$attribute.patt # skill classes with probabilitiesd1$subj.pattern # pattern per subject# posterior probabilities for every response patternd1$posterior# Equal guessing parametersd2a <- CDM::din( data=sim.dina, q.matrix=sim.qmatrix, guess.equal=TRUE, slip.equal=FALSE )d2a$coef# Equal guessing and slipping parametersd2b <- CDM::din( data=sim.dina, q.matrix=sim.qmatrix, guess.equal=TRUE, slip.equal=TRUE )d2b$coef############################################################################## EXAMPLE 3: Examples based on dataset sim.dino############################################################################## DINO Estimationd3 <- CDM::din(sim.dino, q.matr=sim.qmatrix, rule="DINO", conv.crit=0.005, progress=FALSE)# Mixed DINA and DINO Modeld3b <- CDM::din(sim.dino, q.matr=sim.qmatrix, rule=c(rep("DINA", 4), rep("DINO", 5)), conv.crit=0.001, progress=FALSE)# DINA Estimationd4 <- CDM::din(sim.dino, q.matr=sim.qmatrix, rule="DINA", conv.crit=0.005, progress=FALSE)# Comparison of DINA and DINO estimateslapply(list("guessing"=rbind("DINO"=d3$guess[,1], "DINA"=d4$guess[,1]), "slipping"=rbind("DINO"=d3$slip[,1], "DINA"=d4$slip[,1])), round, 2)# Comparison of the information criteriac("DINO"=d3$AIC, "MIXED"=d3b$AIC, "DINA"=d4$AIC)############################################################################## EXAMPLE 4: Example estimation with weights based on dataset sim.dina############################################################################## Here, a weighted maximum likelihood estimation is used# This could be useful for survey data.# i.e. first 200 persons have weight 2, the other have weight 1(weights <- c(rep(2, 200), rep(1, 200)))d5 <- CDM::din(sim.dina, sim.qmatrix, rule="DINA", conv.crit= 0.005, weights=weights, progress=FALSE)# Comparison of the information criteriac("DINA"=d1$AIC, "WEIGHTS"=d5$AIC)############################################################################## EXAMPLE 5: Example estimation within a balanced incomplete## block (BIB) design generated on dataset sim.dina############################################################################## generate BIB data# The next example shows that the din function works for# (relatively arbitrary) missing value pattern# Here, a missing by design is generated in the dataset dinadat.bibsim.dina.bib <- sim.dinasim.dina.bib[1:100, 1:3] <- NAsim.dina.bib[101:300, 4:8] <- NAsim.dina.bib[301:400, c(1,2,9)] <- NAd6 <- CDM::din(sim.dina.bib, sim.qmatrix, rule="DINA", conv.crit=0.0005, weights=weights, maxit=200)d7 <- CDM::din(sim.dina.bib, sim.qmatrix, rule="DINO", conv.crit=0.005, weights=weights)# Comparison of DINA and DINO estimateslapply(list("guessing"=rbind("DINA"=d6$guess[,1], "DINO"=d7$guess[,1]), "slipping"=rbind("DINA"= d6$slip[,1], "DINO"=d7$slip[,1])), round, 2)############################################################################## EXAMPLE 6: DINA model with attribute hierarchy#############################################################################set.seed(987)# assumed skill distribution: P(000)=P(100)=P(110)=P(111)=.245 and# "deviant pattern": P(010)=.02K <- 3 # number of skills# define alphaalpha <- scan() 0 0 0 1 0 0 1 1 0 1 1 1 0 1 0alpha <- matrix( alpha, length(alpha)/K, K, byrow=TRUE )alpha <- alpha[ c( rep(1:4,each=245), rep(5,20) ), ]# define Q-matrixq.matrix <- scan() 1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 0 0 1 1 1 0 1 0 1 0 1 1q.matrix <- matrix( q.matrix, nrow=length(q.matrix)/K, ncol=K, byrow=TRUE )# simulate DINA datadat <- CDM::sim.din( alpha=alpha, q.matrix=q.matrix )$dat#*** Model 1: estimate DINA model | no skill space restrictionmod1 <- CDM::din( dat, q.matrix )#*** Model 2: DINA model | hierarchy A2 > A3B <- "A2 > A3"skill.names <- paste0("A",1:3)skillspace <- CDM::skillspace.hierarchy( B, skill.names )$skillspace.reducedmod2 <- CDM::din( dat, q.matrix, skillclasses=skillspace )#*** Model 3: DINA model | linear hierarchy A1 > A2 > A3# This is a misspecied model because due to P(010)=.02 the relation A1>A2# does not hold.B <- "A1 > A2 A2 > A3"skill.names <- paste0("A",1:3)skillspace <- CDM::skillspace.hierarchy( B, skill.names )$skillspace.reducedmod3 <- CDM::din( dat, q.matrix, skillclasses=skillspace )#*** Model 4: 2PL model in gdmmod4 <- CDM::gdm( dat, theta.k=seq(-5,5,len=21), decrease.increments=TRUE, skillspace="normal" )summary(mod4)anova(mod1,mod2) ## Model loglike Deviance Npars AIC BIC Chisq df p ## 2 Model 2 -7052.460 14104.92 29 14162.92 14305.24 0.9174 2 0.63211 ## 1 Model 1 -7052.001 14104.00 31 14166.00 14318.14 NA NA NAanova(mod2,mod3) ## Model loglike Deviance Npars AIC BIC Chisq df p ## 2 Model 2 -7059.058 14118.12 27 14172.12 14304.63 13.19618 2 0.00136 ## 1 Model 1 -7052.460 14104.92 29 14162.92 14305.24 NA NA NAanova(mod2,mod4) ## Model loglike Deviance Npars AIC BIC Chisq df p ## 2 Model 2 -7220.05 14440.10 24 14488.10 14605.89 335.1805 5 0 ## 1 Model 1 -7052.46 14104.92 29 14162.92 14305.24 NA NA NA# compare fit statisticssummary( CDM::modelfit.cor.din( mod2 ) )summary( CDM::modelfit.cor.din( mod4 ) )############################################################################## EXAMPLE 7: Fitting the basic local independence model (BLIM) with din#############################################################################library(pks)data(DoignonFalmagne7, package="pks") ## str(DoignonFalmagne7) ## $ K : int [1:9, 1:5] 0 1 0 1 1 1 1 1 1 0 ... ## ..- attr(*, "dimnames")=List of 2 ## .. ..$ : chr [1:9] "00000" "10000" "01000" "11000" ... ## .. ..$ : chr [1:5] "a" "b" "c" "d" ... ## $ N.R: Named int [1:32] 80 92 89 3 2 1 89 16 18 10 ... ## ..- attr(*, "names")=chr [1:32] "00000" "10000" "01000" "00100" ...# The idea is to fit the local independence model with the din function.# This can be accomplished by specifying a DINO model with# prespecified skill classes.# extract datasetdat <- as.numeric( unlist( sapply( names(DoignonFalmagne7$N.R), FUN=function( ll){ strsplit( ll, split="") } ) ) )dat <- matrix( dat, ncol=5, byrow=TRUE )colnames(dat) <- colnames(DoignonFalmagne7$K)rownames(dat) <- names(DoignonFalmagne7$N.R)# sample weightsweights <- DoignonFalmagne7$N.R# define Q-matrixq.matrix <- t(DoignonFalmagne7$K)v1 <- colnames(q.matrix) <- paste0("S", colnames(q.matrix))q.matrix <- q.matrix[, - 1] # remove S00000# define skill classesSC <- ncol(q.matrix)skillclasses <- matrix( 0, nrow=SC+1, ncol=SC)colnames(skillclasses) <- colnames(q.matrix)rownames(skillclasses) <- v1skillclasses[ cbind( 2:(SC+1), 1:SC ) ] <- 1# estimate BLIM with din functionmod1 <- CDM::din(data=dat, q.matrix=q.matrix, skillclasses=skillclasses, rule="DINO", weights=weights )summary(mod1) ## Item parameters ## item guess slip IDI rmsea ## a a 0.158 0.162 0.680 0.011 ## b b 0.145 0.159 0.696 0.009 ## c c 0.008 0.181 0.811 0.001 ## d d 0.012 0.129 0.859 0.001 ## e e 0.025 0.146 0.828 0.007# estimate basic local independence model with pks packagemod2 <- pks::blim(K, N.R, method="ML") # maximum likelihood estimation by EM algorithmmod2 ## Error and guessing parameters ## beta eta ## a 0.164871 0.103065 ## b 0.163113 0.095074 ## c 0.188839 0.000004 ## d 0.079835 0.000003 ## e 0.088648 0.019910## End(Not run)Deterministic Classification and Joint Maximum Likelihood Estimationof the Mixed DINA/DINO Model
Description
This function allows the estimation of the mixed DINA/DINO model byjoint maximum likelihood and a deterministic classification basedon ideal latent responses.
Usage
din.deterministic(dat, q.matrix, rule="DINA", method="JML", conv=0.001, maxiter=300, increment.factor=1.05, progress=TRUE)Arguments
dat | Data frame of dichotomous item responses |
q.matrix | Q-matrix with binary entries (see |
rule | The condensation rule (see |
method | Estimation method. The default is joint maximum likelihood estimation( |
conv | Convergence criterion for guessing and slipping parameters |
maxiter | Maximum number of iterations |
increment.factor | A numeric value of at least one which could help to improve convergencebehavior and decreases parameter increments in every iteration. This option isdisabled by setting this argument to 1. |
progress | An optional logical indicating whether the functionshould print the progress of iteration in the estimation process. |
Value
A list with following entries
attr.est | Estimated attribute patterns |
criterion | Criterion of the classification function.For joint maximum likelihood it is the deviance. |
guess | Estimated guessing parameters |
slip | Estimated slipping parameters |
prederror | Average individual prediction error |
q.matrix | Used Q-matrix |
dat | Used data frame |
References
Chiu, C. Y., & Douglas, J. (2013). A nonparametric approach tocognitive diagnosis by proximity to ideal response patterns.Journal of Classification, 30, 225-250.
See Also
For estimating the mixed DINA/DINO model using marginal maximumlikelihood estimation seedin.
See also theNPCD::JMLE function in theNPCD package forjoint maximum likelihood estimation of the DINA or the DINO model.
Examples
############################################################################## EXAMPLE 1: 13 items and 3 attributes#############################################################################set.seed(679)N <- 3000# specify true Q-matrixq.matrix <- matrix( 0, 13, 3 )q.matrix[1:3,1] <- 1q.matrix[4:6,2] <- 1q.matrix[7:9,3] <- 1q.matrix[10,] <- c(1,1,0)q.matrix[11,] <- c(1,0,1)q.matrix[12,] <- c(0,1,1)q.matrix[13,] <- c(1,1,1)q.matrix <- rbind( q.matrix, q.matrix )colnames(q.matrix) <- paste0("Attr",1:ncol(q.matrix))# simulate data according to the DINA modeldat <- CDM::sim.din( N=N, q.matrix)$dat# Joint maximum likelihood estimation (the default: method="JML")res1 <- CDM::din.deterministic( dat, q.matrix )# Adaptive estimation of guessing and slipping parametersres <- CDM::din.deterministic( dat, q.matrix, method="adaptive" )# Classification using Hamming distanceres <- CDM::din.deterministic( dat, q.matrix, method="hamming" )# Classification using weighted Hamming distanceres <- CDM::din.deterministic( dat, q.matrix, method="weighted.hamming" )## Not run: #********* load NPCD library for JML estimationlibrary(NPCD)# DINA modelres <- NPCD::JMLE( Y=dat[1:100,], Q=q.matrix, model="DINA" )as.data.frame(res$par.est ) # item parametersres$alpha.est # skill classifications# RRUM modelres <- NPCD::JMLE( Y=dat[1:100,], Q=q.matrix, model="RRUM" )as.data.frame(res$par.est )## End(Not run)Calculation of Equivalent Skill Classes in the DINA/DINO Model
Description
This function computes indistinguishable skill classes for the DINA andDINO model (Gross & George, 2014; Zhang, DeCarlo & Ying, 2013).
Usage
din.equivalent.class(q.matrix, rule="DINA")Arguments
q.matrix | The Q-matrix (see |
rule | The condensation rule. If it is a string, then the rule appliesto all items. If it is a vector, then for each item |
Value
A list with following entries:
latent.responseM | Matrix of latent responses |
latent.response | Latent responses represented as a string |
S | Matrix containing all skill classes |
gini | Gini coefficient of the frequency distribution ofidentifiable skill classes which result in the same latent response |
skillclasses | Data frame with skill class ( |
References
Gross, J. & George, A. C. (2014). On prerequisite relations betweenattributes in noncompensatory diagnostic classification.Methodology, 10(3), 100-107.
Zhang, S. S., DeCarlo, L. T., & Ying, Z. (2013).Non-identifiability, equivalence classes, and attribute-specific classificationin Q-matrix based cognitive diagnosis models.arXiv preprint,arXiv:1303.0426.
Examples
############################################################################## EXAMPLE 1: Equivalency classes for DINA model for fraction subtraction data##############################################################################-- DINA modelsdata(data.fraction2, package="CDM")# first Q-matrixQ1 <- data.fraction2$q.matrix1m1 <- CDM::din.equivalent.class( q.matrix=Q1, rule="DINA" ) ## 8 Skill classes | 5 distinguishable skill classes | Gini coefficient=0.3# second Q-matrixQ1 <- data.fraction2$q.matrix2m1 <- CDM::din.equivalent.class( q.matrix=Q1, rule="DINA" ) ## 32 Skill classes | 9 distinguishable skill classes | Gini coefficient=0.5# third Q-matrixQ1 <- data.fraction2$q.matrix3m1 <- CDM::din.equivalent.class( q.matrix=Q1, rule="DINA" ) ## 8 Skill classes | 8 distinguishable skill classes | Gini coefficient=0# original fraction subtraction datam1 <- CDM::din.equivalent.class( q.matrix=CDM::fraction.subtraction.qmatrix, rule="DINA") ## 256 Skill classes | 58 distinguishable skill classes | Gini coefficient=0.659Q-Matrix Validation (Q-Matrix Modification) for Mixed DINA/DINO Model
Description
Q-matrix entries can be modified by the Q-matrix validation methodof de la Torre (2008). After estimating a mixed DINA/DINO modelusing thedin function, item parameters and the itemdiscrimination parametersIDI_j are recalculated. Q-matrix rowsare determined by maximizing the estimated item discrimination indexIDI_j=1-s_j -g_j.
Usage
din.validate.qmatrix(object, IDI_diff=.02, print=TRUE)Arguments
object | Object of class |
IDI_diff | Minimum difference in IDI values for choosing a new Q-matrix vector |
print | An optional logical indicating whether the functionshould print the progress of iteration in the estimation process. |
Value
A list with following entries:
coef.modified | Estimated parameters by applying Q-matrixmodifications |
coef.modified.short | A shortened matrix of |
q.matrix.prop | The proposed Q-matrix by Q-matrix validation. |
References
Chiu, C. Y. (2013). Statistical refinement of the Q-matrix in cognitivediagnosis.Applied Psychological Measurement, 37, 598-618.
de la Torre, J. (2008). An empirically based method of Q-matrixvalidation for the DINA model: Development and applications.Journal of Educational Measurement, 45, 343-362.
See Also
The mixed DINA/DINO model can be estimated withdin.
See Chiu (2013) for an alternative estimation approach based onresidual sum of squares which is implementedNPCD::Qrefine function in theNPCD package.
See theGDINA::Qval function in theGDINA package for extended functionality.
Examples
############################################################################## EXAMPLE 1: Detection of a mis-specified Q-matrix#############################################################################set.seed(679)# specify true Q-matrixq.matrix <- matrix( 0, 12, 3 )q.matrix[1:3,1] <- 1q.matrix[4:6,2] <- 1q.matrix[7:9,3] <- 1q.matrix[10,] <- c(1,1,0)q.matrix[11,] <- c(1,0,1)q.matrix[12,] <- c(0,1,1)# simulate datadat <- CDM::sim.din( N=4000, q.matrix)$dat# incorrectly modify Q-matrix rows 1 and 10Q1 <- q.matrixQ1[1,] <- c(1,1,0)Q1[10,] <- c(1,0,0)# estimate DINA modelmod <- CDM::din( dat, q.matr=Q1, rule="DINA")# apply Q-matrix validationres <- CDM::din.validate.qmatrix( mod ) ## item itemindex Skill1 Skill2 Skill3 guess slip IDI qmatrix.orig IDI.orig delta.IDI max.IDI ## I001 1 1 0 0 0.309 0.251 0.440 0 0.431 0.009 0.440 ## I010 10 1 1 0 0.235 0.329 0.437 0 0.320 0.117 0.437 ## I010 10 1 1 1 0.296 0.301 0.403 0 0.320 0.083 0.437 ## ## Proposed Q-matrix: ## ## Skill1 Skill2 Skill3 ## Item1 1 0 0 ## Item2 1 0 0 ## Item3 1 0 0 ## Item4 0 1 0 ## Item5 0 1 0 ## Item6 0 1 0 ## Item7 0 0 1 ## Item8 0 0 1 ## Item9 0 0 1 ## Item10 1 1 0 ## Item11 1 0 1 ## Item12 0 1 1## Not run: #*****************# Q-matrix estimation ('Qrefine') in the NPCD package# See Chiu (2013, APM).#*****************library(NPCD)Qrefine.out <- NPCD::Qrefine( dat, Q1, gate="AND", max.ite=50)print(Qrefine.out) ## The modified Q-matrix ## Attribute 1 Attribute 2 Attribute 3 ## Item 1 1 0 0 ## Item 2 1 0 0 ## Item 3 1 0 0 ## Item 4 0 1 0 ## Item 5 0 1 0 ## Item 6 0 1 0 ## Item 7 0 0 1 ## Item 8 0 0 1 ## Item 9 0 0 1 ## Item 10 1 1 0 ## Item 11 1 0 1 ## Item 12 0 1 1 ## ## The modified entries ## Item Attribute ## [1,] 1 2 ## [2,] 10 2plot(Qrefine.out)## End(Not run)Identifiability Conditions of the DINA Model
Description
Check necessary and sufficient identifiability conditions of the DINA modelaccording Gu and Xu (xxxx) for a given Q-matrix.
Usage
din_identifiability(q.matrix)## S3 method for class 'din_identifiability'summary(object, ...)Arguments
q.matrix | Q-matrix |
object | Object of class |
... | Further arguments to be passed |
Value
List with values
dina_identified | Logical indicating whether the DINA model is identified |
index_single | Condition 1: vector of logicals indicating whether skillsare measured by at least one item with a single loading |
is_three_items | Condition 2: vector of logicals indicating whether skillsare measured by at least three items |
submat_distinct | Condition 3: logical indicating whether all columnsof the submatrix |
References
Gu, Y., & Xu, G. (2018). The sufficient and necessary condition for the identifiabilityand estimability of the DINA model.Psychometrika, xx(xx), xxx-xxx.https://doi.org/10.1007/s11336-018-9619-8
See Also
Seedin.equivalent.class for equivalent (i.e., non-distinguishable)skill classes in the DINA model.
Examples
############################################################################## EXAMPLE 1: Some examples of Gu and Xu (2019)##############################################################################* Matrix 1 in Equation (5) of Gu & Xu (2019)Q1 <- diag(3)Q2 <- matrix( scan(text="1 1 0 1 0 1 1 1 1 1 1 1"), ncol=3, byrow=TRUE)Q <- rbind(Q1, Q2)res <- CDM::din_identifiability(q.matrix=Q)summary(res)# remove two itemsres <- CDM::din_identifiability(q.matrix=Q[-c(2,5),])summary(res)#* Matrix 1 in Equation (6) of Gu & Xu (2019)Q1 <- diag(3)Q2 <- matrix( c(1,1,1), nrow=4, ncol=3, byrow=TRUE)Q <- rbind(Q1, Q2)res <- CDM::din_identifiability(q.matrix=Q)summary(res)Discrimination Indices at Item-Attribute, Item and Test Level
Description
Computes discrimination indices at the probability metric(de la Torre, 2008; Henson, DiBello & Stout, 2018).
Usage
discrim.index(object, ...)## S3 method for class 'din'discrim.index(object, ...)## S3 method for class 'gdina'discrim.index(object, ...)## S3 method for class 'mcdina'discrim.index(object, ...)## S3 method for class 'discrim.index'summary(object, file=NULL, digits=3, ...)Arguments
object | |
file | Optional file name for a file in which the summaryoutput should be sunk |
digits | Number of digits for rounding |
... | Further arguments to be passed |
Details
If itemj possessesH_j categories, the item-attributespecific discrimination for attributekaccording to Henson et al. (2018) is defined as
DI_{jk}=\frac{1}{2} \max_{ \bm{\alpha} }\left( \sum_{h=1}^{H_j} | P(X_j=h| \bm{\alpha} ) -P(X_j=h| \bm{\alpha}^{(-k)} ) |\right )
where\bm{\alpha}^{(-k)} and\bm{\alpha} differ onlyin attributek. The indexDI_{jk} can be found as thevaluediscrim_item_attribute. The test-level discrimination indexis defined as
\overline{DI}=\frac{1}{J} \sum_{j=1}^J \max_k DI_{jk}
and can be foundindiscrim_test.
According to de la Torre (2008) and de la Torre, Rossi and van der Ark (2018),the item discrimination index (IDI) is defined as
IDI_j=\max_{ \bm{\alpha}_1,\bm{\alpha}_2, h} | P(X_j=h| \bm{\alpha}_1 ) - P(X_j=h| \bm{\alpha}_2 ) |
and can be found asidi in the values list.
Value
A list with following entries
discrim_item_attribute | Discrimination indices |
idi | Item discrimination index |
discrim_test | Discrimination index at test level |
References
de la Torre, J. (2008). An empirically based method of Q-matrix validationfor the DINA model: Development and applications.Journal of Educational Measurement, 45, 343-362.
http://dx.doi.org/10.1111/j.1745-3984.2008.00069.x
de la Torre, J., van der Ark, L. A., & Rossi, G. (2018).Analysis of clinical data from a cognitive diagnosis modeling framework.Measurement and Evaluation in Counseling and Development, 51(4),281-296.https://doi.org/10.1080/07481756.2017.1327286
Henson, R., DiBello, L., & Stout, B. (2018). A generalized approach to defining itemdiscrimination for DCMs.Measurement: Interdisciplinary Research and Perspectives, 16(1), 18-29.
http://dx.doi.org/10.1080/15366367.2018.1436855
See Also
Seecdi.kli for discrimination indices based on theKullback-Leibler information.
For a fitted modelmod in theGDINA package, discrimination indices can beextracted by the methodextract(mod,"discrim")(GDINA::extract).
Examples
## Not run: ############################################################################## EXAMPLE 1: DINA and GDINA model#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")#-- fit GDINA and DINA modelmod1 <- CDM::gdina( sim.dina, q.matrix=sim.qmatrix )mod2 <- CDM::din( sim.dina, q.matrix=sim.qmatrix )#-- compute discrimination indicesdimod1 <- CDM::discrim.index(mod1)dimod2 <- CDM::discrim.index(mod2)summary(dimod1)summary(dimod2)## End(Not run)Test-specific and Item-specific Entropy for Latent Class Models
Description
Computes test-specific and item-specific entropy as test-diagnosticcriteria of cognitive diagnostic models (Asparouhov & Muthen, 2014).
Usage
entropy.lca(object)## S3 method for class 'entropy.lca'summary(object, digits=2, ...)Arguments
object | Object of class |
digits | Number of digits to round |
... | Further arguments to be passed |
Value
A list with the data frameentropy as an entry.
References
Asparouhov, T. & Muthen, B. (2014).Variable-specific entropycontribution. Technical Appendix. http://www.statmodel.com/7_3_papers.shtml
See Also
Seecdi.kli for test diagnostic indices based on theKullback-Leibler information andcdm.est.class.accuracyfor calculating the classification accuracy.
Examples
############################################################################## EXAMPLE 1: Entropy for DINA model#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")# fit DINA Modelmod1 <- CDM::din( sim.dina, q.matrix=sim.qmatrix, rule="DINA")summary(mod1)# compute entropy for test and itemsemod1 <- CDM::entropy.lca( mod1 )summary(emod1)## Not run: ############################################################################## EXAMPLE 2: Entropy for polytomous GDINA model#############################################################################data(data.pgdina, package="CDM")dat <- data.pgdina$datq.matrix <- data.pgdina$q.matrix# pGDINA model with "DINA rule"mod1 <- CDM::gdina( dat, q.matrix=q.matrix, rule="DINA")summary(mod1)# compute entropyemod1 <- CDM::entropy.lca( mod1 )summary(emod1)############################################################################## EXAMPLE 3: Entropy for MCDINA model#############################################################################data(data.cdm02, package="CDM")dat <- data.cdm02$dataq.matrix <- data.cdm02$q.matrix# estimate model with polytomous atributemod1 <- CDM::mcdina( dat, q.matrix=q.matrix )summary(mod1)# computre entropyemod1 <- CDM::entropy.lca( mod1 )summary(emod1)## End(Not run)Determination of a Statistically Equivalent DINA Model
Description
This function determines a statistically equivalent DINA modelgiven a Q-matrix using the method of von Davier (2014).Thereby, the dimension of the skill space is expanded, but in thereparameterized version, the Q-matrix has a simple structureor the IRT model is no longer be conjuctive (like in DINA) dueto a redefinition of the skill space.
Usage
equivalent.dina(q.matrix, reparameterization="B")Arguments
q.matrix | The Q-matrix (see |
reparameterization | The used reparameterization (see von Davier, 2014). |
Value
A list with following entries
q.matrix | Original Q-matrix |
q.matrix.ast | Reparameterized Q-matrix |
alpha | Original skill space |
alpha.ast | Reparameterized skill space |
References
von Davier, M. (2014). The DINA model as a constrained generaldiagnostic model: Two variants of a model equivalency.British Journal of Mathematical and Statistical Psychology, 67, 49-71.
Examples
############################################################################## EXAMPLE 1: Toy example############################################################################## define a Q-matrixQ <- matrix( c( 1,0,0, 0,1,0, 0,0,1, 1,0,1, 1,1,1 ), byrow=TRUE, ncol=3 )Q <- Q[ rep(1:(nrow(Q)),each=2), ]# equivalent DINA model (using the default reparameterization B)res1 <- CDM::equivalent.dina( q.matrix=Q )res1# equivalent DINA model (reparametrization A)res2 <- CDM::equivalent.dina( q.matrix=Q, reparameterization="A")res2## Not run: ############################################################################## EXAMPLE 2: Estimation with two equivalent DINA models############################################################################## simulate dataset.seed(789)D <- ncol(Q)mean.alpha <- c( -.5, .5, 0 )r1 <- .5Sigma.alpha <- matrix( r1, D, D ) + diag(1-r1,D)dat1 <- CDM::sim.din( N=2000, q.matrix=Q, mean=mean.alpha, Sigma=Sigma.alpha )# estimate DINA modelmod1 <- CDM::din( dat1$dat, q.matrix=Q )# estimate equivalent DINA modelmod2 <- CDM::din( dat1$dat, q.matrix=res1$q.matrix.ast, skillclasses=res1$alpha.ast)# restricted skill space must be defined by using the argument 'skillclasses'# compare model summariessummary(mod2)summary(mod1)# compare estimated item parameterscbind( mod2$coef, mod1$coef )# compare estimated skill class probabilitiesround( cbind( mod2$attribute.patt, mod1$attribute.patt ), 4 )############################################################################## EXAMPLE 3: Examples from von Davier (2014)############################################################################## define Q-matrixQ <- matrix( 0, nrow=8, ncol=3 )Q[2, ] <- c(1,0,0)Q[3, ] <- c(0,1,0)Q[4, ] <- c(1,1,0)Q[5, ] <- c(0,0,1)# Q[6, ] <- c(1,0,1)Q[6, ] <- c(0,0,1)Q[7, ] <- c(0,1,1)Q[8, ] <- c(1,1,1)#- parametrization Ares1 <- CDM::equivalent.dina(q.matrix=Q, reparameterization="A")res1#- parametrization Bres2 <- CDM::equivalent.dina(q.matrix=Q, reparameterization="B")res2## End(Not run)Evaluation of Likelihood
Description
The functioneval_likelihood evaluates the likelihood given itemresponses and item response probabilities.
The functionprep_data_long_format stores the matrix ofitem responses in a long format omitted all missing responses.
Usage
eval_likelihood(data, irfprob, prior=NULL, normalization=FALSE, N=NULL)prep_data_long_format(data)Arguments
data | Dataset containing item responses in wide format or long format(generated by |
irfprob | Array containing item responses probabilities, formatsee |
prior | Optional prior (matrix or vector) |
normalization | Logical indicating whether posterior should be normalized |
N | Number of persons (optional) |
Value
Numeric matrix
Examples
## Not run: ############################################################################## EXAMPLE 1: Likelihood data.ecpe#############################################################################data(data.ecpe, package="CDM")dat <- data.ecpe$dat[,-1]Q <- data.ecpe$q.matrix#*** store data matrix in long formatdata_long <- CDM::prep_data_long_format(data)str(data_long)#** estimate GDINA modelmod <- CDM::gdina(dat, q.matrix=Q)summary(mod)#** extract data, item response functions and priordata <- CDM::IRT.data(mod)irfprob <- CDM::IRT.irfprob(mod)prob_theta <- attr( irfprob, "prob.theta")#** compute likelihoodlmod <- CDM::eval_likelihood(data=data, irfprob=irfprob)max( abs( lmod - CDM::IRT.likelihood(mod) ))#** compute posteriorpmod <- CDM::eval_likelihood(data=data, irfprob=irfprob, prior=prob.theta, normalization=TRUE)max( abs( pmod - CDM::IRT.posterior(mod) ))## End(Not run)Fraction Subtraction Data
Description
Tatsuoka's (1984) fraction subtraction data set is comprised ofresponses toJ=20 fraction subtraction test items fromN=536middle school students.
Usage
data(fraction.subtraction.data)Format
Thefraction.subtraction.data data frame consists of 536rows and 20 columns, representing the responses of theN=536students to each of theJ=20 test items. Each row in the data setcorresponds to the responses of a particular student. Thereby a "1"denotes that a correct response was recorded, while "0" denotes anincorrect response. The other way round, each column correspondsto all responses to a particular item.
Details
The items used for the fraction subtraction test originally appearedin Tatsuoka (1984) and are published in Tatsuoka (2002). Theycan also be found in DeCarlo (2011). All test items are based on 8attributes (e.g. convert a whole number to a fraction, separate a wholenumber from a fraction or simplify before subtracting). The completelist of skills can be found infraction.subtraction.qmatrix.
Source
The Royal Statistical Society Datasets Website, Series C,Applied Statistics, Data analytic methods for latent partiallyordered classification models:
URL:http://www.blackwellpublishing.com/rss/Volumes/Cv51p2_read2.htm
References
DeCarlo, L. T. (2011). On the analysis of fraction subtraction data:The DINA Model, classification, latent class sizes, and the Q-Matrix.Applied Psychological Measurement, 35, 8–26.
Tatsuoka, C. (2002). Data analytic methods for latent partially ordered classificationmodels.Journal of the Royal Statistical Society, Series C, Applied Statistics,51, 337–350.
Tatsuoka, K. (1984).Analysis of errors in fraction addition and subtractionproblems. Final Report for NIE-G-81-0002, University of Illinois, Urbana-Champaign.
See Also
fraction.subtraction.qmatrix for the corresponding Q-matrix.
Fraction Subtraction Q-Matrix
Description
The Q-Matrix corresponding to Tatsuoka (1984) fraction subtraction data set.
Usage
data(fraction.subtraction.qmatrix)Format
Thefraction.subtraction.qmatrix data frame consists ofJ=20rows andK=8 columns, specifying the attributes that are believed to beinvolved in solving the items. Each row in the data frame represents an itemand the entries in the row indicate whether an attribute is needed to masterthe item (denoted by a "1") or not (denoted by a "0"). The attributes for thefraction subtraction data set are the following:
alpha1convert a whole number to a fraction,
alpha2separate a whole number from a fraction,
alpha3simplify before subtracting,
alpha4find a common denominator,
alpha5borrow from whole number part,
alpha6column borrow to subtract the second numerator from the first,
alpha7subtract numerators,
alpha8reduce answers to simplest form.
Details
This Q-matrix can be found in DeCarlo (2011). It is the same used byde la Torre and Douglas (2004).
Source
DeCarlo, L. T. (2011). On the analysis of fraction subtraction data:The DINA Model, classification, latent class sizes, and the Q-Matrix.Applied Psychological Measurement,35, 8–26.
References
de la Torre, J. and Douglas, J. (2004). Higher-order latent trait modelsfor cognitive diagnosis.Psychometrika, 69, 333–353.
Tatsuoka, C. (2002). Data analytic methods for latent partially ordered classificationmodels.Journal of the Royal Statistical Society, Series C, Applied Statistics,51, 337–350.
Tatsuoka, K. (1984)Analysis of errors in fraction addition and subtractionproblems. Final Report for NIE-G-81-0002, University of Illinois, Urbana-Champaign.
Generalized Distance Discriminating Method
Description
Performs the generalized distance discriminating method(GDD; Sun, Xin, Zhang, & de la Torre, 2013) fordichotomous data which is a method for classifying students intoskill profiles based on a preliminary unidimensional calibration.
Usage
gdd(data, q.matrix, theta, b, a, skillclasses=NULL)Arguments
data | Data frame with |
q.matrix | The Q-matrix |
theta | Estimated person ability |
b | Estimated item intercept from a 2PL model (see Details) |
a | Estimated item slope from a 2PL model (see Details) |
skillclasses | Optional matrix of skill classes used for estimation |
Details
Note that the parameters in the arguments follow the item response model
logit P( X_{nj}=1 | \theta_n )=b_j + a_j \theta_n
which is employed in thegdm function.
Value
A list with following entries
skillclass.est | Estimated skill class |
distmatrix | Distances for every person and every skill class |
skillspace | Used skill space for estimation |
theta | Used person parameter estimate |
References
Sun, J., Xin, T., Zhang, S., & de la Torre, J. (2013).A polytomous extension of the generalized distance discriminating method.Applied Psychological Measurement, 37, 503-521.
Examples
############################################################################## EXAMPLE 1: GDD for sim.dina#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")data <- sim.dinaq.matrix <- sim.qmatrix# estimate 1PL (use irtmodel="2PL" for 2PL estimation)mod <- CDM::gdm( data, irtmodel="1PL", theta.k=seq(-6,6,len=21), decrease.increments=TRUE, conv=.001, globconv=.001)# extract item parameters in parametrization b + a*thetab <- mod$b[,1]a <- mod$a[,,1]# extract person parameter estimatetheta <- mod$person$EAP.F1# generalized distance discriminating methodres <- CDM::gdd( data, q.matrix, theta=theta, b=b, a=a )Estimating the Generalized DINA (GDINA) Model
Description
This function implements the generalized DINA model for dichotomousattributes (GDINA; de la Torre, 2011) and polytomous attributes(pGDINA; Chen & de la Torre, 2013, 2018).In addition, multiple group estimationis also possible using thegdina function. This function alsoallows for the estimation of a higher order GDINA model(de la Torre & Douglas, 2004).Polytomous item responses are treated by specifying a sequentialGDINA model (Ma & de la Torre, 2016; Tutz, 1997).The simulataneous modeling of skills and misconceptions (bugs) can bealso estimated within the GDINA framework (see Kuo, Chen & de la Torre, 2018;see argumentrule).
The estimation can also be conducted by posing monotonocityconstraints (Hong, Chang, & Tsai, 2016) using the argumentmono.constr.Moreover, regularization methods SCAD, lasso, ridge, SCAD-L2 andtruncatedL_1 penalty (TLP) for item parameterscan be employed (Xu & Shang, 2018).
Normally distributed priors can be specified for item parameters(item intercepts and item slopes). Note that (for convenience) theprior specification holds simultaneously for all items.
Usage
gdina(data, q.matrix, skillclasses=NULL, conv.crit=0.0001, dev.crit=.1, maxit=1000, linkfct="identity", Mj=NULL, group=NULL, invariance=TRUE,method=NULL, delta.init=NULL, delta.fixed=NULL, delta.designmatrix=NULL, delta.basispar.lower=NULL, delta.basispar.upper=NULL, delta.basispar.init=NULL, zeroprob.skillclasses=NULL, attr.prob.init=NULL, attr.prob.fixed=NULL, reduced.skillspace=NULL, reduced.skillspace.method=2, HOGDINA=-1, Z.skillspace=NULL, weights=rep(1, nrow(data)), rule="GDINA", bugs=NULL, regular_lam=0, regular_type="none", regular_alpha=NA, regular_tau=NA, regular_weights=NULL, mono.constr=FALSE, prior_intercepts=NULL, prior_slopes=NULL, progress=TRUE, progress.item=FALSE, mstep_iter=10, mstep_conv=1E-4, increment.factor=1.01, fac.oldxsi=0, max.increment=.3, avoid.zeroprobs=FALSE, seed=0, save.devmin=TRUE, calc.se=TRUE, se_version=1, PEM=TRUE, PEM_itermax=maxit, cd=FALSE, cd_steps=1, mono_maxiter=10, freq_weights=FALSE, optimizer="CDM", ...)## S3 method for class 'gdina'summary(object, digits=4, file=NULL, ...)## S3 method for class 'gdina'plot(x, ask=FALSE, ...)## S3 method for class 'gdina'print(x, ...)Arguments
data | A required |
q.matrix | A required integer |
skillclasses | An optional matrix for determining the skill space.The argument can be used if a user wants less than |
conv.crit | Convergence criterion for maximum absolute change in item parameters |
dev.crit | Convergence criterion for maximum absolute change in deviance |
maxit | Maximum number of iterations |
linkfct | A string which indicates the link function for the GDINA model.Options are |
Mj | A list of design matrices and labels for each item.The definition of |
group | A vector of group identifiers for multiple groupestimation. Default is |
invariance | Logical indicating whether invariance of item parametersis assumed for multiple group models. If a subset of items shouldbe treated as noninvariant, then |
method | Estimation method for item parameters (see)(de la Torre, 2011). The default |
delta.init | List with initial |
delta.fixed | List with fixed |
delta.designmatrix | A design matrix for restrictions on delta. See Example 4. |
delta.basispar.lower | Lower bounds for delta basis parameters. |
delta.basispar.upper | Upper bounds for delta basis parameters. |
delta.basispar.init | An optional vector of starting values for the basis parameters of delta.This argument only applies when using a designmatrix for delta,i.e. |
zeroprob.skillclasses | An optional vector of integers which indicates which skillclasses should have zero probability. Default is NULL(no skill classes with zero probability). |
attr.prob.init | Initial probabilities of skill distribution. |
attr.prob.fixed | Vector or matrix with fixed probabilities of skill distribution. |
reduced.skillspace | A logical which indicates if the latent class skill space dimensionshould be reduced (see Xu & von Davier, 2008). The default is |
reduced.skillspace.method | Computation method for skill space reductionin case of |
HOGDINA | Values of -1, 0 or 1 indicating if a higher order GDINAmodel (see Details) should be estimated.The default value of -1 corresponds to the case that no higher orderfactor is assumed to exist. A value of 0 corresponds to independentattributes. A value of 1 assumes the existence of a higher orderfactor. |
Z.skillspace | A user specified design matrix for the skill space reductionas described in Xu and von Davier (2008). See in the Examples section forapplications. See Example 6. |
weights | An optional vector of sample weights. |
rule | A string or a vector of itemwise condensation rules. Allowed entries are |
bugs | Character vector indicating which columns in the Q-matrixrefer to bugs (misconceptions). This is only available if some |
regular_lam | Regularization parameter |
regular_type | Type of regularization. Can be |
regular_alpha | Regularization parameter |
regular_tau | Regularization parameter |
regular_weights | Optional list of item parameter weights used forpenalties in regularized estimation (see Example 13) |
mono.constr | Logical indicating whether monotonicity constraintsshould be fulfilled in estimation (implemented by the increasing penalty method; seeNash, 2014, p. 156). |
prior_intercepts | Vector with mean and standard deviation for priorof random intercepts (applies to all items) |
prior_slopes | Vector with mean and standard deviation for priorof random slopes (applies to all items and all parameters) |
progress | An optional logical indicating whether the functionshould print the progress of iteration in the estimation process. |
progress.item | An optional logical indicating whether item wise progress shouldbe displayed |
mstep_iter | Number of iterations in M-step if |
mstep_conv | Convergence criterion in M-step if |
increment.factor | A factor larger than 1 (say 1.1) to controlmaximum increments in item parameters. This parametercan be used in case of nonconvergence. |
fac.oldxsi | A convergence acceleration factor between 0 and 1 whichdefines the weight of previously estimated values incurrent parameter updates. |
max.increment | Maximum size of change in increments in M stepsof EM algorithm when |
avoid.zeroprobs | An optional logical indicating whether for estimatingitem parameters probabilities occur. Especially ifnot a skill classes are used, it is recommended to switchthe argument to |
seed | Simulation seed for initial parameters. A value of zero correspondsto deterministic starting values, an integer value different fromzero to random initial values with |
save.devmin | An optional logical indicating whether intermediateestimates should be saved corresponding to minimal deviance.Setting the argument to |
calc.se | Optional logical indicating whether standarderrors should be calculated. |
se_version | Integer for calculation method of standard errors. |
PEM | Logical indicating whether the P-EM acceleration should beapplied (Berlinet & Roland, 2012). |
PEM_itermax | Number of iterations in which the P-EM method should beapplied. |
cd | Logical indicating whether coordinate descent algorithm should be used. |
cd_steps | Number of steps for each parameter in coordinate descent algorithm |
mono_maxiter | Maximum number of iterations for fulfilling themonotonicity constraint |
freq_weights | Logical indicating whether frequency weights shouldbe used. Default is |
optimizer | String indicating which optimizer should be used inM-step estimation in case of |
object | A required object of class |
digits | Number of digits after decimal separator to display. |
file | Optional file name for a file in which |
x | A required object of class |
ask | A logical indicating whether every separate item shouldbe displayed in |
... | Optional parameters to be passed to or from othermethods will be ignored. |
Details
The estimation is based on an EM algorithm as described in de la Torre (2011).Item parameters are contained in thedelta vector which is a list wherethejth entry corresponds to item parameters of thejth item.
The following description refers to the case of dichotomous attributes.For using polytomous attributes see Chen and de la Torre (2013) andExample 7 for a definition of the Q-matrix. In this case,Q_{ik}=lmeans that theith item requires the mastery (at least) of levell of attributek.
Assume that two skills\alpha_1 and\alpha_2 are required formastering itemj. Then the GDINA model can be written as
g [ P( X_{nj}=1 | \alpha_n ) ]=\delta_{j0} + \delta_{j1} \alpha_{n1} + \delta_{j2} \alpha_{n2} + \delta_{j12} \alpha_{n1} \alpha_{n2}
which is a two-way GDINA-model (therule="GDINA2" specification) with alink functiong (which can be the identity, logit or logarithmic link).If the specificationACDM is chosen, then\delta_{j12}=0.The DINA model (rule="DINA") assumes \delta_{j1}=\delta_{j2}=0.
For the reduced RUM model (rule="RRUM"), the item response model is
P(X_{nj}=1 | \alpha_n )=\pi_i^\ast \cdot r_{i1}^{1-\alpha_{i1} } \cdot r_{i2}^{1-\alpha_{i2} }
From this equation, it is obvious, thatthis model is equivalent to an additive model (rule="ACDM") witha logarithmic link function (linkfct="log").
If a reduced skillspace (reduced.skillspace=TRUE) is employed, then thelogarithm of probability distribution of the attributes is modeled as alog-linear model:
\log P[ ( \alpha_{n1}, \alpha_{n2}, \ldots, \alpha_{nK} ) ] =\gamma_0 + \sum_k \gamma_k \alpha_{nk} + \sum_{k < l} \gamma_{kl} \alpha_{nk} \alpha_{nl}
If a higher order DINA model is assumed (HOGDINA=1), then a higher orderfactor\theta_n for the attributes is assumed:
P( \alpha_{nk}=1 | \theta_n )=\Phi ( a_k \theta_n + b_k )
ForHOGDINA=0, all attributes\alpha_{nk} are assumed to beindependent of each other:
P[ ( \alpha_{n1}, \alpha_{n2}, \ldots, \alpha_{nK} ) ] =\prod_k P( \alpha_{nk} )
Note that the noncompensatory reduced RUM (NC-RRUM) accordingto Rupp and Templin (2008) is the GDINA model with the argumentsrule="ACDM" andlinkfct="log". NC-RRUM can also beobtained by choosingrule="RRUM".
The compensatory RUM (C-RRUM) can be obtained by using the argumentsrule="ACDM" andlinkfct="logit".
The cognitive diagnosis model for identifyingskills and misconceptions (SISM; Kuo, Chen & de la Torre, 2018) can beestimated withrule="SISM" (see Example 12).
Thegdina function internally parameterizes the GDINA model as
g [ P( X_{nj}=1 | \alpha_n ) ]=\bm{M}_j ( \alpha _n ) \bm{\delta}_j
with item-specific design matrices\bm{M}_j (\alpha _n ) and item parameters\bm{\delta}_j. Only those attributes are modelled which correspondto non-zero entries in the Q-matrix. Because the Q-matrix (inq.matrix)and the design matrices (inM_j; see Example 3) can bespecified by the user, severalcognitive diagnosis models can be estimated. Therefore, some additional extensionsof the DINA model can also be estimated using thegdina function.These models include the DINA model with multiple strategies(Huo & de la Torre, 2014)
Value
An object of classgdina with following entries
coef | Data frame of item parameters |
delta | List with basis item parameters |
se.delta | Standard errors of basis item parameters |
probitem | Data frame with model implied conditional item probabilities |
itemfit.rmsea | The RMSEA item fit index (see |
mean.rmsea | Mean of RMSEA item fit indexes. |
loglike | Log-likelihood |
deviance | Deviance |
G | Number of groups |
N | Sample size |
AIC | AIC |
BIC | BIC |
CAIC | CAIC |
Npars | Total number of parameters |
Nipar | Number of item parameters |
Nskillpar | Number of parameters for skill class distribution |
Nskillclasses | Number of skill classes |
varmat.delta | Covariance matrix of |
posterior | Individual posterior distribution |
like | Individual likelihood |
data | Original data |
q.matrix | Used Q-matrix |
pattern | Individual patterns, individual MLE and MAP classificationsand their corresponding probabilities |
attribute.patt | Probabilities of skill classes |
skill.patt | Marginal skill probabilities |
subj.pattern | Individual subject pattern |
attribute.patt.splitted | Splitted attribute pattern |
pjk | Array of item response probabilities |
Mj | Design matrix |
Aj | Design matrix |
rule | Used condensation rules |
linkfct | Used link function |
delta.designmatrix | Designmatrix for item parameters |
reduced.skillspace | A logical if skillspace reduction was performed |
Z.skillspace | Design matrix for skillspace reduction |
beta | Parameters |
covbeta | Standard errors of |
iter | Number of iterations |
rrum.params | Parameters in the parametrization of the reduced RUM modelif |
group.stat | Group statistics (sample sizes, group labels) |
HOGDINA | The used value of |
mono.constr | Monotonicity constraint |
regularization | Logical indicating whether regularization is used |
regular_lam | Regularization parameter |
numb_bound_mono | Number of items with parameters at boundary ofmonotonicity constraints |
numb_regular_pars | Number of regularized item parameters |
delta_regularized | List indicating which item parametersare regularized |
cd_algorithm | Logical indicating whether coordinate descent algorithm isused |
cd_steps | Number of steps for each parameter in coordinate descent algorithm |
seed | Used simulation seed |
a.attr | Attribute parameters |
b.attr | Attribute parameters |
attr.rf | Attribute response functions. This matrix contains all |
converged | Logical indicating whether convergence was achieved. |
control | Optimization parameters used in estimation |
partable | Parameter table for |
polychor | Group-wise matrices with polychoric correlations |
sequential | Logical indicating whether a sequential GDINA modelis applied for polytomous item responses |
... | Further values |
Note
The functiondin does not allow for multiple group estimation.Use thisgdina function instead and choose the appropriaterule="DINA"as an argument.
Standard error calculation in analyses which use sample weights ordesignmatrix for delta parameters (delta.designmatrix!=NULL) is not yetcorrectly implemented. Please use replication methods instead.
References
Berlinet, A. F., & Roland, C. (2012).Acceleration of the EM algorithm: P-EM versus epsilon algorithm.Computational Statistics & Data Analysis, 56(12), 4122-4137.
Chen, J., & de la Torre, J. (2013).A general cognitive diagnosis model for expert-defined polytomous attributes.Applied Psychological Measurement, 37, 419-437.
Chen, J., & de la Torre, J. (2018). Introducing the general polytomous diagnosismodeling framework.Frontiers in Psychology | Quantitative Psychology and Measurement, 9(1474).
de la Torre, J., & Douglas, J. A. (2004). Higher-order latent trait modelsfor cognitive diagnosis.Psychometrika, 69, 333-353.
de la Torre, J. (2011). The generalized DINA model framework.Psychometrika, 76, 179-199.
Hong, C. Y., Chang, Y. W., & Tsai, R. C. (2016). Estimation of generalized DINAmodel with order restrictions.Journal of Classification, 33(3), 460-484.
Huo, Y., de la Torre, J. (2014). Estimating a cognitive diagnostic model formultiple strategies via the EM algorithm.Applied Psychological Measurement, 38, 464-485.
Kuo, B.-C., Chen, C.-H., & de la Torre, J. (2018).A cognitive diagnosis model for identifying coexisting skills and misconceptions.Applied Psychological Measurement, 42(3), 179-191.
Ma, W., & de la Torre, J. (2016).A sequential cognitive diagnosis model for polytomous responses.British Journal of Mathematical and Statistical Psychology, 69(3),253-275.
Nash, J. C. (2014).Nonlinear parameter optimization usingR tools.West Sussex: Wiley.
Rupp, A. A., & Templin, J. (2008). Unique characteristics ofdiagnostic classification models: A comprehensive review of the currentstate-of-the-art.Measurement: Interdisciplinary Research andPerspectives, 6, 219-262.
Shen, X., Pan, W., & Zhu, Y. (2012). Likelihood-based selection and sharpparameter estimation.Journal of the American Statistical Association, 107, 223-232.
Tutz, G. (1997). Sequential models for ordered responses.In W. van der Linden & R. K. Hambleton.Handbook of modern item response theory (pp. 139-152).New York: Springer.
Xu, G., & Shang, Z. (2018). Identifying latent structures inrestricted latent class models.Journal of the American Statistical Association, 523, 1284-1295.
Xu, X., & von Davier, M. (2008).Fitting the structured general diagnosticmodel to NAEP data. ETS Research Report ETS RR-08-27. Princeton, ETS.
Zeng, L., & Xie, J. (2014). Group variable selection viaSCAD-L_2.Statistics, 48, 49-66.
Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concavepenalty.Annals of Statistics, 38, 894-942.
See Also
See also thedin function (for DINA and DINO estimation).
For assessment of model fit seemodelfit.cor.din andanova.gdina.
Seeitemfit.sx2 for item fit statistics.
Seesim.gdina for simulating the GDINA model.
Seegdina.wald for a Wald test for testing the DINA and ACDMrules at the item-level.
Seegdina.dif for assessing differential itemfunctioning.
Seediscrim.index for computing discrimination indices.
See theGDINA::GDINA function in theGDINA package for similar functionality.
Examples
############################################################################## EXAMPLE 1: Simulated DINA data | different condensation rules#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")dat <- sim.dinaQ <- sim.qmatrix#***# Model 1: estimation of the GDINA model (identity link)mod1 <- CDM::gdina( data=dat, q.matrix=Q)summary(mod1)plot(mod1) # apply plot function## Not run: # Model 1a: estimate model with different simulation seedmod1a <- CDM::gdina( data=dat, q.matrix=Q, seed=9089)summary(mod1a)# Model 1b: estimate model with some fixed delta parametersdelta.fixed <- as.list( rep(NA,9) ) # List for parameters of 9 itemsdelta.fixed[[2]] <- c( 0, .15, .15, .45 )delta.fixed[[6]] <- c( .25, .25 )mod1b <- CDM::gdina( data=dat, q.matrix=Q, delta.fixed=delta.fixed)summary(mod1b)# Model 1c: fix all delta parameters to previously fitted modelmod1c <- CDM::gdina( data=dat, q.matrix=Q, delta.fixed=mod1$delta)summary(mod1c)# Model 1d: estimate GDINA model with GDINA packagemod1d <- GDINA::GDINA( dat=dat, Q=Q, model="GDINA" )summary(mod1d)# extract item parametersGDINA::itemparm(mod1d)GDINA::itemparm(mod1d, what="delta")# compare likelihoodlogLik(mod1)logLik(mod1d)#***# Model 2: estimation of the DINA model with gdina functionmod2 <- CDM::gdina( data=dat, q.matrix=Q, rule="DINA")summary(mod2)plot(mod2)#***# Model 2b: compare results with din functionmod2b <- CDM::din( data=dat, q.matrix=Q, rule="DINA")summary(mod2b)# Model 2: estimation of the DINO model with gdina functionmod3 <- CDM::gdina( data=dat, q.matrix=Q, rule="DINO")summary(mod3)#***# Model 4: DINA model with logit linkmod4 <- CDM::gdina( data=dat, q.matrix=Q, rule="DINA", linkfct="logit" )summary(mod4)#***# Model 5: DINA model log linkmod5 <- CDM::gdina( data=dat, q.matrix=Q, rule="DINA", linkfct="log")summary(mod5)#***# Model 6: RRUM modelmod6 <- CDM::gdina( data=dat, q.matrix=Q, rule="RRUM")summary(mod6)#***# Model 7: Higher order GDINA modelmod7 <- CDM::gdina( data=dat, q.matrix=Q, HOGDINA=1)summary(mod7)#***# Model 8: GDINA model with independent attributesmod8 <- CDM::gdina( data=dat, q.matrix=Q, HOGDINA=0)summary(mod8)#***# Model 9: Estimating the GDINA model with monotonicity constraintsmod9 <- CDM::gdina( data=dat, q.matrix=Q, rule="GDINA", mono.constr=TRUE, linkfct="logit")summary(mod9)#***# Model 10: Estimating the ACDM model with SCAD penalty and regularization# parameter of .05mod10 <- CDM::gdina( data=dat, q.matrix=Q, rule="ACDM", linkfct="logit", regular_type="scad", regular_lam=.05 )summary(mod10)#***# Model 11: Estimation of GDINA model with prior distributions# N(0,10^2) prior for item interceptsprior_intercepts <- c(0,10)# N(0,1^2) prior for item slopesprior_slopes <- c(0,1)# estimate modelmod11 <- CDM::gdina( data=dat, q.matrix=Q, rule="GDINA", prior_intercepts=prior_intercepts, prior_slopes=prior_slopes)summary(mod11)############################################################################## EXAMPLE 2: Simulated DINO data# additive cognitive diagnosis model with different link functions#############################################################################data(sim.dino, package="CDM")data(sim.matrix, package="CDM")dat <- sim.dinoQ <- sim.qmatrix#***# Model 1: additive cognitive diagnosis model (ACDM; identity link)mod1 <- CDM::gdina( data=dat, q.matrix=Q, rule="ACDM")summary(mod1)#***# Model 2: ACDM logit linkmod2 <- CDM::gdina( data=dat, q.matrix=Q, rule="ACDM", linkfct="logit")summary(mod2)#***# Model 3: ACDM log linkmod3 <- CDM::gdina( data=dat, q.matrix=Q, rule="ACDM", linkfct="log")summary(mod3)#***# Model 4: Different condensation rules per itemI <- 9 # number of itemsrule <- rep( "GDINA", I )rule[1] <- "DINO" # 1st item: DINO modelrule[7] <- "GDINA2" # 7th item: GDINA model with first- and second-order interactionsrule[8] <- "ACDM" # 8ht item: additive CDMrule[9] <- "DINA" # 9th item: DINA modelmod4 <- CDM::gdina( data=dat, q.matrix=Q, rule=rule )summary(mod4)############################################################################## EXAMPLE 3: Model with user-specified design matrices#############################################################################data(sim.dino, package="CDM")data(sim.qmatrix, package="CDM")dat <- sim.dinoQ <- sim.qmatrix# do a preliminary analysis and modify obtained design matricesmod0 <- CDM::gdina( data=dat, q.matrix=Q, maxit=1)# extract default design matricesMj <- mod0$MjMj.user <- Mj # these user defined design matrices are modified.#~~~ For the second item, the following model should hold# X1 ~ V2 + V2*V3mj <- Mj[[2]][[1]]mj.lab <- Mj[[2]][[2]]mj <- mj[,-3]mj.lab <- mj.lab[-3]Mj.user[[2]] <- list( mj, mj.lab )# [[1]]# [,1] [,2] [,3]# [1,] 1 0 0# [2,] 1 1 0# [3,] 1 0 0# [4,] 1 1 1# [[2]]# [1] "0" "1" "1-2"#~~~ For the eight item an equality constraint should hold# X8 ~ a*V2 + a*V3 + V2*V3mj <- Mj[[8]][[1]]mj.lab <- Mj[[8]][[2]]mj[,2] <- mj[,2] + mj[,3]mj <- mj[,-3]mj.lab <- c("0", "1=2", "1-2" )Mj.user[[8]] <- list( mj, mj.lab )Mj.user[[8]] ## [[1]] ## [,1] [,2] [,3] ## [1,] 1 0 0 ## [2,] 1 1 0 ## [3,] 1 1 0 ## [4,] 1 2 1 ## ## [[2]] ## [1] "0" "1=2" "1-2"mod <- CDM::gdina( data=dat, q.matrix=Q, Mj=Mj.user, maxit=200 )summary(mod)############################################################################## EXAMPLE 4: Design matrix for delta parameters#############################################################################data(sim.dino, package="CDM")data(sim.qmatrix, package="CDM")#~~~ estimate an initial modelmod0 <- CDM::gdina( data=dat, q.matrix=Q, rule="ACDM", maxit=1)# extract coefficientsc0 <- mod0$coefI <- 9 # number of itemsdelta.designmatrix <- matrix( 0, nrow=nrow(c0), ncol=nrow(c0) )diag( delta.designmatrix) <- 1# set intercept of item 1 and item 3 equal to each otherdelta.designmatrix[ 7, 1 ] <- 1 ; delta.designmatrix[,7] <- 0# set loading of V1 of item1 and item 3 equaldelta.designmatrix[ 8, 2 ] <- 1 ; delta.designmatrix[,8] <- 0delta.designmatrix <- delta.designmatrix[, -c(7:8) ] # exclude original parameters with indices 7 and 8#***# Model 1: ACDM with designmatrixmod1 <- CDM::gdina( data=dat, q.matrix=Q, rule="ACDM", delta.designmatrix=delta.designmatrix )summary(mod1)#***# Model 2: Same model, but with logit link instead of identity link functionmod2 <- CDM::gdina( data=dat, q.matrix=Q, rule="ACDM", delta.designmatrix=delta.designmatrix, linkfct="logit")summary(mod2)############################################################################## EXAMPLE 5: Multiple group estimation############################################################################## simulate dataset.seed(9279)N1 <- 200 ; N2 <- 100 # group sizesI <- 10 # number of itemsq.matrix <- matrix(0,I,2) # create Q-matrixq.matrix[1:7,1] <- 1 ; q.matrix[ 5:10,2] <- 1# simulate first groupdat1 <- CDM::sim.din(N1, q.matrix=q.matrix, mean=c(0,0) )$dat# simulate second groupdat2 <- CDM::sim.din(N2, q.matrix=q.matrix, mean=c(-.3, -.7) )$dat# merge datadat <- rbind( dat1, dat2 )# group indicatorgroup <- c( rep(1,N1), rep(2,N2) )# estimate GDINA model with multiple groups assuming invariant item parametersmod1 <- CDM::gdina( data=dat, q.matrix=q.matrix, group=group)summary(mod1)# estimate DINA model with multiple groups assuming invariant item parametersmod2 <- CDM::gdina( data=dat, q.matrix=q.matrix, group=group, rule="DINA")summary(mod2)# estimate GDINA model with noninvariant item parametersmod3 <- CDM::gdina( data=dat, q.matrix=q.matrix, group=group, invariance=FALSE)summary(mod3)# estimate GDINA model with some invariant item parameters (I001, I006, I008)mod4 <- CDM::gdina( data=dat, q.matrix=q.matrix, group=group, invariance=c("I001", "I006","I008") )#--- model comparisonIRT.compareModels(mod1,mod2,mod3,mod4)# estimate GDINA model with non-invariant item parameters except for the# items I001, I006, I008mod5 <- CDM::gdina( data=dat, q.matrix=q.matrix, group=group, invariance=setdiff( colnames(dat), c("I001", "I006","I008") ) )############################################################################## EXAMPLE 6: User specified reduced skill space############################################################################## Some correlations between attributes should be set to zero.q.matrix <- expand.grid( c(0,1), c(0,1), c(0,1), c(0,1) )colnames(q.matrix) <- colnames( paste("Attr", 1:4,sep=""))q.matrix <- q.matrix[ -1, ]Sigma <- matrix( .5, nrow=4, ncol=4 )diag(Sigma) <- 1Sigma[3,2] <- Sigma[2,3] <- 0 # set correlation of attribute A2 and A3 to zerodat <- CDM::sim.din( N=1000, q.matrix=q.matrix, Sigma=Sigma)$dat#~~~ Step 1: initial estimationmod1a <- CDM::gdina( data=dat, q.matrix=q.matrix, maxit=1, rule="DINA")# estimate also "full" modelmod1 <- CDM::gdina( data=dat, q.matrix=q.matrix, rule="DINA")#~~~ Step 2: modify designmatrix for reduced skillspaceZ.skillspace <- data.frame( mod1a$Z.skillspace )# set correlations of A2/A4 and A3/A4 to zerovars <- c("A2_A3","A2_A4")for (vv in vars){ Z.skillspace[,vv] <- NULL }#~~~ Step 3: estimate model with reduced skillspacemod2 <- CDM::gdina( data=dat, q.matrix=q.matrix, Z.skillspace=Z.skillspace, rule="DINA")#~~~ eliminate all covariancesZ.skillspace <- data.frame( mod1$Z.skillspace )colnames(Z.skillspace)Z.skillspace <- Z.skillspace[, -grep( "_", colnames(Z.skillspace),fixed=TRUE)]colnames(Z.skillspace)mod3 <- CDM::gdina( data=dat, q.matrix=q.matrix, Z.skillspace=Z.skillspace, rule="DINA")summary(mod1)summary(mod2)summary(mod3)############################################################################## EXAMPLE 7: Polytomous GDINA model (Chen & de la Torre, 2013)#############################################################################data(data.pgdina, package="CDM")dat <- data.pgdina$datq.matrix <- data.pgdina$q.matrix# pGDINA model with "DINA rule"mod1 <- CDM::gdina( dat, q.matrix=q.matrix, rule="DINA")summary(mod1)# no reduced skill spacemod1a <- CDM::gdina( dat, q.matrix=q.matrix, rule="DINA",reduced.skillspace=FALSE)summary(mod1)# pGDINA model with "GDINA rule"mod2 <- CDM::gdina( dat, q.matrix=q.matrix, rule="GDINA")summary(mod2)############################################################################## EXAMPLE 8: Fraction subtraction data: DINA and HO-DINA model#############################################################################data(fraction.subtraction.data, package="CDM")data(fraction.subtraction.qmatrix, package="CDM")dat <- fraction.subtraction.dataQ <- fraction.subtraction.qmatrix# Model 1: DINA modelmod1 <- CDM::gdina( dat, q.matrix=Q, rule="DINA")summary(mod1)# Model 2: HO-DINA modelmod2 <- CDM::gdina( dat, q.matrix=Q, HOGDINA=1, rule="DINA")summary(mod2)############################################################################## EXAMPLE 9: Skill space approximation data.jang#############################################################################data(data.jang, package="CDM")data <- data.jang$dataq.matrix <- data.jang$q.matrix#*** Model 1: Reduced RUM modelmod1 <- CDM::gdina( data, q.matrix, rule="RRUM", conv.crit=.001, maxit=500 )#*** Model 2: Reduced RUM model with skill space approximation# use 300 instead of 2^9=512 skill classesskillspace <- CDM::skillspace.approximation( L=300, K=ncol(q.matrix) )mod2 <- CDM::gdina( data, q.matrix, rule="RRUM", conv.crit=.001, skillclasses=skillspace ) ## > logLik(mod1) ## 'log Lik.' -30318.08 (df=153) ## > logLik(mod2) ## 'log Lik.' -30326.52 (df=153)############################################################################## EXAMPLE 10: CDM with a linear hierarchy############################################################################## This model is equivalent to a unidimensional IRT model with an ordered# ordinal latent trait and is actually a probabilistic Guttman model.set.seed(789)# define 3 competency levelsalpha <- scan() 0 0 0 1 0 0 1 1 0 1 1 1# define skill class distributionK <- 3skillspace <- alpha <- matrix( alpha, K + 1, K, byrow=TRUE )alpha <- alpha[ rep( 1:4, c(300,300,200,200) ), ]# P(000)=P(100)=.3, P(110)=P(111)=.2# define Q-matrixQ <- scan() 1 0 0 1 1 0 1 1 1Q <- matrix( Q, nrow=K, ncol=K, byrow=TRUE )Q <- Q[ rep(1:K, each=4 ), ]colnames(skillspace) <- colnames(Q) <- paste0("A",1:K)I <- nrow(Q)# define guessing and slipping parametersguess <- stats::runif( I, 0, .3 )slip <- stats::runif( I, 0, .2 )# simulate datadat <- CDM::sim.din( q.matrix=Q, alpha=alpha, slip=slip, guess=guess )$dat#*** Model 1: DINA model with linear hierarchymod1 <- CDM::din( dat, q.matrix=Q, rule="DINA", skillclasses=skillspace )summary(mod1)#*** Model 2: pGDINA model with 3 levels# The multidimensional CDM with a linear hierarchy is a unidimensional# polytomous GDINA model.Q2 <- matrix( rowSums(Q), nrow=I, ncol=1 )mod2 <- CDM::gdina( dat, q.matrix=Q2, rule="DINA" )summary(mod2)#*** Model 3: estimate probabilistic Guttman model in sirt# Proctor, C. H. (1970). A probabilistic formulation and statistical# analysis for Guttman scaling. Psychometrika, 35, 73-78.library(sirt)mod3 <- sirt::prob.guttman( dat, itemlevel=Q2[,1] )summary(mod3)# -> The three models result in nearly equivalent fit.############################################################################## EXAMPLE 11: Sequential GDINA model (Ma & de la Torre, 2016)#############################################################################data(data.cdm04, package="CDM")#** attach datasetdat <- data.cdm04$data # polytomous item responsesq.matrix1 <- data.cdm04$q.matrix1q.matrix2 <- data.cdm04$q.matrix2#-- DINA model with first Q-matrixmod1 <- CDM::gdina( dat, q.matrix=q.matrix1, rule="DINA")summary(mod1)#-- DINA model with second Q-matrixmod2 <- CDM::gdina( dat, q.matrix=q.matrix2, rule="DINA")#-- GDINA modelmod3 <- CDM::gdina( dat, q.matrix=q.matrix2, rule="GDINA")#** model comparisonIRT.compareModels(mod1,mod2,mod3)############################################################################## EXAMPLE 12: Simulataneous modeling of skills and misconceptions (Kuo et al., 2018)#############################################################################data(data.cdm08, package="CDM")dat <- data.cdm08$dataq.matrix <- data.cdm08$q.matrix#*** estimate modelmod <- CDM::gdina( dat0, q.matrix, rule="SISM", bugs=colnames(q.matrix)[5:7] )summary(mod)############################################################################## EXAMPLE 13: Regularized estimation in GDINA model data.dtmr#############################################################################data(data.dtmr, package="CDM")dat <- data.dtmr$dataq.matrix <- data.dtmr$q.matrix#***** LASSO regularization with lambda parameter of .02mod1 <- CDM::gdina(dat, q.matrix=q.matrix, rule="GDINA", regular_lam=.02, regular_type="lasso")summary(mod1)mod$delta_regularized#***** using starting values from previuos estimationdelta.init <- mod1$deltaattr.prob.init <- mod1$attr.probmod2 <- CDM::gdina(dat, q.matrix=q.matrix, rule="GDINA", regular_lam=.02, regular_type="lasso", delta.init=delta.init, attr.prob.init=attr.prob.init)summary(mod2)#***** final estimation fixing regularized estimates to zero and estimate all other#***** item parameters unregularizedregular_weights <- mod2$delta_regularizeddelta.init <- mod2$deltaattr.prob.init <- mod2$attr.probmod3 <- CDM::gdina(dat, q.matrix=q.matrix, rule="GDINA", regular_lam=1E5, regular_type="lasso", delta.init=delta.init, attr.prob.init=attr.prob.init, regular_weights=regular_weights)summary(mod3)## End(Not run)Differential Item Functioning in the GDINA Model
Description
This function assesses item-wise differential item functioningin the GDINA model by using the Wald test (de la Torre, 2011;Hou, de la Torre & Nandakumar, 2014).It is necessary that a multiple group GDINA model is previouslyfitted.
Usage
gdina.dif(object)## S3 method for class 'gdina.dif'summary(object, ...)Arguments
object | Object of class |
... | Further arguments to be passed |
Details
The p values are also calculated by a Holm adjustmentfor multiple comparisons (seep.holm inoutputdifstats).
In the case of two groups, an effect size of differential item functioning(labeled asUA (unsigned area) indifstats value) is defined asthe weighted absolute difference of item response functions. The DIF measurefor itemj is defined as
UA_j=\sum_l w( \alpha_l ) | P( X_j=1 | \alpha_l, G=1 ) - P( X_j=1 | \alpha_l, G=2 ) |
wherew( \alpha_l )=[ P( \alpha_l | G=1 ) + P( \alpha_l | G=2 ) ]/2.
Value
A list with following entries
difstats | Data frame containing results of item-wise Wald tests |
coef | Data frame containing all (group-wise) item parameters |
delta_all | List of |
varmat_all | List of covariance matrices of all |
prob.exp.group | List with groups and items containing expected latent classsizes and expected probabilities for each group and each item.Based on this information, effect sizes of differential itemfunctioning can be calculated. |
References
de la Torre, J. (2011). The generalized DINA model framework.Psychometrika, 76, 179-199.
Hou, L., de la Torre, J., & Nandakumar, R. (2014).Differential item functioning assessment in cognitivediagnostic modeling: Application of the Wald test toinvestigate DIF in the DINA model.Journal of Educational Measurement, 51, 98-125.
See Also
See theGDINA::dif function in theGDINA package for similar functionality.
Examples
## Not run: ############################################################################## EXAMPLE 1: DIF for DINA simulated data############################################################################## simulate some dataset.seed(976)N <- 2000 # number of persons in a groupI <- 9 # number of itemsq.matrix <- matrix( 0, 9,2 )q.matrix[1:3,1] <- 1q.matrix[4:6,2] <- 1q.matrix[7:9,c(1,2)] <- 1# simulate first groupguess <- rep( .2, I )slip <- rep(.1, I)dat1 <- CDM::sim.din( N=N, q.matrix=q.matrix, guess=guess, slip=slip, mean=c(0,0) )$dat# simulate second group with some DIF items (items 1, 7 and 8)guess[ c(1,7)] <- c(.3, .35 )slip[8] <- .25dat2 <- CDM::sim.din( N=N, q.matrix=q.matrix, guess=guess, slip=slip, mean=c(0.4,.25) )$datgroup <- rep(1:2, each=N )dat <- rbind( dat1, dat2 )#*** estimate multiple group GDINA modelmod1 <- CDM::gdina( dat, q.matrix=q.matrix, rule="DINA", group=group )summary(mod1)#*** assess differential item functioningdmod1 <- CDM::gdina.dif( mod1)summary(dmod1) ## item X2 df p p.holm UA ## 1 I001 10.1711 2 0.0062 0.0495 0.0428 ## 2 I002 1.9933 2 0.3691 1.0000 0.0276 ## 3 I003 0.0313 2 0.9845 1.0000 0.0040 ## 4 I004 0.0290 2 0.9856 1.0000 0.0044 ## 5 I005 2.3230 2 0.3130 1.0000 0.0142 ## 6 I006 1.8330 2 0.3999 1.0000 0.0159 ## 7 I007 40.6851 2 0.0000 0.0000 0.1184 ## 8 I008 6.7912 2 0.0335 0.2346 0.0710 ## 9 I009 1.1538 2 0.5616 1.0000 0.0180## End(Not run)Wald Statistic for Item Fit of the DINA and ACDM Rule for GDINA Model
Description
This function tests with a Wald test for the GDINA model whether a DINA or a ACDMcondensation rule leads to a sufficient item fit comparedto the saturated GDINA rule (de la Torre & Lee, 2013). The Wald testis accompanied by the RMSEA fit and weighted and unweighteddistance measures (wgtdist,uwgtdist), see Details(compare Ma, Iaconangelo, & de la Torre, 2016).
Usage
gdina.wald(object)## S3 method for class 'gdina.wald'summary(object, digits=3, vars=c("X2", "p", "sig", "RMSEA", "wgtdist"), ...)Arguments
object | A fitted |
digits | Number of digits after decimal usedfor rounding. |
vars | Vector including variables which shouldbe displayed in |
... | Further arguments to be passed |
Details
LetP_j( \alpha _l) the estimated item response function for theGDINA model and\hat{P}_j( \alpha _l) the item responsemodel for the approximated model (DINA, DINO or ACDM).The unweighted distanceuwgtdist as a measure of misfit is defined as
uwgtdist=\frac{1}{2^K} \sum_l ( P_j( \alpha _l) - \hat{P}_j( \alpha _l) )^2
The weighted distancewgtdist measures the discrepancywith respected to the probabilitiesw_l=P( \alpha_l) of estimatedskill classes
wgtdist=\sum_l w_l (P_j( \alpha _l) - \hat{P}_j( \alpha _l) )^2
Value
stats | Data frame with Wald statistic for every item,corresponding p values and a RMSEA fit statistic |
References
de la Torre, J., & Lee, Y. S. (2013). Evaluating the Wald test foritem-level comparison of saturated and reduced models in cognitive diagnosis.Journal of Educational Measurement, 50, 355-373.
Ma, W., Iaconangelo, C., & de la Torre, J. (2016). Model similarity,model selection, and attribute classification.Applied Psychological Measurement, 40(3), 200-217.
See Also
See theGDINA::modelcomp function in theGDINA package for similar functionality.
Examples
## Not run: ############################################################################## EXAMPLE 1: Wald test for DINA simulated data sim.dina#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")# Model 1: estimate GDINA modelmod1 <- CDM::gdina( sim.dina, q.matrix=sim.qmatrix, rule="GDINA")summary(mod1)# perform Wald testres1 <- CDM::gdina.wald( mod1 )summary(res1)# -> results show that all but one item fit according to the DINA rule# select some outputsummary(res1, vars=c("wgtdist", "p") )## End(Not run)General Diagnostic Model
Description
This function estimates the general diagnostic model(von Davier, 2008; Xu & von Davier, 2008) which handlesmultidimensional item response models with ordered discreteor continuous latent variables for polytomous itemresponses.
Usage
gdm( data, theta.k, irtmodel="2PL", group=NULL, weights=rep(1, nrow(data)), Qmatrix=NULL, thetaDes=NULL, skillspace="loglinear", b.constraint=NULL, a.constraint=NULL, mean.constraint=NULL, Sigma.constraint=NULL, delta.designmatrix=NULL, standardized.latent=FALSE, centered.latent=FALSE, centerintercepts=FALSE, centerslopes=FALSE, maxiter=1000, conv=1e-5, globconv=1e-5, msteps=4, convM=.0005, decrease.increments=FALSE, use.freqpatt=FALSE, progress=TRUE, PEM=FALSE, PEM_itermax=maxiter, ...)## S3 method for class 'gdm'summary(object, file=NULL, ...)## S3 method for class 'gdm'print(x, ...)## S3 method for class 'gdm'plot(x, perstype="EAP", group=1, barwidth=.1, histcol=1, cexcor=3, pchpers=16, cexpers=.7, ... )Arguments
data | An |
theta.k | In the one-dimensional case it must be a vector.For multidimensional models it has to be a listof skill vectors if the theta grid differs betweendimensions. If not, a vector input can be supplied.If an estimated skillspace ( |
irtmodel | The default |
group | An optional vector of group identifiers formultiple group estimation.For |
weights | An optional vector of sample weights |
Qmatrix | An optional array of dimension |
thetaDes | A design matrix for specifying nonlinear item responsefunctions (see Example 1, Models 4 and 5) |
skillspace | The parametric assumption of the skillspace.If |
b.constraint | In this optional matrix with |
a.constraint | In this optional matrix with |
mean.constraint | A |
Sigma.constraint | A |
delta.designmatrix | The design matrix of |
standardized.latent | A logical indicating whether in a uni- or multidimensionalmodel all latent variables of the first group should be normally distributedand standardized. The default is |
centered.latent | A logical indicating whether in a uni- or multidimensionalmodel all latent variables of the first group should be normallydistributed and do have zero means? The default is |
centerintercepts | A logical indicating whether intercepts should be centered to have amean of 0 for all dimensions. This argument does not (yet) work properlyfor varying numbers of item categories. |
centerslopes | A logical indicating whether item slopes should be centered to havea mean of 1 for all dimensions. This argument only works for |
maxiter | Maximum number of iterations |
conv | Convergence criterion for item parameters anddistribution parameters |
globconv | Global deviance convergence criterion |
msteps | Maximum number of M steps in estimating |
convM | Convergence criterion in M step |
decrease.increments | Should in the M step the incrementsof |
use.freqpatt | A logical indicating whether frequencies of unique item response patternsshould be used. In case of large data set |
progress | An optional logical indicating whether the function should print theprogress of iteration in the estimation process. |
PEM | Logical indicating whether the P-EM acceleration should beapplied (Berlinet & Roland, 2012). |
PEM_itermax | Number of iterations in which the P-EM method should beapplied. |
object | A required object of class |
file | Optional file name for a file in which |
x | A required object of class |
perstype | Person parameter estimate type. Can be either |
barwidth | Bar width in |
histcol | Color of histogram bars in |
cexcor | Font size for print of correlation in |
pchpers | Point type for scatter plot of personparameters in |
cexpers | Point size for scatter plot of personparameters in |
... | Optional parameters to be passed to or from othermethods will be ignored. |
Details
Caseirtmodel="1PL":
Equal item slopes of 1 are assumed in this model. Therefore,it corresponds to a generalized multidimensional Rasch model.
logit P( X_{nj}=k | \theta_n )=b_{j0} + \sum_d q_{jdk} \theta_{nd}
The Q-matrix entriesq_{jdk} are pre-specified by the user.
Caseirtmodel="2PL":
For each item and each dimension, different item slopesa_{jd}are estimated:
logit P( X_{nj}=k | \theta_n )=b_{j0} + \sum_d a_{jd} q_{jdk} \theta_{nd}
Caseirtmodel="2PLcat":
For each item, each dimension and each category,different item slopesa_{jdk}are estimated:
logit P( X_{nj}=k | \theta_n )=b_{j0} + \sum_d a_{jdk} q_{jdk} \theta_{nd}
Note that this model can be generalized to include terms ofany transformationt_h of the\theta_n vector (e.g. quadratic terms,step functions or interaction) such that the model can be formulated as
logit P( X_{nj}=k | \theta_n )=b_{j0} + \sum_h a_{jhk} q_{jhk} t_h( \theta_{n} )
In general, the number of functionst_1, ..., t_H will belarger than the\theta dimension ofD.
The estimation follows an EM algorithm as described in von Davier andYamamoto (2004) and von Davier (2008).
In case ofskillspace="est", the\bold{\theta} vectors(the grid of the theta distribution) are estimated (Bartolucci, 2007;Bacci, Bartolucci & Gnaldi, 2012). This model is called a multidimensionallatent class item response model.
Value
An object of classgdm. The list contains thefollowing entries:
item | Data frame with item parameters |
person | Data frame with person parameters: |
EAP.rel | Reliability of the EAP |
deviance | Deviance |
ic | Information criteria, number of estimated parameters |
b | Item intercepts |
se.b | Standard error of item intercepts |
a | Item slopes |
se.a | Standard error of item slopes |
itemfit.rmsea | The RMSEA item fit index (see |
mean.rmsea | Mean of RMSEA item fit indexes. |
Qmatrix | Used Q-matrix |
pi.k | Trait distribution |
mean.trait | Means of trait distribution |
sd.trait | Standard deviations of trait distribution |
skewness.trait | Skewnesses of trait distribution |
correlation.trait | List of correlation matrices of trait distributioncorresponding to each group |
pjk | Item response probabilities evaluated at grid |
n.ik | An array of expected counts |
G | Number of groups |
D | Number of dimension of |
I | Number of items |
N | Number of persons |
delta | Parameter estimates for skillspace representation |
covdelta | Covariance matrix of parameter estimates forskillspace representation |
data | Original data frame |
group.stat | Group statistics (sample sizes, group labels) |
p.xi.aj | Individual likelihood |
posterior | Individual posterior distribution |
skill.levels | Number of skill levels per dimension |
K.item | Maximal category per item |
theta.k | Used theta design or estimated theta trait distributionin case of |
thetaDes | Used theta design for item responses |
se.theta.k | Estimated standard errors of |
time | Info about computation time |
skillspace | Used skillspace parametrization |
iter | Number of iterations |
converged | Logical indicating whether convergence was achieved. |
object | Object of class |
x | Object of class |
perstype | Person paramter estimate type. Can be either |
group | Group which should be used for |
barwidth | Bar width in |
histcol | Color of histogram bars in |
cexcor | Font size for print of correlation in |
pchpers | Point type for scatter plot of personparameters in |
cexpers | Point size for scatter plot of personparameters in |
... | Optional parameters to be passed to or from othermethods will be ignored. |
References
Bacci, S., Bartolucci, F., & Gnaldi, M. (2012).A class of multidimensional latent class IRT models for ordinalpolytomous item responses.arXiv preprint,arXiv:1201.4667.
Bartolucci, F. (2007). A class of multidimensional IRT models for testingunidimensionality and clustering items.Psychometrika, 72, 141-157.
Berlinet, A. F., & Roland, C. (2012).Acceleration of the EM algorithm: P-EM versus epsilon algorithm.Computational Statistics & Data Analysis,56(12), 4122-4137.
von Davier, M. (2008). A general diagnostic model applied tolanguage testing data.British Journalof Mathematical and Statistical Psychology, 61, 287-307.
von Davier, M., & Yamamoto, K. (2004). Partially observed mixtures of IRT models:An extension of the generalized partial-credit model.Applied Psychological Measurement, 28, 389-406.
Xu, X., & von Davier, M. (2008).Fitting the structured general diagnosticmodel to NAEP data. ETS Research Report ETS RR-08-27. Princeton, ETS.
See Also
Cognitive diagnostic models for dichotomous data can be estimatedwithdin (DINA or DINO model) orgdina(GDINA model, which contains many CDMs as special cases).
For assessment of model fit seemodelfit.cor.din andanova.gdm.
Seeitemfit.sx2 for item fit statistics.
For the estimation of the multidimensionallatent class item response model see theMultiLCIRT packageandsirt package (functionsirt::rasch.mirtlc).
Examples
############################################################################## EXAMPLE 1: Fraction Dataset 1# Unidimensional Models for dichotomous data#############################################################################data(data.fraction1, package="CDM")dat <- data.fraction1$datatheta.k <- seq( -6, 6, len=15 ) # discretized ability#***# Model 1: Rasch model (normal distribution)mod1 <- CDM::gdm( dat, irtmodel="1PL", theta.k=theta.k, skillspace="normal", centered.latent=TRUE)summary(mod1)plot(mod1)#***# Model 2: Rasch model (log-linear smoothing)# set the item difficulty of the 8th item to zerob.constraint <- matrix( c(8,1,0), 1, 3 )mod2 <- CDM::gdm( dat, irtmodel="1PL", theta.k=theta.k, skillspace="loglinear", b.constraint=b.constraint )summary(mod2)#***# Model 3: 2PL modelmod3 <- CDM::gdm( dat, irtmodel="2PL", theta.k=theta.k, skillspace="normal", standardized.latent=TRUE )summary(mod3)## Not run: #***# Model 4: include quadratic term in item response function# using the argument decrease.increments=TRUE leads to a more# stable estimatethetaDes <- cbind( theta.k, theta.k^2 )colnames(thetaDes) <- c( "F1", "F1q" )mod4 <- CDM::gdm( dat, irtmodel="2PL", theta.k=theta.k, thetaDes=thetaDes, skillspace="normal", standardized.latent=TRUE, decrease.increments=TRUE)summary(mod4)#***# Model 5: step function for ICC# two different probabilities theta < 0 and theta > 0thetaDes <- matrix( 1*(theta.k>0), ncol=1 )colnames(thetaDes) <- c( "Fgrm1" )mod5 <- CDM::gdm( dat, irtmodel="2PL", theta.k=theta.k, thetaDes=thetaDes, skillspace="normal" )summary(mod5)#***# Model 6: DINA model with din functionmod6 <- CDM::din( dat, q.matrix=matrix( 1, nrow=ncol(dat),ncol=1 ) )summary(mod6)#***# Model 7: Estimating a version of the DINA model with gdmtheta.k <- c(-.5,.5)mod7 <- CDM::gdm( dat, irtmodel="2PL", theta.k=theta.k, skillspace="loglinear" )summary(mod7)############################################################################## EXAMPLE 2: Cultural Activities - data.Students# Unidimensional Models for polytomous data#############################################################################data(data.Students, package="CDM")dat <- data.Studentsdat <- dat[, grep( "act", colnames(dat) ) ]theta.k <- seq( -4, 4, len=11 ) # discretized ability#***# Model 1: Partial Credit Model (PCM)mod1 <- CDM::gdm( dat, irtmodel="1PL", theta.k=theta.k, skillspace="normal", centered.latent=TRUE)summary(mod1)plot(mod1)#***# Model 1b: PCM using frequency patternsmod1b <- CDM::gdm( dat, irtmodel="1PL", theta.k=theta.k, skillspace="normal", centered.latent=TRUE, use.freqpatt=TRUE)summary(mod1b)#***# Model 2: PCM with two groupsmod2 <- CDM::gdm( dat, irtmodel="1PL", theta.k=theta.k, group=CDM::data.Students$urban + 1, skillspace="normal", centered.latent=TRUE)summary(mod2)#***# Model 3: PCM with loglinear smoothingb.constraint <- matrix( c(1,2,0), ncol=3 )mod3 <- CDM::gdm( dat, irtmodel="1PL", theta.k=theta.k, skillspace="loglinear", b.constraint=b.constraint )summary(mod3)#***# Model 4: Model with pre-specified item weights in Q-matrixQmatrix <- array( 1, dim=c(5,1,2) )Qmatrix[,1,2] <- 2 # default is score 2 for category 2# now change the scoring of category 2:Qmatrix[c(2,4),1,1] <- .74Qmatrix[c(2,4),1,2] <- 2.3# for items 2 and 4 the score for category 1 is .74 and for category 2 it is 2.3mod4 <- CDM::gdm( dat, irtmodel="1PL", theta.k=theta.k, Qmatrix=Qmatrix, skillspace="normal", centered.latent=TRUE)summary(mod4)#***# Model 5: Generalized partial credit modelmod5 <- CDM::gdm( dat, irtmodel="2PL", theta.k=theta.k, skillspace="normal", standardized.latent=TRUE )summary(mod5)#***# Model 6: Item-category slope estimationmod6 <- CDM::gdm( dat, irtmodel="2PLcat", theta.k=theta.k, skillspace="normal", standardized.latent=TRUE, decrease.increments=TRUE)summary(mod6)#***# Models 7: items with different number of categoriesdat0 <- datdat0[ paste(dat0[,1])==2, 1 ] <- 1 # 1st item has only two categoriesdat0[ paste(dat0[,3])==2, 3 ] <- 1 # 3rd item has only two categories# Model 7a: PCMmod7a <- CDM::gdm( dat0, irtmodel="1PL", theta.k=theta.k, centered.latent=TRUE )summary(mod7a)# Model 7b: Item category slopesmod7b <- CDM::gdm( dat0, irtmodel="2PLcat", theta.k=theta.k, standardized.latent=TRUE, decrease.increments=TRUE )summary(mod7b)############################################################################## EXAMPLE 3: Fraction Dataset 2# Multidimensional Models for dichotomous data#############################################################################data(data.fraction2, package="CDM")dat <- data.fraction2$dataQmatrix <- data.fraction2$q.matrix3#***# Model 1: One-dimensional Rasch modeltheta.k <- seq( -4, 4, len=11 ) # discretized abilitymod1 <- CDM::gdm( dat, irtmodel="1PL", theta.k=theta.k, centered.latent=TRUE)summary(mod1)plot(mod1)#***# Model 2: One-dimensional 2PL modelmod2 <- CDM::gdm( dat, irtmodel="2PL", theta.k=theta.k, standardized.latent=TRUE)summary(mod2)plot(mod2)#***# Model 3: 3-dimensional Rasch Model (normal distribution)mod3 <- CDM::gdm( dat, irtmodel="1PL", theta.k=theta.k, Qmatrix=Qmatrix, centered.latent=TRUE, globconv=5*1E-3, conv=1E-4 )summary(mod3)#***# Model 4: 3-dimensional Rasch model (loglinear smoothing)# set some item parameters of items 4,1 and 2 to zerob.constraint <- cbind( c(4,1,2), 1, 0 )mod4 <- CDM::gdm( dat, irtmodel="1PL", theta.k=theta.k, Qmatrix=Qmatrix, b.constraint=b.constraint, skillspace="loglinear" )summary(mod4)#***# Model 5: define a different theta grid for each dimensiontheta.k <- list( "Dim1"=seq( -5, 5, len=11 ), "Dim2"=seq(-5,5,len=8), "Dim3"=seq( -3,3,len=6) )mod5 <- CDM::gdm( dat, irtmodel="1PL", theta.k=theta.k, Qmatrix=Qmatrix, b.constraint=b.constraint, skillspace="loglinear")summary(mod5)#***# Model 6: multdimensional 2PL model (normal distribution)theta.k <- seq( -5, 5, len=13 )a.constraint <- cbind( c(8,1,3), 1:3, 1, 1 ) # fix some slopes to 1mod6 <- CDM::gdm( dat, irtmodel="2PL", theta.k=theta.k, Qmatrix=Qmatrix, centered.latent=TRUE, a.constraint=a.constraint, decrease.increments=TRUE, skillspace="normal")summary(mod6)#***# Model 7: multdimensional 2PL model (loglinear distribution)a.constraint <- cbind( c(8,1,3), 1:3, 1, 1 )b.constraint <- cbind( c(8,1,3), 1, 0 )mod7 <- CDM::gdm( dat, irtmodel="2PL", theta.k=theta.k, Qmatrix=Qmatrix, b.constraint=b.constraint, a.constraint=a.constraint, decrease.increments=FALSE, skillspace="loglinear")summary(mod7)############################################################################## EXAMPLE 4: Unidimensional latent class 1PL IRT model############################################################################## simulate dataset.seed(754)I <- 20 # number of itemsN <- 2000 # number of personstheta <- c( -2, 0, 1, 2 )theta <- rep( theta, c(N/4,N/4, 3*N/8, N/8) )b <- seq(-2,2,len=I)library(sirt) # use function sim.raschtype from sirt packagedat <- sirt::sim.raschtype( theta=theta, b=b )theta.k <- seq(-1, 1, len=4) # initial vector of theta# estimate modelmod1 <- CDM::gdm( dat, theta.k=theta.k, skillspace="est", irtmodel="1PL", centerintercepts=TRUE, maxiter=200)summary(mod1) ## Estimated Skill Distribution ## F1 pi.k ## 1 -1.988 0.24813 ## 2 -0.055 0.23313 ## 3 0.940 0.40059 ## 4 2.000 0.11816############################################################################## EXAMPLE 5: Multidimensional latent class IRT model############################################################################## We simulate a two-dimensional IRT model in which theta vectors# are observed at a fixed discrete grid (see below).# simulate dataset.seed(754)I <- 13 # number of itemsN <- 2400 # number of persons# simulate Dimension 1 at 4 discrete theta pointstheta <- c( -2, 0, 1, 2 )theta <- rep( theta, c(N/4,N/4, 3*N/8, N/8) )b <- seq(-2,2,len=I)library(sirt) # use simulation function from sirt packagedat1 <- sirt::sim.raschtype( theta=theta, b=b )# simulate Dimension 2 at 4 discrete theta pointstheta <- c( -3, 0, 1.5, 2 )theta <- rep( theta, c(N/4,N/4, 3*N/8, N/8) )dat2 <- sirt::sim.raschtype( theta=theta, b=b )colnames(dat2) <- gsub( "I", "U", colnames(dat2))dat <- cbind( dat1, dat2 )# define Q-matrixQmatrix <- matrix(0,2*I,2)Qmatrix[ cbind( 1:(2*I), rep(1:2, each=I) ) ] <- 1theta.k <- seq(-1, 1, len=4) # initial matrixtheta.k <- cbind( theta.k, theta.k )colnames(theta.k) <- c("Dim1","Dim2")# estimate modelmod2 <- CDM::gdm( dat, theta.k=theta.k, skillspace="est", irtmodel="1PL", Qmatrix=Qmatrix, centerintercepts=TRUE)summary(mod2) ## Estimated Skill Distribution ## theta.k.Dim1 theta.k.Dim2 pi.k ## 1 -2.022 -3.035 0.25010 ## 2 0.016 0.053 0.24794 ## 3 0.956 1.525 0.36401 ## 4 1.958 1.919 0.13795############################################################################## EXAMPLE 6: Large-scale dataset data.mg#############################################################################data(data.mg, package="CDM")dat <- data.mg[, paste0("I", 1:11 ) ]theta.k <- seq(-6,6,len=21)#***# Model 1: Generalized partial credit model with multiple groupsmod1 <- CDM::gdm( dat, irtmodel="2PL", theta.k=theta.k, group=CDM::data.mg$group, skillspace="normal", standardized.latent=TRUE)summary(mod1)## End(Not run)Ideal Response Pattern
Description
This function computes the ideal response pattern which is the latentitem response\eta_{lj}=\prod_{k=1}^K \alpha_{lk} for a personwith skill profilel at itemj.
Usage
ideal.response.pattern(q.matrix, skillspace=NULL, rule="DINA")Arguments
q.matrix | The Q-matrix |
skillspace | An optional skill space matrix. If it is not provided, then all skillclasses are used for creating an ideal response pattern. |
rule | Chosen condensation rule for the CDM. Can be |
Value
A list with following entries
idealresp | A matrix with ideal response patterns |
skillspace | Used skill space |
Examples
############################################################################## EXAMPLE 1: Ideal response pattern sim.qmatrix#############################################################################data(sim.qmatrix, package="CDM")q.matrix <- sim.qmatrix#- ideal response pattern for DINA modelCDM::ideal.response.pattern(q.matrix)#- ideal response pattern for DINO modelCDM::ideal.response.pattern( q.matrix, rule="DINO" )# compute ideal responses for a reduced skill spaceskillspace <- matrix( c( 0,1,0, 1,1,0 ), 2,3, byrow=TRUE )CDM::ideal.response.pattern( q.matrix, skillspace=skillspace)Create Dataset with Group-Specific Items
Description
Creates a dataset with group-specific items which can be used for multiplegroup comparisons.
Usage
item_by_group(dat, group, invariant=NULL, rm.empty=TRUE)Arguments
dat | Dataset with item responses |
group | Vector of group identifiers |
invariant | Optional vector of variables which shouldnot be made group-specific, i.e. which should be treatedas invariant across groups. |
rm.empty | Logical indicating whether empty columns should be removed |
Value
Extended dataset with item responses
Examples
## Not run: ############################################################################## EXAMPLE 1: Create dataset with group-specific item responses#############################################################################data(data.mg, package="CDM")dat <- data.mg#-- create dataset with group-specific item responsesdat0 <- CDM::item_by_group( dat=dat[,paste0("I",1:5)], group=dat$group )#-- summary statisticssummary(dat0)colnames(dat0)#-- set some items to invariantinvariant_items <- c("I1","I4")dat1 <- CDM::item_by_group( dat=dat[,paste0("I",1:5)], group=dat$group, invariant=invariant_items)colnames(dat1)## End(Not run)RMSEA Item Fit
Description
This function estimates a chi squared based measure of item fitin cognitive diagnosis models similar to the RMSEA itemfitimplemented in mdltm (von Davier, 2005;cited in Kunina-Habenicht, Rupp & Wilhelm, 2009).
The RMSEA statistic is also called as the RMSD statistic, seeIRT.RMSD.
Usage
itemfit.rmsea(n.ik, pi.k, probs, itemnames=NULL)Arguments
n.ik | An array of four dimensions: Classes x items x categories x groups |
pi.k | An array of two dimensions: Classes x groups |
probs | An array of three dimensions: Classes x items x categories |
itemnames | An optional vector of item names. Default is |
Details
For itemj, the RMSEA itemfit in this function is calculatedas follows:
RMSEA_j=\sqrt{ \sum_k \sum_c \pi ( \bold{\theta}_c) \left( P_j ( \bold{\theta}_c ) -\frac{n_{jkc}}{N_{jc}} \right)^2 }
wherec denotes the class of the skill vector\bold{\theta},k is the item category,\pi ( \bold{\theta}_c) is the estimated class probabilityof\bold{\theta}_c,P_j is the estimated item response function,n_{jkc} is the expected number of students withskill\bold{\theta}_c onitemj in categoryk andN_{jc} is the expected number of students withskill\bold{\theta}_c onitemj.
Value
A list with two entries:
rmsea | Vector of RMSEA item statistics |
rmsea.groups | Matrix of group-wise RMSEA item statistics |
References
Kunina-Habenicht, O., Rupp, A. A., & Wilhelm, O. (2009).A practical illustration of multidimensional diagnostic skills profiling:Comparing results from confirmatory factor analysis and diagnosticclassification models.Studies in Educational Evaluation, 35, 64–70.
von Davier, M. (2005).A general diagnostic model applied to languagetesting data. ETS Research Report RR-05-16. ETS, Princeton, NJ: ETS.
See Also
This function is used indin,gdina andgdm.
S-X2 Item Fit Statistic for Dichotomous Data
Description
Computes the S-X2 item fit statistic (Orlando & Thissen; 2000, 2003)for dichotomous data. Note that completely observed data isnecessary for applying this function.
Usage
itemfit.sx2(object, Eik_min=1, progress=TRUE)## S3 method for class 'itemfit.sx2'summary(object, ...)## S3 method for class 'itemfit.sx2'plot(x, ask=TRUE, ...)Arguments
object | Object of class |
x | Object of class |
Eik_min | The minimum expected cell size for merging score groups. |
progress | An optional logical indicating whether progress should be displayed. |
ask | An optional logical indicating whether every item should beseparately displayed. |
... | Further arguments to be passed |
Details
The S-X2 item fit statistic compares observed and expected proportionsO_{jk} andE_{jk} for itemj andeach score groupk and forms a chi-square distributed statistic
S-X_j^2=\sum_{k=1}^{J-1} N_k \frac{ ( O_{jk} - E_{jk} )^2 } { E_{jk} ( 1 - E_{jk} ) }
The degrees of freedom areJ-1-P_j whereP_j denotesthe number of estimated item parameters.
Value
A list with following entries
itemfit.stat | Data frame containing item fit statistics |
itemtable | Data frame with expected and observed proportionsfor each score group and each item. Beside the ordinary p value,an adjusted p value obtained by correction due to multiple testingis provided ( |
Note
This function does not work properly for multiple groups.
Author(s)
Alexander Robitzsch
References
Li, Y., & Rupp, A. A. (2011). Performance of the S-X2 statistic forfull-information bifactor models.Educational and Psychological Measurement, 71, 986-1005.
Orlando, M., & Thissen, D. (2000). Likelihood-based item-fit indices fordichotomous item response theory models.Applied Psychological Measurement, 24, 50-64.
Orlando, M., & Thissen, D. (2003). Further investigation of the performance ofS-X2: An item fit index for use with dichotomous item response theory models.Applied Psychological Measurement, 27, 289-298.
Zhang, B., & Stone, C. A. (2008). Evaluating item fit for multidimensionalitem response models.Educational and Psychological Measurement,68, 181-196.
Examples
## Not run: ############################################################################## EXAMPLE 1: Items with unequal item slopes############################################################################## simulate dataset.seed(9871)I <- 11b <- seq( -1.5, 1.5, length=I)a <- rep(1,I)a[4] <- .4N <- 1000library(sirt)dat <- sirt::sim.raschtype( theta=stats::rnorm(N), b=b, fixed.a=a)#*** 1PL model estimated with gdmmod1 <- CDM::gdm( dat, theta.k=seq(-6,6,len=21), irtmodel="1PL" )summary(mod1)# estimate item fit statisticfitmod1 <- CDM::itemfit.sx2(mod1)summary(fitmod1) ## item itemindex S-X2 df p S-X2_df RMSEA Nscgr Npars p.holm ## 1 I0001 1 4.173 9 0.900 0.464 0.000 10 1 1.000 ## 2 I0002 2 12.365 9 0.193 1.374 0.019 10 1 1.000 ## 3 I0003 3 6.158 9 0.724 0.684 0.000 10 1 1.000 ## 4 I0004 4 37.759 9 0.000 4.195 0.057 10 1 0.000 ## 5 I0005 5 12.307 9 0.197 1.367 0.019 10 1 1.000 ## 6 I0006 6 19.358 9 0.022 2.151 0.034 10 1 0.223 ## 7 I0007 7 14.610 9 0.102 1.623 0.025 10 1 0.818 ## 8 I0008 8 15.568 9 0.076 1.730 0.027 10 1 0.688 ## 9 I0009 9 8.471 9 0.487 0.941 0.000 10 1 1.000 ## 10 I0010 10 8.330 9 0.501 0.926 0.000 10 1 1.000 ## 11 I0011 11 12.351 9 0.194 1.372 0.019 10 1 1.000 ## ## -- Average Item Fit Statistics -- ## S-X2=13.768 | S-X2_df=1.53# -> 4th item does not fit to the 1PL model# plot item fitplot(fitmod1)#*** 2PL model estimated with gdmmod2 <- CDM::gdm( dat, theta.k=seq(-6,6,len=21), irtmodel="2PL", maxiter=100 )summary(mod2)# estimate item fit statisticfitmod2 <- CDM::itemfit.sx2(mod2)summary(fitmod2) ## item itemindex S-X2 df p S-X2_df RMSEA Nscgr Npars p.holm ## 1 I0001 1 4.083 8 0.850 0.510 0.000 10 2 1.000 ## 2 I0002 2 13.580 8 0.093 1.697 0.026 10 2 0.747 ## 3 I0003 3 6.236 8 0.621 0.780 0.000 10 2 1.000 ## 4 I0004 4 6.049 8 0.642 0.756 0.000 10 2 1.000 ## 5 I0005 5 12.792 8 0.119 1.599 0.024 10 2 0.834 ## 6 I0006 6 14.397 8 0.072 1.800 0.028 10 2 0.648 ## 7 I0007 7 15.046 8 0.058 1.881 0.030 10 2 0.639 ## [...] ## ## -- Average Item Fit Statistics -- ## S-X2=10.22 | S-X2_df=1.277#*** 1PL model estimation in smirt (sirt package)Qmatrix <- matrix(1, nrow=I, ncol=1 )mod1a <- sirt::smirt( dat, Qmatrix=Qmatrix )summary(mod1a)# item fit statisticfitmod1a <- CDM::itemfit.sx2(mod1a)summary(fitmod1a)#*** 2PL model estimation in smirt (sirt package)mod2a <- sirt::smirt( dat, Qmatrix=Qmatrix, est.a="2PL")summary(mod2a)# item fit statisticfitmod2a <- CDM::itemfit.sx2(mod2a)summary(fitmod2a)#*** 1PL model estimated with rasch.mml2 (in sirt)mod1b <- sirt::rasch.mml2(dat)summary(mod1b)# estimate item fit statisticfitmod1b <- CDM::itemfit.sx2(mod1b)summary(fitmod1b)#*** 1PL estimated in TAMlibrary(TAM)mod1c <- TAM::tam.mml( resp=dat )summary(mod1c)# item fitsummary( CDM::itemfit.sx2( mod1c) )# conversion to mirt objectlibrary(sirt)library(mirt)cmod1c <- sirt::tam2mirt( mod1c )# item fit in mirtmirt::itemfit( cmod1c$mirt )#*** 2PL estimated in TAMmod2c <- TAM::tam.mml.2pl( resp=dat )summary(mod2c)# item fitsummary( CDM::itemfit.sx2( mod2c) )# conversion to mirt object and item fit in mirtcmod2c <- sirt::tam2mirt( mod2c )mirt::itemfit( cmod2c$mirt )# estimation in mirtmod1d <- mirt::mirt( dat, 1, itemtype="Rasch" )mirt::itemfit( mod1d ) # compute item fit############################################################################## EXAMPLE 2: Item fit statistics sim.dina dataset#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")#*** Model 1: DINA model (correctly specified model)mod1 <- CDM::din( data=sim.dina, q.matrix=sim.qmatrix )summary(mod1)# item fit statisticsummary( CDM::itemfit.sx2( mod1 ) ) ## -- Average Item Fit Statistics -- ## S-X2=7.397 | S-X2_df=1.233#*** Model 2: Mixed DINA/DINO model#*** 1th item is misspecified according to DINO ruleI <- ncol(CDM::sim.dina)rule <- rep("DINA", I )rule[1] <- "DINO"mod2 <- CDM::din( data=CDM::sim.dina, q.matrix=CDM::sim.qmatrix, rule=rule)summary(mod2)# item fit statisticsummary( CDM::itemfit.sx2( mod2 ) ) ## -- Average Item Fit Statistics -- ## S-X2=9.925 | S-X2_df=1.654#*** Model 3: Additive GDINA modelmod3 <- CDM::gdina( data=CDM::sim.dina, q.matrix=CDM::sim.qmatrix, rule="ACDM")summary(mod3)# item fit statisticsummary( CDM::itemfit.sx2( mod3 ) ) ## -- Average Item Fit Statistics -- ## S-X2=8.416 | S-X2_df=1.678## End(Not run)Extract Log-Likelihood
Description
Extracts the log-likelihood from eitherdin,gdina,mcdina,slca orgdm objects.
Usage
## S3 method for class 'din'logLik(object, ...)## S3 method for class 'gdina'logLik(object, ...)## S3 method for class 'mcdina'logLik(object, ...)## S3 method for class 'gdm'logLik(object, ...)## S3 method for class 'slca'logLik(object, ...)## S3 method for class 'reglca'logLik(object, ...)Arguments
object | An object inheriting from either class |
... | Additional arguments |
See Also
din,gdina,gdm,mcdina,slca,reglca
Examples
data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")# logLik method | DINA modeld1 <- CDM::din( sim.dina, q.matrix=sim.qmatrix, rule="DINA")summary(d1)lld1 <- logLik(d1) ## > lld1 ## 'log Lik.' -2042.378 (df=25) ## > attr(lld1,"df") ## [1] 25 ## > attr(lld1,"nobs") ## [1] 400nobs(lld1)# AIC and BICAIC(lld1)BIC(lld1)Multiple Choice DINA Model
Description
The functionmcdina implements the multiple choice DINA model(de la Torre, 2009; see also Ozaki, 2015; Chen & Zhou, 2017)for multiple groups. Note that the dataset must containinteger values1,\ldots, K_j for each item. The multiple choiceDINA model assumes that each item category possesses different diagnostic capacity.Using this modeling approach, different distractors of amultiple choice item can be of different diagnostic value. The Q-matrix can alsocontain integer values which allows the definition of polytomous attributes.
Usage
mcdina(dat, q.matrix, group=NULL, itempars="gr", weights=NULL, skillclasses=NULL, zeroprob.skillclasses=NULL, reduced.skillspace=TRUE, conv.crit=1e-04, dev.crit=0.1, maxit=1000, progress=TRUE)## S3 method for class 'mcdina'summary(object, digits=4, file=NULL, ...)## S3 method for class 'mcdina'print(x, ...)Arguments
dat | A required |
q.matrix | A required matrix specifying which item category is intended to measure which skill.The Q-matrix has |
group | An optional vector of group identifiers for multiple group estimation. |
itempars | A character or a character vector of length |
weights | An optional vector of sample weights. |
skillclasses | An optional matrix for determining the skill space. The argument can be usedif a user wants less than the prespecified number of |
zeroprob.skillclasses | An optional vector of integers which indicates which skill classes should havezero probability. Default is |
reduced.skillspace | An optional logical indicating whether theskill space should be reduced to cover only bivariate associationsamong skills (see Xu & von Davier, 2008). |
conv.crit | Convergence criterion for change in item parameter values |
dev.crit | Convergence criterion for change in deviance values |
maxit | Maximum number of iterations. |
progress | An optional logical indicating whether the function should print theprogress of iteration in the estimation process. |
object | Object of class |
digits | Number of digits to display in |
file | Optional file name for a file in which |
x | Object of class |
... | Further arguments to be passed. |
Details
The multiple choice DINA model defines for each item categoryjc thenecessary skills to master this attribute. Therefore, the vector of skills\bold{\alpha} is transformed into item-specific latent responses\eta_{j} which are functions of\bold{\alpha} and Q-matrix entriesq_{jc}(just like in the DINA model). If there areK_j item categories for itemj,then there exist at mostK_j values of the latent response\eta_j.
The multiple choice DINA model estimates the item response function as
P( X_{nj}=k | \eta_{nj}=l )=p_{jkl}
with the constraint\sum_k p_{jkl}=1.
Value
A list with following entries
item | Data frame with item parameters |
posterior | Individual posterior distribution |
likelihood | Individual likelihood |
ic | List with information criteria |
q.matrix | Used Q-matrix |
pik | Array of item-category probabilities |
delta | Array of item parameters |
se.delta | Array of standard errors of item parameters |
itemstat | Data frame containing item definitions |
n.ik | Array of expected counts |
deviance | Deviance |
attribute.patt | Probabilities of latent classes |
attribute.patt.splitted | Splitted attribute pattern |
skill.patt | Marginal skill probabilities |
MLE.class | Classified skills for each student (MLE) |
MAP.class | Classified skills for each student (MAP) |
EAP.class | Classified skills for each student (EAP) |
dat | Used dataset |
skillclasses | Used skill classes |
group | Used group identifiers |
lc | Data frame containing definitions of each item category |
lr | Data frame containing the relation of each latent class and each item category |
iter | Number of iterations |
itempars | Used specification of item parameter estimation type |
converged | Logical indicating whether convergence was achieved. |
Note
Ifdat andq.matrix correspond to the 'ordinary format' which is usedingdina, then the functionmcdina will detect it and convert itinto the necessary format (see Example 2).
References
Chen, J., & Zhou, H. (2017) Test designs and modeling under the generalnominal diagnosis model framework.PLoS ONE 12(6), e0180016.
de la Torre, J. (2009). A cognitive diagnosis model for cognitively basedmultiple-choice options.Applied Psychological Measurement, 33, 163-183.
Ozaki, K. (2015). DINA models for multiple-choice items with few parameters:Considering incorrect answers.Applied Psychological Measurement, 39(6), 431-447.
Xu, X., & von Davier, M. (2008).Fitting the structured general diagnosticmodel to NAEP data. ETS Research Report ETS RR-08-27. Princeton, ETS.
See Also
Seedin for estimating the DINA/DINO model andgdinafor estimating the GDINA model.
Examples
############################################################################## EXAMPLE 1: Multiple choice DINA model for data.cdm01 dataset#############################################################################data(data.cdm01, package="CDM")dat <- data.cdm01$datagroup <- data.cdm01$groupq.matrix <- data.cdm01$q.matrix#*** Model 1: Single group modelmod1 <- CDM::mcdina( dat=dat, q.matrix=q.matrix )summary(mod1)#*** Model 2: Multiple group model with group-invariant item parametersmod2 <- CDM::mcdina( dat=dat, q.matrix=q.matrix, group=group, itempars="jo")summary(mod2)## Not run: #*** Model 3: Multiple group model with group-specific item parametersmod3 <- CDM::mcdina( dat=dat, q.matrix=q.matrix, group=group, itempars="gr")summary(mod3)#*** Model 4: Multiple group model with some group-specific item parametersitempars <- rep("jo", ncol(dat))itempars[ c( 2, 7, 9) ] <- "gr" # set items 2,7 and 9 group specificmod4 <- CDM::mcdina( dat=dat, q.matrix=q.matrix, group=group, itempars=itempars)summary(mod4)#*** Model 5: Reduced skill space# define skill classesskillclasses <- scan(nlines=1) # read only one line 0 0 0 1 0 0 0 1 0 0 0 1 1 1 0 1 1 1skillclasses <- matrix( skillclasses, ncol=3, byrow=TRUE )mod5 <- CDM::mcdina( dat, q.matrix=q.matrix, group=group0, skillclasses=skillclasses )summary(mod5)#*** Model 6: Reduced skill space with setting zero probabilities# for some latent classes# set probabilities of classes P101 P011 (6th and 7th class) to zerozeroprob.skillclasses <- c(6,7)mod6 <- CDM::mcdina( dat, q.matrix, group=group, zeroprob.skillclasses=zeroprob.skillclasses )summary(mod6)############################################################################## EXAMPLE 2: Using the mcdina function for estimating the DINA model#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")# estimate the DINA modelmod <- CDM::mcdina( sim.dina, q.matrix=sim.qmatrix )summary(mod)############################################################################## EXAMPLE 3: MCDINA model with polytomous attributes#############################################################################data(data.cdm02, package="CDM")dat <- data.cdm02$dataq.matrix <- data.cdm02$q.matrix# estimate model with polytomous attribute B1mod1 <- CDM::mcdina( dat, q.matrix=q.matrix )summary(mod1)## End(Not run)Assessing Model Fit and Local Dependence by Comparing Observed and ExpectedItem Pair Correlations
Description
This function computes several measures of absolute model fit and localdependence indices for dichotomous item responses which arebased on comparing observed and expected frequencies of item pairs(Chen, de la Torre & Zhang, 2013; see Details).
Usage
modelfit.cor(data, posterior, probs)modelfit.cor2(data, posterior, probs)modelfit.cor.din( dinobj, jkunits=0 )## S3 method for class 'modelfit.cor.din'summary(object, ...)Arguments
data | An |
posterior | A matrix containing the posterior distribution (e.g. obtained asan output of the |
probs | An array of dimension [items,categories,attribute classes]containing probabilities |
dinobj | An object of class |
object | An object of class |
jkunits | Number of Jackknife units. The default is to use 0 units(no use of jackknifing). If jackknife estimation should beemployed, use (say) at least 20 jackknife units.The input |
... | Further arguments to be passed |
Details
The fit statistics are based on predictions of the pairwise table(X_i, X_j) of item responses. The\chi^2 statisticX2 foritem pairsi andj is defined as
\chi^2_{ij}=\sum_{k=0}^1 \sum_{l=0}^1 \frac{ (n_{ij,kl}-e_{ij,kl}) ^2 }{ e_{ij,kl} }
wheren_{ij,kl} is the absolute frequency of\{ X_{i}=k,X_j=l\}ande_{ij,kl} is the expected frequency using the estimated model.Note that for calculatinge_{ij,kl}, individual posterior distributionsare evaluated. The\chi^2_{ij} statistic is chi-square distributed with onedegree of freedom and can be used for testing whether itemsi andj are locally dependent. To control for multiple comparisons,p-value adjustments according to the Holm and FDR method are conducted(seestats::p.adjust).
The residual covarianceRESIDCOV of item pairs(i,j) is calculatedas
RESIDCOV_{ij}= \frac{ n_{ij,11} n_{ij,00} - n_{ij,10} n_{ij,01} }{n^2 } - \frac{ e_{ij,11} e_{ij,00} - e_{ij,10} e_{ij,01} }{n^2 }
whereMRESIDCOV is the average of allRESIDCOV statisticsand is the total sample size.
The statisticMADcor denotes the average absolute deviation betweenobserved correlationsr_{ij} and model predicted correlations\hat{r}_{ij} of item pairs(i,j):
MADcor=\frac{1}{ J(J-1)/2 } \sum_{i < j} | r_{ij} - \hat{r}_{ij} |
The SRMSR (standardized root mean square root of squared residuals,Maydeu-Olivares, 2013) is also based on comparing these correlations
SRMSR=\sqrt{ \frac{1}{ J(J-1)/2 } \sum_{i < j} ( r_{ij} - \hat{r}_{ij} )^2 }
For calculatingMADQ3 andMADaQ3,residuals\varepsilon_{ni}=X_{ni} - e_{ni} ofobserved and expected responses for respondentsn and itemsi areconstructed. Then, the average of the absolute values of pairwise correlationsof these residuals is computed forMADQ3. ForMADaQ3, the averageof the centered pairwise values (i.e. by subtracting the average Q3 statistic)is calculated.
The difference of Fisher transformed correlations (Chen et al., 2013) is alsocomputed and used for assessing statistical inference.
For every of the fit statisticsMADcor,MADacor,SRMSR,MX2,100*MADRESIDCOV andMADQ3 it holds that smaller values(values near to zero) indicate better fit.
Standard errors and confidence intervals of fit statistics are obtainedby Jackknife estimation.
Value
A list with following entries
modelfit.stat | Model fit statistics:
|
modelfit.test | Test of global absolute model fit using teststatistics of all item pairs. The statistic |
itempairs | Fit of itempairs which can be used for inspection of localdependence. The |
Note
The function does not handle sample weights properly.
The functionmodelfit.cor2 has the same functionality asmodelfit.cor but it is much faster because it is based onRcpp code.
References
Chen, J., de la Torre, J., & Zhang, Z. (2013).Relative and absolute fit evaluation in cognitive diagnosis modeling.Journal of Educational Measurement, 50, 123-140.
Chen, W., & Thissen, D. (1997). Local dependence indexes for item pairsusing item response theory.Journal of Educational and Behavioral Statistics,22, 265-289.
DiBello, L. V., Roussos, L. A., & Stout, W. F. (2007). Review ofcognitively diagnostic assessment and a summary of psychometric models.In C. R. Rao and S. Sinharay (Eds.),Handbook of Statistics,Vol. 26 (pp. 979–1030). Amsterdam: Elsevier.
Maydeu-Olivares, A. (2013). Goodness-of-fit assessment of item responsetheory models (with discussion).Measurement: Interdisciplinary Research and Perspectives,11, 71-137.
Maydeu-Olivares, A., & Joe, H. (2014). Assessing approximate fit in categoricaldata analysis.Multivariate Behavioral Research, 49, 305-328.
McDonald, R. P., & Mok, M. M.-C. (1995). Goodness of fit in item response models.Multivariate Behavioral Research, 30, 23-40.
Yen, W. M. (1984). Effects of local item dependence on the fit and equatingperformance of the three-parameter logistic model.Applied Psychological Measurement, 8, 125-145.
Examples
## Not run: ############################################################################## EXAMPLE 1: Model fit for sim.dina#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")dat <- sim.dinaq.matrix <- sim.qmatrix#*** Model 1: DINA model for DINA simulated datamod1 <- CDM::din(dat, q.matrix=q.matrix, rule="DINA" )fmod1 <- CDM::modelfit.cor.din(mod1, jkunits=10)summary(fmod1) ## Test of Global Model Fit ## type value p ## 1 max(X2) 8.728 0.113 ## 2 abs(fcor) 0.143 0.080 ## ## Fit Statistics ## est jkunits jk_est jk_se est_low est_upp ## MADcor 0.030 10 0.020 0.005 0.010 0.030 ## SRMSR 0.040 10 0.023 0.006 0.011 0.035 ## 100*MADRESIDCOV 0.671 10 0.445 0.125 0.200 0.690 ## MADQ3 0.062 10 0.037 0.008 0.021 0.052 ## MADaQ3 0.059 10 0.034 0.008 0.019 0.050# look at first five item pairs with highest degree of local dependenceitempairs <- fmod1$itempairsitempairs <- itempairs[ order( itempairs$X2, decreasing=TRUE ), ]itempairs[ 1:5, c("item1","item2", "X2", "X2_p", "X2_p.holm", "Q3") ] ## item1 item2 X2 X2_p X2_p.holm Q3 ## 29 Item5 Item8 8.728248 0.003133174 0.1127943 -0.26616414 ## 32 Item6 Item8 2.644912 0.103881881 1.0000000 0.04873154 ## 21 Item3 Item9 2.195011 0.138458201 1.0000000 0.05948456 ## 10 Item2 Item4 1.449106 0.228671389 1.0000000 -0.08036216 ## 30 Item5 Item9 1.393583 0.237800911 1.0000000 -0.01934420#*** Model 2: DINO model for DINA simulated datamod2 <- CDM::din(dat, q.matrix=q.matrix, rule="DINO" )fmod2 <- CDM::modelfit.cor.din(mod2, jkunits=10 ) # 10 jackknife unitssummary(fmod2) ## Test of Global Model Fit ## type value p ## 1 max(X2) 13.139 0.010 ## 2 abs(fcor) 0.199 0.001 ## ## Fit Statistics ## est jkunits jk_est jk_se est_low est_upp ## MADcor 0.056 10 0.041 0.007 0.026 0.055 ## SRMSR 0.072 10 0.045 0.019 0.007 0.083 ## 100*MADRESIDCOV 1.225 10 0.878 0.183 0.519 1.236 ## MADQ3 0.073 10 0.055 0.012 0.031 0.080 ## MADaQ3 0.073 10 0.066 0.012 0.042 0.089#*** Model 3: estimate DINA model with gdina functionmod3 <- CDM::gdina( dat, q.matrix=q.matrix, rule="DINA" )fmod3 <- CDM::modelfit.cor.din( mod3, jkunits=0 ) # no Jackknife estimationsummary(fmod3) ## Test of Global Model Fit ## type value p ## 1 max(X2) 8.756 0.111 ## 2 abs(fcor) 0.143 0.078 ## ## Fit Statistics ## est ## MADcor 0.030 ## SRMSR 0.040 ## MX2 0.719 ## 100*MADRESIDCOV 0.668 ## MADQ3 0.062 ## MADaQ3 0.059############################################################################## EXAMPLE 2: Simulated Example DINA model#############################################################################set.seed(9765)# specify Q-matrixQ <- matrix( c(1,0, 0,1, 1,1 ), nrow=3, ncol=2, byrow=TRUE )q.matrix <- Q[ rep(1:3,4), ]I <- nrow(q.matrix)# simulate dataguess <- stats::runif(I, 0, .3 )slip <- stats::runif( I, 0, .4 )N <- 150 # number of personsdat <- CDM::sim.din( N=N, q.matrix=q.matrix, slip=slip, guess=guess )$dat#*** estmate DINA modelmod1 <- CDM::din( dat, q.matrix=q.matrix, rule="DINA" )fmod1 <- CDM::modelfit.cor.din(mod1, jkunits=10)summary(fmod1) ## Test of Global Model Fit ## type value p ## 1 max(X2) 10.697 0.071 ## 2 abs(fcor) 0.277 0.026 ## ## Fit Statistics ## est jkunits jk_est jk_se est_low est_upp ## MADcor 0.052 10 0.026 0.010 0.006 0.045 ## SRMSR 0.074 10 0.048 0.013 0.022 0.074 ## 100*MADRESIDCOV 1.259 10 0.646 0.213 0.228 1.063 ## MADQ3 0.080 10 0.047 0.010 0.027 0.068 ## MADaQ3 0.079 10 0.046 0.010 0.027 0.065## End(Not run)Numerical Computation of the Hessian Matrix
Description
Computes numerically the Hessian matrix of a given function forall coordinates (numerical_Hessian), for a selecteddirection (numerical_Hessian_partial) or the gradientof a multivariate function (numerical_gradient).
Usage
numerical_Hessian(par, FUN, h=1e-05, gradient=FALSE, hessian=TRUE, diag_only=FALSE, ...)numerical_Hessian_partial(par, FUN, h=1e-05, coordinate=1, ... )numerical_gradient(par, FUN, h=1E-5, ...)Arguments
par | Parameter vector |
FUN | Specified function with argument vector |
h | Numerical differentiation parameter. Can be also a vector.The increment in the numerical approximation of the derivative isdefined as |
gradient | Logical indicating whether the gradient should be calculated. |
hessian | Logical indicating whether the Hessianmatrix should be calculated. |
diag_only | Logical indicating whether only the diagonal of thehessian should be computed. |
... | Further arguments to be passed to |
coordinate | Coordinate index for partial derivative |
Value
Gradient vector or Hessian matrix or a list of both elements
See Also
See thenumDeriv package and themirt::numerical_derivfunction from themirt package.
Examples
############################################################################## EXAMPLE 1: Toy example for Hessian matrix############################################################################## define functionf <- function(x){ 3*x[1]^3 - 4*x[2]^2 - 5*x[1]*x[2] + 10 * x[1] * x[3]^2 + 6*x[2]*sqrt(x[3])}# define point for evaluating partial derivativespar <- c(3,8,4)#--- compute gradientCDM::numerical_Hessian( par=par, FUN=f, gradient=TRUE, hessian=FALSE)## Not run: mirt::numerical_deriv(par=par, f=f, gradient=TRUE)#--- compute Hessian matrixCDM::numerical_Hessian( par=par, FUN=f )mirt::numerical_deriv(par=par, f=f, gradient=FALSE)numerical_Hessian( par=par, FUN=f, h=1E-4 )#--- compute gradient and Hessian matrixCDM::numerical_Hessian( par=par, FUN=f, gradient=TRUE, hessian=TRUE)## End(Not run)Opens and Closes asink Connection
Description
Opens and closes asink connection.
Usage
osink(file, suffix, append=FALSE)csink(file)Arguments
file | File name. No |
suffix | Suffix which should be put next to the file name |
append | Optional logical indicating whether console output shouldbe appended to an already existing file. See argument |
See Also
Examples
## The function 'osink' is currently defined asfunction (file, suffix){ if (!is.null(file)) { base::sink(paste0(file, suffix), split=TRUE) } }## The function 'csink' is currently defined asfunction (file){ if (!is.null(file)) { base::sink() } }Appropriateness Statistic for Person Fit Assessment
Description
This function computes the person fit appropriateness statistics(Levine & Drasgow, 1988) as proposed for cognitive diagnosticmodels by Liu, Douglas and Henson (2009). The appropriateness statisticassesses spuriously high scorers (attr.type=1) andspuriously low scorers (attr.type=0).
Usage
personfit.appropriateness(data, probs, skillclassprobs, h=0.001, eps=1e-10, maxiter=30, conv=1e-05, max.increment=0.1, progress=TRUE)## S3 method for class 'personfit.appropriateness'summary(object, digits=3, ...)## S3 method for class 'personfit.appropriateness'plot(x, cexpch=.65, ...)Arguments
data | Data frame of dichotomous item responses |
probs | Probabilities evaluated at skill space (abilities |
skillclassprobs | Probabilities of skill classes |
h | Numerical differentiation parameter |
eps | Constant which is added to probabilities avoiding zero probability |
maxiter | Maximum number of iterations |
conv | Convergence criterion |
max.increment | Maximum increment in iteration |
progress | Optional logical indicating whether iteration progress shouldbe displayed. |
object | Object of class |
digits | Number of digits for rounding |
x | Object of class |
cexpch | Point size in plot |
... | Further arguments to be passed |
Value
List with following entries
summary | Summaries of person fit statistic |
personfit.appr.type1 | Statistic for spuriously high scorers( |
personfit.appr.type0 | Statistic for spuriously low scorers( |
References
Levine, M. V., & Drasgow, F. (1988). Optimal appropriateness measurement.Psychometrika, 53, 161-176.
Liu, Y., Douglas, J. A., & Henson, R. A. (2009). Testing person fit in cognitivediagnosis.Applied Psychological Measurement, 33(8), 579-598.
Examples
############################################################################## EXAMPLE 1: DINA model data.ecpe#############################################################################data(data.ecpe, package="CDM")# fit DINA modelmod1 <- CDM::din( CDM::data.ecpe$data[,-1], q.matrix=CDM::data.ecpe$q.matrix )summary(mod1)# person fit appropriateness statisticdata <- mod1$dataprobs <- mod1$pjkskillclassprobs <- mod1$attribute.patt[,1]res <- CDM::personfit.appropriateness( data, probs, skillclassprobs, maxiter=8) # only few iterationssummary(res)plot(res)## Not run: ############################################################################## EXAMPLE 2: Person fit 2PL model#############################################################################data(data.read, package="sirt")dat <- data.readI <- ncol(dat)# fit 2PL modelmod1 <- sirt::rasch.mml2( dat, est.a=1:I)# person fit statisticdata <- mod1$datprobs0 <- t(mod1$pjk)probs <- array( 0, dim=c( I, 2, dim(probs0)[2] ) )probs[,2,] <- probs0probs[,1,] <- 1 - probs0skillclassprobs <- mod1$trait.distr$pi.kres <- CDM::personfit.appropriateness( data, probs, skillclassprobs )summary(res)plot(res)## End(Not run)Plot Method for Objects of Class din
Description
S3 method to plot objects of the classdin.
Usage
## S3 method for class 'din'plot(x, items=c(1:ncol(x$data)), pattern="", uncertainty=0.1, top.n.skill.classes=6, pdf.file="", hide.obs=FALSE, display.nr=1:4, ask=TRUE, ...)Arguments
x | A required object of class |
items | An index vector giving the items to be visualized in the firstplot, see ‘Details’. The default is |
pattern | An optional character or a numeric vector specifying a response patternof an respondent, whose attributes are analyzed in a separategraphic. It is required to choose a pattern from the empiricaldata set (see Example). |
uncertainty | A numeric between 0 and 0.5 giving theuncertainty bounds for deriving the observed skill occurrence probabilitiesin plot 2 and the simplified deterministic attribute profiles in plot 4. |
top.n.skill.classes | A numeric, specifying the number of skill classes,starting with the most frequent, to be labeled in plot 3.Default value is 6. |
pdf.file | An optional character string. If specified the graphicsobtained from the function |
hide.obs | An optional logical value. If set to |
display.nr | An optional numeric or numeric vector. If specified, only the plots in |
ask | An optional logical indicating whether a request for a user inputis necessary before the next figure is drawn. |
... | Optional graphical parameters to be passed to or from othermethods will be ignored. |
Details
Theplot method graphs the results obtained from a CDM analysis.Four graphics to analyze the fitted model are produced, respectively.
The first graphic depicts the parameter estimates their diagnostic accuracyfor each of chosen the items initems. Parameter estimates aresplitted in guessing and slipping errors for each item. Seedinfor further information.
The second graphic shows the estimated occurrence probabilities of the attributesunderlying the items.
The third graphic illustrates the distribution of the skill class occurrenceprobabilities. Thetop.n.skill.classes most frequent skill classes are labeled.
The forth plot is a parallel coordinate plot of the individual skill profiles.Each line represents an individual skill profile. For each of these skill profileson the vertical lines the individual probabilities of mastering the correspondingattributes are drawn.
If inpattern an empirical response pattern is specified, the fifth plotshows the individual skill profile of an examinee having this response pattern.For each attribute, having a mastering probability below0.5 - uncertaintythe examinee is classified as non-master of the corresponding attribute. Formastering probabilities higher than0.5 + uncertainty the examinee isclassified as master of the corresponding attribute.
Value
If the argumentx is of required type,and if the optional argumentsitems,uncertainty,top.n.skill.classes andpdf.file are specified as required, theplot.din produces several graphics to analyze a CDM model.
See Also
print.din, the S3 method for printing objects ofthe classdin;summary.din, the S3method for summarizing objects of the classdin, whichcreates objects of the classsummary.din;print.summary.din, the S3 method for printingobjects of the classsummary.din;din,the main function for DINA and DINO parameter estimation, whichcreates objects of the classdin. See alsoCDM-packagefor general information about this package.
Examples
#### (1) examples based on dataset fractions.subtraction.data##data(fraction.subtraction.data)data(fraction.subtraction.qmatrix)## Fix the guessing parameters of items 5, 8 and 9 equal to .20# define a constraint.guess matrixconstraint.guess <- matrix(c(5,8,9, rep(0.2, 3)), ncol=2)fractions.dina.fixed <- CDM::din(data=fraction.subtraction.data, q.matrix=fraction.subtraction.qmatrix, constraint.guess=constraint.guess)## The second plot shows the expected (MAP) and observed skill## probabilities. The third plot visualizes the skill class## occurrence probabilities; Only the 'top.n.skill.classes' most frequent## skill classes are labeled; it is obvious that the skill class '11111111'## (all skills are mastered) is the most probable in this population.## The fourth plot shows the skill probabilities conditional on response## patterns; in this population the skills 3 and 6 seem to be## mastered easier than the others. The fifth plot shows the## skill probabilities conditional on a specified response## pattern; it is shown whether a skill is mastered (above## .5+'uncertainty') unclassifiable (within the boundaries) or## not mastered (below .5-'uncertainty'). In this case, the## 527th respondent was chosen; if no response pattern is## specified, the plot will not be shown (of course)pattern <- paste(fraction.subtraction.data[527, ], collapse="")plot(fractions.dina.fixed, pattern=pattern, display.nr=4)# It is also possible to input a vector of item responsesplot(fractions.dina.fixed, pattern=fraction.subtraction.data[527, ],display.nr=4)#uncertainty=0.1, top.n.skill.classes=6 are defaultplot(fractions.dina.fixed, uncertainty=0.1, top.n.skill.classes=6, pattern=pattern)S3 Methods for Plotting Item Probabilities
Description
This S3 method plots item probabilities for non-masters and masters of an item.
Usage
plot_item_mastery(object, pch=c(16,17), lty=c(1,2), ...)## S3 method for class 'din'plot_item_mastery(object, pch=c(16,17), lty=c(1,2), ...)## S3 method for class 'gdina'plot_item_mastery(object, pch=c(16,17), lty=c(1,2), ...)Arguments
object | |
pch | Point symbols for both groups |
lty | Line symbols for both groups |
... | More arguments to be passed. |
Value
Plot
See Also
Plot functions for item response curves:IRT.irfprobPlot.
Examples
## Not run: ############################################################################## EXAMPLE 1: Plot item mastery#############################################################################data(sim.dina)data(sim.qmatrix)#* estimate DINA Modelmod1 <- CDM::din(sim.dina, q.matrix=sim.qmatrix, rule="DINA")#* estimate GDINA modelmod2 <- CDM::gdina(sim.dina, q.matrix=sim.qmatrix)#* plotsplot_item_mastery(mod1)plot_item_mastery(mod2)## End(Not run)Expected Values and Predicted Probabilities from Item Response Response Models
Description
This function computes expected values for each person and eachitem based on the individual posterior distribution. The outputof this function can be the basis of creating item andperson fit statistics.
Usage
IRT.predict(object, dat, group=1)## S3 method for class 'din'predict(object, group=1, ...)## S3 method for class 'gdina'predict(object, group=1, ...)## S3 method for class 'mcdina'predict(object, group=1, ...)## S3 method for class 'gdm'predict(object, group=1, ...)## S3 method for class 'slca'predict(object, group=1, ...)Arguments
object | Object for the S3 methods |
dat | Dataset with item responses |
group | Group index for use |
... | Further arguments to be passed. |
Value
A list with following entries
expected | Array with expected values (persons |
probs.categ | Array with expected probabilities foreach category (persons |
variance | Array with variance in predicted values for eachperson and each item. |
residuals | Array with residuals for each person and each item |
stand.resid | Array with standardized residuals for eachperson and each item |
Examples
## Not run: ############################################################################## EXAMPLE 1: Fitted Rasch model in TAM package##############################################################################--- Model 1: Rasch modellibrary(TAM)mod1 <- TAM::tam.mml(resp=TAM::sim.rasch)# apply IRT.predict functionprmod1 <- CDM::IRT.predict(mod1, mod1$resp )str(prmod1)## End(Not run)############################################################################## EXAMPLE 2: Predict function for din############################################################################## DINA Modelmod1 <- CDM::din( CDM::sim.dina, q.matr=CDM::sim.qmatrix, rule="DINA" )summary(mod1)# apply predict methodprmod1 <- CDM::IRT.predict( mod1, sim.dina )str(prmod1)Print Method for Objects of Class summary.din
Description
S3 method to print objects of the classsummary.din.
Usage
## S3 method for class 'summary.din'print(x, ...)Arguments
x | A required object of class |
... | Optional parameters to be passed to or from othermethods will be ignored. |
Details
Theprint method prints the summary information about objectsof the classdin computed bysummary.din,which are the item discriminations indices, the most frequentskill classes and the model information criteria AIC and BIC.Specific summary information details such asindividual items with their discrimination index can be accessed throughassignment (see ‘Examples’).
Value
If the argumentx is of required type,print.summary.din prints the summaryinformation in ‘Details’, and invisibly returnsx.
See Also
plot.din, the S3 method for plotting objects ofthe classdin;print.din, the S3 methodfor printing objects of the classdin;summary.din, the S3 method for summarizing objectsof the classdin, which creates objects of the classsummary.din;din, the main function forDINA and DINO parameter estimation, which creates objects of the classdin. See alsoCDM-package for generalinformation about this package.
Examples
#### (1) examples based on dataset fractions.subtraction.data#### In particular, accessing detailed summary through assignmentmod <- CDM::din(data=CDM::fraction.subtraction.data, q.matrix=CDM::fraction.subtraction.qmatrix, rule="DINA")smod <- summary(mod)str(smod)Regularized Latent Class Analysis
Description
Estimates the regularized latent class model for dichotomousresponses based on regularization methods(Chen, Liu, Xu, & Ying, 2015; Chen, Li, Liu, & Ying, 2017).The SCAD and MCP penalty functions are available.
Usage
reglca(dat, nclasses, weights=NULL, group=NULL, regular_type="scad", regular_lam=0, sd_noise_init=1, item_probs_init=NULL, class_probs_init=NULL, random_starts=1, random_iter=20, conv=1e-05, h=1e-04, mstep_iter=10, maxit=1000, verbose=TRUE, prob_min=.0001)## S3 method for class 'reglca'summary(object, digits=4, file=NULL, ...)Arguments
dat | Matrix with dichotomous item responses. |
nclasses | Number of classes |
weights | Optional vector of sampling weights |
group | Optional vector for grouping variable |
regular_type | Regularization type. Can be |
regular_lam | Regularization parameter |
sd_noise_init | Standard deviation for amount of noise in generating random starting values |
item_probs_init | Optional matrix of initial item response probabilities |
class_probs_init | Optional vector of class probabilities |
random_starts | Number of random starts |
random_iter | Number of initial iterations for random starts |
conv | Convergence criterion |
h | Numerical differentiation parameter |
mstep_iter | Number of iterations in the M-step |
maxit | Maximum number of iterations |
verbose | Logical indicating whether convergence progress should be displayed |
prob_min | Lower bound for probabilities in estimation |
object | A required object of class |
digits | Number of digits after decimal separator to display. |
file | Optional file name for a file in which |
... | Further arguments to be passed. |
Details
The regularized latent class model for dichotomous item responses assumesClatent classes. The item response probabilitiesP(X_i=1|c)=p_{ic} are estimatedin such a way such that the number of differentp_{ic} values per item isminimized. This approach eases interpretability and enables to recover thestructure of a true (but unknown) cognitive diagnostic model.
Value
A list containing following elements (selection):
item_probs | Item response probabilities |
class_probs | Latent class probabilities |
p.aj.xi | Individual posterior |
p.xi.aj | Individual likelihood |
loglike | Log-likelihood value |
Npars | Number of estimated parameters |
Nskillpar | Number of skill class parameters |
G | Number of groups |
n.ik | Expected counts |
Nipar | Number of item parameters |
n_reg | Number of regularized parameters |
n_reg_item | Number of regularized parameters per item |
item | Data frame with item parameters |
pjk | Item response probabilities (in an array) |
N | Number of persons |
I | Number of items |
References
Chen, Y., Liu, J., Xu, G., & Ying, Z. (2015). Statistical analysis of Q-matrixbased diagnostic classification models.Journal of the American Statistical Association, 110, 850-866.
Chen, Y., Li, X., Liu, J., & Ying, Z. (2017).Regularized latent class analysis with application in cognitive diagnosis.Psychometrika, 82, 660-692.
See Also
See also thegdina andslca functionsfor regularized estimation.
Examples
## Not run: ############################################################################## EXAMPLE 1: Estimating a regularized LCA for DINA data##############################################################################---- simulate dataI <- 12 # number of items# define Q-matrixq.matrix <- matrix(0,I,2)q.matrix[ 1:(I/3), 1 ] <- 1q.matrix[ I/3 + 1:(I/3), 2 ] <- 1q.matrix[ 2*I/3 + 1:(I/3), c(1,2) ] <- 1N <- 1000 # number of personsguess <- rep(seq(.1,.3,length=I/3), 3)slip <- .1rho <- 0.3 # skill correlationset.seed(987)dat <- CDM::sim.din( N=N, q.matrix=q.matrix, guess=guess, slip=slip, mean=0*c( .2, -.2 ), Sigma=matrix( c( 1, rho,rho,1), 2, 2 ) )dat <- dat$dat#--- Model 1: Four latent classes without regularizationmod1 <- CDM::reglca(dat=dat, nclasses=4, regular_lam=0, random_starts=3, random_iter=10, conv=1E-4)summary(mod1)#--- Model 2: Four latent classes with regularization and lambda=.08mod2 <- CDM::reglca(dat=dat, nclasses=4, regular_lam=0.08, regular_type="scad", random_starts=3, random_iter=10, conv=1E-4)summary(mod2)#--- Model 3: Four latent classes with regularization and lambda=.05 with warm start# "warm start" -> use initial parameters from fitted model with higher lambda valueitem_probs_init <- mod2$item_probsclass_probs_init <- mod2$class_probsmod3 <- CDM::reglca(dat=dat, nclasses=4, regular_lam=0.05, regular_type="scad", item_probs_init=item_probs_init, class_probs_init=class_probs_init, random_starts=3, random_iter=10, conv=1E-4)## End(Not run)Constructing a Dataset with Sequential Pseudo Items for OrderedItem Responses
Description
This function constructs dichotomous pseudo items from polytomous ordereditems (Tutz, 1997). Using this method, developed test models for dichotomousdata can be applied for polytomous item responses after transforming theminto dichotomous data. See Details for the construction.
Ma and de la Torre (2016) proposed a sequential GDINA model.Interestingly, the proposed model can be fitted with thegdina function in thisCDM package while item responseshas to be transformed with thesequential.items function forobtaining dichotomous pseudoitems. The Q-matrix for the sequential model of Ma andde la Torre (2016) can be used in the GDINA model for thedichotomous pseudoitems. This approach is implemented for automaticuse ingdina.
Usage
sequential.items(data)Arguments
data | A data frame with item responses |
Details
Assume that itemj possessesK \geq 3 categories. We label thesecategories ask=0,1,\ldots,K-1. The original item responsesX_{nj}for personn at itemj is then transformed intoK-1 pseudoitemsY_{j1}, \ldots, Y_{j,K-1}.
The first pseudo item responseY_{nj1} is defined as 1 iffX_{nj} \geq 1. The second item responsesY_{nj2} is 1 iffX_{nj} \geq 2, it is 0 iffX_{nj}=1 and it is missing(NA in the dataset) iffX_{nj}=0. The construction proceedsin the same manner for other categories (see Tutz, 1997). The pseudo items can berecognized as 'hurdles' a participant has to master to get a score ofkfor the original item.
The pseudo items are treated as conditionally independent which implies thatIRT models or CDMs which assume local independence can be employed for estimation.
For deriving item response probabilities of the original items from responseprobabilities of the pseudo items see Tutz (1997, p. 141ff.).
Value
A list with following entries
dat.expand | A data frame with dichotomous pseudo items |
iteminfo | A data frame containing some item information |
maxK | Vector with maximum number of categories per item |
References
Ma, W., & de la Torre, J. (2016).A sequential cognitive diagnosis model for polytomous responses.British Journal of Mathematical and Statistical Psychology, 69(3),253-275.
Tutz, G. (1997). Sequential models for ordered responses.In W. van der Linden & R. K. Hambleton.Handbook of modern item response theory (pp. 139-152).New York: Springer.
Examples
############################################################################## EXAMPLE 1: Constructing sequential pseudo items for data.mg#############################################################################data(data.mg, package="CDM")dat <- data.mgitems <- colnames(dat)[ which( substring( colnames(dat),1,1)=="I" ) ]## [1] "I1" "I2" "I3" "I4" "I5" "I6" "I7" "I8" "I9" "I10" "I11"data <- dat[,items]# construct sequential dichotomous pseudo itemsres <- CDM::sequential.items(data)# item information tableres$iteminfo ## item itemindex category pseudoitem ## 1 I1 1 1 I1 ## 2 I2 2 1 I2 ## 3 I3 3 1 I3 ## 4 I4 4 1 I4_Cat1 ## 5 I4 4 2 I4_Cat2 ## 6 I5 5 1 I5_Cat1 ## 7 I5 5 2 I5_Cat2 ## [...]# extract dataset with pseudo itemsdat.expand <- res$dat.expandcolnames(dat.expand) ## [1] "I1" "I2" "I3" "I4_Cat1" "I4_Cat2" "I5_Cat1" ## [7] "I5_Cat2" "I6_Cat1" "I6_Cat2" "I7_Cat1" "I7_Cat2" "I7_Cat3" ## [13] "I8" "I9" "I10" "I11_Cat1" "I11_Cat2" "I11_Cat3"# compare original items and pseudoitems#**** Item I1stats::xtabs( ~ paste(data$I1) + paste(dat.expand$I1) ) ## paste(dat.expand$I1) ## paste(data$I1) 0 1 NA ## 0 4339 0 0 ## 1 0 33326 0 ## NA 0 0 578#**** Item I7stats::xtabs( ~ paste(data$I7) + paste(dat.expand$I7_Cat1) ) ## paste(dat.expand$I7_Cat1) ## paste(data$I7) 0 1 NA ## 0 3825 0 0 ## 1 0 14241 0 ## 2 0 14341 0 ## 3 0 5169 0 ## NA 0 0 667stats::xtabs( ~ paste(data$I7) + paste(dat.expand$I7_Cat2) ) ## paste(dat.expand$I7_Cat2) ## paste(data$I7) 0 1 NA ## 0 0 0 3825 ## 1 14241 0 0 ## 2 0 14341 0 ## 3 0 5169 0 ## NA 0 0 667stats::xtabs( ~ paste(data$I7) + paste(dat.expand$I7_Cat3) ) ## paste(dat.expand$I7_Cat3) ## paste(data$I7) 0 1 NA ## 0 0 0 3825 ## 1 0 0 14241 ## 2 14341 0 0 ## 3 0 5169 0 ## NA 0 0 667## Not run: #*** Model 1: Rasch model for sequentially created pseudo itemsmod <- CDM::gdm( dat.expand, irtmodel="1PL", theta.k=seq(-5,5,len=21), skillspace="normal", decrease.increments=TRUE)## End(Not run)Data Simulation Tool for DINA, DINO and mixed DINA and DINO Data
Description
sim.din can be used to simulate dichotomous response data according to a CDMmodel. The model type DINA or DINO can be specified item wise. The number of items,the sample size, and two parameters for each item,the slipping and guessing parameters, can be set explicitly.
Usage
sim.din(N=0, q.matrix, guess=rep(0.2, nrow(q.matrix)), slip=guess, mean=rep(0, ncol(q.matrix)), Sigma=diag(ncol(q.matrix)), rule="DINA", alpha=NULL)Arguments
N | A numeric value specifying the number |
q.matrix | A required binary |
guess | An optional vector of guessing parameters. Default is0.2 for each item. |
slip | An optional vector of slipping parameters. Default is0.2 for each item. |
mean | A numeric vector of length |
Sigma | A matrix of dimension |
rule | An optional character string or vector of character stringsspecifying the model rule that is used. The character strings must beof |
alpha | A matrix of attribute patterns which can be given as an inputinstead of underlying latent variables. If |
Value
A list with following entries
dat | A matrix of simulated dichotomous response dataaccording to the specified CDM model. |
alpha | Simulated attributes |
References
Rupp, A. A., Templin, J. L., & Henson, R. A. (2010).DiagnosticMeasurement: Theory, Methods, and Applications. New York: The GuilfordPress.
See Also
Data-sim for artificial date set simulated with the help of thismethod;plot.din, the S3 method for plotting objects ofthe classdin;summary.din, the S3method for summarizing objects of the classdin, whichcreates objects of the classsummary.din;print.summary.din, the S3 method for printingobjects of the classsummary.din;din,the main function for DINA and DINO parameter estimation,which creates objects of the classdin. See alsoCDM-package for general information about this package.
Seesim_model for a general simulation function.
Examples
############################################################################### EXAMPLE 1: simulate DINA/DINO data according to a tetrachoric correlation############################################################################## define Q-matrix for 4 items and 2 attributesq.matrix <- matrix(c(1,0,0,1,1,1,1,1), ncol=2, nrow=4)# Slipping parametersslip <- c(0.2,0.3,0.4,0.3)# Guessing parametersguess <- c(0,0.1,0.05,0.2)set.seed(1567) # fix random numbersdat1 <- CDM::sim.din(N=200, q.matrix, slip=slip, guess=guess, # Possession of the attributes with high probability mean=c(0.5,0.2), # Possession of the attributes is weakly correlated Sigma=matrix(c(1,0.2,0.2,1), ncol=2), rule="DINA")$dathead(dat1)set.seed(15367) # fix random numbersres <- CDM::sim.din(N=200, q.matrix, slip=slip, guess=guess, mean=c(0.5,0.2), Sigma=matrix(c(1,0.2,0.2,1), ncol=2), rule="DINO")# extract simulated datadat2 <- res$dat# extract attribute patternshead( res$alpha ) ## [,1] [,2] ## [1,] 1 1 ## [2,] 1 1 ## [3,] 1 1 ## [4,] 1 1 ## [5,] 1 1 ## [6,] 1 0# simulate data based on given attributes# -> 5 persons with 2 attributes -> see the Q-matrix abovealpha <- matrix( c(1,0,1,0,1,1,0,1,1,1), nrow=5,ncol=2, byrow=TRUE )CDM::sim.din( q.matrix=q.matrix, alpha=alpha )## Not run: ############################################################################## EXAMPLE 2: Simulation based on attribute vectors#############################################################################set.seed(76)# define Q-matrixQmatrix <- matrix(c(1,0,1,0,1,0,0,1,0,1,0,1,1,1,1,1), 8, 2, byrow=TRUE)colnames(Qmatrix) <- c("Attr1","Attr2")# define skill patternsalpha.patt <- matrix(c(0,0,1,0,0,1,1,1), 4,2,byrow=TRUE )AP <- nrow(alpha.patt)# define pattern probabilitiesalpha.prob <- c( .20, .40, .10, .30 )# simulate alpha latent responsesN <- 1000 # number of personsind <- sample( x=1:AP, size=N, replace=TRUE, prob=alpha.prob)alpha <- alpha.patt[ ind, ] # (true) latent responses# define guessing and slipping parametersguess <- c(.26,.3,.07,.23,.24,.34,.05,.1)slip <- c(.05,.16,.19,.03,.03,.19,.15,.05)# simulation of the DINA modeldat <- CDM::sim.din(N=0, q.matrix=Qmatrix, guess=guess, slip=slip, alpha=alpha)$dat# estimate modelres <- CDM::din( dat, q.matrix=Qmatrix )# extract maximum likelihood estimates for individual classificationsest <- paste( res$pattern$mle.est )# calculate classification accuracymean( est==apply( alpha, 1, FUN=function(ll){ paste0(ll[1],ll[2] ) } ) ) ## [1] 0.935############################################################################## EXAMPLE 3: Simulation based on already estimated DINA model for data.ecpe#############################################################################dat <- CDM::data.ecpe$dataq.matrix <- CDM::data.ecpe$q.matrix#***# (1) estimate DINA modelmod <- CDM::din( data=dat[,-1], q.matrix=q.matrix, rule="DINA")#***# (2) simulate data according to DINA modelset.seed(977)# number of subjects to be simulatedn <- 3000# simulate attribute patternsprobs <- mod$attribute.patt$class.prob # probabilitiespatt <- mod$attribute.patt.splitted # response patternsalpha <- patt[ sample( 1:(length(probs) ), n, prob=probs, replace=TRUE), ]# simulate data using estimated item parametersres <- CDM::sim.din(N=n, q.matrix=q.matrix, guess=mod$guess$est, slip=mod$slip$est, rule="DINA", alpha=alpha)# extract datadat <- res$dat## End(Not run)Simulation of the GDINA model
Description
The functionsim.gdina.prepare creates necessary design matricesMj,Aj andnecc.attr. In most cases, only the listof item parametersdelta must be modified by the user whenapplying the simulation functionsim.gdina. The distribution of latentclasses\alpha is represented by an underlying multivariate normal distribution\alpha^\ast for which a mean vectorthresh.alpha and acovariance matrixcov.alpha must be specified.Alternatively, a matrix of skill classesalphacan be given as an input.
Note that this version ofsim.gdina only works for dichotomous attributes.
Usage
sim.gdina(n, q.matrix, delta, link="identity", thresh.alpha=NULL, cov.alpha=NULL, alpha=NULL, Mj, Aj, necc.attr)sim.gdina.prepare( q.matrix )Arguments
n | Number of persons |
q.matrix | Q-matrix (see |
delta | List with |
link | Link function. Choices are |
thresh.alpha | Vector of thresholds (means) of |
cov.alpha | Covariance matrix of |
alpha | Matrix of skill classes if they should not be simulated |
Mj | Design matrix, see |
Aj | Design matrix, see |
necc.attr | List with |
Value
The output ofsim.gdina is a list with following entries:
data | Simulated item responses |
alpha | Data frame with simulated attributes |
q.matrix | Used Q-matrix |
delta | Used delta item parameters |
Aj | Design matrices |
Mj | Design matrices |
link | Used link function |
The functionsim.gdina.prepare possesses the following values as outputin a list:delta,necc.attr,Aj andMj.
References
de la Torre, J. (2011). The generalized DINA model framework.Psychometrika, 76, 179–199.
See Also
For estimating the GDINA model seegdina.
See theGDINA::simGDINA function in theGDINA package for similar functionality.
Seesim_model for a general simulation function.
Examples
############################################################################## EXAMPLE 1: Simulating the GDINA model#############################################################################n <- 50 # number of persons# define Q-matrixq.matrix <- matrix( c(1,1,0, 0,1,1, 1,0,1, 1,0,0, 0,0,1, 0,1,0, 1,1,1, 0,1,1, 0,1,1), ncol=3, byrow=TRUE)# thresholds for attributes alpha^\astthresh.alpha <- c( .65, 0, -.30 )# covariance matrix for alpha^\astcov.alpha <- matrix(1,3,3)cov.alpha[1,2] <- cov.alpha[2,1] <- .4cov.alpha[1,3] <- cov.alpha[3,1] <- .6cov.alpha[3,2] <- cov.alpha[2,3] <- .8# prepare design matrix by applying sim.gdina.prepare functionrp <- CDM::sim.gdina.prepare( q.matrix )delta <- rp$deltanecc.attr <- rp$necc.attrAj <- rp$AjMj <- rp$Mj# define delta parameters# intercept - main effects - second order interactions - ...str(delta) #=> modify the delta parameter list which contains only zeroes as default## List of 9## $ : num [1:4] 0 0 0 0## $ : num [1:4] 0 0 0 0## $ : num [1:4] 0 0 0 0## $ : num [1:2] 0 0## $ : num [1:2] 0 0## $ : num [1:2] 0 0## $ : num [1:8] 0 0 0 0 0 0 0 0## $ : num [1:4] 0 0 0 0## $ : num [1:4] 0 0 0 0delta[[1]] <- c( .2, .1, .15, .4 )delta[[2]] <- c( .2, .3, .3, -.2 )delta[[3]] <- c( .2, .2, .2, 0 )delta[[4]] <- c( .15, .6 )delta[[5]] <- c( .1, .7 )delta[[6]] <- c( .25, .65 )delta[[7]] <- c( .25, .1, .1, .1, 0, 0, 0, .25 )delta[[8]] <- c( .2, 0, .3, -.1 )delta[[9]] <- c( .2, .2, 0, .3 )#******************************************# Now, the "real simulation" startssim.res <- CDM::sim.gdina( n=n, q.matrix=q.matrix, delta=delta, link="identity", thresh.alpha=thresh.alpha, cov.alpha=cov.alpha, Mj=Mj, Aj=Aj, necc.attr=necc.attr)# sim.res$data # simulated data# sim.res$alpha # simulated alpha## Not run: ############################################################################## EXAMPLE 2: Simulation based on already estimated GDINA model for data.ecpe#############################################################################data(data.ecpe)dat <- data.ecpe$dataq.matrix <- data.ecpe$q.matrix#***# (1) estimate GDINA modelmod <- CDM::gdina( data=dat[,-1], q.matrix=q.matrix )#***# (2) simulate data according to GDINA modelset.seed(977)# prepare design matrix by applying sim.gdina.prepare functionrp <- CDM::sim.gdina.prepare( q.matrix )necc.attr <- rp$necc.attr# number of subjects to be simulatedn <- 3000# simulate attribute patternsprobs <- mod$attribute.patt$class.prob # probabilitiespatt <- mod$attribute.patt.splitted # response patternsalpha <- patt[ sample( 1:(length(probs) ), n, prob=probs, replace=TRUE), ]# simulate data using estimated item parameterssim.res <- CDM::sim.gdina( n=n, q.matrix=q.matrix, delta=mod$delta, link="identity", alpha=alpha, Mj=mod$Mj, Aj=mod$Aj, necc.attr=rp$necc.attr)# extract datadat <- sim.res$data############################################################################## EXAMPLE 3: Simulation based on already estimated RRUM model for data.ecpe#############################################################################dat <- CDM::data.ecpe$dataq.matrix <- CDM::data.ecpe$q.matrix#***# (1) estimate reduced RUM modelmod <- CDM::gdina( data=dat[,-1], q.matrix=q.matrix, rule="RRUM" )summary(mod)#***# (2) simulate data according to RRUM modelset.seed(977)# prepare design matrix by applying sim.gdina.prepare functionrp <- CDM::sim.gdina.prepare( q.matrix )necc.attr <- rp$necc.attr# number of subjects to be simulatedn <- 5000# simulate attribute patternsprobs <- mod$attribute.patt$class.prob # probabilitiespatt <- mod$attribute.patt.splitted # response patternsalpha <- patt[ sample( 1:(length(probs) ), n, prob=probs, replace=TRUE), ]# simulate data using estimated item parameterssim.res <- CDM::sim.gdina( n=n, q.matrix=q.matrix, delta=mod$delta, link=mod$link, alpha=alpha, Mj=mod$Mj, Aj=mod$Aj, necc.attr=rp$necc.attr)# extract datadat <- sim.res$data## End(Not run)Simulate an Item Response Model
Description
Simulates an item response model given a fitted object or input of item responseprobabilities and skill class probabilities.
Usage
sim_model(object=NULL, irfprob=NULL, theta_index=NULL, prob.theta=NULL, data=NULL, N_sim=NULL )Arguments
object | Fitted object for which the methods |
irfprob | Array of item response function values (items |
theta_index | Skill class index for sampling |
prob.theta | Skill class probabilities |
data | Original dataset, only relevant for simulating item response patternwith missing values |
N_sim | Number of subjects to be simulated |
Value
List containing elements
dat | Simulated item responses |
theta | Simulated skill classes |
theta_index | Corresponding indices to |
Examples
## Not run: ############################################################################## EXAMPLE 1: GDINA model simulation#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")dat <- sim.dinaQ <- sim.qmatrix# fit DINA modelmod <- CDM::gdina( dat, q.matrix=Q, rule="DINA")summary(mod)#** simulate new item responses (N equals observed sample size)dat1 <- CDM::sim_model(mod)#*** simulate item responses for N=2000 subjectsdat2 <- CDM::sim_model(mod, N_sim=2000)str(dat2)#*** simulate item responses based on input item response probabilities#*** and theta_indexirfprob <- CDM::IRT.irfprob(mod)prob.theta <- attr(irfprob, "prob.theta")TP <- length(prob.theta)theta_index <- sample(1:TP, size=1000, prob=prob.theta, replace=TRUE )#-- simulatedat3 <- CDM::sim_model(irfprob=irfprob, theta_index=theta_index)str(dat3)## End(Not run)Tetrachoric or Polychoric Correlations between Attributes
Description
This function takes the results ofdin orgdina andcomputes tetrachoric or polychoric correlations between attributes (see e.g.Templin & Henson, 2006).
Usage
# tetrachoric correlationsskill.cor(object)# polychoric correlationsskill.polychor(object, colindex=1)Arguments
object | Object of class |
colindex | Index which can used for group-wise calculationof polychoric correlations |
Value
A list with following entries:
conttable.skills | Bivariate contingency table of all skill pairs |
cor.skills | Tetrachoric correlation matrix for skilldistribution |
References
Templin, J., & Henson, R. (2006). Measurement of psychological disordersusing cognitive diagnosis models.Psychological Methods, 11, 287-305.
Examples
data(sim.dino, package="CDM")data(sim.qmatrix, package="CDM")# estimate modeld4 <- CDM::din( sim.dino, q.matrix=sim.qmatrix)# compute tetrachoric correlationsCDM::skill.cor(d4) ## estimated tetrachoric correlations ## $cor.skills ## V1 V2 V3 ## V1 1.0000000 0.2567718 0.2552958 ## V2 0.2567718 1.0000000 0.9842188 ## V3 0.2552958 0.9842188 1.0000000Skill Space Approximation
Description
This function approximates the skill space withK skills toapproximate a (typically high-dimensional) skill space of2^K classes byL classes(L < 2^K). The large number of latent classes arerepresented by underlying continuous latent variables for thedichotomous skills (see George & Robitzsch, 2014, for more details).
Usage
skillspace.approximation(L, K, nmax=5000)Arguments
L | Number of skill classes used for approximation |
K | Number of skills |
nmax | Number of quasi-randomly generated skill classes using the |
Value
A matrix containing skill classes in rows
Note
This function uses thesfsmisc::QUnif function from thesfsmiscpackage.
References
George, A. C., & Robitzsch, A. (2014). Multiple group cognitive diagnosis models,with an emphasis on differential item functioning.Psychological Test and Assessment Modeling, 56(4), 405-432.
See Also
See alsogdina (Example 9).
Examples
############################################################################## EXAMPLE 1: Approximate a skill space of K=8 eight skills by 20 classes##############################################################################=> 2^8=256 latent classes if all latent classes would be usedCDM::skillspace.approximation( L=20, K=8 ) ## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] ## P00000000 0 0 0 0 0 0 0 0 ## P00000001 0 0 0 0 0 0 0 1 ## P00001011 0 0 0 0 1 0 1 1 ## P00010011 0 0 0 1 0 0 1 1 ## P00101001 0 0 1 0 1 0 0 1 ## [...] ## P11011110 1 1 0 1 1 1 1 0 ## P11100110 1 1 1 0 0 1 1 0 ## P11111111 1 1 1 1 1 1 1 1Creation of a Hierarchical Skill Space
Description
The functionskillspace.hierarchy defines a reduced skill spacefor hierarchies in skills (see e.g. Leighton, Gierl, & Hunka, 2004).The functionskillspace.full defines a full skill spacefor dichotomous skills.
Usage
skillspace.hierarchy(B, skill.names)skillspace.full(skill.names)Arguments
B | A matrix or a string containing restrictions of the hierarchy.If Alternatively, a string can be also conveniently used for defining ahierarchy (see Examples). |
skill.names | Vector of names in skills |
Details
The reduced skill space output can be used as an argument indinorgdina to directly test for a hierarchy in attributes.
Value
A list with following entries
R | Reachability matrix |
skillspace.reduced | Reduced skill space fulfilling the specifiedhierarchy |
skillspace.complete | Complete skill space |
zeroprob.skillclasses | Indices of skill patterns in |
References
Leighton, J. P., Gierl, M. J., & Hunka, S. M. (2004).The attribute hierarchy method for cognitive assessment:A variation on Tatsuoka's rule space approach.Journal of Educational Measurement, 41, 205-237.
See Also
Seedin (Example 6) for an application ofskillspace.hierarchy for model comparisons.
See theGDINA::att.structure function in theGDINA package for similar functionality.
Examples
############################################################################## EXAMPLE 1: Toy example with 3 skills#############################################################################K <- 3 # number of skillsskill.names <- paste0("A", 1:K ) # names of skills# create a zero matrix for hierarchy definitionB0 <- 0*diag(K)rownames(B0) <- colnames(B0) <- skill.names#*** Model 1: A1 > A2 > A3B <- B0B[1,2] <- 1 # A1 > A2B[2,3] <- 1 # A2 > A3sp1 <- CDM::skillspace.hierarchy( B=B, skill.names=skill.names )sp1$skillspace.reduced ## A1 A2 A3 ## 1 0 0 0 ## 2 1 0 0 ## 4 1 1 0 ## 8 1 1 1#*** Model 2: A1 > A2 and A1 > A3B <- B0B[1,2] <- 1 # A1 > A2B[1,3] <- 1 # A1 > A3sp2 <- CDM::skillspace.hierarchy( B=B, skill.names=skill.names )sp2$skillspace.reduced ## A1 A2 A3 ## 1 0 0 0 ## 2 1 0 0 ## 4 1 1 0 ## 6 1 0 1 ## 8 1 1 1#*** Model 3: A1 > A3, A2 is not included in a hierarchical wayB <- B0B[1,3] <- 1 # A1 > A3sp3 <- CDM::skillspace.hierarchy( B=B, skill.names=skill.names )sp3$skillspace.reduced ## A1 A2 A3 ## 1 0 0 0 ## 2 1 0 0 ## 3 0 1 0 ## 4 1 1 0 ## 6 1 0 1 ## 8 1 1 1#~~~ Hierarchy specification using strings#*** Model 1: A1 > A2 > A3B <- "A1 > A2 A2 > A3"sp1 <- CDM::skillspace.hierarchy( B=B, skill.names=skill.names )sp1$skillspace.reduced# Model 1 can be also written in one line for BB <- "A1 > A2 > A3"sp1b <- CDM::skillspace.hierarchy( B=B, skill.names=skill.names )sp1b$skillspace.reduced#*** Model 2: A1 > A2 and A1 > A3B <- "A1 > A2 A1 > A3"sp2 <- CDM::skillspace.hierarchy( B=B, skill.names=skill.names )sp2$skillspace.reduced#*** Model 3: A1 > A3B <- "A1 > A3"sp3 <- CDM::skillspace.hierarchy( B=B, skill.names=skill.names )sp3$skillspace.reduced## Not run: ############################################################################## EXAMPLE 2: Examples from Leighton et al. (2004): Fig. 1 (p. 210)#############################################################################skill.names <- paste0("A",1:6) # 6 skills#*** Model 1: Linear hierarchy (A)B <- "A1 > A2 > A3 > A4 > A5 > A6"sp1 <- CDM::skillspace.hierarchy( B=B, skill.names=skill.names )sp1$skillspace.reduced#*** Model 2: Convergent hierarchy (B)B <- "A1 > A2 > A3 A2 > A4 A3 > A5 > A6 A4 > A5 > A6"sp2 <- CDM::skillspace.hierarchy( B=B, skill.names=skill.names )sp2$skillspace.reduced#*** Model 3: Divergent hierarchy (C)B <- "A1 > A2 > A3 A1 > A4 > A5 A1 > A4 > A6"sp3 <- CDM::skillspace.hierarchy( B=B, skill.names=skill.names )sp3$skillspace.reduced#*** Model 4: Unstructured hierarchy (D)B <- "A1 > A2 \n A1 > A3 \n A1 > A4 \n A1 > A5 \n A1 > A6"# This specification of B is equivalent to writing separate lines:# B <- "A1 > A2# A1 > A3# A1 > A4# A1 > A5# A1 > A6"sp4 <- CDM::skillspace.hierarchy( B=B, skill.names=skill.names )sp4$skillspace.reduced## End(Not run)Structured Latent Class Analysis (SLCA)
Description
This function implements a structured latent class model forpolytomous item responses (Formann, 1985, 1992). Lasso estimation for theitem parameters is included (Chen, Liu, Xu & Ying, 2015;Chen, Li, Liu & Ying, 2017; Sun, Chen, Liu, Ying & Xin, 2016).
Usage
slca(data, group=NULL, weights=rep(1, nrow(data)), Xdes, Xlambda.init=NULL, Xlambda.fixed=NULL, Xlambda.constr.V=NULL, Xlambda.constr.c=NULL, delta.designmatrix=NULL, delta.init=NULL, delta.fixed=NULL, delta.linkfct="log", Xlambda_positive=NULL, regular_type="lasso", regular_lam=0, regular_w=NULL, regular_n=nrow(data), maxiter=1000, conv=1e-5, globconv=1e-5, msteps=10, convM=5e-04, decrease.increments=FALSE, oldfac=0, dampening_factor=1.01, seed=NULL, progress=TRUE, PEM=TRUE, PEM_itermax=maxiter, ...)## S3 method for class 'slca'summary(object, file=NULL, ...)## S3 method for class 'slca'print(x, ...)## S3 method for class 'slca'plot(x, group=1, ... )Arguments
data | Matrix of polytomous item responses |
group | Optional vector of group identifiers. For |
weights | Optional vector of sample weights |
Xdes | Design matrix for |
Xlambda.init | Initial |
Xlambda.fixed | Fixed |
Xlambda.constr.V | A design matrix for linear restrictions of theform |
Xlambda.constr.c | A vector for the linear restriction |
delta.designmatrix | Design matrix for delta parameters |
delta.init | Initial |
delta.fixed | Fixed |
delta.linkfct | Link function for skill space reduction.This can be the log-linear link ( |
Xlambda_positive | Optional vector of logical indicating whichelements of |
regular_type | Regularization method which can be |
regular_lam | Numeric. Regularization parameter |
regular_w | Vector for weighting the regularization penalty |
regular_n | Vector of regularization factor. This will betypically the sample size. |
maxiter | Maximum number of iterations |
conv | Convergence criterion for item parameters anddistribution parameters |
globconv | Global deviance convergence criterion |
msteps | Maximum number of M steps in estimating |
convM | Convergence criterion in M step |
decrease.increments | Should in the M step the incrementsof |
oldfac | Factor |
dampening_factor | Factor larger than one defining the specified decrease indecrements in iterations. |
seed | Simulation seed for initial parameters. The defaultof |
progress | An optional logical indicating whether the functionshould print the progress of iteration in the estimation process. |
PEM | Logical indicating whether the P-EM acceleration should beapplied (Berlinet & Roland, 2012). |
PEM_itermax | Number of iterations in which the P-EM method should beapplied. |
object | A required object of class |
file | Optional file name for a file in which |
x | A required object of class |
... | Optional parameters to be passed to or from othermethods will be ignored. |
Details
The structured latent class model allows for general constraints of itemsi in categoriesh and classesj. The item response model is
P( X_{i}=h | j )=\frac{ \exp( x_{ihj} ) }{ \sum_l \exp( x_{ilj} ) }
with linear constraints on the class specific probabilities
x_{ihj}=\sum_v q_{ihjv} \lambda_{xv}
Linear restrictions on the\lambda_x parameter can be specified bya matrix equationV_x \lambda_x=c_x (seeXlambda.constr.V andXlambda.constr.c; Neuhaus, 1996).
The latent class distribution can be smoothed by a log-linearlink function (Xu & von Davier, 2008) or a logistic link function(Formann, 1992). For classjin groupg employing a link functionh, it holds that
h [ P( j| g) ] \propto \sum_w r_{jw} \delta_{gw}
where group-specific distributions are allowed. The valuesr_{jw} are specified in the design matrixdelta.designmatrix.
This model contains classical uni- and multidimensional latent trait models,latent class analysis, located latent class analysis, cognitive diagnosticmodels, the general diagnostic model and mixture item response models asspecial cases (see Formann & Kohlmann, 1998; Formann, 2007).
The function also allows for regularization of\lambda_{xv} parametersusing the lasso approach (Sun et al., 2016).More formally, the penalty function can be written as
pen( \bold{\lambda}_x )=p_\lambda \sum_v n_v w_v | \lambda_{xv} |
wherep_\lambda can be specified withregular_lam,w_v can be specified withregular_w, andn_v can be specified withregular_n.
Value
An object of classslca. The list contains thefollowing entries:
item | Data frame with conditional item probabilities |
deviance | Deviance |
ic | Information criteria, number of estimated parameters |
Xlambda | Estimated |
se.Xlambda | Standard error of |
pi.k | Trait distribution |
pjk | Item response probabilities evaluated for all classes |
n.ik | An array of expected counts |
G | Number of groups |
I | Number of items |
N | Number of persons |
delta | Parameter estimates for skillspace representation |
covdelta | Covariance matrix of parameter estimates forskillspace representation |
MLE.class | Classified skills for each student (MLE) |
MAP.class | Classified skills for each student (MAP) |
data | Original data frame |
group.stat | Group statistics (sample sizes, group labels) |
p.xi.aj | Individual likelihood |
posterior | Individual posterior distribution |
K.item | Maximal category per item |
time | Info about computation time |
skillspace | Used skillspace parametrization |
iter | Number of iterations |
seed.used | Used simulation seed |
Xlambda.init | Used initial lambda parameters |
delta.init | Used initial delta parameters |
converged | Logical indicating whether convergence was achieved. |
Note
If some items have differing number of categories, appropriateclass probabilities in non-existing categories per items can bepractically set to zero by loading an item for all skill classeson a fixed\lambda_x parameter of a small number, e.g.-999.
The implementation of the model builds on pieces work of Anton Formann.Seehttp://www.antonformann.at/ for more information.
References
Berlinet, A. F., & Roland, C. (2012).Acceleration of the EM algorithm: P-EM versus epsilon algorithm.Computational Statistics & Data Analysis, 56(12), 4122-4137.
Chen, Y., Liu, J., Xu, G., & Ying, Z. (2015).Statistical analysis of Q-matrix based diagnostic classification models.Journal of the American Statistical Association, 110, 850-866.
Chen, Y., Li, X., Liu, J., & Ying, Z. (2017). Regularized latent class analysiswith application in cognitive diagnosis.Psychometrika,82, 660-692.
Formann, A. K. (1985). Constrained latent class models: Theory and applications.British Journal of Mathematical and Statistical Psychology,38, 87-111.
Formann, A. K. (1992). Linear logistic latent class analysis for polytomous data.Journal of the American Statistical Association, 87, 476-486.
Formann, A. K. (2007). (Almost) Equivalence between conditional and mixture maximumlikelihood estimates for some models of the Rasch type. In M. von Davier & C. H. Carstensen(Eds.),Multivariate and mixture distribution Rasch models (pp. 177-189).New York: Springer.
Formann, A. K., & Kohlmann, T. (1998). Structural latent class models.Sociological Methods & Research, 26, 530-565.
Neuhaus, W. (1996). Optimal estimation underlinear constraints.Astin Bulletin, 26, 233-245.
Sun, J., Chen, Y., Liu, J., Ying, Z., & Xin, T. (2016).Latent variable selection for multidimensional item response theory modelsviaL_1 regularization.Psychometrika, 81(4), 921-939.
Xu, X., & von Davier, M. (2008).Fitting the structured general diagnosticmodel to NAEP data. ETS Research Report ETS RR-08-27. Princeton, ETS.
See Also
For latent trait models with continuous latent variables see themirt orTAM packages. For a discrete trait distribution seetheMultiLCIRT package.
For latent class models see thepoLCA,covLCA orrandomLCApackage.
For mixture Rasch or mixture IRT models see thepsychomix ormRm package.
Examples
############################################################################## EXAMPLE 1: data.Students | (Generalized) Partial Credit Model#############################################################################data(data.Students, package="CDM")dat <- data.Students[, c("mj1","mj2","mj3","mj4","sc1", "sc2") ]# define discretized abilitytheta.k <- seq( -6, 6, len=21 )#*** Model 1: Partial credit model# define design matrix for lambdaI <- ncol(dat)maxK <- 4TP <- length(theta.k)NXlam <- I*(maxK-1) + 1 # number of estimated parameters # last parameter is joint slope parameterXdes <- array( 0, dim=c(I, maxK, TP, NXlam ) )# Item1Cat1, ..., Item1Cat3, Item2Cat1, ...,dimnames(Xdes)[[1]] <- colnames(dat)dimnames(Xdes)[[2]] <- paste0("Cat", 1:(maxK) )dimnames(Xdes)[[3]] <- paste0("Class", 1:TP )v2 <- unlist( sapply( 1:I, FUN=function(ii){ # ii paste0( paste0( colnames(dat)[ii], "_b" ), "Cat", 1:(maxK-1) ) }, simplify=FALSE) )dimnames(Xdes)[[4]] <- c( v2, "a" )# define theta design and item discriminationsfor (ii in 1:I){ for (hh in 1:(maxK-1) ){ Xdes[ii, hh + 1,, NXlam ] <- hh * theta.k }}# item interceptsfor (ii in 1:I){ for (hh in 1:(maxK-1) ){ # ii <- 1 # Item # hh <- 1 # category Xdes[ii,hh+1,, ( ii - 1)*(maxK-1) + hh] <- 1 }}#****# skill space designmatrixTP <- length(theta.k)w1 <- stats::dnorm(theta.k)w1 <- w1 / sum(w1)delta.designmatrix <- matrix( 1, nrow=TP, ncol=1 )delta.designmatrix[,1] <- log(w1)# initial lambda parametersXlambda.init <- c( stats::rnorm( dim(Xdes)[[4]] - 1 ), 1 )# fixed delta parameterdelta.fixed <- cbind( 1, 1,1 )# estimate modelmod1 <- CDM::slca( dat, Xdes=Xdes, delta.designmatrix=delta.designmatrix, Xlambda.init=Xlambda.init, delta.fixed=delta.fixed )summary(mod1)plot(mod1, cex.names=.7 )## Not run: #*** Model 2: Partial credit model with some parameter constraints# fixed lambda parametersXlambda.fixed <- cbind( c(1,19), c(3.2,1.52 ) )# 1st parameter=3.2# 19th parameter=1.52 (joint item slope)mod2 <- CDM::slca( dat, Xdes=Xdes, delta.designmatrix=delta.designmatrix, delta.init=delta.init, Xlambda.init=Xlambda.init, delta.fixed=delta.fixed, Xlambda.fixed=Xlambda.fixed, maxiter=70 )#*** Model 3: Partial credit model with non-normal distributionXlambda.fixed <- cbind( c(1,19), c(3.2,1) ) # fix item slope to onedelta.designmatrix <- cbind( 1, theta.k, theta.k^2, theta.k^3 )mod3 <- CDM::slca( dat, Xdes=Xdes, delta.designmatrix=delta.designmatrix, Xlambda.fixed=Xlambda.fixed, maxiter=200 )summary(mod3)# non-normal distribution with convergence regularizing factor oldfacmod3a <- CDM::slca( dat, Xdes=Xdes, delta.designmatrix=delta.designmatrix, Xlambda.fixed=Xlambda.fixed, maxiter=500, oldfac=.95 )summary(mod3a)#*** Model 4: Generalized Partial Credit Model# estimate generalized partial credit model without restrictions on trait# distribution and item parameters to ensure better convergence behavior# Note that two parameters are not identifiable and information criteria# have to be adapted.#---# define design matrix for lambdaI <- ncol(dat)maxK <- 4TP <- length(theta.k)NXlam <- I*(maxK-1) + I # number of estimated parametersXdes <- array( 0, dim=c(I, maxK, TP, NXlam ) )# Item1Cat1, ..., Item1Cat3, Item2Cat1, ...,dimnames(Xdes)[[1]] <- colnames(dat)dimnames(Xdes)[[2]] <- paste0("Cat", 1:(maxK) )dimnames(Xdes)[[3]] <- paste0("Class", 1:TP )v2 <- unlist( sapply( 1:I, FUN=function(ii){ # ii paste0( paste0( colnames(dat)[ii], "_b" ), "Cat", 1:(maxK-1) ) }, simplify=FALSE) )dimnames(Xdes)[[4]] <- c( v2, paste0( colnames(dat),"_a") )dimnames(Xdes)# define theta design and item discriminationsfor (ii in 1:I){ for (hh in 1:(maxK-1) ){ Xdes[ii, hh + 1,, I*(maxK-1) + ii ] <- hh * theta.k }}# item interceptsfor (ii in 1:I){ for (hh in 1:(maxK-1) ){ Xdes[ii,hh+1,, ( ii - 1)*(maxK-1) + hh] <- 1 }}#****# skill space designmatrixdelta.designmatrix <- cbind( 1, theta.k,theta.k^2 )# initial lambda parameters from partial credit modelXlambda.init <- mod1$XlambdaXlambda.init <- c( mod1$Xlambda[ - length(Xlambda.init) ], rep( Xlambda.init[ length(Xlambda.init) ],I) )# estimate modelmod4 <- CDM::slca( dat, Xdes=Xdes, Xlambda.init=Xlambda.init, delta.designmatrix=delta.designmatrix, decrease.increments=TRUE, maxiter=300 )############################################################################## EXAMPLE 2: Latent class model with two classes#############################################################################set.seed(9876)I <- 7 # number of items# simulate response probabilitiesa1 <- stats::runif(I, 0, .4 )a2 <- stats::runif(I, .6, 1 )N <- 1000 # sample size# simulate data in two classes of proportions .3 and .7N1 <- round(.3*N)dat1 <- 1 * ( matrix(a1,N1,I,byrow=TRUE) > matrix( stats::runif( N1 * I), N1, I ) )N2 <- round(.7*N)dat2 <- 1 * ( matrix(a2,N2,I,byrow=TRUE) > matrix( stats::runif( N2 * I), N2, I ) )dat <- rbind( dat1, dat2 )colnames(dat) <- paste0("I", 1:I)# define design matricesTP <- 2 # two classes# The idea is that latent classes refer to two different "dimensions".# Items load on latent class indicators 1 and 2, see below.Xdes <- array(0, dim=c(I,2,2,2*I) )items <- colnames(dat)dimnames(Xdes)[[4]] <- c(paste0( colnames(dat), "Class", 1), paste0( colnames(dat), "Class", 2) ) # items, categories, classes, parameters# probabilities for correct solutionfor (ii in 1:I){ Xdes[ ii, 2, 1, ii ] <- 1 # probabilities class 1 Xdes[ ii, 2, 2, ii+I ] <- 1 # probabilities class 2}# estimate modelmod1 <- CDM::slca( dat, Xdes=Xdes )summary(mod1)############################################################################## EXAMPLE 3: Mixed Rasch model with two classes#############################################################################set.seed(987)library(sirt)# simulate two latent classes of Rasch populationsI <- 15 # 6 itemsb1 <- seq( -1.5, 1.5, len=I) # difficulties latent class 1b2 <- b1 # difficulties latent class 2b2[ c(4,7, 9, 11, 12, 13) ] <- c(1, -.5, -.5, .33, .33, -.66 )N <- 3000 # number of personswgt <- .25 # class probability for class 1# class 1dat1 <- sirt::sim.raschtype( stats::rnorm( wgt*N ), b1 )# class 2dat2 <- sirt::sim.raschtype( stats::rnorm( (1-wgt)*N, mean=1, sd=1.7), b2 )dat <- rbind( dat1, dat2 )# theta gridtheta.k <- seq( -5, 5, len=9 )TP <- length(theta.k)#*** Model 1: Rasch model with normal distributionmaxK <- 2NXlam <- I +1Xdes <- array( 0, dim=c(I, maxK, TP, NXlam ) )dimnames(Xdes)[[1]] <- colnames(dat)dimnames(Xdes)[[2]] <- paste0("Cat", 1:(maxK) )dimnames(Xdes)[[4]] <- c( paste0( "b_", colnames(dat)[1:I] ), "a" )# define item difficultiesfor (ii in 1:I){ Xdes[ii, 2,, ii ] <- -1}# theta designfor (tt in 1:TP){ Xdes[1:I, 2, tt, I + 1] <- theta.k[tt]}# skill space definitiondelta.designmatrix <- cbind( 1, theta.k^2 )delta.fixed <- NULLXlambda.init <- c( stats::runif( I, -.8, .8 ), 1 )Xlambda.fixed <- cbind( I+1, 1 )# estimate modelmod1 <- CDM::slca( dat, Xdes=Xdes, delta.designmatrix=delta.designmatrix, delta.fixed=delta.fixed, Xlambda.fixed=Xlambda.fixed, Xlambda.init=Xlambda.init, decrease.increments=TRUE, maxiter=200 )summary(mod1)#*** Model 1b: Constraint the sum of item difficulties to zero# change skill space definitiondelta.designmatrix <- cbind( 1, theta.k, theta.k^2 )delta.fixed <- NULL# constrain sum of difficulties Xlambda parameters to zeroXlambda.constr.V <- matrix( 1, nrow=I+1, ncol=1 )Xlambda.constr.V[I+1,1] <- 0Xlambda.constr.c <- c(0)# estimate modelmod1b <- CDM::slca( dat, Xdes=Xdes, delta.designmatrix=delta.designmatrix, Xlambda.fixed=Xlambda.fixed, Xlambda.constr.V=Xlambda.constr.V, Xlambda.constr.c=Xlambda.constr.c )summary(mod1b)#*** Model 2: Mixed Rasch model with two latent classesNXlam <- 2*I +2Xdes <- array( 0, dim=c(I, maxK, 2*TP, NXlam ) )dimnames(Xdes)[[1]] <- colnames(dat)dimnames(Xdes)[[2]] <- paste0("Cat", 1:(maxK) )dimnames(Xdes)[[4]] <- c( paste0( "bClass1_", colnames(dat)[1:I] ), paste0( "bClass2_", colnames(dat)[1:I] ), "aClass1", "aClass2" )# define item difficultiesfor (ii in 1:I){ Xdes[ii, 2, 1:TP, ii ] <- -1 # first class Xdes[ii, 2, TP + 1:TP, I+ii ] <- -1 # second class}# theta designfor (tt in 1:TP){ Xdes[1:I, 2, tt, 2*I+1 ] <- theta.k[tt] Xdes[1:I, 2, TP+tt, 2*I+2 ] <- theta.k[tt]}# skill space definitiondelta.designmatrix <- matrix( 0, nrow=2*TP, ncol=4 )delta.designmatrix[1:TP,1] <- 1delta.designmatrix[1:TP,2] <- theta.k^2delta.designmatrix[TP + 1:TP,3] <- 1delta.designmatrix[TP+ 1:TP,4] <- theta.k^2b1 <- stats::qnorm( colMeans(dat) )Xlambda.init <- c( stats::runif( 2*I, -1.8, 1.8 ), 1,1 )Xlambda.fixed <- cbind( c(2*I+1, 2*I+2), 1 )# estimate modelmod2 <- CDM::slca( dat, Xdes=Xdes, delta.designmatrix=delta.designmatrix, Xlambda.fixed=Xlambda.fixed, decrease.increments=TRUE, Xlambda.init=Xlambda.init, maxiter=1000 )summary(mod2)summary(mod1)# latent class proportionsstats::aggregate( mod2$pi.k, list( rep(1:2, each=TP)), sum )#*** Model 2b: Different parametrization with sum constraint on item difficulties# skill space definitiondelta.designmatrix <- matrix( 0, nrow=2*TP, ncol=6 )delta.designmatrix[1:TP,1] <- 1delta.designmatrix[1:TP,2] <- theta.kdelta.designmatrix[1:TP,3] <- theta.k^2delta.designmatrix[TP+ 1:TP,4] <- 1delta.designmatrix[TP+ 1:TP,5] <- theta.kdelta.designmatrix[TP+ 1:TP,6] <- theta.k^2Xlambda.fixed <- cbind( c(2*I+1,2*I+2), c(1,1) )b1 <- stats::qnorm( colMeans( dat ) )Xlambda.init <- c( b1, b1 + stats::runif(I, -1, 1 ), 1, 1 )# constraints on item difficultiesXlambda.constr.V <- matrix( 0, nrow=NXlam, ncol=2)Xlambda.constr.V[1:I, 1 ] <- 1Xlambda.constr.V[I + 1:I, 2 ] <- 1Xlambda.constr.c <- c(0,0)# estimate modelmod2b <- CDM::slca( dat, Xdes=Xdes, delta.designmatrix=delta.designmatrix, Xlambda.fixed=Xlambda.fixed, Xlambda.init=Xlambda.init, Xlambda.constr.V=Xlambda.constr.V, Xlambda.constr.c=Xlambda.constr.c, decrease.increments=TRUE, maxiter=1000 )summary(mod2b)stats::aggregate( mod2b$pi.k, list( rep(1:2, each=TP)), sum )#*** Model 2c: Estimation with mRm packagelibrary(mRm)mod2c <- mRm::mrm(data.matrix=dat, cl=2)plot(mod2c)print(mod2c)#*** Model 2d: Estimation with psychomix packagelibrary(psychomix)mod2d <- psychomix::raschmix(data=dat, k=2, verbose=TRUE )summary(mod2d)plot(mod2d)############################################################################## EXAMPLE 4: Located latent class model, Rasch model#############################################################################set.seed(487)library(sirt)I <- 15 # I itemsb1 <- seq( -2, 2, len=I) # item difficultiesN <- 4000 # number of persons# simulate 4 theta classestheta0 <- c( -2.5, -1, 0.3, 1.3 ) # skill classesprobs0 <- c( .1, .4, .2, .3 )TP <- length(theta0)theta <- theta0[ rep(1:TP, round(probs0*N) ) ]dat <- sirt::sim.raschtype( theta, b1 )#*** Model 1: Located latent class model with 4 classesmaxK <- 2NXlam <- I + TPXdes <- array( 0, dim=c(I, maxK, TP, NXlam ) )dimnames(Xdes)[[1]] <- colnames(dat)dimnames(Xdes)[[2]] <- paste0("Cat", 1:(maxK) )dimnames(Xdes)[[3]] <- paste0("Class", 1:TP )dimnames(Xdes)[[4]] <- c( paste0( "b_", colnames(dat)[1:I] ), paste0("theta", 1:TP) )# define item difficultiesfor (ii in 1:I){ Xdes[ii, 2,, ii ] <- -1}# theta designfor (tt in 1:TP){ Xdes[1:I, 2, tt, I + tt] <- 1}# skill space definitiondelta.designmatrix <- diag(TP)Xlambda.init <- c( - stats::qnorm( colMeans(dat) ), seq(-2,1,len=TP) )# constraint on item difficultiesXlambda.constr.V <- matrix( 0, nrow=NXlam, ncol=1)Xlambda.constr.V[1:I,1] <- 1Xlambda.constr.c <- c(0)delta.init <- matrix( c(1,1,1,1), TP, 1 )# estimate modelmod1 <- CDM::slca( dat, Xdes=Xdes, delta.designmatrix=delta.designmatrix, delta.init=delta.init, Xlambda.init=Xlambda.init, Xlambda.constr.V=Xlambda.constr.V, Xlambda.constr.c=Xlambda.constr.c, decrease.increments=TRUE, maxiter=400 )summary(mod1)# compare estimated and simulated theta class locationscbind( mod1$Xlambda[ - c(1:I) ], theta0 )# compare estimated and simulated latent class proportionscbind( mod1$pi.k, probs0 )############################################################################## EXAMPLE 5: DINA model with two skills#############################################################################set.seed(487)N <- 3000 # number of persons# define Q-matrixI <- 9 # 9 itemsNS <- 2 # 2 skillsTP <- 4 # number of skill classesQ <- scan( nlines=3) 1 0 1 0 1 0 0 1 0 1 0 1 1 1 1 1 1 1Q <- matrix(Q, I, ncol=NS,byrow=TRUE)# define skill distributionalpha0 <- matrix( c(0,0,1,0,0,1,1,1), nrow=4,ncol=2,byrow=TRUE)prob0 <- c( .2, .4, .1, .3 )alpha <- alpha0[ rep( 1:TP, prob0*N),]# define guessing and slipping parametersguess <- round( stats::runif(I, 0, .4 ), 2 )slip <- round( stats::runif(I, 0, .3 ), 2 )# simulate data according to the DINA modeldat <- CDM::sim.din( q.matrix=Q, alpha=alpha, slip=slip, guess=guess )$dat# define Xlambda design matrixmaxK <- 2NXlam <- 2*IXdes <- array( 0, dim=c(I, maxK, TP, NXlam ) )dimnames(Xdes)[[1]] <- colnames(dat)dimnames(Xdes)[[2]] <- paste0("Cat", 1:(maxK) )dimnames(Xdes)[[3]] <- c("S00","S10","S01","S11")dimnames(Xdes)[[4]] <- c( paste0("guess",1:I ), paste0( "antislip", 1:I ) )dimnames(Xdes)# define item difficultiesfor (ii in 1:I){ # define latent responses latresp <- 1*( alpha0 %*% Q[ii,]==sum(Q[ii,]) )[,1] # model slipping parameters Xdes[ii, 2, latresp==1, I+ii ] <- 1 # guessing parameters Xdes[ii, 2, latresp==0, ii ] <- 1}Xdes[1,2,,]Xdes[7,2,,]# skill space definitiondelta.designmatrix <- diag(TP)Xlambda.init <- c( rep( stats::qlogis( .2 ), I ), rep( stats::qlogis( .8 ), I ) )# estimate DINA model with slca functionmod1 <- CDM::slca( dat, Xdes=Xdes, delta.designmatrix=delta.designmatrix, Xlambda.init=Xlambda.init, decrease.increments=TRUE, maxiter=400 )summary(mod1)# compare estimated and simulated latent class proportionscbind( mod1$pi.k, probs0 )# compare estimated and simulated guessing parameterscbind( mod1$pjk[1,,2], guess )# compare estimated and simulated slipping parameterscbind( 1 - mod1$pjk[4,,2], slip )############################################################################## EXAMPLE 6: Investigating differential item functioning in Rasch models# with regularization##############################################################################---- simulate dataset.seed(987)N <- 1000 # number of persons in a groupI <- 20 # number of items#* population parameters of two groupsmu1 <- 0mu2 <- .6sd1 <- 1.4sd2 <- 1# item difficultiesb <- seq( -1.1, 1.1, len=I )# define some DIF effectsdif <- rep(0,I)dif[ c(3,6,9,12)] <- c( .6, -1, .75, -.35 )print(dif)#* simulate datasetsdat1 <- sirt::sim.raschtype( rnorm(N, mean=mu1, sd=sd1), b=b - dif /2 )colnames(dat1) <- paste0("I", 1:I, "_G1")dat2 <- sirt::sim.raschtype( rnorm(N, mean=mu2, sd=sd2), b=b + dif /2 )colnames(dat2) <- paste0("I", 1:I, "_G2")dat <- CDM::CDM_rbind_fill( dat1, dat2 )dat <- data.frame( "group"=rep(1:2, each=N), dat )#-- nodes for distributiontheta.k <- seq(-4, 4, len=11)# define design matrix for lambdanitems <- ncol(dat) - 1maxK <- 2TP <- length(theta.k)NXlam <- 2*I + 1Xdes <- array( 0, dim=c( nitems, maxK, TP, NXlam ) )dimnames(Xdes)[[1]] <- colnames(dat)[-1]dimnames(Xdes)[[2]] <- paste0("Cat", 0:(maxK-1) )dimnames(Xdes)[[3]] <- paste0("Theta", 1:TP )dimnames(Xdes)[[4]] <- c( paste0("b", 1:I ), paste0("dif", 1:I ), "const" )# define theta designfor (ii in 1:nitems){ Xdes[ii,2,,NXlam ] <- theta.k}# item intercepts and DIF effectsfor (ii in 1:I){ Xdes[c(ii,ii+I),2,, ii ] <- -1 Xdes[ii,2,,ii+I] <- - 1/2 Xdes[ii+I,2,,ii+I] <- 1/2}#--- skill space designmatrixTP <- length(theta.k)w1 <- stats::dnorm(theta.k)w1 <- w1 / sum(w1)delta.designmatrix <- matrix( 1, nrow=TP, ncol=2 )delta.designmatrix[,2] <- log(w1)# fixed lambda parametersXlambda.fixed <- cbind(NXlam, 1 )# initial Xlambda parametersdif_sim <- 0*stats::rnorm(I, sd=.2)Xlambda.init <- c( - stats::qnorm( colMeans(dat1) ), dif_sim, 1 )# delta.fixeddelta.fixed <- cbind( 1, 1, 0 )# regularization parameterregular_lam <- .2# weighting vector: regularize only DIF effectsregular_w <- c( rep(0,I), rep(1,I), 0 )#--- estimation model with scad penaltymod1 <- CDM::slca( dat[,-1], group=dat$group, Xdes=Xdes, delta.designmatrix=delta.designmatrix, regular_type="scad", Xlambda.init=Xlambda.init, delta.fixed=delta.fixed, Xlambda.fixed=Xlambda.fixed, regular_lam=regular_lam, regular_w=regular_w )# compare true and estimated DIF effectscbind( "true"=dif, "estimated"=round(coef(mod1)[seq(I+1,2*I)],2) )summary(mod1)## End(Not run)Summary Method for Objects of Class din
Description
S3 method to summarize objects of the classdin.
Usage
## S3 method for class 'din'summary(object, top.n.skill.classes=6, overwrite=FALSE, ...)Arguments
object | A required object of class |
top.n.skill.classes | A numeric, specifying the number of skillclasses, starting with the most frequent, to be returned.Default value is 6. |
overwrite | An optional boolean, specifying wether or notthe method is supposed to overwrite an existing |
... | Optional parameters to be passed to or from othermethods will be ignored. |
Details
The functionsummary.din returns an object of the classsummary.din (see ‘Value’), for which aprint method,print.summary.din, isprovided. Specific summary information details such asindividual item parameters and their discrimination indicescan be accessed through assignment (see ‘Examples’).
Value
If the argumentobject is of required type,summary.din returns a named list, of the classsummary.din, consisting of the following seven components:
CALL | A character specifying the model rule, the number ofitems and the number of attributes underlying the items. |
IDI | A matrix giving the item discriminationindex (IDI; Lee, de la Torre & Park, 2012) for each item
where a high IDI corresponds to favorable test itemswhich have both low guessing and slipping rates. |
SKILL.CLASSES | A vector giving the |
AIC | A numeric giving the AIC of the specified model |
BIC | A numeric giving the BIC of the specified model |
log.file | A character giving the path and file of a specifiedlog file. |
din.object | The object of class |
References
Lee, Y.-S., de la Torre, J., & Park, Y. S. (2012). Relationships betweencognitive diagnosis, CTT, and IRT indices: An empirical investigation.Asia Pacific Educational Research, 13, 333-345.
Rupp, A. A., Templin, J. L., & Henson, R. A. (2010)DiagnosticMeasurement: Theory, Methods, and Applications. New York: The GuilfordPress.
See Also
plot.din, the S3 method for plotting objects ofthe classdin;print.din, the S3 methodfor printing objects of the classdin;summary.din, the S3 method for summarizing objectsof the classdin, which creates objects of the classsummary.din;din, the main function forDINA and DINO parameter estimation, which creates objects of the classdin. See alsoCDM-package for generalinformation about this package.
Examples
#### (1) examples based on dataset fractions.subtraction.data#### Parameter estimation of DINA model# rule="DINA" is defaultfractions.dina <- CDM::din(data=CDM::fraction.subtraction.data, q.matrix=CDM::fraction.subtraction.qmatrix, rule="DINA")## corresponding summaries, including diagnostic accuracies,## most frequent skill classes and information## criteria AIC and BICsummary(fractions.dina)## In particular, accessing detailed summary through assignmentdetailed.summary.fs <- summary(fractions.dina)str(detailed.summary.fs)Printssummary andsink Output in a File
Description
Printssummary andsink output in a File
Usage
summary_sink( object, file, append=FALSE, ...)Arguments
object | Object for which a |
file | File name |
append | Optional logical indicating whether console output shouldbe appended to an already existing file. See argument |
... | Further arguments passed to |
See Also
Examples
## Not run: ############################################################################## EXAMPLE 1: summary_sink example for lm function##############################################################################--- simulate some dataset.seed(997)N <- 200x <- stats::rnorm( N )y <- .4 * x + stats::rnorm(N, sd=.5 )#--- fit a linear model and sink summary into a filemod1 <- stats::lm( y ~ x )CDM::summary_sink(mod1, file="my_model")#--- fit a second model and append it to filemod2 <- stats::lm( y ~ x + I(x^2) )CDM::summary_sink(mod2, file="my_model", append=TRUE )## End(Not run)Asymptotic Covariance Matrix, Standard Errors and Confidence Intervals
Description
Computes the asymptotic covariance matrix fordin objects. The covariance matrix is computed using theempirical cross-product approach (see Paek & Cai, 2014).
In addition, an S3 methodIRT.se is defined which producesan extended output includingvcov andconfint.
Usage
## S3 method for class 'din'vcov(object, extended=FALSE, infomat=FALSE,ind.item.skillprobs=TRUE, ind.item=FALSE, diagcov=FALSE, h=.001,...)## S3 method for class 'din'confint(object, parm, level=.95, extended=FALSE, ind.item.skillprobs=TRUE, ind.item=FALSE, diagcov=FALSE, h=.001, ... )IRT.se(object, ...)## S3 method for class 'din'IRT.se( object, extended=FALSE, parm=NULL, level=.95, infomat=FALSE, ind.item.skillprobs=TRUE, ind.item=FALSE, diagcov=FALSE, h=.001, ... )Arguments
object | An object inheriting from class |
extended | An optional logical indicating whether the covariancematrix should be calculated for an extended set of parameters(estimated and derived parameters). |
infomat | An optional logical indicating whether theinformation matrix instead of the covariance matrix should bethe output. |
ind.item.skillprobs | Optional logical indicating whether the covariancebetween item parameters and skill class probabilities are assumedto be zero. |
ind.item | Optional logical indicating whether covariances ofitem parameters between different items are zero. |
diagcov | Optional logical indicating whether all covariancesbetween estimated parameters are set to zero. |
h | Parameter used for numerical differentiation for computingthe derivative of the log-likelihood function. |
parm | Vector of parameters. If it is missing, then for all estimatedparameters a confidence interval is calculated. |
level | Confidence level |
... | Additional arguments to be passed. |
Value
coef: A vector of parameters.
vcov: A covariance matrix. The corresponding coefficients can be extractedas the attributecoef from this object.
IRT.se: A data frame containing coefficients, standard errorsand confidence intervals for all parameters.
References
Paek, I., & Cai, L. (2014). A comparison of item parameter standard errorestimation procedures for unidimensional and multidimensional item responsetheory modeling.Educational and Psychological Measurement, 74(1),58-76.
See Also
Examples
## Not run: ############################################################################## EXAMPLE 1: DINA model sim.dina#############################################################################data(sim.dina, package="CDM")data(sim.qmatrix, package="CDM")dat <- sim.dinaq.matrix <- sim.qmatrix#****** Model 1: DINA Modelmod1 <- CDM::din( dat, q.matrix=q.matrix, rule="DINA")# look into parameter table of the modelmod1$partable# covariance matrixcovmat1 <- vcov(mod1 )# extract coefficientscoef(mod1)# extract standard errorssqrt( diag( covmat1))# compute confidence intervalsconfint( mod1, level=.90 )# output table with standard errorsIRT.se( mod1, extended=TRUE )#****** Model 2: Constrained DINA Model# fix some slipping parametersconstraint.slip <- cbind( c(2,3,5), c(.15,.20,.25) )# set some skill class probabilities to zerozeroprob.skillclasses <- c(2,4)# estimate modelmod2 <- CDM::din( dat, q.matrix=q.matrix, guess.equal=TRUE, constraint.slip=constraint.slip, zeroprob.skillclasses=zeroprob.skillclasses)# parameter tablemod2$partable# freely estimated coefficientscoef(mod2)# covariance matrix (estimated parameters)vmod2a <- vcov(mod2)sqrt( diag( vmod2a)) # standard errorscolnames( vmod2a )names( attr( vmod2a, "coef") ) # extract coefficients# covariance matrix (more parameters, extended=TRUE)vmod2b <- vcov(mod2, extended=TRUE)sqrt( diag( vmod2b))attr( vmod2b, "coef")# attach standard errors to parameter tablepartable2 <- mod2$partablepartable2 <- partable2[ ! duplicated( partable2$parnames ), ]partable2 <- data.frame( partable2, "se"=sqrt( diag( vmod2b)) )partable2# confidence interval for parameter "skill1" which is not in the model# cannot be calculated!confint(mod2, parm=c( "skill1", "all_guess" ) )# confidence interval for only some parametersconfint(mod2, parm=paste0("prob_skill", 1:3 ) )# compute only information matrixinfomod2 <- vcov(mod2, infomat=TRUE)## End(Not run)