Movatterモバイル変換

Type:

Package

Title:

Easily Carry Out Latent Profile Analysis (LPA) Using Open-Sourceor Commercial Software

Version:

1.1.0

Maintainer:

Joshua M Rosenberg <jmrosenberg@utk.edu>

Description:

An interface to the 'mclust' package to easily carry out latent profile analysis ("LPA"). Provides functionality to estimate commonly-specified models. Follows a tidy approach, in that output is in the form of a data frame that can subsequently be computed on. Also has functions to interface to the commercial 'MPlus' software via the 'MplusAutomation' package.

License:

MIT + file LICENSE

URL:

https://data-edu.github.io/tidyLPA/

BugReports:

https://github.com/data-edu/tidyLPA/issues

Depends:

R (≥ 2.10)

Imports:

dplyr, ggplot2, gtable, grid, mclust, methods, mix,MplusAutomation, tibble

Suggests:

knitr, lme4, missForest, parallel, pillar, rmarkdown,testthat

VignetteBuilder:

knitr

Encoding:

UTF-8

LazyData:

true

RoxygenNote:

7.1.2

NeedsCompilation:

Packaged:

2021-11-17 00:10:32 UTC; joshuarosenberg

Author:

Joshua M Rosenberg [aut, cre], Caspar van Lissa [aut], Jennifer A Schmidt [ctb], Patrick N Beymer [ctb], Daniel Anderson [ctb], Matthew J. Schell [ctb]

Repository:

CRAN

Date/Publication:

2021-11-17 11:40:02 UTC

Pipe

Description

tidyLPA suggests using the pipe operator,%>%, from the magrittrpackage (imported here from the dplyr package).

Arguments

lhs,rhs

An object and a function to apply to it

Examples

# Instead ofsubset(iris, select = c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"))# you can writeiris %>%  subset(select = c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"))

Select best model using analytic hyrarchy process

Description

Integrates information from several fit indices, and selects the best model.

Usage

AHP(  fitindices,  relative_importance = c(AIC = 0.2323, AWE = 0.1129, BIC = 0.2525, CLC = 0.0922, KIC =    0.3101))

Arguments

fitindices

A matrix or data.frame of fit indices, with colnamescorresponding to the indices named inrelative_importance.

relative_importance

A named numeric vector. Names should correspond tocolumns infitindices, and values represent the relative weightassigned to the corresponding fit index. The default value corresponds to thefit indices and weights assigned by Akogul and Erisoglu. To assign uniformweights (i.e., each index is weighted equally), assign an equal value to all.

Details

Many fit indices are available for model selection. Followingthe procedure developed by Akogul and Erisoglu (2017), this functionintegrates information from several fit indices, and selects the best model,using Saaty's (1990) Analytic Hierarchy Process (AHP). Conceptually, theprocess consists of the following steps:

For each fit index, calculate the amount of support provided for eachmodel, relative to the other models.
From these comparisons, obtain a "priority vector" of the amount ofsupport for each model.
Compute a weighted average of the priority vectors for all fit indeces,with weights based on a simulation study examining each fit index' ability torecover the correct number of clusters (Akogul & Erisoglu, 2016).
Select the model with the highest weighted average priority.

Value

Numeric.

Author(s)

Caspar J. van Lissa

Examples

iris[,1:4] %>%  estimate_profiles(1:4) %>%  get_fit() %>%  AHP()

Convert Mplus output to object of class 'tidyLPA'

Description

Takes a list of Mplus output files of classmodelList,containing only mixture models with a single categorical latent variable, andconverts it to an object of classtidyLPA.

Usage

as.tidyLPA(modelList)

Arguments

modelList

A list of classmodelList, as generated byreadModels.

Value

A list of classtidyLPA.

Author(s)

Caspar J. van Lissa

Examples

## Not run: library(MplusAutomation)createMixtures(classes = 1:4, filename_stem = "cars",               model_overall = "wt ON drat;",               model_class_specific = "wt;  qsec;",               rdata = mtcars,               usevariables = c("wt", "qsec", "drat"),               OUTPUT = "standardized")runModels(replaceOutfile = "modifiedDate")cars_results <- readModels(filefilter = "cars")results_tidyLPA <- as.tidyLPA(cars_results)results_tidyLPAplot(results_tidyLPA)plot_profiles(results_tidyLPA) # Throws error; missing column 'Classes'## End(Not run)

Lo-Mendell-Rubin likelihood ratio test

Description

Implements the ad-hoc adjusted likelihood ratio test (LRT)described in Formula 15 of Lo, Mendell, & Rubin (2001), or LMR LRT.

Usage

calc_lrt(n, null_ll, null_param, null_classes, alt_ll, alt_param, alt_classes)

Arguments

n

Integer. Sample size

null_ll

Numeric. Log-likelihood of the null model.

null_param

Integer. Number of parameters of the null model.

null_classes

Integer. Number of classes of the null model.

alt_ll

Numeric. Log-likelihood of the alternative model.

alt_param

Integer. Number of parameters of the alternative model.

alt_classes

Integer. Number of classes of the alternative model.

Value

A numeric vector containing the likelihood ratio LR, the ad-hoccorrected LMR, degrees of freedom, and the LMR p-value.

References

Lo Y, Mendell NR, Rubin DB. Testing the number of components in anormal mixture. Biometrika. 2001;88(3):767–778. doi:10.1093/biomet/88.3.767

Examples

calc_lrt(150L, -741.02, 8, 1, -488.91, 13, 2)

Compare latent profile models

Description

Takes an object of class 'tidyLPA', containing multiple latent profile modelswith different number of classes or model specifications, and helps selectthe optimal number of classes and model specification.

Usage

compare_solutions(x, statistics = "BIC")

Arguments

x

An object of class 'tidyLPA'.

statistics

Character vector. Which statistics to examine fordetermining the optimal model. Defaults to 'BIC'.

Value

An object of class 'bestLPA' and 'list', containing a tibble of fits'fits', a named vector 'best', indicating which model fit best according toeach fit index, a numeric vector 'AHP' indicating the best model according totheAHP, an object 'plot' of class 'ggplot', and a numericvector 'statistics' corresponding to argument of the same name.

Author(s)

Caspar J. van Lissa

Examples

iris_subset <- sample(nrow(iris), 20) # so examples execute quicklyresults <- iris %>%  subset(select = c("Sepal.Length", "Sepal.Width",    "Petal.Length", "Petal.Width")) %>%  estimate_profiles(1:3) %>%  compare_solutions()

Simulated MAC data

Description

This simulated dataset, based on Curry et al., 2019, contains data on moralrelevance and judgment across the seven domains of the Morality AsCooperation scale.

Usage

data(curry_mac)

Format

A data.frame with 1392 rows and 42 variables.

Details

sex	`factor`	Self-identified sex of participants, Male, Female, or Transgendered.
age_years	`numeric`	Participants' age in years.
KinshipR	`numeric`	Mean score of moral relevance, kinship subscale.
MutualismR	`numeric`	Mean score of moral relevance, mutualism subscale.
ExchangeR	`numeric`	Mean score of moral relevance, exchange subscale.
HawkR	`numeric`	Mean score of moral relevance, hawk subscale.
DoveR	`numeric`	Mean score of moral relevance, dove subscale.
DivisionR	`numeric`	Mean score of moral relevance, division subscale.
PossessionR	`numeric`	Mean score of moral relevance, possession subscale.
KinshipJ	`numeric`	Mean score of moral judgment, kinship subscale.
MutualismJ	`numeric`	Mean score of moral judgment, mutualism subscale.
ExchangeJ	`numeric`	Mean score of moral judgment, exchange subscale.
HawkJ	`numeric`	Mean score of moral judgment, hawk subscale.
DoveJ	`numeric`	Mean score of moral judgment, dove subscale.
DivisionJ	`numeric`	Mean score of moral judgment, division subscale.
PossessionJ	`numeric`	Mean score of moral judgment, possession subscale.

References

Curry, O. S., Jones Chesters, M., & Van Lissa, C. J. (2019). Mapping morality with a compass: Testing the theory of ‘morality-as-cooperation’ with a new questionnaire. Journal of Research in Personality, 78, 106–124.doi:10.1016/j.jrp.2018.10.008

Simulated empathy data

Description

This simulated dataset, based on Van Lissa et al., 2014, contains six annualassessments of adolescents' mean scores on the empathic concern andperspective taking subscales of the Interpersonal Reactivity Index(Davis, 1983). The first measurement wave occurred when adolescents were, onaverage, 13 years old, and the last one when they were 18 years old.

Usage

data(empathy)

Format

A data frame with 467 rows and 13 variables.

Details

ec1	`numeric`	Mean score of empathic concern in wave 1
ec2	`numeric`	Mean score of empathic concern in wave 2
ec3	`numeric`	Mean score of empathic concern in wave 3
ec4	`numeric`	Mean score of empathic concern in wave 4
ec5	`numeric`	Mean score of empathic concern in wave 5
ec6	`numeric`	Mean score of empathic concern in wave 6
pt1	`numeric`	Mean score of perspective taking in wave 1
pt2	`numeric`	Mean score of perspective taking in wave 2
pt3	`numeric`	Mean score of perspective taking in wave 3
pt4	`numeric`	Mean score of perspective taking in wave 4
pt5	`numeric`	Mean score of perspective taking in wave 5
pt6	`numeric`	Mean score of perspective taking in wave 6
sex	`factor`	Adolescent sex; M = male, F = female.

References

Van Lissa, C. J., Hawk, S. T., Branje, S. J., Koot, H. M.,Van Lier, P. A., & Meeus, W. H. (2014). Divergence Between Adolescent andParental Perceptions of Conflict in Relationship to Adolescent EmpathyDevelopment. Journal of Youth and Adolescence, (Journal Article), 1–14.doi:10.1007/s10964-014-0152-5

Estimate latent profiles

Description

Estimates latent profiles (finite mixture models) using the opensource packagemclust, or the commercial programMplus (using the R-interface ofMplusAutomation).

Usage

estimate_profiles(  df,  n_profiles,  models = NULL,  variances = "equal",  covariances = "zero",  package = "mclust",  select_vars = NULL,  ...)

Arguments

df

data.frame of numeric data; continuous indicators are required formixture modeling.

n_profiles

Integer vector of the number of profiles (or mixturecomponents) to be estimated.

models

Integer vector. Set toNULL by default, and models areconstructed from thevariances andcovariances arguments. SeeDetails for the six models available in tidyLPA.

variances

Character vector. Specifies which variance components toestimate. Defaults to "equal" (constrain variances across profiles); theother option is "varying" (estimate variances freely across profiles). Eachelement of this vector refers to one of the models you wish to run.

covariances

Character vector. Specifies which covariance components toestimate. Defaults to "zero" (do not estimate covariances; this correspondsto an assumption of conditional independence of the indicators); otheroptions are "equal" (estimate covariances between items, constrained acrossprofiles), and "varying" (free covariances across profiles).

package

Character. Which package to use; 'mclust' or'MplusAutomation' (requires Mplus to be installed). Default: 'mclust'.

select_vars

Character. Optional vector of variable names indf,to be used for model estimation. Defaults toNULL, which means allvariables indf are used.

...

Additional arguments are passed to the estimating function; i.e.,Mclust, ormplusModeler.

Details

Six models are currently available in tidyLPA, corresponding to themost common requirements. These are:

Equal variances and covariances fixed to 0
Varying variances and covariances fixed to 0
Equal variances and equal covariances
Varying variances and equal covariances (not able to be fit w/ mclust)
Equal variances and varying covariances (not able to be fit w/ mclust)
Varying variances and varying covariances

Two interfaces are available to estimate these models; specify their numbersin themodels argument (e.g.,models = 1, ormodels = c(1, 2, 3)), or specify the variances/covariances to beestimated (e.g.,:variances = c("equal", "varying"), covariances =c("zero", "equal")). Note that when mclust is used,models =c(1, 2, 3, 6) are the only models available.

Value

A list of class 'tidyLPA'.

Examples

iris_sample <- iris[c(1:4, 51:54, 101:104), ] # to make example run more quickly# Example 1:iris_sample %>%  subset(select = c("Sepal.Length", "Sepal.Width",    "Petal.Length")) %>%  estimate_profiles(3)# Example 2:iris %>%  subset(select = c("Sepal.Length", "Sepal.Width",    "Petal.Length")) %>%  estimate_profiles(n_profiles = 1:4, models = 1:3)# Example 3:iris_sample %>%  subset(select = c("Sepal.Length", "Sepal.Width",    "Petal.Length")) %>%  estimate_profiles(n_profiles = 1:4, variances = c("equal", "varying"),                    covariances = c("zero", "zero"))

Estimate latent profiles using mclust

Description

Estimates latent profiles (finite mixture models) using the open sourcepackagemclust.

Usage

estimate_profiles_mclust(df, n_profiles, model_numbers, select_vars, ...)

Arguments

df

data.frame with two or more columns with continuous variables

n_profiles

Numeric vector. The number of profiles (or mixturecomponents) to be estimated. Each number in the vector corresponds to ananalysis with that many mixture components.

model_numbers

Numeric vector. Numbers of the models to be estimated.Seeestimate_profiles for a description of the models availablein tidyLPA.

select_vars

Character. Optional vector of variable names indf,to be used for model estimation. Defaults toNULL, which means allvariables indf are used.

...

Parameters passed directly toMclust. Seethe documentation ofMclust.

Value

An object of class 'tidyLPA' and 'list'

Author(s)

Caspar J. van Lissa

Estimate latent profiles using Mplus

Description

Estimates latent profiles (finite mixture models) using the commercialprogram Mplus, through the R-interface ofMplusAutomation.

Usage

estimate_profiles_mplus2(  df,  n_profiles,  model_numbers,  select_vars,  ...,  keepfiles = FALSE)

Arguments

df

data.frame with two or more columns with continuous variables

n_profiles

Numeric vector. The number of profiles (or mixturecomponents) to be estimated. Each number in the vector corresponds to ananalysis with that many mixture components.

model_numbers

Numeric vector. Numbers of the models to be estimated.Seeestimate_profiles for a description of the models availablein tidyLPA.

select_vars

Character. Optional vector of variable names indf,to be used for model estimation. Defaults toNULL, which means allvariables indf are used.

...

Parameters passed directly tomplusModeler. See the documentation ofmplusModeler.

keepfiles

Logical. Whether to retain the files created bymplusModeler (e.g., for future reference, or to manually edit them).

Value

An object of class 'tidyLPA' and 'list'

Author(s)

Caspar J. van Lissa

Get data from objects generated by tidyLPA

Description

Get data from objects generated by tidyLPA.

Usage

get_data(x, ...)## S3 method for class 'tidyLPA'get_data(x, ...)## S3 method for class 'tidyProfile'get_data(x, ...)

Arguments

x

An object generated by tidyLPA.

...

further arguments to be passed to or from other methods. They areignored in this function.

Value

If one model is fit, the data is returned in wide format as a tibble.If more than one model is fit, the data is returned in long form. See theexamples.

Methods (by class)

tidyLPA: Get data for a latent profile analysis with multiplenumbers of classes and models, of class 'tidyLPA'.
tidyProfile: Get data for a single latent profile analysis object,of class 'tidyProfile'.

Author(s)

Caspar J. van Lissa

Examples

## Not run: if(interactive()){ library(dplyr) # the data is returned in wide form results <- iris %>%   select(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) %>%   estimate_profiles(3) get_data(results) # note that if more than one model is fit, the data is returned in long form results1 <- iris %>%   select(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) %>%   estimate_profiles(c(3, 4)) get_data(results1) }## End(Not run)

Get estimates from objects generated by tidyLPA

Description

Get estimates from objects generated by tidyLPA.

Usage

get_estimates(x, ...)## S3 method for class 'tidyLPA'get_estimates(x, ...)## S3 method for class 'tidyProfile'get_estimates(x, ...)

Arguments

x

An object generated by tidyLPA.

...

further arguments to be passed to or from other methods. They areignored in this function.

Value

A tibble.

Methods (by class)

tidyLPA: Get estimates for a latent profile analysis withmultiple numbers of classes and models, of class 'tidyLPA'.
tidyProfile: Get estimates for a single latent profile analysisobject, of class 'tidyProfile'.

Author(s)

Caspar J. van Lissa

Examples

## Not run: if(interactive()){ results <- iris %>%   select(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) %>%   estimate_profiles(3) get_estimates(results) get_estimates(results[[1]]) }## End(Not run)

Get fit indices from objects generated by tidyLPA

Description

Get fit indices from objects generated by tidyLPA.

Usage

get_fit(x, ...)## S3 method for class 'tidyLPA'get_fit(x, ...)## S3 method for class 'tidyProfile'get_fit(x, ...)

Arguments

x

An object generated by tidyLPA.

...

further arguments to be passed to or from other methods. They areignored in this function.

Value

A tibble. Learn more at https://data-edu.github.io/tidyLPA/articles/Introduction_to_tidyLPA.html#getting-fit-statistics

Methods (by class)

tidyLPA: Get fit indices for a latent profile analysis withmultiple numbers of classes and models, of class 'tidyLPA'.
tidyProfile: Get fit indices for a single latent profile analysisobject, of class 'tidyProfile'.

Author(s)

Caspar J. van Lissa

Examples

## Not run: if(interactive()){ results <- iris %>%   select(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) %>%   estimate_profiles(3) get_fit(results) get_fit(results[[1]]) }## End(Not run)

Simulated identity data

Description

This simulated dataset, based on Crochetti et al., 2014, contains five annualassessments of adolescents' mean scores on the commitment, exploration (indepth), and reconsideration subscales of the Utrecht-Management of IdentityCommitments Scale (Crocetti et al., 2008). The scores reported here reflectthe educational identity subscales of this instrument. The first measurementwave occurred when adolescents were, onaverage, 14 years old, and the last one when they were 18 years old.

Usage

data(id_edu)

Format

A data frame with 443 rows and 16 variables.

Details

com1	`numeric`	Mean score of educational commitment in wave 1
exp1	`numeric`	Mean score of educational exploration in wave 1
rec1	`numeric`	Mean score of educational reconsideration in wave 1
com2	`numeric`	Mean score of educational commitment in wave 2
exp2	`numeric`	Mean score of educational exploration in wave 2
rec2	`numeric`	Mean score of educational reconsideration in wave 2
com3	`numeric`	Mean score of educational commitment in wave 3
exp3	`numeric`	Mean score of educational exploration in wave 3
rec3	`numeric`	Mean score of educational reconsideration in wave 3
com4	`numeric`	Mean score of educational commitment in wave 4
exp4	`numeric`	Mean score of educational exploration in wave 4
rec4	`numeric`	Mean score of educational reconsideration in wave 4
com5	`numeric`	Mean score of educational commitment in wave 5
exp5	`numeric`	Mean score of educational exploration in wave 5
rec5	`numeric`	Mean score of educational reconsideration in wave 5
sex	`factor`	Adolescent sex; M = male, F = female.

References

Crocetti, E., Klimstra, T. A., Hale, W. W., Koot, H. M., &Meeus, W. (2013). Impact of early adolescent externalizing problem behaviorson identity development in middle to late adolescence: A prospective 7-yearlongitudinal study. Journal of Youth and Adolescence, 42(11), 1745-1758.doi:10.1007/s10964-013-9924-6

student questionnaire data with four variables from the 2015 PISA for students in the United States

Description

student questionnaire data with four variables from the 2015 PISA for students in the United States

Usage

pisaUSA15

Format

Data frame with columns#'

broad_interest: composite measure of students' self reported broad interest
enjoyment: composite measure of students' self reported enjoyment
instrumental_mot: composite measure of students' self reported instrumental motivation
self_efficacy: composite measure of students' self reported self efficacy

...

Source

http://www.oecd.org/pisa/data/

Create correlation plots for a mixture model

Description

Creates a faceted plot of two-dimensional correlation plots andunidimensional density plots for an object of class 'tidyProfile'.

Usage

plot_bivariate(  x,  variables = NULL,  sd = TRUE,  cors = TRUE,  rawdata = TRUE,  bw = FALSE,  alpha_range = c(0, 0.1),  return_list = FALSE)

Arguments

x

tidyProfile object to plot. A tidyProfile is one element of atidyLPA analysis.

variables

Which variables to plot. If NULL, plots all variables thatare present in all models.

sd

Logical. Whether to show the estimated standard deviations as linesemanating from the cluster centroid.

cors

Logical. Whether to show the estimated correlation (standardizedcovariance) as ellipses surrounding the cluster centroid.

rawdata

Logical. Whether to plot raw data, weighted by posterior classprobability.

bw

Logical. Whether to make a black and white plot (for print) or acolor plot. Defaults to FALSE, because these density plots are hard to readin black and white.

alpha_range

Numeric vector (0-1). Setsthe transparency of geom_density and geom_point.

return_list

Logical. Whether to return a list of ggplot objects, orjust the final plot. Defaults to FALSE.

Value

An object of class 'ggplot'.

Author(s)

Caspar J. van Lissa

Examples

# Example 1iris_sample <- iris[c(1:10, 51:60, 101:110), ] # to make example run more quickly## Not run: iris_sample %>% subset(select = c("Sepal.Length", "Sepal.Width")) %>% estimate_profiles(n_profiles = 2, models = 1) %>% plot_bivariate()## End(Not run)# Example 2## Not run: mtcars %>%  subset(select = c("wt", "qsec", "drat")) %>%  poms() %>%  estimate_profiles(3) %>%  plot_bivariate()## End(Not run)

Create density plots for mixture models

Description

Creates a faceted plot of density plots for an object of class 'tidyLPA'. Foreach variable, a Total density plot will be shown, along with separatedensity plots for each latent class, where cases are weighted by theposterior probability of being assigned to that class.

Usage

plot_density(  x,  variables = NULL,  bw = FALSE,  conditional = FALSE,  alpha = 0.2,  facet_labels = NULL)

Arguments

x

Object to plot.

variables

Which variables to plot. If NULL, plots all variables thatare present in all models.

bw

Logical. Whether to make a black and white plot (for print) or acolor plot. Defaults to FALSE, because these density plots are hard to readin black and white.

conditional

Logical. Whether to show a conditional density plot(surface area is divided amongst the latent classes), or a classic densityplot (surface area of the total density plot is equal to one, and issubdivided amongst the classes).

alpha

Numeric (0-1). Only used when bw and conditional are FALSE. Setsthe transparency of geom_density, so that classes with a small number ofcases remain visible.

facet_labels

Named character vector, the names of which shouldcorrespond to the facet labels one wishes to rename, and the values of whichprovide new names for these facets. For example, to rename variables, in theexample with the 'iris' data below, one could specify:facet_labels = c("Pet_leng" = "Petal length").

Value

An object of class 'ggplot'.

Author(s)

Caspar J. van Lissa

Examples

## Not run: results <- iris %>%  subset(select = c("Sepal.Length", "Sepal.Width",    "Petal.Length", "Petal.Width")) %>%  estimate_profiles(1:3)## End(Not run)## Not run: plot_density(results, variables = "Petal.Length")## End(Not run)## Not run: plot_density(results, bw = TRUE)## End(Not run)## Not run: plot_density(results, bw = FALSE, conditional = TRUE)## End(Not run)## Not run: plot_density(results[[2]], variables = "Petal.Length")## End(Not run)

Create latent profile plots

Description

Creates a profile plot according to best practices, focusing on thevisualization of classification uncertainty by showing:

Bars reflecting a confidence interval for the class centroids
Boxes reflecting the standard deviations within each class; a boxencompasses +/- 64% of the observations in a normal distribution
Raw data, whose transparancy is weighted by the posterior classprobability, such that each datapoint is most clearly visible for the classit is most likely to be a member of.

Usage

plot_profiles(  x,  variables = NULL,  ci = 0.95,  sd = TRUE,  add_line = TRUE,  rawdata = TRUE,  bw = FALSE,  alpha_range = c(0, 0.1),  ...)## Default S3 method:plot_profiles(  x,  variables = NULL,  ci = 0.95,  sd = TRUE,  add_line = FALSE,  rawdata = TRUE,  bw = FALSE,  alpha_range = c(0, 0.1),  ...)

Arguments

x

An object containing the results of a mixture model analysis.

variables

A character vectors with the names of the variables to beplotted (optional).

ci

Numeric. What confidence interval should the errorbars span?Defaults to a 95% confidence interval. Set to NULL to remove errorbars.

sd

Logical. Whether to display a box encompassing +/- 1SD Defaults toTRUE.

add_line

Logical. Whether to display a line, connecting cluster centroidsbelonging to the same latent class. Defaults to TRUE. Note that the additionalinformation conveyed by such a line is limited.

rawdata

Should raw data be plotted in the background? Setting this toTRUE might result in long plotting times.

bw

Logical. Should the plot be black and white (for print), or color?

alpha_range

The minimum and maximum values of alpha (transparancy) forthe raw data. Minimum should be 0; lower maximum values of alpha can helpreduce overplotting.

...

Arguments passed to and from other functions.

Value

An object of class 'ggplot'.

Author(s)

Caspar J. van Lissa

Examples

# Example 1iris_sample <- iris[c(1:10, 51:60, 101:110), ] # to make example run more quicklyiris_sample %>% subset(select = c("Sepal.Length", "Sepal.Width")) %>% estimate_profiles(n_profiles = 1:2, models = 1:2) %>% plot_profiles()# Example 2mtcars %>%  subset(select = c("wt", "qsec", "drat")) %>%  poms() %>%  estimate_profiles(1:4) %>%  plot_profiles(add_line = F)

Apply POMS-coding to data

Description

Takes in a data.frame, and applies POMS (proportion of of maximum)-coding tothe numeric columns.

Usage

poms(data)

Arguments

data

A data.frame.

Value

A data.frame.

Author(s)

Caspar J. van Lissa

Examples

data <- data.frame(a = c(1, 2, 2, 4, 1, 6),                   b = c(6, 6, 3, 5, 3, 4),                   c = c("a", "b", "b", "t", "f", "g"))poms(data)

Print tidyLPA

Description

S3 method 'print' for class 'tidyLPA'.

Usage

## S3 method for class 'tidyLPA'print(  x,  stats = c("AIC", "BIC", "Entropy", "prob_min", "prob_max", "n_min", "n_max",    "BLRT_p"),  digits = 2,  na.print = "",  ...)

Arguments

x

An object of class 'tidyLPA'.

stats

Character vector. Statistics to be printed. Default:c("AIC", "BIC", "Entropy", "prob_min", "prob_max", "n_min", "n_max", "BLRT_p").

digits

minimal number of significant digits, seeprint.default.

na.print

a character string which is used to indicate NA values inprinted output, or NULL. Seeprint.default.

...

further arguments to be passed to or from other methods. They areignored in this function.

Author(s)

Caspar J. van Lissa

Examples

## Not run: if(interactive()){ iris %>%   select(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) %>%   estimate_profiles(3) }## End(Not run)

Print tidyProfile

Description

S3 method 'print' for class 'tidyProfile'.

Usage

## S3 method for class 'tidyProfile'print(x, digits = 2, na.print = "", ...)

Arguments

x

An object of class 'tidyProfile'.

digits

minimal number of significant digits, seeprint.default.

na.print

a character string which is used to indicate NA values inprinted output, or NULL. Seeprint.default.

...

further arguments to be passed to or from other methods. They areignored in this function.

Author(s)

Caspar J. van Lissa

Examples

## Not run: if(interactive()){ tmp <- iris %>%   select(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) %>%   estimate_profiles(3) tmp[[2]] }## End(Not run)

Apply single imputation to data

Description

This function accommodates several methods for single imputationof data. Currently, the following methods are defined:

"imputeData"Applies the mclust native imputation functionimputeData
"missForest"Applies non-parameteric, random-forest based dataimputation usingmissForest. Radom forests canaccommodate any complex interactions and non-linear relations in the data. Mysimulation studies indicate that this method is preferable to mclust'simputeData (see examples).

Usage

single_imputation(x, method = "imputeData")

Arguments

x

A data.frame or matrix.

method

Character. Imputation method to apply, Default: 'imputeData'

Value

A data.frame

Author(s)

Caspar J. van Lissa

Examples

## Not run: library(ggplot2)library(missForest)library(mclust)dm <- 2k <- 3n <- 100V <- 4# Example of one simulationclass <- sample.int(k, n, replace = TRUE)dat <- matrix(rnorm(n*V, mean = (rep(class, each = V)-1)*dm), nrow  = n,              ncol = V, byrow = TRUE)results <- estimate_profiles(data.frame(dat), 1:5)plot_profiles(results)compare_solutions(results)# Simulation for parametric data (i.e., all assumptions of latent profile# analysis met)simulation <- replicate(100, {    class <- sample.int(k, n, replace = TRUE)    dat <- matrix(rnorm(n*V, mean = (rep(class, each = V)-1)*dm), nrow  = n,                  ncol = V, byrow = TRUE)    d <- prodNA(dat)    d_mf <- missForest(d)$ximp    m_mf <- Mclust(d_mf, G = 3, "EEI")    d_im <- imputeData(d, verbose = FALSE)    m_im <- Mclust(d_im, G = 3, "EEI")    class_tabl_mf <- sort(prop.table(table(class, m_mf$classification)),                          decreasing = TRUE)[1:3]    class_tabl_im <- sort(prop.table(table(class, m_im$classification)),                          decreasing = TRUE)[1:3]    c(sum(class_tabl_mf), sum(class_tabl_im))})# Performance on averagerowMeans(simulation)# Performance SDcolSD(t(simulation))# Plot shows slight advantage for missForestplotdat <- data.frame(accuracy = as.vector(simulation), model =                      rep(c("mf", "im"), n))ggplot(plotdat, aes(x = accuracy, colour = model))+geom_density()# Simulation for real data (i.e., unknown whether assumptions are met)simulation <- replicate(100, {    d <- prodNA(iris[,1:4])    d_mf <- missForest(d)$ximp    m_mf <- Mclust(d_mf, G = 3, "EEI")    d_im <- imputeData(d, verbose = FALSE)    m_im <- Mclust(d_im, G = 3, "EEI")    class_tabl_mf <- sort(prop.table(table(iris$Species,                          m_mf$classification)), decreasing = TRUE)[1:3]    class_tabl_im <- sort(prop.table(table(iris$Species,                          m_im$classification)), decreasing = TRUE)[1:3]    c(sum(class_tabl_mf), sum(class_tabl_im))})# Performance on averagerowMeans(simulation)# Performance SDcolSD(t(simulation))# Plot shows slight advantage for missForestplotdat <- data.frame(accuracy = as.vector(tmp),                      model = rep(c("mf", "im"), n))ggplot(plotdat, aes(x = accuracy, colour = model))+geom_density()## End(Not run)

tidyLPA: Functionality to carry out Latent Profile Analysis in R

Description

Latent Profile Analysis (LPA) is a statistical modeling approach forestimating distinct profiles, or groups, of variables. In the socialsciences and in educational research, these profiles could represent, forexample, how different youth experience dimensions of being engaged (i.e.,cognitively, behaviorally, and affectively) at the same time.

Details

tidyLPA provides the functionality to carry out LPA in R. In particular,tidyLPA provides functionality to specify different models that determinewhether and how different parameters (i.e., means, variances, andcovariances) are estimated and to specify (and compare solutions for) thenumber of profiles to estimate.