Movatterモバイル変換


[0]ホーム

URL:


Type:Package
Title:High-Level Functions for Tabulating, Charting and ReportingSurvey Data
Version:3.2.0
Date:2025-11-10
Description:Craft polished tables and plots in Markdown reports. Simply choose whether to treat your data as counts or metrics, and the package will automatically generate well-designed default tables and plots for you. Boiled down to the basics, with labeling features and simple interactive reports. All functions are 'tidyverse' compatible.
URL:https://github.com/strohne/volker,https://strohne.github.io/volker/
BugReports:https://github.com/strohne/volker/issues
License:MIT + file LICENSE
Encoding:UTF-8
RoxygenNote:7.3.2
LazyData:true
Imports:stats, utils, rlang, lifecycle, tibble, dplyr, tidyr,tidyselect, ggplot2 (≥ 2.2.1), scales, base64enc, purrr,magrittr, skimr, broom, knitr, kableExtra, rmarkdown, psych,car, effectsize, heplots
Depends:R (≥ 4.2)
Suggests:tidyverse, remotes, usethis, testthat (≥ 3.0.0), vdiffr
VignetteBuilder:knitr
Config/testthat/edition:3
NeedsCompilation:no
Packaged:2025-10-10 22:45:05 UTC; Jakob
Author:Jakob JüngerORCID iD [aut, cre, cph], Henrieke Kotthoff [aut, ctb], Chantal GärtnerORCID iD [ctb]
Maintainer:Jakob Jünger <jakob.juenger@uni-muenster.de>
Repository:CRAN
Date/Publication:2025-10-10 23:10:02 UTC

volker: High-Level Functions for Tabulating, Charting and Reporting Survey Data

Description

logo

Craft polished tables and plots in Markdown reports. Simply choose whether to treat your data as counts or metrics, and the package will automatically generate well-designed default tables and plots for you. Boiled down to the basics, with labeling features and simple interactive reports. All functions are 'tidyverse' compatible.

Author(s)

Maintainer: Jakob Jüngerjakob.juenger@uni-muenster.de (ORCID) [copyright holder]

Authors:

Other contributors:

See Also

Useful links:


Pipe operator

Description

Seemagrittr::%>% for details.

Usage

lhs %>% rhs

Arguments

lhs

A value or the magrittr placeholder.

rhs

A function call using the magrittr semantics.

Value

The result of callingrhs(lhs).


Add an object to the report list

Description

Add an object to the report list

Usage

.add_to_vlkr_rprt(obj, chunks, tab = NULL)

Arguments

obj

A new chunk (volker table, volker plot or character value).

chunks

The current report list.

tab

A tabsheet name or NULL.

Value

A volker report object.


Calculate classification performance indicators such as precision and recall.

Description

Calculate classification performance indicators such as precision and recall.

Usage

.agree_classification(x, y, ids = NULL, category = NULL)

Arguments

x

Vactor with values.

y

Vector with sources, the first value is taken as ground truth source.To change the ground truth, use a factor and set the first factor level to the appropriate value.

ids

Vector of Case IDs.

category

If no category is provided, macro statistics are returned (along with the number of categories in the output).Provide a category to get the statistics for this category only.If values are boolean (TRUE / FALSE) and no category is provided, the category is always assumed to be "TRUE".

Value

A list with classification performance indicators.


Generate confusion matrix

Description

Generate confusion matrix

Usage

.agree_confusion(  data,  cols,  coders,  ids = NULL,  category = NULL,  labels = TRUE)

Arguments

data

A tibble.

cols

The columns holding codings.

coders

The column holding coders.

ids

The column holding case identifiers.

category

Focus category or null.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

Value

A tibble representing a confusion matrix.


Calculate agreement coefficients for multiple items

Description

Calculate agreement coefficients for multiple items

Usage

.agree_items(  data,  cols,  coders,  ids = NULL,  category = NULL,  method = "reliability",  labels = TRUE)

Arguments

data

A tibble.

cols

The columns holding codings.

coders

The column holding coders.

ids

The column holding case identifiers.

category

If no category is provided, macro statistics are returned (along with the number of categories in the output).Provide a category to get the statistics for this category only.If values are boolean (TRUE / FALSE) and no category is provided, the category is always assumed to be "TRUE".

method

The output metrics, one of reliability (reliability scores)or classification (accuracy, precision, recall, f1).

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

Value

A tibble with agreement coefficients.


Calculate reliability scores

Description

Calculate reliability scores

Usage

.agree_reliability(x, y, ids = NULL)

Arguments

x

Vector with codings.

y

Vector with coders.

ids

Vector with case IDs.

Value

A list with reliability coefficients.


Insert a name-value-pair into an object attribute

Description

Insert a name-value-pair into an object attribute

Usage

.attr_insert(obj, key, name, value)

Arguments

obj

The object.

key

The attribute key.

name

The name of a list item within the attribute.

value

The value of the list item.

Value

The object with new attributes.


Set an attribute value on selected columns of a data frame

Description

Set an attribute value on selected columns of a data frame

Usage

.attr_setcolumn(x, cols, attr_name, attr_value)

Arguments

x

A data frame containing the columns to modify.

cols

A tidyselect expression specifying which columns to modify(e.g.,c(var1, var2) orstarts_with("score")).

attr_name

A character string giving the name of the attribute to set.

attr_value

The value to assign to the attribute for all selected columns.

Value

The data frame with the modified column attributes.


Transfer attributes from one to another object

Description

Transfer attributes from one to another object

Usage

.attr_transfer(to, from, keys)

Arguments

to

The target object.

from

The source object.

keys

A character vector of attribute keys

Value

The target object with the updated attributes.


Calculate correlation and cooccurrence coefficients and test whether they are different from zero

Description

This function is used to calculate coefficients for all pairwise itemsby callingget_correlation() on each combination of the items in thecols- bycross-parameter.

Usage

.effect_correlations(  data,  cols,  cross,  method = "pearson",  category = NULL,  test = TRUE,  adjust = "fdr",  labels = TRUE)

Arguments

data

A tibble.

cols

The columns holding metric values.

cross

The columns holding metric values to correlate.

method

The output metrics, pearson = Pearson's R, spearman = Spearman's rho,cramer = Cramer's V, npmi = Normalized Pointwise Mutual Information.The reported R square value is simply the square of Spearman's rho respective Pearson's r.

category

Calculating NPMI for multiple items requires a focus category.By default, for logical column types, only TRUE values are counted.For other column types, the first category is counted.Accepts both character and numeric values to override default counting behavior.

test

Boolean, whether to perform significance tests (default = TRUE).

adjust

Performing multiple significance tests inflates the alpha error.Thus, p values need to be adjusted according to the number of tests.Set a method supported bystats::p.adjust,e.g. "fdr" (the default) or "bonferroni". Disable adjustment with FALSE.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

Value

A tibble with correlation results.


Create a factor vector and preserve all attributes

Description

Create a factor vector and preserve all attributes

Usage

.factor_with_attr(x, levels = NULL)

Arguments

x

The source value, usually a character vector

levels

The new levels

Value

A factor vector with the new levels


Get the maximum density value in a density plot

Description

Useful for placing geoms in the center of density plots

Usage

.get_density_mode(data, col)

Arguments

data

A tibble.

col

A tidyselect column.

Value

The maximum density value.


Get plot size and resolution for the current output format from the config

Description

Get plot size and resolution for the current output format from the config

Usage

.get_fig_settings()

Value

A list with figure settings


Guess plot limits from values

Description

Guess plot limits from values

Usage

.get_plot_limits(values)

Arguments

values

Numeric values vector

Value

Either a vector c(0,1) or c(-1,1) or the range of the values,, whatever covers the values better


Calculate IQR

Description

Calculate IQR

Usage

.iqr(x)

Arguments

x

A numeric vector

Value

The IQR


Knit volker plots

Description

Automatically calculates the plot height fromchunk options and volker options.

Usage

.knit_plot(pl)

Arguments

pl

A ggplot object with vlkr_options.The vlk_options are added by .to_vlkr_plot()and provide information about the number of vertical items (rows)and the maximum.

Details

Presumptions:

Value

Character string containing a html image tag, including the base64 encoded image.


Prepare markdown content for table rendering

Description

Prepare markdown content for table rendering

Usage

.knit_prepare(x, wrap = FALSE)

Arguments

x

Markdown text.

wrap

Wrap text after the given number of characters.

Value

Markdown text with line breaks and escaped special characters.


Compact table printing with shortened names and values

Description

Truncates long column names and long character values,for more readable console output.

Usage

.knit_shorten(df)

Arguments

df

A data frame or tibble.

Details

The default column name length is 30 and the cell values length is 40.Override withoptions(vlkr.trunc.columns=20) andoptions(vlkr.trunc.cells=20).

Value

A data fram with shortened column names and cell content.


Knit volker tables

Description

Numbers are rounded by three mechanisms:

Usage

.knit_table(df, ...)

Arguments

df

Data frame.

Value

Formatted table produced bykable.


Calculate outliers

Description

Calculate outliers

Usage

.outliers(x, k = 1.5)

Arguments

x

A numeric vector.

Value

A list of outliers.


Helper: mean of pairwise agreements (nominal)

Description

Helper: mean of pairwise agreements (nominal)

Usage

.pair_agreement(vals)

Arguments

vals

A matrix with two columns for coders and ratings as values

Value

The share of agreement.


Helper: mean of pairwise disagreements (nominal)

Description

Helper: mean of pairwise disagreements (nominal)

Usage

.pair_disagreement(vals)

Arguments

vals

A matrix with two columns for coders and ratings as values

Value

The share of disagreement.


Helper function: plot grouped bar chart

Description

Helper function: plot grouped bar chart

Usage

.plot_bars(  data,  category = NULL,  ci = FALSE,  scale = NULL,  limits = NULL,  numbers = NULL,  orientation = "horizontal",  base = NULL,  title = NULL)

Arguments

data

Data frame with the columns item, value, p, n and optionally wIf w is provided, the column width is generated according the w value, resulting in a mosaic plot.

category

Optionally, a category to focus. All rows not matching the category will be filtered out.

ci

Whether to plot error bars for 95% confidence intervals. Provide the columns ci.low and ci.high in data.

scale

Direction of the scale: 0 = no direction for categories,-1 = descending or 1 = ascending values.

numbers

Set to something that evaluates to TRUE and add the .values column to the data frame to ouput values on the bars.

orientation

Whether to show bars (horizontal) or columns (vertical)

base

The plot base as character or NULL.

title

The plot title as character or NULL.

Value

A ggplot object.


Helper function: plot cor and regression outputs

Description

Helper function: plot cor and regression outputs

Usage

.plot_cor(  data,  ci = TRUE,  base = NULL,  limits = NULL,  title = NULL,  label = NULL)

Arguments

data

Dataframe with the columns item and value.To plot errorbars, add the columns low and high and set the ci-paramater to TRUE.

ci

Whether to plot confidence intervals. Provide the columns low and high in data.

base

The plot base as character or NULL.

limits

The scale limits.

title

The plot title as character or NULL.

label

The y axis label.

Value

A ggplot object.


Helper function: Heatmap

Description

Helper function: Heatmap

Usage

.plot_heatmap(  data,  values_col,  numbers_col = NULL,  base = NULL,  title = TRUE,  labels = TRUE)

Arguments

data

A tibble with item combinations in the first two columns.Only if the item values are equal in both columns, titles are added to the axes.

values_col

Name of the column containing correlation values, a character value.

numbers_col

Name of the column containing values to plot on the tiles or NULL to hide numbers.

base

Character value; the baseline note, including the number of cases.

title

If TRUE (default) shows a plot title derived from the column labels.Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

Value

A ggplot object


Helper function: plot grouped line chart

Description

Helper function: plot grouped line chart

Usage

.plot_lines(data, scale = NULL, base = NULL, limits = NULL, title = NULL)

Arguments

data

Dataframe with the columns item, value, and .cross

scale

Passed to the label scale function.

base

The plot base as character or NULL.

limits

The scale limits.

title

The plot title as character or NULL.

Value

A ggplot object.


Helper function: scree plot

Description

Helper function: scree plot

Usage

.plot_scree(data, k = NULL, lab_x = NULL, lab_y = NULL)

Arguments

data

Dataframe with the factor or cluster number in the first columnand the metric in the second.

k

Provide one of the values in the first column to color points up to this value.

lab_x

Label of the x axis

lab_y

Label of the y axis

Value

A vlkr_plot object


Helper function: plot grouped line chart by summarising values

Description

Helper function: plot grouped line chart by summarising values

Usage

.plot_summary(  data,  ci = FALSE,  scale = NULL,  base = NULL,  box = FALSE,  limits = NULL,  title = NULL)

Arguments

data

Dataframe with the columns item, value.

ci

Whether to plot confidence intervals of the means.

scale

Passed to the label scale function.

base

The plot base as character or NULL.

box

Whether to add boxplots.

title

The plot title as character or NULL.

Value

A ggplot object.


Generate table and plot for agreement analysis

Description

Generate table and plot for agreement analysis

Usage

.report_agr(data, cols, coders, ids, method = "reliability", clean = TRUE, ...)

Arguments

data

A data frame.

cols

Columns with codings. A tidy column selection,e.g. a single column (without quotes)or multiple columns selected by methods such as starts_with().

coders

The coders or classification methods.

ids

Column with case IDs.

method

One of "reliability" or "classification".

Value

A list containing a table and a plot volker report chunk.


Generate an cluster table and plot

Description

Generate an cluster table and plot

Usage

.report_cls(  data,  cols,  cross,  metric = FALSE,  ...,  k = 2,  effect = FALSE,  title = TRUE)

Arguments

data

A data frame.

cols

A tidy column selection,e.g. a single column (without quotes)or multiple columns selected by methods such as starts_with().

cross

Not yet implemented. Optional, a grouping column (without quotes).

metric

Not yet implemented. When crossing variables, the cross column parameter can contain categorical or metric values.By default, the cross column selection is treated as categorical data.Set metric to TRUE, to treat it as metric and calculate correlations.

k

Number of clusters to calculate.

effect

Not yet implemented. Whether to report statistical tests and effect sizes.

title

Add a plot title (default = TRUE).

Value

A list containing a table and a plot volker report chunk.


Generate an factor table and plot

Description

Generate an factor table and plot

Usage

.report_fct(  data,  cols,  cross,  metric = FALSE,  ...,  k = 2,  effect = FALSE,  title = TRUE)

Arguments

data

A data frame.

cols

A tidy column selection,e.g. a single column (without quotes)or multiple columns selected by methods such as starts_with().

cross

Not yet implementedt. Optional, a grouping column (without quotes).

metric

Not yet implemented. When crossing variables, the cross column parameter can contain categorical or metric values.By default, the cross column selection is treated as categorical data.Set metric to TRUE, to treat it as metric and calculate correlations.

k

Number of factors to calculate.

effect

Not yet implemented. Whether to report statistical tests and effect sizes.

title

Add a plot title (default = TRUE).

Value

A list containing a table and a plot volker report chunk.


Generate an index table and plot

Description

Generate an index table and plot

Usage

.report_idx(  data,  cols,  cross,  metric = FALSE,  ...,  effect = FALSE,  title = TRUE)

Arguments

data

A data frame.

cols

A tidy column selection,e.g. a single column (without quotes)or multiple columns selected by methods such as starts_with().

cross

Optional, a grouping column (without quotes).

metric

When crossing variables, the cross column parameter can contain categorical or metric values.By default, the cross column selection is treated as categorical data.Set metric to TRUE, to treat it as metric and calculate correlations.

effect

Whether to report statistical tests and effect sizes.

title

Add a plot title (default = TRUE).

Value

A list containing a table and a plot volker report chunk.


Generate an model table and plot

Description

Generate an model table and plot

Usage

.report_mdl(data, cols, categorical = NULL, metric = NULL, ..., title = TRUE)

Arguments

data

A data frame.

cols

A a single column (without quotes).

categorical

A tidy column selection holding independet categorical variables.

metric

A tidy column selection holding independent metric variables.

title

Add a plot title (default = TRUE).

Value

A list containing a table and a plot volker report chunk.


Split a metric column into categories based on the median

Description

Split a metric column into categories based on the median

Usage

.tab_split(data, col, labels = TRUE)

Arguments

data

A data frame containing the column to be split.

col

The column to split.

labels

Logical; ifTRUE (default), use custom labels for the split categories basedon the column title. IfFALSE, use the column name directly.

Value

A data frame with the specified column converted into categorical labels based on its median value.The split threshold (median) is stored as an attribute of the column.


Add vlkr_df class - that means, the data frame has been prepared

Description

Add vlkr_df class - that means, the data frame has been prepared

Usage

.to_vlkr_df(data, digits = NULL)

Arguments

data

A tibble.

Value

A tibble of class vlkr_df.


Add vlkr_list class

Description

Used to collect multiple tables in a list,e.g. from regression outputs

Usage

.to_vlkr_list(data, baseline = TRUE)

Arguments

data

A list.

baseline

Whether to get the baseline.

Value

A volker list.


Add the volker class and options

Description

Add the volker class and options

Usage

.to_vlkr_plot(  pl,  rows = NULL,  maxlab = NULL,  baseline = TRUE,  theme_options = TRUE)

Arguments

pl

A ggplot object.

rows

The number of items on the vertical axis. Will be automatically determined when NULL.For stacked bar charts, don't forget to set the group parameter, otherwise it won't work

maxlab

The character length of the longest label to be plotted. Will be automatically determined when NULL.on the vertical axis.

baseline

Whether to print a message about removed values.

theme_options

Enable or disable axis titles and text, by providing a list with any of the elementsaxis.text.x, axis.text.y, axis.title.x, axis.title.y set to TRUE or FALSE.By default, titles (=scale labels) are disabled and text (= the tick labels) are enabled.Enable or disable legend title by setting the list element legend.title to TRUE or FALSE.Legend titles are disabled by default.

Value

A ggplot object with vlkr_plt class.


Add the vlkr_rprt class to an object

Description

Adding the class makes sure the appropriate printing functionis applied in markdown reports.

Usage

.to_vlkr_rprt(chunks)

Arguments

chunks

A list of character strings.

Value

A volker report object: List of character strings with the vlkr_rprt classcontaining the parts of the report.


Add vlkr_tbl class

Description

Additionally, removes the skim_df class if present.

Usage

.to_vlkr_tab(data, digits = NULL, caption = NULL, baseline = NULL)

Arguments

data

A tibble.

digits

Set the plot digits. If NULL (default), no digits are set.

caption

The caption printed above the table.

baseline

A base line printed below the table.

Value

A volker tibble.


Calculate lower whisker in a boxplot

Description

Calculate lower whisker in a boxplot

Usage

.whisker_lower(x, k = 1.5)

Arguments

x

A numeric vector.

Value

The lower whisker value.


Calculate upper whisker in a boxplot

Description

Calculate upper whisker in a boxplot

Usage

.whisker_upper(x, k = 1.5)

Arguments

x

A numeric vector.

Value

The upper whisker value.


Resolution settings for plots

Description

Override withoptions(vlkr.fig.settings=list(html = list(dpi = 192, scale = 2, width = 910, pxperline = 15))).Add a key for each output format when knitting a document.You can override the width by setting vlkr.fig.width in the chunk options.

Usage

VLKR_FIG_SETTINGS

Format

An object of classlist of length 2.


Fill colors

Description

Override withoptions(vlkr.discrete.fill=list(c("purple"))).

Usage

VLKR_FILLDISCRETE

Format

An object of classlist of length 3.


Gradient colors

Description

Override withoptions(vlkr.gradient.fill=list(c("white","black"))).

Usage

VLKR_FILLGRADIENT

Format

An object of classcharacter of length 5.


Polarized colors

Description

Polarized colors

Usage

VLKR_FILLPOLARIZED

Format

An object of classcharacter of length 5.


Maximum number of distinct values to determine whether a column selectioncontains only categorical values

Description

Override withoptions(vlkr.max.categories=10).

Usage

VLKR_MAX_CATEGORIES

Format

An object of classnumeric of length 1.


Levels to remove from factors

Description

Override withoptions(vlkr.na.levels=c("Not answered")).

Usage

VLKR_NA_LEVELS

Format

An object of classcharacter of length 4.


Numbers to remove from vectors

Description

Override withoptions(vlkr.na.numbers=c(-2,-9)).

Usage

VLKR_NA_NUMBERS

Format

An object of classnumeric of length 3.


Whether to remove all missings

Description

Override withoptions(vlkr.na.omit=FALSE) to use all cases if possible.

Usage

VLKR_NA_OMIT

Format

An object of classlogical of length 1.


Output thresholds

Description

Output thresholds

Usage

VLKR_NORMAL_DIGITS

Format

An object of classnumeric of length 1.


Wrapping threshold

Description

Override withoptions(vlkr.wrap.labels=20).Override withoptions(vlkr.wrap.legend=10).Override withoptions(vlkr.wrap.scale=10).Override withoptions(vlkr.angle.value=30).Override withoptions(vlkr.angle.threshold=10).Override withoptions(vlkr.trunc.columns=20).Override withoptions(vlkr.trunc.cells=20).

Usage

VLKR_PLOT_LABELWRAP

Format

An object of classnumeric of length 1.


Alpha values

Description

Alpha values

Usage

VLKR_POINT_ALPHA

Format

An object of classnumeric of length 1.


Shapes

Description

Shapes

Usage

VLKR_POINT_MEAN_SHAPE

Format

An object of classnumeric of length 1.


Sizes

Description

Sizes

Usage

VLKR_POINT_SIZE

Format

An object of classnumeric of length 1.


Word wrap separators

Description

Word wrap separators

Usage

VLKR_WRAP_SEPARATOR

Format

An object of classcharacter of length 1.


Add cluster number to a data frame

Description

Clustering is performed usingstats::kmeans.

[Experimental]

Usage

add_clusters(data, cols, newcol = NULL, k = 2, method = "kmeans", clean = TRUE)

Arguments

data

A dataframe.

cols

A tidy selection of item columns.

newcol

Name of the new cluster column as a character vector.Set to NULL (default) to automatically build a namefrom the common column prefix, prefixed with "cls_".

k

Number of clusters to calculate.Set to NULL to output a scree plot for up to 10 clustersand automatically choose the number of clusters based on the elbow criterion.The within-sums of squares for the scree plot are calculated bystats::kmeans.

method

The method as character value. Currently, only kmeans is supported.All items are scaled before performing the cluster analysis usingbase::scale.

clean

Prepare data bydata_clean.

Value

The input tibble with additional columncontaining cluster values as a factor.The new column is prefixed with "cls_".The new column contains the fit result in the attribute stats.kmeans.fit.The names of the items used for clustering are stored in the attribute stats.kmeans.items.The clustering diagnostics (Within-Cluster and Between-Cluster Sum of Squares) are stored in the attribute stats.kmeans.wss.

Examples

library(volker)ds <- volker::chatgptvolker::add_clusters(ds, starts_with("cg_adoption"), k = 3)

Add PCA columns along with summary statistics (KMO and Bartlett test) to a data frame

Description

PCA is performed usingpsych::pca usind varimax rotation.Bartlett's test for sphericity is calculated withpsych::cortest.bartlett.The Kaiser-Meyer-Olkin (KMO) measure is computed usingpsych::KMO.

[Experimental]

Usage

add_factors(data, cols, newcols = NULL, k = 2, method = "pca", clean = TRUE)

Arguments

data

A dataframe.

cols

A tidy selection of item columns.

newcols

Names of the factor columns as a character vector.Must be the same length as k or NULL.Set to NULL (default) to automatically build a namefrom the common column prefix, prefixed with "fct_", postfixed with the factor number.

k

Number of factors to calculate.Set to NULL to calculate eigenvalues for all components up to the number of itemsand automatically choose k. Eigenvalues and the decision on k are calculated bypsych::fa.parallel.

method

The method as character value. Currently, only pca is supported.

clean

Prepare data bydata_clean.

Value

The input tibble with additional columns containing factor values.The new columns are prefixed with "fct_".The first new column contains the fit result in the attribute psych.pca.fit.The names of the items used for factor analysis are stored in the attribute psych.pca.items.The summary diagnostics (Bartlett test and KMO) are stored in the attribute psych.kmo.bartlett.

Examples

library(volker)ds <- volker::chatgptvolker::add_factors(ds, starts_with("cg_adoption"))

Calculate the mean value of multiple items

Description

[Experimental]

Usage

add_index(data, cols, newcol = NULL, cols.reverse, clean = TRUE)

Arguments

data

A dataframe.

cols

A tidy selection of item columns.

newcol

Name of the index as a character value.Set to NULL (default) to automatically build a namefrom the common column prefix, prefixed with "idx_".

cols.reverse

A tidy selection of columns with reversed codings.

clean

Prepare data bydata_clean.

Value

The input tibble with an additional column that contains the index values.The column contains the result of the alpha calculation in the attribute named "psych.alpha".

Examples

ds <- volker::chatgptvolker::add_index(ds, starts_with("cg_adoption"))

Add a column with predicted values from a regression model

Description

The regression output comes fromstats::lm.The effect sizes are calculated byheplots::etasq.The variance inflation is calculated bycar::vif.

[Experimental]

Usage

add_model(  data,  col,  categorical,  metric,  interactions = NULL,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

col

The target column holding metric values.

categorical

A tidy column selection holding categorical variables.

metric

A tidy column selection holding metric variables.

interactions

A vector of interaction effects to calculate.Each interaction effect should be provided as multiplication of the variables.The interaction effect can be provided as character value (e.g.c("sd_gender * adopter"))or as unquoted column names (e.g.c(sd_gender * adopter)).

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromeffect_metrics.

Value

The input tibble with one additional column.The new column name is derived from the target column, prefixed with "prd_".The new column will have an attribute "lm.fit" with the fit model.

Examples

library(volker)data <- filter(volker::chatgpt, sd_gender != "diverse")data <- data |>  add_model(use_work, categorical = c(sd_gender, adopter), metric = sd_age)

Adjust p-values from multiple tests and optionally annotate significance stars

Description

Adjust p-values from multiple tests and optionally annotate significance stars

Usage

adjust_p(df, col, method = "fdr", digits = 3, stars = TRUE)

Arguments

df

A data frame or tibble containing the column to adjust.

col

A tidyselect expression specifying the p-value column to adjust.

method

Character string specifying the p-value adjustment method.Seep.adjust for available methods. Disable adjustment with FALSE.

digits

Integer; number of decimal places for rounding.

stars

Logical or character; ifTRUE, add a "stars" columnwith significance symbols (e.g.,"***","**","*")based on the adjusted p-values.If set to a character value it determines the new column name.

Value

A modified data frame with:


Agreement for multiple items

Description

Two types of comparing categories are provided:

Usage

agree_tab(  data,  cols,  coders,  ids = NULL,  category = NULL,  method = "reliability",  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures, coders and case IDs.

cols

A tidy selection of item variables (e.g. starts_with...) with ratings.

coders

The column holding coders or methods to compare.

ids

The column with case IDs.

category

For classification performance indicators, if no category is provided,macro statistics are returned (along with the number of categories in the output).Provide a category to get the statistics for this category only.If values are boolean (TRUE / FALSE) and no category is provided,the category is always assumed to be "TRUE".

method

The output metrics, one ofreliability orclassification.You can abbreviate it, e.g.reli orclass.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromreport_counts.

Details

Value

A volker tibble with one row for each item.The item name is returned in the first column.For the reliability method, the following columns are returned:

For the classification method, the following columns are returned:

Examples

library(dplyr)library(volker)data <- volker::chatgpt# Prepare example data.# First, recode "x" to TRUE/FALSE for the first coder's sample.data_coder1 <- data |>  mutate(across(starts_with("cg_act_"), ~ ifelse(is.na(.), FALSE, TRUE))) %>%  mutate(coder = "coder one")# Second, recode using a dictionary approach for the second coder's sample.data_coder2 <- data |>  mutate(across(starts_with("cg_act_"), ~ ifelse(is.na(.), FALSE, TRUE))) %>%  mutate(cg_act_write = grepl("write|text|translate", tolower(cg_activities))) %>%  mutate(coder="coder two")data_coded <- bind_rows(  data_coder1,  data_coder2)# Reliability coefficients are strictly only appropriate for manual codingsagree_tab(data_coded, cg_act_write,  coder, case, method = "reli")# Better use classification performance indicators to compare the# dictionary approach with human codingagree_tab(data_coded, cg_act_write,  coder, case, method = "class")

Get configured na numbers

Description

Retrieves values either from the option or from the constant.

Usage

cfg_get_na_numbers(default = VLKR_NA_NUMBERS)

Arguments

default

The default na numbers, if not explicitly provided by na.numbers or the options.return A vector with numbers that should be treated as NAs


ChatGPT Adoption Dataset CG-GE-APR23

Description

A small random subset of data from a survey about ChatGPTadoption. The survey was conducted in April 2023 withinthe population of German Internet users.

Usage

chatgpt

Format

chatgpt

A data frame with 101 rows and 22 columns:

case

A running case number

sd_

Columns starting with sd contain sociodemographics of the respondents.

adopter

Adoption groups, inspired by Roger's innovator typology.

use_

Columns starting with use contain data about ChatGPT usage in different contexts.

cg_adoption_

A scale consisting of items about advantages, fears, and social aspects.The scales match theoretical constructs inspired by Roger's diffusion model and Davis' Technology Acceptance Model.

cg_activities

Text answers to the question, what the respondents do with ChatGPT.

cg_act_write

Manual content analysis of cg_activities: Does the activities involve generating text, code or other artifacts?

cg_act_test

Manual content analysis of cg_activities: Does the activities involve testing, experimenting or playing around?

cg_act_search

Manual content analysis of cg_activities: Does the activities involve searching for information, advice or inspiration?

Details

Call codebook(volker::chatgpt) to see the items and answer options.

Source

Communication Department of the University of Münster (gehrau@uni-muenster.de andjakob.juenger@uni-muenster.de).


Check whether a column exist and stop if not

Description

Check whether a column exist and stop if not

Usage

check_has_column(data, cols, msg = NULL)

Arguments

data

A data frame.

cols

A tidyselection of columns.

msg

A custom error message if the check fails.

Value

boolean Whether the column exists.


Check whether a column selection is categorical

Description

Check whether a column selection is categorical

Usage

check_is_categorical(data, cols, msg = NULL)

Arguments

data

A data frame.

cols

A tidyselection of columns.

msg

A custom error message if the check fails.

Value

boolean Whether the columns are categorical


Check whether the object is a dataframe

Description

Check whether the object is a dataframe

Usage

check_is_dataframe(obj, msg = NULL, stopit = TRUE)

Arguments

obj

The object to test.

msg

Optional, a custom error message.

stopit

Whether to stop execution with an error message.

Value

boolean Whether the object is a data.frame object.


Check whether a column selection is numeric

Description

Check whether a column selection is numeric

Usage

check_is_numeric(data, cols, msg = NULL)

Arguments

data

A data frame.

cols

A tidyselection of columns.

msg

A custom error message if the check fails.

Value

boolean Whether the columns are numeric.


Check whether a parameter value is from a valid set

Description

Check whether a parameter value is from a valid set

Usage

check_is_param(  value,  allowed,  allownull = FALSE,  allowmultiple = FALSE,  expand = FALSE,  stopit = TRUE,  msg = NULL)

Arguments

value

A character value.

allowed

Allowed values.

allownull

Whether to allow NULL values.

allowmultiple

Whether to allow multiple values.

stopit

Whether to stop execution if the value is invalid.

msg

A custom error message if the check fails.

Value

logical whether method is valid.


Check whether there is no overlap between two column sets

Description

Check whether there is no overlap between two column sets

Usage

check_nonequal_columns(data, cols, cross, msg = NULL)

Arguments

data

A data frame.

cols

A tidyselection of columns.

cross

A tidyselection of columns.

msg

A custom error message if the check fails.

Value

boolean Whether the two column sets are different.


Get plot for clustering result

Description

Kmeans clustering is performed usingadd_clusters.

[Experimental]

Usage

cluster_plot(  data,  cols,  newcol = NULL,  k = NULL,  method = NULL,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

cols

A tidy selection of item columns or a single column with cluster values as a factor.If the column already contains a cluster result fromadd_clusters, it is used, and other parameters are ignored.If no cluster result exists, it is calculated withadd_clusters.

newcol

Name of the new cluster column as a character vector.Set to NULL (default) to automatically build a namefrom the common column prefix, prefixed with "cls_".

k

Number of clusters to calculate.Set to NULL to output a scree plot for up to 10 clustersand automatically choose the number of clusters based on the elbow criterion.The within-sums of squares for the scree plot are calculated bystats::kmeans.

method

The method as character value. Currently, only kmeans is supported.All items are scaled before performing the cluster analysis usingbase::scale.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_metrics.

Value

A ggplot object.

Examples

library(volker)data <- volker::chatgptcluster_plot(data, starts_with("cg_adoption"), k = 2)

Get tables for clustering result

Description

Kmeans clustering is performed usingadd_clusters.

[Experimental]

Usage

cluster_tab(  data,  cols,  newcol = NULL,  k = NULL,  method = "kmeans",  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

cols

A tidy selection of item columns or a single column with cluster values as a factor.If the column already contains a cluster result fromadd_clusters, it is used, and other parameters are ignored.If no cluster result exists, it is calculated withadd_clusters.

newcol

Name of the new cluster column as a character vector.Set to NULL (default) to automatically build a namefrom the common column prefix, prefixed with "cls_".

k

Number of clusters to calculate.Set to NULL to output a scree plot for up to 10 clustersand automatically choose the number of clusters based on the elbow criterion.The within-sums of squares for the scree plot are calculated bystats::kmeans.

method

The method as character value. Currently, only kmeans is supported.All items are scaled before performing the cluster analysis usingbase::scale.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromtab_metrics.

Value

A volker list with with three volker tabs: cluster centers, cluster counts, and clustering diagnostics.

Examples

library(volker)data <- volker::chatgptcluster_tab(data, starts_with("cg_adoption"), k = 2)

Get variable and value labels from a data set

Description

Variable labels are extracted from their comment or label attribute.Variable values are extracted from factor levels, the labels attribute,numeric or boolean attributes.

Usage

codebook(data, cols, values = TRUE)

Arguments

data

A tibble.

cols

A tidy variable selections to filter specific columns.

values

Whether to output values (TRUE) or only items (FALSE)

Details

[Experimental]

Value

A tibble with the columns:

Examples

volker::codebook(volker::chatgpt)

Convert numeric values to string

Description

Convert numeric values to string

Usage

data_cat(data, cols)

Arguments

data

A data frame containing the items to be converted.

cols

A tidy selection of columns to convert.

Value

A data frame with the converted values


Prepare dataframe for the analysis

Description

Depending on the selected cleaning plan, for example,recodes residual values to NA.

Usage

data_clean(data, plan = "default", ...)

Arguments

data

Data frame.

plan

The cleaning plan. By now, only "default" is supported. Seedata_clean_default.

...

Other parameters passed to the appropriate cleaning function.

Details

The tibble remembers whether it was already cleaned andthe cleaning plan is only applyed once in the first call.

Value

Cleaned data frame with vlkr_df class.

Examples

ds <- volker::chatgptds <- data_clean(ds)

Prepare data originating from SoSci Survey or SPSS

Description

Preparation steps:

Usage

data_clean_default(data, remove.na.levels = TRUE, remove.na.numbers = TRUE)

Arguments

data

Data frame

remove.na.levels

Remove residual values from factor columns.Either a character vector with residual values or TRUE to use defaults inVLKR_NA_LEVELS.You can also define or disable residual levels by setting the global option vlkr.na.levels(e.g.options(vlkr.na.levels=c("Not answered")) or to disableoptions(vlkr.na.levels=FALSE)).

remove.na.numbers

Remove residual values from numeric columns.Either a numeric vector with residual values or TRUE to use defaults inVLKR_NA_NUMBERS.You can also define or disable residual values by setting the global option vlkr.na.numbers(e.g.options(vlkr.na.numbers=c(-2,-9)) or to disableoptions(vlkr.na.numbers=FALSE)).

Details

The tibble remembers whether it was already prepared andthe operations are only performed once in the first call.

Value

Data frame with vlkr_df class (the class is used to prevent double preparation).

Examples

ds <- volker::chatgptds <- data_clean_default(ds)

Convert values to numeric values

Description

Convert values to numeric values

Usage

data_num(data, cols)

Arguments

data

A data frame containing the items to be converted.

cols

A tidy selection of columns to convert.

Value

A data frame with the converted values


One-hot encode selected columns

Description

One-hot encode selected columns

Usage

data_onehot(data, ...)

Arguments

data

A data frame or tibble.

...

Tidyselect expressions specifying columns to one-hot encode

Value

Data frame with one hot encoded data


Prepare data for calculation

Description

Clean data, check column selection, remove cases with missing values

Usage

data_prepare(  data,  cols,  cross,  cols.categorical,  cols.numeric,  cols.reverse,  clean = TRUE)

Arguments

data

Data frame to be prepared.

cols

The first column selection.

cross

The second column selection.

cols.categorical

A tidy selection of columns to be checked for categorical values.

cols.numeric

A tidy selection of columns to be converted to numeric values.

cols.reverse

A tidy selection of columns with reversed codings.

clean

Whether to clean data usingdata_clean.

Value

Prepared data frame.

Examples

data <- volker::chatgptdata_prepare(data, sd_age, sd_gender)

Reverse item values

Description

Reverse item values

Usage

data_rev(data, cols)

Arguments

data

A data frame containing the items to be reversed.

cols

A tidy selection of columns to reverse.For example, if you want to calculate an index of thetwo items "I feel bad about this" and "I like it",both coded with 1=not at all to 5=fully agree,you need to reverse one of them to make thecodings compatible.

Value

A data frame with the specified items reversed.


Remove missings and output a message

Description

Remove missings and output a message

Usage

data_rm_missings(data, cols, force = FALSE)

Arguments

data

Data frame.

cols

A tidy column selection.

force

By default, cases with missings are only removedwhen the vlkr.na.omit option is TRUE.Set force to TRUE to always remove such cases.

Value

Data frame.


Remove NA levels

Description

Remove NA levels

Usage

data_rm_na_levels(data, na.levels = TRUE, default = VLKR_NA_LEVELS)

Arguments

data

Data frame

na.levels

Residual values to remove from factor columns.Either a character vector with residual values or TRUE to use defaults inVLKR_NA_LEVELS.You can define default residual levels by setting the global option vlkr.na.levels(e.g.options(vlkr.na.levels=c("Not answered"))).

default

The default na levels, if not explicitly provided by na.levels or the options.

Value

Data frame


Remove NA numbers

Description

Remove NA numbers

Usage

data_rm_na_numbers(  data,  na.numbers = TRUE,  check.labels = TRUE,  default = VLKR_NA_NUMBERS)

Arguments

data

Data frame

na.numbers

Either a numeric vector with residual values or TRUE to use defaults inVLKR_NA_NUMBERS.You can also define residual values by setting the global option vlkr.na.numbers(e.g.options(vlkr.na.numbers=c(-9))).

check.labels

Whether to only remove NA numbers that are listed in the attributes of a column.

default

The default na numbers, if not explicitly provided by na.numbers or the options.

Value

Data frame


Remove negatives and output a warning

Description

Remove negatives and output a warning

Usage

data_rm_negatives(data, cols)

Arguments

data

Data frame

cols

A tidy column selection

Value

Data frame


Remove zero values, drop missings and output a message

Description

Remove zero values, drop missings and output a message

Usage

data_rm_zeros(data, cols)

Arguments

data

Data frame.

cols

A tidy column selection.

Value

Data frame.


Round and format selected numeric columns

Description

Round and format specified numeric columns in a data frameto a fixed number of decimal places.

Usage

data_round(data, cols, digits)

Arguments

data

A data frame or tibble.

cols

A tidyselect expression specifying which columns to round(e.g.,c(var1, var2) orstarts_with("score")).

digits

Integer; number of decimal places to round.

Details

For each selected numeric column:

Value

The input data frame, with the specified numeric columns roundedand formatted as character vectors.


Cook's distance plot

Description

Cook's distance plot

Usage

diagnostics_cooksd(fit)

Arguments

fit

The lm fit object

Value

A ggplot object

Examples

library(volker)data <- filter(volker::chatgpt, sd_gender != "diverse")data <- add_model(data, use_work, metric = sd_age)fit <- attr(data$prd_use_work, "lm.fit")diagnostics_cooksd(fit)

Normal Q-Q

Description

Normal Q-Q

Usage

diagnostics_qq(fit)

Arguments

fit

The lm fit object

Value

A ggplot object

Examples

library(volker)data <- filter(volker::chatgpt, sd_gender != "diverse")data <- add_model(data, use_work, metric = sd_age)fit <- attr(data$prd_use_work, "lm.fit")diagnostics_qq(fit)

Residuals vs Fitted plot

Description

Residuals vs Fitted plot

Usage

diagnostics_resid_fitted(fit)

Arguments

fit

The lm fit object

Value

A ggplot object

Examples

library(volker)data <- filter(volker::chatgpt, sd_gender != "diverse")data <- add_model(data, use_work, metric = sd_age)fit <- attr(data$prd_use_work, "lm.fit")diagnostics_resid_fitted(fit)

Scale-Location (Spread-Location)

Description

Scale-Location (Spread-Location)

Usage

diagnostics_scale_location(fit)

Arguments

fit

The lm fit object

Value

A ggplot object

Examples

library(volker)data <- filter(volker::chatgpt, sd_gender != "diverse")data <- add_model(data, use_work, metric = sd_age)fit <- attr(data$prd_use_work, "lm.fit)diagnostics_scale_location(fit")

Output effect sizes and test statistics for count data

Description

The type of effect size depends on the number of selected columns:

Cross tabulations:

By default, if you provide two column selections, the second column is treated as categorical.Setting the metric-parameter to TRUE will call the appropriate functions for correlation analysis:

[Experimental]

Usage

effect_counts(data, cols, cross = NULL, metric = FALSE, clean = TRUE, ...)

Arguments

data

A data frame.

cols

A tidy column selection,e.g. a single column (without quotes)or multiple columns selected by methods such as starts_with().

cross

Optional, a grouping column. The column name without quotes.

metric

When crossing variables, the cross column parameter can contain categorical or metric values.By default, the cross column selection is treated as categorical data.Set metric to TRUE, to treat it as metric and calculate correlations.

clean

Prepare data bydata_clean.

...

Other parameters passed to the appropriate effect function.

Value

A volker tibble.

Examples

library(volker)data <- volker::chatgpteffect_counts(data, sd_gender, adopter)

Test homogeneity of category shares for multiple items

Description

Performs a goodness-of-fit test and calculates the Gini coefficient for each item.The goodness-of-fit-test is calculated usingstats::chisq.test.

Usage

effect_counts_items(  data,  cols,  adjust = "fdr",  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

adjust

Performing multiple significance tests inflates the alpha error.Thus, p values need to be adjusted according to the number of tests.Set a method supported bystats::p.adjust,e.g. "fdr" (the default) or "bonferroni". Disable adjustment with FALSE.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromeffect_counts.

Value

A volker tibble with the following statistical measures:

Examples

library(volker)data <- volker::chatgpteffect_counts_items(data, starts_with("cg_adoption_adv"))

Correlate the values in multiple items with one metric column and output effect sizes and tests

Description

Not yet implemented. The future will come.

Usage

effect_counts_items_cor(data, cols, cross, clean = TRUE, ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The metric column.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromeffect_counts.

Value

A volker tibble.


Correlate the values in multiple items with multiple metric columns and output effect sizes and tests

Description

Not yet implemented. The future will come.

Usage

effect_counts_items_cor_items(data, cols, cross, clean = TRUE, ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The metric target columns.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromeffect_counts.

Value

A volker tibble.


Effect size and test for comparing multiple variables by a grouping variable

Description

Effect size and test for comparing multiple variables by a grouping variable

Usage

effect_counts_items_grouped(  data,  cols,  cross,  method = "cramer",  adjust = "fdr",  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures and grouping variable.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The column holding groups to compare.

method

The output metrics, currently onlycramer is supported.

adjust

Performing multiple significance tests inflates the alpha error.Thus, p values need to be adjusted according to the number of tests.Set a method supported bystats::p.adjust,e.g. "fdr" (the default) or "bonferroni". Disable adjustment with FALSE.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromeffect_counts.

Value

A volker tibble.

Examples

library(volker)data <- volker::chatgpteffect_counts_items_grouped(  data, starts_with("cg_adoption_adv"),  sd_gender)

Effect size and test for comparing multiple variables by multiple grouping variables

Description

Effect size and test for comparing multiple variables by multiple grouping variables

Usage

effect_counts_items_grouped_items(  data,  cols,  cross,  method = "cramer",  adjust = "fdr",  category = NULL,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures and grouping variable.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The columns holding groups to compare.

method

The output metrics: cramer = Cramer's V, pmi = Pointwise Mutual Information, npmi = Normalized PMI.

adjust

Performing multiple significance tests inflates the alpha error.Thus, p values need to be adjusted according to the number of tests.Set a method supported bystats::p.adjust,e.g. "fdr" (the default) or "bonferroni". Disable adjustment with FALSE.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromeffect_counts.

Value

A volker tibble.

Examples

library(volker)data <- volker::chatgpteffect_counts(  data,  starts_with("cg_adoption_adv"),  starts_with("use_"))

Test homogeneity of category shares

Description

Performs a goodness-of-fit test and calculates the Gini coefficient.The goodness-of-fit-test is calculated usingstats::chisq.test.

Usage

effect_counts_one(data, col, clean = TRUE, ...)

Arguments

data

A tibble.

col

The column holding factor values.

clean

Prepare data bydata_clean

...

Placeholder to allow calling the method with unused parameters fromeffect_counts.

Value

A volker tibble with the following statistical measures:

Examples

library(volker)data <- volker::chatgptdata |>  filter(sd_gender != "diverse") |>  effect_counts_one(sd_gender)

Output test statistics and effect size from a logistic regression of one metric predictor

Description

Not yet implemented. The future will come.

Usage

effect_counts_one_cor(data, col, cross, clean = TRUE, labels = TRUE, ...)

Arguments

data

A tibble.

col

The column holding factor values.

cross

The column holding metric values.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromeffect_counts.

Value

A volker tibble.


Output test statistics and effect size for contingency tables

Description

Chi squared is calculated usingstats::chisq.test.If any cell contains less than 5 observations, the exact-parameter is set.

Usage

effect_counts_one_grouped(data, col, cross, clean = TRUE, ...)

Arguments

data

A tibble.

col

The column holding factor values.

cross

The column holding groups to compare.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromeffect_counts.

Details

Phi is derived from the Chi squared value bysqrt(fit$statistic / n).Cramer's V is derived bysqrt(phi / (min(dim(contingency)[1], dim(contingency)[2]) - 1)).Cramer's V is set to 1.0 for diagonal contingency matrices, indicating perfect association.

Value

A volker list with two volker tibbles.The first tibble contains npmi values for each combinations:

The second tibble contains effect sizes based on the cross table:

Examples

library(volker)data <- volker::chatgpteffect_counts_one_grouped(data, adopter, sd_gender)

Output effect sizes and test statistics for metric data

Description

The calculations depend on the number of selected columns:

Group comparisons:

By default, if you provide two column selections, the second column is treated as categorical.Setting the metric-parameter to TRUE will call the appropriate functions for correlation analysis:

[Experimental]

Usage

effect_metrics(data, cols, cross = NULL, metric = FALSE, clean = TRUE, ...)

Arguments

data

A data frame.

cols

A tidy column selection,e.g. a single column (without quotes)or multiple columns selected by methods such as starts_with().

cross

Optional, a grouping column (without quotes).

metric

When crossing variables, the cross column parameter can contain categorical or metric values.By default, the cross column selection is treated as categorical data.Set metric to TRUE, to treat it as metric and calculate correlations.

clean

Prepare data bydata_clean.

...

Other parameters passed to the appropriate effect function.

Value

A volker tibble.

Examples

library(volker)data <- volker::chatgpteffect_metrics(data, sd_age, sd_gender)

Test whether a distribution is normal for each item

Description

The test is calculated usingstats::shapiro.test.

Usage

effect_metrics_items(  data,  cols,  adjust = "fdr",  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

The column holding metric values.

adjust

Performing multiple significance tests inflates the alpha error.Thus, p values need to be adjusted according to the number of tests.Set a method supported bystats::p.adjust,e.g. "fdr" (the default) or "bonferroni". Disable adjustment with FALSE.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromeffect_metrics.

Value

A volker table containing itemwise statistics:

Examples

library(volker)data <- volker::chatgpteffect_metrics_items(data, starts_with("cg_adoption"))

Output correlation coefficients for items and one metric variable

Description

The correlation is calculated usingstats::cor.test.

Usage

effect_metrics_items_cor(  data,  cols,  cross,  method = "pearson",  adjust = "fdr",  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The column holding metric values to correlate.

method

The output metrics, pearson = Pearson's R, spearman = Spearman's rho.

adjust

Performing multiple significance tests inflates the alpha error.Thus, p values need to be adjusted according to the number of tests.Set a method supported bystats::p.adjust,e.g. "fdr" (the default) or "bonferroni". Disable adjustment with FALSE.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromeffect_metrics.

Value

A volker table containing itemwise correlations:

Ifmethod = "pearson":

Ifmethod = "spearman":

Examples

library(volker)data <- volker::chatgpteffect_metrics_items_cor(  data, starts_with("cg_adoption_adv"), sd_age)

Output correlation coefficients for multiple items

Description

The correlation is calculated usingstats::cor.test.

Usage

effect_metrics_items_cor_items(  data,  cols,  cross,  method = "pearson",  adjust = "fdr",  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

Tidyselect item variables (e.g. starts_with...).

method

The output metrics, pearson = Pearson's R, spearman = Spearman's rho.

adjust

Performing multiple significance tests inflates the alpha error.Thus, p values need to be adjusted according to the number of tests.Set a method supported bystats::p.adjust,e.g. "fdr" (the default) or "bonferroni". Disable adjustment with FALSE.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromeffect_metrics.

Value

A volker table containing correlations.

Ifmethod = "pearson":

Ifmethod = "spearman":

Examples

library(volker)data <- volker::chatgpteffect_metrics_items_cor_items(  data,  starts_with("cg_adoption_adv"),  starts_with("use"),  metric = TRUE)

Compare groups for each item by calculating F-statistics and effect sizes

Description

The models are fitted usingstats::lm.ANOVA of type II is computed for each fitted model usingcar::Anova.Eta Squared is calculated for each ANOVA resultusingeffectsize::eta_squared.

Usage

effect_metrics_items_grouped(  data,  cols,  cross,  adjust = "fdr",  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The column holding groups to compare.

adjust

Performing multiple significance tests inflates the alpha error.Thus, p values need to be adjusted according to the number of tests.Set a method supported bystats::p.adjust,e.g. "fdr" (the default) or "bonferroni". Disable adjustment with FALSE.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromeffect_metrics.

Value

A volker tibble with the following statistical measures:

Examples

library(volker)data <- volker::chatgpteffect_metrics(data, starts_with("cg_adoption_"), adopter)

Compare groups for each item with multiple target items by calculating F-statistics and effect sizes

Description

Not yet implemented. The future will come.

Usage

effect_metrics_items_grouped_items(data, cols, cross, clean = TRUE, ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The grouping items.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromeffect_counts.

Value

A volker tibble.


Test whether a distribution is normal

Description

The test is calculated usingstats::shapiro.test.

Usage

effect_metrics_one(data, col, labels = TRUE, clean = TRUE, ...)

Arguments

data

A tibble.

col

The column holding metric values.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromeffect_metrics.

Value

A volker list object with the following statistical measures:

Examples

library(volker)data <- volker::chatgpteffect_metrics_one(data, sd_age)

Test whether the correlation is different from zero

Description

The correlation is calculated usingstats::cor.test.

Usage

effect_metrics_one_cor(  data,  col,  cross,  method = "pearson",  adjust = "fdr",  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

col

The column holding metric values.

cross

The column holding metric values to correlate.

method

The output metrics, TRUE or pearson = Pearson's R, spearman = Spearman's rho.

adjust

Performing multiple significance tests inflates the alpha error.Thus, p values need to be adjusted according to the number of tests.Set a method supported bystats::p.adjust,e.g. "fdr" (the default) or "bonferroni". Disable adjustment with FALSE.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromeffect_metrics.

Value

A volker table containing the requested statistics.

Ifmethod = "pearson":

Ifmethod = "spearman":

Examples

library(volker)data <- volker::chatgpteffect_metrics_one_cor(data, sd_age, use_private, metric = TRUE)

Output a regression table with estimates and macro statistics

Description

The regression output comes fromstats::lm.T-test is performed usingstats::t.test.Normality check is performed usingstats::shapiro.test.Equality of variances across groups is assessed usingcar::leveneTest.Cohen's d is calculated usingeffectsize::cohens_d.

Usage

effect_metrics_one_grouped(  data,  col,  cross,  method = "lm",  adjust = "fdr",  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

col

The column holding metric values.

cross

The column holding groups to compare.

method

A character vector of methods, e.g. c("t.test","lm").Supported methods are t.test (only valid if the cross column contains two levels)and lm (regression results).

adjust

Performing multiple significance tests inflates the alpha error.Thus, p values need to be adjusted according to the number of tests.Set a method supported bystats::p.adjust,e.g. "fdr" (the default) or "bonferroni". Disable adjustment with FALSE.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromeffect_metrics.

Value

A volker list object containing volker tables with the requested statistics.

Regression table:

Macro statistics:

Ifmethod = t.test:

Shapiro-Wilk test (normality check):

Levene test (equality of variances):

Cohen's d (effect size):

t-test

Examples

library(volker)data <- volker::chatgpteffect_metrics_one_grouped(data, sd_age, sd_gender)

Select variables by their postfix

Description

Seetidyselect::ends_with for details.


Get plot with factor analysis result

Description

PCA is performed usingadd_factors.

[Experimental]

Usage

factor_plot(  data,  cols,  newcols = NULL,  k = 2,  method = "pca",  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A dataframe.

cols

A tidy selection of item columns.If the first column already contains a pca from add_factors, the result is used. Other parameters are ignored.If there is no pca result yet, it is calculated byadd_factors first.

newcols

Names of the factor columns as a character vector.Must be the same length as k or NULL.Set to NULL (default) to automatically build a namefrom the common column prefix, prefixed with "fct_", postfixed with the factor number.

k

Number of factors to calculate.Set to NULL to generate a scree plot with eigenvalues for all components up to the number of itemsand automatically choose k. Eigenvalues and the decision on k are calculated bypsych::fa.parallel.

method

The method as character value. Currently, only pca is supported.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_metrics.

Value

A ggplot object.

Examples

library(volker)ds <- volker::chatgptvolker::factor_plot(ds, starts_with("cg_adoption"), k = 3)

Get tables with factor analysis results

Description

PCA is performed usingadd_factors.

[Experimental]

Usage

factor_tab(  data,  cols,  newcols = NULL,  k = 2,  method = "pca",  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A dataframe.

cols

A tidy selection of item columns.

        If the first column already contains a pca result from \link{add_factors},        the result is used. Other parameters are ignored.        If there is no pca result yet, it is calculated by \link{add_factors} first.
newcols

Names of the new factor columns as a character vector.Must be the same length as k or NULL.Set to NULL (default) to automatically build a namefrom the common column prefix, prefixed with "fct_", postfixed with the factor number.

k

Number of factors to calculate.Set to NULL to report eigenvalues for all components up to the number of itemsand automatically choose k. Eigenvalues and the decision on k are calculated bypsych::fa.parallel.

method

The method as character value. Currently, only pca is supported.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromtab_metrics.

Value

A volker list with with three volker tabs: loadings, variances and diagnostics.

Examples

library(volker)ds <- volker::chatgptvolker::factor_tab(ds, starts_with("cg_adoption"), k = 3)

Filter function

Description

Seedplyr::filter for details.


Get number of items and Cronbach's alpha of a scale added by add_index()

Description

TODO: Rename to index_tab, return volker list as in factor_tab()

Usage

get_alpha(data)

Arguments

data

A data frame column.

Value

A named list with with the keys "items" and "alpha".


Angle labels

Description

Calculate angle for label adjustment based on character length.

Usage

get_angle(  labels,  threshold = VLKR_PLOT_ANGLE_THRESHOLD,  angle = VLKR_PLOT_ANGLE_VALUE)

Arguments

labels

Vector of labels to check. The values are converted to characters.

threshold

Length threshold beyond which the angle is applied.Default is 20. Override withoptions(vlkr.angle.threshold=10).

angle

The angle to apply if any label exceeds the threshold.Default is 45. Override withoptions(vlkr.angle.value=30).

Value

A single angle value.


Get a formatted baseline from attributes of an object.

Description

The following attributes are considered:

Usage

get_baseline(obj, ignore = c())

Arguments

obj

An object with supported attributes.

ignore

Characer vector of attributes to ignore.

Value

A formatted message or NULL if none of the supported attributes is present.


Calculate standardized betas

Description

Calculate standardized betas

Usage

get_betas(fit)

Arguments

fit

A model fitted with lm()

Value

A data frame with a row for each term


Calculate ci values to be used for error bars on a plot

Description

Calculate ci values to be used for error bars on a plot

Usage

get_ci(x, conf = 0.95)

Arguments

x

A numeric vector.

conf

The confidence level.

Value

A named list with values for y, ymin, and ymax.


Calculate correlation between two vectors

Description

Calculate correlation between two vectors

Usage

get_correlation(x, y, method, category = NULL, test = TRUE)

Arguments

x

First vector

y

Second vector

method

One of "spearman" or "pearson" for metric vectors.For catecorical vectors, use "cramer" or "npmi".

category

A vector of values to focus. Necessary for the npmi method only.

test

Boolean; whether to perform significance tests.

Value

The result of cor.test() for metric vectors, chisq.test for Cramer's V.


Detect whether a scale is a numeric sequence

Description

From all values in the selected columns, the numbers are extracted.If no numeric values can be found, returns 0.Otherwise, if any positive values form an ascending sequence, returns -1.In all other cases, returns 1.

Usage

get_direction(data, cols, extract = TRUE)

Arguments

data

The dataframe.

cols

The tidy selection.

extract

Whether to extract numeric values from characters.

Value

0 = an undirected scale, -1 = descending values, 1 = ascending values.


Calculate Eta squared

Description

Calculate Eta squared

Usage

get_etasq(fit)

Arguments

fit

A model

Value

A data frame with at least the column Eta2


Calculate the Gini coefficient

Description

Calculate the Gini coefficient

Usage

get_gini(x)

Arguments

x

A vector of counts or other values

Value

The gini coefficient


Get the labels of values from a codebook

Description

Get the labels of values from a codebook

Usage

get_labels(codes, values)

Arguments

codes

The codebook as it results from the codebook() function

values

A vector of labels

Value

The labels. If the values are not present in the codebook, returns the values.


Get the numeric range from the labels

Description

Gets the range of all values in the selected columnsby the first successful of the following methods:

Usage

get_limits(data, cols, negative = TRUE)

Arguments

data

The labeled data frame.

cols

A tidy variable selection.

negative

Whether to include negative values.

Details

Value

A list or NULL.


Calculate nmpi

Description

Calculate nmpi

Usage

get_npmi(  data,  col,  cross,  category = NULL,  smoothing = 0,  adjust = "fdr",  test = TRUE,  ...)

Arguments

data

A tibble.

col

The column holding factor values.

cross

The column to correlate.

category

A vector of values to focus. If not null, all other values will be removed from the result.

smoothing

Add pseudocount. Calculate the pseudocount based on the number of trialsto apply Laplace's rule of succession.

adjust

Performing multiple significance tests inflates the alpha error.Thus, p values need to be adjusted according to the number of tests.Set a method supported bystats::p.adjust,e.g. "fdr" (the default) or "bonferroni". Disable adjustment with FALSE.

test

Boolean; whether to perform significance tests (default TRUE).

...

Placeholder to allow calling the method with unused parameters fromtab_counts.

Value

A volker tibble.


Get the common prefix of character values

Description

Helper function taken from the biobase package.Duplicated here instead of loading the package to avoid overhead.See https://github.com/Bioconductor/Biobase

Usage

get_prefix(  x,  ignore.case = FALSE,  trim = FALSE,  delimiters = c(":", "\n"),  minlength = 3)

Arguments

x

Character vector.

ignore.case

Whether case matters (default).

trim

Whether non alphabetic characters should be trimmed.

delimiters

A list of prefix delimiters.If any of the delimiters is present in the extracted prefix,the part after is removed from the prefix.Consider the following two items as an example:c("Usage: in private context", "Usage: in work context").The common prefix would be

"Usage: in "

, but it makesmore sense to break it after the colon.

minlength

Minimum length of the prefix.Consider the following two items as an example:c("coder one", "cg_act_write").The common prefix would be

"c"

,although the items have nothing in common.Requirung a minimum common prefix length should help in many cases.

Value

The longest common prefix of the strings.


Get significance stars from p values

Description

Get significance stars from p values

Usage

get_stars(x)

Arguments

x

A vector of p values.

Value

A character vector with significance stars.


Get a common title for a column selection

Description

Get a common title for a column selection

Usage

get_title(data, cols, default = NULL, max_length = 40)

Arguments

data

A tibble.

cols

A tidy column selection.

default

A character string used in case not prefix is found

Value

A character string.


Volker style HTML document format

Description

Based on the standard theme, tweaks the pill navigationto switch between tables and plots.To use the format, in the header of your Markdown document,setoutput: volker::html_report.

Usage

html_report(...)

Arguments

...

Additional arguments passed to html_document.

Value

R Markdown output format.

Examples

## Not run: # Add `volker::html_report` to the output options of your Markdown document:## ```# ---# title: "How to create reports?"# output: volker::html_report# ---# ```## End(Not run)

Deprecated Alias foradd_index

Description

[Deprecated]idx_add() was renamed toadd_index().

Usage

idx_add(data, cols, newcol = NULL, reverse = NULL, clean = TRUE)

Details

This function is a deprecated alias foradd_index.


Printing method for volker plots when knitting

Description

Printing method for volker plots when knitting

Usage

## S3 method for class 'vlkr_plt'knit_print(x, ...)

Arguments

x

The volker plot.

...

Further parameters passed to print().

Value

Knitr asis output

Examples

library(volker)data <- volker::chatgptpl <- plot_metrics(data, sd_age)print(pl)

Wrap labels in plot scales

Description

Wrap labels in plot scales

Usage

label_scale(x, scale)

Arguments

x

The label vector.

scale

A named label vector to select elements that should be wrapped.Prevents numbers from being wrapped.

Value

A vevtor of wrapped labels.


Set column and value labels

Description

[Experimental]

Usage

labs_apply(data, codes = NULL, cols = NULL, items = TRUE, values = TRUE)

Arguments

data

A tibble containing the dataset.

codes

A tibble incodebook format.

cols

A tidy column selection. Set to NULL (default) to apply to all columnsfound in the codebook.Restricting the columns is helpful when you want to set value labels.In this case, provide a tibble with value_name and value_label columnsand specify the columns that should be modified.

items

If TRUE, column labels will be retrieved from the codes (the default).If FALSE, no column labels will be changed.Alternatively, a named list of column names with their labels.

values

If TRUE, value labels will be retrieved from the codes (default).If FALSE, no value labels will be changed.Alternatively, a named list of value names with their labels.In this case, use the cols-Parameter to define which columns should be changed.

Details

You can either provide a data frame incodebook format to the codes-parameteror provide named lists to the items- or values-parameter.

When working with a codebook in the codes-parameter:

When working with lists in the items- or values-parameter:

Value

A tibble containing the dataset with new labels.

Examples

library(volker)# Set column labels using the items-parametervolker::chatgpt %>%  labs_apply(   items = list(     "cg_adoption_advantage_01" = "Allgemeine Vorteile",     "cg_adoption_advantage_02" = "Finanzielle Vorteile",     "cg_adoption_advantage_03" = "Vorteile bei der Arbeit",     "cg_adoption_advantage_04" = "Macht mehr Spaß"   ) ) %>% tab_metrics(starts_with("cg_adoption_advantage_"))# Set value labels using the values-parameter volker::chatgpt %>%   labs_apply(     cols=starts_with("cg_adoption"),     values = list(       "1" = "Stimme überhaupt nicht zu",       "2" = "Stimme nicht zu",       "3" = "Unentschieden",       "4" = "Stimme zu",       "5" =  "Stimme voll und ganz zu"     )   ) %>%   plot_metrics(starts_with("cg_adoption"))

Remove all comments from the selected columns

Description

[Experimental]

Usage

labs_clear(data, cols, labels = NULL)

Arguments

data

A tibble.

cols

Tidyselect columns.

labels

The attributes to remove. NULL to remove all attributes except levels and class.

Value

A tibble with comments removed.

Examples

library(volker)volker::chatgpt |>  labs_clear()

Add missing residual labels in numeric columns that have at least one labeled value

Description

Add missing residual labels in numeric columns that have at least one labeled value

Usage

labs_impute(data)

Arguments

data

A tibble

Value

A tibble with added value labels


Remove common prefix from the first column

Description

Remove common prefix from the first column

Usage

labs_prefix(result)

Arguments

result

A tibble with item names in the first column

Value

A tibble with the first column renamed to the prefix and the prefix removed from column values.


Replace item value names in a column by their labels

Description

Replace item value names in a column by their labels

Usage

labs_replace(  data,  col,  codes,  col_from = "value_name",  col_to = "value_label",  na.missing = FALSE)

Arguments

data

A tibble.

col

The column holding item values.

codes

The codebook to use: A tibble with the columnsvalue_name and value_label.Can be created by thecodebook function, e.g. by callingcodes <- codebook(data, myitemcolumn).

col_from

The tidyselect column with source values, defaults to value_name.If the column is not found in the codebook, the first column is used.

col_to

The tidyselect column with target values, defaults to value_label.If the column is not found in the codebook, the second column is used

na.missing

By default, the column is converted to a factor with levels combined from the codebook and the data.Set na.missing to TRUE to set all levels not found in the codes to NA.

Value

Tibble with new labels.


Restore labels from the codebook store in the codebook attribute.

Description

[Experimental]

Usage

labs_restore(data, cols = NULL)

Arguments

data

A data frame.

cols

A tidyselect column selection.

Details

You can store labels before mutate operations by callinglabs_store.

Value

A data frame.

Examples

library(dplyr)library(volker)volker::chatgpt |>  labs_store() |>  mutate(sd_age = 2024 - sd_age) |>  labs_restore() |>  tab_metrics(sd_age)

Get the current codebook and store it in the codebook attribute.

Description

[Experimental]

Usage

labs_store(data)

Arguments

data

A data frame.

Details

You can restore the labels after mutate operations by callinglabs_restore.

Value

A data frame.

Examples

library(dplyr)library(volker)volker::chatgpt |>  labs_store() |>  mutate(sd_age = 2024 - sd_age) |>  labs_restore() |>  tab_metrics(sd_age)

Get a label for a key

Description

Get a label for a key

Usage

map_label(key, options = list())

Arguments

key

The key

options

A list of labels with their keys

Value

The label from the options that matches the key


Plot regression coefficients

Description

The regression output comes fromstats::lm.

[Experimental]

Usage

model_metrics_plot(  data,  col,  categorical,  metric,  interactions = NULL,  diagnostics = FALSE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

col

The target column holding metric values.

categorical

A tidy column selection holding categorical variables.

metric

A tidy column selection holding metric variables.

interactions

A vector of interaction effects to calculate.Each interaction effect should be provided as multiplication of the variables.Example:c(sd_gender * adopter).

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromeffect_metrics.

Value

A volker list object containing volker plots

Examples

library(volker)data <- volker::chatgptdata |>  filter(sd_gender != "diverse") |>  model_metrics_plot(use_work, categorical = c(sd_gender, adopter), metric = sd_age)

Output a regression table with estimates and macro statisticsfor multiple categorical or metric independent variables

Description

The regression output comes fromstats::lm.The effect sizes are calculated byheplots::etasq.The variance inflation is calculated bycar::vif.The standardized beta (in the column standard beta) is calculated bymultiplying the estimate with the ratiox_sd / y_sd where x_sd containsthe standard deviation of the predictor values and y_sd the standard deviation ofthe predicted value.

[Experimental]

Usage

model_metrics_tab(  data,  col,  categorical,  metric,  interactions = NULL,  adjust = "fdr",  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

col

The target column holding metric values.

categorical

A tidy column selection holding independet categorical variables.

metric

A tidy column selection holding independent metric variables.

interactions

A vector of interaction effects to calculate.Each interaction effect should be provided as multiplication of the variables.Example:c(sd_gender * adopter).

adjust

Performing multiple significance tests inflates the alpha error.Thus, p values need to be adjusted according to the number of tests.Set a method supported bystats::p.adjust,e.g. "fdr" (the default) or "bonferroni". Disable adjustment with FALSE.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromeffect_metrics.

Value

A volker list object containing volker tables with the requested statistics.

Examples

library(volker)data <- volker::chatgptdata |>  filter(sd_gender != "diverse") |>  model_metrics_tab(use_work, categorical = c(sd_gender, adopter), metric = sd_age)

Mutate function

Description

Seedplyr::mutate for details.


Convert a named vector to a list

Description

Convert a named vector to a list

Usage

named.to.list(x)

Arguments

x

A named vector or a list

Value

Lists are returned as is. Vectors are converted to lists with names as list names.


Volker style PDF document format

Description

Based on the standard theme, tweaks tex headers.To use the format, in the header of your Markdown document,setoutput: volker::pdf_report.

Usage

pdf_report(...)

Arguments

...

Additional arguments passed to pdf_document.

Value

R Markdown output format.

Examples

## Not run: # Add `volker::pdf_report` to the output options of your Markdown document:## ```# ---# title: "How to create reports?"# output: volker::pdf_report# ---# ```## End(Not run)

Output a frequency plot

Description

The type of frequency plot depends on the number of selected columns:

Cross tabulations:

By default, if you provide two column selections, the second selection is treated as categorical.Setting the metric-parameter to TRUE will call the appropriate functions for correlation analysis:

Parameters that may be passed to the count functions(see the respective function help):

[Experimental]

Usage

plot_counts(data, cols, cross = NULL, metric = FALSE, clean = TRUE, ...)

Arguments

data

A data frame.

cols

A tidy column selection,e.g. a single column (without quotes)or multiple columns selected by methods such as starts_with().

cross

Optional, a grouping column. The column name without quotes.

metric

When crossing variables, the cross column parameter can contain categorical or metric values.By default, the cross column selection is treated as categorical data.Set metric to TRUE, to treat it as metric and calculate correlations.

clean

Prepare data bydata_clean.

...

Other parameters passed to the appropriate plot function.

Value

A ggplot2 plot object.

Examples

library(volker)data <- volker::chatgptplot_counts(data, sd_gender)

Output frequencies for multiple variables

Description

Output frequencies for multiple variables

Usage

plot_counts_items(  data,  cols,  category = NULL,  ordered = NULL,  ci = FALSE,  limits = NULL,  numbers = NULL,  title = TRUE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

category

The value FALSE will force to plot all categories.A character value will focus a selected category.When NULL, in case of boolean values, only the TRUE category is plotted.

ordered

Values can be nominal (0) or ordered ascending (1) descending (-1).By default (NULL), the ordering is automatically detected.An appropriate color scale should be choosen depending on the ordering.For unordered values, colors from VLKR_FILLDISCRETE are used.For ordered values, shades of the VLKR_FILLGRADIENT option are used.

ci

Whether to plot error bars for 95% confidence intervals.

limits

The scale limits, autoscaled by default.Set toc(0,100) to make a 100 % plot.

numbers

The values to print on the bars: "n" (frequency), "p" (percentage) or both.

title

If TRUE (default) shows a plot title derived from the column labels.Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_counts.

Value

A ggplot object.

Examples

library(volker)data <- volker::chatgptplot_counts_items(data, starts_with("cg_adoption_"))

Plot percent shares of multiple items compared by a metric variable split into groups

Description

Plot percent shares of multiple items compared by a metric variable split into groups

Usage

plot_counts_items_cor(  data,  cols,  cross,  category = NULL,  title = TRUE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

A metric column that will be split into groups at the median.

category

Summarizing multiple items (the cols parameter) by group requires a focus category.By default, for logical column types, only TRUE values are counted.For other column types, the first category is counted.To override the default behavior, provide a vector of values in the dataset or labels from the codebook.

title

If TRUE (default) shows a plot title derived from the column labels.Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_counts.

Value

A ggplot object.

Examples

library(volker)data <- volker::chatgptplot_counts_items_cor(  data, starts_with("cg_adoption_"), sd_age,  category=c("agree","strongly agree"))plot_counts_items_cor(  data, starts_with("cg_adoption_"), sd_age,  category=c(4,5))

Correlation of categorical items with metric items

Description

Not yet implemented. The future will come.

Usage

plot_counts_items_cor_items(  data,  cols,  cross,  title = TRUE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

Tidyselect item variables (e.g. starts_with...).

title

If TRUE (default) shows a plot title derived from the column labels.Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_counts.

Value

A ggplot object.


Plot percent shares of multiple items compared by groups

Description

Plot percent shares of multiple items compared by groups

Usage

plot_counts_items_grouped(  data,  cols,  cross,  category = NULL,  limits = NULL,  title = TRUE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The column holding groups to compare.

category

Summarizing multiple items (the cols parameter) by group requires a focus category.By default, for logical column types, only TRUE values are counted.For other column types, the first category is counted.To override the default behavior, provide a vector of values in the dataset or labels from the codebook.

limits

The scale limits, autoscaled by default.Set toc(0,100) to make a 100% plot.If the data is binary or focused on a single category, by default a 100% plot is created.

title

If TRUE (default) shows a plot title derived from the column labels.Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_counts.

Value

A ggplot object.

Examples

library(volker)data <- volker::chatgptplot_counts_items_grouped(  data, starts_with("cg_adoption_"), adopter,  category=c("agree","strongly agree"))plot_counts_items_grouped(  data, starts_with("cg_adoption_"), adopter,  category=c(4,5))

Correlation of categorical items with categorical items

Description

Correlation of categorical items with categorical items

Usage

plot_counts_items_grouped_items(  data,  cols,  cross,  method = "cramer",  category = NULL,  title = TRUE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

Tidyselect item variables (e.g. starts_with...).

method

The method of correlation calculation:

  • cramer for Cramer's V,

  • npmi for Normalized Pointwise Mutual Information.

title

If TRUE (default) shows a plot title derived from the column labels.Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_counts.

Value

A ggplot object.

Examples

library(volker)data <- volker::chatgptplot_counts_items_grouped_items(  data,  starts_with("cg_adoption_advantage"),  starts_with("cg_adoption_fearofuse"),  method ="cramer")

Plot the frequency of values in one column

Description

Plot the frequency of values in one column

Usage

plot_counts_one(  data,  col,  category = NULL,  ci = FALSE,  limits = NULL,  numbers = NULL,  title = TRUE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

col

The column holding values to count.

category

The value FALSE will force to plot all categories.A character value will focus a selected category.When NULL, in case of boolean values, only the TRUE category is plotted.

ci

Whether to plot error bars for 95% confidence intervals.

limits

The scale limits, autoscaled by default.Set toc(0,100) to make a 100% plot.If the data is binary or focused on a single category, by default a 100% plot is created.

numbers

The values to print on the bars: "n" (frequency), "p" (percentage) or both.

title

If TRUE (default) shows a plot title derived from the column labels.Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_counts.

Value

A ggplot object.

Examples

library(volker)data <- volker::chatgptplot_counts_one(data, sd_gender)

Plot frequencies cross tabulated with a metric column that will be split into groups

Description

Plot frequencies cross tabulated with a metric column that will be split into groups

Usage

plot_counts_one_cor(  data,  col,  cross,  category = NULL,  prop = "total",  limits = NULL,  ordered = NULL,  numbers = NULL,  title = TRUE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

col

The column holding factor values.

cross

A metric column that will be split into groups at the median.

category

The value FALSE will force to plot all categories.A character value will focus a selected category.When NULL, in case of boolean values, only the TRUE category is plotted.

prop

The basis of percent calculation: "total" (the default), "rows" or "cols".Plotting row or column percentages results in stacked bars that add up to 100%.Whether you set rows or cols determines which variable is in the legend (fill color)and which on the vertical scale.

limits

The scale limits, autoscaled by default.Set toc(0,100) to make a 100 % plot.

ordered

The values of the cross column can be nominal (0), ordered ascending (1), or descending (-1).By default (NULL), the ordering is automatically detected.An appropriate color scale should be chosen depending on the ordering.For unordered values, colors from VLKR_FILLDISCRETE are used.For ordered values, shades of the VLKR_FILLGRADIENT option are used.

numbers

The numbers to print on the bars: "n" (frequency), "p" (percentage) or both.

title

If TRUE (default) shows a plot title derived from the column labels.Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_counts.

Value

A ggplot object.

Examples

library(volker)data <- volker::chatgptplot_counts_one_cor(data, adopter, sd_age)

Plot frequencies cross tabulated with a grouping column

Description

Plot frequencies cross tabulated with a grouping column

Usage

plot_counts_one_grouped(  data,  col,  cross,  category = NULL,  prop = "total",  width = NULL,  tiles = FALSE,  npmi = FALSE,  limits = NULL,  ordered = NULL,  numbers = NULL,  title = TRUE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

col

The column holding factor values.

cross

The column holding groups to split.

category

The value FALSE will force to plot all categories.A character value will focus a selected category.When NULL, in case of boolean values, only the TRUE category is plotted.

prop

The basis of percent calculation: "total" (the default), "rows" or "cols".Plotting row or column percentages results in stacked bars that add up to 100%.Whether you set rows or cols determines which variable is in the legend (fill color)and which on the vertical scale.

width

By default, when setting the prop parameter to "rows" or "cols",the bar or column width reflects the number of cases.You can disable this behavior by setting width to FALSE.

tiles

Set npmi toTRUE to generate a heatmap based on npmi values. Only valid in combination with tiles = TRUE.

limits

The scale limits, autoscaled by default.Set toc(0,100) to make a 100 % plot.

ordered

The values of the cross column can be nominal (0), ordered ascending (1), or descending (-1).By default (NULL), the ordering is automatically detected.An appropriate color scale should be chosen depending on the ordering.For unordered values, colors from VLKR_FILLDISCRETE are used.For ordered values, shades of the VLKR_FILLGRADIENT option are used.

numbers

The numbers to print on the bars or tiles, a vector with one or more of:- "n" (frequency),- "p" (percentage)- "npmi" (normalized pointwise mutual information)

title

If TRUE (default) shows a plot title derived from the column labels.Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_counts.

Value

A ggplot object.

Examples

library(volker)data <- volker::chatgptplot_counts_one_grouped(data, adopter, sd_gender)

Output a plot with distribution parameters such as the mean values

Description

The plot type depends on the number of selected columns:

Group comparisons:

By default, if you provide two column selections, the second selection is treated as categorical.Setting the metric-parameter to TRUE will call the appropriate functions for correlation analysis:

Parameters that may be passed to the metric functions(see the respective function help):

[Experimental]

Usage

plot_metrics(data, cols, cross = NULL, metric = FALSE, clean = TRUE, ...)

Arguments

data

A data frame.

cols

A tidy column selection,e.g. a single column (without quotes)or multiple columns selected by methods such as starts_with().

cross

Optional, a grouping column (without quotes).

metric

When crossing variables, the cross column parameter can contain categorical or metric values.By default, the cross column selection is treated as categorical data.Set metric to TRUE, to treat it as metric and calculate correlations.

clean

Prepare data bydata_clean.

...

Other parameters passed to the appropriate plot function.

Value

A ggplot object.

Examples

library(volker)data <- volker::chatgptplot_metrics(data, sd_age)

Output averages for multiple variables

Description

Output averages for multiple variables

Usage

plot_metrics_items(  data,  cols,  ci = FALSE,  box = FALSE,  limits = NULL,  title = TRUE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

ci

Whether to plot the 95% confidence interval of the mean.

box

Whether to add boxplots.

limits

The scale limits. Set NULL to extract limits from the labels. NOT IMPLEMENTED YET.

title

If TRUE (default) shows a plot title derived from the column labels.Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_metrics.

Value

A ggplot object.

Examples

library(volker)data <- volker::chatgptplot_metrics_items(data, starts_with("cg_adoption_"))

Multiple items correlated with one metric variable

Description

Multiple items correlated with one metric variable

Usage

plot_metrics_items_cor(  data,  cols,  cross,  ci = FALSE,  method = "pearson",  title = TRUE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The column to correlate.

ci

Whether to plot confidence intervals of the correlation coefficient.

method

The method of correlation calculation, pearson = Pearson's R, spearman = Spearman's rho.

title

If TRUE (default) shows a plot title derived from the column labels.Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_metrics.

Value

A ggplot object.

Examples

library(volker)data <- volker::chatgptplot_metrics_items_cor(data, starts_with("use_"), sd_age)

Heatmap for correlations between multiple items

Description

Heatmap for correlations between multiple items

Usage

plot_metrics_items_cor_items(  data,  cols,  cross,  method = "pearson",  title = TRUE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

Tidyselect item variables to correlate (e.g. starts_with...).

method

The method of correlation calculation, pearson = Pearson's R, spearman = Spearman's rho.

title

If TRUE (default) shows a plot title derived from the column labels.Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_metrics.

Value

A ggplot object.

Examples

library(volker)data <- volker::chatgptplot_metrics_items_cor_items(data, starts_with("cg_adoption_adv"), starts_with("use_"))

Output averages for multiple variables compared by a grouping variable

Description

Output averages for multiple variables compared by a grouping variable

Usage

plot_metrics_items_grouped(  data,  cols,  cross,  limits = NULL,  title = TRUE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The column holding groups to compare.

limits

The scale limits. Set NULL to extract limits from the labels.

title

If TRUE (default) shows a plot title derived from the column labels.Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_metrics.

Value

A ggplot object.

Examples

library(volker)data <- volker::chatgptplot_metrics_items_grouped(data, starts_with("cg_adoption_"), sd_gender)

Correlation of metric items with categorical items

Description

Not yet implemented. The future will come.

Usage

plot_metrics_items_grouped_items(  data,  cols,  cross,  title = TRUE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

Tidyselect item variables (e.g. starts_with...)

title

If TRUE (default) shows a plot title derived from the column labels.Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_metrics.

Value

A ggplot object.


Output a density plot for a single metric variable

Description

Output a density plot for a single metric variable

Usage

plot_metrics_one(  data,  col,  ci = FALSE,  box = FALSE,  limits = NULL,  title = TRUE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

col

The column holding metric values.

ci

Whether to plot the confidence interval.

box

Whether to add a boxplot.

limits

The scale limits. Set NULL to extract limits from the label.

title

If TRUE (default) shows a plot title derived from the column labels.Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_metrics.

Value

A ggplot object.

Examples

library(volker)data <- volker::chatgptplot_metrics_one(data, sd_age)

Correlate two items

Description

Correlate two items

Usage

plot_metrics_one_cor(  data,  col,  cross,  limits = NULL,  log = FALSE,  jitter = FALSE,  title = TRUE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

col

The first column holding metric values.

cross

The second column holding metric values.

limits

The scale limits, a list with x and y components, e.g.list(x=c(0,100), y=c(20,100)).Set NULL to extract limits from the labels.

log

Whether to plot log scales.

jitter

Whether to jitter the points.

title

If TRUE (default) shows a plot title derived from the column labels.Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_metrics.

Value

A ggplot object.

Examples

library(volker)data <- volker::chatgptplot_metrics_one_cor(data, use_private, sd_age)

Output averages for multiple variables

Description

Output averages for multiple variables

Usage

plot_metrics_one_grouped(  data,  col,  cross,  ci = FALSE,  box = FALSE,  limits = NULL,  title = TRUE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

col

The column holding metric values.

cross

The column holding groups to compare.

ci

Whether to add error bars with 95% confidence intervals.

box

Whether to add boxplots.

limits

The scale limits. Set NULL to extract limits from the labels.

title

If TRUE (default) shows a plot title derived from the column labels.Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_metrics.

Value

A ggplot object.

Examples

library(volker)data <- volker::chatgptplot_metrics_one_grouped(data, sd_age, sd_gender)

Prepare the scale attribute values

Description

Prepare the scale attribute values

Usage

prepare_scale(data)

Arguments

data

A tibble with a scale attribute.

Value

A named list or NULL.


Printing method for volker lists

Description

Printing method for volker lists

Usage

## S3 method for class 'vlkr_list'print(x, ...)

Arguments

x

The volker list.

...

Further parameters passed to print.

Value

No return value.

Examples

library(volker)data <- volker::chatgptrp <- report_metrics(data, sd_age, sd_gender, effect = TRUE)print(rp)

Printing method for volker plots

Description

Printing method for volker plots

Usage

## S3 method for class 'vlkr_plt'print(x, ...)## S3 method for class 'vlkr_plt'plot(x, ...)

Arguments

x

The volker plot.

...

Further parameters passed to print().

Value

No return value.

Examples

library(volker)data <- volker::chatgptpl <- plot_metrics(data, sd_age)print(pl)

Printing method for volker reports

Description

Printing method for volker reports

Usage

## S3 method for class 'vlkr_rprt'print(x, ...)

Arguments

x

The volker report object.

...

Further parameters passed to print.

Value

No return value.

Examples

library(volker)data <- volker::chatgptrp <- report_metrics(data, sd_age)print(rp)

Printing method for volker tables.

Description

Printing method for volker tables.

Usage

## S3 method for class 'vlkr_tbl'print(x, ...)

Arguments

x

The volker table.

...

Further parameters passed to print().

Value

No return value.

Examples

library(volker)data <- volker::chatgpttb <- tab_metrics(data, sd_age)print(tb)

Create table and plot for categorical variables

Description

Depending on your column selection, different types of plots and tables are generated.Seeplot_counts andtab_counts.

Usage

report_counts(  data,  cols,  cross = NULL,  metric = FALSE,  ids = NULL,  agree = FALSE,  index = FALSE,  effect = FALSE,  numbers = NULL,  title = TRUE,  close = TRUE,  clean = TRUE,  ...)

Arguments

data

A data frame.

cols

A tidy column selection,e.g. a single column (without quotes)or multiple columns selected by methods such as starts_with().

cross

Optional, a grouping column (without quotes).

metric

When crossing variables, the cross column parameter can contain categorical or metric values.By default, the cross column selection is treated as categorical data.Set metric to TRUE, to treat it as metric and calculate correlations.

ids

A column containing unique identifiers for the cases, used only in combination with theagree-parameter.

agree

Setting agree to "reliability" orTRUE adds reliability coefficients to the report (e.g. Kappa).Setting agree to "classification" adds classification performance indicators to the report (e.g. F1).You need to provide a column selection for values in thecols-parameter,provide a column with coders or classification source in thecross-parameter,and a column containing unique case ids to theids-parameter.

index

When the cols contain items on a metric scale(as determined byget_direction),an index will be calculated using the 'psych' package.Set to FALSE to suppress index generation.

effect

Whether to report statistical tests and effect sizes. Seeeffect_counts for further parameters.

numbers

The numbers to print on the bars: "n" (frequency), "p" (percentage) or both.Set to NULL to remove numbers.

title

A character providing the heading or TRUE (default) to output a heading.Classes for tabset pills will be added.

close

Whether to close the last tab (default value TRUE) or to keep it open.Keep it open to add further custom tabs by adding headers on the fifth levelin Markdown (e.g. ##### Method).

clean

Prepare data bydata_clean.

...

Parameters passed to theplot_counts andtab_counts andeffect_counts functions.

Details

For item batteries, an index is calculated and reported.When used in combination with the Markdown-template "html_report",the different parts of the report are grouped under a tabsheet selector.

[Experimental]

Value

A volker report object.

Examples

library(volker)data <- volker::chatgptreport_counts(data, sd_gender)

Create table and plot for metric variables

Description

Depending on your column selection, different types of plots and tables are generated.Seeplot_metrics andtab_metrics.

Usage

report_metrics(  data,  cols,  cross = NULL,  metric = FALSE,  ...,  index = FALSE,  factors = FALSE,  clusters = FALSE,  model = FALSE,  effect = FALSE,  title = TRUE,  close = TRUE,  clean = TRUE)

Arguments

data

A data frame.

cols

A tidy column selection,e.g. a single column (without quotes)or multiple columns selected by methods such as starts_with().

cross

Optional, a grouping or correlation column (without quotes).

metric

When crossing variables, the cross column parameter can contain categorical or metric values.By default, the cross column selection is treated as categorical data.Set metric to TRUE, to treat it as metric and calculate correlations.Alternatively, for multivariable models (if the model parameter is TRUE),provide the metric column selection in the metric parameterand the categorical column selection in the cross parameter.

...

Parameters passed to theplot_metrics andtab_metrics andeffect_metrics functions.

index

When the cols contain items on a metric scale(as determined byget_direction),an index will be calculated using the 'psych' package.Set to FALSE to suppress index generation.

factors

The number of factors to calculate.Set to FALSE to suppress factor analysis.Set to TRUE to output a scree plot and automatically choose the number of factors.When the cols contain items on a metric scale(as determined byget_direction),factors will be calculated using the 'psych' package.Seeadd_factors.

clusters

The number of clusters to calculate.Cluster are determined using kmeans after scaling the items.Set to FALSE to suppress cluster analysis.Set to TRUE to output a scree plot and automatically choose the number of clusters based on the elbow criterion.Seeadd_clusters.

model

Set to TRUE for multivariable models.The dependent variable must be provided in the first parameter (cols).Independent categorical variables are provided in the second parameter (cross),which supports tidy column selections or vectors of multiple columns.Independent metric variables are provided in the metric parameteras tidy column selections or vectors of multiple columns.Interaction terms are provided in the interactions parameter and passed to the model functions.You can get diagnostic plots by setting the diagnostics parameter to TRUE.Seemodel_metrics_tab,model_metrics_plot andadd_model for further options.

effect

Whether to report statistical tests and effect sizes. Seeeffect_counts for further parameters.

title

A character providing the heading or TRUE (default) to output a heading.Classes for tabset pills will be added.

close

Whether to close the last tab (default value TRUE) or to keep it open.Keep it open to add further custom tabs by adding headers on the fifth levelin Markdown (e.g. ##### Method).

clean

Prepare data bydata_clean.

Details

For item batteries, an index is calculated and reported.When used in combination with the Markdown-template "html_report",the different parts of the report are grouped under a tabsheet selector.

[Experimental]

Value

A volker report object.

Examples

library(volker)data <- volker::chatgptreport_metrics(data, sd_age)

Select function

Description

Seedplyr::select for details.


A skimmer for boxplot generation

Description

Returns a five point summary, mean and sd, items count and alpha for scales added by add_index().Additionally, the whiskers defined by the minimum respective maximum value within 1.5 * iqr are calculated.Outliers are returned in a list column.

Usage

skim_boxplot(data, ..., .data_name = NULL)

Calculate a metric by groups

Description

Calculate a metric by groups

Usage

skim_grouped(data, cols, cross, value = "numeric.mean", labels = TRUE)

Arguments

data

A tibble.

cols

The item columns that hold the values to summarize.

cross

The column holding groups to compare.

value

The metric to extract from the skim result, e.g. numeric.mean or numeric.sd.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

Value

A tibble with each item in a row, a total column and columns for all groups.


A reduced skimmer for metric variablesReturns a five point summary, mean and sd, items count and alpha for scales added by add_index()

Description

A reduced skimmer for metric variablesReturns a five point summary, mean and sd, items count and alpha for scales added by add_index()

Usage

skim_metrics(data, ..., .data_name = NULL)

Value

A skimmer, seeskim_with

Examples

library(volker)data <- volker::chatgptskim_metrics(data)

Select variables by their prefix

Description

Seetidyselect::starts_with for details.


Output a frequency table

Description

The type of frequency table depends on the number of selected columns:

Cross tabulations:

By default, if you provide two column selections, the second column is treated as categorical.Setting the metric-parameter to TRUE will call the appropriate functions for correlation analysis:

Parameters that may be passed to specific count functions:

[Experimental]

Usage

tab_counts(data, cols, cross = NULL, metric = FALSE, clean = TRUE, ...)

Arguments

data

A data frame.

cols

A tidy column selection,e.g. a single column (without quotes)or multiple columns selected by methods such as starts_with().

cross

Optional, a grouping column. The column name without quotes.

metric

When crossing variables, the cross column parameter can contain categorical or metric values.By default, the cross column selection is treated as categorical data.Set metric to TRUE, to treat it as metric and calculate correlations.

clean

Prepare data bydata_clean.

...

Other parameters passed to the appropriate table function.

Value

A volker tibble.

Examples

library(volker)data <- volker::chatgpttab_counts(data, sd_gender)

Output frequencies for multiple variables

Description

Output frequencies for multiple variables

Usage

tab_counts_items(  data,  cols,  ci = FALSE,  percent = TRUE,  values = c("n", "p"),  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

ci

Whether to compute 95% confidence intervals.

percent

Set to FALSE to prevent calculating percents from proportions.

values

The values to output: n (frequency) or p (percentage) or both (the default).

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromtab_counts.

Value

A volker tibble.

Examples

library(volker)data <- volker::chatgpttab_counts_items(data, starts_with("cg_adoption_"))

Compare the values in multiple items by a metric column that will be split into groups

Description

Compare the values in multiple items by a metric column that will be split into groups

Usage

tab_counts_items_cor(  data,  cols,  cross,  category = NULL,  split = NULL,  percent = TRUE,  values = c("n", "p"),  title = TRUE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

A metric column that will be split into groups at the median value.

category

Summarizing multiple items (the cols parameter) by group requires a focus category.By default, for logical column types, only TRUE values are counted.For other column types, the first category is counted.Accepts both character and numeric values to override default counting behavior.

split

Not implemented yet.

percent

Proportions are formatted as percent by default. Set to FALSE to get bare proportions.

values

The values to output: n (frequency) or p (percentage) or both (the default).

title

If TRUE (default) shows a plot title derived from the column labels.Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_counts.

Value

A volker tibble.

Examples

library(volker)data <- volker::chatgpttab_counts_items_cor(  data, starts_with("cg_adoption_"), sd_age,  category=c("agree", "strongly agree"))

Correlation of categorical items with metric items

Description

Not yet implemented. The future will come.

Usage

tab_counts_items_cor_items(  data,  cols,  cross,  title = TRUE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

Tidyselect item variables (e.g. starts_with...).

title

If TRUE (default) shows a plot title derived from the column labels.Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_counts.

Value

A volker tibble.


Compare the values in multiple items by a grouping column

Description

Compare the values in multiple items by a grouping column

Usage

tab_counts_items_grouped(  data,  cols,  cross,  category = NULL,  percent = TRUE,  values = c("n", "p"),  title = TRUE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The column holding groups to compare.

category

Summarizing multiple items (the cols parameter) by group requires a focus category.By default, for logical column types, only TRUE values are counted.For other column types, the first category is counted.Accepts both character and numeric values to override default counting behavior.

percent

Proportions are formatted as percent by default. Set to FALSE to get bare proportions.

values

The values to output: n (frequency) or p (percentage) or both (the default).

title

If TRUE (default) shows a plot title derived from the column labels.Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_counts.

Value

A volker tibble.

Examples

library(volker)data <- volker::chatgpttab_counts_items_grouped(  data, starts_with("cg_adoption_"), adopter,  category=c("agree", "strongly agree"))

Correlation of categorical items with categorical items

Description

Correlation of categorical items with categorical items

Usage

tab_counts_items_grouped_items(  data,  cols,  cross,  method = "cramer",  category = NULL,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

Tidyselect item variables (e.g. starts_with...).

method

The method of correlation calculation:

  • cramer for Cramer's V,

  • npmi for Normalized Pointwise Mutual Information. Not implemented yet.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_counts.

Value

A volker tibble.


Output a frequency table for the values in one column

Description

Output a frequency table for the values in one column

Usage

tab_counts_one(  data,  col,  ci = FALSE,  percent = TRUE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

col

The column holding values to count.

ci

Whether to compute 95% confidence intervals usingstats::prop.test.

percent

Proportions are formatted as percent by default. Set to FALSE to get bare proportions.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromtab_counts.

Value

A volker tibble.

Examples

library(volker)data <- volker::chatgpttab_counts_one(data, sd_gender)

Count values by a metric column that will be split into groups

Description

Count values by a metric column that will be split into groups

Usage

tab_counts_one_cor(  data,  col,  cross,  prop = "total",  percent = TRUE,  values = c("n", "p"),  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

col

The column holding factor values.

cross

The metric column that will be split into groups at the median.

prop

The basis of percent calculation: "total" (the default), "cols", or "rows".

percent

Proportions are formatted as percent by default. Set to FALSE to get bare proportions.

values

The values to output: n (frequency) or p (percentage) or both (the default).

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromtab_counts.

Value

A volker tibble.

Examples

library(volker)data <- volker::chatgpttab_counts_one_cor(data, adopter, sd_age)

Output frequencies cross tabulated with a grouping column

Description

Output frequencies cross tabulated with a grouping column

Usage

tab_counts_one_grouped(  data,  col,  cross,  prop = "total",  percent = TRUE,  values = c("n", "p"),  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

col

The column holding factor values.

cross

The column holding groups to split.

prop

The basis of percent calculation: "total" (the default), "cols", or "rows".

percent

Proportions are formatted as percent by default. Set to FALSE to get bare proportions.

values

The values to output: n (frequency) or p (percentage) or both (the default).

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromtab_counts.

Value

A volker tibble.

Examples

library(volker)data <- volker::chatgpttab_counts_one_grouped(data, adopter, sd_gender)

Output a table with distribution parameters

Description

The table type depends on the number of selected columns:

Group comparisons:

By default, if you provide two column selections, the second column is treated as categorical.Setting the metric-parameter to TRUE will call the appropriate functions for correlation analysis:

Parameters that may be passed to specific metric functions:

[Experimental]

Usage

tab_metrics(data, cols, cross = NULL, metric = FALSE, clean = TRUE, ...)

Arguments

data

A data frame.

cols

A tidy column selection,e.g. a single column (without quotes)or multiple columns selected by methods such as starts_with().

cross

Optional, a grouping column (without quotes).

metric

When crossing variables, the cross column parameter can contain categorical or metric values.By default, the cross column selection is treated as categorical data.Set metric to TRUE, to treat it as metric and calculate correlations.

clean

Prepare data bydata_clean.

...

Other parameters passed to the appropriate table function.

Value

A volker tibble.

Examples

library(volker)data <- volker::chatgpttab_metrics(data, sd_age)

Output a five point summary table for multiple items

Description

Output a five point summary table for multiple items

Usage

tab_metrics_items(  data,  cols,  ci = FALSE,  digits = 1,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

cols

The columns holding metric values.

ci

Whether to compute confidence intervals of the mean.

digits

The number of digits to print.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromtab_metrics.

Value

A volker tibble.

Examples

library(volker)data <- volker::chatgpttab_metrics_items(data, starts_with("cg_adoption_"))

Output a correlation table for item battery and one metric variable

Description

[Experimental]

Usage

tab_metrics_items_cor(  data,  cols,  cross,  method = "pearson",  digits = 2,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

cols

The source columns.

cross

The target columns or NULL to calculate correlations within the source columns.

method

The output metrics, pearson = Pearson's R, spearman = Spearman's rho.

digits

The number of digits to print.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromtab_metrics.

Value

A volker tibble.

Examples

library(volker)data <- volker::chatgpttab_metrics_items_cor(  data,  starts_with("cg_adoption_adv"),  sd_age,  metric = TRUE)

Output a correlation table for item battery and item battery

Description

[Experimental]

Usage

tab_metrics_items_cor_items(  data,  cols,  cross,  method = "pearson",  digits = 2,  ci = FALSE,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

cols

The source columns.

cross

The target columns or NULL to calculate correlations within the source columns.

method

The output metrics, pearson = Pearson's R, spearman = Spearman's rho.

digits

The number of digits to print.

ci

Whether to calculate 95% confidence intervals of the correlation coefficient.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromtab_metrics.

Value

A volker tibble.

Examples

library(volker)data <- volker::chatgpttab_metrics_items_cor_items(  data,  starts_with("cg_adoption_adv"),  starts_with("use"),  metric = TRUE)

Output the means for groups in one or multiple columns

Description

Output the means for groups in one or multiple columns

Usage

tab_metrics_items_grouped(  data,  cols,  cross,  digits = 1,  values = c("m", "sd"),  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

cols

The item columns that hold the values to summarize.

cross

The column holding groups to compare.

digits

The number of digits to print.

values

The output metrics, mean (m), the standard deviation (sd) or both (the default).

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromtab_metrics.

Value

A volker tibble.

Examples

library(volker)data <- volker::chatgpttab_metrics_items_grouped(data, starts_with("cg_adoption_"), sd_gender)

Correlation of metric items with categorical items

Description

Not yet implemented. The future will come.

Usage

tab_metrics_items_grouped_items(  data,  cols,  cross,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

Tidyselect item variables (e.g. starts_with...)

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromplot_metrics.

Value

A volker tibble.


Output a five point summary table for the values in multiple columns

Description

Output a five point summary table for the values in multiple columns

Usage

tab_metrics_one(  data,  col,  ci = FALSE,  digits = 1,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

col

The columns holding metric values.

ci

Whether to calculate 95% confidence intervals of the mean.

digits

The number of digits to print.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromtab_metrics.

Value

A volker tibble.

Examples

library(volker)data <- volker::chatgpttab_metrics_one(data, sd_age)

Correlate two columns

Description

Correlate two columns

Usage

tab_metrics_one_cor(  data,  col,  cross,  method = "pearson",  ci = FALSE,  digits = 2,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

col

The first column holding metric values.

cross

The second column holding metric values.

method

The output metrics, TRUE or pearson = Pearson's R, spearman = Spearman's rho

ci

Whether to output confidence intervals.

digits

The number of digits to print.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromtab_counts.

Value

A volker tibble.

Examples

library(volker)data <- volker::chatgpttab_metrics_one_cor(data, use_private, sd_age)

Output a five point summary for groups

Description

Output a five point summary for groups

Usage

tab_metrics_one_grouped(  data,  col,  cross,  ci = FALSE,  digits = 1,  labels = TRUE,  clean = TRUE,  ...)

Arguments

data

A tibble.

col

The column holding metric values.

cross

The column holding groups to compare.

ci

Whether to output 95% confidence intervals.

digits

The number of digits to print.

labels

If TRUE (default) extracts labels from the attributes, seecodebook.

clean

Prepare data bydata_clean.

...

Placeholder to allow calling the method with unused parameters fromtab_metrics.

Value

A volker tibble.

Examples

library(volker)data <- volker::chatgpttab_metrics_one_grouped(data, sd_age, sd_gender)

Get, set, and modify the active ggplot theme

Description

Seeggplot2::theme_set for details.


Define a default theme for volker plots

Description

Set ggplot colors, sizes and layout parameters.

Usage

theme_vlkr(  base_size = 11,  base_color = "black",  base_fill = VLKR_FILLDISCRETE,  base_gradient = VLKR_FILLGRADIENT)

Arguments

base_size

Base font size.

base_color

Base font color.

base_fill

A list of fill color sets or at least one fill color set. Example:list(c("red"), c("red", "blue", "green")).Each set can contain different numbers of colors.Depending on the number of colors needed,the set with at least the number of required colors is used.The first color is always used for simple bar charts.

base_gradient

A color vector used for creating gradient fill colors, e.g. in stacked bar plots.

Details

[Experimental]

Value

A theme function.

Examples

library(volker)library(ggplot2)data <- volker::chatgpttheme_set(theme_vlkr(base_size=15, base_fill = list("red")))plot_counts(data, sd_gender)

Tidy tibbles

Description

Seetibble::tibble for details.


Tidy lm results, replace categorical parameter names by their levels and add the reference level

Description

Tidy lm results, replace categorical parameter names by their levels and add the reference level

Usage

tidy_lm_levels(fit)

Arguments

fit

Result of alm call.

Value

A tibble with regression parameters.

Author(s)

Created with the help of ChatGPT.


Tidy tribbles

Description

Seetibble::tribble for details.


Remove trailing zeros and trailing or leadingwhitespaces, colons,hyphens and underscores

Description

Remove trailing zeros and trailing or leadingwhitespaces, colons,hyphens and underscores

Usage

trim_label(x)

Arguments

x

A character value.

Value

The trimmed character value.


Remove a prefix from a character vector or a factor

Description

If the resulting character values would be empty,the prefix is returned. At the end, all itemsin the vector are trimmed usingtrim_label.

Usage

trim_prefix(x, prefix = TRUE)

Arguments

x

A character or factor vector.

prefix

The prefix. Set to TRUE to first extract the prefix.

Details

If x is a factor, the order of factor levels is retained.

Value

The trimmed character or factor vector.


Truncate labels

Description

Truncate labels that exceed a specified maximum length.

Usage

trunc_labels(x, max_length = 20)

Arguments

x

A character vector.

max_length

Maximum length, default is 20. The ellipsis "..." is appended to shortened labels.

Value

A character vector with truncated labels.


Interpolate an alpha value based on case numbers

Description

Interpolate an alpha value based on case numbers

Usage

vlkr_alpha_interpolated(  n,  n_min = 20,  n_max = 100,  alpha_min = VLKR_POINT_ALPHA,  alpha_max = 1)

Arguments

n

Number of cases

n_min

The case number where the minimum alpha value starts

n_max

The case number where the maximum alpha value ends

alpha_min

The minimum alpha value

alpha_max

The maximum alpha value

Value

A value between the minimum and the maximum alpha value


Given a vector of hex fill colors, choose an appropriate color for text

Description

Given a vector of hex fill colors, choose an appropriate color for text

Usage

vlkr_colors_contrast(colors, threshold = 0.5)

Arguments

colors

A vector of hex colors

threshold

Luminance theshold for choosing black over white

Value

A vector of the same length as fill_colors with white or black colors.


Get colors for discrete scales

Description

If the option ggplot2.discrete.fill is set,gets color values from the first list item thathas enough colors and reverses them to start fillingfrom the left in grouped bar charts.

Usage

vlkr_colors_discrete(n = NULL, inv = FALSE)

Arguments

n

Number of colors.

inv

Whether to get a text color with good contrast on the chosen fill colors.

Details

Falls back to scale_fill_hue().

Value

A vector of colors.


Get colors for polarized scales

Description

Creates a gradient scale based on VLKR_FILLPOLARIZED.

Usage

vlkr_colors_polarized(n = NULL, inv = FALSE)

Arguments

n

Number of colors or NULL to get the raw colors from the config

inv

Whether to get a text color with good contrast on the chosen fill colors.

Value

A vector of colors.


Get colors for sequential scales

Description

Creates a gradient scale based on VLKR_FILLGRADIENT.

Usage

vlkr_colors_sequential(n = NULL, inv = FALSE)

Arguments

n

Number of colors or NULL to get the raw colors from the config

inv

Whether to get a text color with good contrast on the chosen fill colors.

Value

A vector of colors.


Wrap a string

Description

Wrap a string

Usage

wrap_label(x, width = 40)

Arguments

x

A character vector.

width

The number of chars after which to break.

Value

A character vector with wrapped strings.


Combine two identically shaped data framesby adding values of each column from the second data frameinto the corresponding column in the first dataframe using parentheses

Description

Combine two identically shaped data framesby adding values of each column from the second data frameinto the corresponding column in the first dataframe using parentheses

Usage

zip_tables(x, y, newline = TRUE, brackets = FALSE)

Arguments

x

The first data frame.

y

The second data frame.

newline

Whether to add a new line character between the values (default: TRUE).

brackets

Whether to set the secondary values in brackets (default: FALSE).

Value

A combined data frame.


[8]ページ先頭

©2009-2025 Movatter.jp