| Title: | A Toolkit for Behavioral Scientists |
| Version: | 0.6.1 |
| Description: | A collection of functions for analyzing data typically collected or used by behavioral scientists. Examples of the functions include a function that compares groups in a factorial experimental design, a function that conducts two-way analysis of variance (ANOVA), and a function that cleans a data set generated by Qualtrics surveys. Some of the functions will require installing additional package(s). Such packages and other references are cited within the section describing the relevant functions. Many functions in this package rely heavily on these two popular R packages: Dowle et al. (2021)https://CRAN.R-project.org/package=data.table. Wickham et al. (2021)https://CRAN.R-project.org/package=ggplot2. |
| License: | GPL-3 |
| URL: | https://github.com/jinkim3/kim,https://jinkim.science |
| BugReports: | https://github.com/jinkim3/kim/issues |
| Imports: | data.table, remotes |
| Suggests: | boot, ggplot2, moments, testthat (≥ 3.0.0) |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.0 |
| Config/testthat/edition: | 3 |
| NeedsCompilation: | no |
| Packaged: | 2025-02-26 08:04:13 UTC; a |
| Author: | Jin Kim |
| Maintainer: | Jin Kim <jinkim@aya.yale.edu> |
| Repository: | CRAN |
| Date/Publication: | 2025-02-26 12:40:02 UTC |
Akaike Weights
Description
Compare adequacy of different models by calculating their Akaike weightsand the associated evidence ratio.
Usage
akaike_weights(aic_values = NULL, print_output_explanation = TRUE)Arguments
aic_values | a vector of AIC values |
print_output_explanation | logical. Should an explanation abouthow to read the output be printed? (default = TRUE). |
Details
Please refer to Wagenmakers & Farrell (2004),doi:10.3758/BF03206482
Value
the output will be a data.table showing AIC weights,their evidence ratio(s), etc.
Examples
# default reference AIC value is the minimum AIC value, e.g., 202 below.akaike_weights(c(204, 202, 206, 206, 214))Assign function parameters as values
Description
Take a function and assign all the parameters defined within itas values in the specified environment (e.g., global environment)
Usage
assign_fn_parameters_as_vars(fun = NULL, envir = NULL)Arguments
fun | a function |
envir | an environment in which to assign the parameters asvalues (default = |
Details
This function can be useful when you are testing a function andyou need to set all the function's parameters in a single operation.
Examples
## Not run: assign_fn_parameters_as_vars(pm)assign_fn_parameters_as_vars(mean)assign_fn_parameters_as_vars(sum)assign_fn_parameters_as_vars(lm)assign_fn_parameters_as_vars(floodlight_2_by_continuous)## End(Not run)Barplot for counts
Description
Barplot for counts
Usage
barplot_for_counts(data = NULL, x, y)Arguments
data | a data object (a data frame or a data.table) |
x | name of the variable that will be on the x axis of the barplot |
y | name of the variable that will be on the y axis of the barplot |
Examples
barplot_for_counts(x = 1:3, y = 7:9)barplot_for_counts(data = data.frame(cyl = names(table(mtcars$cyl)), count = as.vector(table(mtcars$cyl))),x = "cyl", y = "count")Binomial test
Description
Conduct a binomial test. In other words, test whether an observedproportion of "successes" (e.g., proportion of heads in a seriesof coin tosses) is greater than the expected proportion (e.g., 0.5).This function uses the 'binom.test' function from the 'stats' package.
Usage
binomial_test( x = NULL, success = NULL, failure = NULL, p = 0.5, alternative = "two.sided", ci = 0.95, round_percentages = 0)Arguments
x | a vector of values, each of which represents an instance ofeither a "success" or "failure" (e.g., c("s", "f", "s", "s", "f", "s")) |
success | which value(s) indicate "successes"? |
failure | (optional) which value(s) indicate "failures"?If no input is provided for this argument, then all the non-NA valuesthat are not declared to be "successes" will be treated as "failures". |
p | hypothesized probability of success (default = 0.5) |
alternative | indicates the alternative hypothesis and must beone of "two.sided", "greater", or "less". You can specify just theinitial letter. By default, |
ci | width of the confidence interval (default = 0.95) |
round_percentages | number of decimal places to which to round thepercentages in the summary table (default = 0) |
Examples
# sample vectorsample_vector <- c(0, 1, 1, 0, 1, 98, 98, 99, NA)binomial_test(x = sample_vector,success = 1, failure = 0)binomial_test(x = sample_vector,success = 1, failure = 0,p = 0.1,alternative = "greater")binomial_test(x = sample_vector,success = c(1, 99), failure = c(0, 98),p = 0.6,alternative = "less")Draw a bracket on a ggplot
Description
Draw a square bracket with a label on a ggplot
Usage
bracket( xmin = NULL, xmax = NULL, ymin = NULL, ymax = NULL, vertical = NULL, horizontal = NULL, open = NULL, bracket_shape = NULL, thickness = 2, bracket_color = "black", label = NULL, label_hjust = NULL, label_vjust = NULL, label_font_size = 5, label_font_face = "bold", label_color = "black", label_parse = FALSE)Arguments
xmin | xmin |
xmax | xmax |
ymin | ymin |
ymax | ymax |
vertical | vertical |
horizontal | horizontal |
open | open |
bracket_shape | bracket_shape |
thickness | thickness |
bracket_color | bracket_color |
label | label |
label_hjust | label_hjust |
label_vjust | label_vjust |
label_font_size | label_font_size |
label_font_face | label_font_face |
label_color | label_font_face |
label_parse | label_parse |
Value
a ggplot object; there will be no meaningful output fromthis function. Instead, this function should be used with anotherggplot object
Examples
library(ggplot2)ggplot(mtcars, aes(x = cyl, y = mpg)) + geom_point() +bracket(6.1, 6.2, 17, 22, bracket_shape = "]", label = "abc")Capitalize a substring
Description
Capitalizes the first letter (by default) or a substringof a given character string or each element of the character vector
Usage
capitalize(x, start = 1, end = 1)Arguments
x | a character string or a character vector |
start | starting position of the susbtring (default = 1) |
end | ending position of the susbtring (default = 1) |
Value
a character string or a character vector
Examples
capitalize("abc")capitalize(c("abc", "xyx"), start = 2, end = 3)Change variable names in a data set
Description
Change variable names in a data set
Usage
change_var_names( data = NULL, old_var_names = NULL, new_var_names = NULL, skip_absent = FALSE, print_summary = TRUE, output_type = "dt")Arguments
data | a data object (a data frame or a data.table) |
old_var_names | a vector of old variable names(i.e., variable names to change) |
new_var_names | a vector of new variable names |
skip_absent | If |
print_summary | If |
output_type | type of the output. If |
Value
a data.table object with changed variable names
Examples
change_var_names(mtcars, old = c("mpg", "cyl"), new = c("mpg_new", "cyl_new"))Check modes of objects
Description
Check modes of objects
Usage
check_modes(..., mode_to_confirm = NULL)Arguments
... | R objects. |
mode_to_confirm | The function will test whether each input isof this mode. For example, |
Examples
check_modes(1L, mode_to_confirm = "numeric")check_modes(TRUE, FALSE, 1L, 1:3, 1.1, c(1.2, 1.3), "abc", 1 + 2i, intToBits(1L),mode_to_confirm = "numeric")Check for required packages
Description
Check whether required packages are installed.
Usage
check_req_pkg(pkg = NULL)Arguments
pkg | a character vector containing names of packages to check |
Value
there will be no output from this function. Rather, thefunction will check whether the packages given as inputs are installed.
Examples
check_req_pkg("data.table")check_req_pkg(c("base", "utils", "ggplot2", "data.table"))Chi-squared test
Description
Conduct a chi-squared test and produce a contingency table
Usage
chi_squared_test( data = NULL, iv_name = NULL, dv_name = NULL, round_chi_sq_test_stat = 2, round_p = 3, sigfigs_proportion = 2, correct = TRUE, odds_ratio_ci = 0.95, round_odds_ratio_ci_limits = 2, invert = FALSE, notify_na_count = NULL, save_as_png = FALSE, png_name = NULL, width = 1200, height = 800, units = "px", res = 200, layout_matrix = NULL)Arguments
data | a data object (a data frame or a data.table) |
iv_name | name of the independent variable |
dv_name | name of the dependent variable (must be a binary variable) |
round_chi_sq_test_stat | number of decimal places to which toround the chi-squared test statistic (default = 2) |
round_p | number of decimal places to which to round thep-value from the chi-squared test (default = 3) |
sigfigs_proportion | number of significant digits to round to(for the table of proportions). By default |
correct | logical. Should continuity correction be applied?(default = TRUE) |
odds_ratio_ci | width of the confidence interval for the odds ratio.Input can be any value less than 1 and greater than or equal to 0.By default, |
round_odds_ratio_ci_limits | number of decimal places to which toround the limits of the odds ratio's confidence interval (default = 2) |
invert | logical. Whether the inverse of the odds ratio(i.e., 1 / odds ratio) should be returned. |
notify_na_count | if |
save_as_png | if |
png_name | name of the PNG file to be saved. By default, the namewill be "chi_sq_" followed by a timestamp of thecurrent time.The timestamp will be in the format, jan_01_2021_1300_10_000001,where "jan_01_2021" would indicate January 01, 2021;1300 would indicate 13:00 (i.e., 1 PM); and 10_000001 wouldindicate 10.000001 seconds after the hour. |
width | width of the PNG file (default = 1600) |
height | height of the PNG file (default = 1200) |
units | the units for the |
res | The nominal resolution in ppi which will be recordedin the png file, if a positive integer. Used for unitsother than the default. By default, |
layout_matrix | The layout argument for arranging sectiontitles and tables using the |
Examples
chi_squared_test(data = mtcars, iv_name = "cyl", dv_name = "am")# if the iv has only two levels, odds ratio will also be calculatedchi_squared_test(data = mtcars, iv_name = "vs", dv_name = "am")Chi-squared test, pairwise
Description
Conducts a chi-squared test for every possible pairwise comparisonwith Bonferroni correction
Usage
chi_squared_test_pairwise( data = NULL, iv_name = NULL, dv_name = NULL, focal_dv_value = NULL, contingency_table = TRUE, contingency_table_sigfigs = 2, percent_and_total = FALSE, percentages_only = NULL, counts_only = NULL, sigfigs = 3, chi_sq_test_stats = FALSE, correct = TRUE, save_as_png = FALSE, png_name = NULL, width = 2000, height = 800, units = "px", res = 200, layout_matrix = NULL)Arguments
data | a data object (a data frame or a data.table) |
iv_name | name of the independent variable(must be a categorical variable) |
dv_name | name of the dependent variable (must be a binary variable) |
focal_dv_value | focal value of the dependent variablewhose frequencies will be calculated (i.e., the value of thedependent variable that will be considered a "success" ora result of interest) |
contingency_table | If |
contingency_table_sigfigs | number of significant digits thatthe contingency table's percentage values should be rounded to(default = 2) |
percent_and_total | logical. If |
percentages_only | tabulate percentages of the focal DV value only |
counts_only | tabulate counts of the focal DV value only |
sigfigs | number of significant digits to round to |
chi_sq_test_stats | if |
correct | logical. Should continuity correction be applied?(default = TRUE) |
save_as_png | if |
png_name | name of the PNG file to be saved. By default, the namewill be "chi_sq_" followed by a timestamp of thecurrent time.The timestamp will be in the format, jan_01_2021_1300_10_000001,where "jan_01_2021" would indicate January 01, 2021;1300 would indicate 13:00 (i.e., 1 PM); and 10_000001 wouldindicate 10.000001 seconds after the hour. |
width | width of the PNG file (default = 2000) |
height | height of the PNG file (default = 800) |
units | the units for the |
res | The nominal resolution in ppi which will be recordedin the png file, if a positive integer. Used for unitsother than the default. By default, |
layout_matrix | The layout argument for arranging sectiontitles and tables using the |
Examples
chi_squared_test_pairwise(data = mtcars, iv_name = "vs", dv_name = "am")chi_squared_test_pairwise(data = mtcars, iv_name = "vs", dv_name = "am",percentages_only = TRUE)# using 3 mtcars data sets combinedchi_squared_test_pairwise(data = rbind(mtcars, rbind(mtcars, mtcars)),iv_name = "cyl", dv_name = "am")# include the total countschi_squared_test_pairwise(data = rbind(mtcars, rbind(mtcars, mtcars)),iv_name = "cyl", dv_name = "am", percent_and_total = TRUE)# display countschi_squared_test_pairwise(data = rbind(mtcars, rbind(mtcars, mtcars)),iv_name = "cyl", dv_name = "am", contingency_table = "counts")Confidence Interval of the Mean of a Vector
Description
Returns the confidence interval of the mean of a numeric vector.
Usage
ci_of_mean(x = NULL, confidence_level = 0.95, notify_na_count = NULL)Arguments
x | a numeric vector |
confidence_level | What is the desired confidence levelexpressed as a decimal? (default = 0.95) |
notify_na_count | if |
Value
the output will be a named numeric vector with thelower and upper limit of the confidence interval.
Examples
ci_of_mean(x = 1:100, confidence_level = 0.95)ci_of_mean(mtcars$mpg)Clean data from Qualtrics
Description
Clean a data set downloaded from Qualtrics
Usage
clean_data_from_qualtrics( data = NULL, remove_survey_preview_data = TRUE, remove_test_response_data = TRUE, default_cols_by_qualtrics = NULL, default_cols_by_qualtrics_new = NULL, warn_accuracy_loss = FALSE, click_data_cols = "rm", page_submit_cols = "move_to_right")Arguments
data | a data object (a data frame or a data.table) |
remove_survey_preview_data | logical. Whether to remove datafrom survey preview (default = TRUE) |
remove_test_response_data | logical. Whether to remove datafrom test response (default = TRUE) |
default_cols_by_qualtrics | names of columns that Qualtricsincludes in the data set by default (e.g., "StartDate", "Finished").Accepting the default value |
default_cols_by_qualtrics_new | new names for columns thatQualtrics includes in the data set by default(e.g., "StartDate", "Finished").Accepting the default value |
warn_accuracy_loss | logical. whether to warn the user ifconverting character to numeric leads to loss of accuracy.(default = FALSE) |
click_data_cols | if |
page_submit_cols | if |
Value
a data.table object
Examples
clean_data_from_qualtrics(mtcars)clean_data_from_qualtrics(mtcars, default_cols_by_qualtrics = "mpg",default_cols_by_qualtrics_new = "mpg2")Coefficient of variation
Description
Calculates the (population or sample) coefficient of variationof a given numeric vector
Usage
coefficent_of_variation(vector, pop_or_sample = "pop")Arguments
vector | a numeric vector |
pop_or_sample | should coefficient of variation be calculated fora "population" or a "sample"? |
Value
a numeric value
Examples
coefficent_of_variation(1:4, pop_or_sample = "sample")coefficent_of_variation(1:4, pop_or_sample = "pop")Calculate Cohen's d and its confidence interval usingthe package 'psych'
Description
To run this function, the following package(s) must be installed:Package 'psych' v2.1.9 (or possibly a higher version) byWilliam Revelle (2021),https://cran.r-project.org/package=psych
Usage
cohen_d( sample_1 = NULL, sample_2 = NULL, data = NULL, iv_name = NULL, dv_name = NULL, ci_range = 0.95, output_type = "all")Arguments
sample_1 | a vector of values in the first of two samples |
sample_2 | a vector of values in the second of two samples |
data | a data object (a data frame or a data.table) |
iv_name | name of the independent variable |
dv_name | name of the dependent variable |
ci_range | range of the confidence interval for Cohen's d(default = 0.95) |
output_type | If |
Examples
## Not run: cohen_d(sample_1 = 1:10, sample_2 = 3:12)cohen_d(data = mtcars, iv_name = "vs", dv_name = "mpg", ci_range = 0.99)sample_dt <- data.table::data.table(iris)[Species != "setosa"]cohen_d(data = sample_dt, iv_name = "Species", dv_name = "Petal.Width")## End(Not run)Calculate Cohen's d as illustrated by Borenstein et al.(2009, ISBN: 978-0-470-05724-7)
Description
Calculates Cohen's d, its standard error, and confidence interval,as illustrated in the Borenstein et al. (2009, ISBN: 978-0-470-05724-7).
Usage
cohen_d_borenstein( sample_1 = NULL, sample_2 = NULL, data = NULL, iv_name = NULL, dv_name = NULL, direction = "2_minus_1", ci_range = 0.95, output_type = "all", initial_value = 0)Arguments
sample_1 | a vector of values in the first of two samples |
sample_2 | a vector of values in the second of two samples |
data | a data object (a data frame or a data.table) |
iv_name | name of the independent variable |
dv_name | name of the dependent variable |
direction | If |
ci_range | range of the confidence interval for Cohen's d(default = 0.95) |
output_type | If |
initial_value | initial value of the noncentrality parameter foroptimization (default = 0). Adjust this value if confidenceinterval results look strange. |
Examples
cohen_d_borenstein(sample_1 = 1:10, sample_2 = 3:12)cohen_d_borenstein(data = mtcars, iv_name = "vs", dv_name = "mpg", ci_range = 0.99)sample_dt <- data.table::data.table(iris)[Species != "setosa"]cohen_d_borenstein(data = sample_dt, iv_name = "Species", dv_name = "Petal.Width",initial_value = 10)Calculate Cohen's d to accompany a one-sample t-test
Description
To run this function, the following package(s) must be installed:Package 'psych' v2.1.9 (or possibly a higher version) byWilliam Revelle (2021),https://cran.r-project.org/package=psych
Usage
cohen_d_for_one_sample(x = NULL, mu = NULL)Arguments
x | a numeric vector containing values whose mean will be calculated |
mu | the true mean |
Examples
cohen_d_for_one_sample(x = 1:10, mu = 3)cohen_d_for_one_sample(x = c(1:10, NA, NA), mu = 3)Cohen's d from Jacob Cohen's textbook (1988)
Description
Calculates Cohen's d as described in Jacob Cohen's textbook (1988),Statistical Power Analysis for the Behavioral Sciences, 2nd EditionCohen, J. (1988)doi:10.4324/9780203771587
Usage
cohen_d_from_cohen_textbook( sample_1 = NULL, sample_2 = NULL, data = NULL, iv_name = NULL, dv_name = NULL)Arguments
sample_1 | a vector of values in the first of two samples |
sample_2 | a vector of values in the second of two samples |
data | a data object (a data frame or a data.table) |
iv_name | name of the independent variable |
dv_name | name of the dependent variable |
Value
the output will be a Cohen's d value (a numeric vectorof length one)
Examples
cohen_d_from_cohen_textbook(1:10, 3:12)cohen_d_from_cohen_textbook( data = mtcars, iv_name = "vs", dv_name = "mpg")Cohen's d as a function of sample size
Description
Plot Cohen's d as sample size increases.
Usage
cohen_d_over_n( data = NULL, iv_name = NULL, dv_name = NULL, save_as_png = FALSE, png_name = NULL, xlab = NULL, ylab = NULL, width = 16, height = 9)Arguments
data | a data object (a data frame or a data.table) |
iv_name | name of the independent variable (grouping variable) |
dv_name | name of the dependent variable (measure variableof interest) |
save_as_png | if |
png_name | name of the PNG file to be saved. By default, the namewill be "cohen_d_over_n_" followed by a timestamp of thecurrent time.The timestamp will be in the format, jan_01_2021_1300_10_000001,where "jan_01_2021" would indicate January 01, 2021;1300 would indicate 13:00 (i.e., 1 PM); and 10_000001 wouldindicate 10.000001 seconds after the hour. |
xlab | title of the x-axis for the histogram by group.If |
ylab | title of the y-axis for the histogram by group.If |
width | width of the plot to be saved. This argument will bedirectly entered as the |
height | height of the plot to be saved. This argument will bedirectly entered as the |
Value
the output will be a list of (1) ggplot object(histogram by group) and (2) a data.table with Cohen's d by sample size
Examples
## Not run: cohen_d_over_n(data = mtcars, iv_name = "am", dv_name = "mpg")## End(Not run)Convert Cohen's d to r
Description
Convert d (standardized mean difference or Cohen's d) to r (correlation),as illustrated in Borenstein et al. (2009, p. 48, ISBN: 978-0-470-05724-7)
Usage
cohen_d_to_r(d = NULL, n1 = NULL, n2 = NULL, d_var = NULL)Arguments
d | Cohen's d (the input can be a vector of values) |
n1 | sample size in the first of two group (the input can be avector of values) |
n2 | sample size in the second of two group (the input can be avector of values) |
d_var | (optional argument) variance of d(the input can be a vector of values). If this argument receives aninput, variance of r will be returned as well. |
Value
the output will be a vector of correlation values(and variances of r if the argument d_var received an input)
Examples
## Not run: cohen_d_to_r(1)cohen_d_to_r(d = 1:3)cohen_d_to_r(d = 1:3, n1 = c(100, 200, 300), n2 = c(50, 250, 900))cohen_d_to_r(1.1547)cohen_d_to_r(d = 1.1547, d_var = .0550)cohen_d_to_r(d = 1:2, d_var = 1:2)## End(Not run)Calculate Cohen's d and its confidence interval usingthe package 'effsize'
Description
To run this function, the following package(s) must be installed:Package 'effsize' v0.8.1 (or possibly a higher version) byMarco Torchiano (2020),https://cran.r-project.org/package=effsize
Usage
cohen_d_torchiano( sample_1 = NULL, sample_2 = NULL, data = NULL, iv_name = NULL, dv_name = NULL, ci_range = 0.95)Arguments
sample_1 | a vector of values in the first of two samples |
sample_2 | a vector of values in the second of two samples |
data | a data object (a data frame or a data.table) |
iv_name | name of the independent variable |
dv_name | name of the dependent variable |
ci_range | range of the confidence interval for Cohen's d(default = 0.95) |
Examples
cohen_d_torchiano(1:10, 3:12)cohen_d_torchiano(data = mtcars, iv_name = "vs", dv_name = "mpg", ci_range = 0.99)Combine data across columns
Description
Combine data across columns. If NA is the only value across all focalcolumns for given row(s), NA will be returned for those row(s).
Usage
combine_data_across_cols(data = NULL, cols = NULL)Arguments
data | a data object (a data frame or a data.table) |
cols | a character vector containing names of columns, acrosswhich to combine data |
Value
the output will be a numeric or character vector.
Examples
dt <- data.frame(v1 = c(1, NA), v2 = c(NA, 2))dtcombine_data_across_cols(data = dt, cols = c("v1", "v2"))dt <- data.frame(v1 = c(1, 2, NA), v2 = c(NA, 4, 3))dtcombine_data_across_cols(data = dt, cols = c("v1", "v2"))dt <- data.frame(v1 = c(1, NA, NA), v2 = c(NA, 2, NA))dtcombine_data_across_cols(data = dt, cols = c("v1", "v2"))Convert a comma-separated string of numbers
Description
Convert a comma-separated string of numbers
Usage
comma_sep_string_to_numbers(string)Arguments
string | a character string consisting of numbersseparated by commas |
Value
a character string
Examples
comma_sep_string_to_numbers("1, 2, 3,4, 5 6")Compare data sets
Description
Compares whether or not data sets are identical
Usage
compare_datasets(dataset_1 = NULL, dataset_2 = NULL, dataset_list = NULL)Arguments
dataset_1 | a data object (a data frame or a data.table) |
dataset_2 | another data object (a data frame or a data.table) |
dataset_list | list of data objects (data.frame or data.table) |
Value
the output will be a data.table showing differences in data sets
Examples
# catch differences in class attributes of the data setscompare_datasets(dataset_1 = data.frame(a = 1:2, b = 3:4),dataset_2 = data.table::data.table(a = 1:2, b = 3:4))# catch differences in number of columnscompare_datasets(dataset_1 = data.frame(a = 1:2, b = 3:4, c = 5:6),dataset_2 = data.frame(a = 1:2, b = 3:4))# catch differences in number of rowscompare_datasets(dataset_1 = data.frame(a = 1:2, b = 3:4),dataset_2 = data.frame(a = 1:10, b = 11:20))# catch differences in column namescompare_datasets(dataset_1 = data.frame(A = 1:2, B = 3:4),dataset_2 = data.frame(a = 1:2, b = 3:4))# catch differences in values within corresponding columnscompare_datasets(dataset_1 = data.frame(a = 1:2, b = c(3, 400)),dataset_2 = data.frame(a = 1:2, b = 3:4))compare_datasets(dataset_1 = data.frame(a = 1:2, b = 3:4, c = 5:6),dataset_2 = data.frame(a = 1:2, b = c(3, 4), c = c(5, 6)))# check if data sets in a list are identicalcompare_datasets(dataset_list = list(dt1 = data.frame(a = 1:2, b = 3:4, c = 5:6),dt2 = data.frame(a = 1:2, b = 3:4),dt3 = data.frame(a = 1:2, b = 3:4, c = 5:6)))Compare dependent correlations
Description
Compares whether two dependent correlations from the same sample aresignificantly different each other.
Usage
compare_dependent_rs( data = NULL, var_1_name = NULL, var_2_name = NULL, var_3_name = NULL, one_tailed = FALSE, round_r = 3, round_p = 3, round_t = 2, print_summary = TRUE, return_dt = FALSE)Arguments
data | a data object (a data frame or a data.table) |
var_1_name | name of the variable whose correlations with two othervariables will be compared. |
var_2_name | name of the first of the two variables whosecorrelations with |
var_3_name | name of the second of the two variables whosecorrelations with |
one_tailed | logical. Should the p value based on a one-tailedt-test? (default = FALSE) |
round_r | number of decimal places to which to roundcorrelation coefficients (default = 2) |
round_p | number of decimal places to which to roundp-values (default = 3) |
round_t | number of decimal places to which to round thet-statistic (default = 2) |
print_summary | logical. Should the summary be printed?(default = TRUE) |
return_dt | logical. Should the function return a summary tableas an output, as opposed to returning the output through the"invisible" function? (default = FALSE) |
Details
Suppose that Variables A, B, and C are measured from a group of subjects.This function tests whether A is related to B differently than to C.Put differently, this function tests H0: r(A, B) = r(A, C)
For more information on formulas used in this function, please refer toSteiger (1980)doi:10.1037/0033-2909.87.2.245and Chen & Popovich (2002)doi:10.4135/9781412983808
Value
the output will be a summary of the test comparing two dependentcorrelations
Examples
compare_dependent_rs(data = mtcars, var_1_name = "mpg", var_2_name = "hp", var_3_name = "wt")Compare effect sizes
Description
Compares effect sizesSee p. 156 of Borenstein et al. (2009, ISBN: 978-0-470-05724-7).
Usage
compare_effect_sizes( effect_sizes = NULL, effect_size_variances = NULL, round_stats = TRUE, round_p = 3, round_se = 2, round_z = 2, pretty_round_p_value = TRUE)Arguments
effect_sizes | a vector of estimated effect sizes |
effect_size_variances | a vector of variances of the effect sizes |
round_stats | logical. Should the statistics be rounded?(default = TRUE) |
round_p | number of decimal places to which to roundp-values (default = 3) |
round_se | number of decimal places to which to round thestandard errors of the difference (default = 2) |
round_z | number of decimal places to which to round thez-statistic (default = 2) |
pretty_round_p_value | logical. Should the p-values be roundedin a pretty format (i.e., lower threshold: "<.001").By default, |
Examples
compare_effect_sizes(effect_sizes = c(0.6111, 0.3241, 0.5),effect_size_variances = c(.0029, 0.0033, 0.01))Compare groups
Description
Compares groups by (1) creating histogram by group; (2) summarizingdescriptive statistics by group; and (3) conducting pairwisecomparisons (t-tests and Mann-Whitney tests).
Usage
compare_groups( data = NULL, iv_name = NULL, dv_name = NULL, sigfigs = 3, stats = "basic", welch = TRUE, cohen_d = TRUE, cohen_d_w_ci = TRUE, adjust_p = "holm", bonferroni = NULL, mann_whitney = TRUE, t_test_stats = TRUE, round_p = 3, anova = FALSE, round_f = 2, round_t = 2, round_t_test_df = 2, save_as_png = FALSE, png_name = NULL, xlab = NULL, ylab = NULL, x_limits = NULL, x_breaks = NULL, x_labels = NULL, width = 5000, height = 3600, units = "px", res = 300, layout_matrix = NULL, col_names_nicer = TRUE, convert_dv_to_numeric = TRUE)Arguments
data | a data object (a data frame or a data.table) |
iv_name | name of the independent variable (grouping variable) |
dv_name | name of the dependent variable (measure variableof interest) |
sigfigs | number of significant digits to round to |
stats | statistics to calculate for each group.If |
welch | Should Welch's t-tests be conducted?By default, |
cohen_d | if |
cohen_d_w_ci | if |
adjust_p | the name of the method to use to adjust p-values.If |
bonferroni | The use of this argument is deprecated.Use the 'adjust_p' argument instead.If |
mann_whitney | if |
t_test_stats | if |
round_p | number of decimal places to which to roundp-values (default = 3) |
anova | Should a one-way ANOVA be conducted and reported?By default, |
round_f | number of decimal places to which to roundthe f statistic (default = 2) |
round_t | number of decimal places to which to roundthe t statistic (default = 2) |
round_t_test_df | number of decimal places to which to roundthe degrees of freedom for t tests (default = 2) |
save_as_png | if |
png_name | name of the PNG file to be saved. By default, the namewill be "compare_groups_results_" followed by a timestamp of thecurrent time.The timestamp will be in the format, jan_01_2021_1300_10_000001,where "jan_01_2021" would indicate January 01, 2021;1300 would indicate 13:00 (i.e., 1 PM); and 10_000001 wouldindicate 10.000001 seconds after the hour. |
xlab | title of the x-axis for the histogram by group.If |
ylab | title of the y-axis for the histogram by group.If |
x_limits | a numeric vector with values of the endpointsof the x axis. |
x_breaks | a numeric vector indicating the points at which toplace tick marks on the x axis. |
x_labels | a vector containing labels for the place tick markson the x axis. |
width | width of the PNG file (default = 5000) |
height | height of the PNG file (default = 3600) |
units | the units for the |
res | The nominal resolution in ppi which will be recordedin the png file, if a positive integer. Used for unitsother than the default. By default, |
layout_matrix | The layout argument for arranging plots and tablesusing the |
col_names_nicer | if |
convert_dv_to_numeric | logical. Should the values in thedependent variable be converted to numeric for plotting thehistograms? (default = TRUE) |
holm | if |
Value
the output will be a list of (1) ggplot object(histogram by group) (2) a data.table with descriptive statistics bygroup; and (3) a data.table with pairwise comparison results.Ifsave_as_png = TRUE, the plot and tables will be also savedon local drive as a PNG file.
Examples
## Not run: compare_groups(data = iris, iv_name = "Species", dv_name = "Sepal.Length")compare_groups(data = iris, iv_name = "Species", dv_name = "Sepal.Length",x_breaks = 4:8)# Welch's t-testcompare_groups(data = mtcars, iv_name = "am", dv_name = "hp")# A Student's t-testcompare_groups(data = mtcars, iv_name = "am", dv_name = "hp", welch = FALSE)## End(Not run)Compare independent correlations
Description
Compares whether two correlations from two independent samples aresignificantly different each other.See Field et al. (2012, ISBN: 978-1-4462-0045-2).
Usage
compare_independent_rs( r1 = NULL, n1 = NULL, r2 = NULL, n2 = NULL, one_tailed = FALSE, round_p = 3, round_z_diff = 2, round_r = 2, print_summary = TRUE, output_type = NULL)Arguments
r1 | correlation in the first sample |
n1 | size of the first sample |
r2 | correlation in the second sample |
n2 | size of the first sample |
one_tailed | logical. Should the p value based on a one-tailedt-test? (default = FALSE) |
round_p | (only for displaying purposes) number of decimalplaces to which to round the p-value (default = 3) |
round_z_diff | (only for displaying purposes) number ofdecimal places to which to round the z-score (default = 2) |
round_r | (only for displaying purposes) number ofdecimal places to which to round correlation coefficients (default = 2) |
print_summary | logical. Should the summary be printed?(default = TRUE) |
output_type | type of the output. If |
Value
the output will be the results of a test comparing twoindependent correlations.
Examples
compare_independent_rs(r1 = .1, n1 = 100, r2 = .2, n2 = 200)compare_independent_rs(r1 = .1, n1 = 100, r2 = .2, n2 = 200, one_tailed = TRUE)compare_independent_rs(r1 = .506, n1 = 52, r2 = .381, n2 = 51)Contingency table
Description
Create a contingency table that takes two variables as inputs
Usage
contingency_table( data = NULL, row_var_name = NULL, col_var_name = NULL, row = NULL, col = NULL, output_type = "table")Arguments
data | a data object (a data frame or a data.table) |
row_var_name | name of the variable whose values will fill therows of the contingency table |
col_var_name | name of the variable whose values will fill thecolumns of the contingency table |
row | a vector whose values will fill the rows of thecontingency table |
col | a vector whose values will fill the columns of thecontingency table |
output_type | If |
Examples
contingency_table(data = mtcars,row_var_name = "am",col_var_name = "cyl")contingency_table(row = mtcars$cyl, col = mtcars$am)contingency_table(mtcars, "am", "cyl", output_type = "dt")Convert columns to numeric
Description
Check whether each column in a data.table can be converted to numeric,and if so, convert every such column.
Usage
convert_cols_to_numeric( data = NULL, classes = "character", warn_accuracy_loss = TRUE, print_summary = TRUE, silent = FALSE)Arguments
data | a data object (a data frame or a data.table) |
classes | a character vector specifying classes of columnsthat will be converted. For example, if |
warn_accuracy_loss | logical. whether to warn the user ifconverting character to numeric leads to loss of accuracy.(default = TRUE) |
print_summary | If |
silent | If |
Examples
data_frame_1 <- data.frame(a = c("1", "2"), b = c("1", "b"), c = 1:2)convert_cols_to_numeric(data = data_frame_1)data_table_1 <- data.table::data.table(a = c("1", "2"), b = c("1", "b"), c = 1:2)convert_cols_to_numeric(data = data_table_1)Convert character to Excel formula
Description
Convert elements of a character vector to Excel formulas to preservethe character (string) format when opened in an Excel file.
Usage
convert_to_excel_formula(vector = NULL)Arguments
vector | a character vector |
Value
the output will be a character vector formatted as an Excelformula. For example, if an element in the input vector was".500",this element will be converted to=".500", which will show up as".500" in Excel, rather than as "0.5"
Examples
## Not run: # compare the two csv files below# example 1dt <- data.table::data.table(a = ".500")data.table::fwrite(dt, "example1.csv") # the csv will show "0.5"# example 2dt <- data.table::data.table(a = convert_to_excel_formula(".500"))data.table::fwrite(dt, "example2.csv") # the csv will show ".500"## End(Not run)Estimate the correlation between two variables
Description
Estimate the correlation between two variables
Usage
correlation_kim( x = NULL, y = NULL, data = NULL, x_var_name = NULL, y_var_name = NULL, ci_range = 0.95, round_r = 2, round_p = 3, output_type = "summary")Arguments
x | a numeric vector of data values |
y | a numeric vector of data values |
data | (optional) a data object (a data frame or a data.table) |
x_var_name | (optional) name of the first variable(if using a data set as an input) |
y_var_name | (optional) name of the second variable(if using a data set as an input) |
ci_range | range of the confidence interval for the correlationcoefficient. If |
round_r | number of decimal places to which to roundcorrelation coefficients (default = 2) |
round_p | number of decimal places to which to roundp-values (default = 3) |
output_type | type of the output. If |
Examples
## Not run: correlation_kim(x = 1:4, y = c(1, 3, 2, 4))correlation_kim(x = 1:4, y = c(1, 3, 2, 4), ci_range = FALSE)# output as a data tablecorrelation_kim(x = 1:4, y = c(1, 3, 2, 4), output_type = "dt")## End(Not run)correlation matrix
Description
Creates a correlation matrix
Usage
correlation_matrix( data = NULL, var_names = NULL, row_var_names = NULL, col_var_names = NULL, round_r = 2, round_p = 3, output_type = "rp", numbered_cols = NULL)Arguments
data | a data object (a data frame or a data.table) |
var_names | names of the variables for which to calculateall pairwise correlations |
row_var_names | names of the variables that will go on the rowsof the correlation matrix |
col_var_names | names of the variables that will go on the columnsof the correlation matrix |
round_r | number of decimal places to which to roundcorrelation coefficients (default = 2) |
round_p | number of decimal places to which to roundp-values (default = 3) |
output_type | which value should be filled in cells of thecorrelation matrix? If |
numbered_cols | logical. If |
Value
the output will be a correlation matrix in a data.table format
Examples
correlation_matrix(data = mtcars, var_names = c("mpg", "cyl", "wt"))correlation_matrix(data = mtcars,row_var_names = c("mpg", "cyl", "hp"), col_var_names = c("wt", "am"))correlation_matrix(data = mtcars, var_names = c("mpg", "cyl", "wt"),numbered_cols = FALSE)correlation_matrix(data = mtcars, var_names = c("mpg", "cyl", "wt"), output_type = "r")Cumulative percentage plot
Description
Plots or tabulates cumulative percentages associated withelements in a vector
Usage
cum_percent_plot(vector, output_type = "plot")Arguments
vector | a numeric vector |
output_type | if |
Examples
cum_percent_plot(c(1:100, NA, NA))cum_percent_plot(mtcars$mpg)cum_percent_plot(vector= mtcars$mpg, output_type = "dt")Descriptive statistics
Description
Returns descriptive statistics for a numeric vector.
Usage
desc_stats( vector = NULL, output_type = "vector", sigfigs = 3, se_of_mean = FALSE, ci = FALSE, pi = FALSE, skewness = FALSE, kurtosis = FALSE, notify_na_count = NULL, print_dt = FALSE)Arguments
vector | a numeric vector |
output_type | if |
sigfigs | number of significant digits to round to (default = 3) |
se_of_mean | logical. Should the standard errors aroundthe mean be included in the descriptive stats?(default = FALSE) |
ci | logical. Should 95% CI be included in the descriptive stats?(default = FALSE) |
pi | logical. Should 95% PI be included in the descriptive stats?(default = FALSE) |
skewness | logical. Should the skewness statistic be includedin the descriptive stats?(default = FALSE) |
kurtosis | logical. Should the kurtosis statistic be includedin the descriptive stats?(default = FALSE) |
notify_na_count | if |
print_dt | if |
Value
ifoutput_type = "vector", the output will be anamed numeric vector of descriptive statistics;ifoutput_type = "dt", the output will be data.table ofdescriptive statistics.
Examples
desc_stats(1:100)desc_stats(1:100, ci = TRUE, pi = TRUE, sigfigs = 2)desc_stats(1:100, se_of_mean = TRUE,ci = TRUE, pi = TRUE, sigfigs = 2,skewness = TRUE, kurtosis = TRUE)desc_stats(c(1:100, NA))example_dt <- desc_stats(vector = c(1:100, NA), output_type = "dt")example_dtDescriptive statistics by group
Description
Returns descriptive statistics by group
Usage
desc_stats_by_group( data = NULL, var_for_stats = NULL, grouping_vars = NULL, stats = "all", sigfigs = NULL, cols_to_round = NULL)Arguments
data | a data object (a data frame or a data.table) |
var_for_stats | name of the variable for which descriptivestatistics will be calculated |
grouping_vars | name(s) of grouping variables |
stats | statistics to calculate. If |
sigfigs | number of significant digits to round to |
cols_to_round | names of columns whose values will be rounded |
Value
the output will be a data.table showing descriptive statisticsof the variable for each of the groups formed by the grouping variables.
Examples
desc_stats_by_group(data = mtcars, var_for_stats = "mpg",grouping_vars = c("vs", "am"))desc_stats_by_group(data = mtcars, var_for_stats = "mpg",grouping_vars = c("vs", "am"), sigfigs = 3)desc_stats_by_group(data = mtcars, var_for_stats = "mpg",grouping_vars = c("vs", "am"), stats = "basic", sigfigs = 2)desc_stats_by_group(data = mtcars, var_for_stats = "mpg",grouping_vars = c("vs", "am"), stats = "basic", sigfigs = 2,cols_to_round = "all")desc_stats_by_group(data = mtcars, var_for_stats = "mpg",grouping_vars = c("vs", "am"), stats = c("mean", "median"), sigfigs = 2,cols_to_round = "all")Detach all user-installed packages
Description
Detach all user-installed packages
Usage
detach_user_installed_pkgs(exceptions = NULL, force = FALSE, keep_kim = TRUE)Arguments
exceptions | a character vector of names of packages to keep attached |
force | logical. Should a package be detached even though otherattached packages depend on it? By default, |
keep_kim | logical. If |
Examples
## Not run: detach_user_installed_pkgs()## End(Not run)Duplicated values in a vector
Description
Return all duplicated values in a vector. This function is a copy ofthe earlier function, find_duplicates, in Package 'kim'
Usage
duplicated_values(vector = NULL, na.rm = TRUE, sigfigs = 2, output = "summary")Arguments
vector | a vector whose elements will be checked for duplicates |
na.rm | logical. If |
sigfigs | number of significant digits to round to inthe percent column of the summary (default = 2) |
output | type of output. If |
Value
the output will be a data.table object (summary),a vector of duplicated values, or a vector non-duplicated values.
Examples
duplicated_values(mtcars$cyl)duplicated_values(mtcars$cyl, output = "duplicated_values")duplicated_values(vector = c(mtcars$cyl, 11:20, NA, NA))duplicated_values(vector = c(mtcars$cyl, 11:20, NA, NA), na.rm = FALSE)duplicated_values(vector = c(mtcars$cyl, 11:20, NA, NA),na.rm = FALSE, sigfigs = 4, output = "duplicated_values")Excel formula, convert (to)
Description
Alias for the 'convert_to_excel_formula' function.Convert elements of a character vector to Excel formulas to preservethe character (string) format when opened in an Excel file.
Usage
excel_formula_convert(vector = NULL)Arguments
vector | a character vector |
Value
the output will be a character vector formatted as an Excelformula. For example, if an element in the input vector was".500",this element will be converted to=".500", which will show up as".500" in Excel, rather than as "0.5"
Examples
## Not run: # compare the two csv files below# example 1dt <- data.table::data.table(a = ".500")data.table::fwrite(dt, "example1.csv") # the csv will show "0.5"# example 2dt <- data.table::data.table(a = excel_formula_convert(".500"))data.table::fwrite(dt, "example2.csv") # the csv will show ".500"## End(Not run)Exit from a Parent Function
Description
Exit from a Parent Function
Usage
exit_from_parent_function( n = 1, silent = FALSE, message = "Exiting from a parent function")Arguments
n | the number of generations to go back (default = 1) |
silent | logical. If |
message | message to print |
Examples
fn1 <- function() {print(1)print(2)}fn1()fn2 <- function() {print(1)exit_from_parent_function()print(2)}fn2()Factorial ANOVA 2-Way (Two-Way Factorial ANOVA)
Description
Conduct a two-way factorial analysis of variance (ANOVA).
Usage
factorial_anova_2_way( data = NULL, dv_name = NULL, iv_1_name = NULL, iv_2_name = NULL, iv_1_values = NULL, iv_2_values = NULL, sigfigs = 3, robust = FALSE, iterations = 2000, plot = TRUE, error_bar = "ci", error_bar_range = 0.95, error_bar_tip_width = 0.13, error_bar_thickness = 1, error_bar_caption = TRUE, line_colors = NULL, line_types = NULL, line_thickness = 1, dot_size = 3, position_dodge = 0.13, x_axis_title = NULL, y_axis_title = NULL, y_axis_title_vjust = 0.85, legend_title = NULL, legend_position = "right", output = "anova_table", png_name = NULL, width = 7000, height = 4000, units = "px", res = 300, layout_matrix = NULL)Arguments
data | a data object (a data frame or a data.table) |
dv_name | name of the dependent variable |
iv_1_name | name of the first independent variable |
iv_2_name | name of the second independent variable |
iv_1_values | restrict all analyses to observations havingthese values for the first independent variable |
iv_2_values | restrict all analyses to observations havingthese values for the second independent variable |
sigfigs | number of significant digits to which to roundvalues in anova table (default = 3) |
robust | if |
iterations | number of bootstrap samples for robust ANOVA.The default is set at 2000, but consider increasing the numberof samples to 5000, 10000, or an even larger number, if slowerhandling time is not an issue. |
plot | if |
error_bar | if |
error_bar_range | width of the confidence interval(default = 0.95 for 95 percent confidence interval).This argument will not apply when |
error_bar_tip_width | graphically, width of the segmentsat the end of error bars (default = 0.13) |
error_bar_thickness | thickness of the error bars (default = 1) |
error_bar_caption | should a caption be included to indicatethe width of the error bars? (default = TRUE). |
line_colors | colors of the lines connecting means (default = NULL)If the second IV has two levels, then by default, |
line_types | types of the lines connecting means (default = NULL)If the second IV has two levels, then by default, |
line_thickness | thickness of the lines connecting group means,(default = 1) |
dot_size | size of the dots indicating group means (default = 3) |
position_dodge | by how much should the group means and error barsbe horizontally offset from each other so as not to overlap?(default = 0.13) |
x_axis_title | a character string for the x-axis title. If noinput is entered, then, by default, the first value of |
y_axis_title | a character string for the y-axis title. If noinput is entered, then, by default, |
y_axis_title_vjust | position of the y axis title (default = 0.85).By default, |
legend_title | a character for the legend title. If no inputis entered, then, by default, the second value of |
legend_position | position of the legend: |
output | output type can be one of the following: |
png_name | name of the PNG file to be saved.If |
width | width of the PNG file (default = 7000) |
height | height of the PNG file (default = 4000) |
units | the units for the |
res | The nominal resolution in ppi which will be recordedin the png file, if a positive integer. Used for unitsother than the default. If not specified, taken as 300 ppito set the size of text and line widths. |
layout_matrix | The layout argument for arranging plots and tablesusing the |
Details
The following package(s) must be installed prior to running this function:Package 'car' v3.0.9 (or possibly a higher version) byFox et al. (2020),https://cran.r-project.org/package=car
If robust ANOVA is to be conducted, the following package(s)must be installed prior to running the function:Package 'WRS2' v1.1-1 (or possibly a higher version) byMair & Wilcox (2021),https://cran.r-project.org/package=WRS2
Value
by default, the output will be"anova_table"
Examples
factorial_anova_2_way( data = mtcars, dv_name = "mpg", iv_1_name = "vs", iv_2_name = "am", iterations = 100)anova_results <- factorial_anova_2_way( data = mtcars, dv_name = "mpg", iv_1_name = "vs", iv_2_name = "am", output = "all")anova_resultsFind duplicated values in a vector
Description
Find duplicated values in a vector
Usage
find_duplicates(vector = NULL, na.rm = TRUE, sigfigs = 2, output = "summary")Arguments
vector | a vector whose elements will be checked for duplicates |
na.rm | logical. If |
sigfigs | number of significant digits to round to inthe percent column of the summary (default = 2) |
output | type of output. If |
Value
the output will be a data.table object (summary),a vector of duplicated values, or a vector non-duplicated values.
Examples
find_duplicates(mtcars$cyl)find_duplicates(mtcars$cyl, output = "duplicated_values")find_duplicates(vector = c(mtcars$cyl, 11:20, NA, NA))find_duplicates(vector = c(mtcars$cyl, 11:20, NA, NA), na.rm = FALSE)find_duplicates(vector = c(mtcars$cyl, 11:20, NA, NA),na.rm = FALSE, sigfigs = 4, output = "duplicated_values")Fisher's Z transformation
Description
Perform Fisher's r-to-Z transformation for given correlation coefficient(s).
Usage
fisher_z_transform(r = NULL)Arguments
r | a (vector of) correlation coefficient(s) |
Value
the output will be a vector of Z values which were transformedfrom the given r values.
Examples
fisher_z_transform(0.99)fisher_z_transform(r = seq(0.1, 0.5, 0.1))Floodlight 2 by Continuous
Description
Conduct a floodlight analysis for 2 x Continuous design.
Usage
floodlight_2_by_continuous( data = NULL, iv_name = NULL, dv_name = NULL, mod_name = NULL, covariate_name = NULL, interaction_p_include = TRUE, iv_level_order = NULL, output = "reg_lines_plot", jitter_x_y_percent = 0, jitter_x_percent = 0, jitter_y_percent = 0, dot_alpha = 0.5, dot_size = 4, interaction_p_value_font_size = 8, jn_point_label_add = TRUE, jn_point_font_size = 8, jn_point_label_hjust = NULL, lines_at_mod_extremes = FALSE, interaction_p_vjust = -3, plot_margin = ggplot2::unit(c(75, 7, 7, 7), "pt"), legend_position = "right", reg_line_types = c("solid", "dashed"), jn_line_types = c("solid", "solid"), jn_line_thickness = 1.5, colors_for_iv = c("red", "blue"), sig_region_color = "green", sig_region_alpha = 0.08, nonsig_region_color = "gray", nonsig_region_alpha = 0.08, x_axis_title = NULL, y_axis_title = NULL, legend_title = NULL, round_decimals_int_p_value = 3, line_of_fit_thickness = 1, round_jn_point_labels = 2)Arguments
data | a data object (a data frame or a data.table) |
iv_name | name of the binary independent variable |
dv_name | name of the dependent variable |
mod_name | name of the continuous moderator variable |
covariate_name | name of the variables to control for |
interaction_p_include | logical. Should the plot include ap-value for the interaction term? |
iv_level_order | order of levels in the independentvariable for legend. By default, it will be set as levels of theindependent variable ordered using R's base function |
output | type of output (default = "reg_lines_plot").Possible inputs: "interactions_pkg_results", "simple_effects_plot","jn_points", "regions", "reg_lines_plot" |
jitter_x_y_percent | horizontally and vertically jitter dotsby a percentage of the respective ranges of x and y values. |
jitter_x_percent | horizontally jitter dots by a percentage of therange of x values |
jitter_y_percent | vertically jitter dots by a percentage of therange of y values |
dot_alpha | opacity of the dots (0 = completely transparent,1 = completely opaque). By default, |
dot_size | size of the dots (default = 4) |
interaction_p_value_font_size | font size for the interactionp value (default = 8) |
jn_point_label_add | logical. Should the labels forJohnson-Neyman point labels be added to the plot? (default = TRUE) |
jn_point_font_size | font size for Johnson-Neyman point labels(default = 8) |
jn_point_label_hjust | a vector of hjust values forJohnson-Neyman point labels. By default, the hjust value will be 0.5 forall the points. |
lines_at_mod_extremes | logical. Should vertical lines bedrawn at the observed extreme values of the moderator if those valueslie in siginificant region(s)?(default = FALSE) |
interaction_p_vjust | By how much should the label for theinteraction p-value be adjusted vertically?By default, |
plot_margin | margin for the plotBy default |
legend_position | position of the legend (default = "right").If |
reg_line_types | types of the regression lines for the two levelsof the independent variable.By default, |
jn_line_types | types of the lines for Johnson-Neyman points.By default, |
jn_line_thickness | thickness of the lines at Johnson-Neyman points(default = 1.5) |
colors_for_iv | colors for the two values of theindependent variable (default = c("red", "blue")) |
sig_region_color | color of the significant region, i.e., range(s)of the moderator variable for which simple effect of the independentvariable on the dependent variable is statistically significant. |
sig_region_alpha | opacity for |
nonsig_region_color | color of the non-significant region,i.e., range(s) of the moderator variable for which simple effect ofthe independent variable on the dependent variable is notstatistically significant. |
nonsig_region_alpha | opacity for |
x_axis_title | title of the x axis. By default, it will be setas input for |
y_axis_title | title of the y axis. By default, it will be setas input for |
legend_title | title of the legend. By default, it will be setas input for |
round_decimals_int_p_value | To how many digits after thedecimal point should the p value for the interaction term berounded? (default = 3) |
line_of_fit_thickness | thickness of the lines of fit (default = 1) |
round_jn_point_labels | To how many digits after thedecimal point should the jn point labels be rounded? (default = 2) |
Details
The following package(s) must be installed prior to running this function:Package 'interactions' v1.1.1 (or possibly a higher version) byJacob A. Long (2020),https://cran.r-project.org/package=interactionsSee the following references:Spiller et al. (2013)doi:10.1509/jmr.12.0420Kim (2021)doi:10.5281/zenodo.4445388
Examples
# typical examplefloodlight_2_by_continuous(data = mtcars,iv_name = "am",dv_name = "mpg",mod_name = "qsec")# add covariatesfloodlight_2_by_continuous(data = mtcars,iv_name = "am",dv_name = "mpg",mod_name = "qsec",covariate_name = c("cyl", "hp"))# adjust the jn point label positionsfloodlight_2_by_continuous(data = mtcars,iv_name = "am",dv_name = "mpg",mod_name = "qsec",jn_point_label_hjust = c(1, 0))# return regions of significance and nonsignificancefloodlight_2_by_continuous(data = mtcars,iv_name = "am",dv_name = "mpg",mod_name = "qsec",output = "regions")# draw lines at the extreme values of the moderator# if they are included in the significant regionfloodlight_2_by_continuous(data = mtcars,iv_name = "am",dv_name = "mpg",mod_name = "qsec",lines_at_mod_extremes = TRUE)#' # remove the labels for jn pointsfloodlight_2_by_continuous(data = mtcars,iv_name = "am",dv_name = "mpg",mod_name = "qsec",jn_point_label_add = FALSE)Floodlight 2 by Continuous for a Logistic Regression
Description
Conduct a floodlight analysis for a logistic regression with a2 x Continuous design involving a binary dependent variable.
Usage
floodlight_2_by_continuous_logistic( data = NULL, iv_name = NULL, dv_name = NULL, mod_name = NULL, interaction_p_include = TRUE, iv_level_order = NULL, dv_level_order = NULL, jn_points_disregard_threshold = NULL, output = "reg_lines_plot", num_of_spotlights = 20, jitter_x_percent = 0, jitter_y_percent = 5, dot_alpha = 0.3, dot_size = 6, interaction_p_value_font_size = 8, jn_point_label_add = TRUE, jn_point_font_size = 8, jn_point_label_hjust = NULL, interaction_p_vjust = -3, plot_margin = ggplot2::unit(c(75, 7, 7, 7), "pt"), legend_position = "right", line_types_for_pred_values = c("solid", "dashed"), line_thickness_for_pred_values = 2.5, jn_line_types = c("solid", "solid"), jn_line_thickness = 1.5, sig_region_color = "green", sig_region_alpha = 0.08, nonsig_region_color = "gray", nonsig_region_alpha = 0.08, x_axis_title = NULL, y_axis_title = NULL, legend_title = NULL, round_decimals_int_p_value = 3, round_jn_point_labels = 2)Arguments
data | a data object (a data frame or a data.table) |
iv_name | name of the binary independent variable |
dv_name | name of the binary dependent variable |
mod_name | name of the continuous moderator variable |
interaction_p_include | logical. Should the plot include ap-value for the interaction term? |
iv_level_order | order of levels in the independentvariable for legend. By default, it will be set as levels of theindependent variable ordered using R's base function |
dv_level_order | order of levels in the dependent variable.By default, it will be set as levels of thedependent variable ordered using R's base function |
jn_points_disregard_threshold | the Minimum Distance inthe unit of the moderator variable that will be used for various purposes,such as (1) to disregard the second Johnson-Neyman pointthat is different from the first Johnson-Neyman (JN) point byless than the Minimum Distance; (2) to determine regions ofsignificance, which will calculate the p-value of the IV's effect(the focal dummy variable's effect) on DV at a candidateJN point + / - the Minimum Distance.This input is hard to explain, but a user can enter a really low valuefor this argument (e.g., |
output | type of output (default = "reg_lines_plot").Possible inputs: "interactions_pkg_results", "simple_effects_plot","jn_points", "regions", "reg_lines_plot" |
num_of_spotlights | How many spotlight analyses should beconducted to plot the predicted values at various values of themoderator? (default = 20) |
jitter_x_percent | horizontally jitter dots by a percentage of therange of x values (default = 0) |
jitter_y_percent | vertically jitter dots by a percentage of therange of y values (default = 5) |
dot_alpha | opacity of the dots (0 = completely transparent,1 = completely opaque). By default, |
dot_size | size of the dots (default = 6) |
interaction_p_value_font_size | font size for the interactionp value (default = 8) |
jn_point_label_add | logical. Should the labels forJohnson-Neyman point labels be added to the plot? (default = TRUE) |
jn_point_font_size | font size for Johnson-Neyman point labels(default = 8) |
jn_point_label_hjust | a vector of hjust values forJohnson-Neyman point labels. By default, the hjust value will be 0.5 forall the points. |
interaction_p_vjust | By how much should the label for theinteraction p-value be adjusted vertically?By default, |
plot_margin | margin for the plotBy default |
legend_position | position of the legend (default = "right").If |
line_types_for_pred_values | types of the lines for plottingthe predicted valuesBy default, |
line_thickness_for_pred_values | thickness of the linesfor plotting the predicted values (default = 2.5) |
jn_line_types | types of the lines for Johnson-Neyman points.By default, |
jn_line_thickness | thickness of the lines at Johnson-Neyman points(default = 1.5) |
sig_region_color | color of the significant region, i.e., range(s)of the moderator variable for which simple effect of the independentvariable on the dependent variable is statistically significant. |
sig_region_alpha | opacity for |
nonsig_region_color | color of the non-significant region,i.e., range(s) of the moderator variable for which simple effect ofthe independent variable on the dependent variable is notstatistically significant. |
nonsig_region_alpha | opacity for |
x_axis_title | title of the x axis. By default, it will be setas input for |
y_axis_title | title of the y axis. By default, it will be setas input for |
legend_title | title of the legend. By default, it will be setas input for |
round_decimals_int_p_value | To how many digits after thedecimal point should the p value for the interaction term berounded? (default = 3) |
round_jn_point_labels | To how many digits after thedecimal point should the jn point labels be rounded? (default = 2) |
Details
See the following reference(s):Spiller et al. (2013)doi:10.1509/jmr.12.0420Kim (2023)https://jinkim.science/docs/floodlight.pdf
Examples
floodlight_2_by_continuous_logistic(data = mtcars,iv_name = "am",dv_name = "vs",mod_name = "mpg")# adjust the number of spotlights# (i.e., predict values at only 4 values of the moderator)floodlight_2_by_continuous_logistic(data = mtcars,iv_name = "am",dv_name = "vs",mod_name = "mpg",num_of_spotlights = 4)Floodlight 2 by Continuous for a Multilevel Logistic Regression
Description
Conduct a floodlight analysis for a multilevellogistic regressionwith a 2 x Continuous design involving a binary dependent variable.
Usage
floodlight_2_by_continuous_mlm_logistic( data = NULL, iv_name = NULL, dv_name = NULL, mod_name = NULL, interaction_p_include = TRUE, iv_level_order = NULL, dv_level_order = NULL, jn_points_disregard_threshold = NULL, output = "reg_lines_plot", num_of_spotlights = 20, jitter_x_percent = 0, jitter_y_percent = 5, dot_alpha = 0.3, dot_size = 6, interaction_p_value_font_size = 8, jn_point_font_size = 8, jn_point_label_hjust = NULL, interaction_p_vjust = -3, plot_margin = ggplot2::unit(c(75, 7, 7, 7), "pt"), legend_position = "right", line_types_for_pred_values = c("solid", "dashed"), line_thickness_for_pred_values = 2.5, jn_line_types = c("solid", "solid"), jn_line_thickness = 1.5, sig_region_color = "green", sig_region_alpha = 0.08, nonsig_region_color = "gray", nonsig_region_alpha = 0.08, x_axis_title = NULL, y_axis_title = NULL, legend_title = NULL, round_decimals_int_p_value = 3, round_jn_point_labels = 2)Arguments
data | a data object (a data frame or a data.table) |
iv_name | name of the binary independent variable |
dv_name | name of the binary dependent variable |
mod_name | name of the continuous moderator variable |
interaction_p_include | logical. Should the plot include ap-value for the interaction term? |
iv_level_order | order of levels in the independentvariable for legend. By default, it will be set as levels of theindependent variable ordered using R's base function |
dv_level_order | order of levels in the dependent variable.By default, it will be set as levels of thedependent variable ordered using R's base function |
jn_points_disregard_threshold | the Minimum Distance inthe unit of the moderator variable that will be used for various purposes,such as (1) to disregard the second Johnson-Neyman pointthat is different from the first Johnson-Neyman (JN) point byless than the Minimum Distance; (2) to determine regions ofsignificance, which will calculate the p-value of the IV's effect(the focal dummy variable's effect) on DV at a candidateJN point + / - the Minimum Distance.This input is hard to explain, but a user can enter a really low valuefor this argument (e.g., |
output | type of output (default = "reg_lines_plot").Possible inputs: "interactions_pkg_results", "simple_effects_plot","jn_points", "regions", "reg_lines_plot" |
num_of_spotlights | How many spotlight analyses should beconducted to plot the predicted values at various values of themoderator? (default = 20) |
jitter_x_percent | horizontally jitter dots by a percentage of therange of x values (default = 0) |
jitter_y_percent | vertically jitter dots by a percentage of therange of y values (default = 5) |
dot_alpha | opacity of the dots (0 = completely transparent,1 = completely opaque). By default, |
dot_size | size of the dots (default = 6) |
interaction_p_value_font_size | font size for the interactionp value (default = 8) |
jn_point_font_size | font size for Johnson-Neyman point labels(default = 8) |
jn_point_label_hjust | a vector of hjust values forJohnson-Neyman point labels. By default, the hjust value will be 0.5 forall the points. |
interaction_p_vjust | By how much should the label for theinteraction p-value be adjusted vertically?By default, |
plot_margin | margin for the plotBy default |
legend_position | position of the legend (default = "right").If |
line_types_for_pred_values | types of the lines for plottingthe predicted valuesBy default, |
line_thickness_for_pred_values | thickness of the linesfor plotting the predicted values (default = 2.5) |
jn_line_types | types of the lines for Johnson-Neyman points.By default, |
jn_line_thickness | thickness of the lines at Johnson-Neyman points(default = 1.5) |
sig_region_color | color of the significant region, i.e., range(s)of the moderator variable for which simple effect of the independentvariable on the dependent variable is statistically significant. |
sig_region_alpha | opacity for |
nonsig_region_color | color of the non-significant region,i.e., range(s) of the moderator variable for which simple effect ofthe independent variable on the dependent variable is notstatistically significant. |
nonsig_region_alpha | opacity for |
x_axis_title | title of the x axis. By default, it will be setas input for |
y_axis_title | title of the y axis. By default, it will be setas input for |
legend_title | title of the legend. By default, it will be setas input for |
round_decimals_int_p_value | To how many digits after thedecimal point should the p value for the interaction term berounded? (default = 3) |
round_jn_point_labels | To how many digits after thedecimal point should the jn point labels be rounded? (default = 2) |
Details
See the following reference(s):Spiller et al. (2013)doi:10.1509/jmr.12.0420Kim (2023)https://jinkim.science/docs/floodlight.pdf
Examples
floodlight_2_by_continuous_logistic(data = mtcars,iv_name = "am",dv_name = "vs",mod_name = "mpg")# adjust the number of spotlights# (i.e., predict values at only 4 values of the moderator)floodlight_2_by_continuous_logistic(data = mtcars,iv_name = "am",dv_name = "vs",mod_name = "mpg",num_of_spotlights = 4)Floodlight Analyses for a Set of Contrasts
Description
Conduct a floodlight analysis for a set of contrasts with acontinuous moderator variable.
Usage
floodlight_for_contrasts( data = NULL, iv_name = NULL, dv_name = NULL, mod_name = NULL, contrasts = NULL, contrasts_for_floodlight = NULL, covariate_name = NULL, interaction_p_include = TRUE, iv_category_order = NULL, heteroskedasticity_consistent_se = "HC4", round_r_squared = 3, round_f = 2, sigfigs = 2, jn_points_disregard_threshold = NULL, print_floodlight_plots = TRUE, output = "reg_lines_plot", jitter_x_percent = 0, jitter_y_percent = 0, dot_alpha = 0.5, dot_size = 4, interaction_p_value_font_size = 6, jn_point_font_size = 6, jn_point_label_hjust = NULL, interaction_p_vjust = -3, plot_margin = ggplot2::unit(c(75, 7, 7, 7), "pt"), legend_position = "right", line_of_fit_types = c("solid", "dashed"), line_of_fit_thickness = 1.5, jn_line_types = c("solid", "solid"), jn_line_thickness = 1.5, sig_region_color = "green", sig_region_alpha = 0.08, nonsig_region_color = "gray", nonsig_region_alpha = 0.08, x_axis_title = NULL, y_axis_title = NULL, legend_title = NULL, round_decimals_int_p_value = 3, round_jn_point_labels = 2)Arguments
data | a data object (a data frame or a data.table) |
iv_name | name of the multicategorical independent variable;this variable must have three or more categories. |
dv_name | name of the dependent variable |
mod_name | name of the continuous moderator variable |
contrasts | names of the contrast variables |
contrasts_for_floodlight | names of the contrast variables forwhich floodlight analyses will be conducted |
covariate_name | name of the variables to control for |
interaction_p_include | logical. Should the plot include ap-value for the interaction term? |
iv_category_order | order of levels in the independentvariable for legend. By default, it will be set as levels of theindependent variable ordered using R's base function |
heteroskedasticity_consistent_se | which kind ofheteroskedasticity-consistent (robust) standard errors should becalculated? (default = "HC4") |
round_r_squared | number of decimal places to which to roundr-squared values (default = 3) |
round_f | number of decimal places to which to roundthe f statistic for model comparison (default = 2) |
sigfigs | number of significant digits to round to(for values in the regression tables, except for p values).By default |
jn_points_disregard_threshold | the Minimum Distance inthe unit of the moderator variable that will be used for various purposes,such as (1) to disregard the second Johnson-Neyman pointthat is different from the first Johnson-Neyman (JN) point byless than the Minimum Distance; (2) to determine regions ofsignificance, which will calculate the p-value of the IV's effect(the focal dummy variable's effect) on DV at a candidateJN point + / - the Minimum Distance.This input is hard to explain, but a user can enter a really low valuefor this argument (e.g., |
print_floodlight_plots | If |
output | output of the function (default = "all").Possible inputs: "reg_models", "reg_tables", "reg_tables_rounded","all" |
jitter_x_percent | horizontally jitter dots by a percentage of therange of x values |
jitter_y_percent | vertically jitter dots by a percentage of therange of y values |
dot_alpha | opacity of the dots (0 = completely transparent,1 = completely opaque). By default, |
dot_size | size of the dots (default = 4) |
interaction_p_value_font_size | font size for the interactionp value (default = 8) |
jn_point_font_size | font size for Johnson-Neyman point labels(default = 6) |
jn_point_label_hjust | a vector of hjust values forJohnson-Neyman point labels. By default, the hjust value will be 0.5 forall the points. |
interaction_p_vjust | By how much should the label for theinteraction p-value be adjusted vertically?By default, |
plot_margin | margin for the plotBy default |
legend_position | position of the legend (default = "right").If |
line_of_fit_types | types of the lines of fit for the two levelsof the independent variable.By default, |
line_of_fit_thickness | thickness of the lines of fit (default = 1.5) |
jn_line_types | types of the lines for Johnson-Neyman points.By default, |
jn_line_thickness | thickness of the lines at Johnson-Neyman points(default = 1.5) |
sig_region_color | color of the significant region, i.e., range(s)of the moderator variable for which simple effect of the independentvariable on the dependent variable is statistically significant. |
sig_region_alpha | opacity for |
nonsig_region_color | color of the non-significant region,i.e., range(s) of the moderator variable for which simple effect ofthe independent variable on the dependent variable is notstatistically significant. |
nonsig_region_alpha | opacity for |
x_axis_title | title of the x axis. By default, it will be setas input for |
y_axis_title | title of the y axis. By default, it will be setas input for |
legend_title | title of the legend. By default, it will be setas input for |
round_decimals_int_p_value | To how many digits after thedecimal point should the p value for the interaction term berounded? (default = 3) |
round_jn_point_labels | To how many digits after thedecimal point should the jn point labels be rounded? (default = 2) |
Details
See the following reference, which covers a related topic:Hayes & Montoya (2017)doi:10.1080/19312458.2016.1271116
Examples
## Not run: # typical example# copy and modify the 'mtcars' datamtcars2 <- setDT(data.table::copy(mtcars))# make sure the data table package is attachedmtcars2[, contrast_1 := fcase(cyl == 4, -2, cyl %in% c(6, 8), 1)]mtcars2[, contrast_2 := fcase(cyl == 4, 0, cyl == 6, 1, cyl == 8, -1)]floodlight_for_contrasts(data = mtcars2,iv_name = "cyl",dv_name = "mpg",mod_name = "qsec",contrasts = paste0("contrast_", 1:2),contrasts_for_floodlight = "contrast_2")## End(Not run)Floodlight Multicategorical by Continuous
Description
Conduct a floodlight analysis for aMulticategorical IV x Continuous Moderator design.
Usage
floodlight_multi_by_continuous( data = NULL, iv_name = NULL, dv_name = NULL, mod_name = NULL, coding = "indicator", baseline_category = NULL, covariate_name = NULL, interaction_p_include = TRUE, iv_category_order = NULL, heteroskedasticity_consistent_se = "HC4", round_r_squared = 3, round_f = 2, sigfigs = 2, jn_points_disregard_threshold = NULL, print_floodlight_plots = TRUE, output = "all", jitter_x_percent = 0, jitter_y_percent = 0, dot_alpha = 0.5, dot_size = 4, interaction_p_value_font_size = 8, jn_point_font_size = 8, jn_point_label_hjust = NULL, interaction_p_vjust = -3, plot_margin = ggplot2::unit(c(75, 7, 7, 7), "pt"), legend_position = "right", line_of_fit_types = c("solid", "dashed"), line_of_fit_thickness = 1.5, jn_line_types = c("solid", "solid"), jn_line_thickness = 1.5, colors_for_iv = c("red", "blue"), sig_region_color = "green", sig_region_alpha = 0.08, nonsig_region_color = "gray", nonsig_region_alpha = 0.08, x_axis_title = NULL, y_axis_title = NULL, legend_title = NULL, round_decimals_int_p_value = 3, round_jn_point_labels = 2)Arguments
data | a data object (a data frame or a data.table) |
iv_name | name of the multicategorical independent variable;this variable must have three or more categories. |
dv_name | name of the dependent variable |
mod_name | name of the continuous moderator variable |
coding | name of the coding scheme to use; the current versionof the function allows only the "indicator" coding scheme.By default, |
baseline_category | value of the independent variable thatwill be the reference value against which other values of theindependent variable will be compared |
covariate_name | name of the variables to control for |
interaction_p_include | logical. Should the plot include ap-value for the interaction term? |
iv_category_order | order of levels in the independentvariable for legend. By default, it will be set as levels of theindependent variable ordered using R's base function |
heteroskedasticity_consistent_se | which kind ofheteroskedasticity-consistent (robust) standard errors should becalculated? (default = "HC4") |
round_r_squared | number of decimal places to which to roundr-squared values (default = 3) |
round_f | number of decimal places to which to roundthe f statistic for model comparison (default = 2) |
sigfigs | number of significant digits to round to(for values in the regression tables, except for p values).By default |
jn_points_disregard_threshold | the Minimum Distance inthe unit of the moderator variable that will be used for various purposes,such as (1) to disregard the second Johnson-Neyman pointthat is different from the first Johnson-Neyman (JN) point byless than the Minimum Distance; (2) to determine regions ofsignificance, which will calculate the p-value of the IV's effect(the focal dummy variable's effect) on DV at a candidateJN point + / - the Minimum Distance.This input is hard to explain, but a user can enter a really low valuefor this argument (e.g., |
print_floodlight_plots | If |
output | output of the function (default = "all").Possible inputs: "reg_models", "reg_tables", "reg_tables_rounded","all" |
jitter_x_percent | horizontally jitter dots by a percentage of therange of x values |
jitter_y_percent | vertically jitter dots by a percentage of therange of y values |
dot_alpha | opacity of the dots (0 = completely transparent,1 = completely opaque). By default, |
dot_size | size of the dots (default = 4) |
interaction_p_value_font_size | font size for the interactionp value (default = 8) |
jn_point_font_size | font size for Johnson-Neyman point labels(default = 8) |
jn_point_label_hjust | a vector of hjust values forJohnson-Neyman point labels. By default, the hjust value will be 0.5 forall the points. |
interaction_p_vjust | By how much should the label for theinteraction p-value be adjusted vertically?By default, |
plot_margin | margin for the plotBy default |
legend_position | position of the legend (default = "right").If |
line_of_fit_types | types of the lines of fit for the two levelsof the independent variable.By default, |
line_of_fit_thickness | thickness of the lines of fit (default = 1.5) |
jn_line_types | types of the lines for Johnson-Neyman points.By default, |
jn_line_thickness | thickness of the lines at Johnson-Neyman points(default = 1.5) |
colors_for_iv | colors for the two values of theindependent variable (default = c("red", "blue")) |
sig_region_color | color of the significant region, i.e., range(s)of the moderator variable for which simple effect of the independentvariable on the dependent variable is statistically significant. |
sig_region_alpha | opacity for |
nonsig_region_color | color of the non-significant region,i.e., range(s) of the moderator variable for which simple effect ofthe independent variable on the dependent variable is notstatistically significant. |
nonsig_region_alpha | opacity for |
x_axis_title | title of the x axis. By default, it will be setas input for |
y_axis_title | title of the y axis. By default, it will be setas input for |
legend_title | title of the legend. By default, it will be setas input for |
round_decimals_int_p_value | To how many digits after thedecimal point should the p value for the interaction term berounded? (default = 3) |
round_jn_point_labels | To how many digits after thedecimal point should the jn point labels be rounded? (default = 2) |
Details
See the following reference:Hayes & Montoya (2017)doi:10.1080/19312458.2016.1271116Williams (2004) on r-squared values when calculating robust standard errorshttps://web.archive.org/web/20230627025457/https://www.stata.com/statalist/archive/2004-05/msg00107.html
Examples
## Not run: # typical examplefloodlight_multi_by_continuous(data = mtcars,iv_name = "cyl",dv_name = "mpg",mod_name = "qsec")## End(Not run)Forest plot
Description
Create a forest plot using outputs from 'metafor' package
Usage
forest_plot( estimates = NULL, estimate_ci_ll = NULL, estimate_ci_ul = NULL, point_size_range = c(2, 10), error_bar_size = 1, error_bar_tip_height = 0.3, weights = NULL, diamond_x = NULL, diamond_ci_ll = NULL, diamond_ci_ul = NULL, diamond_height = 1.2, diamond_gap_height = 0.3, diamond_1_tip_at_top_y = -0.5, diamond_colors = "black", study_labels = NULL, diamond_labels = NULL, diamond_label_size = 6, diamond_label_hjust = 0, diamond_label_fontface = "bold", diamond_estimate_label_hjust = 0, diamond_estimate_label_size = 6, diamond_estimate_label_fontface = "bold", round_estimates = 2, x_axis_title = "Observed Outcome", vline_size = 1, vline_intercept = 0, vline_type = "dotted", study_label_hjust = 0, study_label_begin_x = NULL, study_label_begin_x_perc = 60, study_label_size = 6, study_label_fontface = "plain", estimate_label_begin_x = NULL, estimate_label_begin_x_perc = 25, estimate_label_hjust = 0, estimate_label_size = 6, estimate_label_fontface = "plain", x_axis_tick_marks = NULL, x_axis_tick_mark_label_size = 6, legend_position = "none", plot_margin = NULL)Arguments
estimates | default = NULL |
estimate_ci_ll | default = NULL |
estimate_ci_ul | default = NULL |
point_size_range | default = c(2, 10) |
error_bar_size | default = 1 |
error_bar_tip_height | default = 0.3 |
weights | default = NULL |
diamond_x | default = NULL |
diamond_ci_ll | default = NULL |
diamond_ci_ul | default = NULL |
diamond_height | default = 1.2 |
diamond_gap_height | default = 0.3 |
diamond_1_tip_at_top_y | default = -0.5 |
diamond_colors | default = "black" |
study_labels | default = NULL |
diamond_labels | default = NULL |
diamond_label_size | default = 6 |
diamond_label_hjust | default = 0 |
diamond_label_fontface | default = "bold" |
diamond_estimate_label_hjust | default = 0 |
diamond_estimate_label_size | default = 6 |
diamond_estimate_label_fontface | default = "bold" |
round_estimates | default = 2 |
x_axis_title | default = "Observed Outcome" |
vline_size | default = 1 |
vline_intercept | default = 0 |
vline_type | default = "dotted" |
study_label_hjust | default = 0 |
study_label_begin_x | default = NULL |
study_label_begin_x_perc | default = 60 |
study_label_size | default = 6 |
study_label_fontface | default = "plain" |
estimate_label_begin_x | default = NULL |
estimate_label_begin_x_perc | default = 25 |
estimate_label_hjust | default = 0 |
estimate_label_size | default = 6 |
estimate_label_fontface | default = "plain" |
x_axis_tick_marks | default = NULL |
x_axis_tick_mark_label_size | default = 6 |
legend_position | default = "none" |
plot_margin | default = NULL |
Examples
forest_plot(estimates = c(2, 3, 4),estimate_ci_ll = c(1, 2, 3),estimate_ci_ul = c(3, 4, 6),weights = 1:3,diamond_x = 2,diamond_labels = "RE",diamond_ci_ll = 1.8,diamond_ci_ul = 2.2,estimate_label_begin_x_perc = 40,x_axis_tick_marks = seq(-2, 6, 2))Geometric mean
Description
Calculate the geometric mean of a numeric vector
Usage
geomean(x = NULL, zero_or_neg_convert_to = NA)Arguments
x | a numeric vector |
zero_or_neg_convert_to | the value to which zero or negativevalues will be converted to. If |
Examples
## Not run: geomean(c(1, 4))geomean(c(1, 100))geomean(c(1, 100, NA))geomean(c(1, 100, NA, 0, -1, -2))geomean(x = c(1, 100, NA, 0, -1, -2),zero_or_neg_convert_to = 1)geomean(c(1, 100, NA, 1, 1, 1))## End(Not run)ggsave quick
Description
quickly save the current plot with a timestamp
Usage
ggsave_quick( name = NULL, file_name_extension = "png", timestamp = NULL, width = 16, height = 9)Arguments
name | a character string of the png file name.By default, if no input is given ( |
file_name_extension | file name extension (default = "png").If |
timestamp | if |
width | width of the plot to be saved. This argument will bedirectly entered as the |
height | height of the plot to be saved. This argument will bedirectly entered as the |
Value
the output will be a .png image file in the working directory.
Examples
## Not run: kim::histogram(rep(1:30, 3))ggsave_quick()## End(Not run)Histogram
Description
Create a histogram based on the output of thehist functionin thegraphics package.
Usage
histogram( vector = NULL, breaks = NULL, counts = NULL, percent = FALSE, bin_fill_color = "green4", bin_border_color = "black", bin_border_thickness = 1, notify_na_count = NULL, x_axis_tick_marks = NULL, y_axis_tick_marks = NULL, cap_axis_lines = TRUE, x_axis_title = "Value", y_axis_title = NULL, y_axis_title_vjust = 0.85)Arguments
vector | a numeric vector |
breaks | a numeric vector indicating breaks for the bins.By default, no input is required for this argument. |
counts | a numeric vector containing counts for the bins(i.e., heights of the bins). By default, no input is requiredfor this argument. |
percent | logical. If |
bin_fill_color | color of the area inside each bin(default = "green4") |
bin_border_color | color of the border around each bin(default = "black") |
bin_border_thickness | thickness of the border around each bin(default = 1) |
notify_na_count | if |
x_axis_tick_marks | a vector of values at which to place tick markson the x axis (e.g., setting |
y_axis_tick_marks | a vector of values at which to place tick markson the y axis (e.g., setting |
cap_axis_lines | logical. Should the axis lines be capped at theouter tick marks? (default = FALSE) |
x_axis_title | title for x axis (default = "Value") |
y_axis_title | title for y axis (default = "Count" or "Percentage",depending on the value of |
y_axis_title_vjust | position of the y axis title (default = 0.85). |
Value
the output will be a histogram, a ggplot object.
Examples
histogram(1:100)histogram(c(1:100, NA))histogram(vector = mtcars[["mpg"]])histogram(vector = mtcars[["mpg"]], percent = TRUE)histogram(vector = mtcars[["mpg"]],x_axis_tick_marks = c(10, 25, 35), y_axis_title_vjust = 0.5,y_axis_title = "Freq", x_axis_title = "Values of mpg")Histogram by group
Description
Creates histograms by group to compare distributions.
Usage
histogram_by_group( data = NULL, iv_name = NULL, dv_name = NULL, order_of_groups_top_to_bot = NULL, number_of_bins = 40, space_between_histograms = 0.15, draw_baseline = FALSE, xlab = NULL, ylab = NULL, x_limits = NULL, x_breaks = NULL, x_labels = NULL, sigfigs = 3, convert_dv_to_numeric = TRUE)Arguments
data | a data object (a data frame or a data.table) |
iv_name | name of the independent variable |
dv_name | name of the dependent variable |
order_of_groups_top_to_bot | a character vector indicatingthe desired presentation order of levels in the independent variable(from the top to bottom). Omitting a group in this argument willremove the group in the set of histograms. |
number_of_bins | number of bins for the histograms (default = 40) |
space_between_histograms | space between histograms(minimum = 0, maximum = 1, default = 0.15) |
draw_baseline | logical. Should the baseline and the trailinglines to either side of the histogram be drawn? (default = FALSE) |
xlab | title of the x-axis for the histogram by group.If |
ylab | title of the y-axis for the histogram by group.If |
x_limits | a numeric vector with values of the endpointsof the x axis. |
x_breaks | a numeric vector indicating the points at which toplace tick marks on the x axis. |
x_labels | a vector containing labels for the place tick markson the x axis. |
sigfigs | number of significant digits to round to (default = 3) |
convert_dv_to_numeric | logical. Should the values in thedependent variable be converted to numeric for plotting thehistograms? (default = TRUE) |
Details
The following package(s) must be installed prior to running this function:Package 'ggridges' v0.5.3 (or possibly a higher version) byClaus O. Wilke (2021),https://cran.r-project.org/package=ggridges
Value
the output will be a set of vertically arranged histograms(a ggplot object), i.e., one histogram for each level of theindependent variable.
Examples
histogram_by_group(data = mtcars, iv_name = "cyl", dv_name = "mpg")histogram_by_group( data = mtcars, iv_name = "cyl", dv_name = "mpg", order_of_groups_top_to_bot = c("8", "4"), number_of_bins = 10, space_between_histograms = 0.5)histogram_by_group(data = iris, iv_name = "Species", dv_name = "Sepal.Length", x_breaks = 4:8,x_limits = c(4, 8))Histogram
Description
Create a histogram
Usage
histogram_deprecated_1( vector = NULL, number_of_bins = 30, x_tick_marks = NULL, y_tick_marks = NULL, fill_color = "cyan4", border_color = "black", y_axis_title_vjust = 0.85, x_axis_title = NULL, y_axis_title = NULL, cap_axis_lines = FALSE, notify_na_count = NULL)Arguments
vector | a numeric vector |
number_of_bins | number of bins for the histogram (default = 30) |
x_tick_marks | a vector of values at which to place tick markson the x axis (e.g., setting |
y_tick_marks | a vector of values at which to place tick markson the y axis (e.g., setting |
fill_color | color for inside of the bins (default = "cyan4") |
border_color | color for borders of the bins (default = "black") |
y_axis_title_vjust | position of the y axis title (default = 0.85). |
x_axis_title | title for x axis (default = "Value") |
y_axis_title | title for y axis (default = "Count") |
cap_axis_lines | logical. Should the axis lines be capped at theouter tick marks? (default = FALSE) |
notify_na_count | if |
Value
the output will be a histogram, a ggplot object.
Examples
histogram_deprecated_1(1:100)histogram_deprecated_1(c(1:100, NA))histogram_deprecated_1(vector = mtcars[["mpg"]])histogram_deprecated_1(vector = mtcars[["mpg"]], x_tick_marks = seq(10, 36, 2))histogram_deprecated_1(vector = mtcars[["mpg"]], x_tick_marks = seq(10, 36, 2),y_tick_marks = seq(0, 8, 2), y_axis_title_vjust = 0.5,y_axis_title = "Freq", x_axis_title = "Values of mpg")Histogram from hist function
Description
Create a histogram based on the output of thehist functionin thegraphics package.
Usage
histogram_from_hist( vector = NULL, breaks = NULL, counts = NULL, percent = FALSE, bin_fill_color = "green4", bin_border_color = "black", bin_border_thickness = 1, notify_na_count = NULL, x_axis_tick_marks = NULL, y_axis_tick_marks = NULL, cap_axis_lines = TRUE, x_axis_title = "Value", y_axis_title = NULL, y_axis_title_vjust = 0.85)Arguments
vector | a numeric vector |
breaks | a numeric vector indicating breaks for the bins.By default, no input is required for this argument. |
counts | a numeric vector containing counts for the bins(i.e., heights of the bins). By default, no input is requiredfor this argument. |
percent | logical. If |
bin_fill_color | color of the area inside each bin(default = "green4") |
bin_border_color | color of the border around each bin(default = "black") |
bin_border_thickness | thickness of the border around each bin(default = 1) |
notify_na_count | if |
x_axis_tick_marks | a vector of values at which to place tick markson the x axis (e.g., setting |
y_axis_tick_marks | a vector of values at which to place tick markson the y axis (e.g., setting |
cap_axis_lines | logical. Should the axis lines be capped at theouter tick marks? (default = FALSE) |
x_axis_title | title for x axis (default = "Value") |
y_axis_title | title for y axis (default = "Count" or "Percentage",depending on the value of |
y_axis_title_vjust | position of the y axis title (default = 0.85). |
Value
the output will be a histogram, a ggplot object.
Examples
histogram_from_hist(1:100)histogram_from_hist(c(1:100, NA))histogram_from_hist(vector = mtcars[["mpg"]])histogram_from_hist(vector = mtcars[["mpg"]], percent = TRUE)histogram_from_hist(vector = mtcars[["mpg"]],x_axis_tick_marks = c(10, 25, 35), y_axis_title_vjust = 0.5,y_axis_title = "Freq", x_axis_title = "Values of mpg")Histogram with outlier bins
Description
Create a histogram with outlier bins
Usage
histogram_w_outlier_bins( vector = NULL, bin_cutoffs = NULL, outlier_bin_left = TRUE, outlier_bin_right = TRUE, x_tick_marks = NULL, x_tick_mark_labels = NULL, y_tick_marks = NULL, outlier_bin_fill_color = "coral", non_outlier_bin_fill_color = "cyan4", border_color = "black", y_axis_title_vjust = 0.85, x_axis_title = NULL, y_axis_title = NULL, notify_na_count = NULL, plot_proportion = TRUE, plot_frequency = FALSE, mean = TRUE, ci = TRUE, median = TRUE, median_position = 15, error_bar_size = 3)Arguments
vector | a numeric vector |
bin_cutoffs | cutoff points for bins |
outlier_bin_left | logical. Should the leftmost bin treatedas an outlier bin? (default = TRUE) |
outlier_bin_right | logical. Should the rightmost bin treatedas an outlier bin? (default = TRUE) |
x_tick_marks | a vector of values at which to place tick markson the x axis. Note that the first bar spans from 0.5 to 1.5,second bar from 1.5 to 2.5, ... nth bar from n - 0.5 to n + 0.5.See the example. By default, tick marks will be placed at everycutoff point for bins |
x_tick_mark_labels | a character vector to label tick marks.By default, the vector of cutoff points for bins will also beused as labels. |
y_tick_marks | a vector of values at which to place tick markson the y axis (e.g., setting |
outlier_bin_fill_color | color to fill inside of theoutlier bins (default = "coral") |
non_outlier_bin_fill_color | color to fill inside of thenon-outlier bins (default = "cyan4") |
border_color | color for borders of the bins (default = "black") |
y_axis_title_vjust | position of the y axis title (default = 0.85). |
x_axis_title | title for x axis (default = "Value"). If |
y_axis_title | title for y axis. By default, it will be either"Proportion" or "Count". |
notify_na_count | if |
plot_proportion | logical. Should proportions be plotted,as opposed to frequencies? (default = TRUE) |
plot_frequency | logical. Should frequencies be plotted,as opposed to proportions? (default = FALSE).If |
mean | logical. Should mean marked on the histogram?(default = TRUE) |
ci | logical. Should 95% confidence interval marked on the histogram?(default = TRUE) |
median | logical. Should median marked on the histogram?(default = TRUE) |
median_position | position of the median label as a percentage ofheight of the tallest bin (default = 15) |
error_bar_size | size of the error bars (default = 3) |
Value
a ggplot object
Examples
histogram_w_outlier_bins(vector = 1:100, bin_cutoffs = seq(0, 100, 10))histogram_w_outlier_bins(vector = 0:89, bin_cutoffs = seq(0, 90, 10),x_tick_marks = seq(0.5, 9.5, 3), x_tick_mark_labels = seq(0, 90, 30))histogram_w_outlier_bins(vector = 1:10, bin_cutoffs = seq(0, 10, 2.5))histogram_w_outlier_bins(vector = 1:5, bin_cutoffs = seq(0, 10, 2.5))histogram_w_outlier_bins(vector = 1:15, bin_cutoffs = c(5.52, 10.5))Holm-adjusted p-values
Description
Adjust a vector of p-values using the method proposed by Holm
Usage
holm_adjusted_p(p = NULL)Arguments
p | a numeric vector of p-values |
Details
See the following reference:Holm 1979https://www.jstor.org/stable/4615733Manual for the 'p.adjust' function in the 'stats' packagehttps://stat.ethz.ch/R-manual/R-devel/library/stats/html/p.adjust.html
Examples
holm_adjusted_p(c(.05, .01))holm_adjusted_p(c(.05, .05, .05))ID across datasets
Description
Create an ID column in each of the data sets. The ID values willspan across the data sets.
Usage
id_across_datasets( dt_list = NULL, id_col_name = "id", id_col_position = "first", silent = FALSE)Arguments
dt_list | a list of data.table objects |
id_col_name | name of the column that will contain ID values.By default, |
id_col_position | position of the newly created ID column.If |
silent | If |
Value
the output will be a list of data.table objects.
Examples
# running the examples below requires importing the data.table package.prep(data.table)id_across_datasets(dt_list = list(setDT(copy(mtcars)), setDT(copy(iris))))id_across_datasets(dt_list = list(setDT(copy(mtcars)), setDT(copy(iris)), setDT(copy(women))),id_col_name = "newly_created_id_col",id_col_position = "last")Check whether all inputs are identical
Description
Check whether all inputs are identical
Usage
identical_all(...)Arguments
... | two or more R objects. If a vector or list is entered asan input, the function will test whether the vector's or list'selements are identical. |
Value
the output will beTRUE if all inputs are identicalorFALSE if not
Examples
identical_all(1:3, 1:3) # should return TRUEidentical_all(1:3, 1:3, 1:3, 1:3, 1:3) # should return TRUEidentical_all(1:3, 1:3, 1:3, 1:3, 1:3, 1:4) # should return FALSEidentical_all(1:10) # should return FALSEidentical_all(rep(1, 100)) # should return TRUEidentical_all(list(1, 1, 1)) # should return TRUEidentical_all(TRUE, FALSE) # should return FALSEidentical_all(FALSE, TRUE) # should return FALSEInstall all dependencies for all functions
Description
Install all dependencies for all functions in Package 'kim'.
Usage
install_all_dependencies()Value
there will be no output from this function. Rather,dependencies of all functions in Package 'kim' will be installed.
Examples
## Not run: install_all_dependencies()## End(Not run)Kurtosis
Description
Calculate kurtosis of the sample using a formula for either the(1) biased estimator or (2) an unbiased estimator of thepopulation kurtosis. Formulas were taken from DeCarlo (1997),doi:10.1037/1082-989X.2.3.292
Usage
kurtosis(vector = NULL, unbiased = TRUE)Arguments
vector | a numeric vector |
unbiased | logical. If |
Value
a numeric value, i.e., kurtosis of the given vector
Examples
# calculate the unbiased estimator (e.g., kurtosis value that# Excel 2016 will produce)kim::kurtosis(c(1, 2, 3, 4, 5, 10))# calculate the biased estimator (e.g., kurtosis value that# R Package 'moments' will produce)kim::kurtosis(c(1, 2, 3, 4, 5, 10), unbiased = FALSE)# compare with kurtosis from 'moments' packagemoments::kurtosis(c(1, 2, 3, 4, 5, 10))lenu: Length of unique values
Description
Extract unique elements and get the length of those elements
Usage
lenu(x = NULL)Arguments
x | a vector or a data frame or an array or NULL. |
Value
a vector, data frame, or array-like 'x' but with duplicateelements/rows removed.
Examples
unique(c(10, 3, 7, 10))lenu(c(10, 3, 7, 10))unique(c(10, 3, 7, 10, NA))lenu(c(10, 3, 7, 10, NA))lenu(c("b", "z", "b", "a", NA, NA, NA))Levene's test
Description
Conduct Levene's test (i.e., test the null hypothesis that the variancesin different gorups are equal)
Usage
levene_test( data = NULL, dv_name = NULL, iv_1_name = NULL, iv_2_name = NULL, round_f = 2, round_p = 3, output_type = "text")Arguments
data | a data object (a data frame or a data.table) |
dv_name | name of the dependent variable |
iv_1_name | name of the first independent variable |
iv_2_name | name of the second independent variable |
round_f | number of decimal places to which to round theF-statistic from Levene's test (default = 2) |
round_p | number of decimal places to which to round thep-value from Levene's test (default = 3) |
output_type | If |
Value
the output of the function depends on the input foroutput_type. By default, the output will be theresults of Levene's test in a text format (i.e., character).
Examples
## Not run: levene_test(data = mtcars, dv_name = "mpg",iv_1_name = "vs", iv_2_name = "am")## End(Not run)Log odds ratio
Description
Calculate log odds ratio (i.e., ln of odds ratio), as illustratedin Borenstein et al. (2009, p. 36, ISBN: 978-0-470-05724-7)
Usage
log_odds_ratio( data = NULL, iv_name = NULL, dv_name = NULL, contingency_table = NULL, ci = 0.95, var_include = FALSE, invert = FALSE)Arguments
data | a data object (a data frame or a data.table) |
iv_name | name of the independent variable (grouping variable) |
dv_name | name of the dependent variable (binary outcome) |
contingency_table | a contingency table, which can be directlyentered as an input for calculating the odds ratio |
ci | width of the confidence interval. Input can be any valueless than 1 and greater than or equal to 0. By default, |
var_include | logical. Should the output includevariance of the log of odds ratio? (default = FALSE) |
invert | logical. Whether the inverse of the odds ratio(i.e., 1 / odds ratio) should be returned. |
Examples
## Not run: log_odds_ratio(data = mtcars, iv_name = "vs", dv_name = "am")log_odds_ratio(contingency_table = matrix(c(5, 10, 95, 90), nrow = 2))log_odds_ratio(contingency_table = matrix(c(5, 10, 95, 90), nrow = 2),invert = TRUE)log_odds_ratio(contingency_table = matrix(c(34, 39, 16, 11), nrow = 2))log_odds_ratio(contingency_table = matrix(c(34, 39, 16, 11), nrow = 2),var_include = TRUE)## End(Not run)Convert log odds ratio to Cohen's d
Description
Convert log odds ratio to Cohen'd (standardized mean difference),as illustrated in Borenstein et al. (2009, p. 47, ISBN: 978-0-470-05724-7)
Usage
log_odds_ratio_to_d(log_odds_ratio = NULL, unname = TRUE)Arguments
log_odds_ratio | log odds ratio (the input can be a vector of values),which will be converted to Cohen's d |
unname | logical. Should the names from the input be removed?(default = TRUE) |
Examples
## Not run: log_odds_ratio_to_d(log(1))log_odds_ratio_to_d(log(2))## End(Not run)Logistic regression with an interaction term
Description
Conduct logistic regression for a model with an interaction betweentwo predictor variables
Usage
logistic_reg_w_interaction( data = NULL, dv_name = NULL, iv_1_name = NULL, iv_2_name = NULL, round_p = 3, round_chi_sq = 2, dv_ordered_levels = NULL, iv_1_ordered_levels = NULL, iv_2_ordered_levels = NULL, one_line_summary_only = FALSE, p_value_interaction_only = FALSE, return_dt_w_binary = FALSE)Arguments
data | a data object (a data frame or a data.table) |
dv_name | name of the dependent variable (must be a binary variable) |
iv_1_name | name of the first independent variable |
iv_2_name | name of the second independent variable |
round_p | number of decimal places to which to roundp-values (default = 3) |
round_chi_sq | number of decimal places to which to roundchi square statistics (default = 2) |
dv_ordered_levels | a vector with the ordered levels of thedependent variable, the first and second elements of which will becoded as 0 and 1, respectively, to run logistic regression.E.g., |
iv_1_ordered_levels | (only if the first independent variableis a binary variable) a vector with the ordered levels of the firstindependent variable, the first and second elements of which will becoded as 0 and 1, respectively, to run logistic regression.E.g., |
iv_2_ordered_levels | (only if the second independent variableis a binary variable) a vector with the ordered levels of the firstindependent variable, the first and second elements of which will becoded as 0 and 1, respectively, to run logistic regression.E.g., |
one_line_summary_only | logical. Should the output simply be aprintout of a one-line summary on the interaction term? (default = FALSE) |
p_value_interaction_only | logical. Should the output simply be ap-value of the interaction term in the logistic regression model?(default = FALSE) |
return_dt_w_binary | logical. If |
Value
the output will be a summary of logistic regression results,unless set otherwise by arguments to the function.
Examples
logistic_reg_w_interaction(data = mtcars, dv_name = "vs",iv_1_name = "mpg", iv_2_name = "am")Logistic regression
Description
Conduct a logistic regression analysis
Usage
logistic_regression( data = NULL, formula = NULL, formula_1 = NULL, formula_2 = NULL, z_values_keep = FALSE, constant_row_clean = TRUE, odds_ratio_cols_combine = TRUE, round_b_and_se = 3, round_z = 3, round_p = 3, round_odds_ratio = 3, round_r_sq = 3, round_model_chi_sq = 3, pretty_round_p_value = TRUE, print_glm_default_summary = FALSE, print_summary_dt_list = TRUE, print_model_comparison = TRUE, output_type = "summary_dt_list")Arguments
data | a data object (a data frame or a data.table) |
formula | formula for estimating a single logistic regression model |
formula_1 | formula for estimating logistic regression model 1 of 2 |
formula_2 | formula for estimating logistic regression model 2 of 2 |
z_values_keep | logical. Should the z values be kept in the table?(default = FALSE) |
constant_row_clean | logical. Should the row for the constantbe cleared except for b and standard error of b? (default = TRUE) |
odds_ratio_cols_combine | logical. Should the odds ratio columnsbe combined? (default = TRUE) |
round_b_and_se | number of decimal places to which to roundb and standard error of b (default = 3) |
round_z | number of decimal places to which to roundz values (default = 3) |
round_p | number of decimal places to which to roundp-values (default = 3) |
round_odds_ratio | number of decimal places to which to roundodds ratios (default = 3) |
round_r_sq | number of decimal places to which to roundR-squared values (default = 3) |
round_model_chi_sq | number of decimal places to which to roundmodel chi-squared values (default = 3) |
pretty_round_p_value | logical. Should the p-values be roundedin a pretty format (i.e., lower threshold: "<.001").By default, |
print_glm_default_summary | logical. Should the default summaryoutput of the glm objects be printed? (default = FALSE) |
print_summary_dt_list | logical. Should the summaries oflogistic regressions in a data table format be printed? (default = TRUE) |
print_model_comparison | logical. Should the comparison oftwo logistic regression models be printed? (default = TRUE) |
output_type | If |
Value
the output will be a summary of logistic regression results,unless set otherwise by theoutput_type argument to the function.
Examples
logistic_regression(data = mtcars, formula = am ~ mpg)logistic_regression(data = mtcars,formula_1 = am ~ mpg,formula_2 = am ~ mpg + wt)Logistic regression table
Description
Construct a table of logistic regression results from the givenglm object estimating a logistic regression model.
Usage
logistic_regression_table( logistic_reg_glm_object = NULL, z_values_keep = FALSE, constant_row_clean = TRUE, odds_ratio_cols_combine = TRUE, round_b_and_se = 3, round_z = 3, round_p = 3, round_odds_ratio = 3, round_r_sq = 3, round_model_chi_sq = 3, pretty_round_p_value = TRUE)Arguments
logistic_reg_glm_object | a glm object estimating alogistic regression model |
z_values_keep | logical. Should the z values be kept in the table?(default = FALSE) |
constant_row_clean | logical. Should the row for the constantbe cleared except for b and standard error of b? (default = TRUE) |
odds_ratio_cols_combine | logical. Should the odds ratio columnsbe combined? (default = TRUE) |
round_b_and_se | number of decimal places to which to roundb and standard error of b (default = 3) |
round_z | number of decimal places to which to roundz values (default = 3) |
round_p | number of decimal places to which to roundp-values (default = 3) |
round_odds_ratio | number of decimal places to which to roundodds ratios (default = 3) |
round_r_sq | number of decimal places to which to roundR-squared values (default = 3) |
round_model_chi_sq | number of decimal places to which to roundmodel chi-squared values (default = 3) |
pretty_round_p_value | logical. Should the p-values be roundedin a pretty format (i.e., lower threshold: "<.001").By default, |
Value
the output will be a summary of logistic regression results.
Examples
logistic_regression_table(logistic_reg_glm_object =glm(formula = am ~ mpg, family = binomial(), data = mtcars))logistic_regression_table(logistic_reg_glm_object =glm(formula = am ~ mpg, family = binomial(), data = mtcars),z_values_keep = TRUE, constant_row_clean = FALSE,odds_ratio_cols_combine = FALSE)Loglinear analysis
Description
Conduct a loglinear analysis
Usage
loglinear_analysis( data = NULL, dv_name = NULL, iv_1_name = NULL, iv_2_name = NULL, iv_1_values = NULL, iv_2_values = NULL, output = "all", round_p = 3, round_chi_sq = 2, mosaic_plot = TRUE, report_as_field = FALSE)Arguments
data | a data object (a data frame or a data.table) |
dv_name | name of the dependent variable |
iv_1_name | name of the first independent variable |
iv_2_name | name of the second independent variable |
iv_1_values | restrict all analyses to observations havingthese values for the first independent variable |
iv_2_values | restrict all analyses to observations havingthese values for the second independent variable |
output | type of the output. If |
round_p | number of decimal places to which to roundp-values (default = 3) |
round_chi_sq | number of decimal places to which to roundchi-squared test statistics (default = 2) |
mosaic_plot | If |
report_as_field | If |
Examples
loglinear_analysis(data = data.frame(Titanic), "Survived", "Sex", "Age")Remove outliers using the MAD method
Description
Detect outliers in a numeric vector using theMedian Absolute Deviation (MAD) method and remove or convert them.For more information on MAD, see Leys et al. (2013)doi:10.1016/j.jesp.2013.03.013
Usage
mad_remove_outliers( x = NULL, threshold = 2.5, constant = 1.4826, convert_outliers_to = NA, output_type = "converted_vector")Arguments
x | a numeric vector |
threshold | the threshold value for determining outliers.If |
constant | scale factor for the 'mad' function in the 'stats'package. It is the constant linked to the assumed distribution.In case of normality, constant = 1.4826.By default, |
convert_outliers_to | the value to which outliers will be converted.For example, if |
output_type | type of the output.If |
Examples
## Not run: mad_remove_outliers(x = c(1, 3, 3, 6, 8, 10, 10, 1000))mad_remove_outliers(x = c(1, 3, 3, 6, 8, 10, 10, 1000, -10000))# return the vector with the outlier converted to NA valuesmad_remove_outliers(x = c(1, 3, 3, 6, 8, 10, 10, 1000, -10000),output_type = "converted_vector")# return the cutoff values for determining outliersmad_remove_outliers(x = c(1, 3, 3, 6, 8, 10, 10, 1000, -10000),output_type = "cutoff_values")# return the outliersmad_remove_outliers(x = c(1, 3, 3, 6, 8, 10, 10, 1000, -10000),output_type = "outliers")mad_remove_outliers(x = c(1, 3, 3, 6, 8, 10, 10, 1000, -10000),output_type = "non_outlier_values")## End(Not run)Mann-Whitney U Test (Also called Wilcoxon Rank-Sum Test)
Description
A nonparametric equivalent of the independent t-test
Usage
mann_whitney( data = NULL, iv_name = NULL, dv_name = NULL, iv_level_order = NULL, sigfigs = 3)Arguments
data | a data object (a data frame or a data.table) |
iv_name | name of the independent variable (grouping variable) |
dv_name | name of the dependent variable (measure variableof interest) |
iv_level_order | order of levels in the independentvariable. By default, it will be set as levels of theindependent variable ordered using R's base function |
sigfigs | number of significant digits to round to |
Value
the output will be a data.table object with all pairwiseMann-Whitney test results
Examples
mann_whitney(data = iris, iv_name = "Species", dv_name = "Sepal.Length")Prepare a two-column data.table that will be used to fill values in a matrix
Description
Prepare a two-column data.table that will be used to fill values in a matrix
Usage
matrix_prep_dt(row_var_names = NULL, col_var_names = NULL)Arguments
row_var_names | a vector of variable names, each of which will beheader of a row in the eventual matrix |
col_var_names | a vector of variable names, each of which will beheader of a column in the eventual matrix |
Examples
matrix_prep_dt( row_var_names = c("mpg", "cyl"), col_var_names = c("hp", "gear"))Mean center
Description
Mean-center a variable, i.e., subtract the mean of a numeric vectorfrom each value in the numeric vector
Usage
mean_center(x)Arguments
x | a numeric vector; though not thoroughly tested, the functioncan accept a matrix as an input. |
Examples
mean_center(1:5)mean_center(1:6)# if the input is a matrixmatrix(1:9, nrow = 3)mean_center(matrix(1:9, nrow = 3))Mediation analysis
Description
Conducts a mediation analysis to estimate an independent variable'sindirect effect on dependent variable through a mediator variable.The current version of the package only supports a simple mediationmodel consisting of one independent variable, one mediator variable,and one dependent variable.
Usage
mediation_analysis( data = NULL, iv_name = NULL, mediator_name = NULL, dv_name = NULL, covariates_names = NULL, robust_se = TRUE, iterations = 1000, sigfigs = 3, output_type = "summary_dt", silent = FALSE)Arguments
data | a data object (a data frame or a data.table) |
iv_name | name of the independent variable |
mediator_name | name of the mediator variable |
dv_name | name of the dependent variable |
covariates_names | names of covariates to control for |
robust_se | if |
iterations | number of bootstrap samples. The default is set at 1000,but consider increasing the number of samples to 5000, 10000, or aneven larger number, if slower handling time is not an issue. |
sigfigs | number of significant digits to round to |
output_type | if |
silent | if |
Details
This function requires installing Package 'mediation' v4.5.0(or possibly a higher version) by Tingley et al. (2019),and uses the source code from a function in the package.https://cran.r-project.org/package=mediation
Value
ifoutput_type = "summary_dt", which is the default,the output will be a data.table showing a summary of mediationanalysis results; ifoutput_type = "mediate_output",the output will be the output from themediate functionin the 'mediate' package; ifoutput_type = "indirect_effect_p",the output will be the p-value associated with the indirect effectestimated in the mediation model (a numeric vector of length one).
Examples
mediation_analysis( data = mtcars, iv_name = "cyl", mediator_name = "disp", dv_name = "mpg", iterations = 100)mediation_analysis( data = iris, iv_name = "Sepal.Length", mediator_name = "Sepal.Width", dv_name = "Petal.Length", iterations = 100)Merge a list of data tables
Description
Successively merge a list of data.table objects in a recursivefashion. That is, merge the (second data table in the list) aroundthe first data table in the list; then, around this resulting data table,merge the third data table in the list; and so on.
Usage
merge_data_table_list(dt_list = NULL, id = NULL, silent = TRUE)Arguments
dt_list | a list of data.table objects |
id | name(s) of the column(s) that will contain the ID valuesin the two data tables. The name(s) of the ID column(s) must be identicalin the two data tables. |
silent | If |
Details
If there are any duplicated ID values and column names acrossthe data tables, the cell values in the earlier data table willremain intact and the cell values in the later data table will bediscarded for the resulting merged data table in each recursion.
Value
a data.table object, which successively merges (joins)a data table around (i.e., outside) the previous data table in thelist of data tables.
Examples
data_1 <- data.table::data.table(id_col = c(4, 2, 1, 3),a = 3:6,b = 5:8,c = c("w", "x", "y", "z"))data_2 <- data.table::data.table(id_col = c(1, 4, 99),d = 6:8,b = c("p", "q", "r"),e = c(TRUE, FALSE, FALSE))data_3 <- data.table::data.table(id_col = c(200, 3),f = 11:12,b = c(300, "abc"))merge_data_table_list(dt_list = list(data_1, data_2, data_3), id = "id_col")Merge data tables
Description
Merge two data.table objects. If there are any duplicatedID values and column names across the two data tables, thecell values in the first data.table will remain intact andthe cell values in the second data.table will be discarded for theresulting merged data table.
Usage
merge_data_tables(dt1 = NULL, dt2 = NULL, id = NULL, silent = TRUE)Arguments
dt1 | the first data.table which will remain intact |
dt2 | the second data.table which will be joined outside of(around) the first data.table. If there are any duplicatedID values and column names across the two data tables, thecell values in the first data.table will remain intact andthe cell values in the second data.table will be discarded for theresulting merged data table. |
id | name(s) of the column(s) that will contain the ID valuesin the two data tables. The name(s) of the ID column(s) must be identicalin the two data tables. |
silent | If |
Value
a data.table object, which merges (joins) the second data.tablearound the first data.table.
Examples
## Example 1: Typical Usagedata_1 <- data.table::data.table(id_col = c(4, 2, 1, 3),a = 3:6,b = 5:8,c = c("w", "x", "y", "z"))data_2 <- data.table::data.table(id_col = c(1, 99, 4),e = 6:8,b = c("p", "q", "r"),d = c(TRUE, FALSE, FALSE))# check the two example data tablesdata_1data_2# check the result of merging the two data tables above and# note how data_1 (the upper left portion) is intact in the resulting# data tablemerge_data_tables(dt1 = data_1, dt2 = data_2, id = "id_col")# compare the result with above with the result from the `merge` functionmerge(data_1, data_2, by = "id_col", all = TRUE)## Example 2: Some values can be converteddata_3 <- data.table::data.table(id_col = 99,a = "abc",b = TRUE,c = TRUE)data_1data_3merge_data_tables(data_1, data_3, id = "id_col")# In the example above, note how the value of TRUE gets# converted to 1 in the last row of Column 'b' in the resulting data table## Example 3: A simpler casedata_4 <- data.table::data.table(id_col = c(5, 3),a = c("a", NA))data_5 <- data.table::data.table(id_col = 1,a = 2)# check the two example data tablesdata_4data_5merge_data_tables(data_4, data_5, id = "id_col")## Example 4: Merging data tables using multiple ID columnsdata_6 <- data.table::data.table(id_col_1 = 3:1,id_col_2 = c("a", "b", "c"),id_col_3 = 4:6,a = 7:9,b = 10:12)data_7 <- data.table::data.table(id_col_1 = c(3, 2),id_col_3 = c(3, 5),id_col_2 = c("a", "b"),c = 13:14,a = 15:16)# check the example data setsdata_6data_7# merge data sets using the three id columnssuppressWarnings(merge_data_tables(dt1 = data_6,dt2 = data_7,id = c("id_col_1", "id_col_2", "id_col_3")))Mixed ANOVA 2-Way (Two-Way Mixed ANOVA)
Description
Conduct a two-way mixed analysis of variance (ANOVA).
Usage
mixed_anova_2_way( data = NULL, iv_name_bw_group = NULL, repeated_measures_col_names = NULL, iv_name_bw_group_values = NULL, colors = NULL, error_bar = "ci", position_dodge = 0.13, legend_title = NULL, x_axis_expansion_add = c(0.2, 0.03), x_axis_title = NULL, y_axis_title = "Mean", output = "all")Arguments
data | a data object (a data frame or a data.table) |
iv_name_bw_group | name of the between-group independent variable |
repeated_measures_col_names | names of the columns containingthe repeated measures |
iv_name_bw_group_values | restrict all analyses toobservations having these values for the between-groupindependent variable |
colors | colors of the dots and lines connecting means(default = NULL) If there are exactly two repeated measures,then, by default, |
error_bar | if |
position_dodge | by how much should the group means and error barsbe horizontally offset from each other so as not to overlap?(default = 0.13) |
legend_title | a character for the legend title. If no inputis entered, then, by default, the legend title will be removed. |
x_axis_expansion_add | inputs for the |
x_axis_title | a character string for the x-axis title.If |
y_axis_title | a character string for the y-axis title(default = "Mean"). If |
output | output type can be one of the following: |
Details
The following package(s) must be installed prior to running this function:Package 'afex' v3.0.9 (or possibly a higher version) byFox et al. (2020),https://cran.r-project.org/package=car
Examples
mixed_anova_2_way( data = iris, iv_name_bw_group = "Species", repeated_measures_col_names = c("Sepal.Length", "Petal.Length"))g1 <- mixed_anova_2_way( data = iris, iv_name_bw_group = "Species", repeated_measures_col_names = c("Sepal.Length", "Petal.Length"), error_bar = "se", output = "plot")Find modes of objects
Description
Find modes of objects
Usage
modes_of_objects(...)Arguments
... | R objects. |
Value
the output will be a data.table listing objects and their mods.
Examples
modes_of_objects(TRUE, FALSE, 1L, 1:3, 1.1, c(1.2, 1.3), "abc", 1 + 2i, intToBits(1L))Multiple regression
Description
Conduct multiple regression analysis and summarize the resultsin a data.table.
Usage
multiple_regression( data = NULL, formula = NULL, vars_to_mean_center = NULL, mean_center_vars = NULL, sigfigs = NULL, round_digits_after_decimal = NULL, round_p = NULL, pretty_round_p_value = TRUE, return_table_upper_half = FALSE, round_r_squared = 3, round_f_stat = 2, prettify_reg_table_col_names = TRUE, silent = FALSE, save_as_png = FALSE, png_name = NULL, width = 1600, height = 1200, units = "px", res = 200)Arguments
data | a data object (a data frame or a data.table) |
formula | a formula object for the regression equation |
vars_to_mean_center | (deprecated) a character vector specifying namesof variables that will be mean-centered before the regression modelis estimated |
mean_center_vars | a character vector specifying namesof variables that will be mean-centered before the regression modelis estimated |
sigfigs | number of significant digits to round to |
round_digits_after_decimal | round to nth digit after decimal(alternative to |
round_p | number of decimal places to round p values(overrides all other rounding arguments) |
pretty_round_p_value | logical. Should the p-values be roundedin a pretty format (i.e., lower threshold: "<.001").By default, |
return_table_upper_half | logical. Should only the upper partof the table be returned?By default, |
round_r_squared | number of digits after the decimal both r-squaredand adjusted r-squared values should be rounded to (default 3) |
round_f_stat | number of digits after the decimal the f statisticof the regression model should be rounded to (default 2) |
prettify_reg_table_col_names | logical. Should the column namesof the regression table be made pretty (e.g., change "std_beta" to"Std. Beta")? (Default = |
silent | If |
save_as_png | if |
png_name | name of the PNG file to be saved. By default, the namewill be "mult_reg_" followed by a timestamp of thecurrent time.The timestamp will be in the format, jan_01_2021_1300_10_000001,where "jan_01_2021" would indicate January 01, 2021;1300 would indicate 13:00 (i.e., 1 PM); and 10_000001 wouldindicate 10.000001 seconds after the hour. |
width | width of the PNG file (default = 1600) |
height | height of the PNG file (default = 1200) |
units | the units for the |
res | The nominal resolution in ppi which will be recordedin the png file, if a positive integer. Used for unitsother than the default. By default, |
Details
To include standardized beta(s) in the regression results table,the following package(s) must be installed prior to running the function:Package 'lm.beta' v1.5-1 (or possibly a higher version) byStefan Behrendt (2014),https://cran.r-project.org/package=lm.beta
Value
the output will be a data.table showing multiple regressionresults.
Examples
multiple_regression(data = mtcars, formula = mpg ~ gear * cyl)multiple_regression(data = mtcars, formula = mpg ~ gear * cyl,mean_center_vars = "gear",round_digits_after_decimal = 2)multiple_regression(data = mtcars, formula = mpg ~ gear * cyl,png_name = "mtcars reg table 1")Find noncentrality parameter
Description
Find noncentrality parameter
Usage
noncentrality_parameter(t_stat, df, initial_value = 0, ci = 0.95)Arguments
t_stat | the t-statistic associated with the noncentrality parameters |
df | degrees of freedom associated with the noncentrality parameters |
initial_value | initial value of the noncentrality parameter foroptimization (default = 0). Adjust this value if results look strange. |
ci | width of the confidence interval associated with thenoncentrality parameters (default = 0.95) |
Examples
noncentrality_parameter(4.29, 9)Odds ratio
Description
Calculate odds ratio, as illustrated in Borenstein et al.(2009, pp. 33-36, ISBN: 978-0-470-05724-7)
Usage
odds_ratio( data = NULL, iv_name = NULL, dv_name = NULL, contingency_table = NULL, ci = 0.95, round_ci_limits = 2, invert = FALSE)Arguments
data | a data object (a data frame or a data.table) |
iv_name | name of the independent variable (grouping variable) |
dv_name | name of the dependent variable (binary outcome) |
contingency_table | a contingency table, which can be directlyentered as an input for calculating the odds ratio |
ci | width of the confidence interval. Input can be any valueless than 1 and greater than or equal to 0. By default, |
round_ci_limits | number of decimal places to which toround the limits of the confidence interval (default = 2) |
invert | logical. Whether the inverse of the odds ratio(i.e., 1 / odds ratio) should be returned. |
Examples
## Not run: odds_ratio(data = mtcars, iv_name = "vs", dv_name = "am")odds_ratio(data = mtcars, iv_name = "vs", dv_name = "am", ci = 0.9)odds_ratio(contingency_table = matrix(c(5, 10, 95, 90), nrow = 2))odds_ratio(contingency_table = matrix(c(5, 10, 95, 90), nrow = 2),invert = TRUE)odds_ratio(contingency_table = matrix(c(34, 39, 16, 11), nrow = 2))## End(Not run)Order rows specifically in a data table
Description
Order rows in a data.table in a specific order
Usage
order_rows_specifically_in_dt( dt = NULL, col_to_order_by = NULL, specific_order = NULL)Arguments
dt | a data.table object |
col_to_order_by | a character value indicatingthe name of the column by which to order the data.table |
specific_order | a vector indicating a specific order ofthe values in the column by which to order the data.table. |
Value
the output will be a data.table object whose rows willbe ordered as specified.
Examples
order_rows_specifically_in_dt(mtcars, "carb", c(3, 2, 1, 4, 8, 6))Outlier
Description
Return outliers in a vector
Usage
outlier(x = NULL, iqr = 1.5, na.rm = TRUE, type = 7, unique_outliers = FALSE)Arguments
x | a numeric vector |
iqr | a nonnegative constant by which interquartile range (IQR)will be multiplied to build a "fence," outside which observationswill be considered outliers. For example, if |
na.rm | logical. |
type |
|
unique_outliers | logical. If |
Value
the output will be a numeric vector with outliers removed.
Examples
# Example 1outlier(c(1:10, 100))# The steps below show how the outlier, 100, was obtained# v1 is the vector of interestv1 <- c(1:10, 100)# quantilestats::quantile(v1)# first and third quartilesq1 <- stats::quantile(v1, 0.25)q3 <- stats::quantile(v1, 0.75)# interquartile rangeinterquartile_range <- unname(q3 - q1)# fence, using the default 1.5 as the factor to multiply the IQRcutoff_low <- unname(q1 - 1.5 * interquartile_range)cutoff_high <- unname(q3 + 1.5 * interquartile_range)v1[v1 < cutoff_low | v1 > cutoff_high]Find the overlapping interval of two ranges.
Description
This function should be applied to cases where the two ranges areinclusive of both endpoints. For example, the function can work fora pair of ranges like [0, 1] and [3, 4] but not for pairs like[0, 1\) and \(3, 5\)
Usage
overlapping_interval( interval_1_begin = NULL, interval_1_end = NULL, interval_2_begin = NULL, interval_2_end = NULL)Arguments
interval_1_begin | a number at which the first interval begins(the left INCLUSIVE endpoint of interval 1) |
interval_1_end | a number at which the first interval ends(the right INCLUSIVE endpoint of interval 1) |
interval_2_begin | a number at which the second interval begins(the left INCLUSIVE endpoint of interval 2) |
interval_2_end | a number at which the second interval ends(the right INCLUSIVE endpoint of interval 2) |
Value
the output will beNULL if there is no overlappingregion or a vector of the endpoints of the overlapping interval.
Examples
overlapping_interval(1, 3, 2, 4)overlapping_interval(1, 2.22, 2.22, 3)Paste0
Description
A shorthand for the functionpaste0Concatenate vectors after converting to character.
Usage
p0(..., collapse = NULL, recycle0 = FALSE)Arguments
... | one or more R objects, to be converted to character vectors.This is the same argument that would be used in the |
collapse | an optional character string to separate the results.Not NA_character_.This is the same argument that would be used in the |
recycle0 | logical indicating if zero-length characterarguments should lead to the zero-length character(0)after the sep-phase (which turns into "" in thecollapse-phase, i.e., when collapse is not NULL).This is the same argument that would be used in the |
Examples
paste0("a", "b")p0("a", "b")Packages - List the default packages
Description
List the default packages in R
Usage
package_list_default(package_type = c("base", "recommended"))Arguments
package_type | a vector of package types. By default, |
Examples
package_list_default()package_list_default(package_type = "base")Parallel analysis
Description
Conducts a parallel analysis to determine how many factorsto retain in a factor analysis.
Usage
parallel_analysis( data = NULL, names_of_vars = NULL, iterations = NULL, percentile_for_eigenvalue = 95, line_types = c("dashed", "solid"), colors = c("red", "blue"), eigenvalue_random_label_x_pos = NULL, eigenvalue_random_label_y_pos = NULL, unadj_eigenvalue_label_x_pos = NULL, unadj_eigenvalue_label_y_pos = NULL, label_offset_percent = 2, label_size = 6, dot_size = 5, line_thickness = 1.5, y_axis_title_vjust = 0.8, title_text_size = 26, axis_text_size = 22)Arguments
data | a data object (a data frame or a data.table) |
names_of_vars | names of the variables |
iterations | number of random data sets. If no input is entered,this value will be set as 30 * number of variables. |
percentile_for_eigenvalue | percentile used in estimating bias(default = 95). |
line_types | types of the lines connecting eigenvalues.By default, |
colors | size of the dots denoting eigenvalues (default = 5). |
eigenvalue_random_label_x_pos | (optional) x coordinate ofthe label for eigenvalues from randomly generated data. |
eigenvalue_random_label_y_pos | (optional) y coordinate ofthe label for eigenvalues from randomly generated data. |
unadj_eigenvalue_label_x_pos | (optional) x coordinate ofthe label for unadjusted eigenvalues |
unadj_eigenvalue_label_y_pos | (optional) y coordinate ofthe label for unadjusted eigenvalues |
label_offset_percent | How much should labels for theeigenvalue curves be offset, as a percentage of the plot'sx and y range? (default = 2) |
label_size | size of the labels for the eigenvalue curves(default = 6). |
dot_size | size of the dots denoting eigenvalues (default = 5). |
line_thickness | thickness of the eigenvalue curves (default = 1.5). |
y_axis_title_vjust | position of the y axis title as aproportion of the range (default = 0.8). |
title_text_size | size of the plot title (default = 26). |
axis_text_size | size of the text on the axes (default = 22). |
Details
The following package(s) must be installed prior to running the function:Package 'paran' v1.5.2 (or possibly a higher version) byAlexis Dinno (2018),https://cran.r-project.org/package=paran
Examples
parallel_analysis( data = mtcars, names_of_vars = c("disp", "hp", "drat"))# parallel_analysis(# data = mtcars, names_of_vars = c("carb", "vs", "gear", "am"))Percentile rank
Description
Calculate percentile rank of each value in a vector
Usage
percentile_rank(vector)Arguments
vector | a numeric vector |
Examples
percentile_rank(1:5)percentile_rank(1:10)percentile_rank(1:100)Pivot Table
Description
Create a pivot table.
Usage
pivot_table( data = NULL, row_names = NULL, col_names = NULL, function_as_character = NULL, sigfigs = 3, output = "dt", remove_col_names = TRUE)Arguments
data | a data object (a data frame or a data.table) |
row_names | names of variables for constructing rows |
col_names | names of variables for constructing columnsindependent variables |
function_as_character | function to perform for each cell inthe pivot table |
sigfigs | number of significant digits to which to roundvalues in the pivot table (default = 3) |
output | type of output. If |
remove_col_names | logical. Should the column names(i.e., v1, v2, ...) be removed in the data table output? |
Value
the output will be a contingency table in a data.table format
Examples
pivot_table( data = mtcars, col_names = "am", row_names = c("cyl", "vs"), function_as_character = "mean(mpg)")pivot_table( data = mtcars, col_names = "am", row_names = c("cyl", "vs"), function_as_character = "sum(mpg < 17)")pivot_table( data = mtcars, col_names = "am", row_names = c("cyl", "vs"), function_as_character = "round(sum(mpg < 17) / sum(!is.na(mpg)) * 100, 0)")Plot group means
Description
Creates a plot of sample means and error bars by group.
Usage
plot_group_means( data = NULL, dv_name = NULL, iv_name = NULL, na.rm = TRUE, error_bar = "ci", error_bar_range = 0.95, error_bar_tip_width = 0.13, error_bar_thickness = 1, error_bar_caption = TRUE, lines_connecting_means = TRUE, line_colors = NULL, line_types = NULL, line_thickness = 1, line_size = NULL, dot_size = 3, position_dodge = 0.13, x_axis_title = NULL, y_axis_title = NULL, y_axis_title_vjust = 0.85, legend_title = NULL, legend_position = "right")Arguments
data | a data object (a data frame or a data.table) |
dv_name | name of the dependent variable |
iv_name | name(s) of the independent variable(s).Up to two independent variables can be supplied. |
na.rm | logical. If |
error_bar | if |
error_bar_range | width of the confidence or prediction interval(default = 0.95 for 95 percent confidence or prediction interval).This argument will not apply when |
error_bar_tip_width | graphically, width of the segmentsat the end of error bars (default = 0.13) |
error_bar_thickness | thickness of the error bars (default = 1) |
error_bar_caption | should a caption be included to indicatethe width of the error bars? (default = TRUE). |
lines_connecting_means | logical. Should lines connecting meanswithin each group be drawn? (default = TRUE) |
line_colors | colors of the lines connecting means (default = NULL)If the second IV has two levels, then by default, |
line_types | types of the lines connecting means (default = NULL)If the second IV has two levels, then by default, |
line_thickness | thickness of the lines connecting group means(default = 1) |
line_size | Deprecated. Use the 'linewidth' argument instead.(default = 1) |
dot_size | size of the dots indicating group means (default = 3) |
position_dodge | by how much should the group means and error barsbe horizontally offset from each other so as not to overlap?(default = 0.13) |
x_axis_title | a character string for the x-axis title. If noinput is entered, then, by default, the first value of |
y_axis_title | a character string for the y-axis title. If noinput is entered, then, by default, |
y_axis_title_vjust | position of the y axis title (default = 0.85).By default, |
legend_title | a character for the legend title. If no inputis entered, then, by default, the second value of |
legend_position | position of the legend: |
Value
by default, the output will be a ggplot object.Ifoutput = "table", the output will be a data.table object.
Examples
plot_group_means(data = mtcars, dv_name = "mpg", iv_name = c("vs", "am"))plot_group_means( data = mtcars, dv_name = "mpg", iv_name = c("vs", "am"), error_bar = "se")plot_group_means( data = mtcars, dv_name = "mpg", iv_name = c("vs", "am"), error_bar = "pi", error_bar_range = 0.99)# set line colors and types manuallyplot_group_means(data = mtcars, dv_name = "mpg", iv_name = c("vs", "am"),line_colors = c("green4", "purple"),line_types = c("solid", "solid"))# remove axis titlesplot_group_means(data = mtcars, dv_name = "mpg", iv_name = c("vs", "am"),x_axis_title = FALSE, y_axis_title = FALSE, legend_title = FALSE)Paste for message
Description
Combines the base functions paste0 and message
Usage
pm(..., collapse = NULL)Arguments
... | one or more R objects, to be converted to character vectors.Input(s) to this argument will be passed onto the paste0 function. |
collapse | an optional character string to separate the results.Not |
Value
there will be no output from this function. Rather, a messagewill be generated from the arguments.
Examples
pm("hello", 123)pm(c("hello", 123), collapse = ", ")Population variance of a vector
Description
Calculates the population variance, rather than the sample variance,of a vector
Usage
population_variance(vector, na.rm = TRUE)Arguments
vector | a numeric vector |
na.rm | if |
Examples
population_variance(1:4)var(1:4)Prepare package(s) for use
Description
Installs, loads, and attaches package(s). If package(s) are notinstalled, installs them prior to loading and attaching.
Usage
prep( ..., pkg_names_as_object = FALSE, silent_if_successful = FALSE, silent_load_pkgs = NULL)Arguments
... | names of packages to load and attach, separated by commas,e.g., |
pkg_names_as_object | logical. If |
silent_if_successful | logical. If |
silent_load_pkgs | a character vector indicating names ofpackages to load silently (i.e., suppress messages that get printedwhen loading the packaged). By default, |
Value
there will be no output from this function. Rather, packagesgiven as inputs to the function will be installed, loaded, and attached.
Examples
prep(data.table)prep("data.table", silent_if_successful = TRUE)prep("base", utils, ggplot2, "data.table")pkgs <- c("ggplot2", "data.table")prep(pkgs, pkg_names_as_object = TRUE)prep("data.table", silent_load_pkgs = "data.table")Pretty round p-value
Description
Round p-values to the desired number of decimals and removeleading 0s before the decimal.
Usage
pretty_round_p_value( p_value_vector = NULL, round_digits_after_decimal = 3, include_p_equals = FALSE)Arguments
p_value_vector | one number or a numeric vector |
round_digits_after_decimal | how many digits after the decimalpoint should the p-value be rounded to? |
include_p_equals | if |
Value
the output will be a character vector with p values, e.g.,a vector of strings like "< .001" (or "p < .001").
Examples
pretty_round_p_value(0.00001)pretty_round_p_value(0.00001, round_digits_after_decimal = 4)pretty_round_p_value(0.00001, round_digits_after_decimal = 5)# WARNING: the line of code below adding precision that may be unwarrantedpretty_round_p_value(0.00001, round_digits_after_decimal = 6)pretty_round_p_value( p_value_vector = 0.049, round_digits_after_decimal = 2, include_p_equals = FALSE)pretty_round_p_value(c(0.0015, 0.0014, 0.0009), include_p_equals = TRUE)Pretty round r
Description
Round correlation coefficients in APA style (7th Ed.)
Usage
pretty_round_r(r = NULL, round_digits_after_decimal = 2)Arguments
r | a (vector of) correlation coefficient(s) |
round_digits_after_decimal | how many digits after the decimalpoint should the p-value be rounded to? (default = 2) |
Value
the output will be a character vector of correlation coefficient(s).
Examples
pretty_round_r(r = -0.123)pretty_round_r(c(-0.12345, 0.45678), round_digits_after_decimal = 3)pretty_round_r(c(-0.12, 0.45), round_digits_after_decimal = 4)print loop progress
Description
Print current progress inside a loop (e.g., for loop or lapply)
Usage
print_loop_progress( iteration_number = NULL, iteration_start = 1, iteration_end = NULL, text_before = "", percent = 1, output_method = "cat")Arguments
iteration_number | current number of iteration |
iteration_start | iteration number at which the loop begins(default = 1) |
iteration_end | iteration number at which the loop ends. |
text_before | text to add before "Loop Progress..."By default, it is set to be blank, i.e., |
percent | if |
output_method | if |
Examples
for (i in seq_len(250)) { Sys.sleep(0.001) print_loop_progress( iteration_number = i, iteration_end = 250)}unlist(lapply(seq_len(7), function (i) { Sys.sleep(0.1) print_loop_progress( iteration_number = i, iteration_end = 7) return(i)}))Proportion of given values in a vector
Description
Proportion of given values in a vector
Usage
proportion_of_values_in_vector( values = NULL, vector = NULL, na.exclude = TRUE, output_type = "proportion", silent = FALSE, conf.level = 0.95, correct_yates = TRUE)Arguments
values | a set of values that will count as successes (hits) |
vector | a numeric or character vector containingsuccesses (hits) and failures (misses) |
na.exclude | if |
output_type | By default, |
silent | If |
conf.level | confidence level of the returned confidence interval.Input to this argument will be passed onto the conf.level argumentin the |
correct_yates | a logical indicating whether Yates' continuitycorrection should be applied where possible (default = TRUE).Input to this argument will be passed onto the |
Examples
proportion_of_values_in_vector( values = 2:3, vector = c(rep(1:3, each = 10), rep(NA, 10)))proportion_of_values_in_vector( values = 2:3, vector = c(rep(1:3, each = 10), rep(NA, 10)), output_type = "se")proportion_of_values_in_vector( values = 2:3, vector = c(rep(1:3, each = 10), rep(NA, 10)), conf.level = 0.99)proportion_of_values_in_vector( values = c(2:3, NA), vector = c(rep(1:3, each = 10), rep(NA, 10)), na.exclude = FALSE)Q statistic for testing homogeneity of correlations
Description
Calculate the Q statistic to test for homogeneity of correlationcoefficients.See p. 235 of the book Hedges & Olkin (1985),Statistical Methods for Meta-Analysis (ISBN: 0123363802).
Usage
q_stat_test_homo_r(z = NULL, n = NULL)Arguments
z | a vector of z values |
n | a vector of sample sizes which will be used to calculate theweights, which in turn will be used to calculate the weighted z. |
Value
the output will be a weighted z value.
Examples
q_stat_test_homo_r(1:3, c(100, 200, 300))q_stat_test_homo_r(z = c(1:3, NA), n = c(100, 200, 300, NA))Read a csv file
Description
Read a csv file
Usage
read_csv(name = NULL, head = FALSE, dirname = NULL, ...)Arguments
name | a character string of the csv file name without the".csv" extension. For example, if the csv file to read is "myfile.csv",enter |
head | logical. if |
dirname | a character string of the directory containingthe csv file, e.g., |
... | optional arguments for the |
Value
the output will be a data.table object, that is,an output from the data.table function,fread
Examples
## Not run: mydata <- read_csv("myfile")## End(Not run)Read the sole csv file in the working directory
Description
Read the sole csv file in the working directory
Usage
read_sole_csv(head = FALSE, ...)Arguments
head | logical. if |
... | optional arguments for the |
Value
the output will be a data.table object, that is,an output from the data.table function,fread
Examples
mydata <- read_sole_csv()mydata <- read_sole_csv(head = TRUE)mydata <- read_sole_csv(fill = TRUE, nrows = 5)Regular expression matches
Description
Returns elements of a character vector that match the givenregular expression
Usage
regex_match(regex = NULL, vector = NULL, silent = FALSE, perl = FALSE)Arguments
regex | a regular expressionprovided, a default theme will be used. |
vector | a character vector in which to search for regularexpression matches, or a data table whose column names will be searched |
silent | logical. If |
perl | logical. Should Perl-compatible regexps be used? |
Examples
regex_match("p$", names(mtcars))colnames_ending_with_p <- regex_match("p$", names(mtcars))Find relative position of a value in a vector
Description
Find relative position of a value in a vector that may or may not containthe value
Usage
rel_pos_of_value_in_vector(value = NULL, vector = NULL)Arguments
value | a value whose relative position is to be searched in a vector |
vector | a numeric vector |
Value
a number indicating the relative position of the value in thevector
Examples
rel_pos_of_value_in_vector(value = 3, vector = c(2, 4))rel_pos_of_value_in_vector(value = 3, vector = c(2, 6))rel_pos_of_value_in_vector(value = 3, vector = 1:3)Find relative value of a position in a vector
Description
Find relative value of a position in a vector
Usage
rel_value_of_pos_in_vector(vector = NULL, position = NULL)Arguments
vector | a numeric vector |
position | position of a vector |
Value
a number indicating the relative value of the position in thevector
Examples
rel_value_of_pos_in_vector(vector = c(0, 100), position = 1.5)rel_value_of_pos_in_vector(vector = 2:4, position = 2)rel_value_of_pos_in_vector(vector = c(2, 4, 6), position = 2.5)Remove from a vector
Description
Remove certain values from a vector
Usage
remove_from_vector(values = NULL, vector = NULL, silent = FALSE)Arguments
values | a single value or a vector of values which will beremoved from the target vector |
vector | a character or numeric vector |
silent | if |
Value
the output will be a vector with the given values removed.
Examples
remove_from_vector(values = 1, vector = 1:3)remove_from_vector(values = NA, vector = c(1:3, NA))remove_from_vector(values = c(1, NA), vector = c(1:3, NA))remove_from_vector(values = 1:5, vector = 1:10)Remove all user installed packages
Description
Remove all user installed packages
Usage
remove_user_installed_pkgs( exceptions = NULL, type_of_pkg_to_keep = c("base", "recommended"), keep_kim = FALSE)Arguments
exceptions | a character vector of names of packages to keep |
type_of_pkg_to_keep | a character vector indicating typesof packages to keep. The default, |
keep_kim | logical. If |
Examples
## Not run: remove_user_installed_pkgs()## End(Not run)Repeated-Measures ANVOA
Description
Conduct a repeated-measures analysis of variance (ANOVA).This analysis will be appropriate for within-subjects experimental design.
Usage
repeated_measures_anova( data = NULL, p_col_name = NULL, measure_vars = NULL, histograms = TRUE, round_w = 2, round_epsilon = 2, round_df_model = 2, round_df_error = 2, round_f = 2, round_ges = 2)Arguments
data | a data object (a data frame or a data.table) |
p_col_name | name of the column identifying participants |
measure_vars | names of the columns containing repeated measures(within-subjects variables) |
histograms | logical. If |
round_w | number of decimal places to which to roundW statistic from Mauchly's test (default = 2) |
round_epsilon | number of decimal places to which to roundthe epsilon statistic from Greenhouse-Geisser or Huynh-Feldtcorrection (default = 2) |
round_df_model | number of decimal places to which to roundthe corrected degrees of freedom for model (default = 2) |
round_df_error | number of decimal places to which to roundthe corrected degrees of freedom for error (default = 2) |
round_f | number of decimal places to which to roundthe F statistic (default = 2) |
round_ges | number of decimal places to which to roundgeneralized eta-squared (default = 2) |
Details
The following package(s) must be installed prior to running the function:Package 'ez' v4.4-0 (or possibly a higher version) byMichael A Lawrence (2016),https://cran.r-project.org/package=ez
Examples
## Not run: repeated_measures_anova( data = mtcars, p_col_name = "cyl", measure_vars = c("wt", "qsec"))## End(Not run)Replace values in a data table
Description
Replace values in a data.table
Usage
replace_values_in_dt( data = NULL, old_values = NULL, new_values = NULL, silent = FALSE)Arguments
data | a data object (a data frame or a data.table) |
old_values | a vector of old values that need to be replaced |
new_values | a new value or a vector of new values that willreplace the old values |
silent | If |
Examples
replace_values_in_dt(data = mtcars, old_values = 21.0, new_values = 888)replace_values_in_dt(data = mtcars, old_values = c(0, 1), new_values = 999)replace_values_in_dt(data = mtcars, old_values = c(0, 1), new_values = 990:991)replace_values_in_dt(data = data.table::data.table(a = NA_character_, b = NA_character_),old_values = NA, new_values = "")Robust regression (bootstrapped regression)
Description
Estimate coefficients in a multiple regression model by bootstrapping.
Usage
robust_regression( data = NULL, formula = NULL, sigfigs = NULL, round_digits_after_decimal = NULL, iterations = 1000)Arguments
data | a data object (a data frame or a data.table) |
formula | a formula object for the regression equation |
sigfigs | number of significant digits to round to |
round_digits_after_decimal | round to nth digit after decimal(alternative to |
iterations | number of bootstrap samples. The default is set at 1000,but consider increasing the number of samples to 5000, 10000, or aneven larger number, if slower handling time is not an issue. |
Details
The following package(s) must be installed prior to running this function:Package 'boot' v1.3-26 (or possibly a higher version) byCanty & Ripley (2021),https://cran.r-project.org/package=boot
Examples
## Not run: robust_regression( data = mtcars, formula = mpg ~ cyl * hp, iterations = 100)## End(Not run)Round flexibly
Description
Round numbers to a flexible number of significant digits."Flexible" rounding refers to rounding all numbers to the highestlevel of precision seen among numbers that would have resultedfrom the 'signif()' function in base R. The usage examples ofthis function demonstrate flexible rounding (see below).
Usage
round_flexibly(x = NULL, sigfigs = 3)Arguments
x | a numeric vector |
sigfigs | number of significant digits to flexibly round to.By default, |
Value
the output will be a numeric vector with values roundedto the highest level of precision seen among numbers that resultfrom the 'signif()' function in base R.
Examples
# Example 1# First, observe results from the 'signif' function:c(0.00012345, pi)signif(c(0.00012345, pi), 3)# In the result above, notice how info is lost on some digits# (e.g., 3.14159265 becomes 3.140000).# In contrast, flexible rounding retains the lost info in the digitsround_flexibly(x = c(0.00012345, pi), sigfigs = 3)# Example 2# Again, first observe results from the 'signif' function:c(0.12345, 1234, 0.12, 1.23, .01)signif(c(0.12345, 1234, 0.12, 1.23, .01), 3)# In the result above, notice how info is lost on some digits# (e.g., 1234 becomes 1230.000).# In contrast, flexible rounding retains the lost info in the digits.# Specifically, in the example below, 0.12345 rounded to 3 significant# digits (default) is signif(0.12345, 3) = 0.123 (3 decimal places).# Because this 3 decimal places is the highest precision seen among# all numbers, all other numbers will also be rounded to 3 decimal places.round_flexibly(c(0.12345, 1234, 0.12, 1.23, .01))# Example 3# If the input is a character vector, the original input will be returned.round_flexibly(c("a", "b", "c"))# Example 4# If the input is a list (e.g., a data.frame) that contains at least# one numeric vector, the numeric vector element(s) will be rounded# flexibly.round_flexibly(data.frame(a = c(1.2345, 123.45), b = c("a", "b")))# Example 5# If the input is a matrix, all numbers will be rounded flexiblyround_flexibly(matrix(c(1.23, 2.345, 3.4567, 4.56789), ncol = 2), sigfigs = 3)Scatterplot
Description
Creates a scatter plot and calculates a correlation between two variables.
Usage
scatterplot( data = NULL, x_var_name = NULL, y_var_name = NULL, print_correlation = TRUE, dot_label_var_name = NULL, weight_var_name = NULL, alpha = 1, annotate_stats = TRUE, annotate_y_pos_rel = 5, annotate_y_pos_abs = NULL, annotated_stats_color = "green4", annotated_stats_font_size = 6, annotated_stats_font_face = "bold", line_of_fit_type = "lm", ci_for_line_of_fit = FALSE, line_of_fit_color = "blue", line_of_fit_thickness = 1, dot_color = "black", x_axis_label = NULL, y_axis_label = NULL, x_axis_tick_marks = NULL, y_axis_tick_marks = NULL, dot_size = 2, dot_label_size = NULL, dot_size_range = c(3, 12), jitter_x_y_percent = 0, jitter_x_percent = 0, jitter_y_percent = 0, cap_axis_lines = TRUE, color_dots_by = NULL, png_name = NULL, save_as_png = FALSE, width = 13, height = 9)Arguments
data | a data object (a data frame or a data.table) |
x_var_name | name of the variable that will go on the x axis |
y_var_name | name of the variable that will go on the y axis |
print_correlation | should the correlation be printed in theconsole? (default = TRUE) |
dot_label_var_name | name of the variable that will be used tolabel individual observations |
weight_var_name | name of the variable by which to weightthe individual observations for calculating correlation and plottingthe line of fit |
alpha | opacity of the dots (0 = completely transparent,1 = completely opaque) |
annotate_stats | if |
annotate_y_pos_rel | position of the annotated stats, expressedas a percentage of the range of y values by which the annotatedstats will be placed above the maximum value of y in the data set(default = 5). This value will be determined relative to the data.If |
annotate_y_pos_abs | as an alternative to the argument |
annotated_stats_color | color of the annotated stats(default = "green4"). |
annotated_stats_font_size | font size of the annotated stats(default = 6). |
annotated_stats_font_face | font face of the annotated stats(default = "bold"). |
line_of_fit_type | if |
ci_for_line_of_fit | if |
line_of_fit_color | color of the line of fit (default = "blue") |
line_of_fit_thickness | thickness of the line of fit (default = 1) |
dot_color | color of the dots (default = "black") |
x_axis_label | alternative label for the x axis |
y_axis_label | alternative label for the y axis |
x_axis_tick_marks | a numeric vector indicating thepositions of the tick marks on the x axis |
y_axis_tick_marks | a numeric vector indicating thepositions of the tick marks on the y axis |
dot_size | size of the dots on the plot (default = 2) |
dot_label_size | size for dots' labels on the plot. If noinput is entered for this argument, it will be set as |
dot_size_range | minimum and maximum size for dotson the plot when they are weighted |
jitter_x_y_percent | horizontally and vertically jitter dotsby a percentage of the respective ranges of x and y values. |
jitter_x_percent | horizontally jitter dots by a percentage of therange of x values. |
jitter_y_percent | vertically jitter dots by a percentage of therange of y values |
cap_axis_lines | logical. Should the axis lines be capped at theouter tick marks? (default = TRUE) |
color_dots_by | name of the variable that will determinecolors of the dots |
png_name | name of the PNG file to be saved. By default, the namewill be "scatterplot_" followed by a timestamp of thecurrent time.The timestamp will be in the format, jan_01_2021_1300_10_000001,where "jan_01_2021" would indicate January 01, 2021;1300 would indicate 13:00 (i.e., 1 PM); and 10_000001 wouldindicate 10.000001 seconds after the hour. |
save_as_png | if |
width | width of the plot to be saved. This argument will bedirectly entered as the |
height | height of the plot to be saved. This argument will bedirectly entered as the |
Details
If a weighted correlation is to be calculated, the following package(s)must be installed prior to running the function:Package 'weights' v1.0 (or possibly a higher version) byJohn Pasek (2018),https://cran.r-project.org/package=weights
Value
the output will be a scatter plot, a ggplot object.
Examples
## Not run: scatterplot(data = mtcars, x_var_name = "wt", y_var_name = "mpg")scatterplot( data = mtcars, x_var_name = "wt", y_var_name = "mpg", dot_label_var_name = "hp", weight_var_name = "drat", annotate_stats = TRUE)scatterplot( data = mtcars, x_var_name = "wt", y_var_name = "mpg", dot_label_var_name = "hp", weight_var_name = "cyl", dot_label_size = 7, annotate_stats = TRUE)scatterplot(data = mtcars, x_var_name = "wt", y_var_name = "mpg",color_dots_by = "gear")## End(Not run)Score scale items
Description
Score items in a scale (e.g., Likert scale items) by computing thesum or mean of the items.
Usage
score_scale_items( item_list = NULL, reverse_item_list = NULL, operation = "mean", na.rm = FALSE, na_summary = TRUE, reverse_code_minuend = NULL)Arguments
item_list | a list of scale items (i.e., list of vectors of ratings)to code normally (as opposed to reverse coding). |
reverse_item_list | a list of scale items to reverse code. |
operation | if |
na.rm | logical. The |
na_summary | logical. If |
reverse_code_minuend | required for reverse coding; the numberfrom which to subtract item ratings when reverse-coding. For example,if the items to reverse code are measured on a 7-point scale, enter |
Examples
score_scale_items(item_list = list(1:5, rep(3, 5)),reverse_item_list = list(rep(5, 5)), reverse_code_minuend = 6)score_scale_items(item_list = list(c(1, 1), c(1, 5)),reverse_item_list = list(c(5, 3)),reverse_code_minuend = 6, na_summary = FALSE)score_scale_items(item_list = list(c(1, 1), c(1, 5)),reverse_item_list = list(c(5, 1)),reverse_code_minuend = 6, operation = "sum")score_scale_items(item_list = list(1:5, rep(3, 5)))score_scale_items(item_list = list(c(1, NA, 3), c(NA, 2, 3)))score_scale_items(item_list = list(c(1, NA, 3), c(NA, 2, 3)), na.rm = TRUE)Standard error of the mean
Description
Standard error of the mean
Usage
se_of_mean(vector, na.rm = TRUE, notify_na_count = NULL)Arguments
vector | a numeric vector |
na.rm | Deprecated. By default, NA values will be removedbefore calculation |
notify_na_count | if |
Value
the output will be a numeric vector of length one,which will be the standard error of the mean for the given numeric vector.
Examples
se_of_mean(c(1:10, NA))Standard Error (SE) of a percentage
Description
Calculate the standard error of a percentage.See Fowler, Jr. (2014, p. 34, ISBN: 978-1-4833-1240-8)
Usage
se_of_percentage(percent = NULL, n = NULL)Arguments
percent | a vector of percentages; each of the percentage valuesmust be between 0 and 100 |
n | a vector of sample sizes; number of observations usedto calculate each of the percentage values |
Examples
se_of_percentage(percent = 40, n = 50)se_of_percentage(percent = 50, n = 10)Standard Error (SE) of a proportion
Description
Calculate the standard error of a proportion.See Anderson and Finn (1996, p. 364, ISBN: 978-1-4612-8466-6)
Usage
se_of_proportion(p = NULL, n = NULL)Arguments
p | a vector of proportions; each of the proportion valuesmust be between 0 and 1 |
n | a vector of sample sizes; number of observations usedto calculate each of the percentage values |
Examples
se_of_proportion(p = 0.56, n = 400)se_of_proportion(p = 0.5, n = 10)Set up R environment
Description
Set up R environment by (1) clearing the console; (2) removing allobjects in the global environment; (3) setting the working directoryto the active document (in RStudio only); (4) unloading andloading the kim package.
Usage
setup_r_env( clear_console = TRUE, clear_global_env = TRUE, setwd_to_active_doc = TRUE, prep_kim = TRUE)Arguments
clear_console | if |
clear_global_env | if |
setwd_to_active_doc | if |
prep_kim | if |
Examples
## Not run: setup_r_env()## End(Not run)Set working directory to active document in RStudio
Description
Set working directory to location of the active document in RStudio
Usage
setwd_to_active_doc()Value
there will be no output from this function. Rather, theworking directory will be set as location of the active document.
Examples
## Not run: setwd_to_active_doc()## End(Not run)Simple Effects Analysis
Description
Conduct a simple effects analysis to probe a two-way interaction effect.See Field et al. (2012, ISBN: 978-1-4462-0045-2).
Usage
simple_effects_analysis( data = NULL, dv_name = NULL, iv_1_name = NULL, iv_2_name = NULL, iv_1_levels = NULL, iv_2_levels = NULL, print_contrast_table = "weights_sums_and_products", output = NULL)Arguments
data | a data object (a data frame or a data.table) |
dv_name | name of the dependent variable (DV) |
iv_1_name | name of the first independent variable (IV1), whosemain effects will be examined in the first set of contrasts |
iv_2_name | name of the second independent variable (IV2), whosesimple effects at each level of IV1 will be examined in the second setof contrasts |
iv_1_levels | ordered levels of IV1 |
iv_2_levels | ordered levels of IV2 |
print_contrast_table | If |
output | output can be one of the following: |
Value
By default, the function will print a table of contrasts anda table of simple effects.
Examples
factorial_anova_2_way( data = mtcars, dv_name = "mpg", iv_1_name = "vs", iv_2_name = "am", iterations = 100, plot = TRUE)simple_effects_analysis( data = mtcars, dv_name = "mpg", iv_1_name = "vs", iv_2_name = "am")Simple slopes analysis
Description
Conduct a simple slopes analysis, typically to probe a two-wayinteraction.
Usage
simple_slopes_analysis( data = NULL, iv_name = NULL, dv_name = NULL, mod_name = NULL, round_focal_value = 2, round_b = 2, round_se = 2, round_t = 2, round_p = 3, focal_values = NULL)Arguments
data | a data object (a data frame or a data.table) |
iv_name | name of the independent variable (IV) |
dv_name | name of the dependent variable (DV) |
mod_name | name of the moderator variable (MOD) |
round_focal_value | number of decimal places to which to roundthe focal values (default = 2) |
round_b | number of decimal places to which to roundcoefficients from the regression analysis (default = 2) |
round_se | number of decimal places to which to roundstandard error values from the regression analysis (default = 2) |
round_t | number of decimal places to which to roundt statistics from the regression analysis (default = 2) |
round_p | number of decimal places to which to roundp values from the regression analysis (default = 2) |
focal_values | this input will be used only in cases wheremoderator is continuous. In such cases, what are the focal valuesof the moderator at which to estimate the effect of IV on DV?By default, values corresponding to the mean of MOD, andmean of MOD +/-1 SD will be used. |
Examples
simple_slopes_analysis(data = mtcars, iv_name = "vs", dv_name = "mpg", mod_name = "am")simple_slopes_analysis(data = mtcars, iv_name = "vs", dv_name = "mpg", mod_name = "hp")simple_slopes_analysis(data = mtcars, iv_name = "disp", dv_name = "mpg", mod_name = "hp")simple_slopes_analysis(data = mtcars, iv_name = "vs", dv_name = "am", mod_name = "hp")simple_slopes_analysis(data = mtcars, iv_name = "disp", dv_name = "am", mod_name = "hp")Simple slopes analysis with logistic regression analyses
Description
Conduct a simple slopes analysis with logistic regression analyses,typically to probe a two-way interaction when the dependent variableis binary.
Usage
simple_slopes_analysis_logistic( data = NULL, iv_name = NULL, dv_name = NULL, mod_name = NULL, round_b = 2, round_se = 2, round_z = 2, round_p = 3, focal_values = NULL)Arguments
data | a data object (a data frame or a data.table) |
iv_name | name of the independent variable (IV) |
dv_name | name of the dependent variable (DV) |
mod_name | name of the moderator variable (MOD) |
round_b | number of decimal places to which to roundcoefficients from the regression analysis (default = 2) |
round_se | number of decimal places to which to roundstandard error values from the regression analysis (default = 2) |
round_z | number of decimal places to which to roundt statistics from the regression analysis (default = 2) |
round_p | number of decimal places to which to roundp values from the regression analysis (default = 2) |
focal_values | this input will be used only in cases where MODis continuous. In such cases, what are the focal values of the MODat which to estimate the effect of IV on DV? By default, valuescorresponding to the mean of MOD, and mean of MOD +/-1 SD will be used. |
Examples
simple_slopes_analysis_logistic(data = mtcars, iv_name = "vs", dv_name = "am", mod_name = "hp")simple_slopes_analysis_logistic(data = mtcars, iv_name = "disp", dv_name = "am", mod_name = "hp")Skewness
Description
Calculate skewness using one of three formulas: (1) the traditionalFisher-Pearson coefficient of skewness; (2) the adjustedFisher-Pearson standardized moment coefficient; (3) the Pearson 2skewness coefficient. Formulas were taken from Doane & Seward (2011),doi:10.1080/10691898.2011.11889611
Usage
skewness(vector = NULL, type = "adjusted")Arguments
vector | a numeric vector |
type | a character string indicating the type of skewness tocalculate. If |
Value
a numeric value, i.e., skewness of the given vector
Examples
# calculate the adjusted Fisher-Pearson standardized moment coefficientkim::skewness(c(1, 2, 3, 4, 5, 10))# calculate the traditional Fisher-Pearson coefficient of skewnesskim::skewness(c(1, 2, 3, 4, 5, 10), type = "traditional")# compare with skewness from 'moments' packagemoments::skewness(c(1, 2, 3, 4, 5, 10))# calculate the Pearson 2 skewness coefficientkim::skewness(c(1, 2, 3, 4, 5, 10), type = "pearson_2")Spotlight 2 by Continuous
Description
Conduct a spotlight analysis for a 2 x Continuous design.See Spiller et al. (2013)doi:10.1509/jmr.12.0420
Usage
spotlight_2_by_continuous( data = NULL, iv_name = NULL, dv_name = NULL, mod_name = NULL, logistic = NULL, covariate_name = NULL, focal_values = NULL, interaction_p_include = TRUE, iv_level_order = NULL, output_type = "plot", colors = c("red", "blue"), dot_size = 3, observed_dots = FALSE, reg_lines = FALSE, reg_line_width = 1, reg_line_size = 1, lines_connecting_est_dv = TRUE, lines_connecting_est_dv_width = 1, estimated_dv_dot_shape = 15, estimated_dv_dot_size = 6, error_bar = "ci", error_bar_range = 0.95, error_bar_tip_width = NULL, error_bar_tip_width_percent = 8, error_bar_thickness = 1, error_bar_offset = NULL, error_bar_offset_percent = 8, simp_eff_bracket_leg_ht = NULL, simp_eff_bracket_leg_ht_perc = 2, simp_eff_bracket_offset = NULL, simp_eff_bracket_offset_perc = 1, simp_eff_bracket_color = "black", simp_eff_bracket_line_width = 1, simp_eff_text_offset = NULL, simp_eff_text_offset_percent = 7, simp_eff_text_hjust = 0.5, simp_eff_text_part_1 = "Simple Effect\n", simp_eff_text_color = "black", simp_eff_font_size = 5, interaction_p_value_x = NULL, interaction_p_value_y = NULL, interaction_p_value_font_size = 6, interaction_p_value_vjust = -1, interaction_p_value_hjust = 0.5, x_axis_breaks = NULL, x_axis_limits = NULL, x_axis_tick_mark_labels = NULL, y_axis_breaks = NULL, y_axis_limits = NULL, x_axis_space_left_perc = 10, x_axis_space_right_perc = 30, y_axis_tick_mark_labels = NULL, x_axis_title = NULL, y_axis_title = NULL, legend_title = NULL, legend_position = "right", y_axis_title_vjust = 0.85, round_decimals_int_p_value = 3, jitter_x_percent = 0, jitter_y_percent = 0, dot_alpha = 0.2, reg_line_alpha = 0.5, jn_point_font_size = 6, reg_line_types = c("solid", "dashed"), caption = NULL, plot_margin = ggplot2::unit(c(60, 30, 7, 7), "pt"), silent = FALSE)Arguments
data | a data object (a data frame or a data.table) |
iv_name | name of the binary independent variable (IV) |
dv_name | name of the dependent variable (DV) |
mod_name | name of the continuous moderator variable (MOD) |
logistic | logical. Should logistic regressions be conducted,rather than ordinary least squares regressions? By default,ordinary least squares regressions will be conducted. |
covariate_name | name(s) of the variable(s) to control for inestimating conditional values of the DV. |
focal_values | focal values of the moderator variable at whichto estimate IV's effect on DV. |
interaction_p_include | logical. Should the plot include ap-value for the interaction term? |
iv_level_order | order of levels in the independentvariable for legend. By default, it will be set as levels of theindependent variable ordered using R's base function |
output_type | type of output (default = "plot"). Otherpossible values include "spotlight_results", "dt_for_plotting","modified_dt" |
colors | set colors for the two levels of the independent variableBy default, |
dot_size | size of the observed_dots (default = 3) |
observed_dots | logical. If |
reg_lines | logical. If |
reg_line_width | thickness of the regression lines (default = 1). |
reg_line_size | deprecated. Use |
lines_connecting_est_dv | logical. Should lines connecting theestimated values of DV be drawn? (default = TRUE) |
lines_connecting_est_dv_width | thickness of the lines connectingthe estimated values of DV (default = 1). |
estimated_dv_dot_shape | ggplot value for shape of the dotsat estimated values of DV (default = 15, a square shape). |
estimated_dv_dot_size | size of the dots at estimated values ofDV (default = 6). |
error_bar | if |
error_bar_range | width of the confidence interval(default = 0.95 for a 95 percent confidence interval).This argument will not apply when |
error_bar_tip_width | graphically, width of the segmentsat the end of error bars (default = 0.13) |
error_bar_tip_width_percent | (default) |
error_bar_thickness | thickness of the error bars (default = 1) |
error_bar_offset | (default) |
error_bar_offset_percent | (default) |
simp_eff_bracket_leg_ht | (default) |
simp_eff_bracket_leg_ht_perc | (default) |
simp_eff_bracket_offset | (default) |
simp_eff_bracket_offset_perc | (default) |
simp_eff_bracket_color | (default) |
simp_eff_bracket_line_width | (default) |
simp_eff_text_offset | (default) |
simp_eff_text_offset_percent | (default) |
simp_eff_text_hjust | (default) |
simp_eff_text_part_1 | The first part of the text forlabeling simple effects.By default, |
simp_eff_text_color | color for the text indicating p-valuesof simple effects (default = "black"). |
simp_eff_font_size | font size of the text indicatingp-values of simple effects (default = 5). |
interaction_p_value_x | (default) |
interaction_p_value_y | (default) |
interaction_p_value_font_size | font size for the interactionp value (default = 6) |
interaction_p_value_vjust | (default) |
interaction_p_value_hjust | (default) |
x_axis_breaks | (default) |
x_axis_limits | (default) |
x_axis_tick_mark_labels | (default) |
y_axis_breaks | (default) |
y_axis_limits | (default) |
x_axis_space_left_perc | (default) |
x_axis_space_right_perc | (default) |
y_axis_tick_mark_labels | (default) |
x_axis_title | title of the x axis. By default, it will be setas input for |
y_axis_title | title of the y axis. By default, it will be setas input for |
legend_title | title of the legend. By default, it will be setas input for |
legend_position | position of the legend (default = "right").If |
y_axis_title_vjust | position of the y axis title (default = 0.85).If default is used, |
round_decimals_int_p_value | To how many digits after thedecimal point should the p value for the interaction term berounded? (default = 3) |
jitter_x_percent | horizontally jitter dots by a percentage of therange of x values |
jitter_y_percent | vertically jitter dots by a percentage of therange of y values |
dot_alpha | opacity of the dots (0 = completely transparent,1 = completely opaque). By default, |
reg_line_alpha | (default) |
jn_point_font_size | (default) |
reg_line_types | types of the regression lines for the two levelsof the independent variable.By default, |
caption | (default) |
plot_margin | margin for the plotBy default |
silent | If |
Examples
spotlight_2_by_continuous(data = mtcars,iv_name = "am",dv_name = "mpg",mod_name = "qsec")# control for variablesspotlight_2_by_continuous(data = mtcars,iv_name = "am",dv_name = "mpg",mod_name = "qsec",covariate_name = c("cyl", "hp"))# control for variables and adjust simple effect labelsspotlight_2_by_continuous(data = mtcars,iv_name = "am",dv_name = "mpg",mod_name = "qsec",covariate_name = c("cyl", "hp"),reg_lines = TRUE,observed_dots = TRUE,error_bar_offset_percent = 3,error_bar_tip_width_percent = 3,simp_eff_text_offset_percent = 3,simp_eff_bracket_leg_ht_perc = 2,dot_alpha = 0.2,simp_eff_text_part_1 = "")# spotlight at specific valuesspotlight_2_by_continuous(data = mtcars,iv_name = "am",dv_name = "mpg",mod_name = "qsec",covariate_name = c("cyl", "hp"),focal_values = seq(15, 22, 1),reg_lines = TRUE,observed_dots = TRUE,dot_alpha = 0.2,simp_eff_text_part_1 = "",simp_eff_font_size = 4,error_bar_offset_percent = 3,error_bar_tip_width_percent = 3,simp_eff_text_offset_percent = 3,simp_eff_bracket_leg_ht_perc = 1,x_axis_breaks = seq(15, 22, 1))# spotlight for logistic regressionspotlight_2_by_continuous(data = mtcars,iv_name = "am",dv_name = "vs",mod_name = "drat",logistic = TRUE)Standardize
Description
Standardize (i.e., normalize, obtain z-scores, or obtain thestandard scores)
Usage
standardize(x = NULL)Arguments
x | a numeric vector |
Value
the output will be a vector of the standard scores of the input.
Examples
standardize(1:10)Standardized Regression
Description
This function standardizes all variables for a regression analysis(i.e., dependent variable and all independent variables) and thenconducts a regression with the standardized variables.
Usage
standardized_regression( data = NULL, formula = NULL, reverse_code_vars = NULL, sigfigs = NULL, round_digits_after_decimal = NULL, round_p = 3, pretty_round_p_value = TRUE, return_table_upper_half = FALSE, round_r_squared = 3, round_f_stat = 2, prettify_reg_table_col_names = TRUE)Arguments
data | a data object (a data frame or a data.table) |
formula | a formula object for the regression equation |
reverse_code_vars | names of binary variables to reverse code |
sigfigs | number of significant digits to round to |
round_digits_after_decimal | round to nth digit after decimal(alternative to |
round_p | number of decimal places to which to round p-values(default = 3) |
pretty_round_p_value | logical. Should the p-values be roundedin a pretty format (i.e., lower threshold: "<.001").By default, |
return_table_upper_half | logical. Should only the upper partof the table be returned?By default, |
round_r_squared | number of digits after the decimal both r-squaredand adjusted r-squared values should be rounded to (default 3) |
round_f_stat | number of digits after the decimal the f statisticof the regression model should be rounded to (default 2) |
prettify_reg_table_col_names | logical. Should the column namesof the regression table be made pretty (e.g., change "std_beta" to"Std. Beta")? (Default = |
Value
the output will be a data.table showing multiple regressionresults.
Examples
standardized_regression(data = mtcars, formula = mpg ~ gear * cyl)standardized_regression(data = mtcars, formula = mpg ~ gear + gear:am + disp * cyl,round_digits_after_decimal = 3)Start kim
Description
Start kim (update kim; attach default packages; set working directory, etc.)This function requires installing Package 'remotes' v2.4.2(or possibly a higher version) by Csardi et al. (2021),https://cran.r-project.org/package=remotes
Usage
start_kim( update = TRUE, upgrade_other_pkg = FALSE, setup_r_env = TRUE, default_packages = c("data.table", "ggplot2"), silent_load_pkgs = c("data.table", "ggplot2"))Arguments
update | If |
upgrade_other_pkg | input for the |
setup_r_env | logical. If |
default_packages | a vector of names of packages to load and attach.By default, |
silent_load_pkgs | a character vector indicating names ofpackages to load silently (i.e., suppress messages that get printedwhen loading the packages).By default, |
Examples
## Not run: start_kim()start_kim(default_packages = c("dplyr", "ggplot2"))start_kim(update = TRUE, setup_r_env = FALSE)## End(Not run)su: Sorted unique values
Description
Extract unique elements and sort them
Usage
su(x = NULL, na.last = TRUE, decreasing = FALSE)Arguments
x | a vector or a data frame or an array or NULL. |
na.last | an argument to be passed onto the 'sort' function(in base R) for controlling the treatment of NA values.If |
decreasing | logical. Should the sort be increasing or decreasing?An argument to be passed onto the 'sort' function (in base R).By default, |
Value
a vector, data frame, or array-like 'x' but with duplicateelements/rows removed.
Examples
su(c(10, 3, 7, 10, NA))su(c("b", "z", "b", "a", NA, NA, NA))t-tests, pairwise
Description
Conducts a t-test for every possible pairwise comparisonwith Holm or Bonferroni correction
Usage
t_test_pairwise( data = NULL, iv_name = NULL, dv_name = NULL, sigfigs = 3, welch = TRUE, cohen_d = TRUE, cohen_d_w_ci = TRUE, adjust_p = "holm", bonferroni = NULL, mann_whitney = TRUE, mann_whitney_exact = FALSE, t_test_stats = TRUE, sd = FALSE, round_p = 3, anova = FALSE, round_f = 2, round_t = 2, round_t_test_df = 2)Arguments
data | a data object (a data frame or a data.table) |
iv_name | name of the independent variable |
dv_name | name of the dependent variable |
sigfigs | number of significant digits to round to |
welch | Should Welch's t-tests be conducted?By default, |
cohen_d | if |
cohen_d_w_ci | if |
adjust_p | the name of the method to use to adjust p-values.If |
bonferroni | The use of this argument is deprecated.Use the 'adjust_p' argument instead.If |
mann_whitney | if |
mann_whitney_exact | this is the input for the 'exact'argument used in the 'stats::wilcox.test' function, whichconducts a Mann-Whitney test. By default, |
t_test_stats | if |
sd | if |
round_p | number of decimal places to which to roundp-values (default = 3) |
anova | Should a one-way ANOVA be conducted and reported?By default, |
round_f | number of decimal places to which to roundthe f statistic (default = 2) |
round_t | number of decimal places to which to roundthe t statistic (default = 2) |
round_t_test_df | number of decimal places to which to roundthe degrees of freedom for t tests (default = 2) |
Value
the output will be a data.table showing results of allpairwise comparisons between levels of the independent variable.
Examples
## Not run: # Basic examplet_test_pairwise(data = iris, iv_name = "Species", dv_name = "Sepal.Length")# Welch's t-testt_test_pairwise(data = mtcars, iv_name = "am", dv_name = "hp")# A Student's t-testt_test_pairwise(data = mtcars, iv_name = "am", dv_name = "hp", welch = FALSE)# Other examplest_test_pairwise(data = iris, iv_name = "Species",dv_name = "Sepal.Length", t_test_stats = TRUE, sd = TRUE)t_test_pairwise(data = iris, iv_name = "Species", dv_name = "Sepal.Length",mann_whitney = FALSE)## End(Not run)Tabulate vector
Description
Shows frequency and proportion of unique values in a table format
Usage
tabulate_vector( vector = NULL, na.rm = TRUE, sort_by_decreasing_count = NULL, sort_by_increasing_count = NULL, sort_by_decreasing_value = NULL, sort_by_increasing_value = NULL, total_included = TRUE, sigfigs = NULL, round_digits_after_decimal = NULL, output_type = "dt")Arguments
vector | a character or numeric vector |
na.rm | if |
sort_by_decreasing_count | if |
sort_by_increasing_count | if |
sort_by_decreasing_value | if |
sort_by_increasing_value | if |
total_included | if |
sigfigs | number of significant digits to round to |
round_digits_after_decimal | round to nth digit after decimal(alternative to |
output_type | if |
Value
ifoutput_type = "dt", which is the default, the outputwill be a data.table showing the count and proportion (percent) of eachelement in the given vector; ifoutput_type = "df", the output willbe a data.frame showing the count and proportion (percent) of each valuein the given vector.
Examples
tabulate_vector(c("a", "b", "b", "c", "c", "c", NA))tabulate_vector(c("a", "b", "b", "c", "c", "c", NA), sort_by_increasing_count = TRUE)tabulate_vector(c("a", "b", "b", "c", "c", "c", NA), sort_by_decreasing_value = TRUE)tabulate_vector(c("a", "b", "b", "c", "c", "c", NA), sort_by_increasing_value = TRUE)tabulate_vector(c("a", "b", "b", "c", "c", "c", NA), sigfigs = 4)tabulate_vector(c("a", "b", "b", "c", "c", "c", NA), round_digits_after_decimal = 1)tabulate_vector(c("a", "b", "b", "c", "c", "c", NA), output_type = "df")Tau-squared (between-studies variance for meta analysis)
Description
Calculate tau-squared, the between-studies variance (the variance of theeffect size parameters across the population of studies), as illustratedin Borenstein et al. (2009, pp. 72-73, ISBN: 978-0-470-05724-7).
Usage
tau_squared(effect_sizes = NULL, effect_size_variances = NULL)Arguments
effect_sizes | effect sizes (e.g., standardized mean differences) |
effect_size_variances | within-study variances |
Details
Negative values of tau-squared are converted to 0 in the output (seeCheung, 2013;https://web.archive.org/web/20230512225539/https://openmx.ssri.psu.edu/thread/2432)
Examples
## Not run: tau_squared(effect_sizes = c(1, 2), effect_size_variances = c(3, 4))# a negative tau squared value is converted to 0:tau_squared(effect_sizes = c(1.1, 1.4), effect_size_variances = c(1, 4))## End(Not run)Theme Kim
Description
A custom ggplot theme
Usage
theme_kim( legend_position = "none", legend_spacing_y = 1, legend_key_size = 3, base_size = 20, axis_tick_font_size = 20, axis_tick_marks_color = "black", axis_title_font_size = 24, y_axis_title_vjust = 0.85, axis_title_margin_size = 24, cap_axis_lines = FALSE)Arguments
legend_position | position of the legend (default = "none") |
legend_spacing_y | vertical spacing of the legend keys inthe unit of "cm" (default = 1) |
legend_key_size | size of the legend keys in the unit of "lines"(default = 3) |
base_size | base font size |
axis_tick_font_size | font size for axis tick marks |
axis_tick_marks_color | color of the axis tick marks |
axis_title_font_size | font size for axis title |
y_axis_title_vjust | position of the y axis title (default = 0.85).If default is used, |
axis_title_margin_size | size of the margin between axis titleand the axis line |
cap_axis_lines | logical. Should the axis lines be capped at theouter tick marks? (default = FALSE) |
Details
If a axis lines are to be capped at the ends, the following package(s)must be installed prior to running the function:Package 'lemon' v0.4.4 (or possibly a higher version) byEdwards et al. (2020),https://cran.r-project.org/package=lemon
Value
a ggplot object; there will be no meaningful output fromthis function. Instead, this function should be used with anotherggplot object, e.g.,ggplot(mtcars , aes(x = disp, y = mpg)) + theme_kim()
Examples
prep(ggplot2)ggplot2::ggplot(mtcars, aes(x = cyl, y = mpg)) +geom_point() + theme_kim()Top, median, or bottom
Description
Indicates whether each value in a vector belongs to top, median, or bottom
Usage
top_median_or_bottom(vector)Arguments
vector | a numeric vector |
Value
a character vector indicating whether each elementin a vector belongs to "top", "median", or "bottom"
Examples
top_median_or_bottom(c(1, 2, 3, NA))top_median_or_bottom(c(1, 2, 2, NA))top_median_or_bottom(c(1, 1, 2, NA))Tabulate vector
Description
Shows frequency and proportion of unique values in a table format.This function is a copy of the earlier function, tabulate_vector,in Package 'kim'
Usage
tv( vector = NULL, na.rm = FALSE, sort_by_decreasing_count = NULL, sort_by_increasing_count = NULL, sort_by_decreasing_value = NULL, sort_by_increasing_value = NULL, total_included = TRUE, sigfigs = NULL, round_digits_after_decimal = NULL, output_type = "dt")Arguments
vector | a character or numeric vector |
na.rm | if |
sort_by_decreasing_count | if |
sort_by_increasing_count | if |
sort_by_decreasing_value | if |
sort_by_increasing_value | if |
total_included | if |
sigfigs | number of significant digits to round to |
round_digits_after_decimal | round to nth digit after decimal(alternative to |
output_type | if |
Value
ifoutput_type = "dt", which is the default, the outputwill be a data.table showing the count and proportion (percent) of eachelement in the given vector; ifoutput_type = "df", the output willbe a data.frame showing the count and proportion (percent) of each valuein the given vector.
Examples
tv(c("a", "b", "b", "c", "c", "c", NA))tv(c("a", "b", "b", "c", "c", "c", NA), sort_by_increasing_count = TRUE)tv(c("a", "b", "b", "c", "c", "c", NA), sort_by_decreasing_value = TRUE)tv(c("a", "b", "b", "c", "c", "c", NA), sort_by_increasing_value = TRUE)tv(c("a", "b", "b", "c", "c", "c", NA), sigfigs = 4)tv(c("a", "b", "b", "c", "c", "c", NA), round_digits_after_decimal = 1)tv(c("a", "b", "b", "c", "c", "c", NA), output_type = "df")Two-Way Factorial ANOVA
Description
This function is deprecated. Use the function 'factorial_anova_2_way'instead.
Usage
two_way_anova( data = NULL, dv_name = NULL, iv_1_name = NULL, iv_2_name = NULL, iv_1_values = NULL, iv_2_values = NULL, sigfigs = 3, robust = FALSE, iterations = 2000, plot = TRUE, error_bar = "ci", error_bar_range = 0.95, error_bar_tip_width = 0.13, error_bar_thickness = 1, error_bar_caption = TRUE, line_colors = NULL, line_types = NULL, line_thickness = 1, dot_size = 3, position_dodge = 0.13, x_axis_title = NULL, y_axis_title = NULL, y_axis_title_vjust = 0.85, legend_title = NULL, legend_position = "right", output = "anova_table", png_name = NULL, width = 7000, height = 4000, units = "px", res = 300, layout_matrix = NULL)Arguments
data | a data object (a data frame or a data.table) |
dv_name | name of the dependent variable |
iv_1_name | name of the first independent variable |
iv_2_name | name of the second independent variable |
iv_1_values | restrict all analyses to observations havingthese values for the first independent variable |
iv_2_values | restrict all analyses to observations havingthese values for the second independent variable |
sigfigs | number of significant digits to which to roundvalues in anova table (default = 3) |
robust | if |
iterations | number of bootstrap samples for robust ANOVA.The default is set at 2000, but consider increasing the numberof samples to 5000, 10000, or an even larger number, if slowerhandling time is not an issue. |
plot | if |
error_bar | if |
error_bar_range | width of the confidence interval(default = 0.95 for 95 percent confidence interval).This argument will not apply when |
error_bar_tip_width | graphically, width of the segmentsat the end of error bars (default = 0.13) |
error_bar_thickness | thickness of the error bars (default = 1) |
error_bar_caption | should a caption be included to indicatethe width of the error bars? (default = TRUE). |
line_colors | colors of the lines connecting means (default = NULL)If the second IV has two levels, then by default, |
line_types | types of the lines connecting means (default = NULL)If the second IV has two levels, then by default, |
line_thickness | thickness of the lines connecting group means,(default = 1) |
dot_size | size of the dots indicating group means (default = 3) |
position_dodge | by how much should the group means and error barsbe horizontally offset from each other so as not to overlap?(default = 0.13) |
x_axis_title | a character string for the x-axis title. If noinput is entered, then, by default, the first value of |
y_axis_title | a character string for the y-axis title. If noinput is entered, then, by default, |
y_axis_title_vjust | position of the y axis title (default = 0.85).By default, |
legend_title | a character for the legend title. If no inputis entered, then, by default, the second value of |
legend_position | position of the legend: |
output | output type can be one of the following: |
png_name | name of the PNG file to be saved.If |
width | width of the PNG file (default = 7000) |
height | height of the PNG file (default = 4000) |
units | the units for the |
res | The nominal resolution in ppi which will be recordedin the png file, if a positive integer. Used for unitsother than the default. If not specified, taken as 300 ppito set the size of text and line widths. |
layout_matrix | The layout argument for arranging plots and tablesusing the |
Details
Conduct a two-way factorial analysis of variance (ANOVA).
The following package(s) must be installed prior to running this function:Package 'car' v3.0.9 (or possibly a higher version) byFox et al. (2020),https://cran.r-project.org/package=car
If robust ANOVA is to be conducted, the following package(s)must be installed prior to running the function:Package 'WRS2' v1.1-1 (or possibly a higher version) byMair & Wilcox (2021),https://cran.r-project.org/package=WRS2
Value
by default, the output will be"anova_table"
Examples
## Not run: two_way_anova( data = mtcars, dv_name = "mpg", iv_1_name = "vs", iv_2_name = "am", iterations = 100)anova_results <- two_way_anova( data = mtcars, dv_name = "mpg", iv_1_name = "vs", iv_2_name = "am", output = "all")anova_results## End(Not run)Undocumented functions
Description
A collection of miscellaneous functions lacking documentations
Usage
und(fn, ...)Arguments
fn | name of the function |
... | arguments for the function |
Value
the output will vary by function
Examples
# correlationund(corr_text, x = 1:5, y = c(1, 2, 2, 2, 3))# mean centerund(mean_center, 1:10)# compare results with base functionscale(1:10, scale = TRUE)# find the modesund(mode, c(3, 3, 3, 1, 2, 2))# return values that are not outliersund(outlier_rm, c(12:18, 100))kim::outlier(c(1:10, 100))Unload all user-installed packages
Description
Unload all user-installed packages
Usage
unload_user_installed_pkgs(exceptions = NULL, force = FALSE, keep_kim = TRUE)Arguments
exceptions | a character vector of names of packages to keep loaded |
force | logical. Should a package be unloaded even though otherattached packages depend on it? By default, |
keep_kim | logical. If |
Examples
## Not run: unload_user_installed_pkgs()## End(Not run)Update the package 'kim'
Description
Updates the current package 'kim' by installing themost recent version of the package from GitHubThis function requires installing Package 'remotes' v2.4.2(or possibly a higher version) by Csardi et al. (2021),https://cran.r-project.org/package=remotes
Usage
update_kim(force = TRUE, upgrade_other_pkg = FALSE, confirm = TRUE)Arguments
force | logical. If |
upgrade_other_pkg | input for the |
confirm | logical. If |
Value
there will be no output from this function. Rather, executingthis function will update the current 'kim' package by installingthe most recent version of the package from GitHub.
Examples
## Not run: if (interactive()) {update_kim()}## End(Not run)Convert variance of log odds ratio to variance of d
Description
Convert the variance of a log odds ratio to the variance ofa Cohen'd (standardized mean difference), as illustrated inBorenstein et al. (2009, p. 47, ISBN: 978-0-470-05724-7)
Usage
var_of_log_odds_ratio_to_var_of_d(var_of_log_odds_ratio = NULL)Arguments
var_of_log_odds_ratio | the variance of a log odds ratio(the input can be a vector of values) |
Examples
## Not run: var_of_log_odds_ratio_to_var_of_d(1)## End(Not run)Variance of a percentage
Description
Calculate the variance of a percentage.See Fowler, Jr. (2014, p. 34, ISBN: 978-1-4833-1240-8)
Usage
var_of_percentage(percent = NULL, n = NULL)Arguments
percent | a vector of percentages; each of the percentage valuesmust be between 0 and 100 |
n | a vector of sample sizes; number of observations usedto calculate each of the percentage values |
Examples
var_of_percentage(percent = 40, n = 50)var_of_percentage(percent = 50, n = 10)Variance of a proportion
Description
Calculate the variance of a proportion.See Anderson and Finn (1996, p. 364, ISBN: 978-1-4612-8466-6)
Usage
var_of_proportion(p = NULL, n = NULL)Arguments
p | a vector of proportions; each of the proportion valuesmust be between 0 and 1 |
n | a vector of sample sizes; number of observations usedto calculate each of the percentage values |
Examples
var_of_proportion(p = 0.56, n = 400)var_of_proportion(p = 0.5, n = 100)var_of_proportion(p = 0.4, n = 50)var_of_proportion(p = c(0.5, 0.9), n = c(100, 200))Vlookup
Description
Look up values in a reference data.table and return valuesassociated with the looked-up values contained in the reference data.table
Usage
vlookup( lookup_values = NULL, reference_dt = NULL, col_name_for_lookup_values = NULL, col_name_for_output_values = NULL)Arguments
lookup_values | a vector of values to look up |
reference_dt | a data.table containing the values to look upas well as values associated with the looked-up valuesthat need to be returned. |
col_name_for_lookup_values | in the reference data.table,name of the column containing |
col_name_for_output_values | in the reference data.table,name of the column containing values to return (i.e., valuesassociated with the looked-up values that will be the function's output) |
Examples
vlookup(lookup_values = c(2.620, 2.875), reference_dt = mtcars[1:9, ],col_name_for_lookup_values = "wt", col_name_for_output_values = "qsec")Estimate the mean effect size in a meta analysis
Description
Estimate the mean effect size in a meta analysis, as illustratedin Borenstein et al. (2009, pp. 73-74, ISBN: 978-0-470-05724-7)
Usage
weighted_mean_effect_size( effect_sizes = NULL, effect_size_variances = NULL, ci = 0.95, one_tailed = FALSE, random_vs_fixed = "random")Arguments
effect_sizes | effect sizes (e.g., standardized mean differences) |
effect_size_variances | within-study variances |
ci | width of the confidence interval (default = 0.95) |
one_tailed | logical. If |
random_vs_fixed | If |
Examples
## Not run: weighted_mean_effect_size(effect_sizes = c(1, 2), effect_size_variances = c(3, 4))weighted_mean_effect_size(effect_sizes = c(0.095, 0.277, 0.367, 0.664, 0.462, 0.185),effect_size_variances = c(0.033, 0.031, 0.050, 0.011, 0.043, 0.023))# if effect sizes have a variance of 0, they will be excluded from# the analysisweighted_mean_effect_size(effect_sizes = c(1.1, 1.2, 1.3, 1.4),effect_size_variances = c(1, 0, 0, 4))## End(Not run)Weighted mean correlation
Description
Calculate the weighted mean correlation coefficient for a givencorrelations and sample sizes.This function uses the Hedges-Olkin Method with random effects.See Field (2001)doi:10.1037/1082-989X.6.2.161
Usage
weighted_mean_r(r = NULL, n = NULL, ci = 0.95, sigfigs = 3, silent = FALSE)Arguments
r | a (vector of) correlation coefficient(s) |
n | a (vector of) sample size(s) |
ci | width of the confidence interval. Input can be any valueless than 1 and greater than or equal to 0. By default, |
sigfigs | number of significant digits to round to (default = 3) |
silent | logical. If |
Value
the output will be a list of vector of correlation coefficient(s).
Examples
weighted_mean_r(r = c(0.2, 0.4), n = c(100, 100))weighted_mean_r(r = c(0.2, 0.4), n = c(100, 20000))# example consistent with using MedCalcweighted_mean_r(r = c(0.51, 0.48, 0.3, 0.21, 0.6, 0.46, 0.22, 0.25),n = c(131, 129, 155, 121, 111, 119, 112, 145))Weighted z
Description
Calculate the weighted z (for calculating weighted mean correlation).See p. 231 of the book Hedges & Olkin (1985),Statistical Methods for Meta-Analysis (ISBN: 0123363802).
Usage
weighted_z(z = NULL, n = NULL)Arguments
z | a vector of z values |
n | a vector of sample sizes which will be used to calculate theweights, which in turn will be used to calculate the weighted z. |
Value
the output will be a weighted z value.
Examples
weighted_z(1:3, c(100, 200, 300))weighted_z(z = c(1:3, NA), n = c(100, 200, 300, NA))Wilcoxon Rank-Sum Test (Also called the Mann-Whitney U Test)
Description
A nonparametric equivalent of the independent t-test
Usage
wilcoxon_rank_sum_test( data = NULL, iv_name = NULL, dv_name = NULL, sigfigs = 3)Arguments
data | a data object (a data frame or a data.table) |
iv_name | name of the independent variable (grouping variable) |
dv_name | name of the dependent variable (measure variableof interest) |
sigfigs | number of significant digits to round to |
Value
the output will be a data.table object with all pairwiseWilcoxon rank-sum test results
Examples
wilcoxon_rank_sum_test(data = iris, iv_name = "Species", dv_name = "Sepal.Length")Write to a csv file
Description
Write to a csv file
Usage
write_csv(data = NULL, name = NULL, timestamp = NULL)Arguments
data | a data object (a data frame or a data.table) |
name | a character string of the csv file name without the".csv" extension. For example, if the csv file to write to is"myfile.csv", enter |
timestamp | logical. Should the timestamp be appended to thefile name? |
Value
the output will be a .csv file in the working directory,that is, an output from the data.table function,fwrite
Examples
## Not run: write_csv(mtcars, "mtcars_from_write_csv")write_csv(mtcars)## End(Not run)z score
Description
Calculate z-scores (i.e., standardize or obtain the standard scores)
Usage
z_score(x = NULL, na.rm = TRUE)Arguments
x | a numeric vector |
na.rm | logical. If |
Value
the output will be a vector of z-scores.
Examples
z_score(1:10)Z to r transformation (Inverse of Fisher's Z transformation)
Description
Perform the Z-to-r transformation (i.e., the inverse of Fisher'sr-to-Z transformation) for given Z value(s).
Usage
z_to_r_transform(z = NULL)Arguments
z | a (vector of) Z values |
Value
the output will be a vector of correlation coefficient(s) thatare the result(s) of the Z-to-r transformation.
Examples
z_to_r_transform(2.646652)z_to_r_transform(z = -3:3)