Stacked bar charts with statistical tests

Source:R/ggbarstats.R

ggbarstats.Rd

Bar charts for categorical data with statistical details included in the plotas a subtitle.

Usage

ggbarstats(data,x,y,  counts=NULL,  type="parametric",  paired=FALSE,  results.subtitle=TRUE,  label="percentage",  label.args=list(alpha=1, fill="white"),  sample.size.label.args=list(size=4),  digits=2L,  proportion.test=results.subtitle,  digits.perc=0L,  bf.message=TRUE,  ratio=NULL,  conf.level=0.95,  sampling.plan="indepMulti",  fixed.margin="rows",  prior.concentration=1,  title=NULL,  subtitle=NULL,  caption=NULL,  legend.title=NULL,  xlab=NULL,  ylab=NULL,  ggtheme=ggstatsplot::theme_ggstatsplot(),  package="RColorBrewer",  palette="Dark2",  ggplot.component=NULL,...)

Arguments

data

A data frame (or a tibble) from which variables specified are tobe taken. Other data types (e.g., matrix,table, array, etc.) willnotbe accepted. Additionally, grouped data frames from{dplyr} should beungrouped before they are entered asdata.

x

The variable to use as therows in the contingency table. Pleasenote that if there are empty factor levels in your variable, they will bedropped.

y

The variable to use as thecolumns in the contingency table.Please note that if there are empty factor levels in your variable, theywill be dropped. Default isNULL. IfNULL, one-sample proportion test(a goodness of fit test) will be run for thex variable. Otherwise anappropriate association test will be run. This argument can not beNULLforggbarstats().

counts

The variable in data containing counts, orNULL if each rowrepresents a single observation.

type

A character specifying the type of statistical approach:

"parametric"
"nonparametric"
"robust"
"bayes"

You can specify just the initial letter.

paired

Logical indicating whether data came from a within-subjects orrepeated measures design study (Default:FALSE).

results.subtitle

Decides whether the results of statistical tests areto be displayed as a subtitle (Default:TRUE). If set toFALSE, onlythe plot will be returned.

label

Character decides what information needs to be displayedon the label in each pie slice. Possible options are"percentage"(default),"counts","both".

label.args

Additional aesthetic arguments that will be passed toggplot2::geom_label().

sample.size.label.args

Additional aesthetic arguments that will bepassed toggplot2::geom_text().

digits

Number of digits for rounding or significant figures. May alsobe"signif" to return significant figures or"scientific"to return scientific notation. Control the number of digits by adding thevalue as suffix, e.g.digits = "scientific4" to have scientificnotation with 4 decimal places, ordigits = "signif5" for 5significant figures (see alsosignif()).

proportion.test

Decides whether proportion test forx variable is tobe carried out for each level ofy. Defaults toresults.subtitle. Inggbarstats(), onlyp-values from this test will be displayed.

digits.perc

Numeric that decides number of decimal places forpercentage labels (Default:0L).

bf.message

Logical that decides whether to display Bayes Factor infavor of thenull hypothesis. This argument is relevant onlyforparametric test (Default:TRUE).

ratio

A vector of proportions: the expected proportions for theproportion test (should sum to1). Default isNULL, which means the nullis equal theoretical proportions across the levels of the nominal variable.E.g.,ratio = c(0.5, 0.5) for two levels,ratio = c(0.25, 0.25, 0.25, 0.25) for four levels, etc.

conf.level

Scalar between0 and1 (default:95%confidence/credible intervals,0.95). IfNULL, no confidence intervalswill be computed.

sampling.plan

Character describing the sampling plan. Possible options:

"indepMulti" (independent multinomial; default)
"poisson"
"jointMulti" (joint multinomial)
"hypergeom" (hypergeometric).For more, seeBayesFactor::contingencyTableBF().

fixed.margin

For the independent multinomial sampling plan, whichmargin is fixed ("rows" or"cols"). Defaults to"rows".

prior.concentration

Specifies the prior concentration parameter, setto1 by default. It indexes the expected deviation from the nullhypothesis under the alternative, and corresponds to Gunel and Dickey's(1974)"a" parameter.

title

The text for the plot title.

subtitle

The text for the plot subtitle. Will work only ifresults.subtitle = FALSE.

caption

The text for the plot caption. This argument is relevant onlyifbf.message = FALSE.

legend.title

Title text for the legend.

xlab

Label forx axis variable. IfNULL (default),variable name forx will be used.

ylab

Labels fory axis variable. IfNULL (default),variable name fory will be used.

ggtheme

A{ggplot2} theme. Default value istheme_ggstatsplot(). Any of the{ggplot2} themes (e.g.,ggplot2::theme_bw()), or themes from extension packages are allowed(e.g.,ggthemes::theme_fivethirtyeight(),hrbrthemes::theme_ipsum_ps(),etc.). But note that sometimes these themes will remove some of the detailsthat{ggstatsplot} plots typically contains. For example, if relevant,ggbetweenstats() shows details about multiple comparison test as alabel on the secondary Y-axis. Some themes (e.g.ggthemes::theme_fivethirtyeight()) will remove the secondary Y-axis andthus the details as well.

package, palette

Name of the package from which the given palette is tobe extracted. The available palettes and packages can be checked by runningView(paletteer::palettes_d_names).

ggplot.component

Aggplot component to be added to the plot preparedby{ggstatsplot}. This argument is primarily helpful forgrouped_variants of all primary functions. Default isNULL. The argument shouldbe entered as a{ggplot2} function or a list of{ggplot2} functions.

...

Currently ignored.

Details

For details, see:https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/ggpiestats.html

Summary of graphics

graphical element	`geom` used	argument for further modification
bars	`ggplot2::geom_bar()`	`NA`
descriptive labels	`ggplot2::geom_label()`	`label.args`
sample size labels	`ggplot2::geom_text()`	`sample.size.label.args`

Contingency table analyses

The table below provides summary about:

statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details

two-way table

Hypothesis testing

Type	Design	Test	Function used
Parametric/Non-parametric	Unpaired	Pearson's chi-squared test	`stats::chisq.test()`
Bayesian	Unpaired	Bayesian Pearson's chi-squared test	`BayesFactor::contingencyTableBF()`
Parametric/Non-parametric	Paired	McNemar's chi-squared test	`stats::mcnemar.test()`
Bayesian	Paired	No	No

Effect size estimation

Type	Design	Effect size	CI available?	Function used
Parametric/Non-parametric	Unpaired	Cramer'sV	Yes	`effectsize::cramers_v()`
Bayesian	Unpaired	Cramer'sV	Yes	`effectsize::cramers_v()`
Parametric/Non-parametric	Paired	Cohen'sg	Yes	`effectsize::cohens_g()`
Bayesian	Paired	No	No	No

one-way table

Hypothesis testing

Type	Test	Function used
Parametric/Non-parametric	Goodness of fit chi-squared test	`stats::chisq.test()`
Bayesian	Bayesian Goodness of fit chi-squared test	(custom)

Effect size estimation

Type	Effect size	CI available?	Function used
Parametric/Non-parametric	Pearson'sC	Yes	`effectsize::pearsons_c()`
Bayesian	No	No	No

Examples

# for reproducibilityset.seed(123)# creating a plotp<-ggbarstats(mtcars, x=vs, y=cyl)# looking at the plotp# extracting details from statistical testsextract_stats(p)#> $subtitle_data#># A tibble: 1 × 13#>statisticdfp.valuemethodeffectsize#><dbl><int><dbl><chr><chr>#>1      21.3     20.0000232 Pearson's Chi-squared test Cramer's V (adj.)#>estimateconf.levelconf.lowconf.highconf.methodconf.distributionn.obs#><dbl><dbl><dbl><dbl><chr><chr><int>#>10.7890.950.371         1 ncp         chisq                32#>expression#><list>#>1<language>#>#> $caption_data#># A tibble: 1 × 15#>termconf.leveleffectsizeestimateconf.lowconf.high#><chr><dbl><chr><dbl><dbl><dbl>#>1 Ratio0.95 Cramers_v0.6830.4360.840#>prior.distributionprior.locationprior.scalebf10#><chr><dbl><dbl><dbl>#>1 independent multinomial0           130129.#>methodconf.methodlog_e_bf10n.obsexpression#><chr><chr><dbl><int><list>#>1 Bayesian contingency table analysis ETI               10.3    32<language>#>#> $pairwise_comparisons_data#> NULL#>#> $descriptive_data#># A tibble: 5 × 5#>   cyl   vs    counts   perc .label#><fct><fct><int><dbl><chr>#>1 4     1         10  90.9  91%#>2 6     1          4  57.1  57%#>3 4     0          1   9.09 9%#>4 6     0          3  42.9  43%#>5 8     0         14 100    100%#>#> $one_sample_data#># A tibble: 3 × 19#>   cyl   counts  perc N        statistic    df p.value method effectsize estimate#><fct><int><dbl><chr><dbl><dbl><dbl><chr><chr><dbl>#>1 8         14  43.8 (n = 14)    14         1 1.83e-4 Chi-s… Pearson's…    0.707#>2 6          7  21.9 (n = 7)      0.143     1 7.05e-1 Chi-s… Pearson's…    0.141#>3 4         11  34.4 (n = 11)     7.36      1 6.66e-3 Chi-s… Pearson's…    0.633#># ℹ 9 more variables: conf.level <dbl>, conf.low <dbl>, conf.high <dbl>,#>#   conf.method <chr>, conf.distribution <chr>, n.obs <int>, expression <list>,#>#   .label <glue>, .p.label <glue>#>#> $tidy_data#> NULL#>#> $glance_data#> NULL#>#> attr(,"class")#> [1] "ggstatsplot_stats" "list"

Movatterモバイル変換

Primary functions

Miscellaneous