Movatterモバイル変換


[0]ホーム

URL:


Title:Visualizations of Distributions and Uncertainty
Version:3.3.3
Date:2025-04-20
Maintainer:Matthew Kay <mjskay@northwestern.edu>
Description:Provides primitives for visualizing distributions using 'ggplot2' that are particularly tuned for visualizing uncertainty in either a frequentist or Bayesian mode. Both analytical distributions (such as frequentist confidence distributions or Bayesian priors) and distributions represented as samples (such as bootstrap distributions or Bayesian posterior samples) are easily visualized. Visualization primitives include but are not limited to: points with multiple uncertainty intervals, eye plots (Spiegelhalter D., 1999)https://ideas.repec.org/a/bla/jorssa/v162y1999i1p45-58.html, density plots, gradient plots, dot plots (Wilkinson L., 1999) <doi:10.1080/00031305.1999.10474474>, quantile dot plots (Kay M., Kola T., Hullman J., Munson S., 2016) <doi:10.1145/2858036.2858558>, complementary cumulative distribution function barplots (Fernandes M., Walls L., Munson S., Hullman J., Kay M., 2018) <doi:10.1145/3173574.3173718>, and fit curves with multiple uncertainty ribbons.
Depends:R (≥ 4.0.0)
Imports:grid, ggplot2 (≥ 3.5.0), scales, rlang (≥ 0.3.0), cli,tibble, vctrs, withr, glue, gtable, distributional (≥ 0.3.2),numDeriv, quadprog, Rcpp
Suggests:tidyselect, dplyr (≥ 1.0.0), fda, posterior (≥ 1.4.0),beeswarm (≥ 0.4.0), rmarkdown, knitr, testthat (≥ 3.0.0),vdiffr (≥ 1.0.0), svglite (≥ 2.1.0), fontquiver, sysfonts,showtext, mvtnorm, covr, broom (≥ 0.5.6), patchwork, tidyr (≥1.0.0), ragg (≥ 1.3.0), pkgdown
License:GPL (≥ 3)
Language:en-US
BugReports:https://github.com/mjskay/ggdist/issues
URL:https://mjskay.github.io/ggdist/,https://github.com/mjskay/ggdist/
VignetteBuilder:knitr
RoxygenNote:7.3.2
LazyData:true
Encoding:UTF-8
Collate:"ggdist-package.R" "util.R" "compat.R" "rd.R" "RcppExports.R""abstract_geom.R" "abstract_stat.R""abstract_stat_slabinterval.R" "auto_partial.R""binning_methods.R" "bounder.R" "curve_interval.R""cut_cdf_qi.R" "data.R" "density.R" "distributions.R""draw_key_slabinterval.R" "geom.R" "geom_slabinterval.R""geom_dotsinterval.R" "geom_blur_dots.R" "geom_interval.R""geom_lineribbon.R" "geom_pointinterval.R" "geom_slab.R""geom_spike.R" "geom_swarm.R" "guide_rampbar.R""interval_widths.R" "lkjcorr_marginal.R" "parse_dist.R""partial_colour_ramp.R" "point_interval.R""position_dodgejust.R" "pr.R" "rd_density.R""rd_dotsinterval.R" "rd_slabinterval.R" "rd_spike.R""rd_lineribbon.R" "scale_colour_ramp.R" "scale_thickness.R""scale_side_mirrored.R" "scale_.R" "smooth.R" "stat.R""stat_slabinterval.R" "stat_dotsinterval.R" "stat_mcse_dots.R""stat_pointinterval.R" "stat_interval.R" "stat_lineribbon.R""stat_spike.R" "student_t.R" "subguide.R" "subscale.R""testthat.R" "theme_ggdist.R" "thickness.R""tidy_format_translators.R" "weighted_ecdf.R" "weighted_hist.R""weighted_quantile.R" "deprecated.R"
Config/testthat/edition:3
LinkingTo:Rcpp
NeedsCompilation:yes
Packaged:2025-04-22 23:25:44 UTC; matth
Author:Matthew Kay [aut, cre], Brenton M. Wiernik [ctb]
Repository:CRAN
Date/Publication:2025-04-23 00:20:02 UTC

Visualizations of Distributions and Uncertainty

Description

ggdist is an R package that aims to make it easy to integratepopular Bayesian modeling methods into a tidy data + ggplot workflow.

Details

ggdist is an R package that provides a flexible set ofggplot2 geoms and stats designedespecially for visualizing distributions and uncertainty. It is designed for bothfrequentist and Bayesian uncertainty visualization, taking the view that uncertaintyvisualization can be unified through the perspective of distribution visualization:for frequentist models, one visualizes confidence distributions or bootstrap distributions(seevignette("freq-uncertainty-vis")); for Bayesian models, one visualizes probabilitydistributions (seevignette("tidybayes", package = "tidybayes")).

Thegeom_slabinterval() /stat_slabinterval() family (seevignette("slabinterval")) makes iteasy to visualize point summaries and intervals, eye plots, half-eye plots, ridge plots,CCDF bar plots, gradient plots, histograms, and more.

Thegeom_dotsinterval() /stat_dotsinterval() family (seevignette("dotsinterval")) makesit easy to visualize dot+interval plots, Wilkinson dotplots, beeswarm plots, and quantile dotplots.

Thegeom_lineribbon() /stat_lineribbon() family (seevignette("lineribbon"))makes it easy to visualize fit lines with an arbitrary number of uncertainty bands.

Author(s)

Maintainer: Matthew Kaymjskay@northwestern.edu

Other contributors:

See Also

Useful links:


Base ggproto classes for ggdist

Description

Base ggproto classes for ggdist

See Also

ggproto()


Probability expressions in ggdist aesthetics

Description

Experimental probability-like expressions that can be used in place ofsomeafter_stat() expressions in aesthetic assignments inggdist stats.

Usage

Pr_(x)p_(x)

Arguments

x

<barelanguage> Expressions. SeeProbability expressions, below.

Details

Pr_() andp_() are anexperimental mini-language for specifying aesthetic valuesbased on probabilities and probability densities derived from distributionssupplied toggdist stats (e.g., instat_slabinterval(),stat_dotsinterval(), etc.). They generate expressions that useafter_stat()and the computed variables of the stat (such ascdf andpdf; see e.g.theComputed Variables section ofstat_slabinterval()) to computethe desired probabilities or densities.

For example, one way to map the density of a distribution onto thealphaaesthetic of a slab is to useafter_stat(pdf):

ggplot() +  stat_slab(aes(xdist = distributional::dist_normal(), alpha = after_stat(pdf)))

ggdist probability expressions offer an alternative, equivalent syntax:

ggplot() +  stat_slab(aes(xdist = distributional::dist_normal(), alpha = !!p_(x)))

Wherep_(x) is the probability density function. The use of⁠!!⁠ isnecessary to splice the generated expression into theaes() call; formore information, seequasiquotation.

Probability expressions

Probability expressions consist of a call toPr_() orp_() containinga small number of valid combinations of operators and variable names.

Valid variables in probability expressions include:

Pr_() generates expressions for probabilities, e.g. cumulative distributionfunctions (CDFs). Valid operators insidePr_() are:

p_() generates expressions for probability density functions or probability massfunctions (depending on if the underlying distribution is continuous ordiscrete). It currently does not allow any operators in the expression, andmust be passed one ofx,y, orvalue.

See Also

TheComputed Variables section ofstat_slabinterval() (especiallycdf andpdf) and theafter_stat() function.

Examples

library(ggplot2)library(distributional)df = data.frame(  d = c(dist_normal(2.7, 1), dist_lognormal(1, 1/3)),  name = c("normal", "lognormal"))# map density onto alpha of the fillggplot(df, aes(y = name, xdist = d)) +  stat_slabinterval(aes(alpha = !!p_(x)))# map CCDF onto thickness (like stat_ccdfinterval())ggplot(df, aes(y = name, xdist = d)) +  stat_slabinterval(aes(thickness = !!Pr_(xdist > x)))# map containing interval onto fillggplot(df, aes(y = name, xdist = d)) +  stat_slabinterval(aes(fill = !!Pr_(x %in% interval)))# the color scale in the previous example is not great, so turn the# probability into an ordered factor and adjust the fill scale.# Though, see also the `level` computed variable in `stat_slabinterval()`,# which is probably easier to use to create this style of chart.ggplot(df, aes(y = name, xdist = d)) +  stat_slabinterval(aes(fill = ordered(!!Pr_(x %in% interval)))) +  scale_fill_brewer(direction = -1)

Thinned subset of posterior sample from a Bayesian analysis of perception of correlation.

Description

Data from Kay and Heer (2016), primarily used for testing and examples.

Details

For more details, see Kay and Heer (2016) or the Github repository describing the analysis:https://github.com/mjskay/ranking-correlation. The original experiment (but not this analysis of it)is described in Harrisonet al. (2014).

data("RankCorr") is a substantially thinned version of the original posterior sample and has omitted severalparameters in order for it to be a more manageable size.

data("RankCorr_u_tau") is used for testing and examples and is roughly the equivalent of the following:

data("RankCorr")RankCorr_u_tau = tidybayes::spread_draws(RankCorr, u_tau[i]))

References

Kay, Matthew, and Jeffrey Heer. (2016)."Beyond Weber's law: A second look at ranking visualizations of correlation."IEEE transactions on visualization and computer graphics 22(1): 469-478.doi:10.1109/TVCG.2015.2467671

Harrison, Lane, Fumeng Yang, Steven Franconeri, and Remco Chang. (2014)."Ranking visualizations of correlation using Weber's law."IEEE transactions on visualization and computer graphics 20(12): 1943-1952.doi:10.1109/TVCG.2014.2346979


Break (bin) alignment methods

Description

Methods for aligning breaks (bins) in histograms, as used in thealignargument todensity_histogram().

Supportsautomatic partial function application withwaived arguments.

Usage

align_none(breaks)align_boundary(breaks, at = 0)align_center(breaks, at = 0)

Arguments

breaks

<numeric> A sorted vector of breaks (bin edges).

at

<scalarnumeric> The alignment point.

Details

These functions take a sorted vector of equally-spacedbreaks givingbin edges and return a numeric offset which, if subtracted frombreaks,will align them as desired:

Foralign_boundary() (respectivelyalign_center()), if no bin edge (or center) in therange ofbreaks would line up withat, it ensures thatat is an integermultiple of the bin width away from a bin edge (or center).

Value

A scalar numeric returning an offset to be subtracted frombreaks.

See Also

density_histogram(),breaks

Examples

library(ggplot2)set.seed(1234)x = rnorm(200, 1, 2)# If we manually specify a bin width using breaks_fixed(), the default# alignment (align_none()) will not align bin edges to any "pretty" numbers.# Here is a comparison of the three alignment methods on such a histogram:ggplot(data.frame(x), aes(x)) +  stat_slab(    aes(y = "align_none()\nor 'none'"),    density = "histogram",    breaks = breaks_fixed(width = 1),    outline_bars = TRUE,    # no need to specify align; align_none() is the default    color = "black",  ) +  stat_slab(    aes(y = "align_center(at = 0)\nor 'center'"),    density = "histogram",    breaks = breaks_fixed(width = 1),    align = align_center(at = 0),   # or align = "center"    outline_bars = TRUE,    color = "black",  ) +  stat_slab(    aes(y = "align_boundary(at = 0)\nor 'boundary'"),    density = "histogram",    breaks = breaks_fixed(width = 1),    align = align_boundary(at = 0), # or align = "boundary"    outline_bars = TRUE,    color = "black",  ) +  geom_point(aes(y = 0.7), alpha = 0.5) +  labs(    subtitle = "ggdist::stat_slab(density = 'histogram', ...)",    y = "align =",    x = NULL  ) +  geom_vline(xintercept = 0, linetype = "22", color = "red")

Automatic partial function application in ggdist

Description

Severalggdist functions supportautomatic partial application: when called,if all of their required arguments have not been provided, the function returns amodified version of itself that uses the arguments passed to it so far as defaults.Technically speaking, these functions are essentially "Curried" with respect totheir required arguments, but I think "automatic partial application" getsthe idea across more clearly.

Functions supporting automatic partial application include:

Partial application makes it easier to supply custom parameters to thesefunctions when using them inside other functions, such as geoms and stats.For example, smoothers forgeom_dots() can be supplied in one of threeways:

Many other common arguments forggdist functions work similarly; e.g.density,align,breaks,bandwidth, andpoint_interval arguments.

These function families (exceptpoint_interval()) also support passingwaivers to their optional arguments: ifwaiver() is passed to anyof these arguments, their default value (or the mostrecently-partially-applied non-waiver value) is used instead.

Use theauto_partial() function to create new functions that supportautomatic partial application.

Usage

auto_partial(f, name = NULL, waivable = TRUE)

Arguments

f

<function> Function to automatically partially-apply.

name

<string> Name of the function, to be usedwhen printing.

waivable

<scalarlogical> IfTRUE, optional arguments that getpassed awaiver() will keep their default value (or whatevernon-waiver value has been most recently partially applied for thatargument).

Value

A modified version off that will automatically be partiallyapplied if all of its required arguments are not given.

Examples

set.seed(1234)x = rnorm(100)# the first required argument, `x`, of the density_ family is the vector# to calculate a kernel density estimate from. If it is not provided, the# function is partially applied and returned as-isdensity_unbounded()# we could create a new function that uses half the default bandwidthdensity_half_bw = density_unbounded(adjust = 0.5)density_half_bw# we can overwrite partially-applied argumentsdensity_quarter_bw_trimmed = density_half_bw(adjust = 0.25, trim = TRUE)density_quarter_bw_trimmed# when we eventually call the function and provide the required argument# `x`, it is applied using the arguments we have "saved up" so fardensity_quarter_bw_trimmed(x)# create a custom automatically partially applied functionf = auto_partial(function(x, y, z = 3) (x + y) * z)f()f(1)g = f(y = 2)(z = 4)gg(1)# pass waiver() to optional arguments to use existing valuesf(z = waiver())(1, 2)  # uses default z = 3f(z = 4)(z = waiver())(1, 2)  # uses z = 4

Bandwidth estimators

Description

Bandwidth estimators for densities, used in thebandwidth argumentto density functions (e.g.density_bounded(),density_unbounded()).

Supportsautomatic partial function application withwaived arguments.

Usage

bandwidth_nrd0(x, ...)bandwidth_nrd(x, ...)bandwidth_ucv(x, ...)bandwidth_bcv(x, ...)bandwidth_SJ(x, ...)bandwidth_dpi(x, ...)

Arguments

x

<numeric> Vector containing a sample.

...

Arguments passed on tostats::bw.SJ

nb

number of bins to use.

lower,upper

range over which to minimize. The default isalmost always satisfactory.hmax is calculated internallyfrom a normal reference bandwidth.

method

either"ste" ("solve-the-equation") or"dpi" ("direct plug-in"). Can be abbreviated.

tol

for method"ste", the convergence tolerance foruniroot. The default leads to bandwidth estimateswith only slightly more than one digit accuracy, which is sufficientfor practical density estimation, but possibly not for theoreticalsimulation studies.

Details

These are loose wrappers around the correspondingbw.-prefixed functionsinstats. See, for example,bw.SJ().

bandwidth_dpi(), which is the default bandwidth estimator inggdist,is the Sheather-Jones direct plug-in estimator, i.e.bw.SJ(..., method = "dpi").

With the exception ofbandwidth_nrd0(), these estimators may fail in somecases, often when a sample contains many duplicates. If they do they willautomatically fall back tobandwidth_nrd0() with a warning. However, thesefailures are typically symptomatic of situations where you should not want touse a kernel density estimator in the first place (e.g. data with duplicatesand/or discrete data). In these cases consider using a dotplot (geom_dots())or histogram (density_histogram()) instead.

Value

A single number giving the bandwidth

See Also

density_bounded(),density_unbounded().


Bin data values using a dotplot algorithm

Description

Bins the provided data values using one of several dotplot algorithms.

Usage

bin_dots(  x,  y,  binwidth,  heightratio = 1,  stackratio = 1,  layout = c("bin", "weave", "hex", "swarm", "bar"),  side = c("topright", "top", "right", "bottomleft", "bottom", "left", "topleft",    "bottomright", "both"),  orientation = c("horizontal", "vertical", "y", "x"),  overlaps = "nudge")

Arguments

x

<numeric>x values.

y

<numeric>y values (same length asx).

binwidth

<scalarnumeric> Bin width.

heightratio

<scalarnumeric> Ratio of bin width to dot height

stackratio

<scalarnumeric> Ratio of dot height to vertical distancebetween dot centers

layout

<string> The layout method used for the dots. One of:

  • "bin" (default): places dots on the off-axis at the midpoint oftheir bins as in the classic Wilkinson dotplot. This maintains thealignment of rows and columns in the dotplot. This layout is slightlydifferent from the classic Wilkinson algorithm in that: (1) it nudgesbins slightly to avoid overlapping bins and (2) if the input data aresymmetrical it will return a symmetrical layout.

  • "weave": uses the same basic binning approach of"bin", butplaces dots in the off-axis at their actual positions (unlessoverlaps = "nudge", in which case overlaps may be nudged out of theway). This maintains the alignment of rows but does not align dotswithin columns.

  • "hex": uses the same basic binning approach of"bin", butalternates placing dots+ binwidth/4 or- binwidth/4 in theoff-axis from the bin center. This allows hexagonal packing by settingastackratio less than 1 (something like0.9 tends to work).

  • "swarm": uses the"compactswarm" layout frombeeswarm::beeswarm(). Does not maintain alignment of rows or columns,but can be more compact and neat looking, especially for sample data(as opposed to quantile dotplots of theoretical distributions, whichmay look better with"bin","weave", or"hex").

  • "bar": for discrete distributions, lays out duplicate values inrectangular bars.

side

Which side to place the slab on."topright","top", and"right" are synonymswhich cause the slab to be drawn on the top or the right depending on iforientation is"horizontal"or"vertical"."bottomleft","bottom", and"left" are synonyms which cause the slabto be drawn on the bottom or the left depending on iforientation is"horizontal" or"vertical"."topleft" causes the slab to be drawn on the top or the left, and"bottomright"causes the slab to be drawn on the bottom or the right."both" draws the slab mirrored on bothsides (as in a violin plot).

orientation

<string> Whether the dots are laid out horizontallyor vertically. Follows the naming scheme ofgeom_slabinterval():

  • "horizontal" assumes the data values for the dotplot are in thexvariable and that dots will be stacked up in they direction.

  • "vertical" assumes the data values for the dotplot are in theyvariable and that dots will be stacked up in thex direction.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an alias for"vertical" and"y" as an alias for"horizontal".

overlaps

<string> How to handle overlapping dots or bins in the"bin","weave", and"hex" layouts (dots never overlap in the"swarm" or"bar" layouts).For the purposes of this argument, dots are only considered to be overlappingif they would be overlapping whendotsize = 1 andstackratio = 1; i.e.if you set those arguments to other values, overlaps may still occur.One of:

  • "keep": leave overlapping dots as they are. Dots may overlap(usually only slightly) in the"bin","weave", and"hex" layouts.

  • "nudge": nudge overlapping dots out of the way. Overlaps are avoidedusing a constrained optimization which minimizes the squared distance ofdots to their desired positions, subject to the constraint that adjacentdots do not overlap.

Value

Adata.frame with three columns:

See Also

find_dotplot_binwidth() for an algorithm that finds good bin widthsto use with this function;geom_dotsinterval() for geometries that usethese algorithms to create dotplots.

Examples

library(dplyr)library(ggplot2)x = qnorm(ppoints(20))bin_df = bin_dots(x = x, y = 0, binwidth = 0.5, heightratio = 1)bin_df# we can manually plot the binning above, though this is only recommended# if you are using find_dotplot_binwidth() and bin_dots() to build your own# grob. For practical use it is much easier to use geom_dots(), which will# automatically select good bin widths for you (and which uses# find_dotplot_binwidth() and bin_dots() internally)bin_df %>%  ggplot(aes(x = x, y = y)) +  geom_point(size = 4) +  coord_fixed()

Blur functions for blurry dot plots

Description

Methods for constructing blurs, as used in theblur argument togeom_blur_dots() orstat_mcse_dots().

Supportsautomatic partial function application withwaived arguments.

Usage

blur_gaussian(x, r, sd)blur_interval(x, r, sd, .width = 0.95)

Arguments

x

<numeric> Vector of positive distances from the center of the dot(assumed to be 0) to evaluate blur function at.

r

<scalarnumeric> Radius of the dot that is being blurred.

sd

<scalarnumeric> Standard deviation of the dot that is being blurred.

.width

<scalarnumeric> Forblur_interval(), a probability giving the width ofthe interval.

Details

These functions are passedx,r, andsd whengeom_blur_dots()draws in order to create a radial gradient representing each dot in thedotplot. They return values between0 and1 giving the opacity of thedot at each value ofx.

blur_gaussian() creates a dot with radiusr that has a Gaussian blur withstandard deviationsd applied to it. It does this by calculating\alpha(x; r, \sigma), the opacity at distancex from the centerof a dot with radiusr that has had a Gaussian blur with standarddeviation\sigma =sd applied to it:

\alpha(x; r, \sigma) = \Phi \left(\frac{x + r}{\sigma} \right) - \Phi \left(\frac{x - r}{\sigma} \right)

blur_interval() creates an interval-type representation around thedot at 50% opacity, where the interval is a Gaussian quantile interval withmass equal to.width and standard deviationsd.

Value

A vector with the same length asx giving the opacity of the radialgradient representing the dot at eachx value.

See Also

geom_blur_dots() andstat_mcse_dots() for geometries making use ofblur functions.

Examples

# see examples in geom_blur_dots()

Estimate bounds of a distribution using the CDF of its order statistics

Description

Estimate the bounds of the distribution a sample came from using the CDF ofthe order statistics of the sample. Use with thebounder argument todensity_bounded().

Supportsautomatic partial function application withwaived arguments.

Usage

bounder_cdf(x, p = 0.01)

Arguments

x

<numeric> Sample to estimate the bounds of.

p

<scalarnumeric> in[0,1]: Percentile of the order statistic distribution to useas the estimate.p = 1 will returnrange(x);p = 0.5 will give the medianestimate,p = 0 will give a very wide estimate (effectively treating thedistribution as unbounded when used withdensity_bounded()).

Details

bounder_cdf() uses the distribution of the order statistics ofX to estimate where the first and last order statistics (i.e. themin and max) of this distribution would be, assuming the samplex is thedistribution. Then, it adjusts the boundary outwards frommin(x) (ormax(x))by the distance betweenmin(x) (ormax(x)) and the nearest estimatedorder statistic.

TakingX =x, the distributions of the first and last order statistics are:

\begin{array}{rcl}F_{X_{(1)}}(x) &=& 1 - \left[1 - F_X(x)\right]^n\\F_{X_{(n)}}(x) &=& F_X(x)^n\end{array}

Re-arranging, we can get the inverse CDFs (quantile functions) of eachorder statistic in terms of the quantile function ofX (which wecan estimate from the data), giving us an estimate for the minimumand maximum order statistic:

\begin{array}{rcrcl}\hat{x_1} &=& F_{X_{(1)}}^{-1}(p) &=& F_X^{-1}\left[1 - (1 - p)^{1/n}\right]\\\hat{x_n} &=& F_{X_{(n)}}^{-1}(p) &=& F_X^{-1}\left[p^{1/n}\right]\end{array}

Then the estimated bounds are:

\left[2\min(x) - \hat{x_1}, 2\max(x) - \hat{x_n} \right]

These bounds depend onp, the percentile of the distribution of the orderstatistic used to form the estimate. Whilep = 0.5 (the median) might bea reasonable choice (and gives results similar tobounder_cooke()), this tendsto be a bit too aggressive in "detecting" bounded distributions, especially insmall sample sizes. Thus, we use a default ofp = 0.01, which tends tobe very conservative in small samples (in that it usually gives resultsroughly equivalent to an unbounded distribution), but which still performswell on bounded distributions when sample sizes are larger (in the thousands).

Value

A length-2 numeric vector giving an estimate of the minimum and maximum boundsof the distribution thatx came from.

See Also

Thebounder argument todensity_bounded().

Other bounds estimators:bounder_cooke(),bounder_range()


Estimate bounds of a distribution using Cooke's method

Description

Estimate the bounds of the distribution a sample came from using Cooke's method.Use with thebounder argument todensity_bounded().

Supportsautomatic partial function application withwaived arguments.

Usage

bounder_cooke(x)

Arguments

x

<numeric> Sample to estimate the bounds of.

Details

Estimate the bounds of a distribution using the method from Cooke (1979);i.e. method 2.3 from Loh (1984). These bounds are:

\left[\begin{array}{l}2X_{(1)} - \sum_{i = 1}^n \left[\left(1 - \frac{i - 1}{n}\right)^n - \left(1 - \frac{i}{n}\right)^n \right] X_{(i)}\\2X_{(n)} - \sum_{i = 1}^n \left[\left(1 - \frac{n - i}{n}\right)^n - \left(1 - \frac{n + 1 - i}{n} \right)^n\right] X_{(i)}\end{array}\right]

WhereX_{(i)} is theith order statistic ofx (i.e. itsith-smallest value).

Value

A length-2 numeric vector giving an estimate of the minimum and maximum boundsof the distribution thatx came from.

References

Cooke, P. (1979). Statistical inference for bounds of random variables.Biometrika 66(2), 367–374.doi:10.1093/biomet/66.2.367.

Loh, W. Y. (1984). Estimating an endpoint of a distribution with resamplingmethods.The Annals of Statistics 12(4), 1543–1550.doi:10.1214/aos/1176346811

See Also

Thebounder argument todensity_bounded().

Other bounds estimators:bounder_cdf(),bounder_range()


Estimate bounds of a distribution using the range of the sample

Description

Estimate the bounds of the distribution a sample came from using the range of the sample.Use with thebounder argument todensity_bounded().

Supportsautomatic partial function application withwaived arguments.

Usage

bounder_range(x)

Arguments

x

<numeric> Sample to estimate the bounds of.

Details

Estimate the bounds of a distribution usingrange(x).

Value

A length-2 numeric vector giving an estimate of the minimum and maximum boundsof the distribution thatx came from.

See Also

Thebounder argument todensity_bounded().

Other bounds estimators:bounder_cdf(),bounder_cooke()


Break (bin) selection algorithms for histograms

Description

Methods for determining breaks (bins) in histograms, as used in thebreaksargument todensity_histogram().

Supportsautomatic partial function application withwaived arguments.

Usage

breaks_fixed(x, weights = NULL, width = 1)breaks_Sturges(x, weights = NULL)breaks_Scott(x, weights = NULL)breaks_FD(x, weights = NULL, digits = 5)breaks_quantiles(x, weights = NULL, max_n = "Scott", min_width = 0.5)

Arguments

x

<numeric> Sample values.

weights

<numeric |NULL> Optional weights to apply tox, whichwill be normalized to sum to 1.

width

<scalarnumeric> Forbreaks_fixed(), the desired bin width.

digits

<scalarnumeric> Forbreaks_FD(), the number of significant digits to keep whenrounding in the Freedman-Diaconis algorithm. For an explanation of thisparameter, see the documentation of the corresponding parameter ingrDevices::nclass.FD().

max_n

<scalarnumeric |function |string>Forbreaks_quantiles(), either a scalar numeric giving themaximum number of bins, or another breaks function (or string giving thesuffix of the name of a function prefixed with"breaks_") that willreturn the maximum number of bins.breaks_quantiles() will constructat mostmax_n bins.

min_width

<scalarnumeric> Forbreaks_quantiles(), a numericbetween0 and1 giving the minimum bin width as a proportion ofdiff(range(x)) / max_n.

Details

These functions take a sample and its weights and return a value suitable forthebreaks argument todensity_histogram() that will determine the histogrambreaks.

Value

Either a single number (giving the number of bins) or a vectorgiving the edges between bins.

See Also

density_histogram(),align

Examples

library(ggplot2)set.seed(1234)x = rnorm(2000, 1, 2)# Let's compare the different break-selection algorithms on this data:ggplot(data.frame(x), aes(x)) +  stat_slab(    aes(y = "breaks_fixed(width = 0.5)"),    density = "histogram",    breaks = breaks_fixed(width = 0.5),    outline_bars = TRUE,    color = "black",  ) +  stat_slab(    aes(y = "breaks_Sturges()\nor 'Sturges'"),    density = "histogram",    breaks = "Sturges",    outline_bars = TRUE,    color = "black",  ) +  stat_slab(    aes(y = "breaks_Scott()\nor 'Scott'"),    density = "histogram",    breaks = "Scott",    outline_bars = TRUE,    color = "black",  ) +  stat_slab(    aes(y = "breaks_FD()\nor 'FD'"),    density = "histogram",    breaks = "FD",    outline_bars = TRUE,    color = "black",  ) +  stat_slab(    aes(y = "breaks_quantiles()\nor 'quantiles'"),    density = "histogram",    breaks = "quantiles",    outline_bars = TRUE,    color = "black",  ) +  geom_point(aes(y = 0.7), alpha = 0.5) +  labs(    subtitle = "ggdist::stat_slab(density = 'histogram', ...)",    y = "breaks =",    x = NULL  )

Curvewise point and interval summaries for tidy data frames of draws from distributions

Description

Translates draws from distributions in a grouped data frame into a set of point andinterval summaries using a curve boxplot-inspired approach.

Usage

curve_interval(  .data,  ...,  .along = NULL,  .width = 0.5,  na.rm = FALSE,  .interval = c("mhd", "mbd", "bd", "bd-mbd"))## S3 method for class 'matrix'curve_interval(  .data,  ...,  .along = NULL,  .width = 0.5,  na.rm = FALSE,  .interval = c("mhd", "mbd", "bd", "bd-mbd"))## S3 method for class 'rvar'curve_interval(  .data,  ...,  .along = NULL,  .width = 0.5,  na.rm = FALSE,  .interval = c("mhd", "mbd", "bd", "bd-mbd"))## S3 method for class 'data.frame'curve_interval(  .data,  ...,  .along = NULL,  .width = 0.5,  na.rm = FALSE,  .interval = c("mhd", "mbd", "bd", "bd-mbd"),  .simple_names = TRUE,  .exclude = c(".chain", ".iteration", ".draw", ".row"))

Arguments

.data

<data.frame |rvar |matrix> One of:

  • A data frame (or grouped data frame as returned bydplyr::group_by())that contains draws to summarize.

  • Aposterior::rvar vector.

  • A matrix; in which case the first dimension should be draws and the seconddimension values of the curve.

...

<barelanguage> Bare column names or expressions that, when evaluated in the context of.data, represent draws to summarize. If this is empty, then by default allcolumns that are not group columns and which are not in.exclude (by default".chain",".iteration",".draw", and".row") will be summarized.This can be numeric columns, list columns containing numeric vectors, orposterior::rvar()s.

.along

<tidyselect> Which columns are the input values to the functiondescribing the curve (e.g., the "x" values). Intervals are calculated jointly withrespect to these variables, conditional on all other grouping variables in the data frame. The default(NULL) causescurve_interval() to use all grouping variables in the input data frame as the valuefor.along, which will generate the most conservative intervals. However, if you want to calculateintervals for some functiony = f(x) conditional on some other variable(s) (say, conditional on afactorg), you would group byg, then use.along = x to calculate intervals jointly overxconditional ong. To avoid selecting any variables as input values to the function describing thecurve, usecharacter(); this will produce conditional intervals only (the result in this case shouldbe very similar tomedian_qi()). Currently only supported when.data is a data frame.

.width

<numeric> Vector of probabilities to use that determine the widths of the resultingintervals. If multiple probabilities are provided, multiple rows per group are generated, each witha different probability interval (and value of the corresponding.width column).

na.rm

<scalarlogical> ShouldNA values be stripped before the computation proceeds?IfFALSE (the default), the presence ofNA values in the columns to be summarized will generallyresult in an error. IfTRUE,NA values will be removed in the calculation of intervals so longas.interval is"mhd"; other methods do not currently supportna.rm. Be cautious in applyingthis parameter: in general, it is unclear what a joint interval should be when any of the valuesare missing!

.interval

<string> The method used to calculate the intervals. Currently, allmethods rank the curves using some measure ofdata depth, then create envelopes containing the.width% "deepest" curves. Available methods are:

  • "mhd": mean halfspace depth (Fraiman and Muniz 2001).

  • "mbd": modified band depth (Sun and Genton 2011): callsfda::fbplot() withmethod = "MBD".

  • "bd": band depth (Sun and Genton 2011): callsfda::fbplot() withmethod = "BD2".

  • "bd-mbd": band depth, breaking ties with modified band depth (Sun and Genton 2011): callsfda::fbplot() withmethod = "Both".

.simple_names

<scalarlogical> WhenTRUE and only a single column / vector is to besummarized, use the name.lower for the lower end of the interval and.upper for theupper end. WhenFALSE and.data is a data frame,names the lower and upper intervals for each columnxx.lower andx.upper.

.exclude

<character> Vector of names of columns to be excluded from summarizationif no column names are specified to be summarized. Default ignores several meta-data columnnames used inggdist andtidybayes.

Details

Intervals are calculated by ranking the curves using some measure ofdata depth, thenusing binary search to find a cutoffk such that an envelope containing thek% "deepest"curves also contains.width% of the curves, for each value of.width (note thatkand.width are not necessarily the same). This is in contrast to most functional boxplotor curve boxplot approaches, which tend to simply take the.width% deepest curves, andare generally quite conservative (i.e. they may contain more than.width% of the curves).

See Mirzargaret al. (2014) or Juulet al. (2020) for an accessible introductionto data depth and curve boxplots / functional boxplots.

Value

A data frame containing point summaries and intervals, with at least one column correspondingto the point summary, one to the lower end of the interval, one to the upper end of the interval, thewidth of the interval (.width), the type of point summary (.point), and the type of interval (.interval).

Author(s)

Matthew Kay

References

Fraiman, Ricardo and Graciela Muniz. (2001)."Trimmed means for functional data".Test 10: 419–440.doi:10.1007/BF02595706.

Sun, Ying and Marc G. Genton. (2011)."Functional Boxplots".Journal of Computational and Graphical Statistics, 20(2): 316-334.doi:10.1198/jcgs.2011.09224

Mirzargar, Mahsa, Ross T Whitaker, and Robert M Kirby. (2014)."Curve Boxplot: Generalization of Boxplot for Ensembles of Curves".IEEE Transactions on Visualization and Computer Graphics. 20(12): 2654-2663.doi:10.1109/TVCG.2014.2346455

Juul Jonas, Kaare Græsbøll, Lasse Engbo Christiansen, and Sune Lehmann. (2020)."Fixed-time descriptive statistics underestimate extremes of epidemic curve ensembles".arXiv e-print.arXiv:2007.05035

See Also

point_interval() for pointwise intervals. Seevignette("lineribbon") for more examplesand discussion of the differences between pointwise and curvewise intervals.

Examples

library(dplyr)library(ggplot2)# generate a set of curvesk = 11 # number of curvesn = 201df = tibble(    .draw = rep(1:k, n),    mean = rep(seq(-5,5, length.out = k), n),    x = rep(seq(-15,15,length.out = n), each = k),    y = dnorm(x, mean, 3)  )# see pointwise intervals...df %>%  group_by(x) %>%  median_qi(y, .width = c(.5)) %>%  ggplot(aes(x = x, y = y)) +  geom_lineribbon(aes(ymin = .lower, ymax = .upper)) +  geom_line(aes(group = .draw), alpha=0.15, data = df) +  scale_fill_brewer() +  ggtitle("50% pointwise intervals with point_interval()") +  theme_ggdist()# ... compare them to curvewise intervalsdf %>%  group_by(x) %>%  curve_interval(y, .width = c(.5)) %>%  ggplot(aes(x = x, y = y)) +  geom_lineribbon(aes(ymin = .lower, ymax = .upper)) +  geom_line(aes(group = .draw), alpha=0.15, data = df) +  scale_fill_brewer() +  ggtitle("50% curvewise intervals with curve_interval()") +  theme_ggdist()

Categorize values from a CDF into quantile intervals

Description

Given a vector of probabilities from a cumulative distribution function (CDF)and a list of desired quantile intervals, return a vector categorizing eachelement of the input vector according to which quantile interval it falls into.NOTE: While this function can be used for (and was originally designed for)drawing slabs with intervals overlaid on the density, this is can now bedone more easily by mapping the.width orlevel computed variable toslab fill or color. SeeExamples.

Usage

cut_cdf_qi(p, .width = c(0.66, 0.95, 1), labels = NULL)

Arguments

p

<numeric> Vector of values from a cumulative distribution function,such as values returned byp-prefixed distribution functions in base R (e.g.pnorm()),thecdf() function, or values of thecdf computed aesthetic from thestat_slabinterval() family of stats.

.width

<numeric> Vector of probabilities to use that determine the widths of the resulting intervals.

labels

<character |function |NULL> One of:

  • A character vector giving labels (must be same length as.width)

  • A function that takes numeric probabilities as input and returns labels as output(a good candidate might bescales::percent_format()).

  • NULL to use the default labels (.width converted to a character vector).

Value

Anordered factor of the same length asp giving the quantile interval towhich each value ofp belongs.

See Also

Seestat_slabinterval() andits shortcut stats, which generatecdf aesthetics that can be used withcut_cdf_qi() to draw slabs colored by their intervals.

Examples

library(ggplot2)library(dplyr)library(scales)library(distributional)theme_set(theme_ggdist())# NOTE: cut_cdf_qi() used to be the recommended way to do intervals overlaid# on densities, like this...tibble(x = dist_normal(0, 1)) %>%  ggplot(aes(xdist = x)) +  stat_slab(    aes(fill = after_stat(cut_cdf_qi(cdf)))  ) +  scale_fill_brewer(direction = -1)# ... however this is now more easily and flexibly accomplished by directly# mapping .width or level onto fill:tibble(x = dist_normal(0, 1)) %>%  ggplot(aes(xdist = x)) +  stat_slab(    aes(fill = after_stat(level)),    .width = c(.66, .95, 1)  ) +  scale_fill_brewer()# See vignette("slabinterval") for more examples. The remaining examples# below using cut_cdf_qi() are kept for posterity.# With a halfeye (or other geom with slab and interval), NA values will# show up in the fill scale from the CDF function applied to the internal# interval geometry data and can be ignored, hence na.translate = FALSEtibble(x = dist_normal(0, 1)) %>%  ggplot(aes(xdist = x)) +  stat_halfeye(aes(    fill = after_stat(cut_cdf_qi(cdf, .width = c(.5, .8, .95, 1)))  )) +  scale_fill_brewer(direction = -1, na.translate = FALSE)# we could also use the labels parameter to apply nicer formatting# and provide a better name for the legend, and omit the 100% interval# if desiredtibble(x = dist_normal(0, 1)) %>%  ggplot(aes(xdist = x)) +  stat_halfeye(aes(    fill = after_stat(cut_cdf_qi(      cdf,      .width = c(.5, .8, .95),      labels = percent_format(accuracy = 1)    ))  )) +  labs(fill = "Interval") +  scale_fill_brewer(direction = -1, na.translate = FALSE)

Bounded density estimator using the reflection method

Description

Bounded density estimator using the reflection method.

Supportsautomatic partial function application withwaived arguments.

Usage

density_bounded(  x,  weights = NULL,  n = 501,  bandwidth = "dpi",  adjust = 1,  kernel = "gaussian",  trim = TRUE,  bounds = c(NA, NA),  bounder = "cdf",  adapt = 1,  na.rm = FALSE,  ...,  range_only = FALSE)

Arguments

x

<numeric> Sample to compute a density estimate for.

weights

<numeric |NULL> Optional weights to apply tox.

n

<scalarnumeric> The number of grid points to evaluate the density estimator at.

bandwidth

<scalarnumeric |function |string>Bandwidth of the density estimator. One of:

  • a numeric: the bandwidth, as the standard deviation of the kernel

  • a function: a function takingx (the sample) and returning the bandwidth

  • a string: the suffix of the name of a function starting with"bandwidth_" thatwill be used to determine the bandwidth. Seebandwidth for a list.

adjust

<scalarnumeric> Value to multiply the bandwidth of the density estimator by. Default1.

kernel

<string> The smoothing kernel to be used. This must partiallymatch one of"gaussian","rectangular","triangular","epanechnikov","biweight","cosine", or"optcosine". Seestats::density().

trim

<scalarlogical> Should the density estimate be trimmed to the range of the data? DefaultTRUE.

bounds

<length-2numeric> Min and max bounds. If a bound isNA, thenthat bound is estimated from the data using the method specified bybounder.

bounder

<function |string> Method to use to find missing(NA)bounds. A function thattakes a numeric vector of values and returns a length-2 vector of the estimatedlower and upper bound of the distribution. Can also be a string giving thesuffix of the name of such a function that starts with"bounder_". Usefulvalues include:

  • "cdf": Use the CDF of the the minimum and maximum order statistics of thesample to estimate the bounds. Seebounder_cdf().

  • "cooke": Use the method from Cooke (1979); i.e. method 2.3 from Loh (1984).Seebounder_cooke().

  • "range": Use the range ofx (i.e themin ormax). Seebounder_range().

adapt

<positiveinteger> (very experimental) The name and interpretation of this argumentare subject to change without notice. Ifadapt > 1, usesan adaptive approach to calculate the density. First, uses theadaptive bandwidth algorithm of Abramson (1982) to determine local (pointwise)bandwidths, then groups these bandwidths intoadapt groups, then calculatesand sums the densities from each group. You can set this to a very large number(e.g.Inf) for a fully adaptive approach, but this will be very slow; typicallysomething around 100 yields nearly identical results.

na.rm

<scalarlogical> Should missing (NA) values inx be removed?

...

Additional arguments (ignored).

range_only

<scalarlogical> IfTRUE, the range of the output of this density estimatoris computed and is returned in the⁠$x⁠ element of the result, andc(NA, NA)is returned in⁠$y⁠. This gives a faster way to determine the range of the outputthandensity_XXX(n = 2).

Value

An object of class"density", mimicking the output format ofstats::density(), with the following components:

This allows existing methods for density objects, likeprint() andplot(), to work if desired.This output format (and in particular, thex andy components) is alsothe format expected by thedensity argument of thestat_slabinterval()and thesmooth_ family of functions.

References

Cooke, P. (1979). Statistical inference for bounds of random variables.Biometrika 66(2), 367–374.doi:10.1093/biomet/66.2.367.

Loh, W. Y. (1984). Estimating an endpoint of a distribution with resamplingmethods.The Annals of Statistics 12(4), 1543–1550.doi:10.1214/aos/1176346811

See Also

Other density estimators:density_histogram(),density_unbounded()

Examples

library(distributional)library(dplyr)library(ggplot2)# For compatibility with existing code, the return type of density_bounded()# is the same as stats::density(), ...set.seed(123)x = rbeta(5000, 1, 3)d = density_bounded(x)d# ... thus, while designed for use with the `density` argument of# stat_slabinterval(), output from density_bounded() can also be used with# base::plot():plot(d)# here we'll use the same data as above, but pick either density_bounded()# or density_unbounded() (which is equivalent to stats::density()). Notice# how the bounded density (green) is biased near the boundary of the support,# while the unbounded density is not.data.frame(x) %>%  ggplot() +  stat_slab(    aes(xdist = dist), data = data.frame(dist = dist_beta(1, 3)),    alpha = 0.25  ) +  stat_slab(aes(x), density = "bounded", fill = NA, color = "#d95f02", alpha = 0.5) +  stat_slab(aes(x), density = "unbounded", fill = NA, color = "#1b9e77", alpha = 0.5) +  scale_thickness_shared() +  theme_ggdist()# We can also supply arguments to the density estimators by using their# full function names instead of the string suffix; e.g. we can supply# the exact bounds of c(0,1) rather than using the bounds of the data.data.frame(x) %>%  ggplot() +  stat_slab(    aes(xdist = dist), data = data.frame(dist = dist_beta(1, 3)),    alpha = 0.25  ) +  stat_slab(    aes(x), fill = NA, color = "#d95f02", alpha = 0.5,    density = density_bounded(bounds = c(0,1))  ) +  scale_thickness_shared() +  theme_ggdist()

Histogram density estimator

Description

Histogram density estimator.

Supportsautomatic partial function application withwaived arguments.

Usage

density_histogram(  x,  weights = NULL,  breaks = "Scott",  align = "none",  outline_bars = FALSE,  right_closed = TRUE,  outermost_closed = TRUE,  na.rm = FALSE,  ...,  range_only = FALSE)

Arguments

x

<numeric> Sample to compute a density estimate for.

weights

<numeric |NULL> Optional weights to apply tox.

breaks

<numeric |function |string> Determines the breakpoints defining bins. Default"Scott". Similar to (but not exactly the same as) thebreaksargument tographics::hist(). One of:

  • A scalar (length-1) numeric giving the number of bins

  • A vector numeric giving the breakpoints between histogram bins

  • A function takingx andweights and returning either thenumber of bins or a vector of breakpoints

  • A string giving the suffix of a function that starts with"breaks_".ggdist provides weighted implementations of the"Sturges","Scott", and"FD" break-finding algorithms fromgraphics::hist(), as well asbreaks_fixed() for manually settingthe bin width. Seebreaks.

For example,breaks = "Sturges" will use thebreaks_Sturges() algorithm,breaks = 9 will create 9 bins, andbreaks = breaks_fixed(width = 1) willset the bin width to1.

align

<scalarnumeric |function |string> Determines how to align the breakpoints defining bins. Default"none" (performs no alignment). One of:

  • A scalar (length-1) numeric giving an offset that is subtractedfrom the breaks. The offset must be between0 and the bin width.

  • A function taking a sorted vector ofbreaks (bin edges) andreturning an offset to subtract from the breaks.

  • A string giving the suffix of a function that starts with"align_" used to determine the alignment, such asalign_none(),align_boundary(), oralign_center().

For example,align = "none" will provide no alignment,align = align_center(at = 0) will center a bin on0, andalign = align_boundary(at = 0) will align a bin edge on0.

outline_bars

<scalarlogical> Should outlines in between the bars (i.e. density values of0) be included?

right_closed

<scalarlogical> Should the right edge of each bin be closed? Fora bin with endpointsL andU:

  • ifTRUE, use(L, U]: the interval containing allx such thatL < x \le U.

  • ifFALSE, use[L, U): the interval containing allx such thatL \le x < U.

Equivalent to theright argument ofhist() or theleft.open argument offindInterval().

outermost_closed

<scalarlogical> Should values on the edges of the outermost (firstor last) bins always be included in those bins? IfTRUE, the first edge (whenright_closed = TRUE)or the last edge (whenright_closed = FALSE) is treated as closed.

Equivalent to theinclude.lowest argument ofhist() or therightmost.closed argument offindInterval().

na.rm

<scalarlogical> Should missing (NA) values inx be removed?

...

Additional arguments (ignored).

range_only

<scalarlogical> IfTRUE, the range of the output of this density estimatoris computed and is returned in the⁠$x⁠ element of the result, andc(NA, NA)is returned in⁠$y⁠. This gives a faster way to determine the range of the outputthandensity_XXX(n = 2).

Value

An object of class"density", mimicking the output format ofstats::density(), with the following components:

This allows existing methods for density objects, likeprint() andplot(), to work if desired.This output format (and in particular, thex andy components) is alsothe format expected by thedensity argument of thestat_slabinterval()and thesmooth_ family of functions.

See Also

Other density estimators:density_bounded(),density_unbounded()

Examples

library(distributional)library(dplyr)library(ggplot2)# For compatibility with existing code, the return type of density_unbounded()# is the same as stats::density(), ...set.seed(123)x = rbeta(5000, 1, 3)d = density_histogram(x)d# ... thus, while designed for use with the `density` argument of# stat_slabinterval(), output from density_histogram() can also be used with# base::plot():plot(d)# here we'll use the same data as above with stat_slab():data.frame(x) %>%  ggplot() +  stat_slab(    aes(xdist = dist), data = data.frame(dist = dist_beta(1, 3)),    alpha = 0.25  ) +  stat_slab(aes(x), density = "histogram", fill = NA, color = "#d95f02", alpha = 0.5) +  scale_thickness_shared() +  theme_ggdist()

Unbounded density estimator

Description

Unbounded density estimator usingstats::density().

Supportsautomatic partial function application withwaived arguments.

Usage

density_unbounded(  x,  weights = NULL,  n = 501,  bandwidth = "dpi",  adjust = 1,  kernel = "gaussian",  trim = TRUE,  adapt = 1,  na.rm = FALSE,  ...,  range_only = FALSE)

Arguments

x

<numeric> Sample to compute a density estimate for.

weights

<numeric |NULL> Optional weights to apply tox.

n

<scalarnumeric> The number of grid points to evaluate the density estimator at.

bandwidth

<scalarnumeric |function |string>Bandwidth of the density estimator. One of:

  • a numeric: the bandwidth, as the standard deviation of the kernel

  • a function: a function takingx (the sample) and returning the bandwidth

  • a string: the suffix of the name of a function starting with"bandwidth_" thatwill be used to determine the bandwidth. Seebandwidth for a list.

adjust

<scalarnumeric> Value to multiply the bandwidth of the density estimator by. Default1.

kernel

<string> The smoothing kernel to be used. This must partiallymatch one of"gaussian","rectangular","triangular","epanechnikov","biweight","cosine", or"optcosine". Seestats::density().

trim

<scalarlogical> Should the density estimate be trimmed to the range of the data? DefaultTRUE.

adapt

<positiveinteger> (very experimental) The name and interpretation of this argumentare subject to change without notice. Ifadapt > 1, usesan adaptive approach to calculate the density. First, uses theadaptive bandwidth algorithm of Abramson (1982) to determine local (pointwise)bandwidths, then groups these bandwidths intoadapt groups, then calculatesand sums the densities from each group. You can set this to a very large number(e.g.Inf) for a fully adaptive approach, but this will be very slow; typicallysomething around 100 yields nearly identical results.

na.rm

<scalarlogical> Should missing (NA) values inx be removed?

...

Additional arguments (ignored).

range_only

<scalarlogical> IfTRUE, the range of the output of this density estimatoris computed and is returned in the⁠$x⁠ element of the result, andc(NA, NA)is returned in⁠$y⁠. This gives a faster way to determine the range of the outputthandensity_XXX(n = 2).

Value

An object of class"density", mimicking the output format ofstats::density(), with the following components:

This allows existing methods for density objects, likeprint() andplot(), to work if desired.This output format (and in particular, thex andy components) is alsothe format expected by thedensity argument of thestat_slabinterval()and thesmooth_ family of functions.

See Also

Other density estimators:density_bounded(),density_histogram()

Examples

library(distributional)library(dplyr)library(ggplot2)# For compatibility with existing code, the return type of density_unbounded()# is the same as stats::density(), ...set.seed(123)x = rbeta(5000, 1, 3)d = density_unbounded(x)d# ... thus, while designed for use with the `density` argument of# stat_slabinterval(), output from density_unbounded() can also be used with# base::plot():plot(d)# here we'll use the same data as above, but pick either density_bounded()# or density_unbounded() (which is equivalent to stats::density()). Notice# how the bounded density (green) is biased near the boundary of the support,# while the unbounded density is not.data.frame(x) %>%  ggplot() +  stat_slab(    aes(xdist = dist), data = data.frame(dist = dist_beta(1, 3)),    alpha = 0.25  ) +  stat_slab(aes(x), density = "bounded", fill = NA, color = "#d95f02", alpha = 0.5) +  stat_slab(aes(x), density = "unbounded", fill = NA, color = "#1b9e77", alpha = 0.5) +  scale_thickness_shared() +  theme_ggdist()

Dynamically select a good bin width for a dotplot

Description

Searches for a nice-looking bin width to use to draw a dotplot such thatthe height of the dotplot fits within a given space (maxheight).

Usage

find_dotplot_binwidth(  x,  maxheight,  heightratio = 1,  stackratio = 1,  layout = c("bin", "weave", "hex", "swarm", "bar"))

Arguments

x

<numeric> Data values.

maxheight

<scalarnumeric> Maximum height of the dotplot.

heightratio

<scalarnumeric> Ratio of bin width to dot height.

stackratio

<scalarnumeric> Ratio of dot height to vertical distancebetween dot centers

layout

<string> The layout method used for the dots. One of:

  • "bin" (default): places dots on the off-axis at the midpoint oftheir bins as in the classic Wilkinson dotplot. This maintains thealignment of rows and columns in the dotplot. This layout is slightlydifferent from the classic Wilkinson algorithm in that: (1) it nudgesbins slightly to avoid overlapping bins and (2) if the input data aresymmetrical it will return a symmetrical layout.

  • "weave": uses the same basic binning approach of"bin", butplaces dots in the off-axis at their actual positions (unlessoverlaps = "nudge", in which case overlaps may be nudged out of theway). This maintains the alignment of rows but does not align dotswithin columns.

  • "hex": uses the same basic binning approach of"bin", butalternates placing dots+ binwidth/4 or- binwidth/4 in theoff-axis from the bin center. This allows hexagonal packing by settingastackratio less than 1 (something like0.9 tends to work).

  • "swarm": uses the"compactswarm" layout frombeeswarm::beeswarm(). Does not maintain alignment of rows or columns,but can be more compact and neat looking, especially for sample data(as opposed to quantile dotplots of theoretical distributions, whichmay look better with"bin","weave", or"hex").

  • "bar": for discrete distributions, lays out duplicate values inrectangular bars.

Details

This dynamic bin selection algorithm uses a binary search over the number ofbins to find a bin width such that if the input data (x) is binnedusing a Wilkinson-style dotplot algorithm the height of the tallest binwill be less thanmaxheight.

This algorithm is used bygeom_dotsinterval() (and its variants) to automaticallyselect bin widths. Unless you are manually implementing you own dotplotgroborgeom, you probably do not need to use this function directly

Value

A suitable bin width such that a dotplot created with this bin widthandheightratio should have its tallest bin be less than or equal tomaxheight.

See Also

bin_dots() for an algorithm can bin dots using bin widths selectedby this function;geom_dotsinterval() for geometries that usethese algorithms to create dotplots.

Examples

library(dplyr)library(ggplot2)x = qnorm(ppoints(20))binwidth = find_dotplot_binwidth(x, maxheight = 4, heightratio = 1)binwidthbin_df = bin_dots(x = x, y = 0, binwidth = binwidth, heightratio = 1)bin_df# we can manually plot the binning above, though this is only recommended# if you are using find_dotplot_binwidth() and bin_dots() to build your own# grob. For practical use it is much easier to use geom_dots(), which will# automatically select good bin widths for you (and which uses# find_dotplot_binwidth() and bin_dots() internally)bin_df %>%  ggplot(aes(x = x, y = y)) +  geom_point(size = 4) +  coord_fixed()

Blurry dot plot (geom)

Description

Variant ofgeom_dots() for creating blurry dotplots. Accepts ansdaesthetic that gives the standard deviation of the blur applied to the dots.Requires a graphics engine supporting radial gradients. Unlikegeom_dots(),this geom only supports circular and squareshapes.

Usage

geom_blur_dots(  mapping = NULL,  data = NULL,  stat = "identity",  position = "identity",  ...,  blur = "gaussian",  binwidth = NA,  dotsize = 1.07,  stackratio = 1,  layout = "bin",  overlaps = "nudge",  smooth = "none",  overflow = "warn",  verbose = FALSE,  orientation = NA,  subguide = "slab",  na.rm = FALSE,  show.legend = NA,  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

stat

The statistical transformation to use on the data for this layer.When using a⁠geom_*()⁠ function to construct a layer, thestatargument can be used the override the default coupling between geoms andstats. Thestat argument accepts the following:

  • AStat ggproto subclass, for exampleStatCount.

  • A string naming the stat. To give the stat as a string, strip thefunction name of thestat_ prefix. For example, to usestat_count(),give the stat as"count".

  • For more information and other ways to specify the stat, see thelayer stat documentation.

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). These areoften aesthetics, used to set an aesthetic to a fixed value, likecolour = "red"orlinewidth = 3 (seeAesthetics, below). They may also be parametersto the paired geom/stat.

blur

<function |string> Blur function to apply to dots.One of:

  • A function that takes a numeric vector of distances from the dotcenter, the dot radius, and the standard deviation of the blur and returnsa vector of opacities in[0, 1], such asblur_gaussian()orblur_interval().

  • A string indicating what blur function to use, as the suffix to afunction name starting withblur_; e.g."gaussian" (the default)appliesblur_gaussian().

binwidth

<numeric |unit> The bin width to use for laying out the dots.One of:

  • NA (the default): Dynamically select the bin width based on thesize of the plot when drawn. This will pick abinwidth such that thetallest stack of dots is at mostscale in height (ideally exactlyscalein height, though this is not guaranteed).

  • A length-1 (scalar) numeric orunit object giving the exact bin width.

  • A length-2 (vector) numeric orunit object giving the minimum and maximumdesired bin width. The bin width will be dynamically selected withinthese bounds.

If the value is numeric, it is assumed to be in units of data. The bin width(or its bounds) can also be specified usingunit(), which may be useful ifit is desired that the dots be a certain point size or a certain percentage ofthe width/height of the viewport. For example,unit(0.1, "npc") would makedots that areexactly 10% of the viewport size along whichever dimension thedotplot is drawn;unit(c(0, 0.1), "npc") would make dots that areat most10% of the viewport size (while still ensuring the tallest stack is less thanor equal toscale).

dotsize

<scalarnumeric> The width of the dots relative to thebinwidth. The default,1.07, makes dots be just a bit wider than the bin width, which is amanually-tuned parameter that tends to work well with the default circularshape, preventing gaps between bins from appearing to be too large visually(as might arise from dots beingprecisely thebinwidth). If it is desiredto have dots be precisely thebinwidth, setdotsize = 1.

stackratio

<scalarnumeric> The distance between the center of the dots in the samestack relative to the dot height. The default,1, makes dots in the samestack just touch each other.

layout

<string> The layout method used for the dots. One of:

  • "bin" (default): places dots on the off-axis at the midpoint oftheir bins as in the classic Wilkinson dotplot. This maintains thealignment of rows and columns in the dotplot. This layout is slightlydifferent from the classic Wilkinson algorithm in that: (1) it nudgesbins slightly to avoid overlapping bins and (2) if the input data aresymmetrical it will return a symmetrical layout.

  • "weave": uses the same basic binning approach of"bin", butplaces dots in the off-axis at their actual positions (unlessoverlaps = "nudge", in which case overlaps may be nudged out of theway). This maintains the alignment of rows but does not align dotswithin columns.

  • "hex": uses the same basic binning approach of"bin", butalternates placing dots+ binwidth/4 or- binwidth/4 in theoff-axis from the bin center. This allows hexagonal packing by settingastackratio less than 1 (something like0.9 tends to work).

  • "swarm": uses the"compactswarm" layout frombeeswarm::beeswarm(). Does not maintain alignment of rows or columns,but can be more compact and neat looking, especially for sample data(as opposed to quantile dotplots of theoretical distributions, whichmay look better with"bin","weave", or"hex").

  • "bar": for discrete distributions, lays out duplicate values inrectangular bars.

overlaps

<string> How to handle overlapping dots or bins in the"bin","weave", and"hex" layouts (dots never overlap in the"swarm" or"bar" layouts).For the purposes of this argument, dots are only considered to be overlappingif they would be overlapping whendotsize = 1 andstackratio = 1; i.e.if you set those arguments to other values, overlaps may still occur.One of:

  • "keep": leave overlapping dots as they are. Dots may overlap(usually only slightly) in the"bin","weave", and"hex" layouts.

  • "nudge": nudge overlapping dots out of the way. Overlaps are avoidedusing a constrained optimization which minimizes the squared distance ofdots to their desired positions, subject to the constraint that adjacentdots do not overlap.

smooth

<function |string> Smoother to apply to dot positions.One of:

  • A function that takes a numeric vector of dot positions and returns asmoothed version of that vector, such assmooth_bounded(),smooth_unbounded(), smooth_discrete()⁠, or ⁠smooth_bar()'.

  • A string indicating what smoother to use, as the suffix to a functionname starting withsmooth_; e.g."none" (the default) appliessmooth_none(), which simply returns the given vector withoutapplying smoothing.

Smoothing is most effective when the smoother is matched to the support ofthe distribution; e.g. usingsmooth_bounded(bounds = ...).

overflow

<string> How to handle overflow of dots beyond the extent of the geomwhen a minimumbinwidth (or an exactbinwidth) is supplied.One of:

  • "keep": Keep the overflow, drawing dots outside the geom bounds.

  • "warn": Keep the overflow, but produce a warning suggesting solutions,such as settingbinwidth = NA oroverflow = "compress".

  • "compress": Compress the layout. Reduces thebinwidth to the size necessaryto keep the dots within bounds, then adjustsstackratio anddotsize so thatthe apparent dot size is the user-specified minimumbinwidth times theuser-specifieddotsize.

If you find the default layout has dots that are too small, and you are okaywith dots overlapping, consider settingoverflow = "compress" and supplyingan exact or minimum dot size usingbinwidth.

verbose

<scalarlogical> IfTRUE, print out the bin width of the dotplot. Can be usefulif you want to start from an automatically-selected bin width and then adjust itmanually. Bin width is printed both as data units and as normalized parentcoordinates or"npc"s (seeunit()). Note that if you just want to scale theselected bin width to fit within a desired area, it is probably easier to usescale than to copy and scalebinwidth manually, and if you just want toprovide constraints on the bin width, you can pass a length-2 vector tobinwidth.

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

subguide

<function |string> Sub-guide used to annotate thethickness scale. One of:

  • A function that takes ascale argument giving aggplot2::Scaleobject and anorientation argument giving the orientation of thegeometry and then returns agrid::grob that will draw the axisannotation, such assubguide_axis() (to draw a traditional axis) orsubguide_none() (to draw no annotation). Seesubguide_axis()for a list of possibilities and examples.

  • A string giving the name of such a function when prefixedwith"subguide_"; e.g."axis" or"none". The values"slab","dots", and"spike" use the default subguide for their geomfamilies (no subguide), which can be modified by settingsubguide_slab,subguide_dots, orsubguide_spike;see the documentation for those functions.

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

logical. Should this layer be included in the legends?NA, the default, includes if any aesthetics are mapped.FALSE never includes, andTRUE always includes.It can also be a named logical vector to finely select the aesthetics todisplay.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

Thedots family of stats and geoms are similar toggplot2::geom_dotplot() but with a number of differences:

Stats and geoms in this family include:

stat_dots() andstat_dotsinterval(), when used with thequantiles argument,are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertaintyusing a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).

Value

Aggplot2::Geom representing a blurry dot geometry which canbe added to aggplot() object.

Aesthetics

The dots+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: thedots (aka theslab), thepoint, and theinterval.

Positional aesthetics

Dots-specific (aka Slab-specific) aesthetics

Interval-specific aesthetics

Point-specific aesthetics

Color aesthetics

Line aesthetics

Slab-specific color and line override aesthetics

Interval-specific color and line override aesthetics

Point-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("dotsinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

References

Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizationsof Uncertainty in Everyday, Mobile Predictive Systems.Conference on Human Factorsin Computing Systems - CHI '16, 5092–5103.doi:10.1145/2858036.2858558.

Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplotsor CDFs Improve Transit Decision-Making.Conference on Human Factors in Computing Systems - CHI '18.doi:10.1145/3173574.3173718.

See Also

Seegeom_dotsinterval() for the geometry this shortcut is based on.

Seevignette("dotsinterval") for a variety of examples of use.

Other dotsinterval geoms:geom_dots(),geom_dotsinterval(),geom_swarm(),geom_weave()

Examples

library(dplyr)library(ggplot2)theme_set(theme_ggdist())set.seed(1234)x = rnorm(1000)# manually calculate quantiles and their MCSE# this could also be done more succinctly with stat_mcse_dots()p = ppoints(100)df = data.frame(  q = quantile(x, p),  se = posterior::mcse_quantile(x, p))df %>%  ggplot(aes(x = q, sd = se)) +  geom_blur_dots()df %>%  ggplot(aes(x = q, sd = se)) +  # or blur = blur_interval(.width = .95) to set the interval width  geom_blur_dots(blur = "interval")

Dot plot (shortcut geom)

Description

Shortcut version ofgeom_dotsinterval() for creating dot plots.Geoms based ongeom_dotsinterval() create dotplots that automaticallyensure the plot fits within the available space.

Roughly equivalent to:

geom_dotsinterval(  show_point = FALSE,  show_interval = FALSE)

Usage

geom_dots(  mapping = NULL,  data = NULL,  stat = "identity",  position = "identity",  ...,  binwidth = NA,  dotsize = 1.07,  stackratio = 1,  layout = "bin",  overlaps = "nudge",  smooth = "none",  overflow = "warn",  verbose = FALSE,  orientation = NA,  subguide = "slab",  na.rm = FALSE,  show.legend = NA,  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

stat

The statistical transformation to use on the data for this layer.When using a⁠geom_*()⁠ function to construct a layer, thestatargument can be used the override the default coupling between geoms andstats. Thestat argument accepts the following:

  • AStat ggproto subclass, for exampleStatCount.

  • A string naming the stat. To give the stat as a string, strip thefunction name of thestat_ prefix. For example, to usestat_count(),give the stat as"count".

  • For more information and other ways to specify the stat, see thelayer stat documentation.

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). These areoften aesthetics, used to set an aesthetic to a fixed value, likecolour = "red"orlinewidth = 3 (seeAesthetics, below). They may also be parametersto the paired geom/stat.

binwidth

<numeric |unit> The bin width to use for laying out the dots.One of:

  • NA (the default): Dynamically select the bin width based on thesize of the plot when drawn. This will pick abinwidth such that thetallest stack of dots is at mostscale in height (ideally exactlyscalein height, though this is not guaranteed).

  • A length-1 (scalar) numeric orunit object giving the exact bin width.

  • A length-2 (vector) numeric orunit object giving the minimum and maximumdesired bin width. The bin width will be dynamically selected withinthese bounds.

If the value is numeric, it is assumed to be in units of data. The bin width(or its bounds) can also be specified usingunit(), which may be useful ifit is desired that the dots be a certain point size or a certain percentage ofthe width/height of the viewport. For example,unit(0.1, "npc") would makedots that areexactly 10% of the viewport size along whichever dimension thedotplot is drawn;unit(c(0, 0.1), "npc") would make dots that areat most10% of the viewport size (while still ensuring the tallest stack is less thanor equal toscale).

dotsize

<scalarnumeric> The width of the dots relative to thebinwidth. The default,1.07, makes dots be just a bit wider than the bin width, which is amanually-tuned parameter that tends to work well with the default circularshape, preventing gaps between bins from appearing to be too large visually(as might arise from dots beingprecisely thebinwidth). If it is desiredto have dots be precisely thebinwidth, setdotsize = 1.

stackratio

<scalarnumeric> The distance between the center of the dots in the samestack relative to the dot height. The default,1, makes dots in the samestack just touch each other.

layout

<string> The layout method used for the dots. One of:

  • "bin" (default): places dots on the off-axis at the midpoint oftheir bins as in the classic Wilkinson dotplot. This maintains thealignment of rows and columns in the dotplot. This layout is slightlydifferent from the classic Wilkinson algorithm in that: (1) it nudgesbins slightly to avoid overlapping bins and (2) if the input data aresymmetrical it will return a symmetrical layout.

  • "weave": uses the same basic binning approach of"bin", butplaces dots in the off-axis at their actual positions (unlessoverlaps = "nudge", in which case overlaps may be nudged out of theway). This maintains the alignment of rows but does not align dotswithin columns.

  • "hex": uses the same basic binning approach of"bin", butalternates placing dots+ binwidth/4 or- binwidth/4 in theoff-axis from the bin center. This allows hexagonal packing by settingastackratio less than 1 (something like0.9 tends to work).

  • "swarm": uses the"compactswarm" layout frombeeswarm::beeswarm(). Does not maintain alignment of rows or columns,but can be more compact and neat looking, especially for sample data(as opposed to quantile dotplots of theoretical distributions, whichmay look better with"bin","weave", or"hex").

  • "bar": for discrete distributions, lays out duplicate values inrectangular bars.

overlaps

<string> How to handle overlapping dots or bins in the"bin","weave", and"hex" layouts (dots never overlap in the"swarm" or"bar" layouts).For the purposes of this argument, dots are only considered to be overlappingif they would be overlapping whendotsize = 1 andstackratio = 1; i.e.if you set those arguments to other values, overlaps may still occur.One of:

  • "keep": leave overlapping dots as they are. Dots may overlap(usually only slightly) in the"bin","weave", and"hex" layouts.

  • "nudge": nudge overlapping dots out of the way. Overlaps are avoidedusing a constrained optimization which minimizes the squared distance ofdots to their desired positions, subject to the constraint that adjacentdots do not overlap.

smooth

<function |string> Smoother to apply to dot positions.One of:

  • A function that takes a numeric vector of dot positions and returns asmoothed version of that vector, such assmooth_bounded(),smooth_unbounded(), smooth_discrete()⁠, or ⁠smooth_bar()'.

  • A string indicating what smoother to use, as the suffix to a functionname starting withsmooth_; e.g."none" (the default) appliessmooth_none(), which simply returns the given vector withoutapplying smoothing.

Smoothing is most effective when the smoother is matched to the support ofthe distribution; e.g. usingsmooth_bounded(bounds = ...).

overflow

<string> How to handle overflow of dots beyond the extent of the geomwhen a minimumbinwidth (or an exactbinwidth) is supplied.One of:

  • "keep": Keep the overflow, drawing dots outside the geom bounds.

  • "warn": Keep the overflow, but produce a warning suggesting solutions,such as settingbinwidth = NA oroverflow = "compress".

  • "compress": Compress the layout. Reduces thebinwidth to the size necessaryto keep the dots within bounds, then adjustsstackratio anddotsize so thatthe apparent dot size is the user-specified minimumbinwidth times theuser-specifieddotsize.

If you find the default layout has dots that are too small, and you are okaywith dots overlapping, consider settingoverflow = "compress" and supplyingan exact or minimum dot size usingbinwidth.

verbose

<scalarlogical> IfTRUE, print out the bin width of the dotplot. Can be usefulif you want to start from an automatically-selected bin width and then adjust itmanually. Bin width is printed both as data units and as normalized parentcoordinates or"npc"s (seeunit()). Note that if you just want to scale theselected bin width to fit within a desired area, it is probably easier to usescale than to copy and scalebinwidth manually, and if you just want toprovide constraints on the bin width, you can pass a length-2 vector tobinwidth.

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

subguide

<function |string> Sub-guide used to annotate thethickness scale. One of:

  • A function that takes ascale argument giving aggplot2::Scaleobject and anorientation argument giving the orientation of thegeometry and then returns agrid::grob that will draw the axisannotation, such assubguide_axis() (to draw a traditional axis) orsubguide_none() (to draw no annotation). Seesubguide_axis()for a list of possibilities and examples.

  • A string giving the name of such a function when prefixedwith"subguide_"; e.g."axis" or"none". The values"slab","dots", and"spike" use the default subguide for their geomfamilies (no subguide), which can be modified by settingsubguide_slab,subguide_dots, orsubguide_spike;see the documentation for those functions.

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

logical. Should this layer be included in the legends?NA, the default, includes if any aesthetics are mapped.FALSE never includes, andTRUE always includes.It can also be a named logical vector to finely select the aesthetics todisplay.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

Thedots family of stats and geoms are similar toggplot2::geom_dotplot() but with a number of differences:

Stats and geoms in this family include:

stat_dots() andstat_dotsinterval(), when used with thequantiles argument,are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertaintyusing a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).

Value

Aggplot2::Geom representing a dot geometry which canbe added to aggplot() object.

Aesthetics

The dots+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: thedots (aka theslab), thepoint, and theinterval.

Positional aesthetics

Dots-specific (aka Slab-specific) aesthetics

Interval-specific aesthetics

Point-specific aesthetics

Color aesthetics

Line aesthetics

Slab-specific color and line override aesthetics

Interval-specific color and line override aesthetics

Point-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("dotsinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

References

Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizationsof Uncertainty in Everyday, Mobile Predictive Systems.Conference on Human Factorsin Computing Systems - CHI '16, 5092–5103.doi:10.1145/2858036.2858558.

Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplotsor CDFs Improve Transit Decision-Making.Conference on Human Factors in Computing Systems - CHI '18.doi:10.1145/3173574.3173718.

See Also

Seestat_dots() for the stat version, intended foruse on sample data or analytical distributions.

Seegeom_dotsinterval() for the geometry this shortcut is based on.

Seevignette("dotsinterval") for a variety of examples of use.

Other dotsinterval geoms:geom_blur_dots(),geom_dotsinterval(),geom_swarm(),geom_weave()

Examples

library(dplyr)library(ggplot2)theme_set(theme_ggdist())set.seed(12345)df = tibble(  g = rep(c("a", "b"), 200),  value = rnorm(400, c(0, 3), c(0.75, 1)))# orientation is detected automatically based on# which axis is discretedf %>%  ggplot(aes(x = value, y = g)) +  geom_dots()df %>%  ggplot(aes(y = value, x = g)) +  geom_dots()

Automatic dotplot + point + interval meta-geom

Description

This meta-geom supports drawing combinations of dotplots, points, and intervals.Geoms and stats based ongeom_dotsinterval() create dotplots that automatically determine a bin width thatensures the plot fits within the available space. They also ensure dots do not overlap, and allowthe generation of quantile dotplots using thequantiles argument tostat_dotsinterval()/stat_dots().Generally follows the naming scheme andarguments of thegeom_slabinterval() andstat_slabinterval() family ofgeoms and stats.

Usage

geom_dotsinterval(  mapping = NULL,  data = NULL,  stat = "identity",  position = "identity",  ...,  binwidth = NA,  dotsize = 1.07,  stackratio = 1,  layout = "bin",  overlaps = "nudge",  smooth = "none",  overflow = "warn",  verbose = FALSE,  orientation = NA,  interval_size_domain = c(1, 6),  interval_size_range = c(0.6, 1.4),  fatten_point = 1.8,  arrow = NULL,  show_slab = TRUE,  show_point = TRUE,  show_interval = TRUE,  subguide = "slab",  na.rm = FALSE,  show.legend = NA,  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

stat

The statistical transformation to use on the data for this layer.When using a⁠geom_*()⁠ function to construct a layer, thestatargument can be used the override the default coupling between geoms andstats. Thestat argument accepts the following:

  • AStat ggproto subclass, for exampleStatCount.

  • A string naming the stat. To give the stat as a string, strip thefunction name of thestat_ prefix. For example, to usestat_count(),give the stat as"count".

  • For more information and other ways to specify the stat, see thelayer stat documentation.

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). These areoften aesthetics, used to set an aesthetic to a fixed value, likecolour = "red"orlinewidth = 3 (seeAesthetics, below). They may also be parametersto the paired geom/stat.

binwidth

<numeric |unit> The bin width to use for laying out the dots.One of:

  • NA (the default): Dynamically select the bin width based on thesize of the plot when drawn. This will pick abinwidth such that thetallest stack of dots is at mostscale in height (ideally exactlyscalein height, though this is not guaranteed).

  • A length-1 (scalar) numeric orunit object giving the exact bin width.

  • A length-2 (vector) numeric orunit object giving the minimum and maximumdesired bin width. The bin width will be dynamically selected withinthese bounds.

If the value is numeric, it is assumed to be in units of data. The bin width(or its bounds) can also be specified usingunit(), which may be useful ifit is desired that the dots be a certain point size or a certain percentage ofthe width/height of the viewport. For example,unit(0.1, "npc") would makedots that areexactly 10% of the viewport size along whichever dimension thedotplot is drawn;unit(c(0, 0.1), "npc") would make dots that areat most10% of the viewport size (while still ensuring the tallest stack is less thanor equal toscale).

dotsize

<scalarnumeric> The width of the dots relative to thebinwidth. The default,1.07, makes dots be just a bit wider than the bin width, which is amanually-tuned parameter that tends to work well with the default circularshape, preventing gaps between bins from appearing to be too large visually(as might arise from dots beingprecisely thebinwidth). If it is desiredto have dots be precisely thebinwidth, setdotsize = 1.

stackratio

<scalarnumeric> The distance between the center of the dots in the samestack relative to the dot height. The default,1, makes dots in the samestack just touch each other.

layout

<string> The layout method used for the dots. One of:

  • "bin" (default): places dots on the off-axis at the midpoint oftheir bins as in the classic Wilkinson dotplot. This maintains thealignment of rows and columns in the dotplot. This layout is slightlydifferent from the classic Wilkinson algorithm in that: (1) it nudgesbins slightly to avoid overlapping bins and (2) if the input data aresymmetrical it will return a symmetrical layout.

  • "weave": uses the same basic binning approach of"bin", butplaces dots in the off-axis at their actual positions (unlessoverlaps = "nudge", in which case overlaps may be nudged out of theway). This maintains the alignment of rows but does not align dotswithin columns.

  • "hex": uses the same basic binning approach of"bin", butalternates placing dots+ binwidth/4 or- binwidth/4 in theoff-axis from the bin center. This allows hexagonal packing by settingastackratio less than 1 (something like0.9 tends to work).

  • "swarm": uses the"compactswarm" layout frombeeswarm::beeswarm(). Does not maintain alignment of rows or columns,but can be more compact and neat looking, especially for sample data(as opposed to quantile dotplots of theoretical distributions, whichmay look better with"bin","weave", or"hex").

  • "bar": for discrete distributions, lays out duplicate values inrectangular bars.

overlaps

<string> How to handle overlapping dots or bins in the"bin","weave", and"hex" layouts (dots never overlap in the"swarm" or"bar" layouts).For the purposes of this argument, dots are only considered to be overlappingif they would be overlapping whendotsize = 1 andstackratio = 1; i.e.if you set those arguments to other values, overlaps may still occur.One of:

  • "keep": leave overlapping dots as they are. Dots may overlap(usually only slightly) in the"bin","weave", and"hex" layouts.

  • "nudge": nudge overlapping dots out of the way. Overlaps are avoidedusing a constrained optimization which minimizes the squared distance ofdots to their desired positions, subject to the constraint that adjacentdots do not overlap.

smooth

<function |string> Smoother to apply to dot positions.One of:

  • A function that takes a numeric vector of dot positions and returns asmoothed version of that vector, such assmooth_bounded(),smooth_unbounded(), smooth_discrete()⁠, or ⁠smooth_bar()'.

  • A string indicating what smoother to use, as the suffix to a functionname starting withsmooth_; e.g."none" (the default) appliessmooth_none(), which simply returns the given vector withoutapplying smoothing.

Smoothing is most effective when the smoother is matched to the support ofthe distribution; e.g. usingsmooth_bounded(bounds = ...).

overflow

<string> How to handle overflow of dots beyond the extent of the geomwhen a minimumbinwidth (or an exactbinwidth) is supplied.One of:

  • "keep": Keep the overflow, drawing dots outside the geom bounds.

  • "warn": Keep the overflow, but produce a warning suggesting solutions,such as settingbinwidth = NA oroverflow = "compress".

  • "compress": Compress the layout. Reduces thebinwidth to the size necessaryto keep the dots within bounds, then adjustsstackratio anddotsize so thatthe apparent dot size is the user-specified minimumbinwidth times theuser-specifieddotsize.

If you find the default layout has dots that are too small, and you are okaywith dots overlapping, consider settingoverflow = "compress" and supplyingan exact or minimum dot size usingbinwidth.

verbose

<scalarlogical> IfTRUE, print out the bin width of the dotplot. Can be usefulif you want to start from an automatically-selected bin width and then adjust itmanually. Bin width is printed both as data units and as normalized parentcoordinates or"npc"s (seeunit()). Note that if you just want to scale theselected bin width to fit within a desired area, it is probably easier to usescale than to copy and scalebinwidth manually, and if you just want toprovide constraints on the bin width, you can pass a length-2 vector tobinwidth.

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

interval_size_domain

<length-2numeric> Minimum and maximum of the values of thesize andlinewidth aestheticsthat will be translated into actual sizes for intervals drawn according tointerval_size_range (see thedocumentation for that argument.)

interval_size_range

<length-2numeric> This geom scales the raw size aesthetic values whendrawing interval and point sizes, as they tend to be too thick when usingthe default settings ofscale_size_continuous(),which give sizes with a range ofc(1, 6). Theinterval_size_domain value indicates theinput domain of raw size values (typically this should be equal to the value of therangeargument of thescale_size_continuous() function), andinterval_size_range indicates the desired output range of the size values (the min and maxof the actual sizes used to draw intervals). Most of the time it is not recommended to changethe value of this argument, as it may result in strange scaling of legends; this argument isa holdover from earlier versions that did not have size aesthetics targeting the point andinterval separately. If you want to adjust the size of the interval or points separately,you can also use thelinewidth orpoint_size aesthetics; seesub-geometry-scales.

fatten_point

<scalarnumeric> A multiplicative factor used to adjust the size of the point relative to thesize of the thickest interval line. If you wish to specify point sizes directly, you can also usethepoint_size aesthetic andscale_point_size_continuous() orscale_point_size_discrete();sizes specified with that aesthetic will not be adjusted usingfatten_point.

arrow

<arrow |NULL> Type of arrow heads to use on the interval, orNULL for no arrows.

show_slab

<scalarlogical> Should the slab portion of the geom be drawn?

show_point

<scalarlogical> Should the point portion of the geom be drawn?

show_interval

<scalarlogical> Should the interval portion of the geom be drawn?

subguide

<function |string> Sub-guide used to annotate thethickness scale. One of:

  • A function that takes ascale argument giving aggplot2::Scaleobject and anorientation argument giving the orientation of thegeometry and then returns agrid::grob that will draw the axisannotation, such assubguide_axis() (to draw a traditional axis) orsubguide_none() (to draw no annotation). Seesubguide_axis()for a list of possibilities and examples.

  • A string giving the name of such a function when prefixedwith"subguide_"; e.g."axis" or"none". The values"slab","dots", and"spike" use the default subguide for their geomfamilies (no subguide), which can be modified by settingsubguide_slab,subguide_dots, orsubguide_spike;see the documentation for those functions.

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

logical. Should this layer be included in the legends?NA, the default, includes if any aesthetics are mapped.FALSE never includes, andTRUE always includes.It can also be a named logical vector to finely select the aesthetics todisplay.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

Thedots family of stats and geoms are similar toggplot2::geom_dotplot() but with a number of differences:

Stats and geoms in this family include:

stat_dots() andstat_dotsinterval(), when used with thequantiles argument,are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertaintyusing a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).

To visualize sample data, such as a data distribution, samples from abootstrap distribution, or a Bayesian posterior, you can supply samples tothex ory aesthetic.

To visualize analytical distributions, you can use thexdist orydistaesthetic. For historical reasons, you can also usedist to specify the distribution, thoughthis is not recommended as it does not work as well with orientation detection.These aesthetics can be used as follows:

Value

Aggplot2::Geom orggplot2::Stat representing a dotplot or combined dotplot+interval geometry which canbe added to aggplot() object.

Aesthetics

The dots+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: thedots (aka theslab), thepoint, and theinterval.

Positional aesthetics

Dots-specific (aka Slab-specific) aesthetics

Interval-specific aesthetics

Point-specific aesthetics

Color aesthetics

Line aesthetics

Slab-specific color and line override aesthetics

Interval-specific color and line override aesthetics

Point-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("dotsinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

Author(s)

Matthew Kay

References

Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizationsof Uncertainty in Everyday, Mobile Predictive Systems.Conference on Human Factorsin Computing Systems - CHI '16, 5092–5103.doi:10.1145/2858036.2858558.

Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplotsor CDFs Improve Transit Decision-Making.Conference on Human Factors in Computing Systems - CHI '18.doi:10.1145/3173574.3173718.

See Also

See thestat_slabinterval() family for otherstats built on top ofgeom_slabinterval().Seevignette("dotsinterval") for a variety of examples of use.

Other dotsinterval geoms:geom_blur_dots(),geom_dots(),geom_swarm(),geom_weave()

Examples

library(dplyr)library(ggplot2)theme_set(theme_ggdist())set.seed(12345)df = tibble(  g = rep(c("a", "b"), 200),  value = rnorm(400, c(0, 3), c(0.75, 1)))# orientation is detected automatically based on# which axis is discretedf %>%  ggplot(aes(x = value, y = g)) +  geom_dotsinterval()df %>%  ggplot(aes(y = value, x = g)) +  geom_dotsinterval()# stat_dots can summarize quantiles, creating quantile dotplotsdata(RankCorr_u_tau, package = "ggdist")RankCorr_u_tau %>%  ggplot(aes(x = u_tau, y = factor(i))) +  stat_dots(quantiles = 100)# color and fill aesthetics can be mapped within the geom# dotsinterval adds an intervalRankCorr_u_tau %>%  ggplot(aes(x = u_tau, y = factor(i), fill = after_stat(x > 6))) +  stat_dotsinterval(quantiles = 100)

Multiple-interval plot (shortcut geom)

Description

Shortcut version ofgeom_slabinterval() for creating multiple-interval plots.

Roughly equivalent to:

geom_slabinterval(  aes(    datatype = "interval",    side = "both"  ),  interval_size_range = c(1, 6),  show_slab = FALSE,  show_point = FALSE)

Usage

geom_interval(  mapping = NULL,  data = NULL,  stat = "identity",  position = "identity",  ...,  orientation = NA,  interval_size_range = c(1, 6),  interval_size_domain = c(1, 6),  arrow = NULL,  na.rm = FALSE,  show.legend = NA,  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

stat

The statistical transformation to use on the data for this layer.When using a⁠geom_*()⁠ function to construct a layer, thestatargument can be used the override the default coupling between geoms andstats. Thestat argument accepts the following:

  • AStat ggproto subclass, for exampleStatCount.

  • A string naming the stat. To give the stat as a string, strip thefunction name of thestat_ prefix. For example, to usestat_count(),give the stat as"count".

  • For more information and other ways to specify the stat, see thelayer stat documentation.

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). These areoften aesthetics, used to set an aesthetic to a fixed value, likecolour = "red"orlinewidth = 3 (seeAesthetics, below). They may also be parametersto the paired geom/stat.

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

interval_size_range

<length-2numeric> This geom scales the raw size aesthetic values whendrawing interval and point sizes, as they tend to be too thick when usingthe default settings ofscale_size_continuous(),which give sizes with a range ofc(1, 6). Theinterval_size_domain value indicates theinput domain of raw size values (typically this should be equal to the value of therangeargument of thescale_size_continuous() function), andinterval_size_range indicates the desired output range of the size values (the min and maxof the actual sizes used to draw intervals). Most of the time it is not recommended to changethe value of this argument, as it may result in strange scaling of legends; this argument isa holdover from earlier versions that did not have size aesthetics targeting the point andinterval separately. If you want to adjust the size of the interval or points separately,you can also use thelinewidth orpoint_size aesthetics; seesub-geometry-scales.

interval_size_domain

<length-2numeric> Minimum and maximum of the values of thesize andlinewidth aestheticsthat will be translated into actual sizes for intervals drawn according tointerval_size_range (see thedocumentation for that argument.)

arrow

<arrow |NULL> Type of arrow heads to use on the interval, orNULL for no arrows.

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

logical. Should this layer be included in the legends?NA, the default, includes if any aesthetics are mapped.FALSE never includes, andTRUE always includes.It can also be a named logical vector to finely select the aesthetics todisplay.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

This geom wrapsgeom_slabinterval() with defaults designed to producemultiple-interval plots. Default aesthetic mappings are applied if the.width columnis present in the input data (e.g., as generated by thepoint_interval() family of functions),making this geom often more convenient than vanillaggplot2 geometries when used withfunctions likemedian_qi(),mean_qi(),mode_hdi(), etc.

Specifically, if.width is present in the input,geom_interval() actsas if its default aesthetics areaes(colour = forcats::fct_rev(ordered(.width)))

Value

Aggplot2::Geom representing a multiple-interval geometry which canbe added to aggplot() object.

Aesthetics

The slab+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: theslab, thepoint, andtheinterval.

Positional aesthetics

Interval-specific aesthetics

Color aesthetics

Line aesthetics

Interval-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("slabinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

See Also

Seestat_interval() for the stat version, intended foruse on sample data or analytical distributions.Seegeom_slabinterval() for the geometry this shortcut is based on.

Other slabinterval geoms:geom_pointinterval(),geom_slab(),geom_spike()

Examples

library(dplyr)library(ggplot2)theme_set(theme_ggdist())data(RankCorr_u_tau, package = "ggdist")# orientation is detected automatically based on# use of xmin/xmax or ymin/ymaxRankCorr_u_tau %>%  group_by(i) %>%  median_qi(.width = c(.5, .8, .95, .99)) %>%  ggplot(aes(y = i, x = u_tau, xmin = .lower, xmax = .upper)) +  geom_interval() +  scale_color_brewer()RankCorr_u_tau %>%  group_by(i) %>%  median_qi(.width = c(.5, .8, .95, .99)) %>%  ggplot(aes(x = i, y = u_tau, ymin = .lower, ymax = .upper)) +  geom_interval() +  scale_color_brewer()

Line + multiple-ribbon plots (ggplot geom)

Description

A combination ofgeom_line() andgeom_ribbon()with default aesthetics designed for use with output frompoint_interval().

Usage

geom_lineribbon(  mapping = NULL,  data = NULL,  stat = "identity",  position = "identity",  ...,  step = FALSE,  orientation = NA,  na.rm = FALSE,  show.legend = NA,  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

stat

The statistical transformation to use on the data for this layer.When using a⁠geom_*()⁠ function to construct a layer, thestatargument can be used the override the default coupling between geoms andstats. Thestat argument accepts the following:

  • AStat ggproto subclass, for exampleStatCount.

  • A string naming the stat. To give the stat as a string, strip thefunction name of thestat_ prefix. For example, to usestat_count(),give the stat as"count".

  • For more information and other ways to specify the stat, see thelayer stat documentation.

position

A position adjustment to use on the data for this layer. Thiscan be used in various ways, including to prevent overplotting andimproving the display. Theposition argument accepts the following:

  • The result of calling a position function, such asposition_jitter().This method allows for passing extra arguments to the position.

  • A string naming the position adjustment. To give the position as astring, strip the function name of theposition_ prefix. For example,to useposition_jitter(), give the position as"jitter".

  • For more information and other ways to specify the position, see thelayer position documentation.

...

Other arguments passed tolayer(). These areoften aesthetics, used to set an aesthetic to a fixed value, likecolour = "red"orlinewidth = 3 (seeAesthetics, below). They may also be parametersto the paired geom/stat.

step

<scalarlogical |string> Should the line/ribbon be drawnas a step function? One of:

  • FALSE (default): do not draw as a step function.

  • "mid" (orTRUE): draw steps midway between adjacent x values.

  • "hv": draw horizontal-then-vertical steps.

  • "vh": draw as vertical-then-horizontal steps.

TRUE is an alias for"mid", because for a step function with ribbons"mid" is reasonable default (for the other two step approaches the ribbonsat either the very first or very last x value will not be visible).

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

logical. Should this layer be included in the legends?NA, the default, includes if any aesthetics are mapped.FALSE never includes, andTRUE always includes.It can also be a named logical vector to finely select the aesthetics todisplay.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

geom_lineribbon() is a combination of ageom_line() andgeom_ribbon() designed for use with output frompoint_interval().This geom sets some default aesthetics equal to the.width column generated by thepoint_interval() family of functions, making them often more convenient than a vanillageom_ribbon() +geom_line().

Specifically,geom_lineribbon() acts as if its default aesthetics areaes(fill = forcats::fct_rev(ordered(.width))).

Value

Aggplot2::Geom representing a combined line + multiple-ribbon geometry which canbe added to aggplot() object.

Aesthetics

The line+ribbonstats andgeoms have a wide variety of aesthetics that controlthe appearance of their two sub-geometries: theline and theribbon.

Positional aesthetics

Ribbon-specific aesthetics

Color aesthetics

Line aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("lineribbon").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

Author(s)

Matthew Kay

See Also

Seestat_lineribbon() for a version that does summarizing of samples into points and intervalswithin ggplot. Seegeom_pointinterval() for a similar geom intendedfor point summaries and intervals. Seegeom_line() andgeom_ribbon() and for the geoms this is based on.

Examples

library(dplyr)library(ggplot2)theme_set(theme_ggdist())set.seed(12345)tibble(  x = rep(1:10, 100),  y = rnorm(1000, x)) %>%  group_by(x) %>%  median_qi(.width = c(.5, .8, .95)) %>%  ggplot(aes(x = x, y = y, ymin = .lower, ymax = .upper)) +  # automatically uses aes(fill = forcats::fct_rev(ordered(.width)))  geom_lineribbon() +  scale_fill_brewer()

Point + multiple-interval plot (shortcut geom)

Description

Shortcut version ofgeom_slabinterval() for creating point + multiple-interval plots.

Roughly equivalent to:

geom_slabinterval(  aes(    datatype = "interval",    side = "both"  ),  show_slab = FALSE,  show.legend = c(size = FALSE))

Usage

geom_pointinterval(  mapping = NULL,  data = NULL,  stat = "identity",  position = "identity",  ...,  orientation = NA,  interval_size_domain = c(1, 6),  interval_size_range = c(0.6, 1.4),  fatten_point = 1.8,  arrow = NULL,  na.rm = FALSE,  show.legend = c(size = FALSE),  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

stat

The statistical transformation to use on the data for this layer.When using a⁠geom_*()⁠ function to construct a layer, thestatargument can be used the override the default coupling between geoms andstats. Thestat argument accepts the following:

  • AStat ggproto subclass, for exampleStatCount.

  • A string naming the stat. To give the stat as a string, strip thefunction name of thestat_ prefix. For example, to usestat_count(),give the stat as"count".

  • For more information and other ways to specify the stat, see thelayer stat documentation.

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). These areoften aesthetics, used to set an aesthetic to a fixed value, likecolour = "red"orlinewidth = 3 (seeAesthetics, below). They may also be parametersto the paired geom/stat.

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

interval_size_domain

<length-2numeric> Minimum and maximum of the values of thesize andlinewidth aestheticsthat will be translated into actual sizes for intervals drawn according tointerval_size_range (see thedocumentation for that argument.)

interval_size_range

<length-2numeric> This geom scales the raw size aesthetic values whendrawing interval and point sizes, as they tend to be too thick when usingthe default settings ofscale_size_continuous(),which give sizes with a range ofc(1, 6). Theinterval_size_domain value indicates theinput domain of raw size values (typically this should be equal to the value of therangeargument of thescale_size_continuous() function), andinterval_size_range indicates the desired output range of the size values (the min and maxof the actual sizes used to draw intervals). Most of the time it is not recommended to changethe value of this argument, as it may result in strange scaling of legends; this argument isa holdover from earlier versions that did not have size aesthetics targeting the point andinterval separately. If you want to adjust the size of the interval or points separately,you can also use thelinewidth orpoint_size aesthetics; seesub-geometry-scales.

fatten_point

<scalarnumeric> A multiplicative factor used to adjust the size of the point relative to thesize of the thickest interval line. If you wish to specify point sizes directly, you can also usethepoint_size aesthetic andscale_point_size_continuous() orscale_point_size_discrete();sizes specified with that aesthetic will not be adjusted usingfatten_point.

arrow

<arrow |NULL> Type of arrow heads to use on the interval, orNULL for no arrows.

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

<logical> Should this layer be included in the legends?Default isc(size = FALSE), unlike most geoms, to match its common use cases.FALSE hides all legends,TRUE shows all legends, andNA shows onlythose that are mapped (the default for most geoms). It can also be a namedlogical vector to finely select the aesthetics to display.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

This geom wrapsgeom_slabinterval() with defaults designed to producepoint + multiple-interval plots. Default aesthetic mappings are applied if the.width columnis present in the input data (e.g., as generated by thepoint_interval() family of functions),making this geom often more convenient than vanillaggplot2 geometries when used withfunctions likemedian_qi(),mean_qi(),mode_hdi(), etc.

Specifically, if.width is present in the input,geom_pointinterval() actsas if its default aesthetics areaes(size = -.width)

Value

Aggplot2::Geom representing a point + multiple-interval geometry which canbe added to aggplot() object.

Aesthetics

The slab+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: theslab, thepoint, andtheinterval.

Positional aesthetics

Interval-specific aesthetics

Point-specific aesthetics

Color aesthetics

Line aesthetics

Interval-specific color and line override aesthetics

Point-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("slabinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

See Also

Seestat_pointinterval() for the stat version, intended foruse on sample data or analytical distributions.Seegeom_slabinterval() for the geometry this shortcut is based on.

Other slabinterval geoms:geom_interval(),geom_slab(),geom_spike()

Examples

library(dplyr)library(ggplot2)data(RankCorr_u_tau, package = "ggdist")# orientation is detected automatically based on# use of xmin/xmax or ymin/ymaxRankCorr_u_tau %>%  group_by(i) %>%  median_qi(.width = c(.8, .95)) %>%  ggplot(aes(y = i, x = u_tau, xmin = .lower, xmax = .upper)) +  geom_pointinterval()RankCorr_u_tau %>%  group_by(i) %>%  median_qi(.width = c(.8, .95)) %>%  ggplot(aes(x = i, y = u_tau, ymin = .lower, ymax = .upper)) +  geom_pointinterval()

Slab (ridge) plot (shortcut geom)

Description

Shortcut version ofgeom_slabinterval() for creating slab (ridge) plots.

Roughly equivalent to:

geom_slabinterval(  show_point = FALSE,  show_interval = FALSE)

Usage

geom_slab(  mapping = NULL,  data = NULL,  stat = "identity",  position = "identity",  ...,  orientation = NA,  subscale = "thickness",  normalize = "all",  fill_type = "segments",  subguide = "slab",  na.rm = FALSE,  show.legend = NA,  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

stat

The statistical transformation to use on the data for this layer.When using a⁠geom_*()⁠ function to construct a layer, thestatargument can be used the override the default coupling between geoms andstats. Thestat argument accepts the following:

  • AStat ggproto subclass, for exampleStatCount.

  • A string naming the stat. To give the stat as a string, strip thefunction name of thestat_ prefix. For example, to usestat_count(),give the stat as"count".

  • For more information and other ways to specify the stat, see thelayer stat documentation.

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). These areoften aesthetics, used to set an aesthetic to a fixed value, likecolour = "red"orlinewidth = 3 (seeAesthetics, below). They may also be parametersto the paired geom/stat.

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

subscale

<function |string> Sub-scale used to scale values of thethickness aesthetic within the groups determined bynormalize. One of:

  • A function that takes anx argument giving a numeric vectorof values to be scaled and then returns athickness vector representingthe scaled values, such assubscale_thickness() orsubscale_identity().

  • A string giving the name of such a function when prefixedwith"subscale_"; e.g."thickness" or"identity". The value"thickness" using the default subscale, which can be modified bysettingsubscale_thickness; see the documentation for thatfunction.

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

normalize

<string> Groups within which to scale values of thethickness aesthetic. One of:

  • "all": normalize so that the maximum height across all data is1.

  • "panels": normalize within panels so that the maximum height in each panel is1.

  • "xy": normalize within the x/y axis opposite theorientation of this geom sothat the maximum height at each value of the opposite axis is1.

  • "groups": normalize within values of the opposite axis and within eachgroup so that the maximum height in each group is1.

  • "none": values are taken as is with no normalization (this should probablyonly be used with functions whose values are in [0,1], such as CDFs).

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

fill_type

<string> What type of fill to use when the fill color or alpha varies within a slab.One of:

  • "segments": breaks up the slab geometry into segments for each unique combination of fill color andalpha value. This approach is supported by all graphics devices and works well for sharp cutoff values,but can give ugly results if a large number of unique fill colors are being used (as in gradients,like instat_gradientinterval()).

  • "gradient": agrid::linearGradient() is used to create a smooth gradient fill. This works well forlarge numbers of unique fill colors, but requires R >= 4.1 and is not yet supported on all graphics devices.As of this writing, thepng() graphics device withtype = "cairo", thesvg() device, thepdf()device, and theragg::agg_png() devices are known to support this option. On R < 4.1, this optionwill fall back tofill_type = "segments" with a message.

  • "auto": attempts to usefill_type = "gradient" if support for it can be auto-detected. On R >= 4.2,support for gradients can be auto-detected on some graphics devices; if support is not detected, thisoption will fall back tofill_type = "segments" (in case of a false negative,fill_type = "gradient"can be set explicitly). On R < 4.2, support for gradients cannot be auto-detected, so this will alwaysfall back tofill_type = "segments", in which case you can setfill_type = "gradient" explicitlyif you are using a graphics device that support gradients.

subguide

<function |string> Sub-guide used to annotate thethickness scale. One of:

  • A function that takes ascale argument giving aggplot2::Scaleobject and anorientation argument giving the orientation of thegeometry and then returns agrid::grob that will draw the axisannotation, such assubguide_axis() (to draw a traditional axis) orsubguide_none() (to draw no annotation). Seesubguide_axis()for a list of possibilities and examples.

  • A string giving the name of such a function when prefixedwith"subguide_"; e.g."axis" or"none". The values"slab","dots", and"spike" use the default subguide for their geomfamilies (no subguide), which can be modified by settingsubguide_slab,subguide_dots, orsubguide_spike;see the documentation for those functions.

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

logical. Should this layer be included in the legends?NA, the default, includes if any aesthetics are mapped.FALSE never includes, andTRUE always includes.It can also be a named logical vector to finely select the aesthetics todisplay.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Value

Aggplot2::Geom representing a slab (ridge) geometry which canbe added to aggplot() object.

Aesthetics

The slab+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: theslab, thepoint, andtheinterval.

Positional aesthetics

Slab-specific aesthetics

Color aesthetics

Line aesthetics

Slab-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("slabinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

See Also

Seestat_slab() for the stat version, intended foruse on sample data or analytical distributions.Seegeom_slabinterval() for the geometry this shortcut is based on.

Other slabinterval geoms:geom_interval(),geom_pointinterval(),geom_spike()

Examples

library(dplyr)library(ggplot2)theme_set(theme_ggdist())# we will manually demonstrate plotting a density with geom_slab(),# though generally speaking this is easier to do using stat_slab(), which# will determine sensible limits automatically and correctly adjust# densities when using scale transformationsdf = expand.grid(    mean = 1:3,    input = seq(-2, 6, length.out = 100)  ) %>%  mutate(    group = letters[4 - mean],    density = dnorm(input, mean, 1)  )# orientation is detected automatically based on# use of x or ydf %>%  ggplot(aes(y = group, x = input, thickness = density)) +  geom_slab()df %>%  ggplot(aes(x = group, y = input, thickness = density)) +  geom_slab()# RIDGE PLOTS# "ridge" plots can be created by increasing the slab height and# setting the slab colordf %>%  ggplot(aes(y = group, x = input, thickness = density)) +  geom_slab(height = 2, color = "black")

Slab + point + interval meta-geom

Description

This meta-geom supports drawing combinations of functions (as slabs, aka ridge plots or joy plots), points, andintervals. It acts as a meta-geom for many otherggdist geoms that are wrappers around this geom, includingeye plots, half-eye plots, CCDF barplots, and point+multiple interval plots, and supports both horizontal andvertical orientations, dodging (via theposition argument), and relative justification of slabs with theircorresponding intervals.

Usage

geom_slabinterval(  mapping = NULL,  data = NULL,  stat = "identity",  position = "identity",  ...,  orientation = NA,  subscale = "thickness",  normalize = "all",  fill_type = "segments",  interval_size_domain = c(1, 6),  interval_size_range = c(0.6, 1.4),  fatten_point = 1.8,  arrow = NULL,  show_slab = TRUE,  show_point = TRUE,  show_interval = TRUE,  subguide = "slab",  na.rm = FALSE,  show.legend = NA,  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

stat

The statistical transformation to use on the data for this layer.When using a⁠geom_*()⁠ function to construct a layer, thestatargument can be used the override the default coupling between geoms andstats. Thestat argument accepts the following:

  • AStat ggproto subclass, for exampleStatCount.

  • A string naming the stat. To give the stat as a string, strip thefunction name of thestat_ prefix. For example, to usestat_count(),give the stat as"count".

  • For more information and other ways to specify the stat, see thelayer stat documentation.

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). These areoften aesthetics, used to set an aesthetic to a fixed value, likecolour = "red"orlinewidth = 3 (seeAesthetics, below). They may also be parametersto the paired geom/stat.

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

subscale

<function |string> Sub-scale used to scale values of thethickness aesthetic within the groups determined bynormalize. One of:

  • A function that takes anx argument giving a numeric vectorof values to be scaled and then returns athickness vector representingthe scaled values, such assubscale_thickness() orsubscale_identity().

  • A string giving the name of such a function when prefixedwith"subscale_"; e.g."thickness" or"identity". The value"thickness" using the default subscale, which can be modified bysettingsubscale_thickness; see the documentation for thatfunction.

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

normalize

<string> Groups within which to scale values of thethickness aesthetic. One of:

  • "all": normalize so that the maximum height across all data is1.

  • "panels": normalize within panels so that the maximum height in each panel is1.

  • "xy": normalize within the x/y axis opposite theorientation of this geom sothat the maximum height at each value of the opposite axis is1.

  • "groups": normalize within values of the opposite axis and within eachgroup so that the maximum height in each group is1.

  • "none": values are taken as is with no normalization (this should probablyonly be used with functions whose values are in [0,1], such as CDFs).

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

fill_type

<string> What type of fill to use when the fill color or alpha varies within a slab.One of:

  • "segments": breaks up the slab geometry into segments for each unique combination of fill color andalpha value. This approach is supported by all graphics devices and works well for sharp cutoff values,but can give ugly results if a large number of unique fill colors are being used (as in gradients,like instat_gradientinterval()).

  • "gradient": agrid::linearGradient() is used to create a smooth gradient fill. This works well forlarge numbers of unique fill colors, but requires R >= 4.1 and is not yet supported on all graphics devices.As of this writing, thepng() graphics device withtype = "cairo", thesvg() device, thepdf()device, and theragg::agg_png() devices are known to support this option. On R < 4.1, this optionwill fall back tofill_type = "segments" with a message.

  • "auto": attempts to usefill_type = "gradient" if support for it can be auto-detected. On R >= 4.2,support for gradients can be auto-detected on some graphics devices; if support is not detected, thisoption will fall back tofill_type = "segments" (in case of a false negative,fill_type = "gradient"can be set explicitly). On R < 4.2, support for gradients cannot be auto-detected, so this will alwaysfall back tofill_type = "segments", in which case you can setfill_type = "gradient" explicitlyif you are using a graphics device that support gradients.

interval_size_domain

<length-2numeric> Minimum and maximum of the values of thesize andlinewidth aestheticsthat will be translated into actual sizes for intervals drawn according tointerval_size_range (see thedocumentation for that argument.)

interval_size_range

<length-2numeric> This geom scales the raw size aesthetic values whendrawing interval and point sizes, as they tend to be too thick when usingthe default settings ofscale_size_continuous(),which give sizes with a range ofc(1, 6). Theinterval_size_domain value indicates theinput domain of raw size values (typically this should be equal to the value of therangeargument of thescale_size_continuous() function), andinterval_size_range indicates the desired output range of the size values (the min and maxof the actual sizes used to draw intervals). Most of the time it is not recommended to changethe value of this argument, as it may result in strange scaling of legends; this argument isa holdover from earlier versions that did not have size aesthetics targeting the point andinterval separately. If you want to adjust the size of the interval or points separately,you can also use thelinewidth orpoint_size aesthetics; seesub-geometry-scales.

fatten_point

<scalarnumeric> A multiplicative factor used to adjust the size of the point relative to thesize of the thickest interval line. If you wish to specify point sizes directly, you can also usethepoint_size aesthetic andscale_point_size_continuous() orscale_point_size_discrete();sizes specified with that aesthetic will not be adjusted usingfatten_point.

arrow

<arrow |NULL> Type of arrow heads to use on the interval, orNULL for no arrows.

show_slab

<scalarlogical> Should the slab portion of the geom be drawn?

show_point

<scalarlogical> Should the point portion of the geom be drawn?

show_interval

<scalarlogical> Should the interval portion of the geom be drawn?

subguide

<function |string> Sub-guide used to annotate thethickness scale. One of:

  • A function that takes ascale argument giving aggplot2::Scaleobject and anorientation argument giving the orientation of thegeometry and then returns agrid::grob that will draw the axisannotation, such assubguide_axis() (to draw a traditional axis) orsubguide_none() (to draw no annotation). Seesubguide_axis()for a list of possibilities and examples.

  • A string giving the name of such a function when prefixedwith"subguide_"; e.g."axis" or"none". The values"slab","dots", and"spike" use the default subguide for their geomfamilies (no subguide), which can be modified by settingsubguide_slab,subguide_dots, orsubguide_spike;see the documentation for those functions.

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

logical. Should this layer be included in the legends?NA, the default, includes if any aesthetics are mapped.FALSE never includes, andTRUE always includes.It can also be a named logical vector to finely select the aesthetics todisplay.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

geom_slabinterval() is a flexible meta-geom that you can use directly or through a variety of "shortcut"geoms that represent useful combinations of the various parameters of this geom. In many cases you will want touse the shortcut geoms instead as they create more useful mnemonic primitives, such as eye plots,half-eye plots, point+interval plots, or CCDF barplots.

Theslab portion of the geom is much like a ridge or "joy" plot: it represents the value of a functionscaled to fit between values on thex ory axis (depending on the value oforientation). Values ofthe functions are specified using thethickness aesthetic and are scaled to fit intoscaletimes the distance between points on the relevant axis. E.g., iforientation is"horizontal",scale is0.9, andy is a discrete variable, then thethickness aesthetic specifies thevalue of some function ofx that is drawn for everyy value and scaled to fit into0.9 timesthe distance between points on they axis.

For theinterval portion of the geom,x andy aesthetics specify the location of thepoint, andymin/ymax orxmin/xmax (depending on the value oforientation)specify the endpoints of the interval. A scaling factor for interval line width and point size is appliedthrough theinterval_size_domain,interval_size_range, andfatten_point parameters.These scaling factors are designed to give multiple uncertainty intervals reasonablescaling at the default settings forscale_size_continuous().

As a combination geom, this geom expects adatatype aesthetic specifying which part of the geom a givenrow in the input data corresponds to:"slab" or"interval". However, specifying this aestheticmanually is typically only necessary if you use this geom directly; the numerous wrapper geoms willusually set this aesthetic for you as needed, and their use is recommended unless you have a very customuse case.

Wrapper geoms include:

In addition, thestat_slabinterval() family of stats uses geoms from thegeom_slabinterval() family, and is often easier to use than using these geomsdirectly. Typically, the⁠geom_*⁠ versions are meant for use with already-summarized data (such as intervals) and the⁠stat_*⁠ versions are summarize the data themselves (usually draws from a distribution) to produce the geom.

Value

Aggplot2::Geom representing a slab or combined slab+interval geometry which canbe added to aggplot() object.

Aesthetics

The slab+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: theslab, thepoint, andtheinterval.

Positional aesthetics

Slab-specific aesthetics

Interval-specific aesthetics

Point-specific aesthetics

Color aesthetics

Line aesthetics

Slab-specific color and line override aesthetics

Interval-specific color and line override aesthetics

Point-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("slabinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

Author(s)

Matthew Kay

See Also

Seegeom_lineribbon() for a combination geom designed for fit curves plus probability bands.Seegeom_dotsinterval() for a combination geom designed for plotting dotplots with intervals.Seestat_slabinterval() for families of statsbuilt on top of this geom for common use cases (likestat_halfeye()).Seevignette("slabinterval") for a variety of examples of use.

Examples

# geom_slabinterval() is typically not that useful on its own.# See vignette("slabinterval") for a variety of examples of the use of its# shortcut geoms and stats, which are more useful than using# geom_slabinterval() directly.

Spike plot (ggplot2 geom)

Description

Geometry for drawing "spikes" (optionally with points on them) on top ofgeom_slabinterval() geometries: this geometry understands the scaling andpositioning of thethickness aesthetic fromgeom_slabinterval(), whichallows you to position spikes and points along a slab.

Usage

geom_spike(  mapping = NULL,  data = NULL,  stat = "identity",  position = "identity",  ...,  subguide = "spike",  orientation = NA,  subscale = "thickness",  normalize = "all",  arrow = NULL,  na.rm = FALSE,  show.legend = NA,  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

stat

The statistical transformation to use on the data for this layer.When using a⁠geom_*()⁠ function to construct a layer, thestatargument can be used the override the default coupling between geoms andstats. Thestat argument accepts the following:

  • AStat ggproto subclass, for exampleStatCount.

  • A string naming the stat. To give the stat as a string, strip thefunction name of thestat_ prefix. For example, to usestat_count(),give the stat as"count".

  • For more information and other ways to specify the stat, see thelayer stat documentation.

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). These areoften aesthetics, used to set an aesthetic to a fixed value, likecolour = "red"orlinewidth = 3 (seeAesthetics, below). They may also be parametersto the paired geom/stat.

subguide

<function |string> Sub-guide used to annotate thethickness scale. One of:

  • A function that takes ascale argument giving aggplot2::Scaleobject and anorientation argument giving the orientation of thegeometry and then returns agrid::grob that will draw the axisannotation, such assubguide_axis() (to draw a traditional axis) orsubguide_none() (to draw no annotation). Seesubguide_axis()for a list of possibilities and examples.

  • A string giving the name of such a function when prefixedwith"subguide_"; e.g."axis" or"none". The values"slab","dots", and"spike" use the default subguide for their geomfamilies (no subguide), which can be modified by settingsubguide_slab,subguide_dots, orsubguide_spike;see the documentation for those functions.

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

subscale

<function |string> Sub-scale used to scale values of thethickness aesthetic within the groups determined bynormalize. One of:

  • A function that takes anx argument giving a numeric vectorof values to be scaled and then returns athickness vector representingthe scaled values, such assubscale_thickness() orsubscale_identity().

  • A string giving the name of such a function when prefixedwith"subscale_"; e.g."thickness" or"identity". The value"thickness" using the default subscale, which can be modified bysettingsubscale_thickness; see the documentation for thatfunction.

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

normalize

<string> Groups within which to scale values of thethickness aesthetic. One of:

  • "all": normalize so that the maximum height across all data is1.

  • "panels": normalize within panels so that the maximum height in each panel is1.

  • "xy": normalize within the x/y axis opposite theorientation of this geom sothat the maximum height at each value of the opposite axis is1.

  • "groups": normalize within values of the opposite axis and within eachgroup so that the maximum height in each group is1.

  • "none": values are taken as is with no normalization (this should probablyonly be used with functions whose values are in [0,1], such as CDFs).

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

arrow

<arrow |NULL> Type of arrow heads to use on the spike, orNULL for no arrows.

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

logical. Should this layer be included in the legends?NA, the default, includes if any aesthetics are mapped.FALSE never includes, andTRUE always includes.It can also be a named logical vector to finely select the aesthetics todisplay.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

This geometry consists of a "spike" (vertical/horizontal line segment) and a"point" (at the end of the line segment). It uses thethickness aestheticto determine where the endpoint of the line is, which allows it to be usedwithgeom_slabinterval() geometries for labeling specific values of thethickness function.

Value

Aggplot2::Geom representing a spike geometry which canbe added to aggplot() object.rd_slabinterval_aesthetics(geom_name),

Aesthetics

The spikegeom has a wide variety of aesthetics that controlthe appearance of its two sub-geometries: thespike and thepoint.

Positional aesthetics

Spike-specific (aka Slab-specific) aesthetics

Color aesthetics

Line aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("slabinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

See Also

Seestat_spike() for the stat version, intended foruse on sample data or analytical distributions.

Other slabinterval geoms:geom_interval(),geom_pointinterval(),geom_slab()

Examples

library(ggplot2)library(distributional)library(dplyr)# geom_spike is easiest to use with distributional or# posterior::rvar objectsdf = tibble(  d = dist_normal(1:2, 1:2), g = c("a", "b"))# annotate the density at the mean of a distributiondf %>% mutate(  mean = mean(d),  density(d, list(density_at_mean = mean))) %>%  ggplot(aes(y = g)) +  stat_slab(aes(xdist = d)) +  geom_spike(aes(x = mean, thickness = density_at_mean)) +  # need shared thickness scale so that stat_slab and geom_spike line up  scale_thickness_shared()# annotate the endpoints of intervals of a distribution# here we'll use an arrow instead of a point by setting size = 0arrow_spec = arrow(angle = 45, type = "closed", length = unit(4, "pt"))df %>% mutate(  median_qi(d, .width = 0.9),  density(d, list(density_lower = .lower, density_upper = .upper))) %>%  ggplot(aes(y = g)) +  stat_halfeye(aes(xdist = d), .width = 0.9, color = "gray35") +  geom_spike(    aes(x = .lower, thickness = density_lower),    size = 0, arrow = arrow_spec, color = "blue", linewidth = 0.75  ) +  geom_spike(    aes(x = .upper, thickness = density_upper),    size = 0, arrow = arrow_spec, color = "red", linewidth = 0.75  ) +  scale_thickness_shared()

Beeswarm plot (shortcut geom)

Description

Shortcut version ofgeom_dotsinterval() for creating beeswarm plots.Geoms based ongeom_dotsinterval() create dotplots that automaticallyensure the plot fits within the available space.

Roughly equivalent to:

geom_dots(  aes(side = "both"),  overflow = "compress",  binwidth = unit(1.5, "mm"),  layout = "swarm")

Usage

geom_swarm(  mapping = NULL,  data = NULL,  stat = "identity",  position = "identity",  ...,  overflow = "compress",  binwidth = unit(1.5, "mm"),  layout = "swarm",  dotsize = 1.07,  stackratio = 1,  overlaps = "nudge",  smooth = "none",  verbose = FALSE,  orientation = NA,  subguide = "slab",  na.rm = FALSE,  show.legend = NA,  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

stat

The statistical transformation to use on the data for this layer.When using a⁠geom_*()⁠ function to construct a layer, thestatargument can be used the override the default coupling between geoms andstats. Thestat argument accepts the following:

  • AStat ggproto subclass, for exampleStatCount.

  • A string naming the stat. To give the stat as a string, strip thefunction name of thestat_ prefix. For example, to usestat_count(),give the stat as"count".

  • For more information and other ways to specify the stat, see thelayer stat documentation.

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). These areoften aesthetics, used to set an aesthetic to a fixed value, likecolour = "red"orlinewidth = 3 (seeAesthetics, below). They may also be parametersto the paired geom/stat.

overflow

<string> How to handle overflow of dots beyond the extent of the geomwhen a minimumbinwidth (or an exactbinwidth) is supplied.One of:

  • "keep": Keep the overflow, drawing dots outside the geom bounds.

  • "warn": Keep the overflow, but produce a warning suggesting solutions,such as settingbinwidth = NA oroverflow = "compress".

  • "compress": Compress the layout. Reduces thebinwidth to the size necessaryto keep the dots within bounds, then adjustsstackratio anddotsize so thatthe apparent dot size is the user-specified minimumbinwidth times theuser-specifieddotsize.

If you find the default layout has dots that are too small, and you are okaywith dots overlapping, consider settingoverflow = "compress" and supplyingan exact or minimum dot size usingbinwidth.

binwidth

<numeric |unit> The bin width to use for laying out the dots.One of:

  • NA (the default): Dynamically select the bin width based on thesize of the plot when drawn. This will pick abinwidth such that thetallest stack of dots is at mostscale in height (ideally exactlyscalein height, though this is not guaranteed).

  • A length-1 (scalar) numeric orunit object giving the exact bin width.

  • A length-2 (vector) numeric orunit object giving the minimum and maximumdesired bin width. The bin width will be dynamically selected withinthese bounds.

If the value is numeric, it is assumed to be in units of data. The bin width(or its bounds) can also be specified usingunit(), which may be useful ifit is desired that the dots be a certain point size or a certain percentage ofthe width/height of the viewport. For example,unit(0.1, "npc") would makedots that areexactly 10% of the viewport size along whichever dimension thedotplot is drawn;unit(c(0, 0.1), "npc") would make dots that areat most10% of the viewport size (while still ensuring the tallest stack is less thanor equal toscale).

layout

<string> The layout method used for the dots. One of:

  • "bin" (default): places dots on the off-axis at the midpoint oftheir bins as in the classic Wilkinson dotplot. This maintains thealignment of rows and columns in the dotplot. This layout is slightlydifferent from the classic Wilkinson algorithm in that: (1) it nudgesbins slightly to avoid overlapping bins and (2) if the input data aresymmetrical it will return a symmetrical layout.

  • "weave": uses the same basic binning approach of"bin", butplaces dots in the off-axis at their actual positions (unlessoverlaps = "nudge", in which case overlaps may be nudged out of theway). This maintains the alignment of rows but does not align dotswithin columns.

  • "hex": uses the same basic binning approach of"bin", butalternates placing dots+ binwidth/4 or- binwidth/4 in theoff-axis from the bin center. This allows hexagonal packing by settingastackratio less than 1 (something like0.9 tends to work).

  • "swarm": uses the"compactswarm" layout frombeeswarm::beeswarm(). Does not maintain alignment of rows or columns,but can be more compact and neat looking, especially for sample data(as opposed to quantile dotplots of theoretical distributions, whichmay look better with"bin","weave", or"hex").

  • "bar": for discrete distributions, lays out duplicate values inrectangular bars.

dotsize

<scalarnumeric> The width of the dots relative to thebinwidth. The default,1.07, makes dots be just a bit wider than the bin width, which is amanually-tuned parameter that tends to work well with the default circularshape, preventing gaps between bins from appearing to be too large visually(as might arise from dots beingprecisely thebinwidth). If it is desiredto have dots be precisely thebinwidth, setdotsize = 1.

stackratio

<scalarnumeric> The distance between the center of the dots in the samestack relative to the dot height. The default,1, makes dots in the samestack just touch each other.

overlaps

<string> How to handle overlapping dots or bins in the"bin","weave", and"hex" layouts (dots never overlap in the"swarm" or"bar" layouts).For the purposes of this argument, dots are only considered to be overlappingif they would be overlapping whendotsize = 1 andstackratio = 1; i.e.if you set those arguments to other values, overlaps may still occur.One of:

  • "keep": leave overlapping dots as they are. Dots may overlap(usually only slightly) in the"bin","weave", and"hex" layouts.

  • "nudge": nudge overlapping dots out of the way. Overlaps are avoidedusing a constrained optimization which minimizes the squared distance ofdots to their desired positions, subject to the constraint that adjacentdots do not overlap.

smooth

<function |string> Smoother to apply to dot positions.One of:

  • A function that takes a numeric vector of dot positions and returns asmoothed version of that vector, such assmooth_bounded(),smooth_unbounded(), smooth_discrete()⁠, or ⁠smooth_bar()'.

  • A string indicating what smoother to use, as the suffix to a functionname starting withsmooth_; e.g."none" (the default) appliessmooth_none(), which simply returns the given vector withoutapplying smoothing.

Smoothing is most effective when the smoother is matched to the support ofthe distribution; e.g. usingsmooth_bounded(bounds = ...).

verbose

<scalarlogical> IfTRUE, print out the bin width of the dotplot. Can be usefulif you want to start from an automatically-selected bin width and then adjust itmanually. Bin width is printed both as data units and as normalized parentcoordinates or"npc"s (seeunit()). Note that if you just want to scale theselected bin width to fit within a desired area, it is probably easier to usescale than to copy and scalebinwidth manually, and if you just want toprovide constraints on the bin width, you can pass a length-2 vector tobinwidth.

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

subguide

<function |string> Sub-guide used to annotate thethickness scale. One of:

  • A function that takes ascale argument giving aggplot2::Scaleobject and anorientation argument giving the orientation of thegeometry and then returns agrid::grob that will draw the axisannotation, such assubguide_axis() (to draw a traditional axis) orsubguide_none() (to draw no annotation). Seesubguide_axis()for a list of possibilities and examples.

  • A string giving the name of such a function when prefixedwith"subguide_"; e.g."axis" or"none". The values"slab","dots", and"spike" use the default subguide for their geomfamilies (no subguide), which can be modified by settingsubguide_slab,subguide_dots, orsubguide_spike;see the documentation for those functions.

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

logical. Should this layer be included in the legends?NA, the default, includes if any aesthetics are mapped.FALSE never includes, andTRUE always includes.It can also be a named logical vector to finely select the aesthetics todisplay.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

Thedots family of stats and geoms are similar toggplot2::geom_dotplot() but with a number of differences:

Stats and geoms in this family include:

stat_dots() andstat_dotsinterval(), when used with thequantiles argument,are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertaintyusing a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).

Value

Aggplot2::Geom representing a beeswarm geometry which canbe added to aggplot() object.

Aesthetics

The dots+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: thedots (aka theslab), thepoint, and theinterval.

Positional aesthetics

Dots-specific (aka Slab-specific) aesthetics

Interval-specific aesthetics

Point-specific aesthetics

Color aesthetics

Line aesthetics

Slab-specific color and line override aesthetics

Interval-specific color and line override aesthetics

Point-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("dotsinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

References

Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizationsof Uncertainty in Everyday, Mobile Predictive Systems.Conference on Human Factorsin Computing Systems - CHI '16, 5092–5103.doi:10.1145/2858036.2858558.

Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplotsor CDFs Improve Transit Decision-Making.Conference on Human Factors in Computing Systems - CHI '18.doi:10.1145/3173574.3173718.

See Also

Seegeom_dotsinterval() for the geometry this shortcut is based on.

Seevignette("dotsinterval") for a variety of examples of use.

Other dotsinterval geoms:geom_blur_dots(),geom_dots(),geom_dotsinterval(),geom_weave()

Examples

library(dplyr)library(ggplot2)theme_set(theme_ggdist())set.seed(12345)df = tibble(  g = rep(c("a", "b"), 200),  value = rnorm(400, c(0, 3), c(0.75, 1)))# orientation is detected automatically based on# which axis is discretedf %>%  ggplot(aes(x = value, y = g)) +  geom_swarm()df %>%  ggplot(aes(y = value, x = g)) +  geom_swarm()

Dot-weave plot (shortcut geom)

Description

Shortcut version ofgeom_dotsinterval() for creating dot-weave plots.Geoms based ongeom_dotsinterval() create dotplots that automaticallyensure the plot fits within the available space.

Roughly equivalent to:

geom_dots(  aes(side = "both"),  layout = "weave",  overflow = "compress",  binwidth = unit(1.5, "mm"))

Usage

geom_weave(  mapping = NULL,  data = NULL,  stat = "identity",  position = "identity",  ...,  layout = "weave",  overflow = "compress",  binwidth = unit(1.5, "mm"),  dotsize = 1.07,  stackratio = 1,  overlaps = "nudge",  smooth = "none",  verbose = FALSE,  orientation = NA,  subguide = "slab",  na.rm = FALSE,  show.legend = NA,  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

stat

The statistical transformation to use on the data for this layer.When using a⁠geom_*()⁠ function to construct a layer, thestatargument can be used the override the default coupling between geoms andstats. Thestat argument accepts the following:

  • AStat ggproto subclass, for exampleStatCount.

  • A string naming the stat. To give the stat as a string, strip thefunction name of thestat_ prefix. For example, to usestat_count(),give the stat as"count".

  • For more information and other ways to specify the stat, see thelayer stat documentation.

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). These areoften aesthetics, used to set an aesthetic to a fixed value, likecolour = "red"orlinewidth = 3 (seeAesthetics, below). They may also be parametersto the paired geom/stat.

layout

<string> The layout method used for the dots. One of:

  • "bin" (default): places dots on the off-axis at the midpoint oftheir bins as in the classic Wilkinson dotplot. This maintains thealignment of rows and columns in the dotplot. This layout is slightlydifferent from the classic Wilkinson algorithm in that: (1) it nudgesbins slightly to avoid overlapping bins and (2) if the input data aresymmetrical it will return a symmetrical layout.

  • "weave": uses the same basic binning approach of"bin", butplaces dots in the off-axis at their actual positions (unlessoverlaps = "nudge", in which case overlaps may be nudged out of theway). This maintains the alignment of rows but does not align dotswithin columns.

  • "hex": uses the same basic binning approach of"bin", butalternates placing dots+ binwidth/4 or- binwidth/4 in theoff-axis from the bin center. This allows hexagonal packing by settingastackratio less than 1 (something like0.9 tends to work).

  • "swarm": uses the"compactswarm" layout frombeeswarm::beeswarm(). Does not maintain alignment of rows or columns,but can be more compact and neat looking, especially for sample data(as opposed to quantile dotplots of theoretical distributions, whichmay look better with"bin","weave", or"hex").

  • "bar": for discrete distributions, lays out duplicate values inrectangular bars.

overflow

<string> How to handle overflow of dots beyond the extent of the geomwhen a minimumbinwidth (or an exactbinwidth) is supplied.One of:

  • "keep": Keep the overflow, drawing dots outside the geom bounds.

  • "warn": Keep the overflow, but produce a warning suggesting solutions,such as settingbinwidth = NA oroverflow = "compress".

  • "compress": Compress the layout. Reduces thebinwidth to the size necessaryto keep the dots within bounds, then adjustsstackratio anddotsize so thatthe apparent dot size is the user-specified minimumbinwidth times theuser-specifieddotsize.

If you find the default layout has dots that are too small, and you are okaywith dots overlapping, consider settingoverflow = "compress" and supplyingan exact or minimum dot size usingbinwidth.

binwidth

<numeric |unit> The bin width to use for laying out the dots.One of:

  • NA (the default): Dynamically select the bin width based on thesize of the plot when drawn. This will pick abinwidth such that thetallest stack of dots is at mostscale in height (ideally exactlyscalein height, though this is not guaranteed).

  • A length-1 (scalar) numeric orunit object giving the exact bin width.

  • A length-2 (vector) numeric orunit object giving the minimum and maximumdesired bin width. The bin width will be dynamically selected withinthese bounds.

If the value is numeric, it is assumed to be in units of data. The bin width(or its bounds) can also be specified usingunit(), which may be useful ifit is desired that the dots be a certain point size or a certain percentage ofthe width/height of the viewport. For example,unit(0.1, "npc") would makedots that areexactly 10% of the viewport size along whichever dimension thedotplot is drawn;unit(c(0, 0.1), "npc") would make dots that areat most10% of the viewport size (while still ensuring the tallest stack is less thanor equal toscale).

dotsize

<scalarnumeric> The width of the dots relative to thebinwidth. The default,1.07, makes dots be just a bit wider than the bin width, which is amanually-tuned parameter that tends to work well with the default circularshape, preventing gaps between bins from appearing to be too large visually(as might arise from dots beingprecisely thebinwidth). If it is desiredto have dots be precisely thebinwidth, setdotsize = 1.

stackratio

<scalarnumeric> The distance between the center of the dots in the samestack relative to the dot height. The default,1, makes dots in the samestack just touch each other.

overlaps

<string> How to handle overlapping dots or bins in the"bin","weave", and"hex" layouts (dots never overlap in the"swarm" or"bar" layouts).For the purposes of this argument, dots are only considered to be overlappingif they would be overlapping whendotsize = 1 andstackratio = 1; i.e.if you set those arguments to other values, overlaps may still occur.One of:

  • "keep": leave overlapping dots as they are. Dots may overlap(usually only slightly) in the"bin","weave", and"hex" layouts.

  • "nudge": nudge overlapping dots out of the way. Overlaps are avoidedusing a constrained optimization which minimizes the squared distance ofdots to their desired positions, subject to the constraint that adjacentdots do not overlap.

smooth

<function |string> Smoother to apply to dot positions.One of:

  • A function that takes a numeric vector of dot positions and returns asmoothed version of that vector, such assmooth_bounded(),smooth_unbounded(), smooth_discrete()⁠, or ⁠smooth_bar()'.

  • A string indicating what smoother to use, as the suffix to a functionname starting withsmooth_; e.g."none" (the default) appliessmooth_none(), which simply returns the given vector withoutapplying smoothing.

Smoothing is most effective when the smoother is matched to the support ofthe distribution; e.g. usingsmooth_bounded(bounds = ...).

verbose

<scalarlogical> IfTRUE, print out the bin width of the dotplot. Can be usefulif you want to start from an automatically-selected bin width and then adjust itmanually. Bin width is printed both as data units and as normalized parentcoordinates or"npc"s (seeunit()). Note that if you just want to scale theselected bin width to fit within a desired area, it is probably easier to usescale than to copy and scalebinwidth manually, and if you just want toprovide constraints on the bin width, you can pass a length-2 vector tobinwidth.

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

subguide

<function |string> Sub-guide used to annotate thethickness scale. One of:

  • A function that takes ascale argument giving aggplot2::Scaleobject and anorientation argument giving the orientation of thegeometry and then returns agrid::grob that will draw the axisannotation, such assubguide_axis() (to draw a traditional axis) orsubguide_none() (to draw no annotation). Seesubguide_axis()for a list of possibilities and examples.

  • A string giving the name of such a function when prefixedwith"subguide_"; e.g."axis" or"none". The values"slab","dots", and"spike" use the default subguide for their geomfamilies (no subguide), which can be modified by settingsubguide_slab,subguide_dots, orsubguide_spike;see the documentation for those functions.

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

logical. Should this layer be included in the legends?NA, the default, includes if any aesthetics are mapped.FALSE never includes, andTRUE always includes.It can also be a named logical vector to finely select the aesthetics todisplay.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

Thedots family of stats and geoms are similar toggplot2::geom_dotplot() but with a number of differences:

Stats and geoms in this family include:

stat_dots() andstat_dotsinterval(), when used with thequantiles argument,are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertaintyusing a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).

Value

Aggplot2::Geom representing a dot-weave geometry which canbe added to aggplot() object.

Aesthetics

The dots+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: thedots (aka theslab), thepoint, and theinterval.

Positional aesthetics

Dots-specific (aka Slab-specific) aesthetics

Interval-specific aesthetics

Point-specific aesthetics

Color aesthetics

Line aesthetics

Slab-specific color and line override aesthetics

Interval-specific color and line override aesthetics

Point-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("dotsinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

References

Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizationsof Uncertainty in Everyday, Mobile Predictive Systems.Conference on Human Factorsin Computing Systems - CHI '16, 5092–5103.doi:10.1145/2858036.2858558.

Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplotsor CDFs Improve Transit Decision-Making.Conference on Human Factors in Computing Systems - CHI '18.doi:10.1145/3173574.3173718.

See Also

Seegeom_dotsinterval() for the geometry this shortcut is based on.

Seevignette("dotsinterval") for a variety of examples of use.

Other dotsinterval geoms:geom_blur_dots(),geom_dots(),geom_dotsinterval(),geom_swarm()

Examples

library(dplyr)library(ggplot2)theme_set(theme_ggdist())set.seed(12345)df = tibble(  g = rep(c("a", "b"), 200),  value = rnorm(400, c(0, 3), c(0.75, 1)))# orientation is detected automatically based on# which axis is discretedf %>%  ggplot(aes(x = value, y = g)) +  geom_weave()df %>%  ggplot(aes(y = value, x = g)) +  geom_weave()

Deprecated functions and arguments in ggdist

Description

Deprecated functions and arguments and their alternatives are listed below.

Deprecated stats and geoms

Thestat_sample_... andstat_dist_... families of stats were merged in ggdist 3.1.This means:

The oldstat_dist_... names are currently kept as aliases, but may be removed in the future.

Deprecated arguments

Deprecated parameters forstat_slabinterval() and family:

Deprecated parameters forgeom_slabinterval() and family:

Author(s)

Matthew Kay


Continuous guide for colour ramp scales (ggplot2 guide)

Description

A colour ramp bar guide that shows continuous colour ramp scales mapped ontovalues as a smooth gradient. Designed for use withscale_fill_ramp_continuous()andscale_colour_ramp_continuous(). Based onguide_colourbar().

Usage

guide_rampbar(  ...,  to = "gray65",  available_aes = c("fill_ramp", "colour_ramp"))

Arguments

...

Arguments passed on toggplot2::guide_colourbar

title

A character string or expression indicating a title of guide.IfNULL, the title is not shown. By default(waiver()), the name of the scale object or the namespecified inlabs() is used for the title.

theme

Atheme object to style the guide individually ordifferently from the plot's theme settings. Thetheme argument in theguide overrides, and is combined with, the plot's theme.

nbin

A numeric specifying the number of bins for drawing thecolourbar. A smoother colourbar results from a larger value.

display

A string indicating a method to display the colourbar. Can beone of the following:

  • "raster" to display as a bitmap image.

  • "rectangles" to display as a series of rectangles.

  • "gradient" to display as a linear gradient.

Note that not all devices are able to render rasters and gradients.

raster

[Deprecated] A logical. IfTRUE thenthe colourbar is rendered as a raster object. IfFALSE then the colourbaris rendered as a set of rectangles. Note that not all graphics devices arecapable of rendering raster image.

alpha

A numeric between 0 and 1 setting the colour transparency ofthe bar. UseNA to preserve the alpha encoded in the colour itself(default).

draw.ulim

A logical specifying if the upper limit tick marks shouldbe visible.

draw.llim

A logical specifying if the lower limit tick marks shouldbe visible.

position

A character string indicating where the legend should beplaced relative to the plot panels.

direction

A character string indicating the direction of the guide.One of "horizontal" or "vertical."

reverse

logical. IfTRUE the colourbar is reversed. By default,the highest value is on the top and the lowest value is on the bottom

order

positive integer less than 99 that specifies the order ofthis guide among multiple guides. This controls the order in whichmultiple guides are displayed, not the contents of the guide itself.If 0 (default), the order is determined by a secret algorithm.

to

<string> The color to ramp to in the guide. Corresponds to1 on the scale.

available_aes

<character> Vector listing the aesthetics for which aguide_rampbar() can be drawn.

Details

This guide creates smooth gradient color bars for use withscale_fill_ramp_continuous()andscale_colour_ramp_continuous(). The color to ramp from is determined by thefromargument of the⁠scale_*⁠ function, and the color to ramp to is determined by theto argumenttoguide_rampbar().

Guides can be specified in each⁠scale_*⁠ function or inguides().guide = "rampbar" in⁠scale_*⁠ is syntactic sugar forguide = guide_rampbar();e.g.scale_colour_ramp_continuous(guide = "rampbar"). For how to specifythe guide for each scale in more detail, seeguides().

Value

A guide object.

Author(s)

Matthew Kay

See Also

Other colour ramp functions:partial_colour_ramp(),ramp_colours(),scale_colour_ramp

Examples

library(dplyr)library(ggplot2)library(distributional)# The default guide for ramp scales is guide_legend(), which creates a# discrete style scale:tibble(d = dist_uniform(0, 1)) %>%  ggplot(aes(y = 0, xdist = d)) +  stat_slab(aes(fill_ramp = after_stat(x)), fill = "blue") +  scale_fill_ramp_continuous(from = "red")# We can use guide_rampbar() to instead create a continuous guide, but# it does not know what color to ramp to (defaults to "gray65"):tibble(d = dist_uniform(0, 1)) %>%  ggplot(aes(y = 0, xdist = d)) +  stat_slab(aes(fill_ramp = after_stat(x)), fill = "blue") +  scale_fill_ramp_continuous(from = "red", guide = guide_rampbar())# We can tell the guide what color to ramp to using the `to` argument:tibble(d = dist_uniform(0, 1)) %>%  ggplot(aes(y = 0, xdist = d)) +  stat_slab(aes(fill_ramp = after_stat(x)), fill = "blue") +  scale_fill_ramp_continuous(from = "red", guide = guide_rampbar(to = "blue"))

Nicely-spaced sets of interval widths

Description

Create nicely-spaced sets of nested interval widths for use with (e.g.)the.width parameter ofpoint_interval(),stat_slabinterval(), orstat_lineribbon():

Intervals should be evenly-spaced on any symmetric reference distributionwhen applied to data from distributions with the same shape. Ifdistis not symmetric, intervals may only be approximately evenly-spaced above themedian.

Usage

interval_widths(n, dist = dist_normal(), max = 1 - 0.1/n, precision = NULL)pretty_widths(  n,  dist = dist_normal(),  max = if (n <= 4) 0.95 else 1 - 0.1/n,  precision = if (n <= 4) 0.05 else 0.01)

Arguments

n

<numeric> in[0, \infty): Number of intervals to generate.

dist

<distribution>: Referencedistribution.

max

<numeric> in(0, 1): Maximum interval width.

precision

<numeric |NULL>: If notNULL, a value in(0, 1)giving the precision to round resulting widths to. In order to guaranteen unique intervals are returned, widths will only be rounded if theresult does not create duplicate values.

Details

Given the cumulative distribution functionF_\textrm{dist}(q)and the quantile functionF^{-1}_\textrm{dist}(p) ofdist, thefollowing is a sequence ofn + 1 evenly-spaced quantiles ofdistthat could represent upper limits of nested intervals, whereq_i = q_0 + i\frac{q_n - q_0}{n}:

\begin{array}{rcl}q_0, \ldots, q_n &=& F^{-1}_\textrm{dist}(0.5), \ldots, F^{-1}_\textrm{dist}(0.5 + \frac{\textrm{max}}{2})\end{array}

interval_widths(n) returns then interval widths corresponding to theupper interval limitsq_1, \ldots, q_n:

2\cdot\left[F_\textrm{dist}(q_1) - 0.5\right], \ldots, 2\cdot\left[F_\textrm{dist}(q_n) - 0.5\right]

Value

A length-n numeric vector of interval widths (masses) between0 and1 (exclusive) in increasing order.

See Also

The.width argument topoint_interval(),stat_slabinterval(),stat_lineribbon(), etc.

Examples

library(ggplot2)library(distributional)interval_widths(1)   # 0.9# this is roughly +/- 1 SD and +/- 2 SDinterval_widths(2)   # 0.672..., 0.95interval_widths(3)   # 0.521..., 0.844..., 0.966...# "pretty" widths may be useful for legends with a small number of widthspretty_widths(1)     # 0.95pretty_widths(2)     # 0.65, 0.95pretty_widths(3)     # 0.50, 0.80, 0.95# larger numbers of intervals can be useful for plotsggplot(data.frame(x = 1:20/20)) +  aes(x, ydist = dist_normal((x * 5)^2, 1 + x * 5)) +  stat_lineribbon(.width = pretty_widths(10))# large numbers of intervals can be used to create gradients -- particularly# useful if you shade ribbons according to density (not interval width)# (this is currently experimental)withr::with_options(list(ggdist.experimental.slab_data_in_intervals = TRUE), print(  ggplot(data.frame(x = 1:20/20)) +    aes(x, ydist = dist_normal((x * 5)^2, 1 + x * 5)) +    stat_lineribbon(      aes(fill_ramp = after_stat(ave(pdf_min, level))),      .width = interval_widths(40),      fill = "gray50"    ) +    theme_ggdist()))

Marginal distribution of a single correlation from an LKJ distribution

Description

Marginal distribution for the correlation in a single cell from a correlationmatrix distributed according to an LKJ distribution.

Usage

dlkjcorr_marginal(x, K, eta, log = FALSE)plkjcorr_marginal(q, K, eta, lower.tail = TRUE, log.p = FALSE)qlkjcorr_marginal(p, K, eta, lower.tail = TRUE, log.p = FALSE)rlkjcorr_marginal(n, K, eta)

Arguments

x,q

vector of quantiles.

K

<numeric> Dimension of the correlation matrix. Must be greater than or equal to 2.

eta

<numeric> Parameter controlling the shape of the distribution

log,log.p

logical; if TRUE, probabilities p are given as log(p).

lower.tail

logical; if TRUE (default), probabilities areP[X \le x] otherwise,P[X > x].

p

vector of probabilities.

n

number of observations. Iflength(n) > 1, the lengthis taken to be the number required.

Details

The LKJ distribution is a distribution over correlation matrices with a single parameter,\eta.For a given\eta and aK \times K correlation matrixR:

R \sim \textrm{LKJ}(\eta)

Each off-diagonal entry ofR,r_{ij}: i \ne j, has thefollowing marginal distribution (Lewandowski, Kurowicka, and Joe 2009):

\frac{r_{ij} + 1}{2} \sim \textrm{Beta}\left(\eta - 1 + \frac{K}{2}, \eta - 1 + \frac{K}{2}\right)

In other words,r_{ij} is marginally distributed according to the above Betadistribution scaled into(-1,1).

Value

The length of the result is determined byn forrlkjcorr_marginal, and is the maximum of the lengths ofthe numerical arguments for the other functions.

The numerical arguments other thann are recycled to the length of the result. Only the first elementsof the logical arguments are used.

References

Lewandowski, D., Kurowicka, D., & Joe, H. (2009). Generating random correlation matrices based on vinesand extended onion method.Journal of Multivariate Analysis, 100(9), 1989–2001.doi:10.1016/j.jmva.2009.04.008.

See Also

parse_dist() andmarginalize_lkjcorr() for parsing specs that use theLKJ correlation distribution and thestat_slabinterval() family of stats for visualizing them.

Examples

library(dplyr)library(ggplot2)theme_set(theme_ggdist())expand.grid(  eta = 1:6,  K = 2:6) %>%  ggplot(aes(y = ordered(eta), dist = "lkjcorr_marginal", arg1 = K, arg2 = eta)) +  stat_slab() +  facet_grid(~ paste0(K, "x", K)) +  scale_y_discrete(limits = rev) +  labs(    title = paste0(      "Marginal correlation for LKJ(eta) prior on different matrix sizes:\n",      "dlkjcorr_marginal(K, eta)"    ),    subtitle = "Correlation matrix size (KxK)",    y = "eta",    x = "Marginal correlation"  ) +  theme(axis.title = element_text(hjust = 0))

Turn spec for LKJ distribution into spec for marginal LKJ distribution

Description

Turns specs for an LKJ correlation matrix distribution as returned byparse_dist() into specs for the marginal distribution ofa single cell in an LKJ-distributed correlation matrix (i.e.,lkjcorr_marginal()).Useful for visualizing prior correlations from LKJ distributions.

Usage

marginalize_lkjcorr(  data,  K,  predicate = NULL,  dist = ".dist",  args = ".args",  dist_obj = ".dist_obj")

Arguments

data

<data.frame> A data frame containing a column with distribution names (".dist" by default)and a list column of distribution arguments (".args" by default), such as output byparse_dist().

K

<numeric> Dimension of the correlation matrix. Must be greater than or equal to 2.

predicate

<barelanguage |NULL> Expression for selecting the rows ofdata to modify.This is useful ifdata contains more than one row with an LKJ prior in it and you only wantto modify some of the distributions; if this is the case, give row a predicate expression thatevaluates toTRUE on the rows you want to modify.

IfNULL (the default), alllkjcorr distributions indata are modified.

dist

<string> The name of the column containing distribution names. Seeparse_dist().

args

<string> The name of the column containing distribution arguments. Seeparse_dist().

dist_obj

<string> The name of the output column to contain adistributionalobject representing the distribution. Seeparse_dist().

Details

The LKJ(eta) prior on a correlation matrix induces a marginal prior on each correlationin the matrix that depends on both the value ofetaandK, the dimensionof theK \times K correlation matrix. Thus to visualize the marginal prioron the correlations, it is necessary to specify the value ofK, which dependson what your model specification looks like.

Given a data frame representing parsed distribution specifications (suchas returned byparse_dist()), this function updates any rows with.dist == "lkjcorr"so that the first argument to the distribution (stored in.args) is equal to the specified dimensionof the correlation matrix (K), changes the distribution name in.dist to"lkjcorr_marginal",and assigns adistributional object representing this distribution to.dist_obj.This allows the distribution to be easily visualized using thestat_slabinterval()family of ggplot2 stats.

Value

A data frame of the same size and column names as the input, with thedist, andargs,anddist_obj columns modified on rows wheredist == "lkjcorr" such that they represent amarginal LKJ correlation distribution with namelkjcorr_marginal andargs havingK equal to the input value ofK.

See Also

parse_dist(),lkjcorr_marginal()

Examples

library(dplyr)library(ggplot2)# Say we have an LKJ(3) prior on a 2x2 correlation matrix. We can visualize# its marginal distribution as follows...data.frame(prior = "lkjcorr(3)") %>%  parse_dist(prior) %>%  marginalize_lkjcorr(K = 2) %>%  ggplot(aes(y = prior, xdist = .dist_obj)) +  stat_halfeye() +  xlim(-1, 1) +  xlab("Marginal correlation for LKJ(3) prior on 2x2 correlation matrix")# Say our prior list has multiple LKJ priors on correlation matrices# of different sizes, we can supply a predicate expression to select# only those rows we want to modifydata.frame(coef = c("a", "b"), prior = "lkjcorr(3)") %>%  parse_dist(prior) %>%  marginalize_lkjcorr(K = 2, coef == "a") %>%  marginalize_lkjcorr(K = 4, coef == "b")

Parse distribution specifications into columns of a data frame

Description

Parses simple string distribution specifications, like"normal(0, 1)", into two columns ofa data frame, suitable for use with thedist andargs aesthetics ofstat_slabinterval()and its shortcut stats (likestat_halfeye()). This format is outputbybrms::get_prior, making it particularly useful for visualizing priors frombrms models.

Usage

parse_dist(  object,  ...,  dist = ".dist",  args = ".args",  dist_obj = ".dist_obj",  package = NULL,  to_r_names = TRUE)## Default S3 method:parse_dist(object, ...)## S3 method for class 'data.frame'parse_dist(  object,  dist_col,  ...,  dist = ".dist",  args = ".args",  dist_obj = ".dist_obj",  package = NULL,  lb = "lb",  ub = "ub",  to_r_names = TRUE)## S3 method for class 'character'parse_dist(  object,  ...,  dist = ".dist",  args = ".args",  dist_obj = ".dist_obj",  package = NULL,  to_r_names = TRUE)## S3 method for class 'factor'parse_dist(  object,  ...,  dist = ".dist",  args = ".args",  dist_obj = ".dist_obj",  package = NULL,  to_r_names = TRUE)## S3 method for class 'brmsprior'parse_dist(  object,  dist_col = prior,  ...,  dist = ".dist",  args = ".args",  dist_obj = ".dist_obj",  package = NULL,  to_r_names = TRUE)r_dist_name(dist_name)

Arguments

object

<character |data.frame> One of:

  • A character vector containing distribution specifications, likec("normal(0,1)", "exp(1)")

  • A data frame with a column containing distribution specifications.

...

Arguments passed to other implementations ofparse_dist().

dist

<string> The name of the output column to contain the distribution name.

args

<string> The name of the output column to contain the arguments to the distribution.

dist_obj

<string> The name of the output column to contain adistributionalobject representing the distribution.

package

<string |environment |NULL> The package or environment to search fordistribution functions in. Passed todistributional::dist_wrap(). One of:

  • a string: use the environment for the package with the given name

  • anenvironment: use the given environment

  • NULL (default): use the calling environment

to_r_names

<scalarlogical> IfTRUE (the default), certain common aliases for distributionnames are automatically translated into names that R can recognize (i.e., names which have functionsstarting withr,p,q, andd representing random number generators, distribution functions,etc. for that distribution), using ther_dist_name function. For example,"normal" is translatedinto"norm" and"lognormal" is translated into"lnorm".

dist_col

<barelanguage> Column or column expression ofobject that resolves to acharacter vector of distribution specifications (whenobject is adata.frame()).

lb

<string> The name of an input column (fordata.frame andbrms::prior objects)that contains the lower bound of the distribution, which if present will produce a truncated distributionusingdist_truncated(). Ignored ifobject[[lb]] isNULL or ifit isNA for the corresponding input row.

ub

<string> The name of an input column (fordata.frame andbrms::prior objects)that contains the upper bound of the distribution, which if present will produce a truncated distributionusingdist_truncated(). Ignored ifobject[[ub]] isNULL or ifit isNA for the corresponding input row.

dist_name

<character> Forr_dist_name(), a character vector of distribution names to betranslated into distribution names R recognizes. Unrecognized names are left as-is.

Details

parse_dist() can be applied to character vectors or to a data frame + bare column name of thecolumn to parse, and returns a data frame with".dist" and".args" columns added.parse_dist() usesr_dist_name() to translate distribution names into names recognizedby R.

r_dist_name() takes a character vector of names and translates common names into Rdistribution names. Names are first made into valid R names usingmake.names(),then translated (ignoring character case,".", and"_"). Thus,"lognormal","LogNormal","log_normal","log-Normal", and any number of other variantsall get translated into"lnorm".

Value

See Also

Seestat_slabinterval() and its shortcut stats, which can easily make use ofthe output of this function using thedist andargs aesthetics.

Examples

library(dplyr)# parse dist can operate on strings directly...parse_dist(c("normal(0,1)", "student_t(3,0,1)"))# ... or on columns of a data frame, where it adds the# parsed specs back on as columnsdata.frame(prior = c("normal(0,1)", "student_t(3,0,1)")) %>%  parse_dist(prior)# parse_dist is particularly useful with the output of brms::prior(),# which follows the same format as above

Partial colour ramp (datatype)

Description

A representation of a partial ramp between two colours: the origin colour(from) and the distance from the origin colour to the target colour(amount, a value between0 and1). The target colour of the rampcan be filled in later usingramp_colours(), producing a colour.

Usage

partial_colour_ramp(amount = double(), from = "white")

Arguments

amount

<numeric> Vector of values between0 and1 giving amountsto ramp the colour.0 corresponds to the colourfrom.

from

<character> Vector giving colours to ramp from.

Details

This datatype is used byscale_colour_ramp to create ramped colours inggdist geoms. It is avctrs::rcrd datatype with two fields:"amount", the amount to ramp, and"from", the colour to ramp from.

Colour ramps can be applied (i.e. translated into colours) usingramp_colours(), which can be used withpartial_colour_ramp()to implement geoms that make use ofcolour_ramp orfill_ramp scales.

Value

Avctrs::rcrd of class"ggdist_partial_colour_ramp" with fields"amount" and"from".

Author(s)

Matthew Kay

See Also

Other colour ramp functions:guide_rampbar(),ramp_colours(),scale_colour_ramp

Examples

pcr = partial_colour_ramp(c(0, 0.25, 0.75, 1), "red")pcrramp_colours("blue", pcr)

Point and interval summaries for tidy data frames of draws from distributions

Description

Translates draws from distributions in a (possibly grouped) data frame into point andinterval summaries (or set of point and interval summaries, if there aremultiple groups in a grouped data frame).

Supportsautomatic partial function application.

Usage

point_interval(  .data,  ...,  .width = 0.95,  .point = median,  .interval = qi,  .simple_names = TRUE,  na.rm = FALSE,  .exclude = c(".chain", ".iteration", ".draw", ".row"),  .prob)## Default S3 method:point_interval(  .data,  ...,  .width = 0.95,  .point = median,  .interval = qi,  .simple_names = TRUE,  na.rm = FALSE,  .exclude = c(".chain", ".iteration", ".draw", ".row"),  .prob)## S3 method for class 'tbl_df'point_interval(.data, ...)## S3 method for class 'numeric'point_interval(  .data,  ...,  .width = 0.95,  .point = median,  .interval = qi,  .simple_names = FALSE,  na.rm = FALSE,  .exclude = c(".chain", ".iteration", ".draw", ".row"),  .prob)## S3 method for class 'rvar'point_interval(  .data,  ...,  .width = 0.95,  .point = median,  .interval = qi,  .simple_names = TRUE,  na.rm = FALSE)## S3 method for class 'distribution'point_interval(  .data,  ...,  .width = 0.95,  .point = median,  .interval = qi,  .simple_names = TRUE,  na.rm = FALSE)qi(x, .width = 0.95, .prob, na.rm = FALSE)ll(x, .width = 0.95, na.rm = FALSE)ul(x, .width = 0.95, na.rm = FALSE)hdi(  x,  .width = 0.95,  na.rm = FALSE,  ...,  density = density_bounded(trim = TRUE),  n = 4096,  .prob)Mode(x, na.rm = FALSE, ...)## Default S3 method:Mode(  x,  na.rm = FALSE,  ...,  density = density_bounded(trim = TRUE),  n = 2001,  weights = NULL)## S3 method for class 'rvar'Mode(x, na.rm = FALSE, ...)## S3 method for class 'distribution'Mode(x, na.rm = FALSE, ...)hdci(x, .width = 0.95, na.rm = FALSE)mean_qi(.data, ..., .width = 0.95)median_qi(.data, ..., .width = 0.95)mode_qi(.data, ..., .width = 0.95)mean_ll(.data, ..., .width = 0.95)median_ll(.data, ..., .width = 0.95)mode_ll(.data, ..., .width = 0.95)mean_ul(.data, ..., .width = 0.95)median_ul(.data, ..., .width = 0.95)mode_ul(.data, ..., .width = 0.95)mean_hdi(.data, ..., .width = 0.95)median_hdi(.data, ..., .width = 0.95)mode_hdi(.data, ..., .width = 0.95)mean_hdci(.data, ..., .width = 0.95)median_hdci(.data, ..., .width = 0.95)mode_hdci(.data, ..., .width = 0.95)

Arguments

.data

<data.frame |grouped_df> Data frame (or groupeddata frame as returned bydplyr::group_by()) that contains draws to summarize.

...

<barelanguage> Column names or expressions that, when evaluated in the context of.data, represent draws to summarize. If this is empty, then by default allcolumns that are not group columns and which are not in.exclude (by default".chain",".iteration",".draw", and".row") will be summarized.These columns can be numeric,distributional objects,posterior::rvars,or list columns of numeric values to summarise.

.width

<numeric> vector of probabilities to use that determine the widths ofthe resulting intervals. If multiple probabilities are provided, multiple rows pergroup are generated, each with a different probability interval (and value of thecorresponding.width column).

.point

<function> Point summary function, which takes a vector and returns a singlevalue, e.g.mean,median, orMode.

.interval

<function> Interval function, which takes a vector and a probability(.width) and returns a two-element vector representing the lower and upperbound of an interval; e.g.qi,hdi

.simple_names

<scalarlogical> WhenTRUE and only a single column / vectoris to be summarized, use the name.lower for the lower end of the interval and.upper for theupper end. If.data is a vector and this isTRUE, this will also set the column nameof the point summary to.value. WhenFALSE and.data is a data frame,names the lower and upper intervals for each columnxx.lower andx.upper.WhenFALSE and.data is a vector, uses the naming schemey,yminandymax (for use with ggplot).

na.rm

<scalarlogical> ShouldNA values be stripped before the computation proceeds?IfFALSE (the default), any vectors to be summarized that containNA will result inpoint and interval summaries equal toNA.

.exclude

<character> Vector of names of columns to be excluded from summarizationif no column names are specified to be summarized in.... Default ignores several meta-data columnnames used inggdist andtidybayes.

.prob

Deprecated. Use.width instead.

x

<numeric> Vector to summarize (for interval functions:qi(),hdi(), etc)

density

<function |string> Forhdi() andMode(), the kerneldensity estimator to use, either as a function (e.g.density_bounded,density_unbounded)or as a string giving the suffix to a function that starts withdensity_ (e.g."bounded"or"unbounded"). The default,"bounded", uses the bounded density estimator ofdensity_bounded(), which itself estimates the bounds of the distribution, and tends towork well on both bounded and unbounded data.

n

<scalarnumeric> Forhdi() andMode(), the number of points to use to estimatehighest-density intervals or modes.

weights

<numeric |NULL> ForMode(), an optional vector, which (if notNULL)is of the same length asx and provides weights for each element ofx.

Details

If.data is a data frame, then... is a list of bare names ofcolumns (or expressions derived from columns) of.data, on whichthe point and interval summaries are derived. Column expressions are processedusing the tidy evaluation framework (seerlang::eval_tidy()).

For a column namedx, the resulting data frame will have a columnnamedx containing its point summary. If there is a singlecolumn to be summarized and.simple_names isTRUE, the output willalso contain columns.lower (the lower end of the interval),.upper (the upper end of the interval).Otherwise, for every summarized columnx, the output will containx.lower (the lower end of the interval) andx.upper (the upperend of the interval). Finally, the output will have a.width columncontaining the' probability for the interval on each output row.

If.data includes groups (see e.g.dplyr::group_by()),the points and intervals are calculated within the groups.

If.data is a vector,... is ignored and the result is adata frame with one row per value of.width and three columns:y (the point summary),ymin (the lower end of the interval),ymax (the upper end of the interval), and.width, the probabilitycorresponding to the interval. This behavior allowspoint_intervaland its derived functions (likemedian_qi,mean_qi,mode_hdi, etc)to be easily used to plot intervals in ggplot stats using methods likestat_eye(),stat_halfeye(), orstat_summary().

median_qi,mode_hdi, etc are short forms forpoint_interval(..., .point = median, .interval = qi), etc.

qi yields the quantile interval (also known as the percentile interval orequi-tailed interval) as a 1x2 matrix.

hdi yields the highest-density interval(s) (also known as the highest posteriordensity interval).Note: If the distribution is multimodal,hdi may return multipleintervals for each probability level (these will be spread over rows). You may wish to usehdci (below) instead if you want a single highest-density interval, with the caveat that whenthe distribution is multimodalhdci is not a highest-density interval.

hdci yields the highest-densitycontinuous interval, also known as the shortestprobability interval.Note: If the distribution is multimodal, this may not actuallybe the highest-density interval (there may be a higher-densitydiscontinuous interval, which can be found usinghdi).

ll andul yield lower limits and upper limits, respectively (where the oppositelimit is set to eitherInf or-Inf).

Value

A data frame containing point summaries and intervals, with at least one column correspondingto the point summary, one to the lower end of the interval, one to the upper end of the interval, thewidth of the interval (.width), the type of point summary (.point), and the type of interval (.interval).

Author(s)

Matthew Kay

Examples

library(dplyr)library(ggplot2)set.seed(123)rnorm(1000) %>%  median_qi()data.frame(x = rnorm(1000)) %>%  median_qi(x, .width = c(.50, .80, .95))data.frame(    x = rnorm(1000),    y = rnorm(1000, mean = 2, sd = 2)  ) %>%  median_qi(x, y)data.frame(    x = rnorm(1000),    group = "a"  ) %>%  rbind(data.frame(    x = rnorm(1000, mean = 2, sd = 2),    group = "b")  ) %>%  group_by(group) %>%  median_qi(.width = c(.50, .80, .95))multimodal_draws = data.frame(    x = c(rnorm(5000, 0, 1), rnorm(2500, 4, 1))  )multimodal_draws %>%  mode_hdi(.width = c(.66, .95))multimodal_draws %>%  ggplot(aes(x = x, y = 0)) +  stat_halfeye(point_interval = mode_hdi, .width = c(.66, .95))

Dodge overlapping objects side-to-side, preserving justification

Description

A justification-preserving variant ofggplot2::position_dodge() which preserves thevertical position of a geom while adjusting the horizontal position (or viceversa when in a horizontal orientation). Unlikeggplot2::position_dodge(),position_dodgejust() attempts to preserve the "justification" ofxpositions relative to the bounds containing them (xmin/xmax) (orypositions relative toymin/ymax when in a horizontal orientation). Thismakes it useful for dodging annotations to geoms and stats from thegeom_slabinterval() family, which also preserve the justification of theirintervals relative to their slabs when dodging.

Usage

position_dodgejust(  width = NULL,  preserve = c("total", "single"),  justification = NULL)

Arguments

width

Dodging width, when different to the width of the individualelements. This is useful when you want to align narrow geoms with widergeoms. See the examples.

preserve

Should dodging preserve the"total" width of all elementsat a position, or the width of a"single" element?

justification

<scalarnumeric> Justification of the point position (x/y) relativeto its bounds (xmin/xmax orymin/ymax), where0 indicatesbottom/left justification and1 indicates top/right justification(depending onorientation). This is only used ifxmin/xmax/ymin/ymaxare not supplied; in that case,justification will be used along withwidth to determine the bounds of the object prior to dodging.

Examples

library(dplyr)library(ggplot2)library(distributional)dist_df = tribble(  ~group, ~subgroup, ~mean, ~sd,  1,          "h",     5,   1,  2,          "h",     7,   1.5,  3,          "h",     8,   1,  3,          "i",     9,   1,  3,          "j",     7,   1)# An example with normal "dodge" positioning# Notice how dodge points are placed in the center of their bounding boxes,# which can cause slabs to be positioned outside their bounds.dist_df %>%  ggplot(aes(    x = factor(group), ydist = dist_normal(mean, sd),    fill = subgroup  )) +  stat_halfeye(    position = "dodge"  ) +  geom_rect(    aes(xmin = group, xmax = group + 1, ymin = 2, ymax = 13, color = subgroup),    position = "dodge",    data = . %>% filter(group == 3),    alpha = 0.1  ) +  geom_point(    aes(x = group, y = 7.5, color = subgroup),    position = position_dodge(width = 1),    data = . %>% filter(group == 3),    shape = 1,    size = 4,    stroke = 1.5  ) +  scale_fill_brewer(palette = "Set2") +  scale_color_brewer(palette = "Dark2")# This same example with "dodgejust" positioning. For the points we# supply a justification parameter to position_dodgejust which mimics the# justification parameter of stat_halfeye, ensuring that they are# placed appropriately. On slabinterval family geoms, position_dodgejust()# will automatically detect the appropriate justification.dist_df %>%  ggplot(aes(    x = factor(group), ydist = dist_normal(mean, sd),    fill = subgroup  )) +  stat_halfeye(    position = "dodgejust"  ) +  geom_rect(    aes(xmin = group, xmax = group + 1, ymin = 2, ymax = 13, color = subgroup),    position = "dodgejust",    data = . %>% filter(group == 3),    alpha = 0.1  ) +  geom_point(    aes(x = group, y = 7.5, color = subgroup),    position = position_dodgejust(width = 1, justification = 0),    data = . %>% filter(group == 3),    shape = 1,    size = 4,    stroke = 1.5  ) +  scale_fill_brewer(palette = "Set2") +  scale_color_brewer(palette = "Dark2")

Apply partial colour ramps

Description

Given vectors of colours andpartial_colour_ramps, ramps the coloursaccording to the parameters of the partial colour ramps, returninga vector of the same length as the inputs giving the transformed(ramped) colours.

Usage

ramp_colours(colour, ramp)

Arguments

colour

<character> Vector of colours to ramp to.

ramp

<partial_colour_ramp> Vector of colour ramps (same length ascolour) giving the colour to ramp from and the amount to ramp.

Details

Takes vectors of colours andpartial_colour_ramps and producescolours by interpolating between eachfrom colour and the targetcolourthe specifiedamount (whereamount andfrom are the correspondingfields of theramp).

For example, to add support for thefill_ramp aesthetic to a geometry,this line could be used inside thedraw_group() ordraw_panel() methodof a geom:

data$fill = ramp_colours(data$fill, data$fill_ramp)

Value

A character vector of colours.

Author(s)

Matthew Kay

See Also

Other colour ramp functions:guide_rampbar(),partial_colour_ramp(),scale_colour_ramp

Examples

pcr = partial_colour_ramp(c(0, 0.25, 0.75, 1), "red")pcrramp_colours("blue", pcr)

Secondary color scale that ramps from another color (ggplot2 scale)

Description

This scale creates a secondary scale that modifies thefill orcolor scale ofgeoms that support it (geom_lineribbon() andgeom_slabinterval()) to "ramp"from a secondary color (by default white) to the primary fill color (determinedby the standardcolor orfill aesthetics). It uses thepartial_colour_ramp() data type.

Usage

scale_colour_ramp_continuous(  from = "white",  ...,  limits = function(l) c(min(0, l[[1]]), l[[2]]),  range = c(0, 1),  guide = "legend",  aesthetics = "colour_ramp")scale_color_ramp_continuous(  from = "white",  ...,  limits = function(l) c(min(0, l[[1]]), l[[2]]),  range = c(0, 1),  guide = "legend",  aesthetics = "colour_ramp")scale_colour_ramp_discrete(  from = "white",  ...,  range = c(0.2, 1),  aesthetics = "colour_ramp")scale_color_ramp_discrete(  from = "white",  ...,  range = c(0.2, 1),  aesthetics = "colour_ramp")scale_fill_ramp_continuous(..., aesthetics = "fill_ramp")scale_fill_ramp_discrete(..., aesthetics = "fill_ramp")

Arguments

from

<string> The color to ramp from. Corresponds to0 on the scale.

...

Arguments passed to underlying scale or guide functions. E.g.scale_colour_ramp_discrete() passes arguments todiscrete_scale(),scale_colour_ramp_continuous() passes arguments tocontinuous_scale().See those functions for more details.

limits

One of:

  • NULL to use the default scale range

  • A numeric vector of length two providing limits of the scale.UseNA to refer to the existing minimum or maximum

  • A function that accepts the existing (automatic) limits and returnsnew limits. Also accepts rlanglambda functionnotation.Note that setting limits on positional scales willremove data outside of the limits.If the purpose is to zoom, use the limit argument in the coordinate system(seecoord_cartesian()).

range

<length-2numeric> Minimum and maximumvalues after the scale transformation. These values should be between0(thefrom color) and1 (the color determined by thefill aesthetic).

guide

<Guide |string> A function used to create a guide or its name. Forscale_colour_ramp_continuous() andscale_fill_ramp_continuous(),guide_rampbar() can be used to create gradient color bars. Seeguides() for information on other guides.

aesthetics

<character> Names of aesthetics to set scales for.

Details

These scales transform data intopartial_colour_ramps. Eachpartial_colour_rampis a pair of two values: afrom colour and a numericamount between0and1 representing a distance betweenfrom and the target color (where0indicates thefrom color and1 the target color).

The target color is determined by the corresponding aesthetic: for example,thecolour_ramp aesthetic creates ramps betweenfrom and whatever thevalue of thecolour aesthetic is; thefill_ramp aesthetic creates rampsbetweenfrom and whatever the value of thefill aesthetic is. When thecolour_ramp aesthetic is set,ggdist geometries will modify theircolour by applying the colour ramp betweenfrom andcolour (andsimilarly forfill_ramp andfill).

Colour ramps can be applied (i.e. translated into colours) usingramp_colours(), which can be used withpartial_colour_ramp()to implement geoms that make use ofcolour_ramp orfill_ramp scales.

Value

Aggplot2::Scale representing a scale for thecolour_ramp and/orfill_rampaesthetics forggdist geoms. Can be added to aggplot() object.

Author(s)

Matthew Kay

See Also

Other ggdist scales:scale_side_mirrored(),scale_thickness,sub-geometry-scales

Other colour ramp functions:guide_rampbar(),partial_colour_ramp(),ramp_colours()

Examples

library(dplyr)library(ggplot2)library(distributional)tibble(d = dist_uniform(0, 1)) %>%  ggplot(aes(y = 0, xdist = d)) +  stat_slab(aes(fill_ramp = after_stat(x)))tibble(d = dist_uniform(0, 1)) %>%  ggplot(aes(y = 0, xdist = d)) +  stat_slab(aes(fill_ramp = after_stat(x)), fill = "blue") +  scale_fill_ramp_continuous(from = "red")# you can invert the order of `range` to change the order of the blendtibble(d = dist_normal(0, 1)) %>%  ggplot(aes(y = 0, xdist = d)) +  stat_slab(aes(fill_ramp = after_stat(cut_cdf_qi(cdf))), fill = "blue") +  scale_fill_ramp_discrete(from = "red", range = c(1, 0))

Side scale for mirrored slabs (ggplot2 scale)

Description

This scale creates mirrored slabs for theside aesthetic of thegeom_slabinterval()andgeom_dotsinterval() family of geoms and stats. It works on discrete variablesof two or three levels.

Usage

scale_side_mirrored(start = "topright", ..., aesthetics = "side")

Arguments

start

<string> The side to start from. Can be any valid value of theside aestheticexcept"both".

...

Arguments passed on toggplot2::discrete_scale

scale_name

[Deprecated] The name of the scalethat should be used for error messages associated with this scale.

palette

A palette function that when called with a single integerargument (the number of levels in the scale) returns the values thatthey should take (e.g.,scales::pal_hue()).

name

The name of the scale. Used as the axis or legend title. Ifwaiver(), the default, the name of the scale is taken from the firstmapping used for that aesthetic. IfNULL, the legend title will beomitted.

breaks

One of:

  • NULL for no breaks

  • waiver() for the default breaks (the scale limits)

  • A character vector of breaks

  • A function that takes the limits as input and returns breaksas output. Also accepts rlanglambda functionnotation.

labels

One of:

  • NULL for no labels

  • waiver() for the default labels computed by thetransformation object

  • A character vector giving labels (must be same length asbreaks)

  • An expression vector (must be the same length as breaks). See ?plotmath for details.

  • A function that takes the breaks as input and returns labelsas output. Also accepts rlanglambda functionnotation.

limits

One of:

  • NULL to use the default scale values

  • A character vector that defines possible values of the scale and theirorder

  • A function that accepts the existing (automatic) values and returnsnew ones. Also accepts rlanglambda functionnotation.

expand

For position scales, a vector of range expansion constants used to add somepadding around the data to ensure that they are placed some distanceaway from the axes. Use the convenience functionexpansion()to generate the values for theexpand argument. The defaults are toexpand the scale by 5% on each side for continuous variables, and by0.6 units on each side for discrete variables.

na.translate

Unlike continuous scales, discrete scales can easily showmissing values, and do so by default. If you want to remove missing valuesfrom a discrete scale, specifyna.translate = FALSE.

na.value

Ifna.translate = TRUE, what aesthetic value should themissing values be displayed as? Does not apply to position scaleswhereNA is always placed at the far right.

drop

Should unused factor levels be omitted from the scale?The default,TRUE, uses the levels that appear in the data;FALSE includes the levels in the factor. Please note that to displayevery level in a legend, the layer should useshow.legend = TRUE.

guide

A function used to create a guide or its name. Seeguides() for more information.

position

For position scales, The position of the axis.left orright for y axes,top orbottom for x axes.

call

Thecall used to construct the scale for reporting messages.

super

The super class to use for the constructed scale

aesthetics

<character> Names of aesthetics to set scales for.

Value

Aggplot2::Scale representing a scale for thesideaesthetic forggdist geoms. Can be added to aggplot() object.

Author(s)

Matthew Kay

See Also

Other ggdist scales:scale_colour_ramp,scale_thickness,sub-geometry-scales

Examples

library(dplyr)library(ggplot2)set.seed(1234)data.frame(  x = rnorm(400, c(1,4)),  g = c("a","b")) %>%  ggplot(aes(x, fill = g, side = g)) +  geom_weave(linewidth = 0, scale = 0.5) +  scale_side_mirrored()

Slab thickness scale (ggplot2 scale)

Description

Thisggplot2 scale linearly scales allthickness values of geomsthat support thethickness aesthetic (such asgeom_slabinterval()). Itcan be used to align thethickness scales across multiple geoms (by default,thickness is normalized on a per-geom level instead of as a global scale).For a comprehensive discussion and examples of slab scaling and normalization,see thethickness scale article.

Usage

scale_thickness_shared(  name = waiver(),  breaks = waiver(),  labels = waiver(),  limits = function(l) c(min(0, l[[1]]), l[[2]]),  renormalize = FALSE,  oob = scales::oob_keep,  guide = "none",  expand = c(0, 0),  ...)scale_thickness_identity(..., guide = "none")

Arguments

name

The name of the scale. Used as the axis or legend title. Ifwaiver(), the default, the name of the scale is taken from the firstmapping used for that aesthetic. IfNULL, the legend title will beomitted.

breaks

One of:

  • NULL for no breaks

  • waiver() for the default breaks computed by thetransformation object

  • A numeric vector of positions

  • A function that takes the limits as input and returns breaksas output (e.g., a function returned byscales::extended_breaks()).Note that for position scales, limits are provided after scale expansion.Also accepts rlanglambda function notation.

labels

One of:

  • NULL for no labels

  • waiver() for the default labels computed by thetransformation object

  • A character vector giving labels (must be same length asbreaks)

  • An expression vector (must be the same length as breaks). See ?plotmath for details.

  • A function that takes the breaks as input and returns labelsas output. Also accepts rlanglambda functionnotation.

limits

One of:

  • NULL to use the default scale range

  • A numeric vector of length two providing limits of the scale.UseNA to refer to the existing minimum or maximum

  • A function that accepts the existing (automatic) limits and returnsnew limits. Also accepts rlanglambda functionnotation.Note that setting limits on positional scales willremove data outside of the limits.If the purpose is to zoom, use the limit argument in the coordinate system(seecoord_cartesian()).

renormalize

<scalarlogical> When mapping values to thethickness scale, should thosevalues be allowed to be renormalized by geoms (e.g. via thenormalize parametertogeom_slabinterval())? The default isFALSE: ifscale_thickness_shared()is in use, the geom-specificnormalize parameter is ignored (this is achievedby flagging values as already normalized by wrapping them inthickness()).Set this toTRUE to allow geoms to also apply their own normalization.Note that if you set renormalize toTRUE, subguides created via thesubguide parameter togeom_slabinterval() will display the scaled valuesoutput by this scale, not the original data values.

oob

One of:

guide

A function used to create a guide or its name. Seeguides() for more information.

expand

<numeric> Vector of limit expansion constants of length2 or 4, following the same format used by theexpand argument ofcontinuous_scale(). The default is not to expand the limits.You can use the convenience functionexpansion() to generate theexpansion values; expanding the lower limit is usually not recommended(because with mostthickness scales the lower limit is the baselineand represents0), so a typical usage might be something likeexpand = expansion(c(0, 0.05)) to expand the top end of the scaleby 5%.

...

Arguments passed on toggplot2::continuous_scale

aesthetics

The names of the aesthetics that this scale works with.

scale_name

[Deprecated] The name of the scalethat should be used for error messages associated with this scale.

palette

A palette function that when called with a numeric vector withvalues between 0 and 1 returns the corresponding output values(e.g.,scales::pal_area()).

minor_breaks

One of:

  • NULL for no minor breaks

  • waiver() for the default breaks (one minor break betweeneach major break)

  • A numeric vector of positions

  • A function that given the limits returns a vector of minor breaks. Alsoaccepts rlanglambda function notation. Whenthe function has two arguments, it will be given the limits and majorbreaks.

n.breaks

An integer guiding the number of major breaks. The algorithmmay choose a slightly different number to ensure nice break labels. Willonly have an effect ifbreaks = waiver(). UseNULL to use the defaultnumber of breaks given by the transformation.

rescaler

A function used to scale the input values to therange [0, 1]. This is alwaysscales::rescale(), except fordiverging and n colour gradients (i.e.,scale_colour_gradient2(),scale_colour_gradientn()). Therescaler is ignored by positionscales, which always usescales::rescale(). Also accepts rlanglambda function notation.

na.value

Missing values will be replaced with this value.

transform

For continuous scales, the name of a transformation objector the object itself. Built-in transformations include "asn", "atanh","boxcox", "date", "exp", "hms", "identity", "log", "log10", "log1p", "log2","logit", "modulus", "probability", "probit", "pseudo_log", "reciprocal","reverse", "sqrt" and "time".

A transformation object bundles together a transform, its inverse,and methods for generating breaks and labels. Transformation objectsare defined in the scales package, and are called⁠transform_<name>⁠. Iftransformations require arguments, you can call them from the scalespackage, e.g.scales::transform_boxcox(p = 2).You can create your own transformation withscales::new_transform().

trans

[Deprecated] Deprecated in favour oftransform.

position

For position scales, The position of the axis.left orright for y axes,top orbottom for x axes.

call

Thecall used to construct the scale for reporting messages.

super

The super class to use for the constructed scale

Details

By default, normalization/scaling of slab thicknesses is controlled by geometries,not by aggplot2 scale function. This allows various functionality nototherwise possible, such as (1) allowing different geometries to have differentthickness scales and (2) allowing the user to control at what level of aggregation(panels, groups, the entire plot, etc) thickness scaling is done via thenormalizeparameter togeom_slabinterval().

However, this default approach has one drawback: two different geoms will alwayshave their own scaling ofthickness.scale_thickness_shared() offers analternative approach: when added to a chart, all geoms will use the samethickness scale, and geom-level normalization (via theirnormalize parameters)is ignored. This is achieved by "marking" thickness values as alreadynormalized by wrapping them in thethickness() data type (this can bedisabled by settingrenormalize = TRUE).

Note: while a slightly more typical name forscale_thickness_shared() mightbescale_thickness_continuous(), the latter name would cause this scaleto be applied to allthickness aesthetics by default according to the rulesggplot2 uses to find default scales. Thus, to retain the usual behaviorofstat_slabinterval() (per-geom normalization ofthickness), this scaleis calledscale_thickness_shared().

Value

Aggplot2::Scale representing a scale for thethicknessaesthetic forggdist geoms. Can be added to aggplot() object.

Author(s)

Matthew Kay

See Also

Thethickness datatype.

Thethickness aesthetic ofgeom_slabinterval().

subscale_thickness(), for setting athickness sub-scale withina singlegeom_slabinterval().

Other ggdist scales:scale_colour_ramp,scale_side_mirrored(),sub-geometry-scales

Examples

library(distributional)library(ggplot2)library(dplyr)prior_post = data.frame(  prior = dist_normal(0, 1),  posterior = dist_normal(0.1, 0.5))# By default, separate geoms have their own thickness scales, which means# distributions plotted using two separate geoms will not have their slab# functions drawn on the same scale (thus here, the two distributions have# different areas under their density curves):prior_post %>%  ggplot() +  stat_halfeye(aes(xdist = posterior)) +  stat_slab(aes(xdist = prior), fill = NA, color = "red")# For this kind of prior/posterior chart, it makes more sense to have the# densities on the same scale; thus, the areas under both would be the same.# We can do that using scale_thickness_shared():prior_post %>%  ggplot() +  stat_halfeye(aes(xdist = posterior)) +  stat_slab(aes(xdist = prior), fill = NA, color = "#e41a1c") +  scale_thickness_shared()

Smooth dot positions in a dotplot using a kernel density estimator ("density dotplots")

Description

Smoothsx values using a density estimator, returning newx of the samelength. Can be used with a dotplot (e.g.geom_dots(smooth = ...)) to create"density dotplots".

Supportsautomatic partial function application withwaived arguments.

Usage

smooth_bounded(  x,  density = "bounded",  bounds = c(NA, NA),  bounder = "cooke",  trim = FALSE,  ...)smooth_unbounded(x, density = "unbounded", trim = FALSE, ...)

Arguments

x

<numeric> Values to smooth.

density

<function |string> Density estimator to use for smoothing. One of:

  • A function which takes a numeric vector and returns a list with elementsx (giving grid points for the density estimator) andy (thecorresponding densities).ggdist provides a family of functionsfollowing this format, includingdensity_unbounded() anddensity_bounded().

  • A string giving the suffix of a function name that starts with"density_";e.g."bounded" for⁠[density_bounded()]⁠.

bounds

<length-2numeric> Min and max bounds. If a bound isNA, thenthat bound is estimated from the data using the method specified bybounder.

bounder

<function |string> Method to use to find missing(NA)bounds. A function thattakes a numeric vector of values and returns a length-2 vector of the estimatedlower and upper bound of the distribution. Can also be a string giving thesuffix of the name of such a function that starts with"bounder_". Usefulvalues include:

  • "cdf": Use the CDF of the the minimum and maximum order statistics of thesample to estimate the bounds. Seebounder_cdf().

  • "cooke": Use the method from Cooke (1979); i.e. method 2.3 from Loh (1984).Seebounder_cooke().

  • "range": Use the range ofx (i.e themin ormax). Seebounder_range().

trim

<scalarlogical> Passed todensity: Should the density estimate be trimmed to therange of the data? DefaultFALSE.

...

Arguments passed to the density estimator specified bydensity.

Details

Applies a kernel density estimator (KDE) tox, then uses weighted quantilesof the KDE to generate a new set ofx values with smoothed values. Plottedusing a dotplot (e.g.geom_dots(smooth = "bounded") or⁠geom_dots(smooth = smooth_bounded(...)⁠), these values create a variation ona "density dotplot" (Zvinca 2018).

Such plots are recommendedonly in verylarge sample sizes where precise positions of individual values are notparticularly meaningful. In small samples, normal dotplots should generallybe used.

Two variants are supplied by default:

It is generally recommended to pick the smooth based on the known bounds ofyour data, e.g. by usingsmooth_bounded() with thebounds parameter ifthere are finite bounds, orsmooth_unbounded() if both bounds are infinite.

Value

A numeric vector oflength(x), where each entry is a smoothed version ofthe corresponding entry inx.

Ifx is missing, returns a partial application of itself. Seeautomatic-partial-functions.

References

Zvinca, Daniel. "In the pursuit of diversity in data visualization. Jittering data to access details."https://www.linkedin.com/pulse/pursuit-diversity-data-visualization-jittering-access-daniel-zvinca/.

See Also

Other dotplot smooths:smooth_discrete(),smooth_none()

Examples

library(ggplot2)set.seed(1234)x = rnorm(1000)# basic dotplot is noisyggplot(data.frame(x), aes(x)) +  geom_dots()# density dotplot is smoother, but does move points (most noticeable# in areas of low density)ggplot(data.frame(x), aes(x)) +  geom_dots(smooth = "unbounded")# you can adjust the kernel and bandwidth...ggplot(data.frame(x), aes(x)) +  geom_dots(smooth = smooth_unbounded(kernel = "triangular", adjust = 0.5))# for bounded data, you should use the bounded smootherx_beta = rbeta(1000, 0.5, 0.5)ggplot(data.frame(x_beta), aes(x_beta)) +  geom_dots(smooth = smooth_bounded(bounds = c(0, 1)))

Smooth dot positions in a dotplot of discrete values ("bar dotplots")

Description

Note: Better-looking bar dotplots are typically easier to achieve usinglayout = "bar" with thegeom_dotsinterval() family instead ofsmooth = "bar" orsmooth = "discrete".

Smoothsx values wherex is presumed to be discrete, returning a newxof the same length. Bothsmooth_discrete() andsmooth_bar() use theresolution() of the data to apply smoothing around unique values in thedataset;smooth_discrete() uses a kernel density estimator andsmooth_bar()places values in an evenly-spaced grid. Can be used with a dotplot(e.g.geom_dots(smooth = ...)) to create "bar dotplots".

Supportsautomatic partial function application withwaived arguments.

Usage

smooth_discrete(  x,  kernel = c("rectangular", "gaussian", "epanechnikov", "triangular", "biweight",    "cosine", "optcosine"),  width = 0.7,  ...)smooth_bar(x, width = 0.7, ...)

Arguments

x

<numeric> Values to smooth.

kernel

<string> The smoothing kernel to be used. This must partiallymatch one of"gaussian","rectangular","triangular","epanechnikov","biweight","cosine", or"optcosine". Seestats::density().

width

<scalarnumeric> approximate width of the bars as a fractionof dataresolution().

...

additional parameters;smooth_discrete() passes these tosmooth_unbounded() and thereby todensity_unbounded();smooth_bar()ignores them.

Details

smooth_discrete() applies a kernel density estimator (default: rectangular)tox. It automatically sets the bandwidth to be such that the kernel'swidth (for each kernel type) is approximatelywidth times theresolution()of the data. This means it essentially creates smoothed bins around eachunique value. It calls down tosmooth_unbounded().

smooth_bar() generates an evenly-spaced grid of values spanning⁠+/- width/2⁠around each unique value inx.

Value

A numeric vector oflength(x), where each entry is a smoothed version ofthe corresponding entry inx.

Ifx is missing, returns a partial application of itself. Seeautomatic-partial-functions.

See Also

Other dotplot smooths:smooth_density,smooth_none()

Examples

library(ggplot2)set.seed(1234)x = rpois(1000, 2)# automatic binwidth in basic dotplot on large counts in discrete# distributions is very smallggplot(data.frame(x), aes(x)) +  geom_dots()# NOTE: It is now recommended to use layout = "bar" instead of# smooth = "discrete" or smooth = "bar"; the latter are retained because# they can sometimes be useful in combination with other layouts for# more specialized (but finicky) applications.ggplot(data.frame(x), aes(x)) +  geom_dots(layout = "bar")# smooth_discrete() constructs wider bins of dotsggplot(data.frame(x), aes(x)) +  geom_dots(smooth = "discrete")# smooth_bar() is an alternative approach to rectangular layoutsggplot(data.frame(x), aes(x)) +  geom_dots(smooth = "bar")# adjust the shape by changing the kernel or the width. epanechnikov# works well with side = "both"ggplot(data.frame(x), aes(x)) +  geom_dots(smooth = smooth_discrete(kernel = "epanechnikov", width = 0.8), side = "both")

Apply no smooth to a dotplot

Description

Default smooth for dotplots: no smooth. Simply returns the input values.

Supportsautomatic partial function application withwaived arguments.

Usage

smooth_none(x, ...)

Arguments

x

<numeric> Values to smooth.

...

ignored

Details

This is the default value for thesmooth argument ofgeom_dotsinterval().

Value

x

Ifx is missing, returns a partial application of itself. Seeautomatic-partial-functions.

See Also

Other dotplot smooths:smooth_density,smooth_discrete()


CCDF bar plot (shortcut stat)

Description

Shortcut version ofstat_slabinterval() withgeom_slabinterval() forcreating CCDF bar plots.

Roughly equivalent to:

stat_slabinterval(  aes(    thickness = after_stat(thickness(1 - cdf, 0, 1)),    justification = after_stat(0.5),    side = after_stat("topleft")  ),  normalize = "none",  expand = TRUE)

Usage

stat_ccdfinterval(  mapping = NULL,  data = NULL,  geom = "slabinterval",  position = "identity",  ...,  normalize = "none",  expand = TRUE,  p_limits = c(NA, NA),  density = "bounded",  adjust = waiver(),  trim = waiver(),  breaks = waiver(),  align = waiver(),  outline_bars = waiver(),  point_interval = "median_qi",  limits = NULL,  n = waiver(),  .width = c(0.66, 0.95),  orientation = NA,  na.rm = FALSE,  show.legend = c(size = FALSE),  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

geom

<Geom |string> Use to overridethe default connection betweenstat_ccdfinterval() andgeom_slabinterval()

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). Theseare often aesthetics, used to set an aesthetic to a fixed value, likecolour = "red" orlinewidth = 3 (seeAesthetics, below). Theymay also be parameters to the paired geom/stat. When paired with thedefault geom,geom_slabinterval(), these include:

subscale

<function |string> Sub-scale used to scale values of thethickness aesthetic within the groups determined bynormalize. One of:

  • A function that takes anx argument giving a numeric vectorof values to be scaled and then returns athickness vector representingthe scaled values, such assubscale_thickness() orsubscale_identity().

  • A string giving the name of such a function when prefixedwith"subscale_"; e.g."thickness" or"identity". The value"thickness" using the default subscale, which can be modified bysettingsubscale_thickness; see the documentation for thatfunction.

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

fill_type

<string> What type of fill to use when the fill color or alpha varies within a slab.One of:

  • "segments": breaks up the slab geometry into segments for each unique combination of fill color andalpha value. This approach is supported by all graphics devices and works well for sharp cutoff values,but can give ugly results if a large number of unique fill colors are being used (as in gradients,like instat_gradientinterval()).

  • "gradient": agrid::linearGradient() is used to create a smooth gradient fill. This works well forlarge numbers of unique fill colors, but requires R >= 4.1 and is not yet supported on all graphics devices.As of this writing, thepng() graphics device withtype = "cairo", thesvg() device, thepdf()device, and theragg::agg_png() devices are known to support this option. On R < 4.1, this optionwill fall back tofill_type = "segments" with a message.

  • "auto": attempts to usefill_type = "gradient" if support for it can be auto-detected. On R >= 4.2,support for gradients can be auto-detected on some graphics devices; if support is not detected, thisoption will fall back tofill_type = "segments" (in case of a false negative,fill_type = "gradient"can be set explicitly). On R < 4.2, support for gradients cannot be auto-detected, so this will alwaysfall back tofill_type = "segments", in which case you can setfill_type = "gradient" explicitlyif you are using a graphics device that support gradients.

interval_size_domain

<length-2numeric> Minimum and maximum of the values of thesize andlinewidth aestheticsthat will be translated into actual sizes for intervals drawn according tointerval_size_range (see thedocumentation for that argument.)

interval_size_range

<length-2numeric> This geom scales the raw size aesthetic values whendrawing interval and point sizes, as they tend to be too thick when usingthe default settings ofscale_size_continuous(),which give sizes with a range ofc(1, 6). Theinterval_size_domain value indicates theinput domain of raw size values (typically this should be equal to the value of therangeargument of thescale_size_continuous() function), andinterval_size_range indicates the desired output range of the size values (the min and maxof the actual sizes used to draw intervals). Most of the time it is not recommended to changethe value of this argument, as it may result in strange scaling of legends; this argument isa holdover from earlier versions that did not have size aesthetics targeting the point andinterval separately. If you want to adjust the size of the interval or points separately,you can also use thelinewidth orpoint_size aesthetics; seesub-geometry-scales.

fatten_point

<scalarnumeric> A multiplicative factor used to adjust the size of the point relative to thesize of the thickest interval line. If you wish to specify point sizes directly, you can also usethepoint_size aesthetic andscale_point_size_continuous() orscale_point_size_discrete();sizes specified with that aesthetic will not be adjusted usingfatten_point.

arrow

<arrow |NULL> Type of arrow heads to use on the interval, orNULL for no arrows.

subguide

<function |string> Sub-guide used to annotate thethickness scale. One of:

  • A function that takes ascale argument giving aggplot2::Scaleobject and anorientation argument giving the orientation of thegeometry and then returns agrid::grob that will draw the axisannotation, such assubguide_axis() (to draw a traditional axis) orsubguide_none() (to draw no annotation). Seesubguide_axis()for a list of possibilities and examples.

  • A string giving the name of such a function when prefixedwith"subguide_"; e.g."axis" or"none". The values"slab","dots", and"spike" use the default subguide for their geomfamilies (no subguide), which can be modified by settingsubguide_slab,subguide_dots, orsubguide_spike;see the documentation for those functions.

normalize

<string> Groups within which to scale values of thethickness aesthetic. One of:

  • "all": normalize so that the maximum height across all data is1.

  • "panels": normalize within panels so that the maximum height in each panel is1.

  • "xy": normalize within the x/y axis opposite theorientation of this geom sothat the maximum height at each value of the opposite axis is1.

  • "groups": normalize within values of the opposite axis and within eachgroup so that the maximum height in each group is1.

  • "none": values are taken as is with no normalization (this should probablyonly be used with functions whose values are in [0,1], such as CDFs).

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

expand

<logical> For sample data, should the slab be expanded to the limits of the scale? DefaultFALSE.Can be a length-two logical vector to control expansion to the lower and upper limit respectively.

p_limits

<length-2numeric> Probability limits. Used to determine the lower and upperlimits ofanalytical distributions (distributions fromsamples ignore this parameter and determinetheir limits based on the limits of the sample and the value of thetrim parameter).E.g., if this isc(.001, .999), then a slab is drawnfor the distribution from the quantile atp = .001 to the quantile atp = .999. If the lower(respectively upper) limit isNA, then the lower (upper) limit will be the minimum (maximum) of thedistribution's support if it is finite, and0.001 (0.999) if it is not finite. E.g., ifp_limits isc(NA, NA), on a gamma distribution the effective value ofp_limits would bec(0, .999) since the gamma distribution is defined on⁠(0, Inf)⁠; whereas on a normal distributionit would be equivalent toc(.001, .999) since the normal distribution is defined on⁠(-Inf, Inf)⁠.

density

<function |string> Density estimator for sample data. One of:

  • A function which takes a numeric vector and returns a list with elementsx (giving grid points for the density estimator) andy (thecorresponding densities).ggdist provides a family of functionsfollowing this format, includingdensity_unbounded() anddensity_bounded(). This format is also compatible withstats::density().

  • A string giving the suffix of a function name that starts with"density_";e.g."bounded" for⁠[density_bounded()]⁠,"unbounded" for⁠[density_unbounded()]⁠,or"histogram" fordensity_histogram().Defaults to"bounded", i.e.density_bounded(), which estimates the bounds fromthe data and then uses a bounded density estimator based on the reflection method.

adjust

<scalarnumeric |waiver> Passed todensity (e.g.density_bounded()): Value to multiply the bandwidth of the density estimator by. Defaultwaiver() defers to the default of the density estimator, which is usually1.

trim

<scalarlogical |waiver> Passed todensity (e.g.density_bounded()): Should the density estimate be trimmed to the range of the data? Defaultwaiver() defers to the default of the density estimator, which is usuallyTRUE.

breaks

<numeric |function |string |waiver> Passed todensity (e.g.density_histogram()): Determines the breakpoints defining bins. Defaultwaiver() defers to the default of the density estimator, which is usually"Scott". Similar to (but not exactly the same as) thebreaksargument tographics::hist(). One of:

  • A scalar (length-1) numeric giving the number of bins

  • A vector numeric giving the breakpoints between histogram bins

  • A function takingx andweights and returning either thenumber of bins or a vector of breakpoints

  • A string giving the suffix of a function that starts with"breaks_".ggdist provides weighted implementations of the"Sturges","Scott", and"FD" break-finding algorithms fromgraphics::hist(), as well asbreaks_fixed() for manually settingthe bin width. Seebreaks.

For example,breaks = "Sturges" will use thebreaks_Sturges() algorithm,breaks = 9 will create 9 bins, andbreaks = breaks_fixed(width = 1) willset the bin width to1.

align

<scalarnumeric |function |string |waiver> Passed todensity (e.g.density_histogram()): Determines how to align the breakpoints defining bins. Defaultwaiver() defers to the default of the density estimator, which is usually"none" (performs no alignment). One of:

  • A scalar (length-1) numeric giving an offset that is subtractedfrom the breaks. The offset must be between0 and the bin width.

  • A function taking a sorted vector ofbreaks (bin edges) andreturning an offset to subtract from the breaks.

  • A string giving the suffix of a function that starts with"align_" used to determine the alignment, such asalign_none(),align_boundary(), oralign_center().

For example,align = "none" will provide no alignment,align = align_center(at = 0) will center a bin on0, andalign = align_boundary(at = 0) will align a bin edge on0.

outline_bars

<scalarlogical |waiver> Passed todensity (e.g.density_histogram()) and alsoused for discrete analytical distributions (whose slabs are drawn as histograms). Determinesif outlines in between the bars are drawn. Ifwaiver() orFALSE(the default), the outline is drawn only along the tops of the bars. IfTRUE, outlines in betweenbars are also drawn (though you may have to set theslab_color orcolor aesthetic tosee the outlines).

point_interval

<function |string> A function from thepoint_interval() family(e.g.,median_qi,mean_qi,mode_hdi, etc), or a string giving the name of a function from that family(e.g.,"median_qi","mean_qi","mode_hdi", etc; if a string, the caller's environment is searchedfor the function, followed by theggdist environment). This function determines the point summary(typically mean, median, or mode) and interval type (quantile interval,qi;highest-density interval,hdi; or highest-density continuous interval,hdci). Output willbe converted to the appropriatex- ory-based aesthetics depending on the value oforientation.See thepoint_interval() family of functions for more information.

limits

<length-2numeric> Manually-specified limits for the slab, asa vector of length two. These limits are combined with those computed based onp_limits as well as the limits defined by the scales of the plot todetermine the limits used to draw the slab functions: these limits specifythe maximal limits; i.e., if specified, the limits will not be wider thanthese (but may be narrower). UseNA to leave a limit alone; e.g.limits = c(0, NA) will ensure that the lower limit does not go below 0, butlet the upper limit be determined by eitherp_limits or the scale settings.

n

<scalarnumeric> Number of points at which to evaluate the function that defines the slab. Alsopassed todensity (e.g.density_bounded()). Defaultwaiver() uses the value501for analytical distributions and defers to the default of the density estimator forsample-based distributions, which is also usually501.

.width

<numeric> The.width argument passed topoint_interval: a vector of probabilitiesto use that determine the widths of the resulting intervals. If multiple probabilities are provided,multiple intervals per group are generated, each with a different probability interval (andvalue of the corresponding.width andlevel generated variables).

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

<logical> Should this layer be included in the legends? Default isc(size = FALSE),unlike most geoms, to match its common use cases.FALSE hides all legends,TRUE shows all legends,andNA shows only those that are mapped (the default for most geoms). It can also be a named logicalvector to finely select the aesthetics to display.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

To visualize sample data, such as a data distribution, samples from abootstrap distribution, or a Bayesian posterior, you can supply samples tothex ory aesthetic.

To visualize analytical distributions, you can use thexdist orydistaesthetic. For historical reasons, you can also usedist to specify the distribution, thoughthis is not recommended as it does not work as well with orientation detection.These aesthetics can be used as follows:

Value

Aggplot2::Stat representing a CCDF bar geometry which canbe added to aggplot() object.

Computed Variables

The following variables are computed by this stat and made available foruse in aesthetic specifications (aes()) using theafter_stat()function or theafter_stat argument ofstage():

Aesthetics

The slab+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: theslab, thepoint, andtheinterval.

Thesestats support the following aesthetics:

In addition, in their default configuration (paired withgeom_slabinterval())the following aesthetics are supported by the underlying geom:

Slab-specific aesthetics

Interval-specific aesthetics

Point-specific aesthetics

Color aesthetics

Line aesthetics

Slab-specific color and line override aesthetics

Interval-specific color and line override aesthetics

Point-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("slabinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

See Also

Seegeom_slabinterval() for the geom underlying this stat.Seestat_slabinterval() for the stat this shortcut is based on.

Other slabinterval stats:stat_cdfinterval(),stat_eye(),stat_gradientinterval(),stat_halfeye(),stat_histinterval(),stat_interval(),stat_pointinterval(),stat_slab(),stat_spike()

Examples

library(dplyr)library(ggplot2)library(distributional)theme_set(theme_ggdist())# ON SAMPLE DATAset.seed(1234)df = data.frame(  group = c("a", "b", "c"),  value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)))df %>%  ggplot(aes(x = value, y = group)) +  stat_ccdfinterval() +  expand_limits(x = 0)# ON ANALYTICAL DISTRIBUTIONSdist_df = data.frame(  group = c("a", "b", "c"),  mean =  c(  5,   7,   8),  sd =    c(  1, 1.5,   1))# Vectorized distribution types, like distributional::dist_normal()# and posterior::rvar(), can be used with the `xdist` / `ydist` aestheticsdist_df %>%  ggplot(aes(y = group, xdist = dist_normal(mean, sd))) +  stat_ccdfinterval() +  expand_limits(x = 0)

CDF bar plot (shortcut stat)

Description

Shortcut version ofstat_slabinterval() withgeom_slabinterval() forcreating CDF bar plots.

Roughly equivalent to:

stat_slabinterval(  aes(    thickness = after_stat(thickness(cdf, 0, 1)),    justification = after_stat(0.5),    side = after_stat("topleft")  ),  normalize = "none",  expand = TRUE)

Usage

stat_cdfinterval(  mapping = NULL,  data = NULL,  geom = "slabinterval",  position = "identity",  ...,  normalize = "none",  expand = TRUE,  p_limits = c(NA, NA),  density = "bounded",  adjust = waiver(),  trim = waiver(),  breaks = waiver(),  align = waiver(),  outline_bars = waiver(),  point_interval = "median_qi",  limits = NULL,  n = waiver(),  .width = c(0.66, 0.95),  orientation = NA,  na.rm = FALSE,  show.legend = c(size = FALSE),  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

geom

<Geom |string> Use to overridethe default connection betweenstat_cdfinterval() andgeom_slabinterval()

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). Theseare often aesthetics, used to set an aesthetic to a fixed value, likecolour = "red" orlinewidth = 3 (seeAesthetics, below). Theymay also be parameters to the paired geom/stat. When paired with thedefault geom,geom_slabinterval(), these include:

subscale

<function |string> Sub-scale used to scale values of thethickness aesthetic within the groups determined bynormalize. One of:

  • A function that takes anx argument giving a numeric vectorof values to be scaled and then returns athickness vector representingthe scaled values, such assubscale_thickness() orsubscale_identity().

  • A string giving the name of such a function when prefixedwith"subscale_"; e.g."thickness" or"identity". The value"thickness" using the default subscale, which can be modified bysettingsubscale_thickness; see the documentation for thatfunction.

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

fill_type

<string> What type of fill to use when the fill color or alpha varies within a slab.One of:

  • "segments": breaks up the slab geometry into segments for each unique combination of fill color andalpha value. This approach is supported by all graphics devices and works well for sharp cutoff values,but can give ugly results if a large number of unique fill colors are being used (as in gradients,like instat_gradientinterval()).

  • "gradient": agrid::linearGradient() is used to create a smooth gradient fill. This works well forlarge numbers of unique fill colors, but requires R >= 4.1 and is not yet supported on all graphics devices.As of this writing, thepng() graphics device withtype = "cairo", thesvg() device, thepdf()device, and theragg::agg_png() devices are known to support this option. On R < 4.1, this optionwill fall back tofill_type = "segments" with a message.

  • "auto": attempts to usefill_type = "gradient" if support for it can be auto-detected. On R >= 4.2,support for gradients can be auto-detected on some graphics devices; if support is not detected, thisoption will fall back tofill_type = "segments" (in case of a false negative,fill_type = "gradient"can be set explicitly). On R < 4.2, support for gradients cannot be auto-detected, so this will alwaysfall back tofill_type = "segments", in which case you can setfill_type = "gradient" explicitlyif you are using a graphics device that support gradients.

interval_size_domain

<length-2numeric> Minimum and maximum of the values of thesize andlinewidth aestheticsthat will be translated into actual sizes for intervals drawn according tointerval_size_range (see thedocumentation for that argument.)

interval_size_range

<length-2numeric> This geom scales the raw size aesthetic values whendrawing interval and point sizes, as they tend to be too thick when usingthe default settings ofscale_size_continuous(),which give sizes with a range ofc(1, 6). Theinterval_size_domain value indicates theinput domain of raw size values (typically this should be equal to the value of therangeargument of thescale_size_continuous() function), andinterval_size_range indicates the desired output range of the size values (the min and maxof the actual sizes used to draw intervals). Most of the time it is not recommended to changethe value of this argument, as it may result in strange scaling of legends; this argument isa holdover from earlier versions that did not have size aesthetics targeting the point andinterval separately. If you want to adjust the size of the interval or points separately,you can also use thelinewidth orpoint_size aesthetics; seesub-geometry-scales.

fatten_point

<scalarnumeric> A multiplicative factor used to adjust the size of the point relative to thesize of the thickest interval line. If you wish to specify point sizes directly, you can also usethepoint_size aesthetic andscale_point_size_continuous() orscale_point_size_discrete();sizes specified with that aesthetic will not be adjusted usingfatten_point.

arrow

<arrow |NULL> Type of arrow heads to use on the interval, orNULL for no arrows.

subguide

<function |string> Sub-guide used to annotate thethickness scale. One of:

  • A function that takes ascale argument giving aggplot2::Scaleobject and anorientation argument giving the orientation of thegeometry and then returns agrid::grob that will draw the axisannotation, such assubguide_axis() (to draw a traditional axis) orsubguide_none() (to draw no annotation). Seesubguide_axis()for a list of possibilities and examples.

  • A string giving the name of such a function when prefixedwith"subguide_"; e.g."axis" or"none". The values"slab","dots", and"spike" use the default subguide for their geomfamilies (no subguide), which can be modified by settingsubguide_slab,subguide_dots, orsubguide_spike;see the documentation for those functions.

normalize

<string> Groups within which to scale values of thethickness aesthetic. One of:

  • "all": normalize so that the maximum height across all data is1.

  • "panels": normalize within panels so that the maximum height in each panel is1.

  • "xy": normalize within the x/y axis opposite theorientation of this geom sothat the maximum height at each value of the opposite axis is1.

  • "groups": normalize within values of the opposite axis and within eachgroup so that the maximum height in each group is1.

  • "none": values are taken as is with no normalization (this should probablyonly be used with functions whose values are in [0,1], such as CDFs).

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

expand

<logical> For sample data, should the slab be expanded to the limits of the scale? DefaultFALSE.Can be a length-two logical vector to control expansion to the lower and upper limit respectively.

p_limits

<length-2numeric> Probability limits. Used to determine the lower and upperlimits ofanalytical distributions (distributions fromsamples ignore this parameter and determinetheir limits based on the limits of the sample and the value of thetrim parameter).E.g., if this isc(.001, .999), then a slab is drawnfor the distribution from the quantile atp = .001 to the quantile atp = .999. If the lower(respectively upper) limit isNA, then the lower (upper) limit will be the minimum (maximum) of thedistribution's support if it is finite, and0.001 (0.999) if it is not finite. E.g., ifp_limits isc(NA, NA), on a gamma distribution the effective value ofp_limits would bec(0, .999) since the gamma distribution is defined on⁠(0, Inf)⁠; whereas on a normal distributionit would be equivalent toc(.001, .999) since the normal distribution is defined on⁠(-Inf, Inf)⁠.

density

<function |string> Density estimator for sample data. One of:

  • A function which takes a numeric vector and returns a list with elementsx (giving grid points for the density estimator) andy (thecorresponding densities).ggdist provides a family of functionsfollowing this format, includingdensity_unbounded() anddensity_bounded(). This format is also compatible withstats::density().

  • A string giving the suffix of a function name that starts with"density_";e.g."bounded" for⁠[density_bounded()]⁠,"unbounded" for⁠[density_unbounded()]⁠,or"histogram" fordensity_histogram().Defaults to"bounded", i.e.density_bounded(), which estimates the bounds fromthe data and then uses a bounded density estimator based on the reflection method.

adjust

<scalarnumeric |waiver> Passed todensity (e.g.density_bounded()): Value to multiply the bandwidth of the density estimator by. Defaultwaiver() defers to the default of the density estimator, which is usually1.

trim

<scalarlogical |waiver> Passed todensity (e.g.density_bounded()): Should the density estimate be trimmed to the range of the data? Defaultwaiver() defers to the default of the density estimator, which is usuallyTRUE.

breaks

<numeric |function |string |waiver> Passed todensity (e.g.density_histogram()): Determines the breakpoints defining bins. Defaultwaiver() defers to the default of the density estimator, which is usually"Scott". Similar to (but not exactly the same as) thebreaksargument tographics::hist(). One of:

  • A scalar (length-1) numeric giving the number of bins

  • A vector numeric giving the breakpoints between histogram bins

  • A function takingx andweights and returning either thenumber of bins or a vector of breakpoints

  • A string giving the suffix of a function that starts with"breaks_".ggdist provides weighted implementations of the"Sturges","Scott", and"FD" break-finding algorithms fromgraphics::hist(), as well asbreaks_fixed() for manually settingthe bin width. Seebreaks.

For example,breaks = "Sturges" will use thebreaks_Sturges() algorithm,breaks = 9 will create 9 bins, andbreaks = breaks_fixed(width = 1) willset the bin width to1.

align

<scalarnumeric |function |string |waiver> Passed todensity (e.g.density_histogram()): Determines how to align the breakpoints defining bins. Defaultwaiver() defers to the default of the density estimator, which is usually"none" (performs no alignment). One of:

  • A scalar (length-1) numeric giving an offset that is subtractedfrom the breaks. The offset must be between0 and the bin width.

  • A function taking a sorted vector ofbreaks (bin edges) andreturning an offset to subtract from the breaks.

  • A string giving the suffix of a function that starts with"align_" used to determine the alignment, such asalign_none(),align_boundary(), oralign_center().

For example,align = "none" will provide no alignment,align = align_center(at = 0) will center a bin on0, andalign = align_boundary(at = 0) will align a bin edge on0.

outline_bars

<scalarlogical |waiver> Passed todensity (e.g.density_histogram()) and alsoused for discrete analytical distributions (whose slabs are drawn as histograms). Determinesif outlines in between the bars are drawn. Ifwaiver() orFALSE(the default), the outline is drawn only along the tops of the bars. IfTRUE, outlines in betweenbars are also drawn (though you may have to set theslab_color orcolor aesthetic tosee the outlines).

point_interval

<function |string> A function from thepoint_interval() family(e.g.,median_qi,mean_qi,mode_hdi, etc), or a string giving the name of a function from that family(e.g.,"median_qi","mean_qi","mode_hdi", etc; if a string, the caller's environment is searchedfor the function, followed by theggdist environment). This function determines the point summary(typically mean, median, or mode) and interval type (quantile interval,qi;highest-density interval,hdi; or highest-density continuous interval,hdci). Output willbe converted to the appropriatex- ory-based aesthetics depending on the value oforientation.See thepoint_interval() family of functions for more information.

limits

<length-2numeric> Manually-specified limits for the slab, asa vector of length two. These limits are combined with those computed based onp_limits as well as the limits defined by the scales of the plot todetermine the limits used to draw the slab functions: these limits specifythe maximal limits; i.e., if specified, the limits will not be wider thanthese (but may be narrower). UseNA to leave a limit alone; e.g.limits = c(0, NA) will ensure that the lower limit does not go below 0, butlet the upper limit be determined by eitherp_limits or the scale settings.

n

<scalarnumeric> Number of points at which to evaluate the function that defines the slab. Alsopassed todensity (e.g.density_bounded()). Defaultwaiver() uses the value501for analytical distributions and defers to the default of the density estimator forsample-based distributions, which is also usually501.

.width

<numeric> The.width argument passed topoint_interval: a vector of probabilitiesto use that determine the widths of the resulting intervals. If multiple probabilities are provided,multiple intervals per group are generated, each with a different probability interval (andvalue of the corresponding.width andlevel generated variables).

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

<logical> Should this layer be included in the legends? Default isc(size = FALSE),unlike most geoms, to match its common use cases.FALSE hides all legends,TRUE shows all legends,andNA shows only those that are mapped (the default for most geoms). It can also be a named logicalvector to finely select the aesthetics to display.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

To visualize sample data, such as a data distribution, samples from abootstrap distribution, or a Bayesian posterior, you can supply samples tothex ory aesthetic.

To visualize analytical distributions, you can use thexdist orydistaesthetic. For historical reasons, you can also usedist to specify the distribution, thoughthis is not recommended as it does not work as well with orientation detection.These aesthetics can be used as follows:

Value

Aggplot2::Stat representing a CDF bar geometry which canbe added to aggplot() object.

Computed Variables

The following variables are computed by this stat and made available foruse in aesthetic specifications (aes()) using theafter_stat()function or theafter_stat argument ofstage():

Aesthetics

The slab+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: theslab, thepoint, andtheinterval.

Thesestats support the following aesthetics:

In addition, in their default configuration (paired withgeom_slabinterval())the following aesthetics are supported by the underlying geom:

Slab-specific aesthetics

Interval-specific aesthetics

Point-specific aesthetics

Color aesthetics

Line aesthetics

Slab-specific color and line override aesthetics

Interval-specific color and line override aesthetics

Point-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("slabinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

See Also

Seegeom_slabinterval() for the geom underlying this stat.Seestat_slabinterval() for the stat this shortcut is based on.

Other slabinterval stats:stat_ccdfinterval(),stat_eye(),stat_gradientinterval(),stat_halfeye(),stat_histinterval(),stat_interval(),stat_pointinterval(),stat_slab(),stat_spike()

Examples

library(dplyr)library(ggplot2)library(distributional)theme_set(theme_ggdist())# ON SAMPLE DATAset.seed(1234)df = data.frame(  group = c("a", "b", "c"),  value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)))df %>%  ggplot(aes(x = value, y = group)) +  stat_cdfinterval()# ON ANALYTICAL DISTRIBUTIONSdist_df = data.frame(  group = c("a", "b", "c"),  mean =  c(  5,   7,   8),  sd =    c(  1, 1.5,   1))# Vectorized distribution types, like distributional::dist_normal()# and posterior::rvar(), can be used with the `xdist` / `ydist` aestheticsdist_df %>%  ggplot(aes(y = group, xdist = dist_normal(mean, sd))) +  stat_cdfinterval()

Dot plot (shortcut stat)

Description

A combination ofstat_slabinterval() andgeom_dotsinterval() with sensible defaultsfor making dot plots. Whilegeom_dotsinterval() is intended for use on dataframes that have already been summarized using apoint_interval() function,stat_dots() is intended for use directly on data frames of draws or ofanalytical distributions, and will perform the summarization using apoint_interval()function. Geoms based ongeom_dotsinterval() create dotplots that automatically determine a bin width thatensures the plot fits within the available space. They can also ensure dots do not overlap.

Roughly equivalent to:

stat_dotsinterval(  aes(size = NULL),  geom = "dots",  show_point = FALSE,  show_interval = FALSE,  show.legend = NA)

Usage

stat_dots(  mapping = NULL,  data = NULL,  geom = "dots",  position = "identity",  ...,  quantiles = NA,  orientation = NA,  na.rm = FALSE,  show.legend = NA,  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

geom

<Geom |string> Use to overridethe default connection betweenstat_dots() andgeom_dots()

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). Theseare often aesthetics, used to set an aesthetic to a fixed value, likecolour = "red" orlinewidth = 3 (seeAesthetics, below). Theymay also be parameters to the paired geom/stat. When paired with thedefault geom,geom_dots(), these include:

binwidth

<numeric |unit> The bin width to use for laying out the dots.One of:

  • NA (the default): Dynamically select the bin width based on thesize of the plot when drawn. This will pick abinwidth such that thetallest stack of dots is at mostscale in height (ideally exactlyscalein height, though this is not guaranteed).

  • A length-1 (scalar) numeric orunit object giving the exact bin width.

  • A length-2 (vector) numeric orunit object giving the minimum and maximumdesired bin width. The bin width will be dynamically selected withinthese bounds.

If the value is numeric, it is assumed to be in units of data. The bin width(or its bounds) can also be specified usingunit(), which may be useful ifit is desired that the dots be a certain point size or a certain percentage ofthe width/height of the viewport. For example,unit(0.1, "npc") would makedots that areexactly 10% of the viewport size along whichever dimension thedotplot is drawn;unit(c(0, 0.1), "npc") would make dots that areat most10% of the viewport size (while still ensuring the tallest stack is less thanor equal toscale).

dotsize

<scalarnumeric> The width of the dots relative to thebinwidth. The default,1.07, makes dots be just a bit wider than the bin width, which is amanually-tuned parameter that tends to work well with the default circularshape, preventing gaps between bins from appearing to be too large visually(as might arise from dots beingprecisely thebinwidth). If it is desiredto have dots be precisely thebinwidth, setdotsize = 1.

stackratio

<scalarnumeric> The distance between the center of the dots in the samestack relative to the dot height. The default,1, makes dots in the samestack just touch each other.

layout

<string> The layout method used for the dots. One of:

  • "bin" (default): places dots on the off-axis at the midpoint oftheir bins as in the classic Wilkinson dotplot. This maintains thealignment of rows and columns in the dotplot. This layout is slightlydifferent from the classic Wilkinson algorithm in that: (1) it nudgesbins slightly to avoid overlapping bins and (2) if the input data aresymmetrical it will return a symmetrical layout.

  • "weave": uses the same basic binning approach of"bin", butplaces dots in the off-axis at their actual positions (unlessoverlaps = "nudge", in which case overlaps may be nudged out of theway). This maintains the alignment of rows but does not align dotswithin columns.

  • "hex": uses the same basic binning approach of"bin", butalternates placing dots+ binwidth/4 or- binwidth/4 in theoff-axis from the bin center. This allows hexagonal packing by settingastackratio less than 1 (something like0.9 tends to work).

  • "swarm": uses the"compactswarm" layout frombeeswarm::beeswarm(). Does not maintain alignment of rows or columns,but can be more compact and neat looking, especially for sample data(as opposed to quantile dotplots of theoretical distributions, whichmay look better with"bin","weave", or"hex").

  • "bar": for discrete distributions, lays out duplicate values inrectangular bars.

overlaps

<string> How to handle overlapping dots or bins in the"bin","weave", and"hex" layouts (dots never overlap in the"swarm" or"bar" layouts).For the purposes of this argument, dots are only considered to be overlappingif they would be overlapping whendotsize = 1 andstackratio = 1; i.e.if you set those arguments to other values, overlaps may still occur.One of:

  • "keep": leave overlapping dots as they are. Dots may overlap(usually only slightly) in the"bin","weave", and"hex" layouts.

  • "nudge": nudge overlapping dots out of the way. Overlaps are avoidedusing a constrained optimization which minimizes the squared distance ofdots to their desired positions, subject to the constraint that adjacentdots do not overlap.

smooth

<function |string> Smoother to apply to dot positions.One of:

  • A function that takes a numeric vector of dot positions and returns asmoothed version of that vector, such assmooth_bounded(),smooth_unbounded(), smooth_discrete()⁠, or ⁠smooth_bar()'.

  • A string indicating what smoother to use, as the suffix to a functionname starting withsmooth_; e.g."none" (the default) appliessmooth_none(), which simply returns the given vector withoutapplying smoothing.

Smoothing is most effective when the smoother is matched to the support ofthe distribution; e.g. usingsmooth_bounded(bounds = ...).

overflow

<string> How to handle overflow of dots beyond the extent of the geomwhen a minimumbinwidth (or an exactbinwidth) is supplied.One of:

  • "keep": Keep the overflow, drawing dots outside the geom bounds.

  • "warn": Keep the overflow, but produce a warning suggesting solutions,such as settingbinwidth = NA oroverflow = "compress".

  • "compress": Compress the layout. Reduces thebinwidth to the size necessaryto keep the dots within bounds, then adjustsstackratio anddotsize so thatthe apparent dot size is the user-specified minimumbinwidth times theuser-specifieddotsize.

If you find the default layout has dots that are too small, and you are okaywith dots overlapping, consider settingoverflow = "compress" and supplyingan exact or minimum dot size usingbinwidth.

verbose

<scalarlogical> IfTRUE, print out the bin width of the dotplot. Can be usefulif you want to start from an automatically-selected bin width and then adjust itmanually. Bin width is printed both as data units and as normalized parentcoordinates or"npc"s (seeunit()). Note that if you just want to scale theselected bin width to fit within a desired area, it is probably easier to usescale than to copy and scalebinwidth manually, and if you just want toprovide constraints on the bin width, you can pass a length-2 vector tobinwidth.

subguide

<function |string> Sub-guide used to annotate thethickness scale. One of:

  • A function that takes ascale argument giving aggplot2::Scaleobject and anorientation argument giving the orientation of thegeometry and then returns agrid::grob that will draw the axisannotation, such assubguide_axis() (to draw a traditional axis) orsubguide_none() (to draw no annotation). Seesubguide_axis()for a list of possibilities and examples.

  • A string giving the name of such a function when prefixedwith"subguide_"; e.g."axis" or"none". The values"slab","dots", and"spike" use the default subguide for their geomfamilies (no subguide), which can be modified by settingsubguide_slab,subguide_dots, orsubguide_spike;see the documentation for those functions.

quantiles

<scalarlogical> Number of quantiles to plot in the dotplot. UseNA(the default) to plot all data points. Setting this to a value other thanNA will producea quantile dotplot: that is, a dotplot of quantiles from the sample or distribution (foranalytical distributions, the default ofNA is taken to mean100 quantiles). SeeKay et al. (2016) and Fernandes et al. (2018) for more information on quantile dotplots.

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

logical. Should this layer be included in the legends?NA, the default, includes if any aesthetics are mapped.FALSE never includes, andTRUE always includes.It can also be a named logical vector to finely select the aesthetics todisplay.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

Thedots family of stats and geoms are similar toggplot2::geom_dotplot() but with a number of differences:

Stats and geoms in this family include:

stat_dots() andstat_dotsinterval(), when used with thequantiles argument,are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertaintyusing a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).

To visualize sample data, such as a data distribution, samples from abootstrap distribution, or a Bayesian posterior, you can supply samples tothex ory aesthetic.

To visualize analytical distributions, you can use thexdist orydistaesthetic. For historical reasons, you can also usedist to specify the distribution, thoughthis is not recommended as it does not work as well with orientation detection.These aesthetics can be used as follows:

Value

Aggplot2::Stat representing a dot geometry which canbe added to aggplot() object.

Computed Variables

The following variables are computed by this stat and made available foruse in aesthetic specifications (aes()) using theafter_stat()function or theafter_stat argument ofstage():

Aesthetics

The dots+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: thedots (aka theslab), thepoint, and theinterval.

Thesestats support the following aesthetics:

In addition, in their default configuration (paired withgeom_dots())the following aesthetics are supported by the underlying geom:

Dots-specific (aka Slab-specific) aesthetics

Interval-specific aesthetics

Point-specific aesthetics

Color aesthetics

Line aesthetics

Slab-specific color and line override aesthetics

Interval-specific color and line override aesthetics

Point-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("dotsinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

References

Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizationsof Uncertainty in Everyday, Mobile Predictive Systems.Conference on Human Factorsin Computing Systems - CHI '16, 5092–5103.doi:10.1145/2858036.2858558.

Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplotsor CDFs Improve Transit Decision-Making.Conference on Human Factors in Computing Systems - CHI '18.doi:10.1145/3173574.3173718.

See Also

Seegeom_dots() for the geom underlying this stat.Seevignette("dotsinterval") for a variety of examples of use.

Other dotsinterval stats:stat_dotsinterval(),stat_mcse_dots()

Examples

library(dplyr)library(ggplot2)library(distributional)theme_set(theme_ggdist())# ON SAMPLE DATAset.seed(12345)tibble(  x = rep(1:10, 100),  y = rnorm(1000, x)) %>%  ggplot(aes(x = x, y = y)) +  stat_dots()# ON ANALYTICAL DISTRIBUTIONS# Vectorized distribution types, like distributional::dist_normal()# and posterior::rvar(), can be used with the `xdist` / `ydist` aestheticstibble(  x = 1:10,  sd = seq(1, 3, length.out = 10)) %>%  ggplot(aes(x = x, ydist = dist_normal(x, sd))) +  stat_dots(quantiles = 50)

Dots + point + interval plot (shortcut stat)

Description

A combination ofstat_slabinterval() andgeom_dotsinterval() with sensible defaultsfor making dots + point + interval plots. Whilegeom_dotsinterval() is intended for use on dataframes that have already been summarized using apoint_interval() function,stat_dotsinterval() is intended for use directly on data frames of draws or ofanalytical distributions, and will perform the summarization using apoint_interval()function. Geoms based ongeom_dotsinterval() create dotplots that automatically determine a bin width thatensures the plot fits within the available space. They can also ensure dots do not overlap.

Usage

stat_dotsinterval(  mapping = NULL,  data = NULL,  geom = "dotsinterval",  position = "identity",  ...,  quantiles = NA,  point_interval = "median_qi",  .width = c(0.66, 0.95),  orientation = NA,  na.rm = FALSE,  show.legend = c(size = FALSE),  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

geom

<Geom |string> Use to overridethe default connection betweenstat_dotsinterval() andgeom_dotsinterval()

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). Theseare often aesthetics, used to set an aesthetic to a fixed value, likecolour = "red" orlinewidth = 3 (seeAesthetics, below). Theymay also be parameters to the paired geom/stat. When paired with thedefault geom,geom_dotsinterval(), these include:

binwidth

<numeric |unit> The bin width to use for laying out the dots.One of:

  • NA (the default): Dynamically select the bin width based on thesize of the plot when drawn. This will pick abinwidth such that thetallest stack of dots is at mostscale in height (ideally exactlyscalein height, though this is not guaranteed).

  • A length-1 (scalar) numeric orunit object giving the exact bin width.

  • A length-2 (vector) numeric orunit object giving the minimum and maximumdesired bin width. The bin width will be dynamically selected withinthese bounds.

If the value is numeric, it is assumed to be in units of data. The bin width(or its bounds) can also be specified usingunit(), which may be useful ifit is desired that the dots be a certain point size or a certain percentage ofthe width/height of the viewport. For example,unit(0.1, "npc") would makedots that areexactly 10% of the viewport size along whichever dimension thedotplot is drawn;unit(c(0, 0.1), "npc") would make dots that areat most10% of the viewport size (while still ensuring the tallest stack is less thanor equal toscale).

dotsize

<scalarnumeric> The width of the dots relative to thebinwidth. The default,1.07, makes dots be just a bit wider than the bin width, which is amanually-tuned parameter that tends to work well with the default circularshape, preventing gaps between bins from appearing to be too large visually(as might arise from dots beingprecisely thebinwidth). If it is desiredto have dots be precisely thebinwidth, setdotsize = 1.

stackratio

<scalarnumeric> The distance between the center of the dots in the samestack relative to the dot height. The default,1, makes dots in the samestack just touch each other.

layout

<string> The layout method used for the dots. One of:

  • "bin" (default): places dots on the off-axis at the midpoint oftheir bins as in the classic Wilkinson dotplot. This maintains thealignment of rows and columns in the dotplot. This layout is slightlydifferent from the classic Wilkinson algorithm in that: (1) it nudgesbins slightly to avoid overlapping bins and (2) if the input data aresymmetrical it will return a symmetrical layout.

  • "weave": uses the same basic binning approach of"bin", butplaces dots in the off-axis at their actual positions (unlessoverlaps = "nudge", in which case overlaps may be nudged out of theway). This maintains the alignment of rows but does not align dotswithin columns.

  • "hex": uses the same basic binning approach of"bin", butalternates placing dots+ binwidth/4 or- binwidth/4 in theoff-axis from the bin center. This allows hexagonal packing by settingastackratio less than 1 (something like0.9 tends to work).

  • "swarm": uses the"compactswarm" layout frombeeswarm::beeswarm(). Does not maintain alignment of rows or columns,but can be more compact and neat looking, especially for sample data(as opposed to quantile dotplots of theoretical distributions, whichmay look better with"bin","weave", or"hex").

  • "bar": for discrete distributions, lays out duplicate values inrectangular bars.

overlaps

<string> How to handle overlapping dots or bins in the"bin","weave", and"hex" layouts (dots never overlap in the"swarm" or"bar" layouts).For the purposes of this argument, dots are only considered to be overlappingif they would be overlapping whendotsize = 1 andstackratio = 1; i.e.if you set those arguments to other values, overlaps may still occur.One of:

  • "keep": leave overlapping dots as they are. Dots may overlap(usually only slightly) in the"bin","weave", and"hex" layouts.

  • "nudge": nudge overlapping dots out of the way. Overlaps are avoidedusing a constrained optimization which minimizes the squared distance ofdots to their desired positions, subject to the constraint that adjacentdots do not overlap.

smooth

<function |string> Smoother to apply to dot positions.One of:

  • A function that takes a numeric vector of dot positions and returns asmoothed version of that vector, such assmooth_bounded(),smooth_unbounded(), smooth_discrete()⁠, or ⁠smooth_bar()'.

  • A string indicating what smoother to use, as the suffix to a functionname starting withsmooth_; e.g."none" (the default) appliessmooth_none(), which simply returns the given vector withoutapplying smoothing.

Smoothing is most effective when the smoother is matched to the support ofthe distribution; e.g. usingsmooth_bounded(bounds = ...).

overflow

<string> How to handle overflow of dots beyond the extent of the geomwhen a minimumbinwidth (or an exactbinwidth) is supplied.One of:

  • "keep": Keep the overflow, drawing dots outside the geom bounds.

  • "warn": Keep the overflow, but produce a warning suggesting solutions,such as settingbinwidth = NA oroverflow = "compress".

  • "compress": Compress the layout. Reduces thebinwidth to the size necessaryto keep the dots within bounds, then adjustsstackratio anddotsize so thatthe apparent dot size is the user-specified minimumbinwidth times theuser-specifieddotsize.

If you find the default layout has dots that are too small, and you are okaywith dots overlapping, consider settingoverflow = "compress" and supplyingan exact or minimum dot size usingbinwidth.

verbose

<scalarlogical> IfTRUE, print out the bin width of the dotplot. Can be usefulif you want to start from an automatically-selected bin width and then adjust itmanually. Bin width is printed both as data units and as normalized parentcoordinates or"npc"s (seeunit()). Note that if you just want to scale theselected bin width to fit within a desired area, it is probably easier to usescale than to copy and scalebinwidth manually, and if you just want toprovide constraints on the bin width, you can pass a length-2 vector tobinwidth.

interval_size_domain

<length-2numeric> Minimum and maximum of the values of thesize andlinewidth aestheticsthat will be translated into actual sizes for intervals drawn according tointerval_size_range (see thedocumentation for that argument.)

interval_size_range

<length-2numeric> This geom scales the raw size aesthetic values whendrawing interval and point sizes, as they tend to be too thick when usingthe default settings ofscale_size_continuous(),which give sizes with a range ofc(1, 6). Theinterval_size_domain value indicates theinput domain of raw size values (typically this should be equal to the value of therangeargument of thescale_size_continuous() function), andinterval_size_range indicates the desired output range of the size values (the min and maxof the actual sizes used to draw intervals). Most of the time it is not recommended to changethe value of this argument, as it may result in strange scaling of legends; this argument isa holdover from earlier versions that did not have size aesthetics targeting the point andinterval separately. If you want to adjust the size of the interval or points separately,you can also use thelinewidth orpoint_size aesthetics; seesub-geometry-scales.

fatten_point

<scalarnumeric> A multiplicative factor used to adjust the size of the point relative to thesize of the thickest interval line. If you wish to specify point sizes directly, you can also usethepoint_size aesthetic andscale_point_size_continuous() orscale_point_size_discrete();sizes specified with that aesthetic will not be adjusted usingfatten_point.

arrow

<arrow |NULL> Type of arrow heads to use on the interval, orNULL for no arrows.

subguide

<function |string> Sub-guide used to annotate thethickness scale. One of:

  • A function that takes ascale argument giving aggplot2::Scaleobject and anorientation argument giving the orientation of thegeometry and then returns agrid::grob that will draw the axisannotation, such assubguide_axis() (to draw a traditional axis) orsubguide_none() (to draw no annotation). Seesubguide_axis()for a list of possibilities and examples.

  • A string giving the name of such a function when prefixedwith"subguide_"; e.g."axis" or"none". The values"slab","dots", and"spike" use the default subguide for their geomfamilies (no subguide), which can be modified by settingsubguide_slab,subguide_dots, orsubguide_spike;see the documentation for those functions.

quantiles

<scalarlogical> Number of quantiles to plot in the dotplot. UseNA(the default) to plot all data points. Setting this to a value other thanNA will producea quantile dotplot: that is, a dotplot of quantiles from the sample or distribution (foranalytical distributions, the default ofNA is taken to mean100 quantiles). SeeKay et al. (2016) and Fernandes et al. (2018) for more information on quantile dotplots.

point_interval

<function |string> A function from thepoint_interval() family(e.g.,median_qi,mean_qi,mode_hdi, etc), or a string giving the name of a function from that family(e.g.,"median_qi","mean_qi","mode_hdi", etc; if a string, the caller's environment is searchedfor the function, followed by theggdist environment). This function determines the point summary(typically mean, median, or mode) and interval type (quantile interval,qi;highest-density interval,hdi; or highest-density continuous interval,hdci). Output willbe converted to the appropriatex- ory-based aesthetics depending on the value oforientation.See thepoint_interval() family of functions for more information.

.width

<numeric> The.width argument passed topoint_interval: a vector of probabilitiesto use that determine the widths of the resulting intervals. If multiple probabilities are provided,multiple intervals per group are generated, each with a different probability interval (andvalue of the corresponding.width andlevel generated variables).

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

logical. Should this layer be included in the legends?NA, the default, includes if any aesthetics are mapped.FALSE never includes, andTRUE always includes.It can also be a named logical vector to finely select the aesthetics todisplay.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

Thedots family of stats and geoms are similar toggplot2::geom_dotplot() but with a number of differences:

Stats and geoms in this family include:

stat_dots() andstat_dotsinterval(), when used with thequantiles argument,are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertaintyusing a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).

To visualize sample data, such as a data distribution, samples from abootstrap distribution, or a Bayesian posterior, you can supply samples tothex ory aesthetic.

To visualize analytical distributions, you can use thexdist orydistaesthetic. For historical reasons, you can also usedist to specify the distribution, thoughthis is not recommended as it does not work as well with orientation detection.These aesthetics can be used as follows:

Value

Aggplot2::Stat representing a dots + point + interval geometry which canbe added to aggplot() object.

Computed Variables

The following variables are computed by this stat and made available foruse in aesthetic specifications (aes()) using theafter_stat()function or theafter_stat argument ofstage():

Aesthetics

The dots+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: thedots (aka theslab), thepoint, and theinterval.

Thesestats support the following aesthetics:

In addition, in their default configuration (paired withgeom_dotsinterval())the following aesthetics are supported by the underlying geom:

Dots-specific (aka Slab-specific) aesthetics

Interval-specific aesthetics

Point-specific aesthetics

Color aesthetics

Line aesthetics

Slab-specific color and line override aesthetics

Interval-specific color and line override aesthetics

Point-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("dotsinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

References

Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizationsof Uncertainty in Everyday, Mobile Predictive Systems.Conference on Human Factorsin Computing Systems - CHI '16, 5092–5103.doi:10.1145/2858036.2858558.

Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplotsor CDFs Improve Transit Decision-Making.Conference on Human Factors in Computing Systems - CHI '18.doi:10.1145/3173574.3173718.

See Also

Seegeom_dotsinterval() for the geom underlying this stat.Seevignette("dotsinterval") for a variety of examples of use.

Other dotsinterval stats:stat_dots(),stat_mcse_dots()

Examples

library(dplyr)library(ggplot2)library(distributional)theme_set(theme_ggdist())# ON SAMPLE DATAset.seed(12345)tibble(  x = rep(1:10, 100),  y = rnorm(1000, x)) %>%  ggplot(aes(x = x, y = y)) +  stat_dotsinterval()# ON ANALYTICAL DISTRIBUTIONS# Vectorized distribution types, like distributional::dist_normal()# and posterior::rvar(), can be used with the `xdist` / `ydist` aestheticstibble(  x = 1:10,  sd = seq(1, 3, length.out = 10)) %>%  ggplot(aes(x = x, ydist = dist_normal(x, sd))) +  stat_dotsinterval(quantiles = 50)

Eye (violin + interval) plot (shortcut stat)

Description

Shortcut version ofstat_slabinterval() withgeom_slabinterval() forcreating eye (violin + interval) plots.

Roughly equivalent to:

stat_slabinterval(  aes(side = after_stat("both")))

Usage

stat_eye(  mapping = NULL,  data = NULL,  geom = "slabinterval",  position = "identity",  ...,  p_limits = c(NA, NA),  density = "bounded",  adjust = waiver(),  trim = waiver(),  breaks = waiver(),  align = waiver(),  outline_bars = waiver(),  expand = FALSE,  point_interval = "median_qi",  limits = NULL,  n = waiver(),  .width = c(0.66, 0.95),  orientation = NA,  na.rm = FALSE,  show.legend = c(size = FALSE),  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

geom

<Geom |string> Use to overridethe default connection betweenstat_eye() andgeom_slabinterval()

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). Theseare often aesthetics, used to set an aesthetic to a fixed value, likecolour = "red" orlinewidth = 3 (seeAesthetics, below). Theymay also be parameters to the paired geom/stat. When paired with thedefault geom,geom_slabinterval(), these include:

subscale

<function |string> Sub-scale used to scale values of thethickness aesthetic within the groups determined bynormalize. One of:

  • A function that takes anx argument giving a numeric vectorof values to be scaled and then returns athickness vector representingthe scaled values, such assubscale_thickness() orsubscale_identity().

  • A string giving the name of such a function when prefixedwith"subscale_"; e.g."thickness" or"identity". The value"thickness" using the default subscale, which can be modified bysettingsubscale_thickness; see the documentation for thatfunction.

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

normalize

<string> Groups within which to scale values of thethickness aesthetic. One of:

  • "all": normalize so that the maximum height across all data is1.

  • "panels": normalize within panels so that the maximum height in each panel is1.

  • "xy": normalize within the x/y axis opposite theorientation of this geom sothat the maximum height at each value of the opposite axis is1.

  • "groups": normalize within values of the opposite axis and within eachgroup so that the maximum height in each group is1.

  • "none": values are taken as is with no normalization (this should probablyonly be used with functions whose values are in [0,1], such as CDFs).

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

fill_type

<string> What type of fill to use when the fill color or alpha varies within a slab.One of:

  • "segments": breaks up the slab geometry into segments for each unique combination of fill color andalpha value. This approach is supported by all graphics devices and works well for sharp cutoff values,but can give ugly results if a large number of unique fill colors are being used (as in gradients,like instat_gradientinterval()).

  • "gradient": agrid::linearGradient() is used to create a smooth gradient fill. This works well forlarge numbers of unique fill colors, but requires R >= 4.1 and is not yet supported on all graphics devices.As of this writing, thepng() graphics device withtype = "cairo", thesvg() device, thepdf()device, and theragg::agg_png() devices are known to support this option. On R < 4.1, this optionwill fall back tofill_type = "segments" with a message.

  • "auto": attempts to usefill_type = "gradient" if support for it can be auto-detected. On R >= 4.2,support for gradients can be auto-detected on some graphics devices; if support is not detected, thisoption will fall back tofill_type = "segments" (in case of a false negative,fill_type = "gradient"can be set explicitly). On R < 4.2, support for gradients cannot be auto-detected, so this will alwaysfall back tofill_type = "segments", in which case you can setfill_type = "gradient" explicitlyif you are using a graphics device that support gradients.

interval_size_domain

<length-2numeric> Minimum and maximum of the values of thesize andlinewidth aestheticsthat will be translated into actual sizes for intervals drawn according tointerval_size_range (see thedocumentation for that argument.)

interval_size_range

<length-2numeric> This geom scales the raw size aesthetic values whendrawing interval and point sizes, as they tend to be too thick when usingthe default settings ofscale_size_continuous(),which give sizes with a range ofc(1, 6). Theinterval_size_domain value indicates theinput domain of raw size values (typically this should be equal to the value of therangeargument of thescale_size_continuous() function), andinterval_size_range indicates the desired output range of the size values (the min and maxof the actual sizes used to draw intervals). Most of the time it is not recommended to changethe value of this argument, as it may result in strange scaling of legends; this argument isa holdover from earlier versions that did not have size aesthetics targeting the point andinterval separately. If you want to adjust the size of the interval or points separately,you can also use thelinewidth orpoint_size aesthetics; seesub-geometry-scales.

fatten_point

<scalarnumeric> A multiplicative factor used to adjust the size of the point relative to thesize of the thickest interval line. If you wish to specify point sizes directly, you can also usethepoint_size aesthetic andscale_point_size_continuous() orscale_point_size_discrete();sizes specified with that aesthetic will not be adjusted usingfatten_point.

arrow

<arrow |NULL> Type of arrow heads to use on the interval, orNULL for no arrows.

subguide

<function |string> Sub-guide used to annotate thethickness scale. One of:

  • A function that takes ascale argument giving aggplot2::Scaleobject and anorientation argument giving the orientation of thegeometry and then returns agrid::grob that will draw the axisannotation, such assubguide_axis() (to draw a traditional axis) orsubguide_none() (to draw no annotation). Seesubguide_axis()for a list of possibilities and examples.

  • A string giving the name of such a function when prefixedwith"subguide_"; e.g."axis" or"none". The values"slab","dots", and"spike" use the default subguide for their geomfamilies (no subguide), which can be modified by settingsubguide_slab,subguide_dots, orsubguide_spike;see the documentation for those functions.

p_limits

<length-2numeric> Probability limits. Used to determine the lower and upperlimits ofanalytical distributions (distributions fromsamples ignore this parameter and determinetheir limits based on the limits of the sample and the value of thetrim parameter).E.g., if this isc(.001, .999), then a slab is drawnfor the distribution from the quantile atp = .001 to the quantile atp = .999. If the lower(respectively upper) limit isNA, then the lower (upper) limit will be the minimum (maximum) of thedistribution's support if it is finite, and0.001 (0.999) if it is not finite. E.g., ifp_limits isc(NA, NA), on a gamma distribution the effective value ofp_limits would bec(0, .999) since the gamma distribution is defined on⁠(0, Inf)⁠; whereas on a normal distributionit would be equivalent toc(.001, .999) since the normal distribution is defined on⁠(-Inf, Inf)⁠.

density

<function |string> Density estimator for sample data. One of:

  • A function which takes a numeric vector and returns a list with elementsx (giving grid points for the density estimator) andy (thecorresponding densities).ggdist provides a family of functionsfollowing this format, includingdensity_unbounded() anddensity_bounded(). This format is also compatible withstats::density().

  • A string giving the suffix of a function name that starts with"density_";e.g."bounded" for⁠[density_bounded()]⁠,"unbounded" for⁠[density_unbounded()]⁠,or"histogram" fordensity_histogram().Defaults to"bounded", i.e.density_bounded(), which estimates the bounds fromthe data and then uses a bounded density estimator based on the reflection method.

adjust

<scalarnumeric |waiver> Passed todensity (e.g.density_bounded()): Value to multiply the bandwidth of the density estimator by. Defaultwaiver() defers to the default of the density estimator, which is usually1.

trim

<scalarlogical |waiver> Passed todensity (e.g.density_bounded()): Should the density estimate be trimmed to the range of the data? Defaultwaiver() defers to the default of the density estimator, which is usuallyTRUE.

breaks

<numeric |function |string |waiver> Passed todensity (e.g.density_histogram()): Determines the breakpoints defining bins. Defaultwaiver() defers to the default of the density estimator, which is usually"Scott". Similar to (but not exactly the same as) thebreaksargument tographics::hist(). One of:

  • A scalar (length-1) numeric giving the number of bins

  • A vector numeric giving the breakpoints between histogram bins

  • A function takingx andweights and returning either thenumber of bins or a vector of breakpoints

  • A string giving the suffix of a function that starts with"breaks_".ggdist provides weighted implementations of the"Sturges","Scott", and"FD" break-finding algorithms fromgraphics::hist(), as well asbreaks_fixed() for manually settingthe bin width. Seebreaks.

For example,breaks = "Sturges" will use thebreaks_Sturges() algorithm,breaks = 9 will create 9 bins, andbreaks = breaks_fixed(width = 1) willset the bin width to1.

align

<scalarnumeric |function |string |waiver> Passed todensity (e.g.density_histogram()): Determines how to align the breakpoints defining bins. Defaultwaiver() defers to the default of the density estimator, which is usually"none" (performs no alignment). One of:

  • A scalar (length-1) numeric giving an offset that is subtractedfrom the breaks. The offset must be between0 and the bin width.

  • A function taking a sorted vector ofbreaks (bin edges) andreturning an offset to subtract from the breaks.

  • A string giving the suffix of a function that starts with"align_" used to determine the alignment, such asalign_none(),align_boundary(), oralign_center().

For example,align = "none" will provide no alignment,align = align_center(at = 0) will center a bin on0, andalign = align_boundary(at = 0) will align a bin edge on0.

outline_bars

<scalarlogical |waiver> Passed todensity (e.g.density_histogram()) and alsoused for discrete analytical distributions (whose slabs are drawn as histograms). Determinesif outlines in between the bars are drawn. Ifwaiver() orFALSE(the default), the outline is drawn only along the tops of the bars. IfTRUE, outlines in betweenbars are also drawn (though you may have to set theslab_color orcolor aesthetic tosee the outlines).

expand

<logical> For sample data, should the slab be expanded to the limits of the scale? DefaultFALSE.Can be a length-two logical vector to control expansion to the lower and upper limit respectively.

point_interval

<function |string> A function from thepoint_interval() family(e.g.,median_qi,mean_qi,mode_hdi, etc), or a string giving the name of a function from that family(e.g.,"median_qi","mean_qi","mode_hdi", etc; if a string, the caller's environment is searchedfor the function, followed by theggdist environment). This function determines the point summary(typically mean, median, or mode) and interval type (quantile interval,qi;highest-density interval,hdi; or highest-density continuous interval,hdci). Output willbe converted to the appropriatex- ory-based aesthetics depending on the value oforientation.See thepoint_interval() family of functions for more information.

limits

<length-2numeric> Manually-specified limits for the slab, asa vector of length two. These limits are combined with those computed based onp_limits as well as the limits defined by the scales of the plot todetermine the limits used to draw the slab functions: these limits specifythe maximal limits; i.e., if specified, the limits will not be wider thanthese (but may be narrower). UseNA to leave a limit alone; e.g.limits = c(0, NA) will ensure that the lower limit does not go below 0, butlet the upper limit be determined by eitherp_limits or the scale settings.

n

<scalarnumeric> Number of points at which to evaluate the function that defines the slab. Alsopassed todensity (e.g.density_bounded()). Defaultwaiver() uses the value501for analytical distributions and defers to the default of the density estimator forsample-based distributions, which is also usually501.

.width

<numeric> The.width argument passed topoint_interval: a vector of probabilitiesto use that determine the widths of the resulting intervals. If multiple probabilities are provided,multiple intervals per group are generated, each with a different probability interval (andvalue of the corresponding.width andlevel generated variables).

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

<logical> Should this layer be included in the legends? Default isc(size = FALSE),unlike most geoms, to match its common use cases.FALSE hides all legends,TRUE shows all legends,andNA shows only those that are mapped (the default for most geoms). It can also be a named logicalvector to finely select the aesthetics to display.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

To visualize sample data, such as a data distribution, samples from abootstrap distribution, or a Bayesian posterior, you can supply samples tothex ory aesthetic.

To visualize analytical distributions, you can use thexdist orydistaesthetic. For historical reasons, you can also usedist to specify the distribution, thoughthis is not recommended as it does not work as well with orientation detection.These aesthetics can be used as follows:

Value

Aggplot2::Stat representing a eye (violin + interval) geometry which canbe added to aggplot() object.

Computed Variables

The following variables are computed by this stat and made available foruse in aesthetic specifications (aes()) using theafter_stat()function or theafter_stat argument ofstage():

Aesthetics

The slab+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: theslab, thepoint, andtheinterval.

Thesestats support the following aesthetics:

In addition, in their default configuration (paired withgeom_slabinterval())the following aesthetics are supported by the underlying geom:

Slab-specific aesthetics

Interval-specific aesthetics

Point-specific aesthetics

Color aesthetics

Line aesthetics

Slab-specific color and line override aesthetics

Interval-specific color and line override aesthetics

Point-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("slabinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

See Also

Seegeom_slabinterval() for the geom underlying this stat.Seestat_slabinterval() for the stat this shortcut is based on.

Other slabinterval stats:stat_ccdfinterval(),stat_cdfinterval(),stat_gradientinterval(),stat_halfeye(),stat_histinterval(),stat_interval(),stat_pointinterval(),stat_slab(),stat_spike()

Examples

library(dplyr)library(ggplot2)library(distributional)theme_set(theme_ggdist())# ON SAMPLE DATAset.seed(1234)df = data.frame(  group = c("a", "b", "c"),  value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)))df %>%  ggplot(aes(x = value, y = group)) +  stat_eye()# ON ANALYTICAL DISTRIBUTIONSdist_df = data.frame(  group = c("a", "b", "c"),  mean =  c(  5,   7,   8),  sd =    c(  1, 1.5,   1))# Vectorized distribution types, like distributional::dist_normal()# and posterior::rvar(), can be used with the `xdist` / `ydist` aestheticsdist_df %>%  ggplot(aes(y = group, xdist = dist_normal(mean, sd))) +  stat_eye()

Gradient + interval plot (shortcut stat)

Description

Shortcut version ofstat_slabinterval() withgeom_slabinterval() forcreating gradient + interval plots.

Roughly equivalent to:

stat_slabinterval(  aes(    justification = after_stat(0.5),    thickness = after_stat(thickness(1)),    slab_alpha = after_stat(f)  ),  fill_type = "auto",  show.legend = c(size = FALSE, slab_alpha = FALSE))

If your graphics device supports it, it is recommended to use this statwithfill_type = "gradient" (see the description of that parameter). On R >= 4.2,support forfill_type = "gradient" should be auto-detected based on thegraphics device you are using.

Usage

stat_gradientinterval(  mapping = NULL,  data = NULL,  geom = "slabinterval",  position = "identity",  ...,  fill_type = "auto",  p_limits = c(NA, NA),  density = "bounded",  adjust = waiver(),  trim = waiver(),  breaks = waiver(),  align = waiver(),  outline_bars = waiver(),  expand = FALSE,  point_interval = "median_qi",  limits = NULL,  n = waiver(),  .width = c(0.66, 0.95),  orientation = NA,  na.rm = FALSE,  show.legend = c(size = FALSE, slab_alpha = FALSE),  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

geom

<Geom |string> Use to overridethe default connection betweenstat_gradientinterval() andgeom_slabinterval()

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). Theseare often aesthetics, used to set an aesthetic to a fixed value, likecolour = "red" orlinewidth = 3 (seeAesthetics, below). Theymay also be parameters to the paired geom/stat. When paired with thedefault geom,geom_slabinterval(), these include:

subscale

<function |string> Sub-scale used to scale values of thethickness aesthetic within the groups determined bynormalize. One of:

  • A function that takes anx argument giving a numeric vectorof values to be scaled and then returns athickness vector representingthe scaled values, such assubscale_thickness() orsubscale_identity().

  • A string giving the name of such a function when prefixedwith"subscale_"; e.g."thickness" or"identity". The value"thickness" using the default subscale, which can be modified bysettingsubscale_thickness; see the documentation for thatfunction.

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

normalize

<string> Groups within which to scale values of thethickness aesthetic. One of:

  • "all": normalize so that the maximum height across all data is1.

  • "panels": normalize within panels so that the maximum height in each panel is1.

  • "xy": normalize within the x/y axis opposite theorientation of this geom sothat the maximum height at each value of the opposite axis is1.

  • "groups": normalize within values of the opposite axis and within eachgroup so that the maximum height in each group is1.

  • "none": values are taken as is with no normalization (this should probablyonly be used with functions whose values are in [0,1], such as CDFs).

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

interval_size_domain

<length-2numeric> Minimum and maximum of the values of thesize andlinewidth aestheticsthat will be translated into actual sizes for intervals drawn according tointerval_size_range (see thedocumentation for that argument.)

interval_size_range

<length-2numeric> This geom scales the raw size aesthetic values whendrawing interval and point sizes, as they tend to be too thick when usingthe default settings ofscale_size_continuous(),which give sizes with a range ofc(1, 6). Theinterval_size_domain value indicates theinput domain of raw size values (typically this should be equal to the value of therangeargument of thescale_size_continuous() function), andinterval_size_range indicates the desired output range of the size values (the min and maxof the actual sizes used to draw intervals). Most of the time it is not recommended to changethe value of this argument, as it may result in strange scaling of legends; this argument isa holdover from earlier versions that did not have size aesthetics targeting the point andinterval separately. If you want to adjust the size of the interval or points separately,you can also use thelinewidth orpoint_size aesthetics; seesub-geometry-scales.

fatten_point

<scalarnumeric> A multiplicative factor used to adjust the size of the point relative to thesize of the thickest interval line. If you wish to specify point sizes directly, you can also usethepoint_size aesthetic andscale_point_size_continuous() orscale_point_size_discrete();sizes specified with that aesthetic will not be adjusted usingfatten_point.

arrow

<arrow |NULL> Type of arrow heads to use on the interval, orNULL for no arrows.

subguide

<function |string> Sub-guide used to annotate thethickness scale. One of:

  • A function that takes ascale argument giving aggplot2::Scaleobject and anorientation argument giving the orientation of thegeometry and then returns agrid::grob that will draw the axisannotation, such assubguide_axis() (to draw a traditional axis) orsubguide_none() (to draw no annotation). Seesubguide_axis()for a list of possibilities and examples.

  • A string giving the name of such a function when prefixedwith"subguide_"; e.g."axis" or"none". The values"slab","dots", and"spike" use the default subguide for their geomfamilies (no subguide), which can be modified by settingsubguide_slab,subguide_dots, orsubguide_spike;see the documentation for those functions.

fill_type

<string> What type of fill to use when the fill color or alpha varies within a slab.One of:

  • "segments": breaks up the slab geometry into segments for each unique combination of fill color andalpha value. This approach is supported by all graphics devices and works well for sharp cutoff values,but can give ugly results if a large number of unique fill colors are being used (as in gradients,like instat_gradientinterval()).

  • "gradient": agrid::linearGradient() is used to create a smooth gradient fill. This works well forlarge numbers of unique fill colors, but requires R >= 4.1 and is not yet supported on all graphics devices.As of this writing, thepng() graphics device withtype = "cairo", thesvg() device, thepdf()device, and theragg::agg_png() devices are known to support this option. On R < 4.1, this optionwill fall back tofill_type = "segments" with a message.

  • "auto": attempts to usefill_type = "gradient" if support for it can be auto-detected. On R >= 4.2,support for gradients can be auto-detected on some graphics devices; if support is not detected, thisoption will fall back tofill_type = "segments" (in case of a false negative,fill_type = "gradient"can be set explicitly). On R < 4.2, support for gradients cannot be auto-detected, so this will alwaysfall back tofill_type = "segments", in which case you can setfill_type = "gradient" explicitlyif you are using a graphics device that support gradients.

p_limits

<length-2numeric> Probability limits. Used to determine the lower and upperlimits ofanalytical distributions (distributions fromsamples ignore this parameter and determinetheir limits based on the limits of the sample and the value of thetrim parameter).E.g., if this isc(.001, .999), then a slab is drawnfor the distribution from the quantile atp = .001 to the quantile atp = .999. If the lower(respectively upper) limit isNA, then the lower (upper) limit will be the minimum (maximum) of thedistribution's support if it is finite, and0.001 (0.999) if it is not finite. E.g., ifp_limits isc(NA, NA), on a gamma distribution the effective value ofp_limits would bec(0, .999) since the gamma distribution is defined on⁠(0, Inf)⁠; whereas on a normal distributionit would be equivalent toc(.001, .999) since the normal distribution is defined on⁠(-Inf, Inf)⁠.

density

<function |string> Density estimator for sample data. One of:

  • A function which takes a numeric vector and returns a list with elementsx (giving grid points for the density estimator) andy (thecorresponding densities).ggdist provides a family of functionsfollowing this format, includingdensity_unbounded() anddensity_bounded(). This format is also compatible withstats::density().

  • A string giving the suffix of a function name that starts with"density_";e.g."bounded" for⁠[density_bounded()]⁠,"unbounded" for⁠[density_unbounded()]⁠,or"histogram" fordensity_histogram().Defaults to"bounded", i.e.density_bounded(), which estimates the bounds fromthe data and then uses a bounded density estimator based on the reflection method.

adjust

<scalarnumeric |waiver> Passed todensity (e.g.density_bounded()): Value to multiply the bandwidth of the density estimator by. Defaultwaiver() defers to the default of the density estimator, which is usually1.

trim

<scalarlogical |waiver> Passed todensity (e.g.density_bounded()): Should the density estimate be trimmed to the range of the data? Defaultwaiver() defers to the default of the density estimator, which is usuallyTRUE.

breaks

<numeric |function |string |waiver> Passed todensity (e.g.density_histogram()): Determines the breakpoints defining bins. Defaultwaiver() defers to the default of the density estimator, which is usually"Scott". Similar to (but not exactly the same as) thebreaksargument tographics::hist(). One of:

  • A scalar (length-1) numeric giving the number of bins

  • A vector numeric giving the breakpoints between histogram bins

  • A function takingx andweights and returning either thenumber of bins or a vector of breakpoints

  • A string giving the suffix of a function that starts with"breaks_".ggdist provides weighted implementations of the"Sturges","Scott", and"FD" break-finding algorithms fromgraphics::hist(), as well asbreaks_fixed() for manually settingthe bin width. Seebreaks.

For example,breaks = "Sturges" will use thebreaks_Sturges() algorithm,breaks = 9 will create 9 bins, andbreaks = breaks_fixed(width = 1) willset the bin width to1.

align

<scalarnumeric |function |string |waiver> Passed todensity (e.g.density_histogram()): Determines how to align the breakpoints defining bins. Defaultwaiver() defers to the default of the density estimator, which is usually"none" (performs no alignment). One of:

  • A scalar (length-1) numeric giving an offset that is subtractedfrom the breaks. The offset must be between0 and the bin width.

  • A function taking a sorted vector ofbreaks (bin edges) andreturning an offset to subtract from the breaks.

  • A string giving the suffix of a function that starts with"align_" used to determine the alignment, such asalign_none(),align_boundary(), oralign_center().

For example,align = "none" will provide no alignment,align = align_center(at = 0) will center a bin on0, andalign = align_boundary(at = 0) will align a bin edge on0.

outline_bars

<scalarlogical |waiver> Passed todensity (e.g.density_histogram()) and alsoused for discrete analytical distributions (whose slabs are drawn as histograms). Determinesif outlines in between the bars are drawn. Ifwaiver() orFALSE(the default), the outline is drawn only along the tops of the bars. IfTRUE, outlines in betweenbars are also drawn (though you may have to set theslab_color orcolor aesthetic tosee the outlines).

expand

<logical> For sample data, should the slab be expanded to the limits of the scale? DefaultFALSE.Can be a length-two logical vector to control expansion to the lower and upper limit respectively.

point_interval

<function |string> A function from thepoint_interval() family(e.g.,median_qi,mean_qi,mode_hdi, etc), or a string giving the name of a function from that family(e.g.,"median_qi","mean_qi","mode_hdi", etc; if a string, the caller's environment is searchedfor the function, followed by theggdist environment). This function determines the point summary(typically mean, median, or mode) and interval type (quantile interval,qi;highest-density interval,hdi; or highest-density continuous interval,hdci). Output willbe converted to the appropriatex- ory-based aesthetics depending on the value oforientation.See thepoint_interval() family of functions for more information.

limits

<length-2numeric> Manually-specified limits for the slab, asa vector of length two. These limits are combined with those computed based onp_limits as well as the limits defined by the scales of the plot todetermine the limits used to draw the slab functions: these limits specifythe maximal limits; i.e., if specified, the limits will not be wider thanthese (but may be narrower). UseNA to leave a limit alone; e.g.limits = c(0, NA) will ensure that the lower limit does not go below 0, butlet the upper limit be determined by eitherp_limits or the scale settings.

n

<scalarnumeric> Number of points at which to evaluate the function that defines the slab. Alsopassed todensity (e.g.density_bounded()). Defaultwaiver() uses the value501for analytical distributions and defers to the default of the density estimator forsample-based distributions, which is also usually501.

.width

<numeric> The.width argument passed topoint_interval: a vector of probabilitiesto use that determine the widths of the resulting intervals. If multiple probabilities are provided,multiple intervals per group are generated, each with a different probability interval (andvalue of the corresponding.width andlevel generated variables).

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

<logical> Should this layer be included in the legends? Default isc(size = FALSE),unlike most geoms, to match its common use cases.FALSE hides all legends,TRUE shows all legends,andNA shows only those that are mapped (the default for most geoms). It can also be a named logicalvector to finely select the aesthetics to display.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

To visualize sample data, such as a data distribution, samples from abootstrap distribution, or a Bayesian posterior, you can supply samples tothex ory aesthetic.

To visualize analytical distributions, you can use thexdist orydistaesthetic. For historical reasons, you can also usedist to specify the distribution, thoughthis is not recommended as it does not work as well with orientation detection.These aesthetics can be used as follows:

Value

Aggplot2::Stat representing a gradient + interval geometry which canbe added to aggplot() object.

Computed Variables

The following variables are computed by this stat and made available foruse in aesthetic specifications (aes()) using theafter_stat()function or theafter_stat argument ofstage():

Aesthetics

The slab+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: theslab, thepoint, andtheinterval.

Thesestats support the following aesthetics:

In addition, in their default configuration (paired withgeom_slabinterval())the following aesthetics are supported by the underlying geom:

Slab-specific aesthetics

Interval-specific aesthetics

Point-specific aesthetics

Color aesthetics

Line aesthetics

Slab-specific color and line override aesthetics

Interval-specific color and line override aesthetics

Point-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("slabinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

See Also

Seegeom_slabinterval() for the geom underlying this stat.Seestat_slabinterval() for the stat this shortcut is based on.

Other slabinterval stats:stat_ccdfinterval(),stat_cdfinterval(),stat_eye(),stat_halfeye(),stat_histinterval(),stat_interval(),stat_pointinterval(),stat_slab(),stat_spike()

Examples

library(dplyr)library(ggplot2)library(distributional)theme_set(theme_ggdist())# ON SAMPLE DATAset.seed(1234)df = data.frame(  group = c("a", "b", "c"),  value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)))df %>%  ggplot(aes(x = value, y = group)) +  stat_gradientinterval()# ON ANALYTICAL DISTRIBUTIONSdist_df = data.frame(  group = c("a", "b", "c"),  mean =  c(  5,   7,   8),  sd =    c(  1, 1.5,   1))# Vectorized distribution types, like distributional::dist_normal()# and posterior::rvar(), can be used with the `xdist` / `ydist` aestheticsdist_df %>%  ggplot(aes(y = group, xdist = dist_normal(mean, sd))) +  stat_gradientinterval()

Half-eye (density + interval) plot (shortcut stat)

Description

Equivalent tostat_slabinterval(), whose default settings create half-eye (density + interval) plots.

Usage

stat_halfeye(  mapping = NULL,  data = NULL,  geom = "slabinterval",  position = "identity",  ...,  p_limits = c(NA, NA),  density = "bounded",  adjust = waiver(),  trim = waiver(),  breaks = waiver(),  align = waiver(),  outline_bars = waiver(),  expand = FALSE,  point_interval = "median_qi",  limits = NULL,  n = waiver(),  .width = c(0.66, 0.95),  orientation = NA,  na.rm = FALSE,  show.legend = c(size = FALSE),  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

geom

<Geom |string> Use to overridethe default connection betweenstat_halfeye() andgeom_slabinterval()

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). Theseare often aesthetics, used to set an aesthetic to a fixed value, likecolour = "red" orlinewidth = 3 (seeAesthetics, below). Theymay also be parameters to the paired geom/stat. When paired with thedefault geom,geom_slabinterval(), these include:

subscale

<function |string> Sub-scale used to scale values of thethickness aesthetic within the groups determined bynormalize. One of:

  • A function that takes anx argument giving a numeric vectorof values to be scaled and then returns athickness vector representingthe scaled values, such assubscale_thickness() orsubscale_identity().

  • A string giving the name of such a function when prefixedwith"subscale_"; e.g."thickness" or"identity". The value"thickness" using the default subscale, which can be modified bysettingsubscale_thickness; see the documentation for thatfunction.

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

normalize

<string> Groups within which to scale values of thethickness aesthetic. One of:

  • "all": normalize so that the maximum height across all data is1.

  • "panels": normalize within panels so that the maximum height in each panel is1.

  • "xy": normalize within the x/y axis opposite theorientation of this geom sothat the maximum height at each value of the opposite axis is1.

  • "groups": normalize within values of the opposite axis and within eachgroup so that the maximum height in each group is1.

  • "none": values are taken as is with no normalization (this should probablyonly be used with functions whose values are in [0,1], such as CDFs).

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

fill_type

<string> What type of fill to use when the fill color or alpha varies within a slab.One of:

  • "segments": breaks up the slab geometry into segments for each unique combination of fill color andalpha value. This approach is supported by all graphics devices and works well for sharp cutoff values,but can give ugly results if a large number of unique fill colors are being used (as in gradients,like instat_gradientinterval()).

  • "gradient": agrid::linearGradient() is used to create a smooth gradient fill. This works well forlarge numbers of unique fill colors, but requires R >= 4.1 and is not yet supported on all graphics devices.As of this writing, thepng() graphics device withtype = "cairo", thesvg() device, thepdf()device, and theragg::agg_png() devices are known to support this option. On R < 4.1, this optionwill fall back tofill_type = "segments" with a message.

  • "auto": attempts to usefill_type = "gradient" if support for it can be auto-detected. On R >= 4.2,support for gradients can be auto-detected on some graphics devices; if support is not detected, thisoption will fall back tofill_type = "segments" (in case of a false negative,fill_type = "gradient"can be set explicitly). On R < 4.2, support for gradients cannot be auto-detected, so this will alwaysfall back tofill_type = "segments", in which case you can setfill_type = "gradient" explicitlyif you are using a graphics device that support gradients.

interval_size_domain

<length-2numeric> Minimum and maximum of the values of thesize andlinewidth aestheticsthat will be translated into actual sizes for intervals drawn according tointerval_size_range (see thedocumentation for that argument.)

interval_size_range

<length-2numeric> This geom scales the raw size aesthetic values whendrawing interval and point sizes, as they tend to be too thick when usingthe default settings ofscale_size_continuous(),which give sizes with a range ofc(1, 6). Theinterval_size_domain value indicates theinput domain of raw size values (typically this should be equal to the value of therangeargument of thescale_size_continuous() function), andinterval_size_range indicates the desired output range of the size values (the min and maxof the actual sizes used to draw intervals). Most of the time it is not recommended to changethe value of this argument, as it may result in strange scaling of legends; this argument isa holdover from earlier versions that did not have size aesthetics targeting the point andinterval separately. If you want to adjust the size of the interval or points separately,you can also use thelinewidth orpoint_size aesthetics; seesub-geometry-scales.

fatten_point

<scalarnumeric> A multiplicative factor used to adjust the size of the point relative to thesize of the thickest interval line. If you wish to specify point sizes directly, you can also usethepoint_size aesthetic andscale_point_size_continuous() orscale_point_size_discrete();sizes specified with that aesthetic will not be adjusted usingfatten_point.

arrow

<arrow |NULL> Type of arrow heads to use on the interval, orNULL for no arrows.

subguide

<function |string> Sub-guide used to annotate thethickness scale. One of:

  • A function that takes ascale argument giving aggplot2::Scaleobject and anorientation argument giving the orientation of thegeometry and then returns agrid::grob that will draw the axisannotation, such assubguide_axis() (to draw a traditional axis) orsubguide_none() (to draw no annotation). Seesubguide_axis()for a list of possibilities and examples.

  • A string giving the name of such a function when prefixedwith"subguide_"; e.g."axis" or"none". The values"slab","dots", and"spike" use the default subguide for their geomfamilies (no subguide), which can be modified by settingsubguide_slab,subguide_dots, orsubguide_spike;see the documentation for those functions.

p_limits

<length-2numeric> Probability limits. Used to determine the lower and upperlimits ofanalytical distributions (distributions fromsamples ignore this parameter and determinetheir limits based on the limits of the sample and the value of thetrim parameter).E.g., if this isc(.001, .999), then a slab is drawnfor the distribution from the quantile atp = .001 to the quantile atp = .999. If the lower(respectively upper) limit isNA, then the lower (upper) limit will be the minimum (maximum) of thedistribution's support if it is finite, and0.001 (0.999) if it is not finite. E.g., ifp_limits isc(NA, NA), on a gamma distribution the effective value ofp_limits would bec(0, .999) since the gamma distribution is defined on⁠(0, Inf)⁠; whereas on a normal distributionit would be equivalent toc(.001, .999) since the normal distribution is defined on⁠(-Inf, Inf)⁠.

density

<function |string> Density estimator for sample data. One of:

  • A function which takes a numeric vector and returns a list with elementsx (giving grid points for the density estimator) andy (thecorresponding densities).ggdist provides a family of functionsfollowing this format, includingdensity_unbounded() anddensity_bounded(). This format is also compatible withstats::density().

  • A string giving the suffix of a function name that starts with"density_";e.g."bounded" for⁠[density_bounded()]⁠,"unbounded" for⁠[density_unbounded()]⁠,or"histogram" fordensity_histogram().Defaults to"bounded", i.e.density_bounded(), which estimates the bounds fromthe data and then uses a bounded density estimator based on the reflection method.

adjust

<scalarnumeric |waiver> Passed todensity (e.g.density_bounded()): Value to multiply the bandwidth of the density estimator by. Defaultwaiver() defers to the default of the density estimator, which is usually1.

trim

<scalarlogical |waiver> Passed todensity (e.g.density_bounded()): Should the density estimate be trimmed to the range of the data? Defaultwaiver() defers to the default of the density estimator, which is usuallyTRUE.

breaks

<numeric |function |string |waiver> Passed todensity (e.g.density_histogram()): Determines the breakpoints defining bins. Defaultwaiver() defers to the default of the density estimator, which is usually"Scott". Similar to (but not exactly the same as) thebreaksargument tographics::hist(). One of:

  • A scalar (length-1) numeric giving the number of bins

  • A vector numeric giving the breakpoints between histogram bins

  • A function takingx andweights and returning either thenumber of bins or a vector of breakpoints

  • A string giving the suffix of a function that starts with"breaks_".ggdist provides weighted implementations of the"Sturges","Scott", and"FD" break-finding algorithms fromgraphics::hist(), as well asbreaks_fixed() for manually settingthe bin width. Seebreaks.

For example,breaks = "Sturges" will use thebreaks_Sturges() algorithm,breaks = 9 will create 9 bins, andbreaks = breaks_fixed(width = 1) willset the bin width to1.

align

<scalarnumeric |function |string |waiver> Passed todensity (e.g.density_histogram()): Determines how to align the breakpoints defining bins. Defaultwaiver() defers to the default of the density estimator, which is usually"none" (performs no alignment). One of:

  • A scalar (length-1) numeric giving an offset that is subtractedfrom the breaks. The offset must be between0 and the bin width.

  • A function taking a sorted vector ofbreaks (bin edges) andreturning an offset to subtract from the breaks.

  • A string giving the suffix of a function that starts with"align_" used to determine the alignment, such asalign_none(),align_boundary(), oralign_center().

For example,align = "none" will provide no alignment,align = align_center(at = 0) will center a bin on0, andalign = align_boundary(at = 0) will align a bin edge on0.

outline_bars

<scalarlogical |waiver> Passed todensity (e.g.density_histogram()) and alsoused for discrete analytical distributions (whose slabs are drawn as histograms). Determinesif outlines in between the bars are drawn. Ifwaiver() orFALSE(the default), the outline is drawn only along the tops of the bars. IfTRUE, outlines in betweenbars are also drawn (though you may have to set theslab_color orcolor aesthetic tosee the outlines).

expand

<logical> For sample data, should the slab be expanded to the limits of the scale? DefaultFALSE.Can be a length-two logical vector to control expansion to the lower and upper limit respectively.

point_interval

<function |string> A function from thepoint_interval() family(e.g.,median_qi,mean_qi,mode_hdi, etc), or a string giving the name of a function from that family(e.g.,"median_qi","mean_qi","mode_hdi", etc; if a string, the caller's environment is searchedfor the function, followed by theggdist environment). This function determines the point summary(typically mean, median, or mode) and interval type (quantile interval,qi;highest-density interval,hdi; or highest-density continuous interval,hdci). Output willbe converted to the appropriatex- ory-based aesthetics depending on the value oforientation.See thepoint_interval() family of functions for more information.

limits

<length-2numeric> Manually-specified limits for the slab, asa vector of length two. These limits are combined with those computed based onp_limits as well as the limits defined by the scales of the plot todetermine the limits used to draw the slab functions: these limits specifythe maximal limits; i.e., if specified, the limits will not be wider thanthese (but may be narrower). UseNA to leave a limit alone; e.g.limits = c(0, NA) will ensure that the lower limit does not go below 0, butlet the upper limit be determined by eitherp_limits or the scale settings.

n

<scalarnumeric> Number of points at which to evaluate the function that defines the slab. Alsopassed todensity (e.g.density_bounded()). Defaultwaiver() uses the value501for analytical distributions and defers to the default of the density estimator forsample-based distributions, which is also usually501.

.width

<numeric> The.width argument passed topoint_interval: a vector of probabilitiesto use that determine the widths of the resulting intervals. If multiple probabilities are provided,multiple intervals per group are generated, each with a different probability interval (andvalue of the corresponding.width andlevel generated variables).

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

<logical> Should this layer be included in the legends? Default isc(size = FALSE),unlike most geoms, to match its common use cases.FALSE hides all legends,TRUE shows all legends,andNA shows only those that are mapped (the default for most geoms). It can also be a named logicalvector to finely select the aesthetics to display.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

To visualize sample data, such as a data distribution, samples from abootstrap distribution, or a Bayesian posterior, you can supply samples tothex ory aesthetic.

To visualize analytical distributions, you can use thexdist orydistaesthetic. For historical reasons, you can also usedist to specify the distribution, thoughthis is not recommended as it does not work as well with orientation detection.These aesthetics can be used as follows:

Value

Aggplot2::Stat representing a half-eye (density + interval) geometry which canbe added to aggplot() object.

Computed Variables

The following variables are computed by this stat and made available foruse in aesthetic specifications (aes()) using theafter_stat()function or theafter_stat argument ofstage():

Aesthetics

The slab+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: theslab, thepoint, andtheinterval.

Thesestats support the following aesthetics:

In addition, in their default configuration (paired withgeom_slabinterval())the following aesthetics are supported by the underlying geom:

Slab-specific aesthetics

Interval-specific aesthetics

Point-specific aesthetics

Color aesthetics

Line aesthetics

Slab-specific color and line override aesthetics

Interval-specific color and line override aesthetics

Point-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("slabinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

See Also

Seegeom_slabinterval() for the geom underlying this stat.Seestat_slabinterval() for the stat this shortcut is based on.

Other slabinterval stats:stat_ccdfinterval(),stat_cdfinterval(),stat_eye(),stat_gradientinterval(),stat_histinterval(),stat_interval(),stat_pointinterval(),stat_slab(),stat_spike()

Examples

library(dplyr)library(ggplot2)library(distributional)theme_set(theme_ggdist())# ON SAMPLE DATAset.seed(1234)df = data.frame(  group = c("a", "b", "c"),  value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)))df %>%  ggplot(aes(x = value, y = group)) +  stat_halfeye()# ON ANALYTICAL DISTRIBUTIONSdist_df = data.frame(  group = c("a", "b", "c"),  mean =  c(  5,   7,   8),  sd =    c(  1, 1.5,   1))# Vectorized distribution types, like distributional::dist_normal()# and posterior::rvar(), can be used with the `xdist` / `ydist` aestheticsdist_df %>%  ggplot(aes(y = group, xdist = dist_normal(mean, sd))) +  stat_halfeye()

Histogram + interval plot (shortcut stat)

Description

Shortcut version ofstat_slabinterval() withgeom_slabinterval() forcreating histogram + interval plots.

Roughly equivalent to:

stat_slabinterval(  density = "histogram")

Usage

stat_histinterval(  mapping = NULL,  data = NULL,  geom = "slabinterval",  position = "identity",  ...,  density = "histogram",  p_limits = c(NA, NA),  adjust = waiver(),  trim = waiver(),  breaks = waiver(),  align = waiver(),  outline_bars = waiver(),  expand = FALSE,  point_interval = "median_qi",  limits = NULL,  n = waiver(),  .width = c(0.66, 0.95),  orientation = NA,  na.rm = FALSE,  show.legend = c(size = FALSE),  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

geom

<Geom |string> Use to overridethe default connection betweenstat_histinterval() andgeom_slabinterval()

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). Theseare often aesthetics, used to set an aesthetic to a fixed value, likecolour = "red" orlinewidth = 3 (seeAesthetics, below). Theymay also be parameters to the paired geom/stat. When paired with thedefault geom,geom_slabinterval(), these include:

subscale

<function |string> Sub-scale used to scale values of thethickness aesthetic within the groups determined bynormalize. One of:

  • A function that takes anx argument giving a numeric vectorof values to be scaled and then returns athickness vector representingthe scaled values, such assubscale_thickness() orsubscale_identity().

  • A string giving the name of such a function when prefixedwith"subscale_"; e.g."thickness" or"identity". The value"thickness" using the default subscale, which can be modified bysettingsubscale_thickness; see the documentation for thatfunction.

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

normalize

<string> Groups within which to scale values of thethickness aesthetic. One of:

  • "all": normalize so that the maximum height across all data is1.

  • "panels": normalize within panels so that the maximum height in each panel is1.

  • "xy": normalize within the x/y axis opposite theorientation of this geom sothat the maximum height at each value of the opposite axis is1.

  • "groups": normalize within values of the opposite axis and within eachgroup so that the maximum height in each group is1.

  • "none": values are taken as is with no normalization (this should probablyonly be used with functions whose values are in [0,1], such as CDFs).

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

fill_type

<string> What type of fill to use when the fill color or alpha varies within a slab.One of:

  • "segments": breaks up the slab geometry into segments for each unique combination of fill color andalpha value. This approach is supported by all graphics devices and works well for sharp cutoff values,but can give ugly results if a large number of unique fill colors are being used (as in gradients,like instat_gradientinterval()).

  • "gradient": agrid::linearGradient() is used to create a smooth gradient fill. This works well forlarge numbers of unique fill colors, but requires R >= 4.1 and is not yet supported on all graphics devices.As of this writing, thepng() graphics device withtype = "cairo", thesvg() device, thepdf()device, and theragg::agg_png() devices are known to support this option. On R < 4.1, this optionwill fall back tofill_type = "segments" with a message.

  • "auto": attempts to usefill_type = "gradient" if support for it can be auto-detected. On R >= 4.2,support for gradients can be auto-detected on some graphics devices; if support is not detected, thisoption will fall back tofill_type = "segments" (in case of a false negative,fill_type = "gradient"can be set explicitly). On R < 4.2, support for gradients cannot be auto-detected, so this will alwaysfall back tofill_type = "segments", in which case you can setfill_type = "gradient" explicitlyif you are using a graphics device that support gradients.

interval_size_domain

<length-2numeric> Minimum and maximum of the values of thesize andlinewidth aestheticsthat will be translated into actual sizes for intervals drawn according tointerval_size_range (see thedocumentation for that argument.)

interval_size_range

<length-2numeric> This geom scales the raw size aesthetic values whendrawing interval and point sizes, as they tend to be too thick when usingthe default settings ofscale_size_continuous(),which give sizes with a range ofc(1, 6). Theinterval_size_domain value indicates theinput domain of raw size values (typically this should be equal to the value of therangeargument of thescale_size_continuous() function), andinterval_size_range indicates the desired output range of the size values (the min and maxof the actual sizes used to draw intervals). Most of the time it is not recommended to changethe value of this argument, as it may result in strange scaling of legends; this argument isa holdover from earlier versions that did not have size aesthetics targeting the point andinterval separately. If you want to adjust the size of the interval or points separately,you can also use thelinewidth orpoint_size aesthetics; seesub-geometry-scales.

fatten_point

<scalarnumeric> A multiplicative factor used to adjust the size of the point relative to thesize of the thickest interval line. If you wish to specify point sizes directly, you can also usethepoint_size aesthetic andscale_point_size_continuous() orscale_point_size_discrete();sizes specified with that aesthetic will not be adjusted usingfatten_point.

arrow

<arrow |NULL> Type of arrow heads to use on the interval, orNULL for no arrows.

subguide

<function |string> Sub-guide used to annotate thethickness scale. One of:

  • A function that takes ascale argument giving aggplot2::Scaleobject and anorientation argument giving the orientation of thegeometry and then returns agrid::grob that will draw the axisannotation, such assubguide_axis() (to draw a traditional axis) orsubguide_none() (to draw no annotation). Seesubguide_axis()for a list of possibilities and examples.

  • A string giving the name of such a function when prefixedwith"subguide_"; e.g."axis" or"none". The values"slab","dots", and"spike" use the default subguide for their geomfamilies (no subguide), which can be modified by settingsubguide_slab,subguide_dots, orsubguide_spike;see the documentation for those functions.

density

<function |string> Density estimator for sample data. One of:

  • A function which takes a numeric vector and returns a list with elementsx (giving grid points for the density estimator) andy (thecorresponding densities).ggdist provides a family of functionsfollowing this format, includingdensity_unbounded() anddensity_bounded(). This format is also compatible withstats::density().

  • A string giving the suffix of a function name that starts with"density_";e.g."bounded" for⁠[density_bounded()]⁠,"unbounded" for⁠[density_unbounded()]⁠,or"histogram" fordensity_histogram().Defaults to"bounded", i.e.density_bounded(), which estimates the bounds fromthe data and then uses a bounded density estimator based on the reflection method.

p_limits

<length-2numeric> Probability limits. Used to determine the lower and upperlimits ofanalytical distributions (distributions fromsamples ignore this parameter and determinetheir limits based on the limits of the sample and the value of thetrim parameter).E.g., if this isc(.001, .999), then a slab is drawnfor the distribution from the quantile atp = .001 to the quantile atp = .999. If the lower(respectively upper) limit isNA, then the lower (upper) limit will be the minimum (maximum) of thedistribution's support if it is finite, and0.001 (0.999) if it is not finite. E.g., ifp_limits isc(NA, NA), on a gamma distribution the effective value ofp_limits would bec(0, .999) since the gamma distribution is defined on⁠(0, Inf)⁠; whereas on a normal distributionit would be equivalent toc(.001, .999) since the normal distribution is defined on⁠(-Inf, Inf)⁠.

adjust

<scalarnumeric |waiver> Passed todensity (e.g.density_bounded()): Value to multiply the bandwidth of the density estimator by. Defaultwaiver() defers to the default of the density estimator, which is usually1.

trim

<scalarlogical |waiver> Passed todensity (e.g.density_bounded()): Should the density estimate be trimmed to the range of the data? Defaultwaiver() defers to the default of the density estimator, which is usuallyTRUE.

breaks

<numeric |function |string |waiver> Passed todensity (e.g.density_histogram()): Determines the breakpoints defining bins. Defaultwaiver() defers to the default of the density estimator, which is usually"Scott". Similar to (but not exactly the same as) thebreaksargument tographics::hist(). One of:

  • A scalar (length-1) numeric giving the number of bins

  • A vector numeric giving the breakpoints between histogram bins

  • A function takingx andweights and returning either thenumber of bins or a vector of breakpoints

  • A string giving the suffix of a function that starts with"breaks_".ggdist provides weighted implementations of the"Sturges","Scott", and"FD" break-finding algorithms fromgraphics::hist(), as well asbreaks_fixed() for manually settingthe bin width. Seebreaks.

For example,breaks = "Sturges" will use thebreaks_Sturges() algorithm,breaks = 9 will create 9 bins, andbreaks = breaks_fixed(width = 1) willset the bin width to1.

align

<scalarnumeric |function |string |waiver> Passed todensity (e.g.density_histogram()): Determines how to align the breakpoints defining bins. Defaultwaiver() defers to the default of the density estimator, which is usually"none" (performs no alignment). One of:

  • A scalar (length-1) numeric giving an offset that is subtractedfrom the breaks. The offset must be between0 and the bin width.

  • A function taking a sorted vector ofbreaks (bin edges) andreturning an offset to subtract from the breaks.

  • A string giving the suffix of a function that starts with"align_" used to determine the alignment, such asalign_none(),align_boundary(), oralign_center().

For example,align = "none" will provide no alignment,align = align_center(at = 0) will center a bin on0, andalign = align_boundary(at = 0) will align a bin edge on0.

outline_bars

<scalarlogical |waiver> Passed todensity (e.g.density_histogram()) and alsoused for discrete analytical distributions (whose slabs are drawn as histograms). Determinesif outlines in between the bars are drawn. Ifwaiver() orFALSE(the default), the outline is drawn only along the tops of the bars. IfTRUE, outlines in betweenbars are also drawn (though you may have to set theslab_color orcolor aesthetic tosee the outlines).

expand

<logical> For sample data, should the slab be expanded to the limits of the scale? DefaultFALSE.Can be a length-two logical vector to control expansion to the lower and upper limit respectively.

point_interval

<function |string> A function from thepoint_interval() family(e.g.,median_qi,mean_qi,mode_hdi, etc), or a string giving the name of a function from that family(e.g.,"median_qi","mean_qi","mode_hdi", etc; if a string, the caller's environment is searchedfor the function, followed by theggdist environment). This function determines the point summary(typically mean, median, or mode) and interval type (quantile interval,qi;highest-density interval,hdi; or highest-density continuous interval,hdci). Output willbe converted to the appropriatex- ory-based aesthetics depending on the value oforientation.See thepoint_interval() family of functions for more information.

limits

<length-2numeric> Manually-specified limits for the slab, asa vector of length two. These limits are combined with those computed based onp_limits as well as the limits defined by the scales of the plot todetermine the limits used to draw the slab functions: these limits specifythe maximal limits; i.e., if specified, the limits will not be wider thanthese (but may be narrower). UseNA to leave a limit alone; e.g.limits = c(0, NA) will ensure that the lower limit does not go below 0, butlet the upper limit be determined by eitherp_limits or the scale settings.

n

<scalarnumeric> Number of points at which to evaluate the function that defines the slab. Alsopassed todensity (e.g.density_bounded()). Defaultwaiver() uses the value501for analytical distributions and defers to the default of the density estimator forsample-based distributions, which is also usually501.

.width

<numeric> The.width argument passed topoint_interval: a vector of probabilitiesto use that determine the widths of the resulting intervals. If multiple probabilities are provided,multiple intervals per group are generated, each with a different probability interval (andvalue of the corresponding.width andlevel generated variables).

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

<logical> Should this layer be included in the legends? Default isc(size = FALSE),unlike most geoms, to match its common use cases.FALSE hides all legends,TRUE shows all legends,andNA shows only those that are mapped (the default for most geoms). It can also be a named logicalvector to finely select the aesthetics to display.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

To visualize sample data, such as a data distribution, samples from abootstrap distribution, or a Bayesian posterior, you can supply samples tothex ory aesthetic.

To visualize analytical distributions, you can use thexdist orydistaesthetic. For historical reasons, you can also usedist to specify the distribution, thoughthis is not recommended as it does not work as well with orientation detection.These aesthetics can be used as follows:

Value

Aggplot2::Stat representing a histogram + interval geometry which canbe added to aggplot() object.

Computed Variables

The following variables are computed by this stat and made available foruse in aesthetic specifications (aes()) using theafter_stat()function or theafter_stat argument ofstage():

Aesthetics

The slab+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: theslab, thepoint, andtheinterval.

Thesestats support the following aesthetics:

In addition, in their default configuration (paired withgeom_slabinterval())the following aesthetics are supported by the underlying geom:

Slab-specific aesthetics

Interval-specific aesthetics

Point-specific aesthetics

Color aesthetics

Line aesthetics

Slab-specific color and line override aesthetics

Interval-specific color and line override aesthetics

Point-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("slabinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

See Also

Seegeom_slabinterval() for the geom underlying this stat.Seestat_slabinterval() for the stat this shortcut is based on.

Other slabinterval stats:stat_ccdfinterval(),stat_cdfinterval(),stat_eye(),stat_gradientinterval(),stat_halfeye(),stat_interval(),stat_pointinterval(),stat_slab(),stat_spike()

Examples

library(dplyr)library(ggplot2)library(distributional)theme_set(theme_ggdist())# ON SAMPLE DATAset.seed(1234)df = data.frame(  group = c("a", "b", "c"),  value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)))df %>%  ggplot(aes(x = value, y = group)) +  stat_histinterval()# ON ANALYTICAL DISTRIBUTIONSdist_df = data.frame(  group = c("a", "b", "c"),  mean =  c(  5,   7,   8),  sd =    c(  1, 1.5,   1))# Vectorized distribution types, like distributional::dist_normal()# and posterior::rvar(), can be used with the `xdist` / `ydist` aestheticsdist_df %>%  ggplot(aes(y = group, xdist = dist_normal(mean, sd))) +  stat_histinterval()

Multiple-interval plot (shortcut stat)

Description

Shortcut version ofstat_slabinterval() withgeom_interval() forcreating multiple-interval plots.

Roughly equivalent to:

stat_slabinterval(  aes(    colour = after_stat(level),    size = NULL  ),  geom = "interval",  show_point = FALSE,  .width = c(0.5, 0.8, 0.95),  show_slab = FALSE,  show.legend = NA)

Usage

stat_interval(  mapping = NULL,  data = NULL,  geom = "interval",  position = "identity",  ...,  .width = c(0.5, 0.8, 0.95),  point_interval = "median_qi",  orientation = NA,  na.rm = FALSE,  show.legend = NA,  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

geom

<Geom |string> Use to overridethe default connection betweenstat_interval() andgeom_interval()

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). Theseare often aesthetics, used to set an aesthetic to a fixed value, likecolour = "red" orlinewidth = 3 (seeAesthetics, below). Theymay also be parameters to the paired geom/stat. When paired with thedefault geom,geom_interval(), these include:

interval_size_range

<length-2numeric> This geom scales the raw size aesthetic values whendrawing interval and point sizes, as they tend to be too thick when usingthe default settings ofscale_size_continuous(),which give sizes with a range ofc(1, 6). Theinterval_size_domain value indicates theinput domain of raw size values (typically this should be equal to the value of therangeargument of thescale_size_continuous() function), andinterval_size_range indicates the desired output range of the size values (the min and maxof the actual sizes used to draw intervals). Most of the time it is not recommended to changethe value of this argument, as it may result in strange scaling of legends; this argument isa holdover from earlier versions that did not have size aesthetics targeting the point andinterval separately. If you want to adjust the size of the interval or points separately,you can also use thelinewidth orpoint_size aesthetics; seesub-geometry-scales.

interval_size_domain

<length-2numeric> Minimum and maximum of the values of thesize andlinewidth aestheticsthat will be translated into actual sizes for intervals drawn according tointerval_size_range (see thedocumentation for that argument.)

arrow

<arrow |NULL> Type of arrow heads to use on the interval, orNULL for no arrows.

.width

<numeric> The.width argument passed topoint_interval: a vector of probabilitiesto use that determine the widths of the resulting intervals. If multiple probabilities are provided,multiple intervals per group are generated, each with a different probability interval (andvalue of the corresponding.width andlevel generated variables).

point_interval

<function |string> A function from thepoint_interval() family(e.g.,median_qi,mean_qi,mode_hdi, etc), or a string giving the name of a function from that family(e.g.,"median_qi","mean_qi","mode_hdi", etc; if a string, the caller's environment is searchedfor the function, followed by theggdist environment). This function determines the point summary(typically mean, median, or mode) and interval type (quantile interval,qi;highest-density interval,hdi; or highest-density continuous interval,hdci). Output willbe converted to the appropriatex- ory-based aesthetics depending on the value oforientation.See thepoint_interval() family of functions for more information.

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

<logical> Should this layer be included in the legends? Default isc(size = FALSE),unlike most geoms, to match its common use cases.FALSE hides all legends,TRUE shows all legends,andNA shows only those that are mapped (the default for most geoms). It can also be a named logicalvector to finely select the aesthetics to display.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

To visualize sample data, such as a data distribution, samples from abootstrap distribution, or a Bayesian posterior, you can supply samples tothex ory aesthetic.

To visualize analytical distributions, you can use thexdist orydistaesthetic. For historical reasons, you can also usedist to specify the distribution, thoughthis is not recommended as it does not work as well with orientation detection.These aesthetics can be used as follows:

Value

Aggplot2::Stat representing a multiple-interval geometry which canbe added to aggplot() object.

Computed Variables

The following variables are computed by this stat and made available foruse in aesthetic specifications (aes()) using theafter_stat()function or theafter_stat argument ofstage():

Aesthetics

The slab+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: theslab, thepoint, andtheinterval.

Thesestats support the following aesthetics:

In addition, in their default configuration (paired withgeom_interval())the following aesthetics are supported by the underlying geom:

Interval-specific aesthetics

Color aesthetics

Line aesthetics

Interval-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("slabinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

See Also

Seegeom_interval() for the geom underlying this stat.Seestat_slabinterval() for the stat this shortcut is based on.

Other slabinterval stats:stat_ccdfinterval(),stat_cdfinterval(),stat_eye(),stat_gradientinterval(),stat_halfeye(),stat_histinterval(),stat_pointinterval(),stat_slab(),stat_spike()

Examples

library(dplyr)library(ggplot2)library(distributional)theme_set(theme_ggdist())# ON SAMPLE DATAset.seed(1234)df = data.frame(  group = c("a", "b", "c"),  value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)))df %>%  ggplot(aes(x = value, y = group)) +  stat_interval() +  scale_color_brewer()# ON ANALYTICAL DISTRIBUTIONSdist_df = data.frame(  group = c("a", "b", "c"),  mean =  c(  5,   7,   8),  sd =    c(  1, 1.5,   1))# Vectorized distribution types, like distributional::dist_normal()# and posterior::rvar(), can be used with the `xdist` / `ydist` aestheticsdist_df %>%  ggplot(aes(y = group, xdist = dist_normal(mean, sd))) +  stat_interval() +  scale_color_brewer()

Line + multiple-ribbon plot (shortcut stat)

Description

A combination ofstat_slabinterval() andgeom_lineribbon() with sensible defaultsfor making line + multiple-ribbon plots. Whilegeom_lineribbon() is intended for use on dataframes that have already been summarized using apoint_interval() function,stat_lineribbon() is intended for use directly on data frames of draws or ofanalytical distributions, and will perform the summarization using apoint_interval()function.

Roughly equivalent to:

stat_slabinterval(  aes(    group = after_stat(level),    fill = after_stat(level),    order = after_stat(level),    size = NULL  ),  geom = "lineribbon",  .width = c(0.5, 0.8, 0.95),  show_slab = FALSE,  show.legend = NA)

Usage

stat_lineribbon(  mapping = NULL,  data = NULL,  geom = "lineribbon",  position = "identity",  ...,  .width = c(0.5, 0.8, 0.95),  point_interval = "median_qi",  orientation = NA,  na.rm = FALSE,  show.legend = NA,  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

geom

<Geom |string> Use to overridethe default connection betweenstat_lineribbon() andgeom_lineribbon()

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). Theseare often aesthetics, used to set an aesthetic to a fixed value, likecolour = "red" orlinewidth = 3 (seeAesthetics, below). Theymay also be parameters to the paired geom/stat. When paired with thedefault geom,geom_lineribbon(), these include:

step

<scalarlogical |string> Should the line/ribbon be drawnas a step function? One of:

  • FALSE (default): do not draw as a step function.

  • "mid" (orTRUE): draw steps midway between adjacent x values.

  • "hv": draw horizontal-then-vertical steps.

  • "vh": draw as vertical-then-horizontal steps.

TRUE is an alias for"mid", because for a step function with ribbons"mid" is reasonable default (for the other two step approaches the ribbonsat either the very first or very last x value will not be visible).

.width

<numeric> The.width argument passed topoint_interval: a vector of probabilitiesto use that determine the widths of the resulting intervals. If multiple probabilities are provided,multiple intervals per group are generated, each with a different probability interval (andvalue of the corresponding.width andlevel generated variables).

point_interval

<function |string> A function from thepoint_interval() family(e.g.,median_qi,mean_qi,mode_hdi, etc), or a string giving the name of a function from that family(e.g.,"median_qi","mean_qi","mode_hdi", etc; if a string, the caller's environment is searchedfor the function, followed by theggdist environment). This function determines the point summary(typically mean, median, or mode) and interval type (quantile interval,qi;highest-density interval,hdi; or highest-density continuous interval,hdci). Output willbe converted to the appropriatex- ory-based aesthetics depending on the value oforientation.See thepoint_interval() family of functions for more information.

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

<logical> Should this layer be included in the legends?NA, the default, includes if any aesthetics are mapped.FALSE never includes,andTRUE always includes. It can also be a named logical vector to finely selectthe aesthetics to display.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

To visualize sample data, such as a data distribution, samples from abootstrap distribution, or a Bayesian posterior, you can supply samples tothex ory aesthetic.

To visualize analytical distributions, you can use thexdist orydistaesthetic. For historical reasons, you can also usedist to specify the distribution, thoughthis is not recommended as it does not work as well with orientation detection.These aesthetics can be used as follows:

Value

Aggplot2::Stat representing a line + multiple-ribbon geometry which canbe added to aggplot() object.

Computed Variables

The following variables are computed by this stat and made available foruse in aesthetic specifications (aes()) using theafter_stat()function or theafter_stat argument ofstage():

Aesthetics

The line+ribbonstats andgeoms have a wide variety of aesthetics that controlthe appearance of their two sub-geometries: theline and theribbon.

Thesestats support the following aesthetics:

In addition, in their default configuration (paired withgeom_lineribbon())the following aesthetics are supported by the underlying geom:

Ribbon-specific aesthetics

Color aesthetics

Line aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("lineribbon").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

See Also

Seegeom_lineribbon() for the geom underlying this stat.

Other lineribbon stats:stat_ribbon()

Examples

library(dplyr)library(ggplot2)library(distributional)theme_set(theme_ggdist())# ON SAMPLE DATAset.seed(12345)tibble(  x = rep(1:10, 100),  y = rnorm(1000, x)) %>%  ggplot(aes(x = x, y = y)) +  stat_lineribbon() +  scale_fill_brewer()# ON ANALYTICAL DISTRIBUTIONS# Vectorized distribution types, like distributional::dist_normal()# and posterior::rvar(), can be used with the `xdist` / `ydist` aestheticstibble(  x = 1:10,  sd = seq(1, 3, length.out = 10)) %>%  ggplot(aes(x = x, ydist = dist_normal(x, sd))) +  stat_lineribbon() +  scale_fill_brewer()

Blurry MCSE dot plot (stat)

Description

Variant ofstat_dots() for creating blurry dotplots of quantiles. Usesposterior::mcse_quantile() to calculate the Monte Carlo Standard Errorof each quantile computed for the dotplot, yielding anse computed variablethat is by default mapped onto thesd aesthetic ofgeom_blur_dots().

Usage

stat_mcse_dots(  mapping = NULL,  data = NULL,  geom = "blur_dots",  position = "identity",  ...,  quantiles = NA,  orientation = NA,  na.rm = FALSE,  show.legend = NA,  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

geom

<Geom |string> Use to overridethe default connection betweenstat_mcse_dots() andgeom_blur_dots()

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). Theseare often aesthetics, used to set an aesthetic to a fixed value, likecolour = "red" orlinewidth = 3 (seeAesthetics, below). Theymay also be parameters to the paired geom/stat. When paired with thedefault geom,geom_blur_dots(), these include:

blur

<function |string> Blur function to apply to dots.One of:

  • A function that takes a numeric vector of distances from the dotcenter, the dot radius, and the standard deviation of the blur and returnsa vector of opacities in[0, 1], such asblur_gaussian()orblur_interval().

  • A string indicating what blur function to use, as the suffix to afunction name starting withblur_; e.g."gaussian" (the default)appliesblur_gaussian().

binwidth

<numeric |unit> The bin width to use for laying out the dots.One of:

  • NA (the default): Dynamically select the bin width based on thesize of the plot when drawn. This will pick abinwidth such that thetallest stack of dots is at mostscale in height (ideally exactlyscalein height, though this is not guaranteed).

  • A length-1 (scalar) numeric orunit object giving the exact bin width.

  • A length-2 (vector) numeric orunit object giving the minimum and maximumdesired bin width. The bin width will be dynamically selected withinthese bounds.

If the value is numeric, it is assumed to be in units of data. The bin width(or its bounds) can also be specified usingunit(), which may be useful ifit is desired that the dots be a certain point size or a certain percentage ofthe width/height of the viewport. For example,unit(0.1, "npc") would makedots that areexactly 10% of the viewport size along whichever dimension thedotplot is drawn;unit(c(0, 0.1), "npc") would make dots that areat most10% of the viewport size (while still ensuring the tallest stack is less thanor equal toscale).

dotsize

<scalarnumeric> The width of the dots relative to thebinwidth. The default,1.07, makes dots be just a bit wider than the bin width, which is amanually-tuned parameter that tends to work well with the default circularshape, preventing gaps between bins from appearing to be too large visually(as might arise from dots beingprecisely thebinwidth). If it is desiredto have dots be precisely thebinwidth, setdotsize = 1.

stackratio

<scalarnumeric> The distance between the center of the dots in the samestack relative to the dot height. The default,1, makes dots in the samestack just touch each other.

layout

<string> The layout method used for the dots. One of:

  • "bin" (default): places dots on the off-axis at the midpoint oftheir bins as in the classic Wilkinson dotplot. This maintains thealignment of rows and columns in the dotplot. This layout is slightlydifferent from the classic Wilkinson algorithm in that: (1) it nudgesbins slightly to avoid overlapping bins and (2) if the input data aresymmetrical it will return a symmetrical layout.

  • "weave": uses the same basic binning approach of"bin", butplaces dots in the off-axis at their actual positions (unlessoverlaps = "nudge", in which case overlaps may be nudged out of theway). This maintains the alignment of rows but does not align dotswithin columns.

  • "hex": uses the same basic binning approach of"bin", butalternates placing dots+ binwidth/4 or- binwidth/4 in theoff-axis from the bin center. This allows hexagonal packing by settingastackratio less than 1 (something like0.9 tends to work).

  • "swarm": uses the"compactswarm" layout frombeeswarm::beeswarm(). Does not maintain alignment of rows or columns,but can be more compact and neat looking, especially for sample data(as opposed to quantile dotplots of theoretical distributions, whichmay look better with"bin","weave", or"hex").

  • "bar": for discrete distributions, lays out duplicate values inrectangular bars.

overlaps

<string> How to handle overlapping dots or bins in the"bin","weave", and"hex" layouts (dots never overlap in the"swarm" or"bar" layouts).For the purposes of this argument, dots are only considered to be overlappingif they would be overlapping whendotsize = 1 andstackratio = 1; i.e.if you set those arguments to other values, overlaps may still occur.One of:

  • "keep": leave overlapping dots as they are. Dots may overlap(usually only slightly) in the"bin","weave", and"hex" layouts.

  • "nudge": nudge overlapping dots out of the way. Overlaps are avoidedusing a constrained optimization which minimizes the squared distance ofdots to their desired positions, subject to the constraint that adjacentdots do not overlap.

smooth

<function |string> Smoother to apply to dot positions.One of:

  • A function that takes a numeric vector of dot positions and returns asmoothed version of that vector, such assmooth_bounded(),smooth_unbounded(), smooth_discrete()⁠, or ⁠smooth_bar()'.

  • A string indicating what smoother to use, as the suffix to a functionname starting withsmooth_; e.g."none" (the default) appliessmooth_none(), which simply returns the given vector withoutapplying smoothing.

Smoothing is most effective when the smoother is matched to the support ofthe distribution; e.g. usingsmooth_bounded(bounds = ...).

overflow

<string> How to handle overflow of dots beyond the extent of the geomwhen a minimumbinwidth (or an exactbinwidth) is supplied.One of:

  • "keep": Keep the overflow, drawing dots outside the geom bounds.

  • "warn": Keep the overflow, but produce a warning suggesting solutions,such as settingbinwidth = NA oroverflow = "compress".

  • "compress": Compress the layout. Reduces thebinwidth to the size necessaryto keep the dots within bounds, then adjustsstackratio anddotsize so thatthe apparent dot size is the user-specified minimumbinwidth times theuser-specifieddotsize.

If you find the default layout has dots that are too small, and you are okaywith dots overlapping, consider settingoverflow = "compress" and supplyingan exact or minimum dot size usingbinwidth.

verbose

<scalarlogical> IfTRUE, print out the bin width of the dotplot. Can be usefulif you want to start from an automatically-selected bin width and then adjust itmanually. Bin width is printed both as data units and as normalized parentcoordinates or"npc"s (seeunit()). Note that if you just want to scale theselected bin width to fit within a desired area, it is probably easier to usescale than to copy and scalebinwidth manually, and if you just want toprovide constraints on the bin width, you can pass a length-2 vector tobinwidth.

subguide

<function |string> Sub-guide used to annotate thethickness scale. One of:

  • A function that takes ascale argument giving aggplot2::Scaleobject and anorientation argument giving the orientation of thegeometry and then returns agrid::grob that will draw the axisannotation, such assubguide_axis() (to draw a traditional axis) orsubguide_none() (to draw no annotation). Seesubguide_axis()for a list of possibilities and examples.

  • A string giving the name of such a function when prefixedwith"subguide_"; e.g."axis" or"none". The values"slab","dots", and"spike" use the default subguide for their geomfamilies (no subguide), which can be modified by settingsubguide_slab,subguide_dots, orsubguide_spike;see the documentation for those functions.

quantiles

<scalarlogical> Number of quantiles to plot in the dotplot. UseNA(the default) to plot all data points. Setting this to a value other thanNA will producea quantile dotplot: that is, a dotplot of quantiles from the sample or distribution (foranalytical distributions, the default ofNA is taken to mean100 quantiles). SeeKay et al. (2016) and Fernandes et al. (2018) for more information on quantile dotplots.

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

logical. Should this layer be included in the legends?NA, the default, includes if any aesthetics are mapped.FALSE never includes, andTRUE always includes.It can also be a named logical vector to finely select the aesthetics todisplay.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

Thedots family of stats and geoms are similar toggplot2::geom_dotplot() but with a number of differences:

Stats and geoms in this family include:

stat_dots() andstat_dotsinterval(), when used with thequantiles argument,are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertaintyusing a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).

To visualize sample data, such as a data distribution, samples from abootstrap distribution, or a Bayesian posterior, you can supply samples tothex ory aesthetic.

To visualize analytical distributions, you can use thexdist orydistaesthetic. For historical reasons, you can also usedist to specify the distribution, thoughthis is not recommended as it does not work as well with orientation detection.These aesthetics can be used as follows:

Value

Aggplot2::Stat representing a blurry MCSE dot geometry which canbe added to aggplot() object.

Computed Variables

The following variables are computed by this stat and made available foruse in aesthetic specifications (aes()) using theafter_stat()function or theafter_stat argument ofstage():

Aesthetics

The dots+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: thedots (aka theslab), thepoint, and theinterval.

Thesestats support the following aesthetics:

In addition, in their default configuration (paired withgeom_blur_dots())the following aesthetics are supported by the underlying geom:

Dots-specific (aka Slab-specific) aesthetics

Interval-specific aesthetics

Point-specific aesthetics

Color aesthetics

Line aesthetics

Slab-specific color and line override aesthetics

Interval-specific color and line override aesthetics

Point-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("dotsinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

References

Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizationsof Uncertainty in Everyday, Mobile Predictive Systems.Conference on Human Factorsin Computing Systems - CHI '16, 5092–5103.doi:10.1145/2858036.2858558.

Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplotsor CDFs Improve Transit Decision-Making.Conference on Human Factors in Computing Systems - CHI '18.doi:10.1145/3173574.3173718.

See Also

Seegeom_blur_dots() for the geom underlying this stat.Seevignette("dotsinterval") for a variety of examples of use.

Other dotsinterval stats:stat_dots(),stat_dotsinterval()

Examples

library(dplyr)library(ggplot2)theme_set(theme_ggdist())set.seed(1234)data.frame(x = rnorm(1000)) %>%  ggplot(aes(x = x)) +  stat_mcse_dots(quantiles = 100, layout = "weave")

Point + multiple-interval plot (shortcut stat)

Description

Shortcut version ofstat_slabinterval() withgeom_pointinterval() forcreating point + multiple-interval plots.

Roughly equivalent to:

stat_slabinterval(  geom = "pointinterval",  show_slab = FALSE)

Usage

stat_pointinterval(  mapping = NULL,  data = NULL,  geom = "pointinterval",  position = "identity",  ...,  point_interval = "median_qi",  .width = c(0.66, 0.95),  orientation = NA,  na.rm = FALSE,  show.legend = c(size = FALSE),  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

geom

<Geom |string> Use to overridethe default connection betweenstat_pointinterval() andgeom_pointinterval()

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). Theseare often aesthetics, used to set an aesthetic to a fixed value, likecolour = "red" orlinewidth = 3 (seeAesthetics, below). Theymay also be parameters to the paired geom/stat. When paired with thedefault geom,geom_pointinterval(), these include:

interval_size_domain

<length-2numeric> Minimum and maximum of the values of thesize andlinewidth aestheticsthat will be translated into actual sizes for intervals drawn according tointerval_size_range (see thedocumentation for that argument.)

interval_size_range

<length-2numeric> This geom scales the raw size aesthetic values whendrawing interval and point sizes, as they tend to be too thick when usingthe default settings ofscale_size_continuous(),which give sizes with a range ofc(1, 6). Theinterval_size_domain value indicates theinput domain of raw size values (typically this should be equal to the value of therangeargument of thescale_size_continuous() function), andinterval_size_range indicates the desired output range of the size values (the min and maxof the actual sizes used to draw intervals). Most of the time it is not recommended to changethe value of this argument, as it may result in strange scaling of legends; this argument isa holdover from earlier versions that did not have size aesthetics targeting the point andinterval separately. If you want to adjust the size of the interval or points separately,you can also use thelinewidth orpoint_size aesthetics; seesub-geometry-scales.

fatten_point

<scalarnumeric> A multiplicative factor used to adjust the size of the point relative to thesize of the thickest interval line. If you wish to specify point sizes directly, you can also usethepoint_size aesthetic andscale_point_size_continuous() orscale_point_size_discrete();sizes specified with that aesthetic will not be adjusted usingfatten_point.

arrow

<arrow |NULL> Type of arrow heads to use on the interval, orNULL for no arrows.

point_interval

<function |string> A function from thepoint_interval() family(e.g.,median_qi,mean_qi,mode_hdi, etc), or a string giving the name of a function from that family(e.g.,"median_qi","mean_qi","mode_hdi", etc; if a string, the caller's environment is searchedfor the function, followed by theggdist environment). This function determines the point summary(typically mean, median, or mode) and interval type (quantile interval,qi;highest-density interval,hdi; or highest-density continuous interval,hdci). Output willbe converted to the appropriatex- ory-based aesthetics depending on the value oforientation.See thepoint_interval() family of functions for more information.

.width

<numeric> The.width argument passed topoint_interval: a vector of probabilitiesto use that determine the widths of the resulting intervals. If multiple probabilities are provided,multiple intervals per group are generated, each with a different probability interval (andvalue of the corresponding.width andlevel generated variables).

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

<logical> Should this layer be included in the legends? Default isc(size = FALSE),unlike most geoms, to match its common use cases.FALSE hides all legends,TRUE shows all legends,andNA shows only those that are mapped (the default for most geoms). It can also be a named logicalvector to finely select the aesthetics to display.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

To visualize sample data, such as a data distribution, samples from abootstrap distribution, or a Bayesian posterior, you can supply samples tothex ory aesthetic.

To visualize analytical distributions, you can use thexdist orydistaesthetic. For historical reasons, you can also usedist to specify the distribution, thoughthis is not recommended as it does not work as well with orientation detection.These aesthetics can be used as follows:

Value

Aggplot2::Stat representing a point + multiple-interval geometry which canbe added to aggplot() object.

Computed Variables

The following variables are computed by this stat and made available foruse in aesthetic specifications (aes()) using theafter_stat()function or theafter_stat argument ofstage():

Aesthetics

The slab+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: theslab, thepoint, andtheinterval.

Thesestats support the following aesthetics:

In addition, in their default configuration (paired withgeom_pointinterval())the following aesthetics are supported by the underlying geom:

Interval-specific aesthetics

Point-specific aesthetics

Color aesthetics

Line aesthetics

Interval-specific color and line override aesthetics

Point-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("slabinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

See Also

Seegeom_pointinterval() for the geom underlying this stat.Seestat_slabinterval() for the stat this shortcut is based on.

Other slabinterval stats:stat_ccdfinterval(),stat_cdfinterval(),stat_eye(),stat_gradientinterval(),stat_halfeye(),stat_histinterval(),stat_interval(),stat_slab(),stat_spike()

Examples

library(dplyr)library(ggplot2)library(distributional)theme_set(theme_ggdist())# ON SAMPLE DATAset.seed(1234)df = data.frame(  group = c("a", "b", "c"),  value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)))df %>%  ggplot(aes(x = value, y = group)) +  stat_pointinterval()# ON ANALYTICAL DISTRIBUTIONSdist_df = data.frame(  group = c("a", "b", "c"),  mean =  c(  5,   7,   8),  sd =    c(  1, 1.5,   1))# Vectorized distribution types, like distributional::dist_normal()# and posterior::rvar(), can be used with the `xdist` / `ydist` aestheticsdist_df %>%  ggplot(aes(y = group, xdist = dist_normal(mean, sd))) +  stat_pointinterval()

Multiple-ribbon plot (shortcut stat)

Description

A combination ofstat_slabinterval() andgeom_lineribbon() with sensible defaultsfor making multiple-ribbon plots. Whilegeom_lineribbon() is intended for use on dataframes that have already been summarized using apoint_interval() function,stat_ribbon() is intended for use directly on data frames of draws or ofanalytical distributions, and will perform the summarization using apoint_interval()function.

Roughly equivalent to:

stat_lineribbon(  show_point = FALSE)

Usage

stat_ribbon(  mapping = NULL,  data = NULL,  geom = "lineribbon",  position = "identity",  ...,  .width = c(0.5, 0.8, 0.95),  point_interval = "median_qi",  orientation = NA,  na.rm = FALSE,  show.legend = NA,  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

geom

<Geom |string> Use to overridethe default connection betweenstat_ribbon() andgeom_lineribbon()

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). Theseare often aesthetics, used to set an aesthetic to a fixed value, likecolour = "red" orlinewidth = 3 (seeAesthetics, below). Theymay also be parameters to the paired geom/stat. When paired with thedefault geom,geom_lineribbon(), these include:

step

<scalarlogical |string> Should the line/ribbon be drawnas a step function? One of:

  • FALSE (default): do not draw as a step function.

  • "mid" (orTRUE): draw steps midway between adjacent x values.

  • "hv": draw horizontal-then-vertical steps.

  • "vh": draw as vertical-then-horizontal steps.

TRUE is an alias for"mid", because for a step function with ribbons"mid" is reasonable default (for the other two step approaches the ribbonsat either the very first or very last x value will not be visible).

.width

<numeric> The.width argument passed topoint_interval: a vector of probabilitiesto use that determine the widths of the resulting intervals. If multiple probabilities are provided,multiple intervals per group are generated, each with a different probability interval (andvalue of the corresponding.width andlevel generated variables).

point_interval

<function |string> A function from thepoint_interval() family(e.g.,median_qi,mean_qi,mode_hdi, etc), or a string giving the name of a function from that family(e.g.,"median_qi","mean_qi","mode_hdi", etc; if a string, the caller's environment is searchedfor the function, followed by theggdist environment). This function determines the point summary(typically mean, median, or mode) and interval type (quantile interval,qi;highest-density interval,hdi; or highest-density continuous interval,hdci). Output willbe converted to the appropriatex- ory-based aesthetics depending on the value oforientation.See thepoint_interval() family of functions for more information.

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

<logical> Should this layer be included in the legends?NA, the default, includes if any aesthetics are mapped.FALSE never includes,andTRUE always includes. It can also be a named logical vector to finely selectthe aesthetics to display.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

To visualize sample data, such as a data distribution, samples from abootstrap distribution, or a Bayesian posterior, you can supply samples tothex ory aesthetic.

To visualize analytical distributions, you can use thexdist orydistaesthetic. For historical reasons, you can also usedist to specify the distribution, thoughthis is not recommended as it does not work as well with orientation detection.These aesthetics can be used as follows:

Value

Aggplot2::Stat representing a multiple-ribbon geometry which canbe added to aggplot() object.

Computed Variables

The following variables are computed by this stat and made available foruse in aesthetic specifications (aes()) using theafter_stat()function or theafter_stat argument ofstage():

Aesthetics

The line+ribbonstats andgeoms have a wide variety of aesthetics that controlthe appearance of their two sub-geometries: theline and theribbon.

Thesestats support the following aesthetics:

In addition, in their default configuration (paired withgeom_lineribbon())the following aesthetics are supported by the underlying geom:

Ribbon-specific aesthetics

Color aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("lineribbon").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

See Also

Seegeom_lineribbon() for the geom underlying this stat.

Other lineribbon stats:stat_lineribbon()

Examples

library(dplyr)library(ggplot2)library(distributional)theme_set(theme_ggdist())# ON SAMPLE DATAset.seed(12345)tibble(  x = rep(1:10, 100),  y = rnorm(1000, x)) %>%  ggplot(aes(x = x, y = y)) +  stat_ribbon() +  scale_fill_brewer()# ON ANALYTICAL DISTRIBUTIONS# Vectorized distribution types, like distributional::dist_normal()# and posterior::rvar(), can be used with the `xdist` / `ydist` aestheticstibble(  x = 1:10,  sd = seq(1, 3, length.out = 10)) %>%  ggplot(aes(x = x, ydist = dist_normal(x, sd))) +  stat_ribbon() +  scale_fill_brewer()

Slab (ridge) plot (shortcut stat)

Description

Shortcut version ofstat_slabinterval() withgeom_slab() forcreating slab (ridge) plots.

Roughly equivalent to:

stat_slabinterval(  aes(size = NULL),  geom = "slab",  show_point = FALSE,  show_interval = FALSE,  show.legend = NA)

Usage

stat_slab(  mapping = NULL,  data = NULL,  geom = "slab",  position = "identity",  ...,  p_limits = c(NA, NA),  density = "bounded",  adjust = waiver(),  trim = waiver(),  breaks = waiver(),  align = waiver(),  outline_bars = waiver(),  expand = FALSE,  limits = NULL,  n = waiver(),  orientation = NA,  na.rm = FALSE,  show.legend = NA,  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

geom

<Geom |string> Use to overridethe default connection betweenstat_slab() andgeom_slab()

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). Theseare often aesthetics, used to set an aesthetic to a fixed value, likecolour = "red" orlinewidth = 3 (seeAesthetics, below). Theymay also be parameters to the paired geom/stat. When paired with thedefault geom,geom_slab(), these include:

subscale

<function |string> Sub-scale used to scale values of thethickness aesthetic within the groups determined bynormalize. One of:

  • A function that takes anx argument giving a numeric vectorof values to be scaled and then returns athickness vector representingthe scaled values, such assubscale_thickness() orsubscale_identity().

  • A string giving the name of such a function when prefixedwith"subscale_"; e.g."thickness" or"identity". The value"thickness" using the default subscale, which can be modified bysettingsubscale_thickness; see the documentation for thatfunction.

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

normalize

<string> Groups within which to scale values of thethickness aesthetic. One of:

  • "all": normalize so that the maximum height across all data is1.

  • "panels": normalize within panels so that the maximum height in each panel is1.

  • "xy": normalize within the x/y axis opposite theorientation of this geom sothat the maximum height at each value of the opposite axis is1.

  • "groups": normalize within values of the opposite axis and within eachgroup so that the maximum height in each group is1.

  • "none": values are taken as is with no normalization (this should probablyonly be used with functions whose values are in [0,1], such as CDFs).

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

fill_type

<string> What type of fill to use when the fill color or alpha varies within a slab.One of:

  • "segments": breaks up the slab geometry into segments for each unique combination of fill color andalpha value. This approach is supported by all graphics devices and works well for sharp cutoff values,but can give ugly results if a large number of unique fill colors are being used (as in gradients,like instat_gradientinterval()).

  • "gradient": agrid::linearGradient() is used to create a smooth gradient fill. This works well forlarge numbers of unique fill colors, but requires R >= 4.1 and is not yet supported on all graphics devices.As of this writing, thepng() graphics device withtype = "cairo", thesvg() device, thepdf()device, and theragg::agg_png() devices are known to support this option. On R < 4.1, this optionwill fall back tofill_type = "segments" with a message.

  • "auto": attempts to usefill_type = "gradient" if support for it can be auto-detected. On R >= 4.2,support for gradients can be auto-detected on some graphics devices; if support is not detected, thisoption will fall back tofill_type = "segments" (in case of a false negative,fill_type = "gradient"can be set explicitly). On R < 4.2, support for gradients cannot be auto-detected, so this will alwaysfall back tofill_type = "segments", in which case you can setfill_type = "gradient" explicitlyif you are using a graphics device that support gradients.

subguide

<function |string> Sub-guide used to annotate thethickness scale. One of:

  • A function that takes ascale argument giving aggplot2::Scaleobject and anorientation argument giving the orientation of thegeometry and then returns agrid::grob that will draw the axisannotation, such assubguide_axis() (to draw a traditional axis) orsubguide_none() (to draw no annotation). Seesubguide_axis()for a list of possibilities and examples.

  • A string giving the name of such a function when prefixedwith"subguide_"; e.g."axis" or"none". The values"slab","dots", and"spike" use the default subguide for their geomfamilies (no subguide), which can be modified by settingsubguide_slab,subguide_dots, orsubguide_spike;see the documentation for those functions.

p_limits

<length-2numeric> Probability limits. Used to determine the lower and upperlimits ofanalytical distributions (distributions fromsamples ignore this parameter and determinetheir limits based on the limits of the sample and the value of thetrim parameter).E.g., if this isc(.001, .999), then a slab is drawnfor the distribution from the quantile atp = .001 to the quantile atp = .999. If the lower(respectively upper) limit isNA, then the lower (upper) limit will be the minimum (maximum) of thedistribution's support if it is finite, and0.001 (0.999) if it is not finite. E.g., ifp_limits isc(NA, NA), on a gamma distribution the effective value ofp_limits would bec(0, .999) since the gamma distribution is defined on⁠(0, Inf)⁠; whereas on a normal distributionit would be equivalent toc(.001, .999) since the normal distribution is defined on⁠(-Inf, Inf)⁠.

density

<function |string> Density estimator for sample data. One of:

  • A function which takes a numeric vector and returns a list with elementsx (giving grid points for the density estimator) andy (thecorresponding densities).ggdist provides a family of functionsfollowing this format, includingdensity_unbounded() anddensity_bounded(). This format is also compatible withstats::density().

  • A string giving the suffix of a function name that starts with"density_";e.g."bounded" for⁠[density_bounded()]⁠,"unbounded" for⁠[density_unbounded()]⁠,or"histogram" fordensity_histogram().Defaults to"bounded", i.e.density_bounded(), which estimates the bounds fromthe data and then uses a bounded density estimator based on the reflection method.

adjust

<scalarnumeric |waiver> Passed todensity (e.g.density_bounded()): Value to multiply the bandwidth of the density estimator by. Defaultwaiver() defers to the default of the density estimator, which is usually1.

trim

<scalarlogical |waiver> Passed todensity (e.g.density_bounded()): Should the density estimate be trimmed to the range of the data? Defaultwaiver() defers to the default of the density estimator, which is usuallyTRUE.

breaks

<numeric |function |string |waiver> Passed todensity (e.g.density_histogram()): Determines the breakpoints defining bins. Defaultwaiver() defers to the default of the density estimator, which is usually"Scott". Similar to (but not exactly the same as) thebreaksargument tographics::hist(). One of:

  • A scalar (length-1) numeric giving the number of bins

  • A vector numeric giving the breakpoints between histogram bins

  • A function takingx andweights and returning either thenumber of bins or a vector of breakpoints

  • A string giving the suffix of a function that starts with"breaks_".ggdist provides weighted implementations of the"Sturges","Scott", and"FD" break-finding algorithms fromgraphics::hist(), as well asbreaks_fixed() for manually settingthe bin width. Seebreaks.

For example,breaks = "Sturges" will use thebreaks_Sturges() algorithm,breaks = 9 will create 9 bins, andbreaks = breaks_fixed(width = 1) willset the bin width to1.

align

<scalarnumeric |function |string |waiver> Passed todensity (e.g.density_histogram()): Determines how to align the breakpoints defining bins. Defaultwaiver() defers to the default of the density estimator, which is usually"none" (performs no alignment). One of:

  • A scalar (length-1) numeric giving an offset that is subtractedfrom the breaks. The offset must be between0 and the bin width.

  • A function taking a sorted vector ofbreaks (bin edges) andreturning an offset to subtract from the breaks.

  • A string giving the suffix of a function that starts with"align_" used to determine the alignment, such asalign_none(),align_boundary(), oralign_center().

For example,align = "none" will provide no alignment,align = align_center(at = 0) will center a bin on0, andalign = align_boundary(at = 0) will align a bin edge on0.

outline_bars

<scalarlogical |waiver> Passed todensity (e.g.density_histogram()) and alsoused for discrete analytical distributions (whose slabs are drawn as histograms). Determinesif outlines in between the bars are drawn. Ifwaiver() orFALSE(the default), the outline is drawn only along the tops of the bars. IfTRUE, outlines in betweenbars are also drawn (though you may have to set theslab_color orcolor aesthetic tosee the outlines).

expand

<logical> For sample data, should the slab be expanded to the limits of the scale? DefaultFALSE.Can be a length-two logical vector to control expansion to the lower and upper limit respectively.

limits

<length-2numeric> Manually-specified limits for the slab, asa vector of length two. These limits are combined with those computed based onp_limits as well as the limits defined by the scales of the plot todetermine the limits used to draw the slab functions: these limits specifythe maximal limits; i.e., if specified, the limits will not be wider thanthese (but may be narrower). UseNA to leave a limit alone; e.g.limits = c(0, NA) will ensure that the lower limit does not go below 0, butlet the upper limit be determined by eitherp_limits or the scale settings.

n

<scalarnumeric> Number of points at which to evaluate the function that defines the slab. Alsopassed todensity (e.g.density_bounded()). Defaultwaiver() uses the value501for analytical distributions and defers to the default of the density estimator forsample-based distributions, which is also usually501.

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

<logical> Should this layer be included in the legends? Default isc(size = FALSE),unlike most geoms, to match its common use cases.FALSE hides all legends,TRUE shows all legends,andNA shows only those that are mapped (the default for most geoms). It can also be a named logicalvector to finely select the aesthetics to display.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

To visualize sample data, such as a data distribution, samples from abootstrap distribution, or a Bayesian posterior, you can supply samples tothex ory aesthetic.

To visualize analytical distributions, you can use thexdist orydistaesthetic. For historical reasons, you can also usedist to specify the distribution, thoughthis is not recommended as it does not work as well with orientation detection.These aesthetics can be used as follows:

Value

Aggplot2::Stat representing a slab (ridge) geometry which canbe added to aggplot() object.

Computed Variables

The following variables are computed by this stat and made available foruse in aesthetic specifications (aes()) using theafter_stat()function or theafter_stat argument ofstage():

Aesthetics

The slab+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: theslab, thepoint, andtheinterval.

Thesestats support the following aesthetics:

In addition, in their default configuration (paired withgeom_slab())the following aesthetics are supported by the underlying geom:

Slab-specific aesthetics

Color aesthetics

Line aesthetics

Slab-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("slabinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

See Also

Seegeom_slab() for the geom underlying this stat.Seestat_slabinterval() for the stat this shortcut is based on.

Other slabinterval stats:stat_ccdfinterval(),stat_cdfinterval(),stat_eye(),stat_gradientinterval(),stat_halfeye(),stat_histinterval(),stat_interval(),stat_pointinterval(),stat_spike()

Examples

library(dplyr)library(ggplot2)library(distributional)theme_set(theme_ggdist())# ON SAMPLE DATAset.seed(1234)df = data.frame(  group = c("a", "b", "c"),  value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)))df %>%  ggplot(aes(x = value, y = group)) +  stat_slab()# ON ANALYTICAL DISTRIBUTIONSdist_df = data.frame(  group = c("a", "b", "c"),  mean =  c(  5,   7,   8),  sd =    c(  1, 1.5,   1))# Vectorized distribution types, like distributional::dist_normal()# and posterior::rvar(), can be used with the `xdist` / `ydist` aestheticsdist_df %>%  ggplot(aes(y = group, xdist = dist_normal(mean, sd))) +  stat_slab()# RIDGE PLOTS# "ridge" plots can be created by expanding the slabs to the limits of the plot# (expand = TRUE), allowing the density estimator to be nonzero outside the# limits of the data (trim = FALSE), and increasing the height of the slabs.data.frame(  group = letters[1:3],  value = rnorm(3000, 3:1)) %>%  ggplot(aes(y = group, x = value)) +  stat_slab(color = "black", expand = TRUE, trim = FALSE, height = 2)

Slab + interval plots for sample data and analytical distributions (ggplot stat)

Description

"Meta" stat for computing distribution functions (densities or CDFs) + intervals for use withgeom_slabinterval(). Useful for creating eye plots, half-eye plots, CCDF bar plots,gradient plots, histograms, and more. Sample data can be supplied to thex andyaesthetics or analytical distributions (in a variety of formats) can be supplied to thexdist andydist aesthetics.SeeDetails.

Usage

stat_slabinterval(  mapping = NULL,  data = NULL,  geom = "slabinterval",  position = "identity",  ...,  p_limits = c(NA, NA),  density = "bounded",  adjust = waiver(),  trim = waiver(),  breaks = waiver(),  align = waiver(),  outline_bars = waiver(),  expand = FALSE,  point_interval = "median_qi",  limits = NULL,  n = waiver(),  .width = c(0.66, 0.95),  orientation = NA,  na.rm = FALSE,  show.legend = c(size = FALSE),  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

geom

<Geom |string> Use to override thedefault connection betweenstat_slabinterval() andgeom_slabinterval().

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). Theseare often aesthetics, used to set an aesthetic to a fixed value, likecolour = "red" orlinewidth = 3 (seeAesthetics, below). Theymay also be parameters to the paired geom/stat. When paired with thedefault geom,geom_slabinterval(), these include:

subscale

<function |string> Sub-scale used to scale values of thethickness aesthetic within the groups determined bynormalize. One of:

  • A function that takes anx argument giving a numeric vectorof values to be scaled and then returns athickness vector representingthe scaled values, such assubscale_thickness() orsubscale_identity().

  • A string giving the name of such a function when prefixedwith"subscale_"; e.g."thickness" or"identity". The value"thickness" using the default subscale, which can be modified bysettingsubscale_thickness; see the documentation for thatfunction.

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

normalize

<string> Groups within which to scale values of thethickness aesthetic. One of:

  • "all": normalize so that the maximum height across all data is1.

  • "panels": normalize within panels so that the maximum height in each panel is1.

  • "xy": normalize within the x/y axis opposite theorientation of this geom sothat the maximum height at each value of the opposite axis is1.

  • "groups": normalize within values of the opposite axis and within eachgroup so that the maximum height in each group is1.

  • "none": values are taken as is with no normalization (this should probablyonly be used with functions whose values are in [0,1], such as CDFs).

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

fill_type

<string> What type of fill to use when the fill color or alpha varies within a slab.One of:

  • "segments": breaks up the slab geometry into segments for each unique combination of fill color andalpha value. This approach is supported by all graphics devices and works well for sharp cutoff values,but can give ugly results if a large number of unique fill colors are being used (as in gradients,like instat_gradientinterval()).

  • "gradient": agrid::linearGradient() is used to create a smooth gradient fill. This works well forlarge numbers of unique fill colors, but requires R >= 4.1 and is not yet supported on all graphics devices.As of this writing, thepng() graphics device withtype = "cairo", thesvg() device, thepdf()device, and theragg::agg_png() devices are known to support this option. On R < 4.1, this optionwill fall back tofill_type = "segments" with a message.

  • "auto": attempts to usefill_type = "gradient" if support for it can be auto-detected. On R >= 4.2,support for gradients can be auto-detected on some graphics devices; if support is not detected, thisoption will fall back tofill_type = "segments" (in case of a false negative,fill_type = "gradient"can be set explicitly). On R < 4.2, support for gradients cannot be auto-detected, so this will alwaysfall back tofill_type = "segments", in which case you can setfill_type = "gradient" explicitlyif you are using a graphics device that support gradients.

interval_size_domain

<length-2numeric> Minimum and maximum of the values of thesize andlinewidth aestheticsthat will be translated into actual sizes for intervals drawn according tointerval_size_range (see thedocumentation for that argument.)

interval_size_range

<length-2numeric> This geom scales the raw size aesthetic values whendrawing interval and point sizes, as they tend to be too thick when usingthe default settings ofscale_size_continuous(),which give sizes with a range ofc(1, 6). Theinterval_size_domain value indicates theinput domain of raw size values (typically this should be equal to the value of therangeargument of thescale_size_continuous() function), andinterval_size_range indicates the desired output range of the size values (the min and maxof the actual sizes used to draw intervals). Most of the time it is not recommended to changethe value of this argument, as it may result in strange scaling of legends; this argument isa holdover from earlier versions that did not have size aesthetics targeting the point andinterval separately. If you want to adjust the size of the interval or points separately,you can also use thelinewidth orpoint_size aesthetics; seesub-geometry-scales.

fatten_point

<scalarnumeric> A multiplicative factor used to adjust the size of the point relative to thesize of the thickest interval line. If you wish to specify point sizes directly, you can also usethepoint_size aesthetic andscale_point_size_continuous() orscale_point_size_discrete();sizes specified with that aesthetic will not be adjusted usingfatten_point.

arrow

<arrow |NULL> Type of arrow heads to use on the interval, orNULL for no arrows.

subguide

<function |string> Sub-guide used to annotate thethickness scale. One of:

  • A function that takes ascale argument giving aggplot2::Scaleobject and anorientation argument giving the orientation of thegeometry and then returns agrid::grob that will draw the axisannotation, such assubguide_axis() (to draw a traditional axis) orsubguide_none() (to draw no annotation). Seesubguide_axis()for a list of possibilities and examples.

  • A string giving the name of such a function when prefixedwith"subguide_"; e.g."axis" or"none". The values"slab","dots", and"spike" use the default subguide for their geomfamilies (no subguide), which can be modified by settingsubguide_slab,subguide_dots, orsubguide_spike;see the documentation for those functions.

p_limits

<length-2numeric> Probability limits. Used to determine the lower and upperlimits ofanalytical distributions (distributions fromsamples ignore this parameter and determinetheir limits based on the limits of the sample and the value of thetrim parameter).E.g., if this isc(.001, .999), then a slab is drawnfor the distribution from the quantile atp = .001 to the quantile atp = .999. If the lower(respectively upper) limit isNA, then the lower (upper) limit will be the minimum (maximum) of thedistribution's support if it is finite, and0.001 (0.999) if it is not finite. E.g., ifp_limits isc(NA, NA), on a gamma distribution the effective value ofp_limits would bec(0, .999) since the gamma distribution is defined on⁠(0, Inf)⁠; whereas on a normal distributionit would be equivalent toc(.001, .999) since the normal distribution is defined on⁠(-Inf, Inf)⁠.

density

<function |string> Density estimator for sample data. One of:

  • A function which takes a numeric vector and returns a list with elementsx (giving grid points for the density estimator) andy (thecorresponding densities).ggdist provides a family of functionsfollowing this format, includingdensity_unbounded() anddensity_bounded(). This format is also compatible withstats::density().

  • A string giving the suffix of a function name that starts with"density_";e.g."bounded" for⁠[density_bounded()]⁠,"unbounded" for⁠[density_unbounded()]⁠,or"histogram" fordensity_histogram().Defaults to"bounded", i.e.density_bounded(), which estimates the bounds fromthe data and then uses a bounded density estimator based on the reflection method.

adjust

<scalarnumeric |waiver> Passed todensity (e.g.density_bounded()): Value to multiply the bandwidth of the density estimator by. Defaultwaiver() defers to the default of the density estimator, which is usually1.

trim

<scalarlogical |waiver> Passed todensity (e.g.density_bounded()): Should the density estimate be trimmed to the range of the data? Defaultwaiver() defers to the default of the density estimator, which is usuallyTRUE.

breaks

<numeric |function |string |waiver> Passed todensity (e.g.density_histogram()): Determines the breakpoints defining bins. Defaultwaiver() defers to the default of the density estimator, which is usually"Scott". Similar to (but not exactly the same as) thebreaksargument tographics::hist(). One of:

  • A scalar (length-1) numeric giving the number of bins

  • A vector numeric giving the breakpoints between histogram bins

  • A function takingx andweights and returning either thenumber of bins or a vector of breakpoints

  • A string giving the suffix of a function that starts with"breaks_".ggdist provides weighted implementations of the"Sturges","Scott", and"FD" break-finding algorithms fromgraphics::hist(), as well asbreaks_fixed() for manually settingthe bin width. Seebreaks.

For example,breaks = "Sturges" will use thebreaks_Sturges() algorithm,breaks = 9 will create 9 bins, andbreaks = breaks_fixed(width = 1) willset the bin width to1.

align

<scalarnumeric |function |string |waiver> Passed todensity (e.g.density_histogram()): Determines how to align the breakpoints defining bins. Defaultwaiver() defers to the default of the density estimator, which is usually"none" (performs no alignment). One of:

  • A scalar (length-1) numeric giving an offset that is subtractedfrom the breaks. The offset must be between0 and the bin width.

  • A function taking a sorted vector ofbreaks (bin edges) andreturning an offset to subtract from the breaks.

  • A string giving the suffix of a function that starts with"align_" used to determine the alignment, such asalign_none(),align_boundary(), oralign_center().

For example,align = "none" will provide no alignment,align = align_center(at = 0) will center a bin on0, andalign = align_boundary(at = 0) will align a bin edge on0.

outline_bars

<scalarlogical |waiver> Passed todensity (e.g.density_histogram()) and alsoused for discrete analytical distributions (whose slabs are drawn as histograms). Determinesif outlines in between the bars are drawn. Ifwaiver() orFALSE(the default), the outline is drawn only along the tops of the bars. IfTRUE, outlines in betweenbars are also drawn (though you may have to set theslab_color orcolor aesthetic tosee the outlines).

expand

<logical> For sample data, should the slab be expanded to the limits of the scale? DefaultFALSE.Can be a length-two logical vector to control expansion to the lower and upper limit respectively.

point_interval

<function |string> A function from thepoint_interval() family(e.g.,median_qi,mean_qi,mode_hdi, etc), or a string giving the name of a function from that family(e.g.,"median_qi","mean_qi","mode_hdi", etc; if a string, the caller's environment is searchedfor the function, followed by theggdist environment). This function determines the point summary(typically mean, median, or mode) and interval type (quantile interval,qi;highest-density interval,hdi; or highest-density continuous interval,hdci). Output willbe converted to the appropriatex- ory-based aesthetics depending on the value oforientation.See thepoint_interval() family of functions for more information.

limits

<length-2numeric> Manually-specified limits for the slab, asa vector of length two. These limits are combined with those computed based onp_limits as well as the limits defined by the scales of the plot todetermine the limits used to draw the slab functions: these limits specifythe maximal limits; i.e., if specified, the limits will not be wider thanthese (but may be narrower). UseNA to leave a limit alone; e.g.limits = c(0, NA) will ensure that the lower limit does not go below 0, butlet the upper limit be determined by eitherp_limits or the scale settings.

n

<scalarnumeric> Number of points at which to evaluate the function that defines the slab. Alsopassed todensity (e.g.density_bounded()). Defaultwaiver() uses the value501for analytical distributions and defers to the default of the density estimator forsample-based distributions, which is also usually501.

.width

<numeric> The.width argument passed topoint_interval: a vector of probabilitiesto use that determine the widths of the resulting intervals. If multiple probabilities are provided,multiple intervals per group are generated, each with a different probability interval (andvalue of the corresponding.width andlevel generated variables).

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

<logical> Should this layer be included in the legends? Default isc(size = FALSE),unlike most geoms, to match its common use cases.FALSE hides all legends,TRUE shows all legends,andNA shows only those that are mapped (the default for most geoms). It can also be a named logicalvector to finely select the aesthetics to display.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

A highly configurable stat for generating a variety of plots that combine a "slab"that describes a distribution plus a point summary and any number of intervals.Several "shortcut" stats are providedwhich combine multiple options to create useful geoms, particularlyeye plots(a violin plot of density plus interval),half-eye plots (a density plot plus interval),CCDF bar plots (a complementary CDF plus interval), andgradient plots(a density encoded in color alpha plus interval).

The shortcut stats include:

To visualize sample data, such as a data distribution, samples from abootstrap distribution, or a Bayesian posterior, you can supply samples tothex ory aesthetic.

To visualize analytical distributions, you can use thexdist orydistaesthetic. For historical reasons, you can also usedist to specify the distribution, thoughthis is not recommended as it does not work as well with orientation detection.These aesthetics can be used as follows:

Value

Aggplot2::Stat representing a slab or combined slab+interval geometry which canbe added to aggplot() object.

Computed Variables

The following variables are computed by this stat and made available foruse in aesthetic specifications (aes()) using theafter_stat()function or theafter_stat argument ofstage():

Aesthetics

The slab+intervalstats andgeoms have a wide variety of aesthetics that controlthe appearance of their three sub-geometries: theslab, thepoint, andtheinterval.

Thesestats support the following aesthetics:

In addition, in their default configuration (paired withgeom_slabinterval())the following aesthetics are supported by the underlying geom:

Slab-specific aesthetics

Interval-specific aesthetics

Point-specific aesthetics

Color aesthetics

Line aesthetics

Slab-specific color and line override aesthetics

Interval-specific color and line override aesthetics

Point-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("slabinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

See Also

Seegeom_slabinterval() for more information on the geom these statsuse by default and some of the options it has.Seevignette("slabinterval") for a variety of examples of use.

Examples

library(dplyr)library(ggplot2)library(distributional)theme_set(theme_ggdist())# EXAMPLES ON SAMPLE DATAset.seed(1234)df = data.frame(  group = c("a", "b", "c", "c", "c"),  value = rnorm(2500, mean = c(5, 7, 9, 9, 9), sd = c(1, 1.5, 1, 1, 1)))# here are vertical eyes:df %>%  ggplot(aes(x = group, y = value)) +  stat_eye()# note the sample size is not automatically incorporated into the# area of the densities in case one wishes to plot densities against# a reference (e.g. a prior distribution).# But you may wish to account for sample size if using these geoms# for something other than visualizing posteriors; in which case# you can use after_stat(f*n):df %>%  ggplot(aes(x = group, y = value)) +  stat_eye(aes(thickness = after_stat(pdf*n)))# EXAMPLES ON ANALYTICAL DISTRIBUTIONSdist_df = tribble(  ~group, ~subgroup, ~mean, ~sd,  "a",          "h",     5,   1,  "b",          "h",     7,   1.5,  "c",          "h",     8,   1,  "c",          "i",     9,   1,  "c",          "j",     7,   1)# Using functions from the distributional package (like dist_normal()) with the# dist aesthetic can lead to more compact/expressive specificationsdist_df %>%  ggplot(aes(x = group, ydist = dist_normal(mean, sd), fill = subgroup)) +  stat_eye(position = "dodge")# using the old character vector + args approachdist_df %>%  ggplot(aes(x = group, dist = "norm", arg1 = mean, arg2 = sd, fill = subgroup)) +  stat_eye(position = "dodge")# the stat_slabinterval family applies a Jacobian adjustment to densities# when plotting on transformed scales in order to plot them correctly.# It determines the Jacobian using symbolic differentiation if possible,# using stats::D(). If symbolic differentation fails, it falls back# to numericDeriv(), which is less reliable; therefore, it is# advisable to use scale transformation functions that are defined in# terms of basic math functions so that their derivatives can be# determined analytically (most of the transformation functions in the# scales package currently have this property).# For example, here is a log-Normal distribution plotted on the log# scale, where it will appear Normal:data.frame(dist = "lnorm", logmean = log(10), logsd = 2*log(10)) %>%  ggplot(aes(y = 1, dist = dist, arg1 = logmean, arg2 = logsd)) +  stat_halfeye() +  scale_x_log10(breaks = 10^seq(-5,7, by = 2))# see vignette("slabinterval") for many more examples.

Spike plot (ggplot2 stat)

Description

Stat for drawing "spikes" (optionally with points on them) at specific pointson a distribution (numerical or determined as a function of the distribution),intended for annotatingstat_slabinterval() geometries.

Usage

stat_spike(  mapping = NULL,  data = NULL,  geom = "spike",  position = "identity",  ...,  at = "median",  p_limits = c(NA, NA),  density = "bounded",  adjust = waiver(),  trim = waiver(),  breaks = waiver(),  align = waiver(),  outline_bars = waiver(),  expand = FALSE,  limits = NULL,  n = waiver(),  orientation = NA,  na.rm = FALSE,  show.legend = NA,  inherit.aes = TRUE,  check.aes = TRUE,  check.param = TRUE)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

geom

<Geom |string> Use to override the defaultconnection betweenstat_spike() andgeom_spike()

position

<Position |string> Position adjustment,either as a string, or the result of a call to a position adjustment function.Setting this equal to"dodge" (position_dodge()) or"dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed tolayer(). Theseare often aesthetics, used to set an aesthetic to a fixed value, likecolour = "red" orlinewidth = 3 (seeAesthetics, below). Theymay also be parameters to the paired geom/stat. When paired with thedefault geom,geom_spike(), these include:

subguide

<function |string> Sub-guide used to annotate thethickness scale. One of:

  • A function that takes ascale argument giving aggplot2::Scaleobject and anorientation argument giving the orientation of thegeometry and then returns agrid::grob that will draw the axisannotation, such assubguide_axis() (to draw a traditional axis) orsubguide_none() (to draw no annotation). Seesubguide_axis()for a list of possibilities and examples.

  • A string giving the name of such a function when prefixedwith"subguide_"; e.g."axis" or"none". The values"slab","dots", and"spike" use the default subguide for their geomfamilies (no subguide), which can be modified by settingsubguide_slab,subguide_dots, orsubguide_spike;see the documentation for those functions.

subscale

<function |string> Sub-scale used to scale values of thethickness aesthetic within the groups determined bynormalize. One of:

  • A function that takes anx argument giving a numeric vectorof values to be scaled and then returns athickness vector representingthe scaled values, such assubscale_thickness() orsubscale_identity().

  • A string giving the name of such a function when prefixedwith"subscale_"; e.g."thickness" or"identity". The value"thickness" using the default subscale, which can be modified bysettingsubscale_thickness; see the documentation for thatfunction.

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

normalize

<string> Groups within which to scale values of thethickness aesthetic. One of:

  • "all": normalize so that the maximum height across all data is1.

  • "panels": normalize within panels so that the maximum height in each panel is1.

  • "xy": normalize within the x/y axis opposite theorientation of this geom sothat the maximum height at each value of the opposite axis is1.

  • "groups": normalize within values of the opposite axis and within eachgroup so that the maximum height in each group is1.

  • "none": values are taken as is with no normalization (this should probablyonly be used with functions whose values are in [0,1], such as CDFs).

For a comprehensive discussion and examples of slab scaling and normalization, see thethickness scale article.

arrow

<arrow |NULL> Type of arrow heads to use on the spike, orNULL for no arrows.

at

<numeric |function |character |list> The points at which toevaluate the PDF and CDF of the distribution. One of:

  • numeric vector: points to evaluate the PDF and CDF of the distributions at.

  • function or character vector: function (or names of functions) which,when applied on a distribution-like object (e.g. adistributional object or aposterior::rvar()), returns a vector of values to evaluate the distribution functions at.

  • alist where each element is any of the above (e.g. anumeric, function, orname of a function): the evaluation points determined by each element of thelist are concatenated together. This means, e.g.,c(0, median, qi) would adda spike at0, the median, and the endpoints of theqi of the distribution.

The values ofat are also converted into a character vector which is suppliedas a computed variable (also calledat) generated by thisstat, which can bemapped onto aesthetics usingafter_stat(). Non-empty namescan be used to override the values of the computed variable; e.g.at = c(zero = 0, "median", mode = "Mode") will generate a computed variable withthe valuesc("zero", "median", "mode") that is evaluated at0, the median, andthe mode of the distribution.

p_limits

<length-2numeric> Probability limits. Used to determine the lower and upperlimits ofanalytical distributions (distributions fromsamples ignore this parameter and determinetheir limits based on the limits of the sample and the value of thetrim parameter).E.g., if this isc(.001, .999), then a slab is drawnfor the distribution from the quantile atp = .001 to the quantile atp = .999. If the lower(respectively upper) limit isNA, then the lower (upper) limit will be the minimum (maximum) of thedistribution's support if it is finite, and0.001 (0.999) if it is not finite. E.g., ifp_limits isc(NA, NA), on a gamma distribution the effective value ofp_limits would bec(0, .999) since the gamma distribution is defined on⁠(0, Inf)⁠; whereas on a normal distributionit would be equivalent toc(.001, .999) since the normal distribution is defined on⁠(-Inf, Inf)⁠.

density

<function |string> Density estimator for sample data. One of:

  • A function which takes a numeric vector and returns a list with elementsx (giving grid points for the density estimator) andy (thecorresponding densities).ggdist provides a family of functionsfollowing this format, includingdensity_unbounded() anddensity_bounded(). This format is also compatible withstats::density().

  • A string giving the suffix of a function name that starts with"density_";e.g."bounded" for⁠[density_bounded()]⁠,"unbounded" for⁠[density_unbounded()]⁠,or"histogram" fordensity_histogram().Defaults to"bounded", i.e.density_bounded(), which estimates the bounds fromthe data and then uses a bounded density estimator based on the reflection method.

adjust

<scalarnumeric |waiver> Passed todensity (e.g.density_bounded()): Value to multiply the bandwidth of the density estimator by. Defaultwaiver() defers to the default of the density estimator, which is usually1.

trim

<scalarlogical |waiver> Passed todensity (e.g.density_bounded()): Should the density estimate be trimmed to the range of the data? Defaultwaiver() defers to the default of the density estimator, which is usuallyTRUE.

breaks

<numeric |function |string |waiver> Passed todensity (e.g.density_histogram()): Determines the breakpoints defining bins. Defaultwaiver() defers to the default of the density estimator, which is usually"Scott". Similar to (but not exactly the same as) thebreaksargument tographics::hist(). One of:

  • A scalar (length-1) numeric giving the number of bins

  • A vector numeric giving the breakpoints between histogram bins

  • A function takingx andweights and returning either thenumber of bins or a vector of breakpoints

  • A string giving the suffix of a function that starts with"breaks_".ggdist provides weighted implementations of the"Sturges","Scott", and"FD" break-finding algorithms fromgraphics::hist(), as well asbreaks_fixed() for manually settingthe bin width. Seebreaks.

For example,breaks = "Sturges" will use thebreaks_Sturges() algorithm,breaks = 9 will create 9 bins, andbreaks = breaks_fixed(width = 1) willset the bin width to1.

align

<scalarnumeric |function |string |waiver> Passed todensity (e.g.density_histogram()): Determines how to align the breakpoints defining bins. Defaultwaiver() defers to the default of the density estimator, which is usually"none" (performs no alignment). One of:

  • A scalar (length-1) numeric giving an offset that is subtractedfrom the breaks. The offset must be between0 and the bin width.

  • A function taking a sorted vector ofbreaks (bin edges) andreturning an offset to subtract from the breaks.

  • A string giving the suffix of a function that starts with"align_" used to determine the alignment, such asalign_none(),align_boundary(), oralign_center().

For example,align = "none" will provide no alignment,align = align_center(at = 0) will center a bin on0, andalign = align_boundary(at = 0) will align a bin edge on0.

outline_bars

<scalarlogical |waiver> Passed todensity (e.g.density_histogram()) and alsoused for discrete analytical distributions (whose slabs are drawn as histograms). Determinesif outlines in between the bars are drawn. Ifwaiver() orFALSE(the default), the outline is drawn only along the tops of the bars. IfTRUE, outlines in betweenbars are also drawn (though you may have to set theslab_color orcolor aesthetic tosee the outlines).

expand

<logical> For sample data, should the slab be expanded to the limits of the scale? DefaultFALSE.Can be a length-two logical vector to control expansion to the lower and upper limit respectively.

limits

<length-2numeric> Manually-specified limits for the slab, asa vector of length two. These limits are combined with those computed based onp_limits as well as the limits defined by the scales of the plot todetermine the limits used to draw the slab functions: these limits specifythe maximal limits; i.e., if specified, the limits will not be wider thanthese (but may be narrower). UseNA to leave a limit alone; e.g.limits = c(0, NA) will ensure that the lower limit does not go below 0, butlet the upper limit be determined by eitherp_limits or the scale settings.

n

<scalarnumeric> Number of points at which to evaluate the function that defines the slab. Alsopassed todensity (e.g.density_bounded()). Defaultwaiver() uses the value501for analytical distributions and defers to the default of the density estimator forsample-based distributions, which is also usually501.

orientation

<string> Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aestheticsare assigned. Automatic detection works most of the time.

  • "horizontal" (or"y"): draw horizontally, using they aesthetic to identify differentgroups. For each group, uses thex,xmin,xmax, andthickness aesthetics todraw points, intervals, and slabs.

  • "vertical" (or"x"): draw vertically, using thex aesthetic to identify differentgroups. For each group, uses they,ymin,ymax, andthickness aesthetics todraw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme fororientation,"x" can be used as an aliasfor"vertical" and"y" as an alias for"horizontal" (ggdist had anorientation parameterbefore base ggplot did, hence the discrepancy).

na.rm

<scalarlogical> IfFALSE, the default, missing values are removed with a warning. IfTRUE,missing values are silently removed.

show.legend

<logical> Should this layer be included in the legends? Default isc(size = FALSE),unlike most geoms, to match its common use cases.FALSE hides all legends,TRUE shows all legends,andNA shows only those that are mapped (the default for most geoms). It can also be a named logicalvector to finely select the aesthetics to display.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

check.aes,check.param

IfTRUE, the default, will check thatsupplied parameters and aesthetics are understood by thegeom orstat. UseFALSE to suppress the checks.

Details

This stat computes slab values (i.e. PDF and CDF values) at specified locationson a distribution, as determined by theat parameter.

To visualize sample data, such as a data distribution, samples from abootstrap distribution, or a Bayesian posterior, you can supply samples tothex ory aesthetic.

To visualize analytical distributions, you can use thexdist orydistaesthetic. For historical reasons, you can also usedist to specify the distribution, thoughthis is not recommended as it does not work as well with orientation detection.These aesthetics can be used as follows:

Value

Aggplot2::Stat representing a spike geometry which can be added to aggplot() object.

Aesthetics

The spikegeom has a wide variety of aesthetics that controlthe appearance of its two sub-geometries: thespike and thepoint.

Thesestats support the following aesthetics:

In addition, in their default configuration (paired withgeom_spike())the following aesthetics are supported by the underlying geom:

Spike-specific (aka Slab-specific) aesthetics

Color aesthetics

Line aesthetics

Other aesthetics (these work as in standardgeoms)

See examples of some of these aesthetics in action invignette("slabinterval").Learn more about the sub-geom override aesthetics (likeinterval_color) in thescales documentation. Learn more about basic ggplot aesthetics invignette("ggplot2-specs").

Computed Variables

The following variables are computed by this stat and made available foruse in aesthetic specifications (aes()) using theafter_stat()function or theafter_stat argument ofstage():

See Also

Seegeom_spike() for the geom underlying this stat.Seestat_slabinterval() for the stat this shortcut is based on.

Other slabinterval stats:stat_ccdfinterval(),stat_cdfinterval(),stat_eye(),stat_gradientinterval(),stat_halfeye(),stat_histinterval(),stat_interval(),stat_pointinterval(),stat_slab()

Examples

library(ggplot2)library(distributional)library(dplyr)df = tibble(  d = c(dist_normal(1), dist_gamma(2,2)), g = c("a", "b"))# annotate the density at the mode of a distributiondf %>%  ggplot(aes(y = g, xdist = d)) +  stat_slab(aes(xdist = d)) +  stat_spike(at = "Mode") +  # need shared thickness scale so that stat_slab and geom_spike line up  scale_thickness_shared()# annotate the endpoints of intervals of a distribution# here we'll use an arrow instead of a point by setting size = 0arrow_spec = arrow(angle = 45, type = "closed", length = unit(4, "pt"))df %>%  ggplot(aes(y = g, xdist = d)) +  stat_halfeye(point_interval = mode_hdci) +  stat_spike(    at = function(x) hdci(x, .width = .66),    size = 0, arrow = arrow_spec, color = "blue", linewidth = 0.75  ) +  scale_thickness_shared()# annotate quantiles of a sampleset.seed(1234)data.frame(x = rnorm(1000, 1:2), g = c("a","b")) %>%  ggplot(aes(x, g)) +  stat_slab() +  stat_spike(at = function(x) quantile(x, ppoints(10))) +  scale_thickness_shared()

Scaled and shifted Student's t distribution

Description

Density, distribution function, quantile function and random generation for thescaled and shifted Student's t distribution, parameterized by degrees of freedom (df),location (mu), and scale (sigma).

Usage

dstudent_t(x, df, mu = 0, sigma = 1, log = FALSE)pstudent_t(q, df, mu = 0, sigma = 1, lower.tail = TRUE, log.p = FALSE)qstudent_t(p, df, mu = 0, sigma = 1, lower.tail = TRUE, log.p = FALSE)rstudent_t(n, df, mu = 0, sigma = 1)

Arguments

x,q

vector of quantiles.

df

degrees of freedom (> 0, maybe non-integer).df = Inf is allowed.

mu

<numeric> Location parameter (median).

sigma

<numeric> Scale parameter.

log,log.p

logical; if TRUE, probabilities p are given as log(p).

lower.tail

logical; if TRUE (default), probabilities areP[X \le x], otherwise,P[X > x].

p

vector of probabilities.

n

number of observations. Iflength(n) > 1, the lengthis taken to be the number required.

Value

The length of the result is determined byn forrstudent_t, and is the maximum of the lengths ofthe numerical arguments for the other functions.

The numerical arguments other thann are recycled to the length of the result. Only the first elementsof the logical arguments are used.

See Also

parse_dist() and parsing distribution specs and thestat_slabinterval()family of stats for visualizing them.

Examples

library(dplyr)library(ggplot2)expand.grid(  df = c(3,5,10,30),  scale = c(1,1.5)) %>%  ggplot(aes(y = 0, dist = "student_t", arg1 = df, arg2 = 0, arg3 = scale, color = ordered(df))) +  stat_slab(p_limits = c(.01, .99), fill = NA) +  scale_y_continuous(breaks = NULL) +  facet_grid( ~ scale) +  labs(    title = "dstudent_t(x, df, 0, sigma)",    subtitle = "Scale (sigma)",    y = NULL,    x = NULL  ) +  theme_ggdist() +  theme(axis.title = element_text(hjust = 0))

Sub-geometry scales for geom_slabinterval (ggplot2 scales)

Description

These scales allow more specific aesthetic mappings to be made when usinggeom_slabinterval()and stats/geoms based on it (like eye plots).

Usage

scale_point_colour_discrete(..., aesthetics = "point_colour")scale_point_color_discrete(..., aesthetics = "point_colour")scale_point_colour_continuous(  ...,  aesthetics = "point_colour",  guide = guide_colourbar2())scale_point_color_continuous(  ...,  aesthetics = "point_colour",  guide = guide_colourbar2())scale_point_fill_discrete(..., aesthetics = "point_fill")scale_point_fill_continuous(  ...,  aesthetics = "point_fill",  guide = guide_colourbar2())scale_point_alpha_continuous(..., range = c(0.1, 1))scale_point_alpha_discrete(..., range = c(0.1, 1))scale_point_size_continuous(..., range = c(1, 6))scale_point_size_discrete(..., range = c(1, 6), na.translate = FALSE)scale_interval_colour_discrete(..., aesthetics = "interval_colour")scale_interval_color_discrete(..., aesthetics = "interval_colour")scale_interval_colour_continuous(  ...,  aesthetics = "interval_colour",  guide = guide_colourbar2())scale_interval_color_continuous(  ...,  aesthetics = "interval_colour",  guide = guide_colourbar2())scale_interval_alpha_continuous(..., range = c(0.1, 1))scale_interval_alpha_discrete(..., range = c(0.1, 1))scale_interval_size_continuous(..., range = c(1, 6))scale_interval_size_discrete(..., range = c(1, 6), na.translate = FALSE)scale_interval_linetype_discrete(..., na.value = "blank")scale_interval_linetype_continuous(...)scale_slab_colour_discrete(..., aesthetics = "slab_colour")scale_slab_color_discrete(..., aesthetics = "slab_colour")scale_slab_colour_continuous(  ...,  aesthetics = "slab_colour",  guide = guide_colourbar2())scale_slab_color_continuous(  ...,  aesthetics = "slab_colour",  guide = guide_colourbar2())scale_slab_fill_discrete(..., aesthetics = "slab_fill")scale_slab_fill_continuous(  ...,  aesthetics = "slab_fill",  guide = guide_colourbar2())scale_slab_alpha_continuous(  ...,  limits = function(l) c(min(0, l[[1]]), l[[2]]),  range = c(0, 1))scale_slab_alpha_discrete(..., range = c(0.1, 1))scale_slab_size_continuous(..., range = c(1, 6))scale_slab_size_discrete(..., range = c(1, 6), na.translate = FALSE)scale_slab_linewidth_continuous(..., range = c(1, 6))scale_slab_linewidth_discrete(..., range = c(1, 6), na.translate = FALSE)scale_slab_linetype_discrete(..., na.value = "blank")scale_slab_linetype_continuous(...)scale_slab_shape_discrete(..., solid = TRUE)scale_slab_shape_continuous(...)guide_colourbar2(...)guide_colorbar2(...)

Arguments

...

Arguments passed to underlying scale or guide functions. E.g.scale_point_color_discretepasses arguments toscale_color_discrete(). See those functions for more details.

aesthetics

<character> Names of aesthetics to set scales for.

guide

<Guide |string> Guide to use for legends for an aesthetic.

range

<length-2numeric> The minimum and maximum size of the plotting symbolafter transformation.

na.translate

<scalarlogical> In discrete scales, should we show missing values?

na.value

<linetype> Whenna.translate isTRUE, what value should be shown?

limits

One of:

  • NULL to use the default scale range

  • A numeric vector of length two providing limits of the scale.UseNA to refer to the existing minimum or maximum

  • A function that accepts the existing (automatic) limits and returnsnew limits. Also accepts rlanglambda functionnotation.Note that setting limits on positional scales willremove data outside of the limits.If the purpose is to zoom, use the limit argument in the coordinate system(seecoord_cartesian()).

solid

Should the shapes be solid,TRUE, or hollow,FALSE?

Details

The following additional scales / aesthetics are defined for use withgeom_slabinterval() andrelated geoms:

⁠scale_point_color_* ⁠

Point color

⁠scale_point_fill_* ⁠

Point fill color

⁠scale_point_alpha_* ⁠

Point alpha level / opacity

⁠scale_point_size_* ⁠

Point size

⁠scale_interval_color_* ⁠

Interval line color

⁠scale_interval_alpha_* ⁠

Interval alpha level / opacity

⁠scale_interval_linetype_* ⁠

Interval line type

⁠scale_slab_color_* ⁠

Slab outline color

⁠scale_slab_fill_* ⁠

Slab fill color

⁠scale_slab_alpha_* ⁠

Slab alpha level / opacity. The default settings ofscale_slab_alpha_continuous differ fromscale_alpha_continuous()and are designed for gradient plots (e.g.stat_gradientinterval()) by ensuring thatdensities of 0 get mapped to 0 in the output.

⁠scale_slab_linewidth_* ⁠

Slab outline line width

⁠scale_slab_linetype_* ⁠

Slab outline line type

⁠scale_slab_shape_* ⁠

Slab dot shape (forgeom_dotsinterval())

See the corresponding scale documentation in ggplot for more information; e.g.scale_color_discrete(),scale_color_continuous(), etc.

Other scale functions can be used with the aesthetics/scales defined here by using theaestheticsargument to that scale function. For example, to use color brewer scales with thepoint_color aesthetic:

scale_color_brewer(..., aesthetics = "point_color")

With continuous color scales, you may also need to provide a guide as the default guide does not work properly;this is whatguide_colorbar2 is for:

scale_color_distiller(..., guide = "colorbar2", aesthetics = "point_color")

These scales have been deprecated:

⁠scale_interval_size_* ⁠

Use⁠scale_linewidth_*⁠

⁠scale_slab_size_* ⁠

Slab⁠scale_size_linewidth_*⁠

Value

Aggplot2::Scale representing one of the aesthetics used to target the appearance of specific parts of compositeggdist geoms. Can be added to aggplot() object.

Author(s)

Matthew Kay

See Also

Other ggplot2 scales:scale_color_discrete(),scale_color_continuous(), etc.

Other ggdist scales:scale_colour_ramp,scale_side_mirrored(),scale_thickness

Examples

library(dplyr)library(ggplot2)# This plot shows how to set multiple specific aesthetics# NB it is very ugly and is only for demo purposes.data.frame(distribution = "Normal(1,2)") %>%  parse_dist(distribution) %>%  ggplot(aes(y = distribution, xdist = .dist, args = .args)) +  stat_halfeye(    shape = 21,  # this point shape has a fill and outline    point_color = "red",    point_fill = "black",    point_alpha = .1,    point_size = 6,    stroke = 2,    interval_color = "blue",    # interval line widths are scaled from [1, 6] onto [0.6, 1.4] by default    # see the interval_size_range parameter in help("geom_slabinterval")    linewidth = 8,    interval_linetype = "dashed",    interval_alpha = .25,    # fill sets the fill color of the slab (here the density)    slab_color = "green",    slab_fill = "purple",    slab_linewidth = 3,    slab_linetype = "dotted",    slab_alpha = .5  )

Axis sub-guide for thickness scales

Description

This is a sub-guide intended for annotating thethickness and dot-countsubscales inggdist. It can be used with thesubguide parameter ofgeom_slabinterval() andgeom_dotsinterval().

Supportsautomatic partial function application withwaived arguments.

Usage

subguide_axis(  values,  title = NULL,  breaks = waiver(),  labels = waiver(),  position = 0,  just = 0,  label_side = "topright",  orientation = "horizontal",  theme = theme_get())subguide_inside(..., label_side = "inside")subguide_outside(..., label_side = "outside", just = 1)subguide_integer(..., breaks = scales::breaks_extended(Q = c(1, 5, 2, 4, 3)))subguide_count(..., breaks = scales::breaks_width(1))subguide_slab(values, ...)subguide_dots(values, ...)subguide_spike(values, ...)

Arguments

values

<numeric> Values used to construct the scale used for this guide.Typically provided automatically bygeom_slabinterval().

title

<string> The title of the scale shown on the sub-guide's axis.

breaks

One of:

  • NULL for no breaks

  • waiver() for the default breaks computed by thetransformation object

  • A numeric vector of positions

  • A function that takes the limits as input and returns breaksas output (e.g., a function returned byscales::extended_breaks()).Note that for position scales, limits are provided after scale expansion.Also accepts rlanglambda function notation.

labels

One of:

  • NULL for no labels

  • waiver() for the default labels computed by thetransformation object

  • A character vector giving labels (must be same length asbreaks)

  • An expression vector (must be the same length as breaks). See ?plotmath for details.

  • A function that takes the breaks as input and returns labelsas output. Also accepts rlanglambda functionnotation.

position

<scalarnumeric> Value between0 and1 giving the position of theguide relative to the axis:0 causes the sub-guide to be drawn on theleft or bottom depending on iforientation is"horizontal" or"vertical",and1 causes the sub-guide to be drawn on the top or right depending oniforientation is"horizontal" or"vertical". May also be a stringindicating the position:"top","right","bottom","left","topright","topleft","bottomright", or"bottomleft".

just

<scalarnumeric> Value between0 and1 giving the justification of theguide relative to its position: 0 means aligned towards the inside of theaxis edge, 1 means aligned towards the outside of the axis edge.

label_side

<string> Which side of the axis to draw the ticks and labels on."topright","top", and"right" are synonyms which cause the labelsto be drawn on the top or the right depending on iforientation is"horizontal" or"vertical"."bottomleft","bottom", and"left"are synonyms which cause the labels to be drawn on the bottom or the leftdepending on iforientation is"horizontal" or"vertical"."topleft" causes the labels to be drawn on the top or the left, and"bottomright" causes the labels to be drawn on the bottom or the right."inside" causes the labels to be drawn on the side closest to the insideof the chart, depending onposition, and"outside" on the side closestto the outside of the chart.

orientation

<string> Orientation of the geometry this sub-guide is for. Oneof"horizontal" ("y") or"vertical" ("x"). See theorientationparameter togeom_slabinterval().

theme

<theme> Theme used to determine the style that thesub-guide elements are drawn in. The title label is drawn using the"axis.title.x" or"axis.title.y" theme setting, and the axis line,ticks, and tick labels are drawn usingguide_axis(), so the same themesettings that normally apply to axis guides will be followed.

...

Arguments passed to other functions, typically back tosubguide_axis() itself.

Details

subguide_inside() is a shortcut for drawing labels inside of the chartregion.

subguide_outside() is a shortcut for drawing labels outside of the chartregion.

subguide_integer() only draws breaks that are integer values, useful forlabeling counts ingeom_dots().

subguide_count() is a shortcut for drawing labels whereevery whole numberis labeled, useful for labeling counts ingeom_dots(). If your max count islarge,subguide_integer() may be better.

subguide_slab(),subguide_dots(), andsubguide_spike() are aliasesforsubguide_none() that allow you to change the default subguide usedfor thegeom_slabinterval(),geom_dotsinterval(), andgeom_spike()families. If you overwrite these in the global environment, you can setthe corresponding default subguide. For example:

subguide_slab = ggdist::subguide_inside(position = "right")

This will causegeom_slabinterval()s to default to having a guide on theright side of the geom.

See Also

Thethickness datatype.

Thethickness aesthetic ofgeom_slabinterval().

scale_thickness_shared(), for setting athickness scale acrossall geometries using thethickness aesthetic.

subscale_thickness(), for setting athickness sub-scale withina singlegeom_slabinterval().

Other sub-guides:subguide_none()

Examples

library(ggplot2)library(distributional)df = data.frame(d = dist_normal(2:3, 2:3), g = c("a", "b"))# subguides allow you to label thickness axesggplot(df, aes(xdist = d, y = g)) +  stat_slabinterval(subguide = "inside")# they respect normalization and use of scale_thickness_shared()ggplot(df, aes(xdist = d, y = g)) +  stat_slabinterval(subguide = "inside", normalize = "groups")# they can also be positioned outside the plot area, though# this typically requires manually adjusting plot marginsggplot(df, aes(xdist = d, y = g)) +  stat_slabinterval(subguide = subguide_outside(title = "density", position = "right")) +  theme(plot.margin = margin(5.5, 50, 5.5, 5.5))# any of the subguide types will also work to indicate bin counts in# geom_dots(); subguide_integer() and subguide_count() can be useful for# dotplots as they only label integers / whole numbers:df = data.frame(d = dist_gamma(2:3, 2:3), g = c("a", "b"))ggplot(df, aes(xdist = d, y = g)) +  stat_dots(subguide = subguide_count(label_side = "left", title = "count")) +  scale_y_discrete(expand = expansion(add = 0.1)) +  scale_x_continuous(expand = expansion(add = 0.5))

Empty sub-guide for thickness scales

Description

This is a blank sub-guide that omits annotations for thethickness anddot-count sub-scales inggdist. It can be used with thesubguideparameter ofgeom_slabinterval() andgeom_dotsinterval().

Supportsautomatic partial function application withwaived arguments.

Usage

subguide_none(values, ...)

Arguments

values

<numeric> Values used to construct the scale used for this guide.Typically provided automatically bygeom_slabinterval().

...

ignored.

See Also

Other sub-guides:subguide_axis()


Identity sub-scale for thickness aesthetic

Description

This is an identity sub-scale for thethickness aestheticinggdist. It returns its input as athickness vector withoutrescaling. It can be used with thesubscale parameter ofgeom_slabinterval().

Usage

subscale_identity(x)

Arguments

x

<numeric> Vector to be rescaled.Typically provided automatically bygeom_slabinterval().

Value

Athickness vector of the same length asx, with infinitevalues inx squished into the data range.

See Also

Other sub-scales:subscale_thickness()


Sub-scale for thickness aesthetic

Description

This is a sub-scale intended for adjusting the scaling of thethicknessaesthetic at a geometry (or sub-geometry) level inggdist. It can beused with thesubscale parameter ofgeom_slabinterval().

Supportsautomatic partial function application withwaived arguments.

Usage

subscale_thickness(  x,  limits = function(l) c(min(0, l[1]), l[2]),  expand = c(0, 0))

Arguments

x

<numeric> Vector to be rescaled.Typically provided automatically bygeom_slabinterval().

limits

<length-2numeric |function |NULL> One of:

  • Anumeric vector of length two providing the limits of the scale. UseNA to use the default minimum or maximum.

  • A function that accepts a length-2numeric vector of the automaticlimits and returns new limits. Unlike positional scales, these limitswill not remove data.

  • NULL to use the range of the data

expand

<numeric> Vector of limit expansion constants of length2 or 4, following the same format used by theexpand argument ofcontinuous_scale(). The default is not to expand the limits.You can use the convenience functionexpansion() to generate theexpansion values; expanding the lower limit is usually not recommended(because with mostthickness scales the lower limit is the baselineand represents0), so a typical usage might be something likeexpand = expansion(c(0, 0.05)) to expand the top end of the scaleby 5%.

Details

You can overwritesubscale_thickness in the global environment to setthe default properties of the thickness subscale. For example:

subscale_thickness = ggdist::subscale_thickness(expand = expansion(c(0, 0.05)))

This will causegeom_slabinterval()s to default to a thickness subscalethat expands by 5% at the top of the scale.Always prefix such adefinition with⁠ggdist::⁠ to avoid infinite loops caused by recursion.

Value

Athickness vector of the same length asx scaled to be between0 and1.

See Also

Thethickness datatype.

Thethickness aesthetic ofgeom_slabinterval().

scale_thickness_shared(), for setting athickness scale acrossall geometries using thethickness aesthetic.

Other sub-scales:subscale_identity()

Examples

library(ggplot2)library(distributional)df = data.frame(d = dist_normal(2:3, 1), g = c("a", "b"))# breaks on thickness subguides are always limited to the bounds of the# subscale, which may leave labels off near the edge of the subscale# (e.g. here `0.4` is omitted because the max value is approx `0.39`)ggplot(df, aes(xdist = d, y = g)) +  stat_slabinterval(    subguide = "inside"  )# We can use the subscale to expand the upper limit of the thickness scale# by 5% (similar to the default for positional scales), allowing bounds near# (but just less than) the limit, like `0.4`, to be shown.ggplot(df, aes(xdist = d, y = g)) +  stat_slabinterval(    subguide = "inside",    subscale = subscale_thickness(expand = expansion(c(0, 0.5)))  )

Simple, light ggplot2 theme for ggdist and tidybayes

Description

A simple, relatively minimalist ggplot2 theme, and some helper functions to go with it.

Usage

theme_ggdist(  base_size = 11,  base_family = "",  base_line_size = base_size/22,  base_rect_size = base_size/22)theme_tidybayes(  base_size = 11,  base_family = "",  base_line_size = base_size/22,  base_rect_size = base_size/22)facet_title_horizontal()axis_titles_bottom_left()facet_title_left_horizontal()facet_title_right_horizontal()

Arguments

base_size

base font size, given in pts.

base_family

base font family

base_line_size

base size for line elements

base_rect_size

base size for rect elements

Details

This is a relatively minimalist ggplot2 theme, intended to be used for making publication-ready plots.It is currently based onggplot2::theme_light().

A word of warning: this theme may (and very likely will) change in the future as I tweak it to my taste.

theme_ggdist() andtheme_tidybayes() are aliases.

Value

A named list in the format ofggplot2::theme()

Author(s)

Matthew Kay

See Also

ggplot2::theme(),ggplot2::theme_set()

Examples

library(ggplot2)theme_set(theme_ggdist())

Thickness (datatype)

Description

A representation of the thickness of a slab: a scaled value (x) where0 is the base of the slab and1 is its maximum extent, and the lower(lower) and upper (upper) limits of the slab values in their originaldata units.

Usage

thickness(x = double(), lower = NA_real_, upper = NA_real_)

Arguments

x

<coercible-to-numeric> Anumeric vector or an objectcoercible to anumeric (viavctrs::vec_cast()) representing scaled valuesto be converted to athickness() object.

lower

<numeric> The original lower bounds of thickness values before scaling.May beNA to indicate that this bound is not known.

upper

<numeric> The original upper bounds of thickness values before scaling.May beNA to indicate that this bound is not known.

Details

This datatype is used byscale_thickness_shared() andsubscale_thickness()to representnumeric()-like objects marked as being in units of slab "thickness".

Unlike regularnumeric()s,thickness() values mapped onto thethicknessaesthetic are not rescaled byscale_thickness_shared() orgeom_slabinterval().In most casesthickness() is not useful directly; though it can be used tomark values that should not be rescaled—see the definitions ofstat_ccdfinterval() andstat_gradientinterval() for some example usages.

thickness objects with unequal lower or upper limits may not be combined.However,thickness objects withNA limits may be combined withthickness objects with non-NA limits. This allows (e.g.) specifyinglocations on thethickness scale that are independent of data limits.

Value

Avctrs::rcrd of class"ggdist_thickness" with fields"x","lower", and"upper".

Author(s)

Matthew Kay

See Also

Thethickness aesthetic ofgeom_slabinterval().

scale_thickness_shared(), for setting athickness scale acrossall geometries using thethickness aesthetic.

subscale_thickness(), for setting athickness sub-scale withina singlegeom_slabinterval().

Examples

thickness(0:1)thickness(0:1, 0, 10)

Translate between different tidy data frame formats for draws from distributions

Description

These functions translateggdist/tidybayes-style data frames to/from different data frameformats (each format using a different naming scheme for its columns).

Usage

to_broom_names(data)from_broom_names(data)to_ggmcmc_names(data)from_ggmcmc_names(data)

Arguments

data

<data.frame> A data frame to translate.

Details

Function prefixed withto_ translate from theggdist/tidybayes format to another format, functionsprefixed withfrom_ translate from that format back to theggdist/tidybayes format. Formats include:

to_broom_names() /from_broom_names():

to_ggmcmc_names() /from_ggmcmc_names():

Value

A data frame with (possibly) new names in some columns, according to thetranslation scheme described inDetails.

Author(s)

Matthew Kay

Examples

library(dplyr)data(RankCorr_u_tau, package = "ggdist")df = RankCorr_u_tau %>%  dplyr::rename(.variable = i, .value = u_tau) %>%  group_by(.variable) %>%  median_qi(.value)dfdf %>%  to_broom_names()

A waived argument

Description

A flag indicating that the default value of an argument should be used.

Usage

waiver()

Details

Awaiver() is a flag passed to a function argument that indicates thefunction should use the default value of that argument. It is used in twocases:

See Also

auto_partial(),ggplot2::waiver()

Examples

f = auto_partial(function(x, y = "b") {  c(x = x, y = y)})f("a")# uses the default value of `y` ("b")f("a", y = waiver())# partially apply `f`g = f(y = "c")g# uses the last partially-applied value of `y` ("c")g("a", y = waiver())

Weighted empirical cumulative distribution function

Description

A variation ofecdf() that can be applied to weighted samples.

Usage

weighted_ecdf(x, weights = NULL, na.rm = FALSE)

Arguments

x

<numeric> Sample values.

weights

<numeric |NULL> Weights for the sample. One of:

  • numeric vector of same length asx: weights for corresponding values inx,which will be normalized to sum to 1.

  • NULL: indicates no weights are provided, so the unweighted empiricalcumulative distribution function (equivalent toecdf()) is returned.

na.rm

<scalarlogical> IfTRUE, corresponding entries inx andweightsare removed if either isNA.

Details

Generates a weighted empirical cumulative distribution function,F(x).Givenx, a sorted vector (derived fromx), andw_i, the correspondingweight forx_i,F(x) is a step function with steps at eachx_iwithF(x_i) equal to the sum of all weights up to and includingw_i.

Value

weighted_ecdf() returns a function of class"weighted_ecdf", which alsoinherits from thestepfun() class. Thus, it also hasplot() andprint()methods. Likeecdf(),weighted_ecdf() also provides aquantile() method,which dispatches toweighted_quantile().

See Also

weighted_quantile()

Examples

weighted_ecdf(1:3, weights = 1:3)plot(weighted_ecdf(1:3, weights = 1:3))quantile(weighted_ecdf(1:3, weights = 1:3), 0.4)

Weighted sample quantiles

Description

A variation ofquantile() that can be applied to weighted samples.

Usage

weighted_quantile(  x,  probs = seq(0, 1, 0.25),  weights = NULL,  n = NULL,  na.rm = FALSE,  names = TRUE,  type = 7,  digits = 7)weighted_quantile_fun(x, weights = NULL, n = NULL, na.rm = FALSE, type = 7)

Arguments

x

<numeric> Sample values.

probs

<numeric> Vector of probabilities in[0, 1] defining thequantiles to return.

weights

<numeric |NULL> Weights for the sample. One of:

  • numeric vector of same length asx: weights for corresponding values inx,which will be normalized to sum to 1.

  • NULL: indicates no weights are provided, so unweighted sample quantiles(equivalent toquantile()) are returned.

n

<scalarnumeric> Presumed effective sample size. If this is greater than 1 andcontinuous quantiles (type >= 4) are requested, flat regions may be addedto the approximation to the inverse CDF in areas where the normalizedweight exceeds1/n (i.e., regions of high density). This can be used toensure that if a sample of sizen with duplicatex values is summarizedinto a weighted sample without duplicates, the result ofweighted_quantile(..., n = n)on the weighted sample is equal to the result ofquantile() on the originalsample. One of:

  • NULL: do not make a sample size adjustment.

  • numeric: presumed effective sample size.

  • function or name of function (as a string): A function applied toweights (prior to normalization) to determine the sample size. Someuseful values may be:

    • "length": i.e. use the number of elements inweights (equivalentlyinx) as the effective sample size.

    • "sum": i.e. use the sum of the unnormalizedweights as the samplesize. Useful if the providedweights is unnormalized so that itssum represents the true sample size.

na.rm

<scalarlogical> IfTRUE, corresponding entries inx andweightsare removed if either isNA.

names

<scalarlogical> IfTRUE, add names to the output giving the inputprobs formatted as a percentage.

type

<scalarinteger> Value between 1 and 9: determines the type of quantile estimatorto be used. Types 1 to 3 are for discontinuous quantiles, types 4 to 9 arefor continuous quantiles. SeeDetails.

digits

<scalarnumeric> The number of digits to use to format percentageswhennames isTRUE.

Details

Calculates weighted quantiles using a variation of the quantile types basedon a generalization ofquantile().

Type 1–3 (discontinuous) quantiles are directly a function of the inverseCDF as a step function, and so can be directly translated to the weightedcase using the natural definition of the weighted ECDF as the cumulativesum of the normalized weights.

Type 4–9 (continuous) quantiles require some translation from the definitionsinquantile().quantile() defines continuous estimators in terms ofx_k, which is thekth order statistic, andp_k, which is a function ofkandn (the sample size). In the weighted case, we instead takex_k as thekthsmallest value ofx in the weighted sample (not necessarily an order statistic,because of the weights). Then we can re-write the formulas forp_k in terms ofF(x_k) (the empirical CDF atx_k, i.e. the cumulative sum of normalizedweights) andf(x_k) (the normalized weight atx_k), by using thefact that, in the unweighted case,k = F(x_k) \cdot n and1/n = f(x_k):

Type 4

p_k = \frac{k}{n} = F(x_k)

Type 5

p_k = \frac{k - 0.5}{n} = F(x_k) - \frac{f(x_k)}{2}

Type 6

p_k = \frac{k}{n + 1} = \frac{F(x_k)}{1 + f(x_k)}

Type 7

p_k = \frac{k - 1}{n - 1} = \frac{F(x_k) - f(x_k)}{1 - f(x_k)}

Type 8

p_k = \frac{k - 1/3}{n + 1/3} = \frac{F(x_k) - f(x_k)/3}{1 + f(x_k)/3}

Type 9

p_k = \frac{k - 3/8}{n + 1/4} = \frac{F(x_k) - f(x_k) \cdot 3/8}{1 + f(x_k)/4}

Then the quantile function (inverse CDF) is the piece-wise linear functiondefined by the points(p_k, x_k).

Value

weighted_quantile() returns a numeric vector oflength(probs) with theestimate of the corresponding quantile fromprobs.

weighted_quantile_fun() returns a function that takes a single argument,a vector of probabilities, which itself returns the corresponding quantileestimates. It may be useful whenweighted_quantile() needs to be calledrepeatedly for the same sample, re-using some pre-computation.

See Also

weighted_ecdf()


[8]ページ先頭

©2009-2025 Movatter.jp