Movatterモバイル変換


[0]ホーム

URL:


distionary

License: MITR-CMD-checkCodecov test coverageLifecycle: stableProject Status: Active – The project has reached a stable, usable state and is being actively developed.Status at rOpenSci Software Peer Review

Withdistionary, you can:

  1. Specify a probability distribution (built-inoryourown), and
  2. Evaluatethe probability distribution.

The main purpose ofdistionary is to implement adistribution object, and to make distribution calculations availableeven if they are not specified in the distribution’s definition.distionary provides the building blocks of the widerprobaverse ecosystem formaking representative statistical models.

The name “distionary” is a portmanteau of “distribution” and“dictionary”. While a dictionary lists and defines words,distionary defines distributions and makes a list of commondistribution families available. The built-in distributions act asbuilding blocks for the wider probaverse.

Statement of Need

When building statistical models, distributions should accuratelyreflect your data, but out-of-the-box options like the Normal or Poissondistributions often fall short. Achieving realistic probabilitydistributions demands a versatile workbench where distributions can bemanipulated, and data can inform their features. This is the goal of theprobaverse ecosystem, withdistionaryproviding the foundational building blocks.

distionary provides the fundamentalprobaverse infrastructure for defining probabilitydistribution objects. It allows for the evaluation of distributionproperties, even if they aren’t explicitly specified, offeringstandalone utility for users needing to define a distribution in variousforms and evaluate it comprehensively.

Target Audience

Lots of people work with probability distributions. Lots of peopledon’t work with probability distributions but should, becausethey don’t see the value or because distributions are too clumsy to workwith under existing infrastructure. And, there are lots of peoplelearning about probability distributions that would have an easier timeif they get to “feel” distributions and their multifaceted nature.distionary is for all of these people.

distionary – and theprobaverse more widely– is designed for data scientists, statisticians, and researchers whorequire the flexibility to develop custom statistical models. It catersto those in finance, insurance, environmental science, and engineering,where nuanced distribution modeling is crucial. Whether building complexstochastic models or performing detailed risk assessments,distionary equips users with the tools needed to exploreand manipulate probability distributions effectively.

distionary makes reference to common terms regardingprobability distributions. If you’re uneasy with these terms andconcepts, most intro books in probability will be a good resource tolearn from. Asdistionary develops, more documentation willbe made available so that it’s more self-contained.

Installation

To installdistionary, run the following code in R:

install.packages("distionary")

Example: Built-inDistributions

library(distionary)

Specify a distribution like a Poisson distributionand a Generalised Extreme Value (GEV) distribution using thedst_*() family of functions.

# Create a Poisson distributionpoisson<-dst_pois(1.5)# Inspectpoisson#> Poisson distribution (discrete)#> --Parameters--#> lambda#>    1.5
# Create a GEV distributiongev<-dst_gev(-1,1,0.2)# Inspectgev#> Generalised Extreme Value distribution (continuous)#> --Parameters--#> location    scale    shape#>     -1.0      1.0      0.2

Here is what the distributions look like, via their probability mass(PMF) and density functions.

plot(poisson)

plot(gev)

Evaluate variousdistributional propertiessuch as mean, skewness, and range of valid values.

mean(gev)#> [1] -0.1788514skewness(poisson)#> [1] 0.8164966range(gev)#> [1]  -6 Inf

Properties that completely define the distribution are calleddistributional representations, and can be accessed by theeval_*() functions. such as the PMF or quantiles. Theeval_*() functions simply evaluate the representation,whereas theenframe_*() functions place the outputalongside the input in a data frame or tibble.

eval_pmf(poisson,at =0:4)#> [1] 0.22313016 0.33469524 0.25102143 0.12551072 0.04706652enframe_quantile(gev,at =c(0.2,0.5,0.9))#> # A tibble: 3 × 2#>    .arg quantile#>   <dbl>    <dbl>#> 1   0.2   -1.45#> 2   0.5   -0.620#> 3   0.9    1.84

Example: Custom Distributions

You can create a custom distribution usingdistribution(). The innovative aspect ofdistionary is its ability to automatically computeproperties from the specified representations. By providing just one ortwo representations (such as CDF and density),distionarycan derive other properties as needed.

# Make a distribution by specifying only density and CDFlinear<-distribution(density =function(x) {    d<-2* (1- x)    d[x<0| x>1]<-0    d  },cdf =function(x) {    p<-2* x* (1- x/2)    p[x<0]<-0    p[x>1]<-1    p  },.vtype ="continuous",.name ="My Linear")# Inspectlinear#> My Linear distribution (continuous)#> --Parameters--#> NULL

Here is what it looks like (density function).

plot(linear)

Even though only the density and CDF were specified, other propertiescan be evaluated, like its mean and quantiles:

mean(linear)#> [1] 0.3333333enframe_quantile(linear,at =c(0.2,0.5,0.9))#> # A tibble: 3 × 2#>    .arg quantile#>   <dbl>    <dbl>#> 1   0.2    0.106#> 2   0.5    0.293#> 3   0.9    0.684
distionaryin the Context of Other Packages

The R ecosystem offers several packages for working with probabilitydistributions, each with unique strengths:

In this landscape,distionary addresses the need for acohesive and flexible API that can seamlessly integrate the strengths ofthese packages. It provides a unified framework for defining,manipulating, and evaluating probability distributions. Becausedistionary only needs some distribution properties to bespecified, it offers a level of flexibility central to theprobaverse ecosystem.

Acknowledgements

The creation ofdistionary would not have been possiblewithout the support of BGC Engineering Inc., the R Consortium, thePolitecnico di Milano, the European Space Agency, The University ofBritish Columbia, and the Natural Science and Engineering ResearchCouncil of Canada (NSERC). The authors would also like to thank thereviewers from ROpenSci for their insightful feedback, which greatlycontributed to enhancing the quality of this R package.

Citation

To cite packagedistionary in publications use:

Coia V (2025).distionary: Create and Evaluate ProbabilityDistributions. R package version 0.1.0,https://github.com/probaverse/distionary,https://distionary.probaverse.com/.

Code of Conduct

Please note that the distionary project is released with aCode ofConduct. By contributing to this project, you agree to abide by itsterms.


[8]ページ先頭

©2009-2025 Movatter.jp