Movatterモバイル変換

The goal of rando is to provide easier generating of random numbersin a manner that is context aware, and reproducible.

Installation

Example

With rando, generating random numbers becomes incredibly easy, as wedo not need to define how many random numbers we need.rando will figure out how many you need based on where thenumber generator is being used.

df<-tibble(id =1:10,x =r_norm())df#> # A tibble: 10 x 2#>       id      x#>    <int>  <dbl>#>  1     1 -0.365#>  2     2  0.173#>  3     3 -0.294#>  4     4  0.576#>  5     5  0.875#>  6     6  0.359#>  7     7 -0.527#>  8     8 -0.819#>  9     9 -0.990#> 10    10  0.518

mutate(df,y =r_unif())#> # A tibble: 10 x 3#>       id      x      y#>    <int>  <dbl>  <dbl>#>  1     1 -0.365 0.210#>  2     2  0.173 0.354#>  3     3 -0.294 0.317#>  4     4  0.576 0.0695#>  5     5  0.875 0.125#>  6     6  0.359 0.169#>  7     7 -0.527 0.305#>  8     8 -0.819 0.601#>  9     9 -0.990 0.483#> 10    10  0.518 0.300

Parameters can also be used to define the number of values to return.If parameters are longer than 1,rando will try to returnthe same number of random values, unless there is a clash between two ofthe parameters

r_norm(mean =1:10)#>  [1] 0.4088105 2.2987041 2.2807546 3.9659070 4.5111552 5.4712253 6.5461452#>  [8] 6.3708207 7.7550056 8.7627581r_norm(mean=1:10,sd=1:2)#> Error: Inconsistent parameter lengths supplied to r_norm()

If you want to manually define the number of random numbers to begenerated, there are two ways to do it. The old fashioned way: providingthen argument (this must be named)

r_unif(n=20)#>  [1] 0.75427791 0.97153547 0.06031924 0.43098427 0.45223070 0.54105261#>  [7] 0.13882213 0.86252549 0.31421104 0.97247948 0.29288323 0.03809931#> [13] 0.55187415 0.51237188 0.45841500 0.12699633 0.15236584 0.08755528#> [19] 0.78088410 0.83223010

Or, if we are generating many random numbers, we can set a defaultn value to be used globally

set_n(15)r_norm(mean=3)#>  [1] 4.001347 2.561471 3.474956 2.312623 2.508933 5.044508 2.586922 3.051763#>  [9] 1.205965 3.220328 3.575350 4.599801 2.599194 4.300862 2.722302r_binom(size=3)#>  [1] 1 2 0 1 3 0 1 2 1 1 3 0 2 2 0

Safer and replicable

Therando functions also check if parameters beingsupplied are viable and throws an informative error if not. This isdifferent to the defaultstats random number generatingfunctions, which may return a lot ofNaN values with only avague warning.

rnorm(n=10,sd=-1)#> Warning in rnorm(n = 10, sd = -1): NAs produced#>  [1] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaNr_norm(sd=-1)#> Error: sd provided to r_norm() must be strictly positive

Allrando functions can also take a.seedargument which does one of two things:

If.seed is notNULL (the default), thenthisseed value (supplied or generated) will be attached tothe output, and can be extracted withpull_seed()

r_norm(.seed =42)#>  [1]  1.37095845 -0.56469817  0.36312841  0.63286260  0.40426832 -0.10612452#>  [7]  1.51152200 -0.09465904  2.01842371 -0.06271410  1.30486965  2.28664539#> [13] -1.38886070 -0.27878877 -0.13332134#> attr(,"seed")#> [1] 42r_norm(.seed =42)#>  [1]  1.37095845 -0.56469817  0.36312841  0.63286260  0.40426832 -0.10612452#>  [7]  1.51152200 -0.09465904  2.01842371 -0.06271410  1.30486965  2.28664539#> [13] -1.38886070 -0.27878877 -0.13332134#> attr(,"seed")#> [1] 42x<-r_norm(.seed=TRUE)x#>  [1] -1.0515017  2.8143380  1.1880200 -1.2010801 -1.1589546 -0.1876997#>  [7] -0.1515049  0.7168907 -0.2086623 -1.0248107  0.7394365 -0.5944315#> [13] -1.9588881  0.5869532  0.6124257#> attr(,"seed")#> [1] 1020465408r_norm(.seed=pull_seed(x))#>  [1] -1.0515017  2.8143380  1.1880200 -1.2010801 -1.1589546 -0.1876997#>  [7] -0.1515049  0.7168907 -0.2086623 -1.0248107  0.7394365 -0.5944315#> [13] -1.9588881  0.5869532  0.6124257#> attr(,"seed")#> [1] 1020465408

Blueprints

In order to make simulations easier,rando provides theblueprint() function. This function creates a plan for asimulated dataset usingrando functions.

make_tbl<-blueprint(x =r_norm(),y =r_norm())make_tbl(n=2)#> # A tibble: 2 x 2#>       x     y#>   <dbl> <dbl>#> 1 -1.89 1.34#> 2 -2.28 0.913make_tbl(n=5)#> # A tibble: 5 x 2#>        x      y#>    <dbl>  <dbl>#> 1  0.316 -0.154#> 2  1.86   1.46#> 3 -0.396 -1.42#> 4 -1.08   0.481#> 5  1.75   0.323

These blueprints can accept additional arguments and will begenerated based on these arguments

make_tbl2<-blueprint(x =r_norm(mean=x_mu),y =r_unif(min=y_min,max=y_max))set_n(10000)make_tbl2(x_mu =10,y_min =-10,y_max=-5)%>%summarise(n =n(),mean_x =mean(x),min_y =min(y),max_y =max(y))#> # A tibble: 1 x 4#>       n mean_x min_y max_y#>   <int>  <dbl> <dbl> <dbl>#> 1 10000   10.0 -10.0 -5.00

This then allows for quick generation of simulation data usingpmap() and analysis usingmap()

make_sim<-blueprint(x =r_norm(mean = x_mu),y =r_norm(mean =2*x+10,sd =2))tibble(x_mu =r_unif(n =5,-10,10))%>%pmap(make_sim,n =100)%>%map(lm,formula = y~ x)%>%map_dfr(broom::tidy)#> # A tibble: 10 x 5#>    term        estimate std.error statistic  p.value#>    <chr>          <dbl>     <dbl>     <dbl>    <dbl>#>  1 (Intercept)     9.29     1.35       6.89 5.45e-10#>  2 x               1.92     0.202      9.48 1.60e-15#>  3 (Intercept)     8.69     0.723     12.0  5.58e-21#>  4 x               2.38     0.193     12.3  1.32e-21#>  5 (Intercept)    10.6      0.726     14.6  2.91e-26#>  6 x               1.82     0.252      7.20 1.22e-10#>  7 (Intercept)    10.1      0.770     13.1  3.20e-23#>  8 x               2.06     0.202     10.2  4.72e-17#>  9 (Intercept)     9.78     0.426     22.9  3.54e-41#> 10 x               1.68     0.218      7.72 1.02e-11

Distribution Functions

The majority of random number generating functions from thestats package have been translated intorandofunctions. Be sure to look into the documentation for therando functions you use, as some have re-parametrised.Functions names for transitioning fromstats torando generally follow the same naming convention, that isr*() becomesr_*(), e.g. r_norm()replacesrnorm(). The only exceptions arer_tdist() andr_fdist() to take over the rolesofrt() andrf(), respectively.rando also includes several new distributions such asr_bern() andr_letters().

Arbitrary Distributions

Ther_cdf() function is a dynamic random numbergenerator. It can take any cdf as an argument and produce random numberswith the associated distribution.

my_fun<-function(x,beta=1){if_else(x<0,0,1-exp(-beta*x))}set_n(1000)x_data<-r_cdf(my_fun)hist(x_data,breaks=seq(0,10,0.1))

Anyadditional arguments used by the function, can be passed tor_cdf(), and will be used in determining the number ofvalues to generate (just as in the other distribution functionsabove)

r_cdf(my_fun,beta=1:10)#>  [1] 1.59363151 0.01710057 0.51777959 0.10563731 0.15656352 0.04890561#>  [7] 0.05313754 0.10311007 0.01916289 0.09977221

Finally,purrr-style functions can be used forr_cdf() to allow for even briefer function definitions.These have been extended to allow for the use of additional namedarguments to be passed to these<lambda> functions.Either.x or.t can be used for the randomvariable.

set_n(20)r_cdf(~1-exp(-.x),min=0)#>  [1] 1.00280643 0.51202178 3.15050483 0.38757920 0.16273856 1.37652755#>  [7] 0.41813254 1.14622712 1.26543641 0.01011491 0.65036416 1.35177970#> [13] 1.25859380 0.30105710 1.45331025 0.22260547 1.71133876 0.12983680#> [19] 0.41169524 0.26691556r_cdf(~1-exp(-beta*.x),beta=1:10,min=0,n=10)#>  [1] 0.892275572 0.172501802 0.160342455 0.432735682 0.299936533 0.004011393#>  [7] 0.133234262 0.150531530 0.004047155 0.426167250

Code of Conduct

Please note that the rando project is released with aContributorCode of Conduct. By contributing to this project, you agree to abideby its terms.