Movatterモバイル変換


[0]ホーム

URL:


Type:Package
Title:Time-Based Rolling Functions
Version:0.1.7
Description:Provides rolling statistical functions based on date and time windows instead of n-lagged observations.
URL:https://mps9506.github.io/tbrf/
BugReports:https://github.com/mps9506/tbrf/issues
License:GPL-3 | file LICENSE
Encoding:UTF-8
LazyData:true
RoxygenNote:7.3.1
Depends:R (≥ 2.10), ggplot2 (≥ 2.2.1)
Imports:boot, dplyr, lubridate, purrr, rlang, stats, tibble, tidyr
Suggests:spelling, covr, testthat, knitr, rmarkdown
VignetteBuilder:knitr
Language:en-US
Config/Needs/website:mps9506/mpsTemplates
NeedsCompilation:no
Packaged:2025-08-19 13:46:51 UTC; michael.schramm
Author:Michael SchrammORCID iD [aut, cre, cph], Frank Harrell [ctb], Bob Rudis [ctb]
Maintainer:Michael Schramm <mpschramm@gmail.com>
Repository:CRAN
Date/Publication:2025-08-19 14:10:02 UTC

Time-Based Rolling Functions

Description

Provides rolling statistical functions basedon date and time windows instead of n-lagged observations.

Author(s)

Michael Schramm

See Also

Useful links:


Dissolved oxygen measurements from the Tres Palacios River

Description

Data from the Texas Commission on Environmental Quality Surface Water QualityMonitoring Information System. The 'AverageDO“ field is the mean of dissolved oxygenconcentrations (mg/L) measured at a field site at that day. The MinDO is the minimumdissolved oxygen concentration measured at that site on that day.

Usage

data(Dissolved_Oxygen)

Format

A data frame with 236 rows and 6 variables:

Station_ID

unique water quality monitoring station identifier

Date

sampling date in yyyy-mm-dd format

Param_Code

unique parameter code

Param_Desc

parameter description with units

Average_DO

mean of dissolved oxygen measurement, in mg/L

Min_DO

minimum of dissolved oxygen measurement, in mg/L

Source

https://www80.tceq.texas.gov/SwqmisPublic/public/default.htm


Enterococci bacteria measurements from the Tres Palacios River

Description

Data from the Texas Commission on Environmental Quality Surface Water QualityMonitoring Information System. The 'Value“ field is the lab measured value of Enterococci bacteria (MPN/100 mL) from grab samples collected at 'Station ID' on the Tres Palacios River on 'Date'.

Usage

data(Entero)

Format

A data frame with 212 rows and 5 variables:

Station_ID

unique water quality monitoring station identifier

Date

sampling date in yyyy-mm-dd format

Param_Code

unique parameter code

Param_Desc

parameter description with units

Value

Enterococci concentration, in MPN/L

Source

https://www80.tceq.texas.gov/SwqmisPublic/public/default.htm


Confidence Intervals for Binomial Probabilities

Description

An implementation of thebinconf function in FrankHarrell's Hmisc package. Produces 1-alpha confidence intervals for binomialprobabilities.

Usage

binom_ci(  x,  n,  alpha = 0.05,  method = c("wilson", "exact", "asymptotic"),  return.df = FALSE)

Arguments

x

vector containing the number of "successes" for binomial variates.

n

vector containing the numbers of corresponding observations.

alpha

probability of a type I error, so confidence coefficient =1-alpha.

method

character string specifying which method to use. The "exact"method uses the F distribution to compute exact (based on the binomial cdf)intervals; the "wilson" interval is score-test-based; and the "asymptotic"is the text-book, asymptotic normal interval. Following Agresti and Coull,the Wilson interval is to be preferred and so is the default.

return.df

logical flag to indicate that a data frame rather than amatrix be returned.

Author(s)

Frank Harrell, modified by Michael Schramm

References

A. Agresti and B.A. Coull, Approximate is better than "exact" forinterval estimation of binomial proportions,American Statistician,52:119–126, 1998.

R.G. Newcombe, Logit confidence intervals and the inverse sinhtransformation,American Statistician,55:200–202, 2001.

L.D. Brown, T.T. Cai and A. DasGupta, Interval estimation for a binomialproportion (with discussion),Statistical Science,16:101–133, 2001.

Examples

binom_ci(46,50,method="wilson")

Calculates the Geometric Mean

Description

Originally from Paul McMurdie, Ben Bolker, and Gregor on Stack Overflow:https://stackoverflow.com/questions/2602583/geometric-mean-is-there-a-built-in

Usage

gm_mean(x, na.rm = TRUE, zero.propagate = FALSE)

Arguments

x

vector of numeric values

na.rm

logical TRUE/FALSE remove NA values

zero.propagate

logical TRUE/FALSE. Allows the optional propagation ofzeros.

Value

the geometric mean of the vector


Returns the Geomean and CI

Description

Generates Geometric mean and confidence intervals using bootstrap.

Usage

gm_mean_ci(  window,  conf = 0.95,  na.rm = TRUE,  type = "basic",  R = 1000,  parallel = "no",  ncpus = getOption("boot.ncpus", 1L),  cl = NULL,  zero.propagate = FALSE)

Arguments

window

vector of data values

conf

confidence level of the required interval.NA if skippingcalculating the bootstrapped CI

na.rm

logicalTRUE/FALSE. Remove NAs from the dataset. DefaultsTRUE

type

character string, one ofc("norm","basic", "stud", "perc","bca")."all" is not a valid value. Seeboot.ci

R

the number of bootstrap replicates. seeboot

parallel

The type of parallel operation to be used (if any). seeboot

ncpus

integer: number of process to be used in parallel operation. seeboot

cl

optional parallel or snow cluster for use ifparallel ="snow". seeboot

zero.propagate

logicalTRUE/FALSE Allows the optional propagation ofzeros.

Value

named list with geometric mean and (optionally) specified confidenceinterval


List NA

Description

function to return tibble with NAs as specified

Usage

list_NA(x)

Arguments

x

named vector

Value

empty tibble


Returns the mean and CI

Description

Generates mean and confidence intervals using bootstrap.

Usage

mean_ci(  window,  conf = 0.95,  na.rm = TRUE,  type = "basic",  R = 1000,  parallel = "no",  ncpus = getOption("boot.ncpus", 1L),  cl = NULL)

Arguments

window

vector of data values

conf

confidence level of the required interval.NA if skippingcalculating the bootstrapped CI

na.rm

logicalTRUE/FALSE. Remove NAs from the dataset. DefaultsTRUE

type

character string, one ofc("norm","basic", "stud", "perc","bca")."all" is not a valid value. Seeboot.ci

R

the number of bootstrap replicates. seeboot

parallel

The type of parallel operation to be used (if any). seeboot

ncpus

integer: number of process to be used in parallel operation. seeboot

cl

optional parallel or snow cluster for use ifparallel ="snow". seeboot

Value

named list with mean and (optionally) specified confidenceinterval


Returns the median and CI

Description

Generates median and confidence intervals using bootstrap.

Usage

median_ci(  window,  conf = 0.95,  na.rm = TRUE,  type = "basic",  R = 1000,  parallel = "no",  ncpus = getOption("boot.ncpus", 1L),  cl = NULL)

Arguments

window

vector of data values

conf

confidence level of the required interval.NA if skippingcalculating the bootstrapped CI

na.rm

logicalTRUE/FALSE. Remove NAs from the dataset. DefaultsTRUE

type

character string, one ofc("norm","basic", "stud", "perc","bca")."all" is not a valid value. Seeboot.ci

R

the number of bootstrap replicates. seeboot

parallel

The type of parallel operation to be used (if any). seeboot

ncpus

integer: number of process to be used in parallel operation. seeboot

cl

optional parallel or snow cluster for use ifparallel ="snow". seeboot

Value

named list with mean and (optionally) specified confidenceinterval


Open Window

Description

calculates the period at each row from the row of interest

Usage

open_window(x, tcolumn, unit = "years", n, i, na.pad)

Arguments

x

dataframe

tcolumn

time column

unit

unit

n

desired n

i

row number

na.pad

logical if 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'.

Value

vector


Step ribbon statistic

Description

Provides stairstep values for ribbon plots. This was originally in Bob Rudis's ggalt package which is no longer on CRAN.

Usage

stat_stepribbon(  mapping = NULL,  data = NULL,  geom = "ribbon",  position = "identity",  na.rm = FALSE,  show.legend = NA,  inherit.aes = TRUE,  direction = "hv",  ...)

Arguments

mapping

Set of aesthetic mappings created byaes(). If specified andinherit.aes = TRUE (the default), it is combined with the default mappingat the top level of the plot. You must supplymapping if there is no plotmapping.

data

The data to be displayed in this layer. There are threeoptions:

IfNULL, the default, the data is inherited from the plotdata as specified in the call toggplot().

Adata.frame, or other object, will override the plotdata. All objects will be fortified to produce a data frame. Seefortify() for which variables will be created.

Afunction will be called with a single argument,the plot data. The return value must be adata.frame, andwill be used as the layer data. Afunction can be createdfrom aformula (e.g.~ head(.x, 10)).

geom

which geom to use; defaults to "ribbon"

position

A position adjustment to use on the data for this layer. Thiscan be used in various ways, including to prevent overplotting andimproving the display. Theposition argument accepts the following:

  • The result of calling a position function, such asposition_jitter().This method allows for passing extra arguments to the position.

  • A string naming the position adjustment. To give the position as astring, strip the function name of theposition_ prefix. For example,to useposition_jitter(), give the position as"jitter".

  • For more information and other ways to specify the position, see thelayer position documentation.

na.rm

IfFALSE, the default, missing values are removed witha warning. IfTRUE, missing values are silently removed.

show.legend

logical. Should this layer be included in the legends?NA, the default, includes if any aesthetics are mapped.FALSE never includes, andTRUE always includes.It can also be a named logical vector to finely select the aesthetics todisplay.

inherit.aes

IfFALSE, overrides the default aesthetics,rather than combining with them. This is most useful for helper functionsthat define both data and aesthetics and shouldn't inherit behaviour fromthe default plot specification, e.g.borders().

direction

hv for horizontal-veritcal steps,vh forvertical-horizontal steps

...

Other arguments passed on tolayer()'sparams argument. Thesearguments broadly fall into one of 4 categories below. Notably, furtherarguments to theposition argument, or aesthetics that are requiredcannot be passed through.... Unknown arguments that are not partof the 4 categories below are ignored.

  • Static aesthetics that are not mapped to a scale, but are at a fixedvalue and apply to the layer as a whole. For example,colour = "red"orlinewidth = 3. The geom's documentation has anAestheticssection that lists the available options. The 'required' aestheticscannot be passed on to theparams. Please note that while passingunmapped aesthetics as vectors is technically possible, the order andrequired length is not guaranteed to be parallel to the input data.

  • When constructing a layer usinga⁠stat_*()⁠ function, the... argument can be used to pass onparameters to thegeom part of the layer. An example of this isstat_density(geom = "area", outline.type = "both"). The geom'sdocumentation lists which parameters it can accept.

  • Inversely, when constructing a layer using a⁠geom_*()⁠ function, the... argument can be used to pass on parametersto thestat part of the layer. An example of this isgeom_area(stat = "density", adjust = 0.5). The stat's documentationlists which parameters it can accept.

  • Thekey_glyph argument oflayer() may also be passed on through.... This can be one of the functions described askey glyphs, to change the display of the layer in the legend.

Author(s)

Bob Rudis

References

https://groups.google.com/forum/?fromgroups=#!topic/ggplot2/9cFWHaH1CPs

Examples

x <- 1:10df <- data.frame(x=x, y=x+10, ymin=x+7, ymax=x+12)gg <- ggplot(df, aes(x, y))gg <- gg + geom_ribbon(aes(ymin=ymin, ymax=ymax),                       stat="stepribbon", fill="#b2b2b2")gg <- gg + geom_step(color="#2b2b2b")gggg <- ggplot(df, aes(x, y))gg <- gg + geom_ribbon(aes(ymin=ymin, ymax=ymax),                       stat="stepribbon", fill="#b2b2b2",                       direction="hv")gg <- gg + geom_step(color="#2b2b2b")gg

Time-Based Rolling Binomial Probability

Description

Produces a a rolling time-window based vector of binomial probability andconfidence intervals.

Usage

tbr_binom(.tbl, x, tcolumn, unit = "years", n, alpha = 0.05, na.pad = TRUE)

Arguments

.tbl

dataframe with two variables.

x

indicates the variable column containing "success" and "failure"observations coded as 1 or 0.

tcolumn

indicates the variable column containing Date or Date-Timevalues.

unit

character, one of "years", "months", "weeks", "days", "hours","minutes", "seconds"

n

numeric, describing the length of the time window in the selectedunits.

alpha

numeric, probability of a type 1 error, so confidencecoefficient = 1-alpha

na.pad

logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defatuls to 'TRUE'

Value

tibble with binomial point estimate and confidence intervals.

See Also

binom_ci

Examples

## Generate Sample Datadf <- tibble::tibble(date = sample(seq(as.Date('2000-01-01'), as.Date('2015/12/30'), by = "day"), 100),value = rbinom(100, 1, 0.25))## Run Functiontbr_binom(df, x = value,tcolumn = date, unit = "years", n = 5,alpha = 0.1, na.pad = FALSE)

Binomial test based on time window

Description

Binomial test based on time window

Usage

tbr_binom_window(x, tcolumn, unit = "years", n, i, alpha, na.pad)

Arguments

x

column containing "success" and "failure" observations as 0 or 1

tcolumn

formatted time column

unit

character, one of "years", "months", "weeks", "days", "hours","minutes", "seconds"

n

numeric, describing the length of the time window.

i

rows

alpha

numeric, probability of a type 1 error, so confidencecoefficient = 1-alpha

na.pad

logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'.

Value

list


Time-Based Rolling Geometric Mean

Description

Produces a a rolling time-window based vector of geometric means andconfidence intervals.

Usage

tbr_gmean(.tbl, x, tcolumn, unit = "years", n, na.pad = TRUE, ...)

Arguments

.tbl

a data frame with at least two variables; time column formattedas date, date/time and value column.

x

column containing the values to calculate the geometric mean.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours","minutes", "seconds"

n

numeric, describing the length of the time window.

na.pad

logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defatuls to 'TRUE'

...

additional arguments passed togm_mean_ci

Value

tibble with columns for the rolling geometric mean and upper andlower confidence levels.

See Also

gm_mean_ci

Examples

## Return a tibble with new rolling geometric mean columntbr_gmean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, na.pad = FALSE)## Not run: ## Return a tibble with rolling geometric mean and 95% CItbr_gmean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, conf = .95)## End(Not run)

Geometric mean based on a time-window

Description

Geometric mean based on a time-window

Usage

tbr_gmean_window(x, tcolumn, unit = "years", n, i, na.pad, ...)

Arguments

x

column containing the values to calculate the geometric mean.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours","minutes", "seconds"

n

numeric, describing the length of the time window.

i

row

...

additional arguments passed to gmean_ci

Value

list


Time-Based Rolling Mean

Description

Produces a a rolling time-window based vector of means and confidence intervals.

Usage

tbr_mean(.tbl, x, tcolumn, unit = "years", n, na.pad = TRUE, ...)

Arguments

.tbl

a data frame with at least two variables; time column formatted as date, date/time and value column.

x

column containing the numeric values to calculate the mean.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window.

na.pad

logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defaults to 'TRUE'

...

additional arguments passed tomean_ci.

Value

tibble with columns for the rolling mean and upper and lower confidence intervals.

See Also

mean_ci

Examples

## Return a tibble with new rolling mean columntbr_mean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, na.pad = FALSE)## Not run: ## Return a tibble with rolling mean and 95% CItbr_mean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, conf = .95)## End(Not run)

Mean Based on a Time-Window

Description

Mean Based on a Time-Window

Usage

tbr_mean_window(x, tcolumn, unit = "years", n, i, na.pad, ...)

Arguments

x

column containing the values to calculate the mean.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window.

i

row

na.pad

logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'.

...

additional arguments passed tomean_ci

Value

list


Time-Based Rolling Median

Description

Produces a a rolling time-window based vector of medians and confidence intervals.

Usage

tbr_median(.tbl, x, tcolumn, unit = "years", n, na.pad = TRUE, ...)

Arguments

.tbl

a data frame with at least two variables; time column formatted as date, date/time and value column.

x

column containing the numeric values to calculate the mean.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window.

na.pad

logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defaults to 'TRUE'

...

additional arguments passed tomedian_ci

Value

tibble with columns for the rolling median and upper and lower confidence intervals.

See Also

median_ci

Examples

## Return a tibble with new rolling median columntbr_median(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years",n = 5, na.pad = FALSE)## Not run: ## Return a tibble with rolling median and 95% CI tbr_median(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, conf = .95)## End(Not run)

Median Based on a Time-Window

Description

Median Based on a Time-Window

Usage

tbr_median_window(x, tcolumn, unit = "years", n, i, na.pad, ...)

Arguments

x

column containing the values to calculate the median.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window.

i

row

na.pad

logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'.

...

additional arguments passed tomedian_ci

Value

list


Use Generic Functions with Time Windows

Description

Use Generic Functions with Time Windows

Usage

tbr_misc(.tbl, x, tcolumn, unit = "years", n, na.pad = TRUE, func, ...)

Arguments

.tbl

a data frame with at least two variables; time column formattedas date, date/time and value column.

x

column containing the values the function is applied to.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window.

na.pad

logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defaults to 'TRUE'

func

specified function

...

optional additional arguments passed to functionfunc

Value

tibble

Examples

tbr_misc(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, na.pad = FALSE, func = mean)

Time-Based Rolling Standard Deviation

Description

Time-Based Rolling Standard Deviation

Usage

tbr_sd(.tbl, x, tcolumn, unit = "years", n, na.rm = FALSE, na.pad = TRUE)

Arguments

.tbl

a data frame with at least two variables; time column formatted as date, date/time and value column.

x

column containing the values to calculate the standard deviation.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window.

na.rm

logical. Should missing values be removed?

na.pad

logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defaults to 'TRUE'

Value

tibble with column for the rolling sd.

See Also

sd

Examples

tbr_sd(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, na.pad = FALSE)

Standard Deviation Based on a Time-Window

Description

Standard Deviation Based on a Time-Window

Usage

tbr_sd_window(x, tcolumn, unit = "years", n, i, na.pad, ...)

Arguments

x

column containing the values to calculate the standard deviation.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window.

i

row

na.pad

logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'.

...

additional arguments passed to base::sd()

Value

numeric value


Time-Based Rolling Sum

Description

Time-Based Rolling Sum

Usage

tbr_sum(.tbl, x, tcolumn, unit = "years", n, na.rm = FALSE, na.pad = TRUE)

Arguments

.tbl

a data frame with at least two variables; time column formattedas date, date/time and value column.

x

column containing the values to calculate the sum.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours","minutes", "seconds"

n

numeric, describing the length of the time window.

na.rm

logical. Should missing values be removed?

na.pad

logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defatuls to 'TRUE'

Value

dataframe with column for the rolling sum.

See Also

sum

Examples

tbr_sum(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n =5, na.pad = FALSE)

Sum Based on a Time-Window

Description

Sum Based on a Time-Window

Usage

tbr_sum_window(x, tcolumn, unit = "years", n, i, na.rm, na.pad)

Arguments

x

column containing the values to calculate the sum.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window.

i

row

na.rm

logical. Should missing values be removed?

na.pad

logical if 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'.

Value

numeric value


tbrf extensions to ggplot2

Description

tbrf makes use of the ggproto class system to extend thefunctionality of ggplot2. In general the actual classes should be of littleinterest to users as the standard ggplot2 api of using geom_* and stat_*functions for building up the plot is encouraged.

References

https://groups.google.com/forum/?fromgroups=#!topic/ggplot2/9cFWHaH1CPs


[8]ページ先頭

©2009-2025 Movatter.jp