Movatterモバイル変換


[0]ホーム

URL:


tsibble

R-CMD-checkCRAN statusCodecov test coverage

Thetsibble package provides a data infrastructurefor tidy temporal data with wrangling tools. Adapting thetidy dataprinciples,tsibble is a data- and model-oriented object.Intsibble:

  1. Index is a variable with inherent ordering from past topresent.
  2. Key is a set of variables that define observational units overtime.
  3. Each observation should be uniquely identified byindex andkey.
  4. Each observational unit should be measured at a commoninterval, if regularly spaced.

Installation

You could install the stable version on CRAN:

install.packages("tsibble")

You could install the development version from Github using

# install.packages("remotes")remotes::install_github("tidyverts/tsibble")

Get started

Coerce to a tsibble withas_tsibble()

To coerce a data frame totsibble, we need to declare keyand index. For example, in theweather data from thepackagenycflights13, thetime_hour containingthe date-times should be declared asindex, and theorigin askey. Other columns can beconsidered as measured variables.

library(dplyr)library(tsibble)weather<- nycflights13::weather%>%select(origin, time_hour, temp, humid, precip)weather_tsbl<-as_tsibble(weather,key = origin,index = time_hour)weather_tsbl#> # A tsibble: 26,115 x 5 [1h] <America/New_York>#> # Key:       origin [3]#>   origin time_hour            temp humid precip#>   <chr>  <dttm>              <dbl> <dbl>  <dbl>#> 1 EWR    2013-01-01 01:00:00  39.0  59.4      0#> 2 EWR    2013-01-01 02:00:00  39.0  61.6      0#> 3 EWR    2013-01-01 03:00:00  39.0  64.4      0#> 4 EWR    2013-01-01 04:00:00  39.9  62.2      0#> 5 EWR    2013-01-01 05:00:00  39.0  64.4      0#> # ℹ 26,110 more rows

Thekey can be comprised of empty, one, or morevariables. Seepackage?tsibble andvignette("intro-tsibble")for details.

Theinterval is computed from index based on therepresentation, ranging from year to nanosecond, from numerics toordered factors. The table below shows how tsibble interprets somecommon time formats.

IntervalClass
Annualinteger/double
Quarterlyyearquarter
Monthlyyearmonth
Weeklyyearweek
DailyDate/difftime
SubdailyPOSIXt/difftime/hms

A full list of index classes supported by tsibble can be found inpackage?tsibble.

fill_gaps()to turn implicit missing values into explicit missing values

Often there are implicit missing cases in time series. If theobservations are made at regular time interval, we could turn theseimplicit missingness to be explicit simply usingfill_gaps(), filling gaps in precipitation(precip) with 0 in the meanwhile. It is quite common toreplacesNAs with its previous observation for each originin time series analysis, which is easily done usingfill()fromtidyr.

full_weather<- weather_tsbl%>%fill_gaps(precip =0)%>%group_by_key()%>%  tidyr::fill(temp, humid,.direction ="down")full_weather#> # A tsibble: 26,190 x 5 [1h] <America/New_York>#> # Key:       origin [3]#> # Groups:    origin [3]#>   origin time_hour            temp humid precip#>   <chr>  <dttm>              <dbl> <dbl>  <dbl>#> 1 EWR    2013-01-01 01:00:00  39.0  59.4      0#> 2 EWR    2013-01-01 02:00:00  39.0  61.6      0#> 3 EWR    2013-01-01 03:00:00  39.0  64.4      0#> 4 EWR    2013-01-01 04:00:00  39.9  62.2      0#> 5 EWR    2013-01-01 05:00:00  39.0  64.4      0#> # ℹ 26,185 more rows

fill_gaps() also handles filling in time gaps by valuesor functions, and respects time zones for date-times. Wanna a quickoverview of implicit missing values? Check outvignette("implicit-na").

index_by()+summarise() to aggregate over calendar periods

index_by() is the counterpart ofgroup_by()in temporal context, but it groups the index only. In conjunction withindex_by(),summarise() aggregates interestedvariables over time periods.index_by() goes hand in handwith the index functions includingas.Date(),yearweek(),yearmonth(), andyearquarter(), as well as other friends fromlubridate. For example, it would be of interest incomputing average temperature and total precipitation per month, byapplyingyearmonth() to the index variable (referred to as.).

full_weather%>%group_by_key()%>%index_by(year_month =~yearmonth(.))%>%# monthly aggregatessummarise(avg_temp =mean(temp,na.rm =TRUE),ttl_precip =sum(precip,na.rm =TRUE)  )#> # A tsibble: 36 x 4 [1M]#> # Key:       origin [3]#>   origin year_month avg_temp ttl_precip#>   <chr>       <mth>    <dbl>      <dbl>#> 1 EWR      2013 Jan     35.6       3.53#> 2 EWR      2013 Feb     34.2       3.83#> 3 EWR      2013 Mar     40.1       3#> 4 EWR      2013 Apr     53.0       1.47#> 5 EWR      2013 May     63.3       5.44#> # ℹ 31 more rows

While collapsing rows (likesummarise()),group_by() andindex_by() will take care ofupdating the key and index respectively. Thisindex_by() +summarise() combo can help with regularising a tsibble ofirregular time space too.

Learn more about tsibble

An ecosystem,thetidyverts, is built around thetsibble object fortidy time series analysis.


Please note that this project is released with aContributorCode of Conduct. By participating in this project you agree to abideby its terms.


[8]ページ先頭

©2009-2025 Movatter.jp