
Thetsibble package provides a data infrastructurefor tidy temporal data with wrangling tools. Adapting thetidy dataprinciples,tsibble is a data- and model-oriented object.Intsibble:
You could install the stable version on CRAN:
install.packages("tsibble")You could install the development version from Github using
# install.packages("remotes")remotes::install_github("tidyverts/tsibble")as_tsibble()To coerce a data frame totsibble, we need to declare keyand index. For example, in theweather data from thepackagenycflights13, thetime_hour containingthe date-times should be declared asindex, and theorigin askey. Other columns can beconsidered as measured variables.
library(dplyr)library(tsibble)weather<- nycflights13::weather%>%select(origin, time_hour, temp, humid, precip)weather_tsbl<-as_tsibble(weather,key = origin,index = time_hour)weather_tsbl#> # A tsibble: 26,115 x 5 [1h] <America/New_York>#> # Key: origin [3]#> origin time_hour temp humid precip#> <chr> <dttm> <dbl> <dbl> <dbl>#> 1 EWR 2013-01-01 01:00:00 39.0 59.4 0#> 2 EWR 2013-01-01 02:00:00 39.0 61.6 0#> 3 EWR 2013-01-01 03:00:00 39.0 64.4 0#> 4 EWR 2013-01-01 04:00:00 39.9 62.2 0#> 5 EWR 2013-01-01 05:00:00 39.0 64.4 0#> # ℹ 26,110 more rowsThekey can be comprised of empty, one, or morevariables. Seepackage?tsibble andvignette("intro-tsibble")for details.
Theinterval is computed from index based on therepresentation, ranging from year to nanosecond, from numerics toordered factors. The table below shows how tsibble interprets somecommon time formats.
| Interval | Class |
|---|---|
| Annual | integer/double |
| Quarterly | yearquarter |
| Monthly | yearmonth |
| Weekly | yearweek |
| Daily | Date/difftime |
| Subdaily | POSIXt/difftime/hms |
A full list of index classes supported by tsibble can be found inpackage?tsibble.
fill_gaps()to turn implicit missing values into explicit missing valuesOften there are implicit missing cases in time series. If theobservations are made at regular time interval, we could turn theseimplicit missingness to be explicit simply usingfill_gaps(), filling gaps in precipitation(precip) with 0 in the meanwhile. It is quite common toreplacesNAs with its previous observation for each originin time series analysis, which is easily done usingfill()fromtidyr.
full_weather<- weather_tsbl%>%fill_gaps(precip =0)%>%group_by_key()%>% tidyr::fill(temp, humid,.direction ="down")full_weather#> # A tsibble: 26,190 x 5 [1h] <America/New_York>#> # Key: origin [3]#> # Groups: origin [3]#> origin time_hour temp humid precip#> <chr> <dttm> <dbl> <dbl> <dbl>#> 1 EWR 2013-01-01 01:00:00 39.0 59.4 0#> 2 EWR 2013-01-01 02:00:00 39.0 61.6 0#> 3 EWR 2013-01-01 03:00:00 39.0 64.4 0#> 4 EWR 2013-01-01 04:00:00 39.9 62.2 0#> 5 EWR 2013-01-01 05:00:00 39.0 64.4 0#> # ℹ 26,185 more rowsfill_gaps() also handles filling in time gaps by valuesor functions, and respects time zones for date-times. Wanna a quickoverview of implicit missing values? Check outvignette("implicit-na").
index_by()+summarise() to aggregate over calendar periodsindex_by() is the counterpart ofgroup_by()in temporal context, but it groups the index only. In conjunction withindex_by(),summarise() aggregates interestedvariables over time periods.index_by() goes hand in handwith the index functions includingas.Date(),yearweek(),yearmonth(), andyearquarter(), as well as other friends fromlubridate. For example, it would be of interest incomputing average temperature and total precipitation per month, byapplyingyearmonth() to the index variable (referred to as.).
full_weather%>%group_by_key()%>%index_by(year_month =~yearmonth(.))%>%# monthly aggregatessummarise(avg_temp =mean(temp,na.rm =TRUE),ttl_precip =sum(precip,na.rm =TRUE) )#> # A tsibble: 36 x 4 [1M]#> # Key: origin [3]#> origin year_month avg_temp ttl_precip#> <chr> <mth> <dbl> <dbl>#> 1 EWR 2013 Jan 35.6 3.53#> 2 EWR 2013 Feb 34.2 3.83#> 3 EWR 2013 Mar 40.1 3#> 4 EWR 2013 Apr 53.0 1.47#> 5 EWR 2013 May 63.3 5.44#> # ℹ 31 more rowsWhile collapsing rows (likesummarise()),group_by() andindex_by() will take care ofupdating the key and index respectively. Thisindex_by() +summarise() combo can help with regularising a tsibble ofirregular time space too.
An ecosystem,thetidyverts, is built around thetsibble object fortidy time series analysis.
Please note that this project is released with aContributorCode of Conduct. By participating in this project you agree to abideby its terms.