tidychangepoint
tidychangepoint.RmdTidy methods for changepoint analysis
Thetidychangepoint package allows you to use any numberof algorithms for detecting changepoint sets in univariate time serieswith a common,tidyverse-compliant interface. It alsoprovides model-fitting procedures for commonly-used parametric models,tools for computing various penalty functions, and graphical diagnosticdisplays.
Changepoint sets are computed using thesegment()function, which takes a numeric vector that is coercible into ats object, and a string indicating the algorithm you wishyou use.segment() always returns atidycptobject.
Various methods are available fortidycpt objects. Forexample,as.ts() returns the original data asts object, andchangepoints() returns the setof changepoint indices.
changepoints(x)#> [1] 547 822 972Retrieving information using thebroom interface
tidychangepoint follows the design interface of thebroom package. Therefore,augment(),tidy() andglance() methods exists fortidycpt objects.
augment()returns atsibblethat isgrouped according to the regions defined by the changepoint set.
augment(x)#> Registered S3 method overwritten by 'tsibble':#> method from#> as_tibble.grouped_df dplyr#># A tsibble: 1,096 x 5 [1]#># Groups: region [4]#> index y region .fitted .resid#><int><dbl><fct><dbl><dbl>#> 1 1 35.5 [1,547) 35.3 0.232#> 2 2 29.0 [1,547) 35.3 -6.27#> 3 3 35.6 [1,547) 35.3 0.357#> 4 4 33.0 [1,547) 35.3 -2.29#> 5 5 29.5 [1,547) 35.3 -5.74#> 6 6 25.4 [1,547) 35.3 -9.87#> 7 7 28.8 [1,547) 35.3 -6.45#> 8 8 50.3 [1,547) 35.3 15.0#> 9 9 24.9 [1,547) 35.3 -10.3#>10 10 58.9 [1,547) 35.3 23.6#># ℹ 1,086 more rowstidy()returns atblthat provides summarystatistics for each region. These include any parameters that were fit,which are prefixed in the output byparam_.
tidy(x)#># A tibble: 4 × 10#> region num_obs min max mean sd begin end param_mu param_sigma_hatsq#><chr><int><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>#>1 [1,547) 546 13.7 92.8 35.3 11.3 1 547 35.3 127.#>2 [547,8… 275 20.5 163. 58.1 19.3 547 822 58.1 372.#>3 [822,9… 150 39.2 215. 96.7 30.5 822 972 96.7 924.#>4 [972,1… 125 67.2 299. 156. 49.6 9721097 156.2442.glance()returns atblthat providessummary statistics for the algorithm. This includes thefitness, which is the value of the penalized objectivefunction that was used.
glance(x)#># A tibble: 1 × 8#> pkg version algorithm seg_params model_name criteria fitness elapsed_time#><chr><pckg_><chr><list><chr><chr><dbl><drtn>#>1 changep… 2.3 PELT<list [1]> meanvar MBIC9403. 0.068 secsOther methods
Theplot() method leveragesggplot2 toprovide an informative plot, with the regions defined by the changepointset clearly demarcated, and the means within each region alsoindicated.
plot(x)
Other generic functions defined fortidycpt objectsincludefitness(),as.model(), andexceedances(). For example,fitness() returnsa named vector with the value of the penalized objective functionused.
fitness(x)#> MBIC#> 9403.391Structure
Everytidycpt objects contains two main children:
segmenter: The object that results from the changepointdetection algorithm. These can be of any class. Methods for objects ofclasscpt,ga, andwbsarecurrently implemented, and as well asseg_basket(thedefault internal class). Given a data set, a model, and a penalizedobjective function, a segmenter’s job is to search theexponentially-large space of possible changepoint sets for the one thatoptimizes the penalized objective function (over the space of possiblechangepoint sets). Some segmenting algorithms (e.g., PELT) aredeterministic, while others (e.g., genetic algorithms) arerandomized.model: A model object inheriting frommod_cpt, an internal class for representing model objects.Model objects are created by model-fitting functions, all of whose namesstart withfit_. Themodelof atidycptobject is the model object returned by thefit_*()function that corresponds to the one used by thesegmenter. Given a data set, a model description,and a set of changepoints, the correspondingmodel-fitting function finds the values of the model parameters thatoptimize the model fit to the data.
Bothsegmenters and models implement methods for thegeneric functionschangepoints(),as.ts(),nobs(),logLik(),model_name(),andglance(). However, it is important to note that whiletidychangepoint does its best to match the model used bythesegmenter to its corresponding model-fitting function,exact matches do not always exist. Thus, thelogLik() ofthesegmenter may not always match thelogLik() of themodel. Nevertheless, squaringthese values is the focus of ongoing work.
Segmenters
In the example above, thesegmenter is of classcpt, becausesegment() simply wraps thecpt.meanvar() function from thechangepointpackage.
x|>as.segmenter()|>str()#> Formal class 'cpt' [package "changepoint"] with 12 slots#> ..@ data.set : Time-Series [1:1096] from 1 to 1096: 35.5 29 35.6 33 29.5 ...#> ..@ cpttype : chr "mean and variance"#> ..@ method : chr "PELT"#> ..@ test.stat: chr "Normal"#> ..@ pen.type : chr "MBIC"#> ..@ pen.value: num 28#> ..@ minseglen: num 2#> ..@ cpts : int [1:4] 547 822 972 1096#> ..@ ncpts.max: num Inf#> ..@ param.est:List of 2#> .. ..$ mean : num [1:4] 35.3 58.2 96.8 156.5#> .. ..$ variance: num [1:4] 127 371 921 2406#> ..@ date : chr "Thu Oct 16 19:35:00 2025"#> ..@ version : chr "2.3"In addition to the generic functions listed above,segmenters implement methods for the generic functionsfitness(),model_args(), andseg_params().
Models
Themodel object in this case is created byfit_meanvar(), and is of classmod_cpt.
x|>as.model()|>str()#> List of 6#> $ data : Time-Series [1:1096] from 1 to 1096: 35.5 29 35.6 33 29.5 ...#> $ tau : int [1:3] 547 822 972#> $ region_params: tibble [4 × 3] (S3: tbl_df/tbl/data.frame)#> ..$ region : chr [1:4] "[1,547)" "[547,822)" "[822,972)" "[972,1.1e+03)"#> ..$ param_mu : num [1:4] 35.3 58.1 96.7 155.9#> ..$ param_sigma_hatsq: Named num [1:4] 127 372 924 2442#> .. ..- attr(*, "names")= chr [1:4] "[1,547)" "[547,822)" "[822,972)" "[972,1.1e+03)"#> $ model_params : NULL#> $ fitted_values: num [1:1096] 35.3 35.3 35.3 35.3 35.3 ...#> $ model_name : chr "meanvar"#> - attr(*, "class")= chr "mod_cpt"In addition to the generic functions listed above,models implement methods for the generic functionsfitted(),residuals(),coef(),augment(),tidy(), andplot().