Movatterモバイル変換


[0]ホーム

URL:


Skip to contents

tidychangepoint

tidychangepoint.Rmd

Tidy methods for changepoint analysis

library(tidychangepoint)

Thetidychangepoint package allows you to use any numberof algorithms for detecting changepoint sets in univariate time serieswith a common,tidyverse-compliant interface. It alsoprovides model-fitting procedures for commonly-used parametric models,tools for computing various penalty functions, and graphical diagnosticdisplays.

Changepoint sets are computed using thesegment()function, which takes a numeric vector that is coercible into ats object, and a string indicating the algorithm you wishyou use.segment() always returns atidycptobject.

x<-segment(DataCPSim, method="pelt")class(x)#> [1] "tidycpt"

Various methods are available fortidycpt objects. Forexample,as.ts() returns the original data asts object, andchangepoints() returns the setof changepoint indices.

changepoints(x)#> [1] 547 822 972

Retrieving information using thebroom interface

tidychangepoint follows the design interface of thebroom package. Therefore,augment(),tidy() andglance() methods exists fortidycpt objects.

  • augment() returns atsibble that isgrouped according to the regions defined by the changepoint set.
augment(x)#> Registered S3 method overwritten by 'tsibble':#>   method               from#>   as_tibble.grouped_df dplyr#># A tsibble: 1,096 x 5 [1]#># Groups:    region [4]#>    index     y region  .fitted  .resid#><int><dbl><fct><dbl><dbl>#> 1     1  35.5 [1,547)    35.3   0.232#> 2     2  29.0 [1,547)    35.3  -6.27#> 3     3  35.6 [1,547)    35.3   0.357#> 4     4  33.0 [1,547)    35.3  -2.29#> 5     5  29.5 [1,547)    35.3  -5.74#> 6     6  25.4 [1,547)    35.3  -9.87#> 7     7  28.8 [1,547)    35.3  -6.45#> 8     8  50.3 [1,547)    35.3  15.0#> 9     9  24.9 [1,547)    35.3 -10.3#>10    10  58.9 [1,547)    35.3  23.6#># ℹ 1,086 more rows
  • tidy() returns atbl that provides summarystatistics for each region. These include any parameters that were fit,which are prefixed in the output byparam_.
tidy(x)#># A tibble: 4 × 10#>   region  num_obs   min   max  mean    sd begin   end param_mu param_sigma_hatsq#><chr><int><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>#>1 [1,547)     546  13.7  92.8  35.3  11.3     1   547     35.3              127.#>2 [547,8…     275  20.5 163.   58.1  19.3   547   822     58.1              372.#>3 [822,9…     150  39.2 215.   96.7  30.5   822   972     96.7              924.#>4 [972,1…     125  67.2 299.  156.   49.6   9721097    156.2442.
  • glance() returns atbl that providessummary statistics for the algorithm. This includes thefitness, which is the value of the penalized objectivefunction that was used.
glance(x)#># A tibble: 1 × 8#>   pkg      version algorithm seg_params model_name criteria fitness elapsed_time#><chr><pckg_><chr><list><chr><chr><dbl><drtn>#>1 changep… 2.3     PELT<list [1]> meanvar    MBIC9403. 0.068 secs

Other methods

Theplot() method leveragesggplot2 toprovide an informative plot, with the regions defined by the changepointset clearly demarcated, and the means within each region alsoindicated.

plot(x)

Other generic functions defined fortidycpt objectsincludefitness(),as.model(), andexceedances(). For example,fitness() returnsa named vector with the value of the penalized objective functionused.

fitness(x)#>     MBIC#> 9403.391

Structure

Everytidycpt objects contains two main children:

  • segmenter: The object that results from the changepointdetection algorithm. These can be of any class. Methods for objects ofclasscpt,ga, andwbs arecurrently implemented, and as well asseg_basket (thedefault internal class). Given a data set, a model, and a penalizedobjective function, a segmenter’s job is to search theexponentially-large space of possible changepoint sets for the one thatoptimizes the penalized objective function (over the space of possiblechangepoint sets). Some segmenting algorithms (e.g., PELT) aredeterministic, while others (e.g., genetic algorithms) arerandomized.
  • model: A model object inheriting frommod_cpt, an internal class for representing model objects.Model objects are created by model-fitting functions, all of whose namesstart withfit_. Themodel of atidycpt object is the model object returned by thefit_*() function that corresponds to the one used by thesegmenter. Given a data set, a model description,and a set of changepoints, the correspondingmodel-fitting function finds the values of the model parameters thatoptimize the model fit to the data.

Bothsegmenters and models implement methods for thegeneric functionschangepoints(),as.ts(),nobs(),logLik(),model_name(),andglance(). However, it is important to note that whiletidychangepoint does its best to match the model used bythesegmenter to its corresponding model-fitting function,exact matches do not always exist. Thus, thelogLik() ofthesegmenter may not always match thelogLik() of themodel. Nevertheless, squaringthese values is the focus of ongoing work.

Segmenters

In the example above, thesegmenter is of classcpt, becausesegment() simply wraps thecpt.meanvar() function from thechangepointpackage.

x|>as.segmenter()|>str()#> Formal class 'cpt' [package "changepoint"] with 12 slots#>   ..@ data.set : Time-Series [1:1096] from 1 to 1096: 35.5 29 35.6 33 29.5 ...#>   ..@ cpttype  : chr "mean and variance"#>   ..@ method   : chr "PELT"#>   ..@ test.stat: chr "Normal"#>   ..@ pen.type : chr "MBIC"#>   ..@ pen.value: num 28#>   ..@ minseglen: num 2#>   ..@ cpts     : int [1:4] 547 822 972 1096#>   ..@ ncpts.max: num Inf#>   ..@ param.est:List of 2#>   .. ..$ mean    : num [1:4] 35.3 58.2 96.8 156.5#>   .. ..$ variance: num [1:4] 127 371 921 2406#>   ..@ date     : chr "Thu Oct 16 19:35:00 2025"#>   ..@ version  : chr "2.3"

In addition to the generic functions listed above,segmenters implement methods for the generic functionsfitness(),model_args(), andseg_params().

Models

Themodel object in this case is created byfit_meanvar(), and is of classmod_cpt.

x|>as.model()|>str()#> List of 6#>  $ data         : Time-Series [1:1096] from 1 to 1096: 35.5 29 35.6 33 29.5 ...#>  $ tau          : int [1:3] 547 822 972#>  $ region_params: tibble [4 × 3] (S3: tbl_df/tbl/data.frame)#>   ..$ region           : chr [1:4] "[1,547)" "[547,822)" "[822,972)" "[972,1.1e+03)"#>   ..$ param_mu         : num [1:4] 35.3 58.1 96.7 155.9#>   ..$ param_sigma_hatsq: Named num [1:4] 127 372 924 2442#>   .. ..- attr(*, "names")= chr [1:4] "[1,547)" "[547,822)" "[822,972)" "[972,1.1e+03)"#>  $ model_params : NULL#>  $ fitted_values: num [1:1096] 35.3 35.3 35.3 35.3 35.3 ...#>  $ model_name   : chr "meanvar"#>  - attr(*, "class")= chr "mod_cpt"

In addition to the generic functions listed above,models implement methods for the generic functionsfitted(),residuals(),coef(),augment(),tidy(), andplot().


[8]ページ先頭

©2009-2025 Movatter.jp