Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Seasonal adjustment of weekly data

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
NotificationsYou must be signed in to change notification settings

timginker/boiwsa

Repository files navigation

boiwsa is an R package for seasonal adjustment and forecasting ofweekly data. It provides a simple, easy-to-use interface for calculatingthe seasonally adjusted estimates, as well as a number of diagnostictools for evaluating the quality of the adjustments. In this tutorial weillustrate its functionality in seasonal adjustment and forecasting ofthe weekly data.

The seasonal adjustment procedure approach aligns closely with thelocally-weighted least squares procedure introduced by Cleveland etal. (2014) albeit with several adjustments. Firstly, instead of relyingon differencing to detrend the data, we opt for a more explicit approachby directly estimating the trend through smoothing techniques. Secondly,we incorporate a variation of Discount weighted regression (Harrison andJohnston, 1984) to enable the seasonal component to evolve dynamicallyover time.

We consider the following decomposition model of the observed series$y_{t}$:

$$y_{t}=T_{t}+S_{t}+H_{t}+O_{t}+I_{t},$$

where$T_{t}$,$S_{t}$,$H_{t}$,$O_{t}$ and$I_{t}$ represent thetrend, seasonal, outlier, holiday- and trading-day, and irregularcomponents, respectively. To evade ambiguity, it is important to notethat in weekly data,$t$ typically denotes the date of the last daywithin a given week. The seasonal component is specified usingtrigonometric variables as:

$$\begin{eqnarray*}S_{t} &=&\sum_{k=1}^{K}\left( \alpha _{k}^{y}\sin (\frac{2\pi kD_{t}^{y}}{n_{t}^{y}})+\beta _{k}^{y}\cos (\frac{2\pi kD_{t}^{y}}{n_{t}^{y}})\right) +\\\&&\sum_{l=1}^{L}\left( \alpha _{l}^{m}\sin (\frac{2\pi lD_{t}^{m}}{n_{t}^{m}})+\beta _{l}^{m}\cos (\frac{2\pi lD_{t}^{m}}{n_{t}^{m}})\right) ,\end{eqnarray*}$$

where$D_{t}^{y}$ and$D_{t}^{m}$ are the day of the year and the day ofthe month, and$n_{t}^{y}$ and$n_{t}^{m}$ are the number of days in thegiven year or month. Thus, the seasonal adjustment procedure takes intoaccount the existence of two cycles, namely intrayearly andintramonthly.

Like the X-11 method (Ladiray and Quenneville, 2001), theboiwsaprocedure uses an iterative principle to estimate the variouscomponents. The seasonal adjustment algorithm comprises eight steps,which are documented below:

  • Step 1: Estimation of trend ($T_{t}^{(1)}$) usingstats::supsmu().

  • Step 2: Estimation of the Seasonal-Irregular component:

$$y_{t}-T_{t}^{(1)}=S_{t}+H_{t}+O_{t}+I_{t}$$

  • Step 2*: Searching for additive outliers using the method proposed byFindley et al. (1998)

  • Step 2**: Identifying the optimal number of trigonometric variables

  • Step 3: Calculation of seasonal factors, along with other potentialfactors such as$H_{t}$ or$O_{t}$, is done through DWR on theseasonal-irregular component extracted in Step 2. In this application,the discounting rate decays over the years. For each year$t$ and theobserved year$\tau$, a geometrically decaying weight function isrepresented as:$w_{t}=r^{|t-\tau|}$, where$r \in (0,1]$. Severalimportant points are worth mentioning. First, when$r=1$, the methodsimplifies to ordinary least squares regression with constantseasonality. On the contrary, smaller values of$r$ permit a morerapid rate of change in the seasonal component. However, it is advisedagainst setting it below 0.5 to prevent overfitting. In addition, thechoice of$r$ affects the strength of revisions in the seasonallyadjusted data, with higher values of$r$ leading to potentiallystronger revisions. Second, our methodology differs from theconventional one-way discounting, enabling the inclusion of futureobservations in the computation of seasonal factors. This approachcircumvents the limitations of the forecasting methods discussed inBandara et al. (2024). Finally, the choice of year-based discountingis driven by the fact that in traditional discount-weightedregression, even with a conservative choice of$r=0.95$, in weeklydata, observations separated by more than 2 years would carry nearlynegligible weight. Therefore, the use of year-based discountingprevents an overly rapid decay which may potentially lead to unstableestimates of the seasonal component.

  • Step 4: Estimation of trend ($T_{t}^{(2)}$) from seasonally andoutlier adjusted series usingstats::supsmu() function (R Core Team,2013)

  • Step 5: Estimation of the Seasonal-Irregular component:$$y_{t}-T_{t}^{(2)}=S_{t}+H_{t}+O_{t}+I_{t}$$

  • Step 6: Computing the final seasonal factors (and possibly otherfactors such as$H_{t}$ or$O_{t}$) using discount weightedregression, as in step 3.

  • Step 7: Estimation of the final seasonally adjusted series:$$y_{t}-S_{t}-H_{t}$$

  • Step 8: Computing final trend ($T_{t}^{(3)}$) estimate from seasonallyand outlier adjusted series usingstats::supsmu().

Installation

To install boiwsa, you can use devtools:

# install.packages("devtools")devtools::install_github("timginker/boiwsa")

Alternatively, you can clone the repository and install the package fromsource:

git clone https://github.com/timginker/boiwsa.gitcd boiwsaR CMD INSTALL.

Usage

Example 1: Gasoline production in the US

Usingboiwsa is simple. First, load theboiwsa package:

library(boiwsa)

Next, load your time series data into a data frame object. Here is anexample that is based on thegasoline data from the US EnergyInformation Administration that we copied from the from thefpp2package:

data("gasoline.data")plot(gasoline.data$date,gasoline.data$y,type="l"     ,xlab="Year",ylab="",main="Weekly US gasoline production")

Once you have your data loaded, you can use theboiwsa function toperform weekly seasonal adjustment:

res<- boiwsa(x=gasoline.data$y,dates=gasoline.data$date)

In general, the procedure can be applied with minimum interventions andrequires only the series to be adjusted (x argument) and theassociated dates (dates argument) provided in a date format. Unlessspecified otherwise (i.e.,my.k_l = NULL), the procedure automaticallyidentifies the best number of trigonometric variables to model theintra-yearly ($K$) and intra-monthly ($L$) seasonal cycles based on theAkaike Information Criterion corrected for small sample sizes (AICc).The information criterion can be adjusted through theic option. Likeother software, there are three options: “aic”, “aicc”, and “bic”. Theweighting decay rate is specified byr. By default$r=0.8$ which issimilar to what is customary in the literature (see Ch. 2 in Harvey(1990)).

In addition, the procedure automatically searches for additive outliers(AO) using the method described in Appendix C of Findley et al. (1998).To disable the automatic AO search, setauto.ao.search = F. To adduser-defined AOs, use theao.list option. As suggested by Findley etal. (1998), the$t$-statistic threshold for outlier search is by defaultset to$3.8$. However, since high-frequency data are generally morenoisy (Proietti and Pedregal, 2023), it could be advantageous toconsider setting a higher threshold by adjusting theout.thresholdargument.

Theboiwsa function returns an S3 class object containing the results.The seasonally adjusted series is stored in a vector calledsa. Theestimated seasonal factors are stored assf. In addition, the user cansee the number of trigonometric terms chosen in automatic search(my.k_l) and the position of additive outliers (ao.list) found bythe automatic routine.

After the seasonal adjustment, we can plot the adjusted data tovisualize the seasonal pattern:

plot(res)

To assess the quality of the adjustment, we can plot the autoregressivespectrum of the original and seasonally adjusted data, as illustrated inthe code below:

plot_spec(res)

It is evident that the series originally had a single intra-yearlyseasonal cycle, but this component was completely removed by theprocedure.

We can also inspect the output to check if the number of trigonometricterms chosen by the automatic procedure matches our visual findings:

print(res)#>#>  number of yearly cycle variables:  12#>  number of monthly cycle variables:  0#>  list of additive outliers:  1998-03-28

As can be seen, the number of yearly terms,$K$, is 12 and the number ofmonthly terms is zero, which is consistent with the observed spectrum.

Example 2: Initial unemployment claims in Israel

Here, we consider the weekly number of initial registrations at theIsraeli Employment Service. Registration and reporting at the EmploymentService are mandatory prerequisites for individuals seeking to receivean unemployment benefit. Therefore, applicants are expected to registerpromptly after their employment has been terminated. Given that mostemployment contracts conclude toward the end of the month, an increasednumber of applications is anticipated at the beginning of the month,leading to an intra-monthly seasonal pattern. Additionally, as can beseen in the Figure below, on an annual basis, three distinct peaks areobserved, with the final one occurring in August. This peak is closelytied to seasonal workers, leading to the creation of an intra-yearlycycle.

library(tidyverse)#> ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──#> ✔ dplyr     1.1.4     ✔ readr     2.1.4#> ✔ forcats   1.0.0     ✔ stringr   1.5.1#> ✔ ggplot2   3.4.4     ✔ tibble    3.2.1#> ✔ lubridate 1.9.3     ✔ tidyr     1.3.0#> ✔ purrr     1.0.2#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──#> ✖ dplyr::filter() masks stats::filter()#> ✖ dplyr::lag()    masks stats::lag()#> ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errorsggplot()+  geom_line(aes(x=lbm$date,y=lbm$IES_IN_W_ADJ  ),color="royalblue")+  theme_bw()+  ylab("Number of claims per week")+  xlab("Year")

Furthermore, each year, there are two weeks in which the activityplunges to nearly zero due to the existence of two moving holidaysassociated with Rosh Hashanah and Pesach. Moreover, a working day effectis expected, which leads to a reduced number of applications in weekswith fewer working days. These effects are captured and modeled throughadditional variables generated by the dedicated functions inboiwsa.

To generate a working day variable, we use theboiwsa::simple_tdfunction, designed to aggregate the count of full working days within aweek and normalize it. This function requires two parameters: the datadates and adata.frame object containing information about workingdays. Thedata.frame should be in a daily frequency and contain twocolumns: “date” and “WORKING_DAY_PART”. For a complete working day, the“WORKING_DAY_PART” column should be assigned a value of 1, for a halfworking day 0.5, and for a holiday, the value should be set to 0.

Moving holiday variables can be created using theboiwsa::genholfunction. These variables are computed using the Easter formula in Table2 of Findley et al. (1998), with the calendar centering to avoid bias,as indicated in the documentation. In the present example, the impact ofeach holiday is concentrated within a single week, resulting in anoticeable drop and subsequent increase in the number of registrationsduring the following week. To account for this effect, we employ dummyvariables that are globally centered. These dummy variables are createdusing a custom function -boiwsa::my_rosh, which is created for thisillustrative scenario.

The code below illustrates the entire process based on theboiwsa::lbm dataset: creation of working day adjustment variablesusing theboiwsa::simple_td function; creation of moving holidayvariables using the dedicated functions, and adding the combined inputinto theboiwsa::boiwsa function.

# creating an input for simple_tddates_il %>%  select(DATE_VALUE,ISR_WORKING_DAY_PART) %>%  `colnames<-`(c("date","WORKING_DAY_PART")) %>%  mutate(date= as.Date(date))->df.td# creating a matrix with a working day variabletd<- simple_td(dates=lbm$date,df.td=df.td)# generating the Rosh Hashanah and Pesach moving holiday variablesrosh<- my_rosh(dates=lbm$date,holiday.dates=holiday_dates_il$rosh)# renaming (make sure that all the variables in H have distinct names)colnames(rosh)<- paste0("rosh", colnames(rosh))pesach<- my_rosh(dates=lbm$date,holiday.dates=holiday_dates_il$pesah,start=3,end=-1)colnames(pesach)<- paste0("pesach", colnames(pesach))# combining the prior adjustment variables in a single matrixH<- as.matrix(cbind(rosh[,-1],pesach[,-1],td[,-1]))# running seasonal adjustment routineres<- boiwsa(x=lbm$IES_IN_W_ADJ,dates=lbm$date,H=H,out.threshold=5)

Subsequently, we can visually examine the results of the procedure:

plot(res)

As we can see in the plot, the procedure has successfully eliminated theannual and monthly seasonal cycles, along with the influences of movingholidays.

Following a thorough visual examination of the seasonally adjusted data,we can now move forward with the spectrum diagnostics. As illustrated inthe Figure below, corroborating our initial analysis of potentialunderlying seasonal patterns, it becomes evident that the data has twodistinct seasonal cycles. Additionally, it is noteworthy that ourprocedure successfully removed the corresponding peaks, therebyhighlighting its effectiveness.

plot_spec(res)

Forecasting

It is also possible to forecast weekly data using the predict method forboiwsa. The prediction is based on theforecast::auto.arima modelfitted to the seasonally and outlier-adjusted data, which is thencombined with the seasonal factor estimates fromboiwsa.

The code below illustrates the entire process:

x<-boiwsa::lbm$IES_IN_W_ADJdates<-boiwsa::lbm$date# train-test split (using 90% of observations for train)n_train<- round(length(x)*0.9)n_test<- (length(x)-n_train)h_train<-H[1:n_train, ]h_test<- tail(H,n_test)x_train<-x[1:n_train]x_test<- tail(x,n_test)dates_train<-dates[1:n_train]dates_test<- tail(dates,n_test)fit<- boiwsa(x=x_train,dates=dates_train,H=h_train)fct<- predict(fit,n.ahead=n_test,new_H=h_test)# Visualizing predictionsplot(dates_test,x_test,type="l",ylab="Number of claims per week",xlab="Date")lines(dates_test,fct$forecast$mean,col="red")legend("topright",legend= c("Original","Predicted"),lwd= c(2,1),col= c("black","red"),bty="n")

References

Bandara, K., Hyndman, R. J. and C. Bergmeir (2024). MSTL: Aseasonal-trend decomposition algorithm for time series with multipleseasonal patterns. International Journal of Operational Research, 2024.In press.

Cleveland, W.P., Evans, T.D. and S. Scott (2014). Weekly SeasonalAdjustment-A Locally-weighted Regression Approach (No. 473). Bureau ofLabor Statistics.

Findley, D.F., Monsell, B.C., Bell, W.R., Otto, M.C. and B.C Chen(1998). New capabilities and methods of the X-12-ARIMAseasonal-adjustment program. Journal of Business & Economic Statistics,16(2), pp.127-152.

Harrison, P. J. and F. R. Johnston (1984). Discount weighted regression.Journal of the Operational Research Society 35(10), 923–932.

Hyndman, R. (2023). fpp2: Data for “Forecasting: Principles andPractice” (2nd Edition). R package version 2.5.

Harvey, A. C. (1990). Forecasting, structural time series models and theKalman filter. Cambridge University Press.

Ladiray, D. and B. Quenneville (2001). Seasonal adjustment with the X-11method.

Proietti, T. and D. J. Pedregal (2023). Seasonality in High FrequencyTime Series. Econometrics and Statistics, 27: 62–82, 2023.

R Core Team (2013). R: A Language and Environment for StatisticalComputing. Vienna, Austria: R Foundation for Statistical Computing. ISBN3-900051-07-0.

Disclaimer

The views expressed here are solely of the author and do not necessarilyrepresent the views of the Bank of Israel.

Please note thatboiwsa is still under development and may containbugs or other issues that have not yet been resolved. While we have madeevery effort to ensure that the package is functional and reliable, wecannot guarantee its performance in all situations.

We strongly advise that you regularly check for updates and install anynew versions that become available, as these may contain important bugfixes and other improvements. By using this package, you acknowledge andaccept that it is provided on an “as is” basis, and that we make nowarranties or representations regarding its suitability for yourspecific needs or purposes.

About

Seasonal adjustment of weekly data

Topics

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Packages

No packages published

Contributors2

  •  
  •  

Languages


[8]ページ先頭

©2009-2025 Movatter.jp