The finnts package, commonly referred to as “Finn”, is a standardizedtimes series forecast framework developed by Microsoft Finance. It’s aresult of years of effort trying to perfect a centralized forecastingpractice that everyone in finance could leverage. Even though it wasbuilt for finance like forecasts, it can easily be extended to any typeof time series forecast.
Finn takes years of hard work and thousands of lines of code, andsimplifies the forecasting process down to one line of code. A singlefunction, “forecast_time_series”, takes in historical data and appliesdozens of models to produce a state of the art forecast. Whilesimplifying the forecasting process down to a single function call mightseem limiting, Finn actually allows for a lot of flexibility under thehood. In order to leverage the best components of Finn, please check outall of the other vignettes within the package.
Getting started with Finn is as simple as 1..2..3
Data used in Finn needs to follow a few requirements, called outbelow.
A good example to use when producing your first Finn forecast is toleverage existing data examples from thetimetk package.Let’s take a monthly example and trim it down to speed up the run timeof your first Finn forecast.
library(finnts)hist_data<- timetk::m4_monthly%>% dplyr::filter(date>="2013-01-01")%>% dplyr::rename(Date = date)%>% dplyr::mutate(id =as.character(id))print(hist_data)#> # A tibble: 120 × 3#> id Date value#> <chr> <date> <dbl>#> 1 M1 2013-01-01 9120#> 2 M1 2013-02-01 8280#> 3 M1 2013-03-01 7860#> 4 M1 2013-04-01 7150#> 5 M1 2013-05-01 8110#> 6 M1 2013-06-01 10860#> 7 M1 2013-07-01 10730#> 8 M1 2013-08-01 9610#> 9 M1 2013-09-01 8270#> 10 M1 2013-10-01 9200#> # ℹ 110 more rowsprint(unique(hist_data$id))#> [1] "M1" "M2" "M750" "M1000"The above data set contains 4 individual time series, identifiedusing the “id” column.
Before we call the Finn forecast function. Let’s first set up somerun information usingset_run_info(), this helps log allcomponents of our Finn forecast successfully.
Calling the “forecast_time_series” function is the easiest part. Inthis example we will be running just two models.
# no need to assign it to a variable, since all of the outputs are written to disk :)forecast_time_series(run_info = run_info,input_data = hist_data,combo_variables =c("id"),target_variable ="value",date_type ="month",forecast_horizon =3,back_test_scenarios =6,models_to_run =c("arima","ets"),return_data =FALSE)best_model_tbl<- finn_output_tbl%>% dplyr::filter(Best_Model=="Yes")%>% dplyr::select(Combo, Model_ID, Model_Name, Model_Type, Recipe_ID)%>% dplyr::distinct()print(best_model_tbl)Note: the best model for the “M1” combination is a simple average of“arima” and “ets” models.