Movatterモバイル変換

The finnts package, commonly referred to as “Finn”, is a standardizedtimes series forecast framework developed by Microsoft Finance. It’s aresult of years of effort trying to perfect a centralized forecastingpractice that everyone in finance could leverage. Even though it wasbuilt for finance like forecasts, it can easily be extended to any typeof time series forecast.

Finn takes years of hard work and thousands of lines of code, andsimplifies the forecasting process down to one line of code. A singlefunction, “forecast_time_series”, takes in historical data and appliesdozens of models to produce a state of the art forecast. Whilesimplifying the forecasting process down to a single function call mightseem limiting, Finn actually allows for a lot of flexibility under thehood. In order to leverage the best components of Finn, please check outall of the other vignettes within the package.

1. Bring Data

Data used in Finn needs to follow a few requirements, called outbelow.

Data is tabular, formatted as data frame, tibble, or spark dataframe.
Needs a time stamp or date column, which needs to be formatted as adate and labeled as “Date”. The date values need to start at thebeginning of the period. For example, a monthly data set needs to haveeach date period started on the first day of each month. For a quarterlyforecast, the first day of the quarter, etc.
Contains at least one unique label to identify one time series fromanother. These are sometimes referred to as “data combos” or “combovariables” in Finn. For example, a monthly forecast by country shouldhave a column with country names to help Finn split out each countryinto separate time series.
No duplicate rows at the intersection of data combos and date.
Column headers should contain only letters, numbers, andunderscores. They should also start with a letter, not a number. Theserequirements ensure that R/Python handle your data frame correctlywithout any errors.
External regressors are optional, they’re not required to produce aFinn forecast. To learn more about how to use them, please check out thevignette on external regressors.

A good example to use when producing your first Finn forecast is toleverage existing data examples from thetimetk package.Let’s take a monthly example and trim it down to speed up the run timeof your first Finn forecast.

library(finnts)hist_data<- timetk::m4_monthly%>%  dplyr::filter(date>="2013-01-01")%>%  dplyr::rename(Date = date)%>%  dplyr::mutate(id =as.character(id))print(hist_data)#> # A tibble: 120 × 3#>    id    Date       value#>    <chr> <date>     <dbl>#>  1 M1    2013-01-01  9120#>  2 M1    2013-02-01  8280#>  3 M1    2013-03-01  7860#>  4 M1    2013-04-01  7150#>  5 M1    2013-05-01  8110#>  6 M1    2013-06-01 10860#>  7 M1    2013-07-01 10730#>  8 M1    2013-08-01  9610#>  9 M1    2013-09-01  8270#> 10 M1    2013-10-01  9200#> # ℹ 110 more rowsprint(unique(hist_data$id))#> [1] "M1"    "M2"    "M750"  "M1000"

The above data set contains 4 individual time series, identifiedusing the “id” column.

2. Create Finn Forecast

Before we call the Finn forecast function. Let’s first set up somerun information usingset_run_info(), this helps log allcomponents of our Finn forecast successfully.

run_info<-set_run_info(experiment_name ="finn_forecast",run_name ="test_run")

Calling the “forecast_time_series” function is the easiest part. Inthis example we will be running just two models.

# no need to assign it to a variable, since all of the outputs are written to disk :)forecast_time_series(run_info = run_info,input_data = hist_data,combo_variables =c("id"),target_variable ="value",date_type ="month",forecast_horizon =3,back_test_scenarios =6,models_to_run =c("arima","ets"),return_data =FALSE)

3. Get Forecast Outputs

Initial Finn Outputs

finn_output_tbl<-get_forecast_data(run_info = run_info)print(finn_output_tbl)

Future Forecast

future_forecast_tbl<- finn_output_tbl%>%  dplyr::filter(Run_Type=="Future_Forecast")print(future_forecast_tbl)

Back Test Results

back_test_tbl<- finn_output_tbl%>%  dplyr::filter(Run_Type=="Back_Test")print(back_test_tbl)

Back Test Best Model per Time Series

best_model_tbl<- finn_output_tbl%>%  dplyr::filter(Best_Model=="Yes")%>%  dplyr::select(Combo, Model_ID, Model_Name, Model_Type, Recipe_ID)%>%  dplyr::distinct()print(best_model_tbl)

Note: the best model for the “M1” combination is a simple average of“arima” and “ets” models.

Trained Models

trained_model_tbl<-get_trained_models(run_info = run_info)print(trained_model_tbl)

Initial Prepped Data

R1_prepped_data_tbl<-get_prepped_data(run_info = run_info,recipe ="R1")print(R1_prepped_data_tbl)R2_prepped_data_tbl<-get_prepped_data(run_info = run_info,recipe ="R2")print(R2_prepped_data_tbl)

Run Info Metadata

run_info_tbl<-get_run_info(experiment_name ="finn_forecast")print(run_info_tbl)