The ML.EXPLAIN_FORECAST function

This document describes theML.EXPLAIN_FORECAST function, which lets yougenerate forecasts that are based on a trainedtime series model. It only works onARIMA_PLUS models with thetraining optiondecompose_time_series enabled or onARIMA_PLUS_XREG models.TheML.EXPLAIN_FORECAST function encompasses theML.FORECAST functionbecause its output is a superset of the results ofML.FORECAST.

Syntax

# `ARIMA_PLUS` models:ML.EXPLAIN_FORECAST(  MODEL `PROJECT_ID.DATASET.MODEL`,  STRUCT(    [HORIZON AS horizon]    [,CONFIDENCE_LEVEL AS confidence_level]))# `ARIMA_PLUS_XREG` model:ML.EXPLAIN_FORECAST(  MODEL `PROJECT_ID.DATASET.MODEL`,  STRUCT(    [HORIZON AS horizon]    [,CONFIDENCE_LEVEL AS confidence_level]),    { TABLE `PROJECT_ID.DATASET.TABLE` | (QUERY_STATEMENT) })

Note: No input data is required forARIMA_PLUS models.

Arguments

ML.EXPLAIN_FORECAST takes the following arguments:

PROJECT_ID: the project that contains theresource.

DATASET: the dataset that contains theresource.

MODEL: the name of the model.

TABLE: the name of the input table that contains thedata to be evaluated.
IfTABLE is specified, the input column names in the table must match thecolumn names in the model, and their types should be compatible according toBigQuery implicit coercion rules.
If there are unused columns from the table, they are ignored.
TheTABLE argument is required for theARIMA_PLUS_XREG model.
QUERY_STATEMENT: the GoogleSQL query that isused to generate the evaluation data. For the supported SQL syntax for theQUERY_STATEMENT clause in GoogleSQL, seeQuery syntax.
IfQUERY_STATEMENT is specified, the input column names from the querymust match the column names in the model, and their types should becompatible according to BigQueryimplicit coercion rules.
HORIZON: anINT64 value that specifies the number oftime points to forecast. The maximum value is the horizon value specified intheCREATE MODELstatement for the time series model, or1000 if unspecified. The defaultvalue is3. When forecasting multiple time series at the same time, thisparameter applies to each time series.
Note: Forecasting takes place when theCREATE MODEL statement runs. TheML.EXPLAIN_FORECAST function retrieves the forecasting values and computesthe prediction intervals. If you want to filter results when you'reforecasting multiple time series, use theML.EXPLAIN_FORECAST function. Tosave query time, specify a value for theHORIZON option in theCREATE MODEL statement.
CONFIDENCE_LEVEL: aFLOAT64 value that specifies thepercentage of the future values that fall in the prediction interval. Thevalid input range is[0, 1). The default value is0.95.

Output

TheML.EXPLAIN_FORECAST function returns the following columns:

time_series_id_col ortime_series_id_cols: a value that containsthe identifiers of a time series.time_series_id_col can be anINT64 orSTRING value.time_series_id_cols can be anARRAY<INT64> orARRAY<STRING> value. Only present when forecasting multiple time series atonce. The column names and types are inherited from theTIME_SERIES_ID_COLoption as specified in theCREATE MODEL statement.
time_series_timestamp: aTIMESTAMP value that contains the timestamp ofthe time series. This column has a type ofTIMESTAMP regardless of thetype of the input time_series_timestamp_col.For each time series, the output rows are sorted in chronological order bythetime_series_timestamp value.
time_series_type: aSTRING value that contains eitherhistory orforecast. The rows that have a value ofhistory in thiscolumn are used in training, either directly from the training table, or frominterpolation using the training data.
time_series_data: aFLOAT64 value that contains the data of the timeseries. For rows that have a value ofhistory in thetime_series_typecolumn,time_series_data is eitherthe training data or the interpolated value using the training data.For rows that have a value offorecast in thetime_series_typecolumn,time_series_data is the forecast value.
time_series_adjusted_data: aFLOAT64 value that contains the adjusteddata of the time series. For rows that have a value ofhistory in thetime_series_type column, this is the valueafter cleaning spikes and dips, adjusting the step changes, and removing theresiduals. It is the aggregation of all the valid components:holiday effect,seasonal components, andtrend.For rows that have a value offorecast in thetime_series_typecolumn, this is the forecast value, which is the same as thevalue oftime_series_data.
standard_error: aFLOAT64 value that contains the standard error of theresiduals during the ARIMA fitting. The values arethe same for all rows that have a value ofhistory in thetime_series_type column. For rows that have a value offorecast in thetime_series_type column, this value increaseswith time, as the forecast values become less reliable.
confidence_level: aFLOAT64 value that contains the user-specifiedconfidence level or, if unspecified, the default value.This value is the same for all rows that have a value ofhistory in thetime_series_type column. This value isNULL for all rows that have a valueofforecast in thetime_series_type column.
prediction_interval_lower_bound: aFLOAT64 value that contains the lowerbound of the prediction result. Only rows that have a value offorecast in thetime_series_type column have valuesother thanNULL in this column.
prediction_interval_upper_bound: aFLOAT64 value that contains the upperbound of the prediction result. Only rows that have a value offorecast in thetime_series_type column have valuesother thanNULL in this column.
trend: aFLOAT64 value that contains the long-term increase or decreasein the time series data.
seasonal_period_yearly: aFLOAT64 value that contains the time seriesdata value affected by the time of the year. This value isNULL if no yearly effect is found.
seasonal_period_quarterly: aFLOAT64 value that contains the time seriesdata value affected by the time of the quarter. This value isNULL if no quarterly effect is found.
seasonal_period_monthly: aFLOAT64 value that contains the time seriesdata value affected by the time of the month. This value isNULL if no monthly effect is found.
seasonal_period_weekly: aFLOAT64 value that contains the time seriesdata value affected by the time of the week. This value isNULL if no weekly effect is found.
seasonal_period_daily: aFLOAT64 value that contains the time seriesdata value affected by the time of the day. This value isNULL if no daily effect is found.
holiday_effect: aFLOAT64 value that contains the time series data valueaffected by different holidays. This is thesum of the maximum positive individual holiday effect and the minimumnegative individual holiday effect. This is shown in the following formula,where \(H\) is the overall holiday effect and \(h(i)\) is the individualholiday effect:
\[H=\max\limits_{h(i)>0} h(i) + \min\limits_{h(i)<0} h(i)\]
This value isNULL if no holiday effect is found.
spikes_and_dips: aFLOAT64 value that contains the unexpectedly highor low values of the time series. For rows that have a value ofhistory in thetime_series_type column, the value isNULL if no spike or dip is found.For rows that have a value offorecast in thetime_series_type column, thisvalue isNULL.
step_changes: aFLOAT64 value that contains the abrupt or structuralchange in the distributional properties of the timeseries. For rows that have a value ofhistory in thetime_series_type column, this value isNULL if no step change is found.For rows that have a value offorecast in thetime_series_type column,this value isNULL.
residual: aFLOAT64 value that contains the difference between the actualtime series and the fitted time series after model training. The residualvalue is only meaningful for historical data. For rows that have a value offorecast in thetime_series_type column, theresidual value isNULL.
holiday_effect_holiday_name:aFLOAT64 value that contains the time series data value affected bythe holiday that's identified inholiday_name. If no holiday effect is found, this value isNULL.
There is oneholiday_effect_holiday_name columnfor each holiday that's included in the model.
attribution_feature_name: aFLOAT64 value thatcontains the attribution of each feature to the final forecast. This onlyapplies toARIMAX_PLUS_XREG models. The value is calculated by multiplyingthe weight of the feature with the feature value. This is shown in thefollowing formula, where \(\beta_{fn}\) is the weight of featurefn in thelinear regression and \(X_{fn}\) is the numericalized feature value:
\[attribution_{fn}=\beta_{fn} * X_{fn}\]

Mathematical explanation

The mathematical relationship of the output columns is described in thefollowing sections.

`time_series_data`

Thetime_series_data value is decomposed into several components to get betterexplainability. ForARIMA_PLUS models, the component list includes thefollowing components for better explainability:

trend
seasonal_period_yearly
seasonal_period_quarterly
seasonal_period_monthly
seasonal_period_weekly
seasonal_period_daily
holiday_effect
spikes_and_dips
step_changes
residual

ForARIMA_PLUS_XREG models, this list also includes the feature contributionattribution_feature_name. For future data,thespikes_and_dips,step_changes, andresiduals values aren't applicable.

The following formulas show what components make up thetime_series_datavalue for historical and forecast data for time series models

ForARIMA_PLUS models:

Historical data:

time_series_data = trend + seasonal_period_yearly + seasonal_period_quarterly + seasonal_period_monthly                    + seasonal_period_weekly + seasonal_period_daily + holiday_effect                    + spikes_and_dips + step_changes + residual

Forecast data:

time_series_data = trend + seasonal_period_yearly + seasonal_period_quarterly + seasonal_period_monthly                  + seasonal_period_weekly + seasonal_period_daily + holiday_effect

ForARIMA_PLUS_XREG models:

Historical data:

time_series_data = trend + seasonal_period_yearly + seasonal_period_quarterly + seasonal_period_monthly                  + seasonal_period_weekly + seasonal_period_daily + holiday_effect                  + spikes_and_dips + step_changes + residual                  + (attribution_feature_1 + ... + attribution_feature_n)

Forecast data:

time_series_data = trend + seasonal_period_yearly + seasonal_period_quarterly + seasonal_period_monthly                  + seasonal_period_weekly + seasonal_period_daily + holiday_effect                  + (attribution_feature_1 + ... + attribution_feature_n)

`time_series_adjusted_data`

Thetime_series_adjusted_data value is the value that remains after cleaningspikes and dips, adjusting the step changes, and removing the residuals. Itsformula is the same for both historical and forecast data.

ForARIMA_PLUS models:

time_series_adjusted_data = trend + seasonal_period_yearly + seasonal_period_quarterly                            + seasonal_period_monthly + seasonal_period_weekly + seasonal_period_daily                            + holiday_effect

ForARIMA_PLUS_XREG models:

time_series_adjusted_data = trend + seasonal_period_yearly + seasonal_period_quarterly                            + seasonal_period_monthly + seasonal_period_weekly + seasonal_period_daily                            + holiday_effect + (attribution_feature_1 + ... + attribution_feature_n)

Note: For rows that have a value offorecast in thetime_series_type column, you might notice that thetime_series_data andtime_series_adjusted_data values are the same .

`holiday_effect`

Theholiday_effect_holiday_name value is a subcomponent. Theholiday_effect value is the sum of all theholiday_effect_holiday_name values. For example, if youspecify holidaysxmas andmlk, the formula isholiday_effect = holiday_effect_xmas + holiday_effect_mlk.

`ARIMA_PLUS` example

The following example forecasts 30 time points witha confidence level of0.8:

SELECT*FROMML.EXPLAIN_FORECAST(MODEL`mydataset.mymodel`,STRUCT(30AShorizon,0.8ASconfidence_level))

`ARIMA_PLUS_XREG` example

The following example forecasts 30 time points with aconfidence level of0.8 with future features:

SELECT*FROMML.EXPLAIN_FORECAST(MODEL`mydataset.mymodel`,STRUCT(30AShorizon,0.8ASconfidence_level),(SELECT*FROM`mydataset.mytable`))

What's next

For more information about Explainable AI, see BigQuery Explainable AI overview.
For more information about supported SQL statements and functions fortime series forecasting models, seeEnd-to-end user journeys for time series forecasting models.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.

Movatterモバイル変換

The ML.EXPLAIN_FORECAST function

Syntax

Arguments

Output

Mathematical explanation

time_series_data

time_series_adjusted_data

holiday_effect

ARIMA_PLUS example