Forecast multiple time series with an ARIMA_PLUS univariate model

This tutorial teaches you how to use anARIMA_PLUS univariate time series model to forecast the future value of a givencolumn, based on the historical values for that column.

This tutorial forecasts for multiple time series. Forecasted values arecalculated for each time point, for each value in one or more specified columns.For example, if you wanted to forecast weather and specified a column containingcity data, the forecasted data would contain forecasts for all time points forCity A, then forecasted values for all time points for City B, and so forth.

This tutorial uses data from the publicbigquery-public-data.new_york.citibike_tripstable. This table contains information about Citi Bike trips in New York City.

Before reading this tutorial, we highly recommend that you readForecast a single time series with a univariate model.

Objectives

This tutorial guides you through completing the following tasks:

Costs

This tutorial uses billable components of Google Cloud, including:

  • BigQuery
  • BigQuery ML

For more information about BigQuery costs, see theBigQuery pricing page.

For more information about BigQuery ML costs, seeBigQuery ML pricing.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.
    Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

    Go to project selector

  3. Verify that billing is enabled for your Google Cloud project.

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.
    Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

    Go to project selector

  5. Verify that billing is enabled for your Google Cloud project.

  6. BigQuery is automatically enabled in new projects. To activate BigQuery in a pre-existing project, go to

    Enable the BigQuery API.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enable permission.Learn how to grant roles.

    Enable the API

Required Permissions

  • To create the dataset, you need thebigquery.datasets.createIAM permission.

  • To create the model, you need the following permissions:

    • bigquery.jobs.create
    • bigquery.models.create
    • bigquery.models.getData
    • bigquery.models.updateData
  • To run inference, you need the following permissions:

    • bigquery.models.getData
    • bigquery.jobs.create

For more information about IAM roles and permissions inBigQuery, seeIntroduction to IAM.

Create a dataset

Create a BigQuery dataset to store your ML model.

Console

  1. In the Google Cloud console, go to theBigQuery page.

    Go to the BigQuery page

  2. In theExplorer pane, click your project name.

  3. ClickView actions > Create dataset

  4. On theCreate dataset page, do the following:

    • ForDataset ID, enterbqml_tutorial.

    • ForLocation type, selectMulti-region, and then selectUS (multiple regions in United States).

    • Leave the remaining default settings as they are, and clickCreate dataset.

bq

To create a new dataset, use thebq mk commandwith the--location flag. For a full list of possible parameters, see thebq mk --dataset commandreference.

  1. Create a dataset namedbqml_tutorial with the data location set toUSand a description ofBigQuery ML tutorial dataset:

    bq --location=US mk -d \ --description "BigQuery ML tutorial dataset." \ bqml_tutorial

    Instead of using the--dataset flag, the command uses the-d shortcut.If you omit-d and--dataset, the command defaults to creating adataset.

  2. Confirm that the dataset was created:

    bqls

API

Call thedatasets.insertmethod with a defineddataset resource.

{"datasetReference":{"datasetId":"bqml_tutorial"}}

BigQuery DataFrames

Before trying this sample, follow the BigQuery DataFrames setup instructions in theBigQuery quickstart using BigQuery DataFrames. For more information, see theBigQuery DataFrames reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up ADC for a local development environment.

importgoogle.cloud.bigquerybqclient=google.cloud.bigquery.Client()bqclient.create_dataset("bqml_tutorial",exists_ok=True)

Visualize the input data

Before creating the model, you can optionally visualize your inputtime series data to get a sense of the distribution. You can do this by usingLooker Studio.

SQL

TheSELECT statement of the following query uses theEXTRACT functionto extract the date information from thestarttime column. The query usestheCOUNT(*) clause to get the daily total number of Citi Bike trips.

Follow these steps to visualize the time series data:

  1. In the Google Cloud console, go to theBigQuery page.

    Go to BigQuery

  2. In the query editor, paste in the following query and clickRun:

    SELECTEXTRACT(DATEfromstarttime)ASdate,COUNT(*)ASnum_tripsFROM`bigquery-public-data.new_york.citibike_trips`GROUPBYdate;
  3. When the query completes, clickOpen in>Looker Studio. Looker Studio opens ina new tab. Complete the following steps in the new tab.

  4. In the Looker Studio, clickInsert>Time series chart.

  5. In theChart pane, choose theSetup tab.

  6. In theMetric section, add thenum_trips field,and remove the defaultRecord Count metric.The resulting chart looks similar to the following:

    Chart showing bike trip data over time.

BigQuery DataFrames

Before trying this sample, follow the BigQuery DataFrames setup instructions in theBigQuery quickstart using BigQuery DataFrames. For more information, see theBigQuery DataFrames reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up ADC for a local development environment.

importbigframes.pandasasbpddf=bpd.read_gbq("bigquery-public-data.new_york.citibike_trips")features=bpd.DataFrame({"num_trips":df.starttime,"date":df["starttime"].dt.date,})date=df["starttime"].dt.datedf.groupby([date])num_trips=features.groupby(["date"]).count()# Results from running "print(num_trips)"#                num_trips# date# 2013-07-01      16650# 2013-07-02      22745# 2013-07-03      21864# 2013-07-04      22326# 2013-07-05      21842# 2013-07-06      20467# 2013-07-07      20477# 2013-07-08      21615# 2013-07-09      26641# 2013-07-10      25732# 2013-07-11      24417# 2013-07-12      19006# 2013-07-13      26119# 2013-07-14      29287# 2013-07-15      28069# 2013-07-16      29842# 2013-07-17      30550# 2013-07-18      28869# 2013-07-19      26591# 2013-07-20      25278# 2013-07-21      30297# 2013-07-22      25979# 2013-07-23      32376# 2013-07-24      35271# 2013-07-25      31084num_trips.plot.line(# Rotate the x labels so they are more visible.rot=45,)

Create the time series model

You want to forecast the number of bike trips for each Citi Bike station, which requires many time series models; one for each Citi Bike station that is included in the input data. You can create multiple models to do this, but that can be a tedious and time consuming process, especially when you have a large number of time series. Instead, you can use a single query to create and fit a set of time series models in order to forecast multiple time series at once.

SQL

In the following query, theOPTIONS(model_type='ARIMA_PLUS', time_series_timestamp_col='date', ...)clause indicates that you are creating anARIMA-basedtime series model. You use thetime_series_id_col optionof theCREATE MODEL statement to specify one or more columns in the input datathat you want to get forecasts for, in this case the Citi Bike station, asrepresented by thestart_station_name column. You use theWHERE clause tolimit the start stations to those withCentral Park in their names. Theauto_arima_max_order optionof theCREATE MODEL statement controls thesearch space for hyperparameter tuning in theauto.ARIMA algorithm. Thedecompose_time_series optionof theCREATE MODEL statement defaults toTRUE, so that information aboutthe time series data is returned when you evaluate the model in the next step.

Follow these steps to create the model:

  1. In the Google Cloud console, go to theBigQuery page.

    Go to BigQuery

  2. In the query editor, paste in the following query and clickRun:

    CREATEORREPLACEMODEL`bqml_tutorial.nyc_citibike_arima_model_group`OPTIONS(model_type='ARIMA_PLUS',time_series_timestamp_col='date',time_series_data_col='num_trips',time_series_id_col='start_station_name',auto_arima_max_order=5)ASSELECTstart_station_name,EXTRACT(DATEfromstarttime)ASdate,COUNT(*)ASnum_tripsFROM`bigquery-public-data.new_york.citibike_trips`WHEREstart_station_nameLIKE'%Central Park%'GROUPBYstart_station_name,date;

    The query takes approximately 24 seconds to complete, after which you can access thenyc_citibike_arima_model_group model. Because the query uses aCREATE MODEL statement, you don't seequery results.

This query creates twelve time series models, one for each of the twelveCiti Bike start stations in the input data. The time cost, approximately 24seconds, is only 1.4 times more than that of creating a single time seriesmodel because of the parallelism. However, if you remove theWHERE ... LIKE ... clause, there would be 600+ time series to forecast, andthey wouldn't be forecast completely in parallel because of slot capacitylimitations. In that case, the query would take approximately 15 minutes tofinish. To reduce the query runtime with the compromise of a potential slightdrop in model quality, you could decrease the value of theauto_arima_max_order.This shrinks the search space of hyperparameter tuning in theauto.ARIMAalgorithm. For more information, seeLarge-scale time series forecasting best practices.

BigQuery DataFrames

In the following snippet, you are creating anARIMA-basedtime series model.

Before trying this sample, follow the BigQuery DataFrames setup instructions in theBigQuery quickstart using BigQuery DataFrames. For more information, see theBigQuery DataFrames reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up ADC for a local development environment.

frombigframes.mlimportforecastingimportbigframes.pandasasbpdmodel=forecasting.ARIMAPlus(# To reduce the query runtime with the compromise of a potential slight# drop in model quality, you could decrease the value of the# auto_arima_max_order. This shrinks the search space of hyperparameter# tuning in the auto.ARIMA algorithm.auto_arima_max_order=5,)df=bpd.read_gbq("bigquery-public-data.new_york.citibike_trips")# This query creates twelve time series models, one for each of the twelve# Citi Bike start stations in the input data. If you remove this row# filter, there would be 600+ time series to forecast.df=df[df["start_station_name"].str.contains("Central Park")]features=bpd.DataFrame({"start_station_name":df["start_station_name"],"num_trips":df["starttime"],"date":df["starttime"].dt.date,})num_trips=features.groupby(["start_station_name","date"],as_index=False,).count()X=num_trips["date"].to_frame()y=num_trips["num_trips"].to_frame()model.fit(X,y,# The input data that you want to get forecasts for,# in this case the Citi Bike station, as represented by the# start_station_name column.id_col=num_trips["start_station_name"].to_frame(),)# The model.fit() call above created a temporary model.# Use the to_gbq() method to write to a permanent location.model.to_gbq(your_model_id,# For example: "bqml_tutorial.nyc_citibike_arima_model",replace=True,)

This creates twelve time series models, one for each of the twelve Citi Bike start stations in the input data. The time cost, approximately 24 seconds, is only 1.4 times more than that of creating a single time series model because of the parallelism.

Evaluate the model

SQL

Evaluate the time series model by using theML.ARIMA_EVALUATEfunction. TheML.ARIMA_EVALUATE function shows you the evaluation metrics thatwere generated for the model during the process of automatichyperparameter tuning.

Follow these steps to evaluate the model:

  1. In the Google Cloud console, go to theBigQuery page.

    Go to BigQuery

  2. In the query editor, paste in the following query and clickRun:

    SELECT*FROMML.ARIMA_EVALUATE(MODEL`bqml_tutorial.nyc_citibike_arima_model_group`);

    The results should look like the following:

    Evaluation metrics for the time series model.

    Whileauto.ARIMA evaluates dozens of candidate ARIMA models for eachtime series,ML.ARIMA_EVALUATE by default only outputs the information of thebest model to make the output table compact. To view all the candidate models,you can set theML.ARIMA_EVALUATE function'sshow_all_candidate_model argument toTRUE.

BigQuery DataFrames

Before trying this sample, follow the BigQuery DataFrames setup instructions in theBigQuery quickstart using BigQuery DataFrames. For more information, see theBigQuery DataFrames reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up ADC for a local development environment.

# Evaluate the time series models by using the summary() function. The summary()# function shows you the evaluation metrics of all the candidate models evaluated# during the process of automatic hyperparameter tuning.summary=model.summary()print(summary.peek())# Expected output:#    start_station_name                  non_seasonal_p  non_seasonal_d   non_seasonal_q  has_drift  log_likelihood           AIC     variance ...# 1         Central Park West & W 72 St               0               1                5      False    -1966.449243   3944.898487  1215.689281 ...# 8            Central Park W & W 96 St               0               0                5      False     -274.459923    562.919847   655.776577 ...# 9        Central Park West & W 102 St               0               0                0      False     -226.639918    457.279835    258.83582 ...# 11        Central Park West & W 76 St               1               1                2      False    -1700.456924   3408.913848   383.254161 ...# 4   Grand Army Plaza & Central Park S               0               1                5      False    -5507.553498  11027.106996   624.138741 ...

Thestart_station_name column identifies the input data column for whichtime series were created. This is the column that you specified with thetime_series_id_col option when creating the model.

Thenon_seasonal_p,non_seasonal_d,non_seasonal_q, andhas_driftoutput columns define an ARIMA model in the training pipeline. Thelog_likelihood,AIC, andvarianceoutput columns are relevant to the ARIMAmodel fitting process.The fitting process determines the best ARIMA model byusing theauto.ARIMA algorithm, one for each time series.

Theauto.ARIMA algorithm uses theKPSS test to determine the best valuefornon_seasonal_d, which in this case is1. Whennon_seasonal_d is1,the auto.ARIMA algorithm trains 42 different candidate ARIMA models in parallel.In this example, all 42 candidate models are valid, so the output contains 42rows, one for each candidate ARIMA model; in cases where some of the modelsaren't valid, they are excluded from the output. These candidate models arereturned in ascending order by AIC. The model in the first row has the lowestAIC, and is considered as the best model. This best model is saved as the finalmodel and is used when you forecast data, evaluate the model, andinspect the model's coefficients as shown in the following steps.

Theseasonal_periods column contains information about the seasonal patternidentified in the time series data. Each time series can have different seasonalpatterns. For example, from the figure, you can see that one time series has ayearly pattern, while others don't.

Thehas_holiday_effect,has_spikes_and_dips, andhas_step_changes columnsare only populated whendecompose_time_series=TRUE. These columns also reflectinformation about the input time series data, and are not related to the ARIMAmodeling. These columns also have the same values across all output rows.

Inspect the model's coefficients

SQL

Inspect the time series model's coefficients by using theML.ARIMA_COEFFICIENTS function.

Follow these steps to retrieve the model's coefficients:

  1. In the Google Cloud console, go to theBigQuery page.

    Go to BigQuery

  2. In the query editor, paste in the following query and clickRun:

    SELECT*FROMML.ARIMA_COEFFICIENTS(MODEL`bqml_tutorial.nyc_citibike_arima_model_group`);

    The query takes less than a second to complete. The results should looksimilar to the following:

    Coefficients for the time series model.

    For more information about the output columns, seeML.ARIMA_COEFFICIENTS function.

BigQuery DataFrames

Inspect the time series model's coefficients by using thecoef_ function.

Before trying this sample, follow the BigQuery DataFrames setup instructions in theBigQuery quickstart using BigQuery DataFrames. For more information, see theBigQuery DataFrames reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up ADC for a local development environment.

coef=model.coef_print(coef.peek())# Expected output:#    start_station_name                                              ar_coefficients                                   ma_coefficients intercept_or_drift# 5    Central Park West & W 68 St                                                [] [-0.41014089  0.21979212 -0.59854213 -0.251438...                0.0# 6         Central Park S & 6 Ave                                                [] [-0.71488957 -0.36835772  0.61008532  0.183290...                0.0# 0    Central Park West & W 85 St                                                [] [-0.39270166 -0.74494638  0.76432596  0.489146...                0.0# 3    W 82 St & Central Park West                         [-0.50219511 -0.64820817]             [-0.20665325  0.67683137 -0.68108631]                0.0# 11  W 106 St & Central Park West [-0.70442887 -0.66885553 -0.25030325 -0.34160669]                                                []                0.0

Thestart_station_name column identifies the input data column for whichtime series were created. This is the column that you specified in thetime_series_id_col option when creating the model.

Thear_coefficients output column shows the model coefficients of theautoregressive (AR) part of the ARIMA model. Similarly, thema_coefficientsoutput column shows the model coefficients of the moving-average (MA) part ofthe ARIMA model. Both of these columns contain array values, whose lengths areequal tonon_seasonal_p andnon_seasonal_q, respectively. Theintercept_or_drift value is the constant term in the ARIMA model.

Use the model to forecast data

SQL

Forecast future time series values by using theML.FORECASTfunction.

In the following GoogleSQL query, theSTRUCT(3 AS horizon, 0.9 AS confidence_level) clause indicates that thequery forecasts 3 future time points, and generates a prediction intervalwith a 90% confidence level.

Follow these steps to forecast data with the model:

  1. In the Google Cloud console, go to theBigQuery page.

    Go to BigQuery

  2. In the query editor, paste in the following query and clickRun:

    SELECT*FROMML.FORECAST(MODEL`bqml_tutorial.nyc_citibike_arima_model_group`,STRUCT(3AShorizon,0.9ASconfidence_level))
  3. ClickRun.

    The query takes less than a second to complete. The results should looklike the following:

    ML.FORECAST output.

For more information about the output columns, seeML.FORECAST function.

BigQuery DataFrames

Forecast future time series values by using thepredict function.

Before trying this sample, follow the BigQuery DataFrames setup instructions in theBigQuery quickstart using BigQuery DataFrames. For more information, see theBigQuery DataFrames reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up ADC for a local development environment.

prediction=model.predict(horizon=3,confidence_level=0.9)print(prediction.peek())# Expected output:#            forecast_timestamp                             start_station_name  forecast_value  standard_error  confidence_level ...# 4   2016-10-01 00:00:00+00:00                         Central Park S & 6 Ave      302.377201       32.572948               0.9 ...# 14  2016-10-02 00:00:00+00:00  Central Park North & Adam Clayton Powell Blvd      263.917567       45.284082               0.9 ...# 1   2016-09-25 00:00:00+00:00                    Central Park West & W 85 St      189.574706       39.874856               0.9 ...# 20  2016-10-02 00:00:00+00:00                    Central Park West & W 72 St      175.474862       40.940794               0.9 ...# 12  2016-10-01 00:00:00+00:00                   W 106 St & Central Park West        63.88163       18.088868               0.9 ...

The first column,start_station_name, annotates the time series that eachtime series model is fitted against. Eachstart_station_name has threerows of forecasted results, as specified by thehorizon value.

For eachstart_station_name, the output rows are in chronological order by theforecast_timestamp column value. In time series forecasting, the predictioninterval, as represented by theprediction_interval_lower_bound andprediction_interval_upper_bound column values, is as important as theforecast_value column value. Theforecast_value value is the middle pointof the prediction interval. The prediction interval depends on thestandard_error andconfidence_level column values.

Explain the forecasting results

SQL

You can get explainability metrics in addition to forecast data by using theML.EXPLAIN_FORECAST function. TheML.EXPLAIN_FORECAST function forecastsfuture time series values and also returns all the separate components of thetime series. If you just want to return forecast data, use theML.FORECASTfunction instead, as shown inUse the model to forecast data.

TheSTRUCT(3 AS horizon, 0.9 AS confidence_level) clause used in theML.EXPLAIN_FORECAST function indicates that the query forecasts 3 futuretime points and generates a prediction interval with 90% confidence.

Follow these steps to explain the model's results:

  1. In the Google Cloud console, go to theBigQuery page.

    Go to BigQuery

  2. In the query editor, paste in the following query and clickRun:

    SELECT*FROMML.EXPLAIN_FORECAST(MODEL`bqml_tutorial.nyc_citibike_arima_model_group`,STRUCT(3AShorizon,0.9ASconfidence_level));

    The query takes less than a second to complete. The results should looklike the following:

    The first nine output columns of forecasted data and forecast explanations.The tenth through seventeenth output columns of forecasted data and forecast explanations.The last six output columns of forecasted data and forecast explanations.

    The first thousands rows returned are all history data. You must scrollthrough the results to see the forecast data.

    The output rows are ordered first bystart_station_name, thenchronologically by thetime_series_timestamp column value. In time seriesforecasting, the predictioninterval, as represented by theprediction_interval_lower_bound andprediction_interval_upper_bound column values, is as important as theforecast_value column value. Theforecast_value value is the middle pointof the prediction interval. The prediction interval depends on thestandard_error andconfidence_level column values.

    For more information about the output columns, seeML.EXPLAIN_FORECAST.

BigQuery DataFrames

You can get explainability metrics in addition to forecast data by using thepredict_explain function. Thepredict_explain function forecastsfuture time series values and also returns all the separate components of thetime series. If you just want to return forecast data, use thepredictfunction instead, as shown inUse the model to forecast data.

Thehorizon=3, confidence_level=0.9 clause used in thepredict_explain function indicates that the query forecasts 3 futuretime points and generates a prediction interval with 90% confidence.

Before trying this sample, follow the BigQuery DataFrames setup instructions in theBigQuery quickstart using BigQuery DataFrames. For more information, see theBigQuery DataFrames reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up ADC for a local development environment.

explain=model.predict_explain(horizon=3,confidence_level=0.9)print(explain.peek(5))# Expected output:#   time_series_timestamp        start_station_name            time_series_type    time_series_data    time_series_adjusted_data    standard_error    confidence_level    prediction_interval_lower_bound    prediction_interval_upper_bound    trend    seasonal_period_yearly    seasonal_period_quarterly    seasonal_period_monthly    seasonal_period_weekly    seasonal_period_daily    holiday_effect    spikes_and_dips    step_changes    residual# 02013-07-01 00:00:00+00:00Central Park S & 6 Ave                history                  69.0                   154.168527              32.572948             <NA>                        <NA>                            <NA>                 0.0          35.477484                       <NA>                        <NA>                  -28.402102                 <NA>                <NA>               0.0         -85.168527        147.093145# 12013-07-01 00:00:00+00:00Grand Army Plaza & Central Park S    history                  79.0                      79.0                  24.982769             <NA>                        <NA>                            <NA>                 0.0          43.46428                       <NA>                        <NA>                  -30.01599                     <NA>                <NA>               0.0            0.0             65.55171# 22013-07-02 00:00:00+00:00Central Park S & 6 Ave                history                  180.0                   204.045651              32.572948             <NA>                        <NA>                            <NA>              147.093045      72.498327                       <NA>                        <NA>                  -15.545721                 <NA>                <NA>               0.0         -85.168527         61.122876# 32013-07-02 00:00:00+00:00Grand Army Plaza & Central Park S    history                  129.0                    99.556269              24.982769             <NA>                        <NA>                            <NA>               65.551665      45.836432                       <NA>                        <NA>                  -11.831828                 <NA>                <NA>               0.0            0.0             29.443731# 42013-07-03 00:00:00+00:00Central Park S & 6 Ave                history                  115.0                   205.968236              32.572948             <NA>                        <NA>                            <NA>               191.32754      59.220766                       <NA>                        <NA>                  -44.580071                 <NA>                <NA>               0.0         -85.168527        -5.799709

The output rows are ordered first bytime_series_timestamp, thenchronologically by thestart_station_name column value. In time seriesforecasting, the predictioninterval, as represented by theprediction_interval_lower_bound andprediction_interval_upper_bound column values, is as important as theforecast_value column value. Theforecast_value value is the middle pointof the prediction interval. The prediction interval depends on thestandard_error andconfidence_level column values.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.

  • You can delete the project you created.
  • Or you can keep the project and delete the dataset.

Delete your dataset

Deleting your project removes all datasets and all tables in the project. If youprefer to reuse the project, you can delete the dataset you created in thistutorial:

  1. If necessary, open the BigQuery page in theGoogle Cloud console.

    Go to the BigQuery page

  2. In the navigation, click thebqml_tutorial dataset you created.

  3. ClickDelete dataset to delete the dataset, the table, and all of thedata.

  4. In theDelete dataset dialog, confirm the delete command by typingthe name of your dataset (bqml_tutorial) and then clickDelete.

Delete your project

To delete the project:

    Caution: Deleting a project has the following effects:
    • Everything in the project is deleted. If you used an existing project for the tasks in this document, when you delete it, you also delete any other work you've done in the project.
    • Custom project IDs are lost. When you created this project, you might have created a custom project ID that you want to use in the future. To preserve the URLs that use the project ID, such as anappspot.com URL, delete selected resources inside the project instead of deleting the whole project.

    If you plan to explore multiple architectures, tutorials, or quickstarts, reusing projects can help you avoid exceeding project quota limits.

  1. In the Google Cloud console, go to theManage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then clickDelete.
  3. In the dialog, type the project ID, and then clickShut down to delete the project.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.