Export models

This page shows you how to export BigQuery ML models. You can export BigQuery ML models to Cloud Storage, and use them for online prediction, or edit them in Python. You can export aBigQuery ML model by:

You can export the following model types:

  • AUTOENCODER
  • AUTOML_CLASSIFIER
  • AUTOML_REGRESSOR
  • BOOSTED_TREE_CLASSIFIER
  • BOOSTED_TREE_REGRESSOR
  • DNN_CLASSIFIER
  • DNN_REGRESSOR
  • DNN_LINEAR_COMBINED_CLASSIFIER
  • DNN_LINEAR_COMBINED_REGRESSOR
  • KMEANS
  • LINEAR_REG
  • LOGISTIC_REG
  • MATRIX_FACTORIZATION
  • RANDOM_FOREST_CLASSIFIER
  • RANDOM_FOREST_REGRESSOR
  • TENSORFLOW (imported TensorFlow models)
  • PCA
  • TRANSFORM_ONLY

Export model formats and samples

The following table shows the export destination formats for eachBigQuery ML model type and provides a sample of files that get writtenin the Cloud Storage bucket.

Model typeExport model formatExported files sample
AUTOML_CLASSIFIERTensorFlow SavedModel (TF 2.1.0)gcs_bucket/
  assets/
    f1.txt
    f2.txt
  saved_model.pb
  variables/
    variables.data-00-of-01
    variables.index
AUTOML_REGRESSOR
AUTOENCODERTensorFlow SavedModel (TF 1.15 or higher)
DNN_CLASSIFIER
DNN_REGRESSOR
DNN_LINEAR_COMBINED_CLASSIFIER
DNN_LINEAR_COMBINED_REGRESSOR
KMEANS
LINEAR_REGRESSOR
LOGISTIC_REG
MATRIX_FACTORIZATION
PCA
TRANSFORM_ONLY
BOOSTED_TREE_CLASSIFIERBooster (XGBoost 0.82)gcs_bucket/
  assets/
    0.txt
    1.txt
    model_metadata.json
  main.py
  model.bst
  xgboost_predictor-0.1.tar.gz
    ....
     predictor.py
    ....


main.py is for local run. SeeModel deployment for more details.
BOOSTED_TREE_REGRESSOR
RANDOM_FOREST_REGRESSOR
RANDOM_FOREST_REGRESSOR
TENSORFLOW (imported)TensorFlow SavedModelExactly the same files that were present when importing the model
Note: Theautomatic data preprocessingperformed during model creation, such as standardization and label encoding,is saved in the exported files as part of the graph for TensorFlow SavedModel,and in the external files for Booster. Explicit preprocessing is unneeded beforepassing data for prediction. Input should generally match that used forBigQuery MLML.PREDICT.All numerical values in the exported model signatures are cast as data typeFLOAT64 .Also, allSTRUCT fields must be expanded into separate fields. For example,fieldf1 inSTRUCT f2 should be renamed asf2_f1 and passed as a separatecolumn.

Export model trained withTRANSFORM

If the model is trained with theTRANSFORM clause,then an additional preprocessing model performs the same logic in theTRANSFORM clause and is saved in theTensorFlow SavedModel format under the subdirectorytransform.You can deploy a model trained with theTRANSFORM clauseto Vertex AI as well as locally. For more information, seemodel deployment.

Export model formatExported files sample
Prediction model:TensorFlow SavedModel or Booster (XGBoost 0.82).
Preprocessing model for TRANSFORM clause:TensorFlow SavedModel (TF 2.5 or higher)
gcs_bucket/
  ....(model files)
  transform/
    assets/
        f1.txt/
        f2.txt/
    saved_model.pb
    variables/
        variables.data-00-of-01
        variables.index

The model doesn't contain the information about the feature engineeringperformed outside theTRANSFORM clauseduring training. For example, anything in theSELECT statement . So you wouldneed to manually convert the input data before feeding into the preprocessingmodel.

Supported data types

When exporting models trained with theTRANSFORM clause,the following data types are supported for feeding into theTRANSFORM clause.

TRANSFORM input typeTRANSFORM input samplesExported preprocessing model input samples
INT6410,
11
tf.constant(
  [10, 11],
  dtype=tf.int64)
NUMERICNUMERIC 10,
NUMERIC 11
tf.constant(
  [10, 11],
  dtype=tf.float64)
BIGNUMERICBIGNUMERIC 10,
BIGNUMERIC 11
tf.constant(
  [10, 11],
  dtype=tf.float64)
FLOAT6410.0,
11.0
tf.constant(
  [10, 11],
  dtype=tf.float64)
BOOLTRUE,
FALSE
tf.constant(
  [True, False],
  dtype=tf.bool)
STRING'abc',
'def'
tf.constant(
  ['abc', 'def'],
  dtype=tf.string)
BYTESb'abc',
b'def'
tf.constant(
  ['abc', 'def'],
  dtype=tf.string)
DATEDATE '2020-09-27',
DATE '2020-09-28'
tf.constant(
  [
    '2020-09-27',
    '2020-09-28'
  ],
  dtype=tf.string)

"%F" format
DATETIMEDATETIME '2023-02-02 02:02:01.152903',
DATETIME '2023-02-03 02:02:01.152903'
tf.constant(
  [
    '2023-02-02 02:02:01.152903',
    '2023-02-03 02:02:01.152903'
  ],
  dtype=tf.string)

"%F %H:%M:%E6S" format
TIMETIME '16:32:36.152903',
TIME '17:32:36.152903'
tf.constant(
  [
    '16:32:36.152903',
    '17:32:36.152903'
  ],
  dtype=tf.string)

"%H:%M:%E6S" format
TIMESTAMPTIMESTAMP '2017-02-28 12:30:30.45-08',
TIMESTAMP '2018-02-28 12:30:30.45-08'
tf.constant(
  [
    '2017-02-28 20:30:30.4 +0000',
    '2018-02-28 20:30:30.4 +0000'
  ],
  dtype=tf.string)

"%F %H:%M:%E1S %z" format
ARRAY['a', 'b'],
['c', 'd']
tf.constant(
  [['a', 'b'], ['c', 'd']],
  dtype=tf.string)
ARRAY< STRUCT< INT64, FLOAT64>>[(1, 1.0), (2, 1.0)],
[(2, 1.0), (3, 1.0)]
tf.sparse.from_dense(
  tf.constant(
    [
      [0, 1.0, 1.0, 0],
      [0, 0, 1.0, 1.0]
    ],
    dtype=tf.float64))
NULLNULL,
NULL
tf.constant(
  [123456789.0e10, 123456789.0e10],
  dtype=tf.float64)

tf.constant(
  [1234567890000000000, 1234567890000000000],
  dtype=tf.int64)

tf.constant(
  [' __MISSING__ ', ' __MISSING__ '],
  dtype=tf.string)

Supported SQL functions

When exporting models trained with theTRANSFORM clause,you can use the following SQL functions inside theTRANSFORM clause.

  • Operators
    • +,-,*,/,=,<,>,<=,>=,!=,<>,[NOT] BETWEEN,[NOT] IN,IS [NOT] NULL,IS [NOT] TRUE,IS [NOT] FALSE,NOT,AND,OR.
  • Conditional expressions
    • CASE expr,CASE,COALESCE,IF,IFNULL,NULLIF.
  • Mathematical functions
    • ABS,ACOS,ACOSH,ASINH,ATAN,ATAN2,ATANH,CBRT,CEIL,CEILING,COS,COSH,COT,COTH,CSC,CSCH,EXP,FLOOR,IS_INF,IS_NAN,LN,LOG,LOG10,MOD,POW,POWER,SEC,SECH,SIGN,SIN,SINH,SQRT,TAN,TANH.
  • Conversion functions
    • CAST AS INT64,CAST AS FLOAT64,CAST AS NUMERIC,CAST AS BIGNUMERIC,CAST AS STRING,SAFE_CAST AS INT64,SAFE_CAST AS FLOAT64
  • String functions
    • CONCAT,LEFT,LENGTH,LOWER,REGEXP_REPLACE,RIGHT,SPLIT,SUBSTR,SUBSTRING,TRIM,UPPER.
  • Date functions
    • Date,DATE_ADD,DATE_SUB,DATE_DIFF,DATE_TRUNC,EXTRACT,FORMAT_DATE,PARSE_DATE,SAFE.PARSE_DATE.
  • Datetime functions
    • DATETIME,DATETIME_ADD,DATETIME_SUB,DATETIME_DIFF,DATETIME_TRUNC,EXTRACT,PARSE_DATETIME,SAFE.PARSE_DATETIME.
  • Time functions
    • TIME,TIME_ADD,TIME_SUB,TIME_DIFF,TIME_TRUNC,EXTRACT,FORMAT_TIME,PARSE_TIME,SAFE.PARSE_TIME.
  • Timestamp functions
    • TIMESTAMP,TIMESTAMP_ADD,TIMESTAMP_SUB,TIMESTAMP_DIFF,TIMESTAMP_TRUNC,FORMAT_TIMESTAMP,PARSE_TIMESTAMP,SAFE.PARSE_TIMESTAMP,TIMESTAMP_MICROS,TIMESTAMP_MILLIS,TIMESTAMP_SECONDS,EXTRACT,STRING,UNIX_MICROS,UNIX_MILLIS,UNIX_SECONDS.
  • Manual preprocessing functions
    • ML.IMPUTER,ML.HASH_BUCKETIZE,ML.LABEL_ENCODER,ML.MULTI_HOT_ENCODER,ML.NGRAMS,ML.ONE_HOT_ENCODER,ML.BUCKETIZE,ML.MAX_ABS_SCALER,ML.MIN_MAX_SCALER,ML.NORMALIZER,ML.QUANTILE_BUCKETIZE,ML.ROBUST_SCALER,ML.STANDARD_SCALER.

Limitations

The following limitations apply when exporting models:

  • Model export is not supported if any of the following features were usedduring training:

    • ARRAY,TIMESTAMP, orGEOGRAPHY feature types were present in theinput data.
  • Exported models for model typesAUTOML_REGRESSOR andAUTOML_CLASSIFIERdo not support Vertex AI deployment for online prediction.

  • The model size limit is 1 GB for matrix factorization model export.The model size is roughly proportional tonum_factors, so you can reducenum_factors during training to shrink the model size if you reach the limit.

  • For models trained with theBigQuery MLTRANSFORM clauseformanual feature preprocessing,see thedata typesandfunctionssupported for exporting.

  • Models trained with theBigQuery MLTRANSFORM clausebefore 18 September 2023 must be re-trained before they can bedeployed through Model Registryfor online prediction.

  • During model export,ARRAY<STRUCT<INT64, FLOAT64>>,ARRAY andTIMESTAMP are supported as pre-transformed data, but are not supported aspost-transformed data.

Export BigQuery ML models

To export a model, select one of the following:

Console

  1. Open the BigQuery page in the Google Cloud console.

    Go to the BigQuery page

  2. In the left pane, clickExplorer:

    Highlighted button for the Explorer pane.

    If you don't see the left pane, clickExpand left pane to open the pane.

  3. In theExplorer pane, expand your project, clickDatasets, andthen click your dataset.

  4. ClickOverview> Models and click the model name that you're exporting.

  5. ClickMore> Export:

    Export model

  6. In theExport model to Google Cloud Storage dialog:

    • ForSelect GCS location, browse for the bucketor folder location where you want to to export the model, and clickSelect.
    • ClickSubmit to export the model.

To check on the progress of the job, in theExplorer pane, clickJob history, and look for anEXTRACT type job.

SQL

TheEXPORT MODEL statement lets you export BigQuery ML modelstoCloud Storage usingGoogleSQLquery syntax.

To export a BigQuery ML model in the Google Cloud console byusing theEXPORT MODEL statement, follow these steps:

  1. In the Google Cloud console, open the BigQuery page.

    Go to BigQuery

  2. ClickCompose new query.

  3. In theQuery editor field, type yourEXPORT MODELstatement.

    The following query exports a model namedmyproject.mydataset.mymodel to a Cloud Storage bucket withURIgs://bucket/path/to/saved_model/.

    EXPORTMODEL`myproject.mydataset.mymodel`OPTIONS(URI='gs://bucket/path/to/saved_model/')

  4. ClickRun. When the query is complete, the following appears in theQuery results pane:Successfully exported model.

bq

Note: To export a model using the bq command-line tool, you must have bq toolversion 2.0.56 or later, which is included with gcloud CLIversion 287.0.0 and later.To see your installed bq tool version, usebq version and, ifneeded, update the gcloud CLI usinggcloud components update.

Use thebq extract command with the--model flag.

(Optional) Supply the--destination_format flag and pick the format of themodel exported.(Optional) Supply the--location flag and set the value toyourlocation.

bq --location=location extract \--destination_formatformat \--modelproject_id:dataset.model \gs://bucket/model_folder

Where:

  • location is the name of your location. The--location flagis optional. For example, if you are using BigQuery in theTokyo region, you can set the flag's value toasia-northeast1. You canset a default value for the location using the.bigqueryrc file.
  • destination_format is the format for the exported model:ML_TF_SAVED_MODEL (default), orML_XGBOOST_BOOSTER.
  • project_id is your project ID.
  • dataset is the name of the source dataset.
  • model is the model you're exporting.
  • bucket is the name of the Cloud Storage bucket towhich you're exporting the data. The BigQuery dataset and theCloud Storage bucket must be in the samelocation.
  • model_folder is the name of the folder where the exportedmodel files will be written.

Examples:

For example, the following command exportsmydataset.mymodel in TensorFlow SavedModelformat to a Cloud Storage bucket namedmymodel_folder.

bqextract--model\'mydataset.mymodel'\gs://example-bucket/mymodel_folder

The default value ofdestination_format isML_TF_SAVED_MODEL.

The following command exportsmydataset.mymodel in XGBoost Booster formatto a Cloud Storage bucket namedmymodel_folder.

bqextract--model\--destination_formatML_XGBOOST_BOOSTER\'mydataset.mytable'\gs://example-bucket/mymodel_folder

API

To export model, create anextract job and populate the job configuration.

(Optional) Specify your location in thelocation property in thejobReference section of thejob resource.

  1. Create an extract job that points to the BigQuery ML model andthe Cloud Storage destination.

  2. Specify the source model by using thesourceModel configuration objectthat contains the project ID, dataset ID, and model ID.

  3. Thedestination URI(s) property must be fully-qualified, in the formatgs://bucket/model_folder.

  4. Specify the destination format by setting theconfiguration.extract.destinationFormat property. For example, toexport a boosted tree model, set this property to the valueML_XGBOOST_BOOSTER.

  5. To check the job status, calljobs.get(job_id) withthe ID of the job returned by the initial request.

    • Ifstatus.state = DONE, the job completed successfully.
    • If thestatus.errorResult property is present, the request failed,and that object will include information describing what went wrong.
    • Ifstatus.errorResult is absent, the job finished successfully,although there might have been some non-fatal errors. Non-fatalerrors are listed in the returned job object'sstatus.errorsproperty.

API notes:

  • As a best practice, generate a unique ID and pass it asjobReference.jobId when callingjobs.insert to create a job. Thisapproach is more robust to network failure because the client can pollor retry on the known job ID.

  • Callingjobs.insert on a given job ID is idempotent; in other words,you can retry as many times as you like on the same job ID, and at mostone of those operations will succeed.

Java

Before trying this sample, follow theJava setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryJava API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

importcom.google.cloud.bigquery.BigQuery;importcom.google.cloud.bigquery.BigQueryException;importcom.google.cloud.bigquery.BigQueryOptions;importcom.google.cloud.bigquery.ExtractJobConfiguration;importcom.google.cloud.bigquery.Job;importcom.google.cloud.bigquery.JobInfo;importcom.google.cloud.bigquery.ModelId;// Sample to extract model to GCS bucketpublicclassExtractModel{publicstaticvoidmain(String[]args)throwsInterruptedException{// TODO(developer): Replace these variables before running the sample.StringprojectName="bigquery-public-data";StringdatasetName="samples";StringmodelName="model";StringbucketName="MY-BUCKET-NAME";StringdestinationUri="gs://"+bucketName+"/path/to/file";extractModel(projectName,datasetName,modelName,destinationUri);}publicstaticvoidextractModel(StringprojectName,StringdatasetName,StringmodelName,StringdestinationUri)throwsInterruptedException{try{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.BigQuerybigquery=BigQueryOptions.getDefaultInstance().getService();ModelIdmodelId=ModelId.of(projectName,datasetName,modelName);ExtractJobConfigurationextractConfig=ExtractJobConfiguration.newBuilder(modelId,destinationUri).build();Jobjob=bigquery.create(JobInfo.of(extractConfig));// Blocks until this job completes its execution, either failing or succeeding.JobcompletedJob=job.waitFor();if(completedJob==null){System.out.println("Job not executed since it no longer exists.");return;}elseif(completedJob.getStatus().getError()!=null){System.out.println("BigQuery was unable to extract due to an error: \n"+job.getStatus().getError());return;}System.out.println("Model extract successful");}catch(BigQueryExceptionex){System.out.println("Model extraction job was interrupted. \n"+ex.toString());}}}

Model deployment

You can deploy the exported model to Vertex AI as well as locally. If themodel'sTRANSFORM clause contains Datefunctions, Datetime functions, Time functions or Timestamp functions, you mustusebigquery-ml-utils libraryin the container. The exception is if you aredeploying through Model Registry,which does not need exported models or serving containers.

Vertex AI deployment

Export model formatDeployment
TensorFlow SavedModel (non-AutoML models)Deploy a TensorFlow SavedModel. You must create the SavedModel file using asupported version of TensorFlow.
TensorFlow SavedModel (AutoML models) Not supported.
XGBoost Booster Use acustom prediction routine. For XGBoost Booster models, preprocessing and postprocessing information is saved in the exported files, and a custom prediction routine lets you deploy the model with the extra exported files.

You must create the model files using asupported version of XGBoost.

Local deployment

Export model formatDeployment
TensorFlow SavedModel (non-AutoML models) SavedModel is a standard format, and you can deploy them inTensorFlow Serving docker container.

You can also leverage thelocal run of Vertex AI online prediction.
TensorFlow SavedModel (AutoML models)Containerize and run the model.
XGBoost Booster To run XGBoost Booster models locally, you can use the exportedmain.py file:
  1. Download all of the files from Cloud Storage to the local directory.
  2. Unzip thepredictor.py file fromxgboost_predictor-0.1.tar.gz to the local directory.
  3. Runmain.py (see instructions inmain.py).

Prediction output format

This section provides the prediction output format of the exported models foreach model type. All exported models support batch prediction; they can handlemultiple input rows at a time. For example, there are two input rows in each ofthe following output format examples.

AUTOENCODER

Prediction output formatOutput sample
+------------------------+------------------------+------------------------+|      LATENT_COL_1      |      LATENT_COL_2      |           ...          |+------------------------+------------------------+------------------------+|       [FLOAT]          |         [FLOAT]        |           ...          |+------------------------+------------------------+------------------------+
+------------------+------------------+------------------+------------------+|   LATENT_COL_1   |   LATENT_COL_2   |   LATENT_COL_3   |   LATENT_COL_4   |+------------------------+------------+------------------+------------------+|    0.21384512    |    0.93457112    |    0.64978097    |    0.00480489    |+------------------+------------------+------------------+------------------+

AUTOML_CLASSIFIER

Prediction output formatOutput sample
+------------------------------------------+| predictions                              |+------------------------------------------+| [{"scores":[FLOAT], "classes":[STRING]}] |+------------------------------------------+
+---------------------------------------------+| predictions                                 |+---------------------------------------------+| [{"scores":[1, 2], "classes":['a', 'b']},   ||  {"scores":[3, 0.2], "classes":['a', 'b']}] |+---------------------------------------------+

AUTOML_REGRESSOR

Prediction output formatOutput sample
+-----------------+| predictions     |+-----------------+| [FLOAT]         |+-----------------+
+-----------------+| predictions     |+-----------------+| [1.8, 2.46]     |+-----------------+

BOOSTED_TREE_CLASSIFIER and RANDOM_FOREST_CLASSIFIER

Prediction output formatOutput sample
+-------------+--------------+-----------------+| LABEL_PROBS | LABEL_VALUES | PREDICTED_LABEL |+-------------+--------------+-----------------+| [FLOAT]     | [STRING]     | STRING          |+-------------+--------------+-----------------+
+-------------+--------------+-----------------+| LABEL_PROBS | LABEL_VALUES | PREDICTED_LABEL |+-------------+--------------+-----------------+| [0.1, 0.9]  | ['a', 'b']   | ['b']           |+-------------+--------------+-----------------+| [0.8, 0.2]  | ['a', 'b']   | ['a']           |+-------------+--------------+-----------------+

BOOSTED_TREE_REGRESSOR AND RANDOM_FOREST_REGRESSOR

Prediction output formatOutput sample
+-----------------+| predicted_label |+-----------------+| FLOAT           |+-----------------+
+-----------------+| predicted_label |+-----------------+| [1.8]           |+-----------------+| [2.46]          |+-----------------+

DNN_CLASSIFIER

Prediction output formatOutput sample
+---------------+-------------+-----------+---------+------------------------+--------+---------------+| ALL_CLASS_IDS | ALL_CLASSES | CLASS_IDS | CLASSES | LOGISTIC (binary only) | LOGITS | PROBABILITIES |+---------------+-------------+-----------+---------+------------------------+--------+---------------+| [INT64]       | [STRING]    | INT64     | STRING  | FLOAT                  | [FLOAT]| [FLOAT]       |+---------------+-------------+-----------+---------+------------------------+--------+---------------+
+---------------+-------------+-----------+---------+------------------------+--------+---------------+| ALL_CLASS_IDS | ALL_CLASSES | CLASS_IDS | CLASSES | LOGISTIC (binary only) | LOGITS | PROBABILITIES |+---------------+-------------+-----------+---------+------------------------+--------+---------------+| [0, 1]        | ['a', 'b']  | [0]       | ['a']   | [0.36]                 | [-0.53]| [0.64, 0.36]  |+---------------+-------------+-----------+---------+------------------------+--------+---------------+| [0, 1]        | ['a', 'b']  | [0]       | ['a']   | [0.2]                  | [-1.38]| [0.8, 0.2]    |+---------------+-------------+-----------+---------+------------------------+--------+---------------+

DNN_REGRESSOR

Prediction output formatOutput sample
+-----------------+| PREDICTED_LABEL |+-----------------+| FLOAT           |+-----------------+
+-----------------+| PREDICTED_LABEL |+-----------------+| [1.8]           |+-----------------+| [2.46]          |+-----------------+

DNN_LINEAR_COMBINED_CLASSIFIER

Prediction output formatOutput sample
+---------------+-------------+-----------+---------+------------------------+--------+---------------+| ALL_CLASS_IDS | ALL_CLASSES | CLASS_IDS | CLASSES | LOGISTIC (binary only) | LOGITS | PROBABILITIES |+---------------+-------------+-----------+---------+------------------------+--------+---------------+| [INT64]       | [STRING]    | INT64     | STRING  | FLOAT                  | [FLOAT]| [FLOAT]       |+---------------+-------------+-----------+---------+------------------------+--------+---------------+
+---------------+-------------+-----------+---------+------------------------+--------+---------------+| ALL_CLASS_IDS | ALL_CLASSES | CLASS_IDS | CLASSES | LOGISTIC (binary only) | LOGITS | PROBABILITIES |+---------------+-------------+-----------+---------+------------------------+--------+---------------+| [0, 1]        | ['a', 'b']  | [0]       | ['a']   | [0.36]                 | [-0.53]| [0.64, 0.36]  |+---------------+-------------+-----------+---------+------------------------+--------+---------------+| [0, 1]        | ['a', 'b']  | [0]       | ['a']   | [0.2]                  | [-1.38]| [0.8, 0.2]    |+---------------+-------------+-----------+---------+------------------------+--------+---------------+

DNN_LINEAR_COMBINED_REGRESSOR

Prediction output formatOutput sample
+-----------------+| PREDICTED_LABEL |+-----------------+| FLOAT           |+-----------------+
+-----------------+| PREDICTED_LABEL |+-----------------+| [1.8]           |+-----------------+| [2.46]          |+-----------------+

KMEANS

Prediction output formatOutput sample
+--------------------+--------------+---------------------+| CENTROID_DISTANCES | CENTROID_IDS | NEAREST_CENTROID_ID |+--------------------+--------------+---------------------+| [FLOAT]            | [INT64]      | INT64               |+--------------------+--------------+---------------------+
+--------------------+--------------+---------------------+| CENTROID_DISTANCES | CENTROID_IDS | NEAREST_CENTROID_ID |+--------------------+--------------+---------------------+| [1.2, 1.3]         | [1, 2]       | [1]                 |+--------------------+--------------+---------------------+| [0.4, 0.1]         | [1, 2]       | [2]                 |+--------------------+--------------+---------------------+

LINEAR_REG

Prediction output formatOutput sample
+-----------------+| PREDICTED_LABEL |+-----------------+| FLOAT           |+-----------------+
+-----------------+| PREDICTED_LABEL |+-----------------+| [1.8]           |+-----------------+| [2.46]          |+-----------------+

LOGISTIC_REG

Prediction output formatOutput sample
+-------------+--------------+-----------------+| LABEL_PROBS | LABEL_VALUES | PREDICTED_LABEL |+-------------+--------------+-----------------+| [FLOAT]     | [STRING]     | STRING          |+-------------+--------------+-----------------+
+-------------+--------------+-----------------+| LABEL_PROBS | LABEL_VALUES | PREDICTED_LABEL |+-------------+--------------+-----------------+| [0.1, 0.9]  | ['a', 'b']   | ['b']           |+-------------+--------------+-----------------+| [0.8, 0.2]  | ['a', 'b']   | ['a']           |+-------------+--------------+-----------------+

MATRIX_FACTORIZATION

Note: We only support taking an input user and output top 50 (predicted_rating, predicted_item) pairs sorted by predicted_rating in descending order.

Prediction output formatOutput sample
+--------------------+--------------+| PREDICTED_RATING | PREDICTED_ITEM |+------------------+----------------+| [FLOAT]          | [STRING]       |+------------------+----------------+
+--------------------+--------------+| PREDICTED_RATING | PREDICTED_ITEM |+------------------+----------------+| [5.5, 1.7]       | ['A', 'B']     |+------------------+----------------+| [7.2, 2.7]       | ['B', 'A']     |+------------------+----------------+

TENSORFLOW (imported)

Prediction output format
Same as the imported model

PCA

Prediction output formatOutput sample
+-------------------------+---------------------------------+| PRINCIPAL_COMPONENT_IDS | PRINCIPAL_COMPONENT_PROJECTIONS |+-------------------------+---------------------------------+|       [INT64]           |             [FLOAT]             |+-------------------------+---------------------------------+
+-------------------------+---------------------------------+| PRINCIPAL_COMPONENT_IDS | PRINCIPAL_COMPONENT_PROJECTIONS |+-------------------------+---------------------------------+|       [1, 2]            |             [1.2, 5.0]          |+-------------------------+---------------------------------+

TRANSFORM_ONLY

Prediction output format
Same as the columns specified in the model'sTRANSFORM clause

XGBoost model visualization

You can visualize the boosted trees using theplot_tree Python API after model export. For example, you can leverageColab without installing the dependencies:

  1. Export the boosted tree model to a Cloud Storage bucket.
  2. Download themodel.bst file from the Cloud Storage bucket.
  3. In aColab noteboook, upload themodel.bst file toFiles.
  4. Run the following code in the notebook:

    importxgboostasxgbimportmatplotlib.pyplotaspltmodel=xgb.Booster(model_file="model.bst")num_iterations= <iteration_number>fortree_numinrange(num_iterations):xgb.plot_tree(model,num_trees=tree_num)plt.show

This example plots multiple trees (one tree per iteration):

Export model

Note: We use the label encoder to encode categorical features, so you can getthe corresponding category for a split value from the vocabulary file in the'assets/' directory inside the model export Cloud Storage bucket. Forexample, when you see "f0 < 2.95" in a node, you can find the correspondingcategory in the vocabulary file by looking for the 3rd item.

We don't save feature names in the model, so you will see namessuch as "f0", "f1", and so on. You can find the corresponding feature names intheassets/model_metadata.json exported file using these names (such as "f0")as indexes.

Required permissions

To export a BigQuery ML model to Cloud Storage, you need permissions toaccess the BigQuery ML model, permissions to run an extract job, andpermissions to write the data to the Cloud Storage bucket.

BigQuery permissions

Cloud Storage permissions

  • To write the data to an existing Cloud Storage bucket, you must begrantedstorage.objects.create permissions. The following predefinedIAM roles are grantedstorage.objects.create permissions:

    • storage.objectCreator
    • storage.objectAdmin
    • storage.admin

For more information on IAM roles and permissions in BigQuery ML, seeAccess control.

Move BigQuery data between locations

You cannot change the location of a dataset after it is created, but you canmake a copy of the dataset.

Quota policy

For information on extract job quotas, seeExtract jobs on the Quotas and limits page.

Pricing

There is no charge for exporting BigQuery ML models, butexports are subject to BigQuery'sQuotas and limits. For more information on BigQuerypricing, see thePricing page.

After the data is exported, you are charged for storing the data inCloud Storage. For more information on Cloud Storage pricing, see theCloud StoragePricing page.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.