Hyperparameter tuning overview
In machine learning, hyperparameter tuning identifies a set of optimalhyperparameters for a learning algorithm. A hyperparameter is a model argumentwhose value is set before the learning process begins. By contrast, the valuesof other parameters such as coefficients of a linear model are learned.
Hyperparameter tuning lets you spend less time manually iteratinghyperparameters and more time focusing on exploring insights from data.
You can specify hyperparameter tuning options for the following model types:
- Linear and logistic regression
- K-means
- Matrix factorization
- Autoencoder
- Boosted trees
- Random forest
- Deep neural network (DNN)
- Wide & Deep network
For these types of models, hyperparameter tuning is enabled when youspecify a value for theNUM_TRIALS optionin theCREATE MODEL statement.
To try running hyperparameter tuning on a linear regression model, seeUse the BigQuery ML hyperparameter tuning to improve model performance.
The following models also support hyperparameter tuning but don't allow you to specifyparticular values:
- AutoML Tables modelshave automatic hyperparameter tuning embedded in the model training bydefault.
- ARIMA_PLUS modelslet you set the
AUTO_ARIMAargumentto perform hyperparameter tuning using the auto.ARIMA algorithm. Thisalgorithm performs hyperparameter tuning for the trend module. Hyperparametertuning isn't supported for the entiremodeling pipeline.
Locations
For information about which locations support hyperparameter tuning, seeBigQuery ML locations.
Set hyperparameters
To tune a hyperparameter, you must specify a range of values for thathyperparameter that the model can use for a set of trials. You can do this byusing one of the following keywords when setting the hyperparameter in theCREATE MODEL statement, instead of providing a single value:
HPARAM_RANGE: A two-elementARRAY(FLOAT64)value that defines the minimumand maximum bounds of the search space of continuous values for ahyperparameter. Use this option to specify a range of values for ahyperparameter, for exampleLEARN_RATE = HPARAM_RANGE(0.0001, 1.0).HPARAM_CANDIDATES: AARRAY(STRUCT)value that specifies the set ofdiscrete values for the hyperparameter. Use this option to specify a setof values for a hyperparameter, for exampleOPTIMIZER = HPARAM_CANDIDATES(['ADAGRAD', 'SGD', 'FTRL']).
Hyperparameters and objectives
The following table lists the supported hyperparameters and objectives foreach model type that supports hyperparameter tuning:
| Model type | Hyperparameter objectives | Hyperparameter | Valid range | Default range | Scale type |
|---|---|---|---|---|---|
LINEAR_REG | MEAN_ABSOLUTE_ERRORMEAN_SQUARED_ERRORMEAN_SQUARED_LOG_ERRORMEDIAN_ABSOLUTE_ERRORR2_SCORE (default)EXPLAINED_VARIANCE | L1_REGL2_REG | (0, ∞](0, ∞] | (0, 10](0, 10] | LOGLOG |
LOGISTIC_REG | PRECISIONRECALLACCURACYF1_SCORELOG_LOSSROC_AUC (default) | L1_REGL2_REG | (0, ∞](0, ∞] | (0, 10](0, 10] | LOGLOG |
KMEANS | DAVIES_BOULDIN_INDEX | NUM_CLUSTERS | [2, 100] | [2, 10] | LINEAR |
MATRIX_ (explicit) | MEAN_SQUARED_ERROR | NUM_FACTORSL2_REG | [2, 200](0, ∞) | [2, 20](0, 10] | LINEARLOG |
MATRIX_ (implicit) | MEAN_AVERAGE_PRECISION (default)MEAN_SQUARED_ERRORNORMALIZED_DISCOUNTED_CUMULATIVE_GAINAVERAGE_RANK | NUM_FACTORSL2_REGWALS_ALPHA | [2, 200](0, ∞)[0, ∞) | [2, 20](0, 10][0, 100] | LINEARLOGLINEAR |
AUTOENCODER | MEAN_ABSOLUTE_ERRORMEAN_SQUARED_ERROR (default)MEAN_SQUARED_LOG_ERROR | LEARN_RATEBATCH_SIZEL1_REGL2_REGL1_REG_ACTIVATIONDROPOUTHIDDEN_UNITSOPTIMIZERACTIVATION_FN | [0, 1](0, ∞)(0, ∞)(0, ∞)(0, ∞)[0, 1)Array of [1, ∞){ ADAM,ADAGRAD,FTRL,RMSPROP,SGD}{ RELU,RELU6,CRELU,ELU,SELU,SIGMOID,TANH} | [0, 1][16, 1024](0, 10](0, 10](0, 10][0, 0.8]N/A { ADAM,ADAGRAD,FTRL,RMSPROP,SGD}N/A | LOGLOGLOGLOGLOGLINEARN/A N/A N/A |
DNN_CLASSIFIER | PRECISIONRECALLACCURACYF1_SCORELOG_LOSSROC_AUC (default) | BATCH_SIZEDROPOUTHIDDEN_UNITSLEARN_RATEOPTIMIZERL1_REGL2_REGACTIVATION_FN | (0, ∞)[0, 1)Array of [1, ∞)[0, 1]{ ADAM,ADAGRAD,FTRL,RMSPROP,SGD}(0, ∞)(0, ∞){ RELU,RELU6,CRELU,ELU,SELU,SIGMOID,TANH} | [16, 1024][0, 0.8]N/A [0, 1]{ ADAM,ADAGRAD,FTRL,RMSPROP,SGD}(0, 10](0, 10]N/A | LOGLINEARN/A LINEARN/A LOGLOGN/A |
DNN_REGRESSOR | MEAN_ABSOLUTE_ERRORMEAN_SQUARED_ERRORMEAN_SQUARED_LOG_ERRORMEDIAN_ABSOLUTE_ERRORR2_SCORE (default)EXPLAINED_VARIANCE | ||||
DNN_LINEAR_ | PRECISIONRECALLACCURACYF1_SCORELOG_LOSSROC_AUC (default) | BATCH_SIZEDROPOUTHIDDEN_UNITSL1_REGL2_REGACTIVATION_FN | (0, ∞)[0, 1)Array of [1, ∞)(0, ∞)(0, ∞){ RELU,RELU6,CRELU,ELU,SELU,SIGMOID,TANH} | [16, 1024][0, 0.8]N/A (0, 10](0, 10]N/A | LOGLINEARN/A LOGLOGN/A |
DNN_LINEAR_ | MEAN_ABSOLUTE_ERRORMEAN_SQUARED_ERRORMEAN_SQUARED_LOG_ERRORMEDIAN_ABSOLUTE_ERRORR2_SCORE (default)EXPLAINED_VARIANCE | ||||
BOOSTED_TREE_ | PRECISIONRECALLACCURACYF1_SCORELOG_LOSSROC_AUC (default) | LEARN_RATEL1_REGL2_REGDROPOUTMAX_TREE_DEPTHMAX_TREE_DEPTHSUBSAMPLEMIN_SPLIT_LOSSNUM_PARALLEL_TREEMIN_TREE_CHILD_WEIGHTCOLSAMPLE_BYTREECOLSAMPLE_BYLEVELCOLSAMPLE_BYNODEBOOSTER_TYPEDART_NORMALIZE_TYPETREE_METHOD | [0, ∞)(0, ∞)(0, ∞)[0, 1][1, 20](0, 1][0, ∞)[1, ∞)[0, ∞)[0, 1][0, 1][0, 1]{ GBTREE,DART}{ TREE,FOREST}{ AUTO,EXACT,APPROX,HIST} | [0, 1](0, 10](0, 10]N/A [1, 10](0, 1]N/A N/A N/A N/A N/A N/A N/A N/A N/A | LINEARLOGLOGLINEARLINEARLINEARLINEARLINEARLINEARLINEARLINEARLINEARN/A N/A N/A |
BOOSTED_TREE_ | MEAN_ABSOLUTE_ERRORMEAN_SQUARED_ERRORMEAN_SQUARED_LOG_ERRORMEDIAN_ABSOLUTE_ERRORR2_SCORE (default)EXPLAINED_VARIANCE | ||||
RANDOM_FOREST_ | PRECISIONRECALLACCURACYF1_SCORELOG_LOSSROC_AUC (default) | L1_REGL2_REGMAX_TREE_DEPTHSUBSAMPLEMIN_SPLIT_LOSSNUM_PARALLEL_TREEMIN_TREE_CHILD_WEIGHTCOLSAMPLE_BYTREECOLSAMPLE_BYLEVELCOLSAMPLE_BYNODETREE_METHOD | (0, ∞)(0, ∞)[1, 20](0, 1)[0, ∞)[2, ∞)[0, ∞)[0, 1][0, 1][0, 1]{ AUTO,EXACT,APPROX,HIST} | (0, 10](0, 10][1, 20](0, 1)N/A [2, 200]N/A N/A N/A N/A N/A | LOGLOGLINEARLINEARLINEARLINEARLINEARLINEARLINEARLINEARN/A |
RANDOM_FOREST_ | MEAN_ABSOLUTE_ERRORMEAN_SQUARED_ERRORMEAN_SQUARED_LOG_ERRORMEDIAN_ABSOLUTE_ERRORR2_SCORE (default)EXPLAINED_VARIANCE |
MostLOG scale hyperparameters use the open lower boundary of0. You canstill set0 as the lower boundary by using theHPARAM_RANGE keyword toset the hyperparameter range. For example, in a boosted tree classifiermodel, you could set the range for theL1_REG hyperparameterasL1_REG = HPARAM_RANGE(0, 5). A value of0 gets converted to1e-14.
Conditional hyperparameters are supported. For example, in a boosted treeregressor model, you can only tune theDART_NORMALIZE_TYPE hyperparameterwhen the value of theBOOSTER_TYPE hyperparameterisDART. In this case, you specify both search spaces and the conditionsare handled automatically, as shown in the following example:
BOOSTER_TYPE=HPARAM_CANDIDATES(['DART','GBTREE'])DART_NORMALIZE_TYPE=HPARAM_CANDIDATES(['TREE','FOREST'])Search starting point
If you don't specify a search space for a hyperparameter by usingHPARAM_RANGE orHPARAM_CANDIDATES, the search starts from the default valueof that hyperparameter, as documented in theCREATE MODEL topic for that modeltype. For example, if you are running hyperparameter tuning for aboosted tree model,and you don't specify a value for theL1_REG hyperparameter,then the search starts from0, the default value.
If you specify a search space for a hyperparameter by usingHPARAM_RANGE orHPARAM_CANDIDATES, the search starting points depends on whether the specifiedsearch space includes the default value for that hyperparameter, as documentedin theCREATE MODEL topic for that model type:
- If the specified range contains the default value, that's where thesearch starts. For example, if you are running hyperparameter tuning for animplicitmatrix factorization model,and you specify the value
[20, 30, 40, 50]for theWALS_ALPHAhyperparameter,then the search starts at40, the default value. - If the specified range doesn't contain the default value, the search startsfrom the point in the specified range that is closest to the default value.For example,if you specify the value
[10, 20, 30]for theWALS_ALPHAhyperparameter, then the search starts from30, which is the closest valueto the default value of40.
Data split
When you specify a value for theNUM_TRIALS option, the service identifiesthat you are doing hyperparameter tuning and automatically performs a 3-waysplit on input data to divide it into training, evaluation, and test sets.By default, the input data is randomized and then split 80% for training,10% for evaluation, and 10% for testing.
The training and evaluation sets are used in each trial training, the same asin models that don't use hyperparameter tuning. The trial hyperparametersuggestions are calculated based on themodel evaluation metricsfor that model type. At the end of each trial training, the test set is usedto test the trial and record its metrics in the model. This ensures theobjectivity of the final reporting evaluation metrics by using data thathas not yet been analyzed by the model. Evaluation data is usedto calculate the intermediate metrics for hyperparameter suggestion, while thetest data is used to calculate the final, objective model metrics.
If you want to use only a training set, specifyNO_SPLITfor theDATA_SPLIT_METHOD optionof theCREATE MODEL statement.
If you want to use only training and evaluation sets, specify0 for theDATA_SPLIT_TEST_FRACTION optionof theCREATE MODEL statement. When the test set is empty, the evaluationset is used as the test set for the final evaluation metrics reporting.
The metrics from models that are generated from a normal training job and thosefrom a hyperparameter tuning training job are only comparable when the datasplit fractions are equal. For example, the following models are comparable:
- Non-hyperparameter tuning:
DATA_SPLIT_METHOD='RANDOM', DATA_SPLIT_EVAL_FRACTION=0.2 - Hyperparameter tuning:
DATA_SPLIT_METHOD='RANDOM', DATA_SPLIT_EVAL_FRACTION=0.2, DATA_SPLIT_TEST_FRACTION=0
Performance
Model performance when using hyperparameter tuning is typically no worsethan model performance when using the default search space and not usinghyperparameter tuning. A model that uses the default search space and doesn'tuse hyperparameter tuning always uses the default hyperparameters in the firsttrial.
To confirm the model performance improvements provided by hyperparameter tuning,compare the optimal trial for the hyperparameter tuning model to the firsttrial for the non-hyperparameter tuning model.
Transfer learning
Transfer learning is enabled by default when you set theHPARAM_TUNING_ALGORITHM optionin theCREATE MODEL statement toVIZIER_DEFAULT. The hyperparametertuning for a model benefits by learning from previously tunedmodels if it meets the following requirements:
- It has the same model type as previously tuned models.
- It resides in the same project as previously tuned models.
- It use the same hyperparameter search space OR asubset of thehyperparameter search space of previously tuned models. A subset uses thesame hyperparameter names and types, but doesn't have to have the same ranges.For example,
(a:[0, 10])is considered as a subset of(a:[-1, 1], b:[0, 1]).
Transfer learning doesn't require that the input data be the same.
Transfer learning helps solve the cold start problem where the system performsrandom exploration during the first trial batch. Transfer learning provides thesystem with some initial knowledge about the hyperparameters and theirobjectives. To continuously improve the model quality, always train a newhyperparameter tuning model with the same or a subset of hyperparameters.
Transfer learning helps hyperparameter tuning converge faster, instead ofhelping submodels to converge.
Error handling
Hyperparameter tuning handles errors in the following ways:
Cancellation: If a training job is cancelled while running, then allsuccessful trials remain usable.
Invalid input: If the user input is invalid, then the service returnsa user error.
Invalid hyperparameters: If the hyperparameters are invalid for a trial,then the trial is skipped and marked as
INFEASIBLEin the output from theML.TRIAL_INFOfunction.Trial internal error: If more than 10% of the
NUM_TRIALSvalue fail due toINTERNAL_ERROR, then the training job stops and returns a user error.If less than 10% of the
NUM_TRIALSvalue fail due toINTERNAL_ERROR, thetraining continues with the failed trials marked asFAILEDin the outputfrom theML.TRIAL_INFOfunction.
Model serving functions
You can use output models from hyperparameter tuning with a number of existingmodel serving functions. To use these functions, follow these rules:
When the function takes input data, only the result from one trial isreturned. By default this is the optimal trial, but you can also choose aparticular trial by specifying the
TRIAL_IDas an argument for the givenfunction. You can get theTRIAL_IDfrom the output of theML.TRIAL_INFOfunction. The following functions are supported:When the function doesn't take input data, all trial results are returned,and the first output column is
TRIAL_ID. The following functions aresupported:
The output fromML.FEATURE_INFOdoesn't change, because all trials share the same input data.
Evaluation metrics fromML.EVALUATE andML.TRIAL_INFO can be differentbecause of the way input data is split. By default,ML.EVALUATE runs againstthe test data, whileML.TRIAL_INFO runs against the evaluation data. For moreinformation, seeData split.
Unsupported functions
TheML.TRAINING_INFO functionreturns information for each iteration, and iteration results aren't saved inhyperparameter tuning models. Trial results are saved instead. You can use theML.TRIAL_INFO functionto get information about trial results.
Model export
You can export models created with hyperparameter tuning to Cloud Storagelocations using theEXPORT MODEL statement.You can export the default optimal trial or any specified trial.
Pricing
The cost of hyperparameter tuning training is the sum of the cost of allexecuted trials. The pricing of a trial is consistent with the existingBigQuery ML pricing model.
FAQ
This section provides answers to some frequently asked questions abouthyperparameter tuning.
How many trials do I need to tune a model?
We recommend using at least 10 trials for one hyperparameter, so the totalnumber of trials should be at least10 *num_hyperparameters. If you are using the defaultsearch space, refer to theHyperparameters column in theHyperparameters and objectivestable for the number of hyperparameters tuned by default for a given model type.
What if I don't see performance improvements by using hyperparameter tuning?
Make sure you follow the guidance in this document to get a fair comparison. Ifyou still don't see performance improvements, it might mean the defaulthyperparameters already work well for you. You might want to focus on featureengineering or try other model types before trying another round ofhyperparameter tuning.
What if I want to continue tuning a model?
Train a new hyperparameter tuning model with the same search space. Thebuilt-in transfer learning helps to continue tuning based on your previouslytuned models.
Do I need to retrain the model with all data and the optimal hyperparameters?
It depends on the following factors:
K-means models already use all data as the training data, so there's no needto retrain the model.
For matrix factorization models, you can retrain the model with the selectedhyperparameters and all input data for better coverage of users and items.
For all other model types, retraining is usually unnecessary. The servicealready keeps 80% of the input data for training during the default randomdata split. You can still retrain the model with more training data and theselected hyperparameters if your dataset is small, but leaving littleevaluation data for early stop might worsen overfitting.
What's next
- To try running hyperparameter tuning, seeUse the BigQuery ML hyperparameter tuning to improve model performance.
- For more information about supported SQL statements and functions for MLmodels, seeEnd-to-end user journeys for ML models.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-19 UTC.