Overview of hyperparameter tuning

Hyperparameter tuningtakes advantage of the processing infrastructure of Google Cloud to testdifferent hyperparameter configurations when training your model. It can giveyou optimized values for hyperparameters, which maximizes your model'spredictive accuracy.

What's a hyperparameter?

Hyperparameters contain the data that govern the training process itself.

Your training application handles three categories of data as it trains yourmodel:

  • Yourinput data (also called training data) is a collection of individualrecords (instances) containing the features important to your machinelearning problem. This data is used during training to configure your modelto accurately make inferences about new instances of similar data. However,the values in your input data never directly become part of your model.

  • Your model'sparameters are the variables that your chosen machinelearning technique uses to adjust to your data. For example, a deep neuralnetwork (DNN) is composed of processing nodes (neurons), each with anoperation performed on data as it travels through the network. When your DNNis trained, each node has a weight value that tells your model how muchimpact it has on the final inference. Those weights are an example of yourmodel's parameters. In many ways, your model's parametersare themodel—they are what distinguishes your particular model from othermodels of the same type working on similar data.

  • Yourhyperparameters are the variables that govern the training processitself. For example, part of designing a DNN is deciding how many hiddenlayers of nodes to use between the input and output layers, and how manynodes each hidden layer should use. These variables are not directlyrelated to the training data. They are configuration variables. Note thatparameters change during a training job, while hyperparameters are usuallyconstant during a job.

Your modelparameters are optimized (you could say "tuned") by the trainingprocess: you run data through the operations of the model, compare the resultinginference with the actual value for each data instance, evaluate the accuracy,and adjust until you find the best values.Hyperparameters are tuned byrunning your whole training job, looking at the aggregate accuracy, andadjusting. In both cases, you are modifying the composition of your modelto find the best combination to handle your problem.

Without an automated technology like Vertex AI hyperparametertuning, you need to make manual adjustments to the hyperparameters over thecourse of many training runs to arrive at the optimal values. Hyperparametertuning makes the process of determining the best hyperparameter settings easierand less tedious.

How hyperparameter tuning works

Hyperparameter tuning works by running multipletrials of your trainingapplication with values for your chosen hyperparameters, set within limits youspecify. Vertex AI keeps track of the results of eachtrial and makes adjustments for subsequent trials. When the job is finished, youcan get a summary of all the trials along with the most effective configurationof values according to the criteria you specify.

Hyperparameter tuning requires explicit communication between theVertex AI and your training application. Yourtraining application defines all the information that your model needs.You define the hyperparameters (variables) that you want to adjust, andtarget variables that are used to evaluate each trial.

Learn more about Bayesian optimization for hyperparametertuning.

In addition to Bayesian optimization, Vertex AI optimizesacross hyperparameter tuning jobs. If you are doing hyperparameter tuningagainst similar models, changing only the objective function or addinga new input column, Vertex AI is able to improve over time andmake the hyperparameter tuning more efficient.

What hyperparameter tuning optimizes

Hyperparameter tuning optimizes target variables that you specify, calledhyperparameter metrics. Model accuracy, as calculated from an evaluation pass,is a common metric. Metrics must be numeric.

When configuring a hyperparameter tuning job, you define the name and goal ofeach metric. The goal specifies whether you want to tune your model to maximizeor minimize the value of this metric.

How Vertex AI gets your metrics

Use thecloudml-hypertune Python package to passmetrics to Vertex AI. This library provides helper functions forreporting metrics to Vertex AI.

Learn more about reporting hyperparameter metrics.

The flow of hyperparameter values

Without hyperparameter tuning, you can set your hyperparameters by whatevermeans you like in your training application. For example, you can configure thehyperparameters by passing command-line arguments to your main applicationmodule, or feed them to your application in a configuration file.

When you use hyperparameter tuning, you must use the following procedure to setthe values of the hyperparameters that you're using for tuning:

  • Define a command-line argument in your main training module for each tunedhyperparameter.

  • Use the value passed in those arguments to set the correspondinghyperparameter in your application's code.

When you configure a hyperparameter tuning job, you define eachhyperparameter to tune, its data type, and the range of values to try. Youidentify each hyperparameter using the same name as the corresponding argumentyou defined in your main module. The training service includes command-linearguments using these names when it runs your application.

Learn more about the requirements for parsing command-linearguments.

Select hyperparameters to tune

There is little universal advice to give about how to choosewhich hyperparameters you should tune. If you have experience with the machinelearning technique that you're using, you may have insight into how itshyperparameters behave. You may also be able to find advice from machinelearning communities.

However you choose them, it's important to understand the implications. Everyhyperparameter that you choose to tune has the potential toincrease the number of trials required for a successful tuning job. When yourun a hyperparameter tuning job on Vertex AI,the amount you'recharged is based on the duration of the trials initiated by your hyperparametertuning job. A careful choice of hyperparameters to tune can reducethe time and cost of your hyperparameter tuning job.

Hyperparameter data types

In aParameterSpec object, you specify the hyperparameterdata type as an instance of a parameter value specification. The followingtable lists the supported parameter value specifications.

TypeData typeValue rangesValue data
DoubleValueSpecDOUBLEminValue &maxValueFloating-point values
IntegerValueSpecINTEGERminValue &maxValueInteger values
CategoricalValueSpecCATEGORICALcategoricalValuesList of category strings
DiscreteValueSpecDISCRETEdiscreteValuesList of values in ascending order

Scale hyperparameters

In aParameterSpec object, you can specify that scalingshould be performed on this hyperparameter. Scaling is recommended for theDOUBLE and INTEGER data types. The availablescaling types are:

  • SCALE_TYPE_UNSPECIFIED: No scaling is applied to this hyperparameter.
  • UNIT_LINEAR_SCALE: Scales the feasible space linearly 0 through 1.
  • UNIT_LOG_SCALE: Scales the feasible space logarithmically 0 through 1.The entire feasible space must be strictly positive.
  • UNIT_REVERSE_LOG_SCALE: Scales the feasible space "reverse" logarithmically0 through 1. The result is that values close to the top of the feasiblespace are spread out more than points near the bottom. The entire feasiblespace must be strictly positive.

Conditional hyperparameters

TheConditionalParameterSpec object lets you addhyperparameters to a trial when the value of its parent hyperparameter matches acondition that you specify.

For example, you could define a hyperparameter tuning job with the goal offinding an optimal model using either linear regression or a deep neuralnetwork (DNN). To let your tuning job specify the training method, you definea categorical hyperparameter namedtraining_method with the following options:LINEAR_REGRESSION andDNN. When thetraining_method isLINEAR_REGRESSION, your tuning job must specify a hyperparameter for thelearning rate. When thetraining_method isDNN, your tuning jobmust specify parameters for thelearning rate and thenumber of hidden layers.

Since the number of hidden layers is applicable only when a trial'straining_method isDNN, you define a conditional parameter that addsa hyperparameter namednum_hidden_layers when thetraining_method isDNN.

Since the learning rate is used by bothtraining_method options, you mustdecide if this conditional hyperparameter should be shared. If thehyperparameter is shared, the tuning job uses what it has learned fromLINEAR_REGRESSION andDNN trials to tune the learning rate. In this case,it makes more sense to have separate learning rates for eachtraining_method,since the learning rate for training a model usingLINEAR_REGRESSION shouldn'taffect the learning rate for training a model usingDNN. So, you define thefollowing conditional hyperparameters:

  • A hyperparameter namedlearning_rate that is added when thetraining_method isLINEAR_REGRESSION.
  • A hyperparameter namedlearning_rate that is added when thetraining_method isDNN.

Conditional hyperparameters let you define the hyperparameters for your tuningjob as a graph. This lets you tune your training process using differenttraining techniques, each with their own hyperparameter dependencies.

Search algorithms

You can specify a search algorithm in theStudySpec object.If you don't specify an algorithm, your job uses the defaultVertex AI algorithm. The default algorithm applies Bayesianoptimization to arrive at the optimal solution with a more effective searchover the parameter space.

Available values:

  • ALGORITHM_UNSPECIFIED: Same as not specifying an algorithm.Vertex AI chooses the best search algorithm between Gaussian process bandits, linear combination search, or their variants.

  • GRID_SEARCH: A grid search within the feasible space. This option isparticularly useful if you want to specify a quantity of trials that isgreater than the number of points in the feasible space. In such cases, ifyou don't specify a grid search, the Vertex AI defaultalgorithm may generate duplicate suggestions. To use grid search, allparameters must be of typeINTEGER,CATEGORICAL, orDISCRETE.

  • RANDOM_SEARCH: A random search within the feasible space.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.