Configure container settings for Vertex AI serverless training

When you perform Vertex AI serverless training, you must specify whatmachine learning (ML)code you want Vertex AI to run. To do this, configure training containersettings for either acustom containeror aPython training application that runs on a prebuiltcontainer.

To determine whether you want to use a custom container or a prebuiltcontainer, readTraining code requirements.

This document describes the fields of the Vertex AI API that you must specifyin either of the preceding cases.

Where to specify container settings

Specify configuration details within aWorkerPoolSpec. Depending on howyou perform serverless training, put thisWorkerPoolSpec in one of the followingAPI fields:

If you are creating aCustomJobresource, specify theWorkerPoolSpec inCustomJob.jobSpec.workerPoolSpecs.
If you are using the Google Cloud CLI, then you can use the--worker-pool-spec flag or the--config flag on thegcloud ai custom-jobs createcommand to specify workerpool options.
Learn more aboutcreating aCustomJob.
If you are creating aHyperparameterTuningJobresource, specifytheWorkerPoolSpec inHyperparameterTuningJob.trialJobSpec.workerPoolSpecs.
If you are using the gcloud CLI, then you can use the--configflag on thegcloud ai hpt-tuning-jobs createcommand to specifyworker pool options.
Learn more aboutcreating aHyperparameterTuningJob.
If you are creating aTrainingPipelineresource withouthyperparameter tuning, specify theWorkerPoolSpec inTrainingPipeline.trainingTaskInputs.workerPoolSpecs.
Learn more aboutcreating a customTrainingPipeline.
If you are creating aTrainingPipeline with hyperparameter tuning,specify theWorkerPoolSpec inTrainingPipeline.trainingTaskInputs.trialJobSpec.workerPoolSpecs.

If you are performingdistributedtraining, you can use differentsettings for each worker pool.

Configure container settings

Depending on whether you are using a prebuilt container or a custom container,you must specify different fields within theWorkerPoolSpec. Select the tab for your scenario:

Prebuilt container

Select aprebuilt container thatsupports the ML framework you plan to use for training. Specify one of thecontainer image's URIs in thepythonPackageSpec.executorImageUrifield.
Specify the Cloud Storage URIs of yourPython trainingapplication in thepythonPackageSpec.packageUrisfield.
Specify your training application's entry point module in thepythonPackageSpec.pythonModulefield.
Optionally, specify a list of command-line arguments to pass to yourtraining application's entry point module in thepythonPackageSpec.argsfield.

The following examples highlight where you specify these container settingswhen you create aCustomJob:

Console

In the Google Cloud console, you can't create aCustomJob directly. However,you cancreate aTrainingPipeline that creates aCustomJob. When you create aTrainingPipeline in the Google Cloud console, you can specify prebuiltcontainer settings in certain fields on theTraining container step:

pythonPackageSpec.executorImageUri: Use theModel framework andModel framework version drop-down lists.
pythonPackageSpec.packageUris: Use thePackage location field.
pythonPackageSpec.pythonModule: Use thePython module field.
pythonPackageSpec.args: Use theArguments field.

gcloud

gcloudaicustom-jobscreate\--region=LOCATION\--display-name=JOB_NAME\--python-package-uris=PYTHON_PACKAGE_URIS\--worker-pool-spec=machine-type=MACHINE_TYPE,replica-count=REPLICA_COUNT,executor-image-uri=PYTHON_PACKAGE_EXECUTOR_IMAGE_URI,python-module=PYTHON_MODULE

For more context, read theguide to creating aCustomJob.

Custom container

Specify the Artifact Registry or Docker Hub URI of yourcustom container in thecontainerSpec.imageUrifield.
Optionally, if you want to override theENTRYPOINT orCMDinstructions in your container, specify thecontainerSpec.command orcontainerSpec.args fields.These fields affect how your container runs according to the followingrules:
- If you specify neither field: Your container runs according to itsENTRYPOINT instruction andCMD instruction (if it exists). Refer totheDocker documentation about howCMD andENTRYPOINTinteract.
- If you specify onlycontainerSpec.command: Your container runs withthe value ofcontainerSpec.command replacing itsENTRYPOINTinstruction. If the container has aCMD instruction, it is ignored.
- If you specify onlycontainerSpec.args: Your container runsaccording to itsENTRYPOINT instruction, with the value ofcontainerSpec.args replacing itsCMD instruction.
- If you specify both fields: Your container runs withcontainerSpec.command replacing itsENTRYPOINT instruction andcontainerSpec.args replacing itsCMD instruction.

The following example highlights where you can specify some of thesecontainer settings when you create aCustomJob:

Console

In the Google Cloud console, you can't create aCustomJob directly. However,you cancreate aTrainingPipeline that creates aCustomJob. When you create aTrainingPipeline in the Google Cloud console, you can specify customcontainer settings in certain fields on theTraining container step:

containerSpec.imageUri: Use theContainer image field.
containerSpec.command: This API field is not configurable in theGoogle Cloud console.
containerSpec.args: Use theArguments field.

gcloud

gcloudaicustom-jobscreate\--region=LOCATION\--display-name=JOB_NAME\--worker-pool-spec=machine-type=MACHINE_TYPE,replica-count=REPLICA_COUNT,container-image-uri=CUSTOM_CONTAINER_IMAGE_URI

Java

Before trying this sample, follow theJava setup instructions in theVertex AI quickstart using client libraries. For more information, see theVertex AIJava API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.

importcom.google.cloud.aiplatform.v1.AcceleratorType;importcom.google.cloud.aiplatform.v1.ContainerSpec;importcom.google.cloud.aiplatform.v1.CustomJob;importcom.google.cloud.aiplatform.v1.CustomJobSpec;importcom.google.cloud.aiplatform.v1.JobServiceClient;importcom.google.cloud.aiplatform.v1.JobServiceSettings;importcom.google.cloud.aiplatform.v1.LocationName;importcom.google.cloud.aiplatform.v1.MachineSpec;importcom.google.cloud.aiplatform.v1.WorkerPoolSpec;importjava.io.IOException;// Create a custom job to run machine learning training code in Vertex AIpublicclassCreateCustomJobSample{publicstaticvoidmain(String[]args)throwsIOException{// TODO(developer): Replace these variables before running the sample.Stringproject="PROJECT";StringdisplayName="DISPLAY_NAME";// Vertex AI runs your training application in a Docker container image. A Docker container// image is a self-contained software package that includes code and all dependencies. Learn// more about preparing your training application at// https://cloud.google.com/vertex-ai/docs/training/overview#prepare_your_training_applicationStringcontainerImageUri="CONTAINER_IMAGE_URI";createCustomJobSample(project,displayName,containerImageUri);}staticvoidcreateCustomJobSample(Stringproject,StringdisplayName,StringcontainerImageUri)throwsIOException{JobServiceSettingssettings=JobServiceSettings.newBuilder().setEndpoint("us-central1-aiplatform.googleapis.com:443").build();Stringlocation="us-central1";// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.try(JobServiceClientclient=JobServiceClient.create(settings)){MachineSpecmachineSpec=MachineSpec.newBuilder().setMachineType("n1-standard-4").setAcceleratorType(AcceleratorType.NVIDIA_TESLA_T4).setAcceleratorCount(1).build();ContainerSpeccontainerSpec=ContainerSpec.newBuilder().setImageUri(containerImageUri).build();WorkerPoolSpecworkerPoolSpec=WorkerPoolSpec.newBuilder().setMachineSpec(machineSpec).setReplicaCount(1).setContainerSpec(containerSpec).build();CustomJobSpeccustomJobSpecJobSpec=CustomJobSpec.newBuilder().addWorkerPoolSpecs(workerPoolSpec).build();CustomJobcustomJob=CustomJob.newBuilder().setDisplayName(displayName).setJobSpec(customJobSpecJobSpec).build();LocationNameparent=LocationName.of(project,location);CustomJobresponse=client.createCustomJob(parent,customJob);System.out.format("response: %s\n",response);System.out.format("Name: %s\n",response.getName());}}}

Node.js

Before trying this sample, follow theNode.js setup instructions in theVertex AI quickstart using client libraries. For more information, see theVertex AINode.js API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.

/** * TODO(developer): Uncomment these variables before running the sample.\ * (Not necessary if passing values as arguments) */// const customJobDisplayName = 'YOUR_CUSTOM_JOB_DISPLAY_NAME';// const containerImageUri = 'YOUR_CONTAINER_IMAGE_URI';// const project = 'YOUR_PROJECT_ID';// const location = 'YOUR_PROJECT_LOCATION';// Imports the Google Cloud Job Service Client libraryconst{JobServiceClient}=require('@google-cloud/aiplatform');// Specifies the location of the api endpointconstclientOptions={apiEndpoint:'us-central1-aiplatform.googleapis.com',};// Instantiates a clientconstjobServiceClient=newJobServiceClient(clientOptions);asyncfunctioncreateCustomJob(){// Configure the parent resourceconstparent=`projects/${project}/locations/${location}`;constcustomJob={displayName:customJobDisplayName,jobSpec:{workerPoolSpecs:[{machineSpec:{machineType:'n1-standard-4',acceleratorType:'NVIDIA_TESLA_T4',acceleratorCount:1,},replicaCount:1,containerSpec:{imageUri:containerImageUri,command:[],args:[],},},],},};constrequest={parent,customJob};// Create custom job requestconst[response]=awaitjobServiceClient.createCustomJob(request);console.log('Create custom job response:\n',JSON.stringify(response));}createCustomJob();

Python

To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.

fromgoogle.cloudimportaiplatformdefcreate_custom_job_sample(project:str,display_name:str,container_image_uri:str,location:str="us-central1",api_endpoint:str="us-central1-aiplatform.googleapis.com",):# The AI Platform services require regional API endpoints.client_options={"api_endpoint":api_endpoint}# Initialize client that will be used to create and send requests.# This client only needs to be created once, and can be reused for multiple requests.client=aiplatform.gapic.JobServiceClient(client_options=client_options)custom_job={"display_name":display_name,"job_spec":{"worker_pool_specs":[{"machine_spec":{"machine_type":"n1-standard-4","accelerator_type":aiplatform.gapic.AcceleratorType.NVIDIA_TESLA_K80,"accelerator_count":1,},"replica_count":1,"container_spec":{"image_uri":container_image_uri,"command":[],"args":[],},}]},}parent=f"projects/{project}/locations/{location}"response=client.create_custom_job(parent=parent,custom_job=custom_job)print("response:",response)

For more context, read theguide to creating aCustomJob.

What's next

Learn how to perform serverless training bycreating aCustomJob.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.

Movatterモバイル変換

Configure container settings for Vertex AI serverless training Stay organized with collections Save and categorize content based on your preferences.

Where to specify container settings

Configure container settings

Prebuilt container

Console

gcloud

Custom container

Console

gcloud

Java

Node.js

Python

What's next

Configure container settings for Vertex AI serverless training