ModelContainerSpec Stay organized with collections Save and categorize content based on your preferences.
Specification of a container for serving predictions. Some fields in this message correspond to fields in theKubernetes Container v1 core specification.
imageUristringRequired. Immutable. URI of the Docker image to be used as the custom container for serving predictions. This URI must identify an image in Artifact Registry. Learn more about thecontainer publishing requirements, including permissions requirements for the Vertex AI service Agent.
The container image is ingested uponModelService.UploadModel, stored internally, and this original path is afterwards not used.
To learn about the requirements for the Docker image itself, seeCustom container requirements.
You can use the URI to one of Vertex AI'spre-built container images for prediction in this field.
command[]stringImmutable. Specifies the command that runs when the container starts. This overrides the container'sENTRYPOINT. Specify this field as an array of executable and arguments, similar to a DockerENTRYPOINT's "exec" form, not its "shell" form.
If you do not specify this field, then the container'sENTRYPOINT runs, in conjunction with theargs field or the container'sCMD, if either exists. If this field is not specified and the container does not have anENTRYPOINT, then refer to the Docker documentation abouthowCMD andENTRYPOINT interact.
If you specify this field, then you can also specify theargs field to provide additional arguments for this command. However, if you specify this field, then the container'sCMD is ignored. See theKubernetes documentation about how thecommand andargs fields interact with a container'sENTRYPOINT andCMD.
In this field, you can referenceenvironment variables set by Vertex AI and environment variables set in theenv field. You cannot reference environment variables set in the Docker image. In order for environment variables to be expanded, reference them by using the following syntax:
$(VARIABLE_NAME)
Note that this differs from Bash variable expansion, which does not use parentheses. If a variable cannot be resolved, the reference in the input string is used unchanged. To avoid variable expansion, you can escape this syntax with$$; for example:
$$(VARIABLE_NAME)
This field corresponds to thecommand field of the Kubernetes Containersv1 core API.
args[]stringImmutable. Specifies arguments for the command that runs when the container starts. This overrides the container'sCMD. Specify this field as an array of executable and arguments, similar to a DockerCMD's "default parameters" form.
If you don't specify this field but do specify thecommand field, then the command from thecommand field runs without any additional arguments. See theKubernetes documentation about how thecommand andargs fields interact with a container'sENTRYPOINT andCMD.
If you don't specify this field and don't specify thecommand field, then the container'sENTRYPOINT andCMD determine what runs based on their default behavior. See the Docker documentation abouthowCMD andENTRYPOINT interact.
In this field, you can referenceenvironment variables set by Vertex AI and environment variables set in theenv field. You cannot reference environment variables set in the Docker image. In order for environment variables to be expanded, reference them by using the following syntax:
$(VARIABLE_NAME)
Note that this differs from Bash variable expansion, which does not use parentheses. If a variable cannot be resolved, the reference in the input string is used unchanged. To avoid variable expansion, you can escape this syntax with$$; for example:
$$(VARIABLE_NAME)
This field corresponds to theargs field of the Kubernetes Containersv1 core API.
env[]object (EnvVar)Immutable. List of environment variables to set in the container. After the container starts running, code running in the container can read these environment variables.
Additionally, thecommand andargs fields can reference these variables. Later entries in this list can also reference earlier entries. For example, the following example sets the variableVAR_2 to have the valuefoo bar:
[ { "name": "VAR_1", "value": "foo" }, { "name": "VAR_2", "value": "$(VAR_1) bar" }]If you switch the order of the variables in the example, then the expansion does not occur.
This field corresponds to theenv field of the Kubernetes Containersv1 core API.
ports[]object (Port)Immutable. List of ports to expose from the container. Vertex AI sends any prediction requests that it receives to the first port on this list. Vertex AI also sendsliveness and health checks to this port.
If you do not specify this field, it defaults to following value:
[ { "containerPort": 8080 }]Vertex AI does not use ports other than the first one listed. This field corresponds to theports field of the Kubernetes Containersv1 core API.
predictRoutestringImmutable. HTTP path on the container to send prediction requests to. Vertex AI forwards requests sent usingprojects.locations.endpoints.predict to this path on the container's IP address and port. Vertex AI then returns the container's response in the API response.
For example, if you set this field to/foo, then when Vertex AI receives a prediction request, it forwards the request body in a POST request to the/foo path on the port of your container specified by the first value of thisModelContainerSpec'sports field.
If you don't specify this field, it defaults to the following value when youdeploy this Model to an Endpoint:
/v1/endpoints/ENDPOINT/deployedModels/DEPLOYED_MODEL:predict
The placeholders in this value are replaced as follows:
ENDPOINT: The last segment (following
endpoints/)of the Endpoint.name][] field of the Endpoint where this Model has been deployed. (Vertex AI makes this value available to your container code as theAIP_ENDPOINT_IDenvironment variable.)DEPLOYED_MODEL:
DeployedModel.idof theDeployedModel. (Vertex AI makes this value available to your container code as theAIP_DEPLOYED_MODEL_IDenvironment variable.)
healthRoutestringImmutable. HTTP path on the container to send health checks to. Vertex AI intermittently sends GET requests to this path on the container's IP address and port to check that the container is healthy. Read more abouthealth checks.
For example, if you set this field to/bar, then Vertex AI intermittently sends a GET request to the/bar path on the port of your container specified by the first value of thisModelContainerSpec'sports field.
If you don't specify this field, it defaults to the following value when youdeploy this Model to an Endpoint:
/v1/endpoints/ENDPOINT/deployedModels/DEPLOYED_MODEL:predict
The placeholders in this value are replaced as follows:
ENDPOINT: The last segment (following
endpoints/)of the Endpoint.name][] field of the Endpoint where this Model has been deployed. (Vertex AI makes this value available to your container code as theAIP_ENDPOINT_IDenvironment variable.)DEPLOYED_MODEL:
DeployedModel.idof theDeployedModel. (Vertex AI makes this value available to your container code as theAIP_DEPLOYED_MODEL_IDenvironment variable.)
invokeRoutePrefixstringImmutable. Invoke route prefix for the custom container. "/*" is the only supported value right now. By setting this field, any non-root route on this model will be accessible with invoke http call eg: "/invoke/foo/bar", however the [PredictionService.Invoke] RPC is not supported yet.
Only one ofpredictRoute orinvokeRoutePrefix can be set, and we default to usingpredictRoute if this field is not set. If this field is set, the Model can only be deployed to dedicated endpoint.
grpcPorts[]object (Port)Immutable. List of ports to expose from the container. Vertex AI sends gRPC prediction requests that it receives to the first port on this list. Vertex AI also sends liveness and health checks to this port.
If you do not specify this field, gRPC requests to the container will be disabled.
Vertex AI does not use ports other than the first one listed. This field corresponds to theports field of the Kubernetes Containers v1 core API.
deploymentTimeoutstring (Duration format)Immutable. Deployment timeout. Limit for deployment timeout is 2 hours.
A duration in seconds with up to nine fractional digits, ending with 's'. Example:"3.5s".
sharedMemorySizeMbstring (int64 format)Immutable. The amount of the VM memory to reserve as the shared memory for the model in megabytes.
startupProbeobject (Probe)Immutable. Specification for Kubernetes startup probe.
healthProbeobject (Probe)Immutable. Specification for Kubernetes readiness probe.
livenessProbeobject (Probe)Immutable. Specification for Kubernetes liveness probe.
| JSON representation |
|---|
{"imageUri":string,"command":[string],"args":[string],"env":[{object ( |
Port
Represents a network port in a container.
containerPortintegerThe number of the port to expose on the pod's IP address. Must be a valid port number, between 1 and 65535 inclusive.
| JSON representation |
|---|
{"containerPort":integer} |
Probe
Probe describes a health check to be performed against a container to determine whether it is alive or ready to receive traffic.
periodSecondsintegerHow often (in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1. Must be less than timeoutSeconds.
Maps to Kubernetes probe argument 'periodSeconds'.
timeoutSecondsintegerNumber of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1. Must be greater or equal to periodSeconds.
Maps to Kubernetes probe argument 'timeoutSeconds'.
failureThresholdintegerNumber of consecutive failures before the probe is considered failed. Defaults to 3. Minimum value is 1.
Maps to Kubernetes probe argument 'failureThreshold'.
successThresholdintegerNumber of consecutive successes before the probe is considered successful. Defaults to 1. Minimum value is 1.
Maps to Kubernetes probe argument 'successThreshold'.
initialDelaySecondsintegerNumber of seconds to wait before starting the probe. Defaults to 0. Minimum value is 0.
Maps to Kubernetes probe argument 'initialDelaySeconds'.
probe_typeUnion typeprobe_type can be only one of the following:execobject (ExecAction)ExecAction probes the health of a container by executing a command.
httpGetobject (HttpGetAction)HttpGetAction probes the health of a container by sending an HTTP GET request.
grpcobject (GrpcAction)GrpcAction probes the health of a container by sending a gRPC request.
tcpSocketobject (TcpSocketAction)TcpSocketAction probes the health of a container by opening a TCP socket connection.
| JSON representation |
|---|
{"periodSeconds":integer,"timeoutSeconds":integer,"failureThreshold":integer,"successThreshold":integer,"initialDelaySeconds":integer,// probe_type"exec":{object ( |
ExecAction
ExecAction specifies a command to execute.
command[]stringCommand is the command line to execute inside the container, the working directory for the command is root ('/') in the container's filesystem. The command is simply exec'd, it is not run inside a shell, so traditional shell instructions ('|', etc) won't work. To use a shell, you need to explicitly call out to that shell. Exit status of 0 is treated as live/healthy and non-zero is unhealthy.
| JSON representation |
|---|
{"command":[string]} |
HttpGetAction
HttpGetAction describes an action based on HTTP Get requests.
pathstringPath to access on the HTTP server.
portintegerNumber of the port to access on the container. Number must be in the range 1 to 65535.
hoststringhost name to connect to, defaults to the model serving container's IP. You probably want to set "host" in httpHeaders instead.
schemestringScheme to use for connecting to the host. Defaults to HTTP. Acceptable values are "HTTP" or "HTTPS".
httpHeaders[]object (HttpHeader)Custom headers to set in the request. HTTP allows repeated headers.
| JSON representation |
|---|
{"path":string,"port":integer,"host":string,"scheme":string,"httpHeaders":[{object ( |
HttpHeader
HttpHeader describes a custom header to be used in HTTP probes
namestringThe header field name. This will be canonicalized upon output, so case-variant names will be understood as the same header.
valuestringThe header field value
| JSON representation |
|---|
{"name":string,"value":string} |
GrpcAction
GrpcAction checks the health of a container using a gRPC service.
portintegerPort number of the gRPC service. Number must be in the range 1 to 65535.
servicestringservice is the name of the service to place in the gRPC HealthCheckRequest. Seehttps://github.com/grpc/grpc/blob/master/doc/health-checking.md.
If this is not specified, the default behavior is defined by gRPC.
| JSON representation |
|---|
{"port":integer,"service":string} |
TcpSocketAction
TcpSocketAction probes the health of a container by opening a TCP socket connection.
portintegerNumber of the port to access on the container. Number must be in the range 1 to 65535.
hoststringOptional: host name to connect to, defaults to the model serving container's IP.
| JSON representation |
|---|
{"port":integer,"host":string} |
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-07-31 UTC.