Create an A3 Ultra or A4 instance

This document describes how to create instances with attached GPUs from theA3 Ultra or A4 machine series. To learn more about creating instances withattached GPUs, seeOverview of creating an instance with attached GPUs.

The A4 and A3 Ultra machine series are designed to enable you to runlarge-scale AI/ML clusters with features such as targeted workloadplacement, advanced cluster maintenance controls, and topology-aware scheduling.For more information, seeCluster management overview.

Before you begin

To review limitations and additional prerequisite steps for creatinginstances with attached GPUs, such as how to select an OS image or check GPUquota, seeOverview of creatingan instance with attached GPUs.
If you haven't already, set upauthentication. Authentication verifies your identity for access to Google Cloud services and APIs. To run code or samples from a local development environment, you can authenticate to Compute Engine by selecting one of the following options:
Select the tab for how you plan to use the samples on this page:
Console
When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.
gcloud
1. Install the Google Cloud CLI. After installation,initialize the Google Cloud CLI by running the following command:
  gcloudinit
  If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
  Note: If you installed the gcloud CLI previously, make sure you have the latest version by runninggcloud components update.
2. Set a default region and zone.
REST
To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.
For more information, see Authenticate for using REST in the Google Cloud authentication documentation.

Required roles

To get the permissions that you need to create instances, ask your administrator to grant you theCompute Instance Admin (v1) (roles/compute.instanceAdmin.v1) IAM role on the project. For more information about granting roles, seeManage access to projects, folders, and organizations.

This predefined role contains the permissions required to create instances. To see the exact permissions that are required, expand theRequired permissions section:

Required permissions

The following permissions are required to create instances:

compute.instances.create on the project
To use a custom image to create the VM: compute.images.useReadOnly on the image
To use a snapshot to create the VM: compute.snapshots.useReadOnly on the snapshot
To use an instance template to create the VM: compute.instanceTemplates.useReadOnly on the instance template
To specify a subnet for your VM: compute.subnetworks.use on the project or on the chosen subnet
To specify a static IP address for the VM: compute.addresses.use on the project
To assign an external IP address to the VM when using a VPC network: compute.subnetworks.useExternalIp on the project or on the chosen subnet
To assign alegacy network to the VM: compute.networks.use on the project
To assign an external IP address to the VM when using a legacy network: compute.networks.useExternalIp on the project
To set VM instance metadata for the VM: compute.instances.setMetadata on the project
To set tags for the VM: compute.instances.setTags on the VM
To set labels for the VM: compute.instances.setLabels on the VM
To set a service account for the VM to use: compute.instances.setServiceAccount on the VM
To create a new disk for the VM: compute.disks.create on the project
To attach an existing disk in read-only or read-write mode: compute.disks.use on the disk
To attach an existing disk in read-only mode: compute.disks.useReadOnly on the disk

You might also be able to get these permissions withcustom roles or otherpredefined roles.

Determine how to create A3 Ultra or A4 instances

To determine the options that you want to use to create A3 Ultra or A4instances, complete the following steps:

Choose a consumption option: To learn how to choose a consumption optionfor an A3 Ultra or A4 instance,seeChoose a consumption optionin the AI Hypercomputer documentation.
Note: A3 Ultra and A4 instances don't support on-demand instances, which isthe default option when creating Compute Engine instances.
Obtain capacity: To learn how to obtain capacity for A3 Ultra or A4instances for the consumptionoption that you chose, see Capacity overviewin the AI Hypercomputer documentation.
Select creation instructions: To learn about all the options that youcan use to create A3 Ultra or A4 instances, such asmanaged instance groups (MIGs) or clusters, seeOverview of creating VMs and clustersin the AI Hypercomputer documentation.
If you want to use the cluster management features of A3 Ultra or A4, or ifyou don't want to create standalone instances, then select a creation optionin the AI Hypercomputer documentation instead.

Create an A3 Ultra or A4 instance

To create an A3 Ultra or A4 instance, complete the following steps:

Create VPC networks

Tip: If you are setting up a quick test, you can skip this step and specify a single NIC--network-interface=nic-type=GVNIC instead.

To set up the network for A4 or A3 Ultra machine type, create three VPC networks for the following network interfaces:

2 regular VPC networks for the gVNIC network interfaces (NIC). These are used for host to host communication.
1 VPC network with theRoCE network profile is required for the CX-7 NICs. The RoCE VPC network needs to have 8 subnets, one subnet for each CX-7 NIC. These NICs use RDMA over Converged Ethernet (RoCE), providing the high-bandwidth, low-latency communication that's essential for GPU to GPU communication.

For more information about NIC arrangement, seeReview network bandwidth and NIC arrangement.

Create the networks either manually by following the instruction guides or automatically by usingthe provided script.

Instruction guides

To create the networks, you can use the following instructions:

To create the regular VPC networks for the gVNICs, seeCreate and manage Virtual Private Cloud networks.
To create the RoCE VPC network, seeCreate a Virtual Private Cloud network for RDMA NICs.

For these VPC networks, we recommend setting themaximum transmission unit (MTU) to a larger value.For A4 or A3 Ultra machine type, the recommended MTU is8896 bytes.To review the recommended MTU settings for other GPU machine types, seeMTU settings for GPU machine types.

Script

To create the networks, follow these steps.

Use the following script to create VPC networks for the gVNICs and CX-7 NICs.

      #!/bin/bash    # Create regular VPC networks and subnets for the gVNICs    for N in $(seq 0 1); do      gcloud compute networks createGVNIC_NAME_PREFIX-net-$N \        --subnet-mode=custom \        --mtu=8896      gcloud compute networks subnets createGVNIC_NAME_PREFIX-sub-$N \        --network=GVNIC_NAME_PREFIX-net-$N \        --region=REGION \        --range=10.$N.0.0/16      gcloud compute firewall-rules createGVNIC_NAME_PREFIX-internal-$N \        --network=GVNIC_NAME_PREFIX-net-$N \        --action=ALLOW \        --rules=tcp:0-65535,udp:0-65535,icmp \        --source-ranges=10.0.0.0/8    done    # Create SSH firewall rules    gcloud compute firewall-rules createGVNIC_NAME_PREFIX-ssh \      --network=GVNIC_NAME_PREFIX-net-0 \      --action=ALLOW \      --rules=tcp:22 \      --source-ranges=IP_RANGE    # Assumes that an external IP is only created for vNIC 0    gcloud compute firewall-rules createGVNIC_NAME_PREFIX-allow-ping-net-0 \      --network=GVNIC_NAME_PREFIX-net-0 \      --action=ALLOW \      --rules=icmp \      --source-ranges=IP_RANGE      # List and make sure network profiles exist in the machine type's zone    gcloud compute network-profiles list --filter "location.name=ZONE"    # Create network for CX-7    gcloud compute networks createRDMA_NAME_PREFIX-mrdma \      --network-profile=ZONE-vpc-roce \      --subnet-mode custom \      --mtu=8896    # Create subnets    for N in $(seq 0 7); do      gcloud compute networks subnets createRDMA_NAME_PREFIX-mrdma-sub-$N \        --network=RDMA_NAME_PREFIX-mrdma \        --region=REGION \        --range=10.$((N+2)).0.0/16 # offset to avoid overlap with gVNICs    done

Replace the following:

GVNIC_NAME_PREFIX: the custom name prefix to use for the regular VPC networks and subnets for the gVNICs.
RDMA_NAME_PREFIX: the custom name prefix to use for the RoCE VPC network and subnets for the CX-7 NICs.
ZONE: specify a zone in which the machine type that you want to use is available, such asus-central1-a. For information about regions, seeGPU availability by regions and zones.
REGION: the region where you want to create the subnets. This region must correspond to the zone specified. For example, if your zone isus-central1-a, then your region isus-central1.
IP_RANGE: the IP range to use for theSSH firewall rules.

Optional: To verify that the VPC network resources are created successfully, check the network settings in the Google Cloud console:
1. In the Google Cloud console, go to theVPC networks page.
  Go to VPC networks
2. Search the list for the networks that you created in the previous step.
3. To view the subnets, firewall rules, and other network settings, click the name of the network.

Create the instance

To create an instance, use one of the following options.

Console

In the Google Cloud console, go to theCreate an instance page.
Go to Create an instance
TheCreate an instance screen appears and displays theMachine configuration pane.
In theMachine configuration pane, complete the following steps:
1. Specify aName for your instance. SeeResource naming convention.
2. Select theRegion andZone where you have reserved capacity.
3. Click theGPUs tab, and then complete the following steps:
  1. In theGPU type list, select your GPU type.
    - For A4 instances, selectNVIDIA B200.
    - For A3 Ultra instances, selectNVIDIA H200 141GB.
  2. In theNumber of GPUs list, select8.
In the navigation menu, clickOS and storage. In theOS and storage pane that appears, complete the following steps:
1. ClickChange. TheBoot disk configuration pane appears.
2. On thePublic images tab, select a recommended image. For a listof recommended images, seeOperating systems.
3. To confirm your boot disk options, clickSelect.
To create a multi-NIC instance, complete the following steps. Otherwise,to create a single-NIC instance, skip these steps.
- In the navigation menu, clickNetworking. In theNetworking pane that appears, complete the following steps:
  1. In theNetwork interfaces section, complete the followingsteps:
  2. Delete the default network interface. To delete the interface,clickDelete.
  3. ClickAdd a network interface. Use this option to add networkinterfaces that attach to the VPC networks that youcreated in the previous section. When you add the network interfaces,remember the following:
    - For a network interface that is used for host to hostcommunication, select a regular VPC networkand subnet from theNetwork andSubnetwork lists, and set theNetwork interface card list togVNIC.
    - For a network interface that is used for GPU to GPU communication,select the RoCE VPC network and subnet from theNetwork andSubnetwork lists, and set theNetwork interface card list toMRDMA forthese network interfaces.
In the navigation menu, clickAdvanced. Then, complete the following steps for theprovisioning model that you want to use.
Flex-start
1. In theProvisioning model section, in theVM provisioning model list, selectFlex-start.
2. In theEnter number of hours, enter the maximum amount oftime that you want the VM to run. The value must be between 46seconds (0.01) and seven days (168, or 168 hours).
3. SelectSet a wait time for VM creation.
  Based on the zonal requirements for your workload, specify one of the following durations to help increase your chances that your VM creation request succeeds:
  - Workloads with strict zonal requirements: if your workload requires you to create the VM in a specific zone, then specify a duration between90 seconds and2 hours. Longer durations give you higher chances of obtaining resources.
  - Workloads without strict zonal requirements: if the VM can run in any zone within the region, then specify a duration of0 seconds or clear theSet a wait time for VM creation checkbox. This action specifies that Compute Engine only allocates resources if they are immediately available. If the VM creation request fails because resources are unavailable, then retry the request in a different zone.
4. In theOn VM termination field, select whether to stop ordelete the VM at the end of its run duration:
  - To delete the VM, selectDelete.
  - To stop the VM, selectStop.
Reservation-bound
1. ClickChoose a reservation. This action opens a pane with alist of available reservations within your selected zone. Fromthe reservation list, complete the following steps:
  1. Select the reservation that you want to use for the VM. Youcan also select a specific block within the reservation.
  2. ClickChoose.
Spot
1. In theProvisioning model section, selectSpotfrom theVM provisioning model list.
2. Optional: To select thetermination actionthat happens when Compute Engine preempts the VM, completethe following steps:
  1. Expand theVM provisioning model advanced settingssection.
  2. In theOn VM termination list, select one of the followingoptions:
    To stop the VM during preemption, selectStop (default).
    To delete the VM during preemption, selectDelete.
To create and start the instance, clickCreate.

gcloud

To create the VM, use thegcloud compute instances create command.

The parameters that you need to specify depend on theconsumption option that you are using forthis deployment. Select the tab that corresponds to your consumption option's provisioning model.

Flex-start

gcloud compute instances createVM_NAME  \    --machine-type=MACHINE_TYPE \    --image-family=IMAGE_FAMILY \    --image-project=IMAGE_PROJECT \    --zone=ZONE \    --boot-disk-type=hyperdisk-balanced \    --boot-disk-size=DISK_SIZE \    --scopes=cloud-platform \    --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-0,subnet=GVNIC_NAME_PREFIX-sub-0 \    --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-1,subnet=GVNIC_NAME_PREFIX-sub-1,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-0,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-1,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-2,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-3,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-4,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-5,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-6,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-7,no-address \    --reservation-affinity=none \    --provisioning-model=FLEX_START \    --request-valid-for-duration=REQUEST_VALID_FOR_DURATION \    --max-run-duration=MAX_RUN_DURATION \    --instance-termination-action=TERMINATION_ACTION \    --maintenance-policy=TERMINATE

Replace the following:

VM_NAME: the name of the VM.
MACHINE_TYPE: the machine type to use for the VM. For more information, seeGPU machine types.
IMAGE_FAMILY: the image family of the OS image that you want to use.For a list of supported operating systems, seeOperating system details.
IMAGE_PROJECT: the project ID of the OS image.
ZONE: the zone in which the machine type that you want to use is available. For information about regions, seeGPU availability by regions and zones.
DISK_SIZE: the size of the boot disk in GB.
GVNIC_NAME_PREFIX: the name prefix that you specified when creating the standard VPC networks and subnets that use gVNICs.
RDMA_NAME_PREFIX: the name prefix that you specified when creating the VPC networks and subnets that use RDMA NICs.
REQUEST_VALID_FOR_DURATION: how long the request to create the VM remains valid. You must format the value as the number of days, hours, minutes, or seconds followed byd,h,m, ands respectively. For example, specify30m for 30 minutes or1h2m3s for one hour, two minutes, and three seconds.
Based on the zonal requirements for your workload, specify one of the following durations to help increase your chances that your VM creation request succeeds:
- Workloads with strict zonal requirements: if your workload requires you to create the VM in a specific zone, then specify a duration between 90 seconds (90s) and two hours (2h). Longer durations give you higher chances of obtaining resources.
- Workloads without strict zonal requirements: if the VM can run in any zone within the region, then specify a duration of zero seconds (0s). This action specifies that Compute Engine only allocates resources if they are immediately available. If the VM creation request fails because resources are unavailable, then retry the request in a different zone.
MAX_RUN_DURATION: how long you want the requested VMs to run. You must format the value as the number of days, hours, minutes, or seconds followed byd,h,m, ands respectively. The value must be between 10 minutes and seven days.
TERMINATION_ACTION: whether Compute Engine stops (STOP) or deletes (DELETE) the VM at the end of its run duration.

Reservation-bound

gcloud compute instances createVM_NAME  \    --machine-type=MACHINE_TYPE \    --image-family=IMAGE_FAMILY \    --image-project=IMAGE_PROJECT \    --zone=ZONE \    --boot-disk-type=hyperdisk-balanced \    --boot-disk-size=DISK_SIZE \    --scopes=cloud-platform \    --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-0,subnet=GVNIC_NAME_PREFIX-sub-0 \    --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-1,subnet=GVNIC_NAME_PREFIX-sub-1,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-0,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-1,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-2,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-3,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-4,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-5,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-6,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-7,no-address \    --reservation-affinity=specific \    --reservation=RESERVATION \    --provisioning-model=RESERVATION_BOUND \    --instance-termination-action=TERMINATION_ACTION \    --maintenance-policy=TERMINATE \    --restart-on-failure

Replace the following:

VM_NAME: the name of the VM.
MACHINE_TYPE: the machine type to use for the VM. For more information, seeGPU machine types.
IMAGE_FAMILY: the image family of the OS image that you want to use.For a list of supported operating systems, seeOperating system details.
IMAGE_PROJECT: the project ID of the OS image.
ZONE: the zone in which the machine type that you want to use is available. For information about regions, seeGPU availability by regions and zones.
DISK_SIZE: the size of the boot disk in GB.
GVNIC_NAME_PREFIX: the name prefix that you specified when creating the standard VPC networks and subnets that use gVNICs.
RDMA_NAME_PREFIX: the name prefix that you specified when creating the VPC networks and subnets that use RDMA NICs.
RESERVATION: either the reservation name or a specific block within a reservation. To get the reservation name or the available blocks, seeView reserved capacity. Based on your requirement for instance placement, choose one of the following:
- To create the instance on any block:
```
projects/RESERVATION_OWNER_PROJECT_ID/reservations/RESERVATION_NAME
```
  Additionally, to create multiple instances in the same block, apply the same compact placement policy that specifies a block collocation (maxDistance=2) when creating each instance. Compute Engine then applies the policy to the reservation and creates instances on the same block.
- To create the instance on a specific block:
```
projects/RESERVATION_OWNER_PROJECT_ID/reservations/RESERVATION_NAME/reservationBlocks/RESERVATION_BLOCK_NAME
```
Tip: If the reservation exists in the current project, then you can omitprojects/RESERVATION_OWNER_PROJECT_ID/reservations/ from the reservation value.
TERMINATION_ACTION: whether Compute Engine stops (STOP) or deletes (DELETE) the VM at the end of the reservation period.

Spot

gcloud compute instances createVM_NAME  \    --machine-type=MACHINE_TYPE \    --image-family=IMAGE_FAMILY \    --image-project=IMAGE_PROJECT \    --zone=ZONE \    --boot-disk-type=hyperdisk-balanced \    --boot-disk-size=DISK_SIZE \    --scopes=cloud-platform \    --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-0,subnet=GVNIC_NAME_PREFIX-sub-0 \    --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-1,subnet=GVNIC_NAME_PREFIX-sub-1,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-0,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-1,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-2,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-3,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-4,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-5,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-6,no-address \    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-7,no-address \    --provisioning-model=SPOT \    --instance-termination-action=TERMINATION_ACTION \    --maintenance-policy=TERMINATE \    --no-restart-on-failure

Replace the following:

VM_NAME: the name of the VM.

MACHINE_TYPE: the machine type to use for the VM. For more information, see GPU machine types.
IMAGE_FAMILY: the image family of the OS image that you want to use.For a list of supported operating systems, seeOperating system details.
IMAGE_PROJECT: the project ID of the OS image.
ZONE: the zone in which the machine type that you want to use is available. For information about regions, seeGPU availability by regions and zones.
DISK_SIZE: the size of the boot disk in GB.
GVNIC_NAME_PREFIX: the name prefix that you specified when creating the standard VPC networks and subnets that use gVNICs.
RDMA_NAME_PREFIX: the name prefix that you specified when creating the VPC networks and subnets that use RDMA NICs.
TERMINATION_ACTION: the action to take when Compute Enginepreempts the instance, eitherSTOP (default) orDELETE.
Important: Make sure your application can handle preemption. Forexample, handle preemption byspecifying a shutdown script during instance creation. Learn how to handlepreemption with a shutdown script.

REST

To create the VM, make aPOST request to theinstances.insert method.

The parameters that you need to specify depend on theconsumption option that you are using forthis deployment. Select the tab that corresponds to your consumption option's provisioning model.

Flex-start

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances{  "machineType": "projects/PROJECT_ID/zones/ZONE/machineTypes/MACHINE_TYPE",  "name": "VM_NAME",  "disks":[    {      "boot":true,      "initializeParams":{        "diskSizeGb": "DISK_SIZE",        "diskType": "hyperdisk-balanced",        "sourceImage": "projects/IMAGE_PROJECT/global/images/family/IMAGE_FAMILY"      },      "mode": "READ_WRITE",      "type": "PERSISTENT"    }  ],  "serviceAccounts": [    {      "email": "default",      "scopes": [        "https://www.googleapis.com/auth/cloud-platform"      ]    }  ],  "networkInterfaces": [    {      "accessConfigs": [        {          "name": "external-nat",          "type": "ONE_TO_ONE_NAT"        }      ],      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-0",      "nicType": "GVNIC",      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-0"    },    {      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-1",      "nicType": "GVNIC",      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-1"    },    {      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-2",      "nicType": "GVNIC",      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-2"    },    {      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-3",      "nicType": "GVNIC",      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-3"    },    {      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-4",      "nicType": "GVNIC",      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-4"    }  ],  "reservationAffinity":{    "consumeReservationType": "NO_RESERVATION",  },  "scheduling":{    "provisioningModel": "FLEX_START",    "requestValidForDuration": {      "seconds":REQUEST_VALID_FOR_DURATION    },    "maxRunDuration": {      "seconds":MAX_RUN_DURATION    },    "instanceTerminationAction": "TERMINATION_ACTION",    "onHostMaintenance": "TERMINATE",  }}

Replace the following:

PROJECT_ID: the project ID of the project where you want to create the VM.
ZONE: the zone in which the machine type that you want to use is available. For information about regions, seeGPU availability by regions and zones.
MACHINE_TYPE: the machine type to use for the VM. For more information, seeGPU machine types.
VM_NAME: the name of the VM.
DISK_SIZE: the size of the boot disk in GB.
IMAGE_PROJECT: the project ID of the OS image.
IMAGE_FAMILY: the image family of the OS image that you want to use.For a list of supported operating systems, seeOperating system details.
NETWORK_PROJECT_ID: the project ID of the network.
GVNIC_NAME_PREFIX: the name prefix that you specified when creating the standard VPC networks and subnets that use gVNICs.
REGION: the region of the subnetwork.
RDMA_NAME_PREFIX: the name prefix that you specified when creating the VPC networks and subnets that use RDMA NICs.
REQUEST_VALID_FOR_DURATION: the duration, in seconds, that the request to create the VM remains valid.
Based on the zonal requirements for your workload, specify one of the following durations to help increase your chances that your VM creation request succeeds:
- Workloads with strict zonal requirements: if your workload requires you to create the VM in a specific zone, then specify a duration between 90 seconds (90) and two hours (7200). Longer durations give you higher chances of obtaining resources.
- Workloads without strict zonal requirements: if the VM can run in any zone within the region, then specify a duration of zero seconds (0). This action specifies that Compute Engine only allocates resources if they are immediately available. If the VM creation request fails because resources are unavailable, then retry the request in a different zone.
MAX_RUN_DURATION: the duration you want the requested VMs to run. You must format the value as the number of seconds. For example, specify86400 for 86,400 seconds (24 hours). The value must be between 10 minutes and seven days.
TERMINATION_ACTION: whether Compute Engine stops (STOP) or deletes (DELETE) the VM at the end of its run duration.

Reservation-bound

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances{  "machineType": "projects/PROJECT_ID/zones/ZONE/machineTypes/MACHINE_TYPE",  "name": "VM_NAME",  "disks":[    {      "boot":true,      "initializeParams":{        "diskSizeGb": "DISK_SIZE",        "diskType": "hyperdisk-balanced",        "sourceImage": "projects/IMAGE_PROJECT/global/images/family/IMAGE_FAMILY"      },      "mode": "READ_WRITE",      "type": "PERSISTENT"    }  ],  "serviceAccounts": [    {      "email": "default",      "scopes": [        "https://www.googleapis.com/auth/cloud-platform"      ]    }  ],  "networkInterfaces": [    {      "accessConfigs": [        {          "name": "external-nat",          "type": "ONE_TO_ONE_NAT"        }      ],      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-0",      "nicType": "GVNIC",      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-0"    },    {      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-1",      "nicType": "GVNIC",      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-1"    },    {      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-2",      "nicType": "GVNIC",      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-2"    },    {      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-3",      "nicType": "GVNIC",      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-3"    },    {      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-4",      "nicType": "GVNIC",      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-4"    }  ],  "reservationAffinity":{    "consumeReservationType": "SPECIFIC_RESERVATION",    "key": "compute.googleapis.com/reservation-name",    "values":[      "RESERVATION"    ]  },  "scheduling":{    "provisioningModel": "RESERVATION_BOUND",    "instanceTerminationAction": "TERMINATION_ACTION",    "onHostMaintenance": "TERMINATE",    "automaticRestart": true  }}

Replace the following:

PROJECT_ID: the project ID of the project where you want to create the VM.
ZONE: the zone in which the machine type that you want to use is available. For information about regions, seeGPU availability by regions and zones.
MACHINE_TYPE: the machine type to use for the VM. For more information, seeGPU machine types.
VM_NAME: the name of the VM.
DISK_SIZE: the size of the boot disk in GB.
IMAGE_PROJECT: the project ID of the OS image.
IMAGE_FAMILY: the image family of the OS image that you want to use.For a list of supported operating systems, seeOperating system details.
NETWORK_PROJECT_ID: the project ID of the network.
GVNIC_NAME_PREFIX: the name prefix that you specified when creating the standard VPC networks and subnets that use gVNICs.
REGION: the region of the subnetwork.
RDMA_NAME_PREFIX: the name prefix that you specified when creating the VPC networks and subnets that use RDMA NICs.
RESERVATION: either the reservation name or a specific block within a reservation. To get the reservation name or the available blocks, seeView reserved capacity. Based on your requirement for instance placement, choose one of the following:
- To create the instance on any block:
```
projects/RESERVATION_OWNER_PROJECT_ID/reservations/RESERVATION_NAME
```
  Additionally, to create multiple instances in the same block, apply the same compact placement policy that specifies a block collocation (maxDistance=2) when creating each instance. Compute Engine then applies the policy to the reservation and creates instances on the same block.
- To create the instance on a specific block:
```
projects/RESERVATION_OWNER_PROJECT_ID/reservations/RESERVATION_NAME/reservationBlocks/RESERVATION_BLOCK_NAME
```
Tip: If the reservation exists in the current project, then you can omitprojects/RESERVATION_OWNER_PROJECT_ID/reservations/ from the reservation value.
TERMINATION_ACTION: whether Compute Engine stops (STOP) or deletes (DELETE) the VM at the end of the reservation period.

Spot

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances{  "machineType": "projects/PROJECT_ID/zones/ZONE/machineTypes/MACHINE_TYPE",  "name": "VM_NAME",  "disks":[    {      "boot":true,      "initializeParams":{        "diskSizeGb": "DISK_SIZE",        "diskType": "hyperdisk-balanced",        "sourceImage": "projects/IMAGE_PROJECT/global/images/family/IMAGE_FAMILY"      },      "mode": "READ_WRITE",      "type": "PERSISTENT"    }  ],  "serviceAccounts": [    {      "email": "default",      "scopes": [        "https://www.googleapis.com/auth/cloud-platform"      ]    }  ],  "networkInterfaces": [    {      "accessConfigs": [        {          "name": "external-nat",          "type": "ONE_TO_ONE_NAT"        }      ],      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-0",      "nicType": "GVNIC",      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-0"    },    {      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-1",      "nicType": "GVNIC",      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-1"    },    {      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-2",      "nicType": "GVNIC",      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-2"    },    {      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-3",      "nicType": "GVNIC",      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-3"    },    {      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-4",      "nicType": "GVNIC",      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-4"    }  ],  "scheduling":  {    "provisioningModel": "SPOT",    "instanceTerminationAction": "TERMINATION_ACTION",    "onHostMaintenance": "TERMINATE",    "automaticRestart": false  }}

Replace the following:

PROJECT_ID: the project ID of the project where you want to create the VM.

ZONE: the zone in which the machine type that you want to use is available. For information about regions, see GPU availability by regions and zones.
MACHINE_TYPE: the machine type to use for the VM. For more information, seeGPU machine types.
VM_NAME: the name of the VM.
DISK_SIZE: the size of the boot disk in GB.
IMAGE_PROJECT: the project ID of the OS image.
IMAGE_FAMILY: the image family of the OS image that you want to use.For a list of supported operating systems, seeOperating system details.
NETWORK_PROJECT_ID: the project ID of the network.
GVNIC_NAME_PREFIX: the name prefix that you specified when creating the standard VPC networks and subnets that use gVNICs.
REGION: the region of the subnetwork.
RDMA_NAME_PREFIX: the name prefix that you specified when creating the VPC networks and subnets that use RDMA NICs.
TERMINATION_ACTION: the action to take when Compute Enginepreempts the instance, eitherSTOP (default) orDELETE.
Important: Make sure your application can handle preemption. Forexample, handle preemption byspecifying a shutdown script during instance creation. Learn how to handlepreemption with a shutdown script.

Prepare the instance for use

To prepare an instance that has GPUs attached for use, complete the followingsteps:

To enable an A4 or A3 Ultra instance to use its attached GPUs, the instancemust have GPU drivers installed. Unless the image in the instance alreadyincludes the required GPU drivers,install GPU drivers.
If you created a Spot VM in the previous section, thencomplete the following steps:
- To prepare your Spot VM for a potential preemption,seeManage preemption of Spot VMs.
- Optional: Learn aboutbest practices for Spot VMs.

What's next

To monitor GPU performance, seeMonitor GPU performance.
To troubleshoot GPU instances, seeTroubleshoot GPU VMs.
Learn more aboutGPU platforms.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.

Movatterモバイル変換

Create an A3 Ultra or A4 instance Stay organized with collections Save and categorize content based on your preferences.

Before you begin

Console

gcloud

REST

Required roles

Required permissions

Determine how to create A3 Ultra or A4 instances

Create an A3 Ultra or A4 instance

Create VPC networks

Instruction guides

Script

Create the instance

Console

Flex-start

Reservation-bound

Spot

gcloud

Flex-start

Reservation-bound

Spot

REST

Flex-start

Reservation-bound

Spot

Prepare the instance for use

What's next

Create an A3 Ultra or A4 instance