Create a Flex-start VM Stay organized with collections Save and categorize content based on your preferences.
This document explains how to create a Flex-start virtual machine (VM)instance. Flex-start VMs run for up to seven days and help you acquirehigh-demand resources like GPUs at a discounted price. These features makeFlex-start VMs a cost-effective solution for runningshort-duration workloads, such as model fine-tuning and batch inference.
To learn more about the key characteristics of Flex-start VMs,including the requirements and limitations that you apply when you create them,seeAbout Flex-start VMs.
Before you begin
Based on the machine type that you want to use, review one of the following configuration requirements:
- For an accelerator-optimized machine type (except A4X or G4), seeOverview of creating an instance with attached GPUs.
- For an H4D machine type, seeCreate an instance that uses Cloud RDMA.
- If you haven't already, set upauthentication. Authentication verifies your identity for access to Google Cloud services and APIs. To run code or samples from a local development environment, you can authenticate to Compute Engine by selecting one of the following options:
Select the tab for how you plan to use the samples on this page:
Console
When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.
gcloud
Install the Google Cloud CLI. After installation,initialize the Google Cloud CLI by running the following command:
gcloudinit
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
Note: If you installed the gcloud CLI previously, make sure you have the latest version by runninggcloud components update.- Set a default region and zone.
REST
To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.
Install the Google Cloud CLI. After installation,initialize the Google Cloud CLI by running the following command:
gcloudinit
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
Note: If you installed the gcloud CLI previously, make sure you have the latest version by runninggcloud components update.For more information, seeAuthenticate for using REST in the Google Cloud authentication documentation.
Required roles
To get the permissions that you need to create Flex-start VMs, ask your administrator to grant you theCompute Instance Admin (v1) (roles/compute.instanceAdmin.v1) IAM role on the project. For more information about granting roles, seeManage access to projects, folders, and organizations.
This predefined role contains the permissions required to create Flex-start VMs. To see the exact permissions that are required, expand theRequired permissions section:
Required permissions
The following permissions are required to create Flex-start VMs:
compute.instances.createon the project- To use a custom image to create the VM:
compute.images.useReadOnlyon the image - To use a snapshot to create the VM:
compute.snapshots.useReadOnlyon the snapshot - To use an instance template to create the VM:
compute.instanceTemplates.useReadOnlyon the instance template - To specify a subnet for your VM:
compute.subnetworks.useon the project or on the chosen subnet - To specify a static IP address for the VM:
compute.addresses.useon the project - To assign an external IP address to the VM when using a VPC network:
compute.subnetworks.useExternalIpon the project or on the chosen subnet - To assign alegacy network to the VM:
compute.networks.useon the project - To assign an external IP address to the VM when using a legacy network:
compute.networks.useExternalIpon the project - To set VM instance metadata for the VM:
compute.instances.setMetadataon the project - To set tags for the VM:
compute.instances.setTagson the VM - To set labels for the VM:
compute.instances.setLabelson the VM - To set a service account for the VM to use:
compute.instances.setServiceAccounton the VM - To create a new disk for the VM:
compute.disks.createon the project - To attach an existing disk in read-only or read-write mode:
compute.disks.useon the disk - To attach an existing disk in read-only mode:
compute.disks.useReadOnlyon the disk
You might also be able to get these permissions withcustom roles or otherpredefined roles.
Create a Flex-start VM
To create a Flex-start VM, select one of the following options:
Console
In the Google Cloud console, go to theCreate an instance page.
In theMachine configuration pane, complete the following steps:
In theName field, enter a name for theFlex-start VM.
Specify theRegion andZone where you want to create yourVM. To review the regions and zones where the machine type that youwant to use is available, seeAvailable regions and zones.
Based on the workload that you want to run, specify a machine typeas follows:
To specify an accelerator-optimized machine type, do thefollowing:
Click theGPUs tab.
In theGPU type list, select a GPU type, exceptNVIDIA GB200 192GB (A4X) andNVIDIA RTX PRO 600(G4).
In theNumber of GPUs list, select the number of GPUs toattach to your VM.
Optional: If your GPU model supportsNVIDIA RTX Virtual Workstations (vWS) for graphics workloads,and you plan to run graphics-intensive workloads, selectEnable Virtual Workstation (NVIDIA GRID).
To specify an H4D machine type, do the following:
Click theCompute optimized tab.
In theSeries column, selectH4D.
In the navigation menu, clickAdvanced. In theAdvanced panethat appears, complete the following steps:
In theProvisioning model section, in theVM provisioning model list, selectFlex-start.
In theEnter number of hours field, enter the maximum amount oftime that you want the VM to run. The value must be between
0.01(36 seconds) and168(168 hours, or seven days).Select theSet a wait time for VM creation checkbox. Then, basedon the zonal requirements for your workload, specify one of thefollowing durations to help increase the chances that your VMcreation request succeeds:
If your workload requires you to create the VM in a specificzone, then specify a duration between90 seconds and2 hours. Longer durations give you higher chances ofobtaining resources.
If the VM can run in any zone within the region, then specify aduration of0 seconds or clear theSet a wait time for VM creation checkbox. This actionspecifies that Compute Engine only allocates resources if theyare immediately available. If the VM creation request failsbecause resources are unavailable, then retry the request in adifferent zone.
In theOn VM termination field, select whether to stop or deletethe Flex-start VM at the end of its run duration:
To delete the VM, selectDelete.
To stop the VM, selectStop.
To create the Flex-start VM, clickCreate.
gcloud
To create a Flex-start VM, use thegcloud compute instances create commandwith the following flags:
The
--request-valid-for-durationflagThe
--provisioning-model=FLEX_STARTflagThe
--instance-termination-actionflagThe
--max-run-durationflagThe
--maintenance-policy=TERMINATEflagThe
--reservation-affinity=noneflag
To create a Flex-start VM, run the following command:
gcloud compute instances createVM_NAME \ --machine-type=MACHINE_TYPE \ --zone=ZONE \ --request-valid-for-duration=VALID_FOR_DURATION \ --provisioning-model=FLEX_START \ --instance-termination-action=TERMINATION_ACTION \ --max-run-duration=RUN_DURATION \ --maintenance-policy=TERMINATE \ --reservation-affinity=noneReplace the following:
VM_NAME: the name of your new VM.MACHINE_TYPE: the machine type to use for theFlex-start VM. If you specify a G2 or N1 machine type,then consider the following:For G2 machine types, you can optionally specify aNVIDIA RTX Virtual Workstations (vWS)to use for graphic-intensive workloads. To do so, include the
--acceleratorflag in the command as follows:--accelerator=count=VWS_ACCELERATOR_COUNT,type=nvidia-l4-vwsReplace
VWS_ACCELERATOR_COUNTwith the number ofNVIDIA RTX vWS that your workload requires.For N1 machine types, you must specify the number and type of GPUsto attach to your VM. Otherwise, creating the VM fails. To attachGPUs to an N1 VM, include the
--acceleratorflag in the command asfollows:--accelerator=count=NUMBER_OF_ACCELERATORS,type=ACCELERATOR_TYPEReplace the following:
NUMBER_OF_ACCELERATORS: the number of GPUs toattach to your N1 VM.ACCELERATOR_TYPE: asupported GPU model for N1 VMs.
ZONE: the zone where you want to create the VM.To verify that your specified machine type is available in the zonewhere you want to create the VM, seeAvailable regions and zones.VALID_FOR_DURATION: the maximum time to waitfor provisioning your requested resources. You must format the value asthe number of days, hours, minutes, or seconds followed byd,h,m, andsrespectively. For example, specify30mfor 30 minutes or1h2m3sfor one hour, two minutes, and three seconds. Based on thezonal requirements for your workload, specify one of the followingdurations to help increase your chances that your VM creation requestsucceeds:If your workload requires you to create the VM in a specific zone,then specify a duration between 90 seconds (
90s) and two hours(2h). Longer durations give you higher chances of obtainingresources.If the VM can run in any zone within the region, then specify aduration of zero seconds (
0s). This value specifies thatCompute Engine only allocates resources if they areimmediately available. If the creation request fails becauseresources are unavailable, then retry the request in a differentzone.
TERMINATION_ACTION: whether to stop or deletethe VM at the end of its run duration. Specify one of the followingvalues:To stop the VM:
STOPTo delete the VM:
DELETE
RUN_DURATION: the maximum time that the VM runsbefore Compute Engine automatically stops or deletes it. Youmust format the value as the number of days, hours, minutes, or secondsfollowed byd,h,m, andsrespectively. The value must bebetween 10 minutes and seven days.
REST
To create a Flex-start VM, make aPOST request to theinstances.insert method.In the request body, include the following fields:
The
params.requestValidForDurationfield.The
scheduling.provisioningModelfield set toFLEX_START.The
scheduling.instanceTerminationActionfield.The
scheduling.maxRunDurationfield.The
scheduling.onHostMaintenancefield set toTERMINATE.The
reservationAffinity.consumeReservationTypeset toNO_RESERVATION.
To create a Flex-start VM, make aPOST request as follows:
POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances{ "name": "VM_NAME", "machineType": "zones/ZONE/machineTypes/MACHINE_TYPE", "disks": [ { "initializeParams": { "sourceImage": "projects/IMAGE_PROJECT/global/images/IMAGE" }, "boot": true } ], "networkInterfaces": [ { "network": "global/networks/default" } ], "params": { "requestValidForDuration": { "seconds":VALID_FOR_DURATION } }, "scheduling": { "provisioningModel": "FLEX_START", "instanceTerminationAction": "TERMINATION_ACTION", "maxRunDuration": { "seconds":RUN_DURATION }, "onHostMaintenance": "TERMINATE" }, "reservationAffinity": { "consumeReservationType": "NO_RESERVATION" }}Replace the following:
PROJECT_ID: the ID of the project in which tocreate the VM.ZONE: the zone where you want to create the VM.To verify that a machine type is available in the zone where you want tocreate the VM, seeAvailable regions and zones.VM_NAME: the name of your new VM.MACHINE_TYPE: the machine type to use for theFlex-start VM. If you specify a G2 or N1 machine type,then consider the following:For G2 machine types, you can optionally specify aNVIDIA RTX Virtual Workstations (vWS)to use for graphic-intensive workloads. To do so, include the
guestAcceleratorsfield in the request body as follows:"guestAccelerators": [ { "acceleratorCount":VWS_ACCELERATOR_COUNT, "acceleratorType": "projects/PROJECT_ID/zones/ZONE/acceleratorTypes/nvidia-l4-vws" }]Replace
VWS_ACCELERATOR_COUNTwith the number ofNVIDIA RTX vWS that your workload requires.For N1 machine types, you must specify the number and type of GPUsto attach to your VM. Otherwise, creating the VM fails. To attachGPUs to an N1 VM, include the
guestAcceleratorsfield in therequest body as follows:"guestAccelerators": [ { "acceleratorCount":ACCELERATOR_COUNT, "acceleratorType": "projects/PROJECT_ID/zones/ZONE/acceleratorTypes/ACCELERATOR_TYPE" }]Replace the following:
NUMBER_OF_ACCELERATORS: the number of GPUs toattach to your N1 VM.ACCELERATOR_TYPE: asupported GPU model for N1 VMs.
IMAGE_PROJECT: the image project that containsthe image—for example,debian-cloud. For more information aboutthe supported image projects, seePublic images.IMAGE: specify one of the following:A specific version of the OS image—for example,
debian-12-bookworm-v20240617.Animage family, which must beformatted as
family/IMAGE_FAMILY. This valuespecifies to use the most recent, non-deprecated OS image. Forexample, if you specifyfamily/debian-12, the latest version inthe Debian 12 image family is used. For more information about usingimage families, seeImage families best practices.
VALID_FOR_DURATION: the maximum time in secondsto wait for the VM to be provisioned. Based on the zonal requirementsfor your workload, specify one of the following durations to helpincrease your chances that your VM creation request succeeds:If your workload requires you to create the VM in a specific zone,then specify a duration between 90 seconds (
90) and two hours(7200). Longer durations give you higher chances of obtainingresources.If the VM can run in any zone within the region, then specify aduration of zero seconds (
0). This value specifies thatCompute Engine only allocates resources if they areimmediately available. If the creation request fails becauseresources aren't available, then retry the request in a differentzone.
TERMINATION_ACTION: whether to stop or deletethe VM at the end of its run duration. Specify one of the followingvalues:To stop the VM:
STOPTo delete the VM:
DELETE
RUN_DURATION: the maximum time, in seconds,that the VM runs before Compute Engine automatically stops ordeletes it. The value must be between600(600 seconds, or 10 minutes)and604800(604,800 seconds, or seven days).
What's next
Try it for yourself
If you're new to Google Cloud, create an account to evaluate how Compute Engine performs in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
Try Compute Engine freeExcept as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.