About future reservation requests in calendar mode Stay organized with collections Save and categorize content based on your preferences.
Preview
This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA products and features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.
This document gives an overview of future reservation requests in calendar mode.
Use future reservation requests in calendar mode to obtain high-demandresources, such as for creating virtual machine (VM) instances that have GPUs orTPUs attached. When Google Cloud approves a reservation request,Compute Engine provisions your reserved resources at your specified date andtime, and for a duration of up to 90 days. You can then use the reservedresources for creating GPU VMs, H4D VMs, or TPU VMs to run the followingworkloads:
Model pre-training jobs
Model fine-tuning jobs
High performance computing (HPC) simulation workloads
Short-term expected increases in inference workloads
For more information about other ways to reserve resources inCompute Engine, seeChoose a reservation type.
Create a request in calendar mode
The following sections explain how to view resource availability, as well aswhat details to specify when you create a future reservation request in calendarmode.
View resources future availability
Before you create a future reservation request in calendar mode, you can viewthe future availability in a region of the following resources:
For GPU or H4D VMs, up to 60 days in advance
For TPUs, up to 120 days in advance
Compute Engine uses theDynamic Workload Scheduler (DWS)to view when your requested resources are available. When you create a request,specify the number, type, and reservation period for the resources that youconfirmed as available. Google Cloud is more likely to approve yourrequest if you supply this information.
Define request properties
When you create a future reservation request in calendar mode, you must specifythe following properties:
Auto-delete. This property determines if Compute Engine deletesthe automatically created (auto-created) reservation for your request atthe end time, even if the reservation isn't fully consumed. To create arequest in calendar mode, you must enable the auto-delete option.
Consumption type. This property defines how VMs consume the auto-createdreservation. When you create a request in calendar mode, you must specifythat you want to createspecifically-targeted reservations.This setting means that only VMs that target the reservation can consume it.
Deployment type. This property defines the collocation of your reservedresources. Based on the type of resources that you reserve, resources arereserved as follows:
For GPU or H4D VMs, you must specify to densely reserve resources tominimize network latency.
For TPUs, resources are reserved as close as possible on a best-effortbasis.
Name. The name of your request, which must be unique within yourproject.
Number of resources. The number of GPU or H4D VMs or TPUs to reserve atyour requested start time.
Planning status. This property defines if you immediately submit yourrequest to Google Cloud for review, or save it as a draft and submitit later. When you create a request in calendar mode, you must specify toimmediately submit the request for review.
Reservation mode. This property defines the method to reserve resources,which you must set to
CALENDARfor a request in calendar mode.Reservation name. The name for the reservation thatCompute Engine automatically creates if Google Cloud approvesyour request.
Share type. This property defines if other projects in your organizationcan consume the auto-created reservation for your approved request. You canspecify one of the following options:
Single-project. Only your project can consume the reserved capacity.
Shared. You can share the reserved capacity with up to 100 otherprojects in your organization. If you specify this option, then you mustspecify the projects to share the auto-created reservation with. Formore information, see thebest practices for shared reservations.
Reservation period. The date and time when Compute Engineprovisions your requested capacity, and you can consume it. The reservationperiod includes the following:
Start time. When you want to start consuming your reserved capacity.Based on the resources that you reserve, the start time must be at leastone of the following values from when you create and submit a request:
For GPU and H4D VMs, 87 hours (three days and 15 hours)
For TPUs, six hours
End time. When your requested capacity is no longer reserved foryou. At this time, Compute Engine deletes the auto-createdreservation, and stops or deletes and any VMs that consume thereservation based on thetermination actionthat you specified for the VMs.
Resource properties. The hardware requirements of the GPU VMS, H4D VMs,or TPUs that you want to reserve. VMs can only use a reservation if theirproperties match the reservation's properties. For more information, see therequirements to consume reservations.
Workload type. If you reserveTPU v5e, then you mustspecify how to reserve capacity based on your workload type:
Batch. For workloads that handle large amounts of data in single ormultiple operations, such as machine learning (ML) training workloads.
Serving. For workloads that handle concurrent requests and requireminimal network latency, such as ML inference workloads.
Zone. The zone where you want to reserve capacity.
Request review process
To reserve capacity by using a future reservation request in calendar mode, youmust create and submit the request to Google Cloud for review. After youcreate and submit a request, Google Cloud reviews it within a minute, andthen one of the following occurs:
Google Cloud approves your request: Compute Enginereserves your requested resources and, within a minute after approval,automatically creates an empty reservation. At the request start time,Compute Engine provisions your requested capacity by increasing thenumber of GPU VMs, H4D VMs, or TPUs in the reservation.
Caution: After you create a request, you can't cancel, delete, or modify it.You commit to pay for the requested capacity at the request start time,regardless if you use the capacity or not.You encounter an error. The request fails because the request's zonelacks sufficient resources. We recommend that you view future resourcesavailability again, and then create and submit a new request for review.
Request lifecycle
The following diagram shows the different states that Compute Engine canset a future reservation request in calendar mode to:

The states and flow of events shown in the preceding diagram are as follows:
PENDING_APPROVAL: you created and submitted a request for review.Within a minute, Google Cloud approves the request.APPROVED: Google Cloud approved your request. Then, within aminute, Compute Engine automatically creates an empty reservationand changes the request state toPROCURING.PROCURING: Compute Engine schedules the provisioning of yourreserved resources. Before the request start time, the request state changestoPROVISIONING.PROVISIONING: Compute Engine is provisioning your reservedresources by increasing the number of reserved GPU VMs, H4D VMs, or TPUs inthe auto-created reservation. At the request start time, the request statechanges toFULFILLED.FULFILLED: Compute Engine has provisioned your reservedresources, and you're charged for them. You can consume the auto-createdreservation by creating VMs until the request end time.
At the request end time, Compute Engine deletes the request and theauto-created reservation. It also stops or deletes any VMs that consume thereservation based on thetermination actionthat you specified for the VMs.
Consume provisioned capacity
After Google Cloud approves a future reservation request in calendar mode,Compute Engine automatically creates a reservation with the followingcharacteristics:
The auto-created reservation has no reserved GPU VMs, H4D VMs or TPUs; youcan't consume it yet.
The auto-created reservation inherits the VM or TPU properties specified inyour request.
At the request start time, Compute Engine provisions your requestedcapacity by increasing the number of GPU VMs, H4D VMs, or TPUs in theauto-created reservation. You can then consume the reservation by creating GPUVMs, H4D VMs, or TPU VMs that meet all of the following conditions:
The VMs and the reservation havematching properties.
The VMs use thereservation-bound provisioning model.
The VMs must bestopped or deleted at the reservation end time.
You can create VMs until the reservation is fully consumed or until the requestend time. At the request end time, Compute Engine deletes theauto-created reservation, and stops or deletes any VMs that consume thereservation.
Quota
Future reservation requests in calendar mode must use thereservation-bound provisioning model.This model doesn't require Compute Engine quota to reserve resources.However, before you create a request, verify that you have sufficient quota forany resources that aren't part of a reservation when you create VMs, such asdisks or IP addresses.
Pricing
When you create a future reservation request in calendar mode, you aren'tcharged. Instead, you incur charges when the following occurs:
Compute Engine provisions your requested capacity. When arequest reaches the
FULFILLEDstate, you're charged for the provisionedresources according toDWS pricing.This pricing model offers vCPUs, memory, GPUs, and TPUs at a discountedprice compared to standard pricing.You use resources not covered by the reservation. When you create VMsthat consume an auto-created reservation, you aren't charged again for theconsumed resources. You're only charged for resources that aren't part ofthe reservation, such as disks or IP addresses.
You stop incurring charges for the reserved resources at the request end time.At this time, Compute Engine deletes the auto-created reservation, andstops or deletes any VMs that consume the reservation.
Important: If you specified to stop a VM at the end of the reservation period,then, after Compute Engine stops the VM, you keep incurring charges forany resources that are attached to it. To avoid unnecessary charges, detach anddelete any resources that you don't need anymore.Limitations
The following sections explain the limitations for future reservation requestsin calendar mode.
Limitations for all requests
All future reservation requests in calendar mode have the following limitations:
You can reserve resources for a period between 1 and 90 days.
After you create and submit a request, you can't cancel, delete, or modifyyour request.
Limitations for requests for VMs
You can only reserve GPU VMs or H4D VMs as follows:
You can reserve between 1 and 80 GPU VMs per request.
You can reserve up to 256 H4D VMs per request.
You can reserve the following machine series:
You can reserve GPU VMs only inspecific zones. For H4D regionalavailability, seeAvailable regions and zones and usetheMachine series filter to view only the zones where you can reserveH4D instances.
You can't create requests for GPU VMs by using an instance template.
Limitations for requests for TPUs
You can only reserve TPUs as follows:
You can reserve 1, 4, 8, 16, 32, 64, 128, 256, 512, or 1,024 TPU chips perrequest.
You can reserve the following TPU versions:
You can only reserve 1, 4, or 8 TPU v5e chips for serving (
SERVING)workload types.You can reserve TPUs only in the following zones:
TPU7x:
us-central1-c
TPU v6e:
asia-northeast1-beurope-west4-aus-east5-aus-east5-b
TPU v5p:
us-east5-a
TPU v5e:
For batch (
BATCH) workload types:europe-west4-bus-west4-b
For serving (
SERVING) workload types:us-south1-a
Limitations for all auto-created reservations
An auto-created reservation for a request has the following limitations:
You can only modify the reservation as follows:
To allow or disallow Vertex AI jobs from consuming it.
After the reservation start time.
You can't apply committed use discounts (CUDs) or sustained use discounts(SUDs) to the reservation.
You can't delete the reservation; Compute Engine deletes it at theend time for the reservation.
What's next
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-09 UTC.