About future reservation requests in calendar mode

This document gives an overview of future reservation requests in calendar mode.

Use future reservation requests in calendar mode to obtain high-demandresources, such as for creating virtual machine (VM) instances that have GPUs orTPUs attached. When Google Cloud approves a reservation request,Compute Engine provisions your reserved resources at your specified date andtime, and for a duration of up to 90 days. You can then use the reservedresources for creating GPU VMs, H4D VMs, or TPU VMs to run the followingworkloads:

  • Model pre-training jobs

  • Model fine-tuning jobs

  • High performance computing (HPC) simulation workloads

  • Short-term expected increases in inference workloads

For more information about other ways to reserve resources inCompute Engine, seeChoose a reservation type.

Create a request in calendar mode

The following sections explain how to view resource availability, as well aswhat details to specify when you create a future reservation request in calendarmode.

View resource future availability

Before you create a future reservation request in calendar mode, you can viewthe future availability in a region of the following resources:

  • For GPU or H4D VMs, up to 60 days in advance

  • For TPUs, up to 120 days in advance

Compute Engine uses theDynamic Workload Scheduler (DWS)to view when your requested resources are available. When you create a request,specify the number, type, and reservation period for the resources that youconfirmed as available. Google Cloud is more likely to approve yourrequest if you supply this information.

Define request properties

When you create a future reservation request in calendar mode, you must specifythe following properties:

Request review process

To reserve capacity by using a future reservation request in calendar mode, youmust create and submit the request to Google Cloud for review. After youcreate and submit a request, Google Cloud reviews it within a minute, andthen one of the following occurs:

Request lifecycle

The following diagram shows the different states that Compute Engine canset a future reservation request in calendar mode to:

A flowchart showing the different states a future reservation request in calendar mode can go through.

The states and flow of events shown in the preceding diagram are as follows:

At the request end time, Compute Engine deletes the request and theauto-created reservation. It also stops or deletes any VMs that consume thereservation based on thetermination actionthat you specified for the VMs.

Consume provisioned capacity

After Google Cloud approves a future reservation request in calendar mode,Compute Engine automatically creates a reservation with the followingcharacteristics:

  • The auto-created reservation has no reserved GPU VMs, H4D VMs, or TPUs; youcan't consume it yet.

  • The auto-created reservation inherits the VM or TPU properties specified inyour request.

At the request start time, Compute Engine provisions your requestedcapacity by increasing the number of GPU VMs, H4D VMs, or TPUs in theauto-created reservation. You can then consume the reservation by creating GPUVMs, H4D VMs, or TPU VMs that meet all of the following conditions:

You can create VMs until the reservation is fully consumed or until the requestend time. At the request end time, Compute Engine deletes theauto-created reservation, and stops or deletes any VMs that consume thereservation.

Quota

Future reservation requests in calendar mode must use thereservation-bound provisioning model.This model doesn't require Compute Engine quota to reserve resources.However, before you create a request, verify that you have sufficient quota forany resources that aren't part of a reservation when you create VMs, such asdisks or IP addresses.

Pricing

When you create and submit a future reservation request in calendar mode, andGoogle Cloud approves your request, you don't immediately incur charges.Instead, you incur charges when the following occurs:

  • Compute Engine provisions your requested capacity. When yourrequest reaches theFULFILLED state at the request's start time, you incurcharges for the provisioned resources according toDWS pricing. This pricingmodel offers vCPUs, memory, GPUs, and TPUs at a discounted price compared tostandard pricing.

  • You use resources outside of the reservation. When you create VMs thatconsume an auto-created reservation, you don't incur additional charges forthe consumed resources. You only incur charges for resources that aren'tpart of the reservation, such as disks or IP addresses.

You stop incurring charges for the reserved resources at the request end time.At this time, Compute Engine deletes the auto-created reservation, andstops or deletes any VMs that consume the reservation based on their terminationaction.

Important: If a VM's termination action specifies to stop the VM, then, afterCompute Engine stops the VM at the end of the reservation period, youkeep incurring charges for any resources that are attached to the VM. To avoidunnecessary charges, detach and delete any resources that you don't needanymore.

Limitations

The following sections explain the limitations for future reservation requestsin calendar mode.

Limitations for all requests

All future reservation requests in calendar mode have the following limitations:

  • You can reserve resources for a period between 1 and 90 days.

  • After you create and submit a request, you can't cancel, delete, or modifyyour request.

Limitations for requests for VMs

You can only reserve GPU VMs or H4D VMs as follows:

Limitations for requests for TPUs

You can only reserve TPUs as follows:

  • You can reserve 1, 4, 8, 16, 32, 64, 128, 256, 512, or 1,024 TPU chips perrequest.

  • You can reserve the following TPU versions:

  • You can only reserve 1, 4, or 8 TPU v5e chips for serving (SERVING)workload types.

  • You can reserve TPUs only in the following zones:

    • TPU7x:

      • us-central1-c
    • TPU v6e:

      • asia-northeast1-b

      • europe-west4-a

      • us-east5-a

      • us-east5-b

      • us-south1-ai1b

    • TPU v5p:

      • us-east5-a
    • TPU v5e:

      • For batch (BATCH) workload types:

        • europe-west4-b

        • us-west4-b

      • For serving (SERVING) workload types:

        • us-south1-a

Limitations for all auto-created reservations

An auto-created reservation for a request has the following limitations:

  • You can only modify the reservation as follows:

    • To allow or disallow Vertex AI jobs from consuming it.

    • After the reservation start time.

  • You can't apply committed use discounts (CUDs) or sustained use discounts(SUDs) to the reservation.

  • You can't delete the reservation; Compute Engine deletes it at theend time for the reservation.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.