Overview of HPC clusters with enhanced cluster management capabilities

To create the infrastructure for tightly-coupled applications that scale acrossmultiple nodes, you can create a cluster of virtual machine (VM) instances. Thisguide provides a high-level overview of the key considerations and steps toconfigure a cluster of virtual machine (VM) instances for high performancecomputing (HPC) workloads using dense resource allocation.

WithH4D,Compute Engine adds support for running massive HPC workloads bytreating an entire cluster of VM instances as a single computer. Usingtopology-aware placement of VMs lets you access many instances within a singlenetworking superblock and minimizes network latency. You can also configureCloud RDMA on these instancesto maximize inter-node communication performance, which is crucial fortightly-coupled HPC workloads.

Note: This type of configuration relies on similar features and concepts asthose documented in the AI Hypercomputer documentation foraccelerator-optimized VMs with GPUs.

You create these HPC VM clusters with H4D by reserving blocks of capacityinstead of individual resources. Using blocks of capacity for your clusterenablesenhanced cluster management capabilities.

HPC clusters with H4D instances can be created either with or withoutenhanced cluster management capabilities. If you don't require enhanced cluster management capabilities features with your H4D HPCcluster, or if you want to create HPC clusters using a machine series other thanH4D, then use the following instructions for creating HPC instances or clusters:

Cluster terminology

When working with blocks of capacity, the following terms are used:

Block

A collection of sub-blocks that are interconnected with a non-blocking fabric, providing a high-bandwidth interconnect between all hosts in the cluster.

Cluster

A collection of blocks interconnected by a high-speed network fabric. A cluster can scale to thousands of CPUs for running large-scale HPC workloads. Each cluster is globally unique. Communication across different blocks adds only one additional hop, maintaining high performance and predictability, even at a massive scale. Cluster-level metadata is also available to orchestrators for intelligent, large-scale job placement.

Cluster Toolkit

An open source tool offered by Google that simplifies the configuration and deployment for clusters that use either Slurm or Google Kubernetes Engine. You use predefined blueprints to build a deployment folder that is based on the blueprint. You can modify blueprints or the deployment folder to customize deployments and your software stack. You then use Terraform or Packer to run the commands generated by Cluster Toolkit to deploy the cluster.

Dense deployment

A resource request that allocates your compute instance resources physically close to each other to minimize network hops and optimize for the lowest latency.

Network fabric

A network fabric provides high-bandwidth, low-latency connectivity across all blocks and Google Cloud services in a cluster. Jupiter is Google's data center network architecture that uses software-defined networking and optical circuit switches to evolve the network and optimize its performance.

Node or host

A single physical server machine in the data center. Each host has its associated compute resources such as CPUs, memory, and network interfaces. The number and configuration of these compute resources depend on the machine type. Compute instances are provisioned on top of a physical host.

Orchestrator

An orchestrator automates the management of your clusters. With an orchestrator, you don't have to manage each VM instance in the cluster. An orchestrator, such asSlurm or Google Kubernetes Engine (GKE), handles tasks like job queueing, resource allocation, auto scaling (with GKE), and other day-to-day cluster management tasks.

Sub-blocks

A group of hosts and associated connectivity hardware that are located on a single physical rack. A top-of-rack (ToR) switch connects these hosts, enabling extremely efficient, single-hop communication between any two CPUs within the sub-block. Cloud RDMA facilitates this direct communication.

Overview of cluster creation process with H4D VMs

To create HPC clusters on reserved blocks of capacity, you must complete thefollowing steps:

Provisioning models for VM and cluster creation

When creating VM instances, you can use the provisioning models described inCompute Engine instances provisioning models.

To create a tightly-coupled H4D instances, you must use one of the followingprovisioning models to obtain the necessary resources for creating computeinstances:

Reservation-bound: you can reserve resources at a discounted price for afuture date and duration. At the start of your reservation period, you can usethe reserved resources to create VMs or clusters. You have exclusive access toyour reserved resources for the reservation period.
Flex-start: you can request discounted resources for up to seven days.Compute Engine makes best-effort attempts to schedule the provisioningof your requested resources as soon as they're available. You have exclusiveaccess to your obtained resources for your requested period.
Spot: based on availability, you can immediately obtain deeply discountedresources. However, Compute Engine might stop or delete the VMinstances at any time to reclaim capacity.

Reservation-bound provisioning model

The reservation-bound provisioning model links your created VM instances to thecapacity that you previously reserved. When you reserve capacity,Compute Engine creates an empty reservation. Then, at the reservationstart time, the following occurs:

Compute Engine adds your reserved resources to the reservation.You have exclusive access to the reserved capacity until the reservation endtime.
Google Cloud charges you for the reserved capacity until the end of yourreservation period, whether you use the capacity or not.

You can then use the reserved resources to create VMs without additionalcharges. You only pay for resources that aren't included in the reservation,such as disks or IP addresses.

You can reserve resources for as many VMs as you like for aslong as you like for a future date. Then, you can use the reserved resources tocreate and run VMs until the end of the reservation period. If you reserveresources for one year or longer, then you must purchase and attach aresource-based commitment.

To provision resources using the reservation-bound provisioning model, see:

For long-running, large-scale distributed workloads with densely allocatedresources:Reserve capacity through your account team
For short-running (up to 90 days) distributed workloads with densely allocatedresources:Future reservation requests in calendar mode

You can use reservation-bound provisioning with H4D instances by specifying thereservation-bound provisioning model when creating individual VMs, a HPCcluster, or a group of VMs.

Flex-start provisioning model

To run short-duration workloads that require densely allocated resources, youcan request compute resources for up to seven days by using Flex-start. Wheneverresources are available, Compute Engine creates your requested number ofVMs. You can stop standalone Flex-start VMs, but you can't stopFlex-start VMs that a managed instance group (MIG) creates throughresize requests. The Flex-start VMs exist until you delete them, oruntil Compute Engine deletes the VMs at the end of their run duration.

Flex-start is ideal for workloads that can start at any time. The flex-startprovisioning model provisions resources from a secure capacity pool, so theallocated resources are densely allocated to minimize network latency.

When you add Flex-start VMs to amanaged instance group (MIG) by using resize requests, the MIG creates the VMsall at once. This approach helps you avoid unnecessary charges forpartial capacity that Compute Engine might deliver while you wait forthe full capacity needed to start your workload.

You can use Flex-start provisioning with H4D instances, using any availabledeployment model.

Spot provisioning model

To run fault-tolerant workloads, you can obtain compute resources immediatelybased on availability. You get resources at the lowest price possible. However,Compute Engine might stop or delete the created Spot VMs at anytime to reclaim capacity. This process is calledpreemption.

Spot VMs are ideal for workloads where interruptions are acceptable,such as:

Batch processing
High performance computing (HPC)
Data analytics
Continuous integration and continuous deployment (CI/CD)
Media encoding

You can use Spot VMs with any machine type, except A4X, X4, and baremetal machine types. Dense allocation depends on resource availability. To helpensure a closer allocation, you can apply a compact placement policy to theSpot VMs.

Note: Spot VMs are not covered by any Service Level Agreement and areexcluded from the Compute Engine SLA.

You can use Spot VMs with the following dense deployment options:

Choose a consumption option and obtain capacity

Consumption options determine how resources are obtained for your cluster. Tocreate a cluster that uses enhanced cluster management capabilities, you must request blocksof capacity for adense deployment.

The following table summarizes the key differences between the consumptionoptions for blocks of capacity:

Note: You can also request a future reservation for more than 90 days. If youneed to reserve this capacity, see Reserve capacity through your account team.

Consumption option	Future reservations for capacity blocks	Future reservations for up to 90 days (in calendar mode)	Flex-start	Spot
Workload characteristics	Long-running, large-scale distributed workloads that require densely allocated resources	Short-duration workloads that require densely allocated resources	Short-duration workloads that require densely allocated resources	Fault-tolerant workloads
Lifespan	Any time	Up to 90 days	Up to 7 days	Any time, but subject topreemption
Preemptible	No	No	No	Yes
Capacity assurance	Very high	Very high	Best effort	Best effort
Quota	Check that you have enough quota before creating instances.	No quota is charged	Preemptible quota is charged.	Preemptible quota is charged.
Pricing	Seepricing for VMs. If you reserve resources for a year or longer, then you must purchase and attach aresource-based commitment to your reserved resources. You're charged for the reservation period. See reservations billing.	Discounted (up to 25%). SeeDynamic Workload Scheduler pricing. You're charged for the reservation period. See reservations billing.	Discounted (up to 25%). SeeDynamic Workload Scheduler pricing. You pay as you go (PAYG).	Deeply discounted (60-91%). SeeSpot VMs pricing andpricing for compute-optimized VMs. You pay as you go (PAYG).
Resource allocation	Dense	Dense	Dense	Standard (Compact placement policy optional)
Provisioning model	Reservation-bound	Reservation-bound	Flex-start	Spot
Creation method	To create HPC clusters and VMs, you must do the following: Reserve capacity through your account team At your chosen date and time, you can use the reserved capacity to create HPC clusters. SeeChoose a deployment option.	To create HPC clusters and VMs, you must do the following: Create a future reservation request in calendar mode At your chosen date and time, you can use the reserved capacity to create HPC clusters. SeeChoose a deployment option.	To create VMs, select one of the following options: Create a standalone Flex-start VMs. Create Flex-start VMs all at once by using MIG resize requests. Use GKE torun high performance computing (HPC) workloads with H4D. When your requested capacity becomes available, Compute Engine provisions it.	You can immediately create VMs. SeeChoose a deployment option.

Choose a deployment option

High performance computing (HPC) workloads aggregate computing resources to gainperformance greater than that of a single workstation, server, or computer. HPCis used to solve problems in academic research, science, design, simulation, andbusiness intelligence.

For HPC clusters with enhanced cluster management capabilities, choose the H4D machine series. If you planto use a different machine series, follow the documentation atCreate an HPC-ready VM instanceinstead of using the deployment methods listed on this page.

Some of the available deployment optionsinclude the installation and configuration of anorchestratorfor enhanced management of the HPC cluster.

For the most appropriate option to create your VMs or clusters for your usecase, choose one of the following:

Option	Use case
Cluster Toolkit	You want to use open-source software that simplifies the process for you to deploy both Slurm and Google Kubernetes Engine (GKE) clusters.Cluster Toolkit is designed to be highly customizable and extensible. To learn more, see the following: Create an H4D Slurm cluster with enhanced cluster management capabilities Quickstart: Create a Cloud RDMA-enabled HPC Slurm cluster
GKE	You want maximum flexibility in configuring your Google Kubernetes Engine cluster based on the needs of your workload. To learn more, seeRun HPC workloads with H4D.
Use Compute Engine	You want full control of the infrastructure layer so that you can set up your own orchestrator. To learn more, see the following: Create an HPC-optimized instance (non-dense deployments) Create an HPC-ready VM instance Create an instance that uses Cloud RDMA Create H4D instances in bulk Create a managed instance group (MIG) with H4D instances Create a HPC MIG with H4D machine series Quickstart: Create a MIG with H4D machine types and flex-start Quickstart: Create a MIG for HPC workloads with reservation-bound consumption

Option

Use case

Cluster Toolkit

You want to use open-source software that simplifies the process for you to deploy both Slurm and Google Kubernetes Engine (GKE) clusters.Cluster Toolkit is designed to be highly customizable and extensible. To learn more, see the following:

GKE

You want maximum flexibility in configuring your Google Kubernetes Engine cluster based on the needs of your workload. To learn more, seeRun HPC workloads with H4D.

Use Compute Engine

You want full control of the infrastructure layer so that you can set up your own orchestrator. To learn more, see the following:

Create an HPC-optimized instance (non-dense deployments)
- Create an HPC-ready VM instance
- Create an instance that uses Cloud RDMA
Create H4D instances in bulk
Create a managed instance group (MIG) with H4D instances

Choose the operating system image

The operating system (OS) image you choose depends on the service you use todeploy your cluster.

For clusters on GKE: Use a GKE node image,such as Container-Optimized OS. If you use Cluster Toolkit todeploy your GKE cluster, a Container-Optimized OSimage is used by default. For more information about node images, seeNode images in theGKE documentation.
For clusters on Compute Engine: You can use one of the followingimages:
- HPC VM image:A Rocky Linux 8 image that is optimized for tightly-coupledHPC workloads.
- OS image provided by Google Cloud:OS images that support H4D. You will need to configure these for your HPCworkloads.
- Custom images: You can createand use your own custom images. To include HPC-specific optimizations, werecommend that youcreate a custom image using the HPC VM image.
For Slurm Clusters: Cluster Toolkit deploys the Slurm Clusterwith a HPC VM image based on Rocky Linux 8 that is optimized fortightly-coupled HPC workloads.

Create your HPC cluster

After you review the cluster creation process and make preliminary decisionsfor your workload, create your cluster by using any of thedeployment options.

Enhanced cluster management capabilities for your HPC cluster

When you create H4D instances with densely allocated resources using thedeployment methods mentioned inChoose a deployment option,you can use enhanced HPC cluster management capabilities with your instances.

For more information about these capabilities, seeEnhanced HPC cluster management with H4D instances.

What's next

Learn more aboutCluster Toolkit.
Try the Quickstart tutorialDeploy an HPC cluster with Slurm.
Reviewbest practices for running HPC workloads

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.

Movatterモバイル変換

Overview of HPC clusters with enhanced cluster management capabilities Stay organized with collections Save and categorize content based on your preferences.