TPU v6e
This document describes the architecture and supported configurations ofCloud TPU v6e (Trillium). On all technicalsurfaces, such as the API and logs, and throughout this document, Trillium willbe referred to as v6e.
With a 256-chip footprint per Pod, v6e shares many similarities withv5e. This system is optimized for transformer, text-to-image,and convolutional neural network (CNN) training, fine-tuning, and serving.
System architecture
Each v6e chip contains one TensorCore. Each TensorCore has 2 matrix-multiplyunits (MXU), a vector unit, and a scalar unit. The following table shows the keyspecifications and their values for TPU v6e.
| Specification | Values |
|---|---|
| Performance/total cost of ownership (TCO) (expected) | 1 |
| Peak compute per chip (bf16) | 918 TFLOPs |
| Peak compute per chip (Int8) | 1836 TOPs |
| HBM capacity per chip | 32 GB |
| HBM bandwidth per chip | 1600 GBps |
| Bidirectional inter-chip interconnect (ICI) bandwidth (per chip) | 800 GBps |
| ICI ports per chip | 4 |
| DRAM per host | 1536 GiB |
| Chips per host | 8 |
| TPU Pod size | 256 chips |
| Interconnect topology | 2D torus |
| BF16 peak compute per Pod | 234.9 PFLOPs |
| All-reduce bandwidth per Pod | 102.4 TB/s |
| Bisection bandwidth per Pod | 3.2 TB/s |
| Per-host NIC configuration | 4 x 200 Gbps NIC |
| Data center network bandwidth per Pod | 25.6 Tbps |
| Special features | SparseCore |
Supported configurations
The following table shows the 2D slice shapes that are supported for v6e:
| Topology | TPU chips | Hosts | VMs | Machine type (GKE API) | Scope |
|---|---|---|---|---|---|
| 1x1 | 1 | 1/8 | 1 | ct6e-standard-1t | Sub-host |
| 2x2 | 4 | 1/2 | 1 | ct6e-standard-4t | Sub-host |
| 2x4 | 8 | 1 | 1 | ct6e-standard-8t | Single-host |
| 2x4 | 8 | 1 | 2 | ct6e-standard-4t | Single-host |
| 4x4 | 16 | 2 | 4 | ct6e-standard-4t | Multi-host |
| 4x8 | 32 | 4 | 8 | ct6e-standard-4t | Multi-host |
| 8x8 | 64 | 8 | 16 | ct6e-standard-4t | Multi-host |
| 8x16 | 128 | 16 | 32 | ct6e-standard-4t | Multi-host |
| 16x16 | 256 | 32 | 64 | ct6e-standard-4t | Multi-host |
Slices with 8 chips (v6e-8) attached to a single VM are optimized forinference, allowing all 8 chips to be used in a single serving workload. You canperform multi-host inference using Pathways on Cloud. For more information, seePerform multihost inference using Pathways.
For information about the number of VMs for each topology, seeVM Types.
VM types
Each TPU v6e VM can contain 1, 4, or 8 chips. 4-chip and smallerslices have the same non-uniform memory access (NUMA) node. For more informationabout NUMA nodes, seeNon-uniform memoryaccess on Wikipedia.

v6e slices are created using half-host VMs, each with 4 TPU chips. There are twoexceptions to this rule:
v6e-1: A VM with only a single chip, primarily intended for testingv6e-8: A full-host VM that has been optimized for an inference use casewith all 8 chips attached to a single VM.
The following table shows a comparison of TPU v6e VM types:
| VM type | Number of vCPUs per VM | RAM (GB) per VM | Number of NUMA nodes per VM |
|---|---|---|---|
| 1-chip VM | 44 | 176 | 1 |
| 4-chip VM | 180 | 720 | 1 |
| 8-chip VM | 180 | 1440 | 2 |
v6e-8 with one VM) for dualnetworks due to performance impacts.What's next
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-12 UTC.