TPU v6e

This document describes the architecture and supported configurations ofCloud TPU v6e (Trillium). On all technicalsurfaces, such as the API and logs, and throughout this document, Trillium willbe referred to as v6e.

With a 256-chip footprint per Pod, v6e shares many similarities withv5e. This system is optimized for transformer, text-to-image,and convolutional neural network (CNN) training, fine-tuning, and serving.

System architecture

Each v6e chip contains one TensorCore. Each TensorCore has 2 matrix-multiplyunits (MXU), a vector unit, and a scalar unit. The following table shows the keyspecifications and their values for TPU v6e.

SpecificationValues
Performance/total cost of ownership (TCO) (expected)1
Peak compute per chip (bf16)918 TFLOPs
Peak compute per chip (Int8)1836 TOPs
HBM capacity per chip32 GB
HBM bandwidth per chip1600 GBps
Bidirectional inter-chip interconnect (ICI) bandwidth (per chip)800 GBps
ICI ports per chip4
DRAM per host1536 GiB
Chips per host8
TPU Pod size256 chips
Interconnect topology2D torus
BF16 peak compute per Pod234.9 PFLOPs
All-reduce bandwidth per Pod102.4 TB/s
Bisection bandwidth per Pod3.2 TB/s
Per-host NIC configuration4 x 200 Gbps NIC
Data center network bandwidth per Pod25.6 Tbps
Special featuresSparseCore

Supported configurations

The following table shows the 2D slice shapes that are supported for v6e:

TopologyTPU chipsHostsVMsMachine type (GKE API)Scope
1x111/81ct6e-standard-1tSub-host
2x241/21ct6e-standard-4tSub-host
2x4811ct6e-standard-8tSingle-host
2x4812ct6e-standard-4tSingle-host
4x41624ct6e-standard-4tMulti-host
4x83248ct6e-standard-4tMulti-host
8x864816ct6e-standard-4tMulti-host
8x161281632ct6e-standard-4tMulti-host
16x162563264ct6e-standard-4tMulti-host
Note: The 8-chip (2x4) configuration attached to 2 VMs is only supported whenusing the GKE API.

Slices with 8 chips (v6e-8) attached to a single VM are optimized forinference, allowing all 8 chips to be used in a single serving workload. You canperform multi-host inference using Pathways on Cloud. For more information, seePerform multihost inference using Pathways.

For information about the number of VMs for each topology, seeVM Types.

VM types

Each TPU v6e VM can contain 1, 4, or 8 chips. 4-chip and smallerslices have the same non-uniform memory access (NUMA) node. For more informationabout NUMA nodes, seeNon-uniform memoryaccess on Wikipedia.

Diagram of a v6e host

v6e slices are created using half-host VMs, each with 4 TPU chips. There are twoexceptions to this rule:

  • v6e-1: A VM with only a single chip, primarily intended for testing
  • v6e-8: A full-host VM that has been optimized for an inference use casewith all 8 chips attached to a single VM.

The following table shows a comparison of TPU v6e VM types:

VM typeNumber of vCPUs per VMRAM (GB) per VMNumber of NUMA nodes per VM
1-chip VM441761
4-chip VM1807201
8-chip VM18014402
Note: We don't recommend using a full-host VM (v6e-8 with one VM) for dualnetworks due to performance impacts.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-12 UTC.