Harden workload isolation with GKE Sandbox

This page describes how to useGKE Sandbox to protect thehost kernel on your nodes when containers in the Pod execute unknown oruntrusted code, or need extra isolation from the node. This page explainshow you can enable GKE Sandbox and monitor your clusters when GKE Sandbox isrunning.

This page is for Security specialists who must isolate their workloads foradditional protection from unknown or untrusted code.To learn more aboutcommon roles and example tasks that we reference in Google Cloud content, seeCommon GKE user roles and tasks.

Before reading this page, ensure that you're familiar withthegeneral overview of GKE Sandbox.

Enable GKE Sandbox

GKE Sandbox is ready to use in Autopilot clusters runningGKE version 1.27.4-gke.800 and later. To start deployingAutopilot workloads in a sandbox, skip toWorking with GKE Sandbox.

To use GKE Sandbox in new or existing GKE Standardclusters, you must manually enable GKE Sandbox on the cluster.

For more information about GPU version releases, seeGPU Model Support for details on GPU version releases.

Before you begin

Before you start, make sure that you have performed the following tasks:

  • Enable the Google Kubernetes Engine API.
  • Enable Google Kubernetes Engine API
  • If you want to use the Google Cloud CLI for this task,install and theninitialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running thegcloud components update command. Earlier gcloud CLI versions might not support running the commands in this document.Note: For existing gcloud CLI installations, make sure to set thecompute/regionproperty. If you use primarily zonal clusters, set thecompute/zone instead. By setting a default location, you can avoid errors in the gcloud CLI like the following:One of [--zone, --region] must be supplied: Please specify location. You might need to specify the location in certain commands if the location of your cluster differs from the default that you set.

Enable GKE Sandbox on a new Standard cluster

The default node pool, which is created when you create a new cluster, can't useGKE Sandbox if it's the only node pool in the cluster, becauseGKE-managed system workloads must run separately from untrustedsandboxed workloads. To enable GKE Sandbox during cluster creation, you mustadd at least one extra node pool to the cluster.

Console

To view your clusters, visit the Google Kubernetes Engine menu in theGoogle Cloud console.

  1. In the Google Cloud console, go to theCreate a Kubernetes cluster page.

    Go to Create a Kubernetes cluster

  2. Optional but recommended: From the navigation menu, underCluster,clickFeatures and select the following check boxesso that gVisor messages are logged:

    • Cloud Logging
    • Cloud Monitoring
    • Managed Service for Prometheus
  3. ClickAdd NodePool.

  4. From the navigation menu, underNode Pools, expand the new node pooland clickNodes.

  5. Configure the following settings for the node pool:

    1. From theImage type drop-down list, selectContainer-OptimizedOS with Containerd (cos_containerd). This is the only supported image type for GKE Sandbox.
    2. UnderMachine Configuration, select aSeries andMachine type.
    3. Optionally, if you're running asupported GKEversion, select a GPU or TPU type. This must be one ofthe following GPU types:

      • nvidia-gb200: NVIDIA GB200 NVL72 (Preview)
      • nvidia-b200: NVIDIA B200 (180GB) (Preview)
      • nvidia-h200-141gb: NVIDIA H200 (141GB) (Preview)
      • nvidia-h100-mega-80gb: NVIDIA H100 Mega (80GB)
      • nvidia-h100-80gb: NVIDIA H100 (80GB)
      • nvidia-a100-80gb: NVIDIA A100 (80GB)
      • nvidia-tesla-a100: NVIDIA A100 (40GB)
      • nvidia-rtx-pro-6000: NVIDIA RTX PRO 6000 (Preview)
      • nvidia-l4: NVIDIA L4
      • nvidia-tesla-t4: NVIDIA T4
      For more information, seeGPU model support.

      or the following TPU types:

  6. From the navigation menu, under the name of the node pool you areconfiguring, clickSecurity and select theEnable sandbox withgVisor checkbox.

  7. Continue to configure the cluster and node pools as needed.

  8. ClickCreate.

gcloud

GKE Sandbox can't be enabled for the default node pool, and it isn'tpossible to create additional node pools at the same time as you create anew cluster using thegcloud command. Instead, create your cluster as younormally would. Although optional, it's recommended that youenableLogging and Monitoringso that gVisor messages are logged.

Next, use thegcloud container node-pools create command, and set the--sandbox flag totype=gvisor. The node image type must becos_containerdfor GKE Sandbox.

gcloudcontainernode-poolscreateNODE_POOL_NAME\--cluster=CLUSTER_NAME\--node-version=NODE_VERSION\--machine-type=MACHINE_TYPE\--image-type=cos_containerd\--sandboxtype=gvisor

Replace the following variables:

  • NODE_POOL_NAME: the name of your new node pool.
  • CLUSTER_NAME: the name of your cluster.
  • NODE_VERSION: the version to use for the node pool.
  • MACHINE_TYPE: thetype of machineto use for the nodes.

To create a GPU node pool with GKE Sandbox, run the following command:

gcloudcontainernode-poolscreateNODE_POOL_NAME\--cluster=CLUSTER_NAME\--node-version=NODE_VERSION\--machine-type=MACHINE_TYPE\--accelerator=type=GPU_TYPE,gpu-driver-version=DRIVER_VERSION\--image-type=cos_containerd\--sandboxtype=gvisor

Replace the following:

  • GPU_TYPE: a supported GPU type. For details, seeGKE Sandbox.

  • MACHINE_TYPE: a machine that matches the requestedGPU type. For details, seeGoogle Kubernetes Engine GPU Requirements.

  • DRIVER_VERSION: the NVIDIA driver version toinstall. Can be one of the following:

    • default: Install the default driver version for your GKEversion.
    • latest: Install the latest available driver version for yourGKE version. Available only for nodes that useContainer-Optimized OS.

To create a TPU node pool with GKE Sandbox, run the following command:

gcloudcontainernode-poolscreateNODE_POOL_NAME\--cluster=CLUSTER_NAME\--node-version=NODE_VERSION\--num-nodes=NUM_NODES\--tpu-topology=TPU_TOPOLOGY\--machine-type=MACHINE_TYPE\--image-type=cos_containerd\--sandboxtype=gvisor
  • MACHINE_TYPE: a supported TPU type. For details,seeGKE Sandbox.

Enable GKE Sandbox on an existing Standard cluster

You can enable GKE Sandbox on an existing Standard cluster byadding a new node pool and enabling the feature for that node pool.

Console

To create a new node pool with GKE Sandbox enabled:

  1. Go to theGoogle Kubernetes Engine page in the Google Cloud console.

    Go to Google Kubernetes Engine

  2. Click the name of the cluster you want to modify.

  3. ClickAdd NodePool.

  4. Configure theNode pool details page as selected.

  5. From the navigation menu, clickNodes and configure the followingsettings:

    1. From theImage type drop-down list, selectContainer-OptimizedOS with Containerd (cos_containerd). This is the only supported image type for GKE Sandbox.
    2. UnderMachine Configuration, select aSeries andMachine type.
    3. Optionally, if you're running asupported GKEversion, select a GPU or TPU type. This must be one ofthe following GPU types

      • nvidia-gb200: NVIDIA GB200 NVL72 (Preview)
      • nvidia-b200: NVIDIA B200 (180GB) (Preview)
      • nvidia-h200-141gb: NVIDIA H200 (141GB) (Preview)
      • nvidia-h100-mega-80gb: NVIDIA H100 Mega (80GB)
      • nvidia-h100-80gb: NVIDIA H100 (80GB)
      • nvidia-a100-80gb: NVIDIA A100 (80GB)
      • nvidia-tesla-a100: NVIDIA A100 (40GB)
      • nvidia-rtx-pro-6000: NVIDIA RTX PRO 6000 (Preview)
      • nvidia-l4: NVIDIA L4
      • nvidia-tesla-t4: NVIDIA T4
      For more information, seeGPU model support.

      or the following TPU types:

  6. From the navigation menu, clickSecurity and select theEnablesandbox with gVisor checkbox.

  7. ClickCreate.

gcloud

To create a new node pool with GKE Sandbox enabled, use a command likethe following:

gcloudcontainernode-poolscreateNODE_POOL_NAME\--cluster=CLUSTER_NAME\--machine-type=MACHINE_TYPE\--image-type=cos_containerd\--sandboxtype=gvisor

The node image type must becos_containerd for GKE Sandbox.

To create a GPU node pool with GKE Sandbox, run the following command:

gcloudcontainernode-poolscreateNODE_POOL_NAME\--cluster=CLUSTER_NAME\--node-version=NODE_VERSION\--machine-type=MACHINE_TYPE\--accelerator=type=GPU_TYPE,gpu-driver-version=DRIVER_VERSION\--image-type=cos_containerd\--sandboxtype=gvisor

Replace the following:

  • GPU_TYPE: a supported GPU type. For details, seeGKE Sandbox.

  • MACHINE_TYPE: a machine that matches the requestedGPU type. For details, seeGoogle Kubernetes Engine GPU Requirements.

  • DRIVER_VERSION: the NVIDIA driver version toinstall. Can be one of the following:

    • default: Install the default driver version for your GKEversion.
    • latest: Install the latest available driver version for yourGKE version. Available only for nodes that useContainer-Optimized OS.

To create a TPU node pool with GKE Sandbox, run the following command:

gcloudcontainernode-poolscreateNODE_POOL_NAME\--cluster=CLUSTER_NAME\--node-version=NODE_VERSION\--num-nodes=NUM_NODES\--tpu-topology=TPU_TOPOLOGY\--machine-type=MACHINE_TYPE\--image-type=cos_containerd\--sandboxtype=gvisor
  • MACHINE_TYPE: a supported TPU type. For details,seeGKE Sandbox.

Optional: Enable monitoring and logging

It is optional but recommended that you enable Cloud Logging andCloud Monitoring on the cluster, so that gVisor messages are logged.These services are enabled by default for new clusters.

You can use the Google Cloud console to enable these features on an existing cluster.

  1. Go to theGoogle Kubernetes Engine page in the Google Cloud console.

    Go to Google Kubernetes Engine

  2. Click the name of the cluster you want to modify.

  3. UnderFeatures, in theCloud Loggingfield, clickEdit Cloud Logging.

  4. Select theEnable Cloud Logging checkbox.

  5. ClickSave Changes.

  6. Repeat the same steps for theCloud Monitoring andManaged Service for Prometheus fields to enable those features.

Use GKE Sandbox in Autopilot and Standard

In Autopilot clusters and in Standard clusters withGKE Sandbox enabled, you request a sandboxed environment for a Pod byspecifying thegvisor RuntimeClass in the Pod specification.

For Autopilot clusters, ensure that you're running GKEversion 1.27.4-gke.800 or later.

Run an application in a sandbox

To make a Deployment run on a node with GKE Sandbox enabled, set itsspec.template.spec.runtimeClassName togvisor, as shown in the followingexample:

# httpd.yamlapiVersion:apps/v1kind:Deploymentmetadata:name:httpdlabels:app:httpdspec:replicas:1selector:matchLabels:app:httpdtemplate:metadata:labels:app:httpdspec:runtimeClassName:gvisorcontainers:-name:httpdimage:httpd

Create the Deployment:

kubectlapply-fhttpd.yaml

The Pod is deployed to a node with GKE Sandbox enabled. To verify thedeployment, find the node where the Pod is deployed:

kubectlgetpods

The output is similar to the following:

NAME                    READY   STATUS    RESTARTS   AGEhttpd-db5899bc9-dk7lk   1/1     Running   0          24s

From the output, find the name of the Pod in the output, and then check thevalue for RuntimeClass:

kubectlgetpodsPOD_NAME-ojsonpath='{.spec.runtimeClassName}'

The output isgvisor.

Alternatively, you can list the RuntimeClass of each Pod, and look for the Podswhere it is set togvisor:

kubectlgetpods-ojsonpath=$'{range .items[*]}{.metadata.name}: {.spec.runtimeClassName}\n{end}'

The output is the following:

POD_NAME: gvisor

This method of verifying that the Pod is running in a sandbox is trustworthybecause it does not rely on any data within the sandbox itself. Anythingreported from within the sandbox is untrustworthy, because it could bedefective or malicious.

Run a Pod with accelerators on GKE Sandbox

To run a GPU or TPU workload on GKE Sandbox, add theruntimeClassName: gvisor field to your manifest like the followingexamples:

  • Example manifest for Standard mode GPU Pods:

    apiVersion:v1kind:Podmetadata:name:my-gpu-podspec:runtimeClassName:gvisorcontainers:-name:my-gpu-containerimage:nvidia/samples:vectoradd-cuda10.2resources:limits:nvidia.com/gpu:1
  • Example manifest for Autopilot mode GPU Pods:

    apiVersion:v1kind:Podmetadata:name:my-gpu-podspec:runtimeClassName:gvisornodeSelector:cloud.google.com/gke-gpu-driver-version:"latest"cloud.google.com/gke-accelerator:nvidia-tesla-t4containers:-name:my-gpu-containerimage:nvidia/samples:vectoradd-cuda10.2resources:limits:nvidia.com/gpu:1
  • Example manifest for Standard or Autopilot mode TPU Pods:

    apiVersion:v1kind:Podmetadata:name:my-tpu-podspec:runtimeClassName:gvisornodeSelector:cloud.google.com/gke-tpu-accelerator:tpu-v5-lite-podslicecloud.google.com/gke-tpu-topology:1x1containers:-name:my-tpu-containerimage:us-docker.pkg.dev/cloud-tpu-images/jax-ai-image/tpu:latestcommand:-bash--c-|python -c 'import jax; print("TPU cores:", jax.device_count())'resources:limits:google.com/tpu:1requests:google.com/tpu:1

You can run any Autopilot or Standard mode accelerator Podsthat meet the version and accelerator type requirements on GKE Sandbox byadding theruntimeClassName: gvisor field to the manifest. To run GPU Pods inGKE, see the following:

To run TPU Pods in GKE, see the following:

Supported GPU types for Autopilot are the following:

  • nvidia-gb200: NVIDIA GB200 NVL72 (Preview)
  • nvidia-b200: NVIDIA B200 (180GB) (Preview)
  • nvidia-h200-141gb: NVIDIA H200 (141GB) (Preview)
  • nvidia-h100-mega-80gb: NVIDIA H100 Mega (80GB)
  • nvidia-h100-80gb: NVIDIA H100 (80GB)
  • nvidia-a100-80gb: NVIDIA A100 (80GB)
  • nvidia-tesla-a100: NVIDIA A100 (40GB)
  • nvidia-rtx-pro-6000: NVIDIA RTX PRO 6000 (Preview)
  • nvidia-l4: NVIDIA L4
  • nvidia-tesla-t4: NVIDIA T4
For more information, seeGPU model support.

Run a regular Pod along with sandboxed Pods

The steps in this section apply to Standard mode workloads. You don'tneed to run regular Pods alongside sandbox Pods in Autopilot mode,because theAutopilot pricing modeleliminates the need to manually optimize the number of Pods scheduled on nodes.

After enabling GKE Sandbox on a node pool, you can run trusted applicationson those nodes without using a sandbox by using node taints and tolerations.These Pods are referred to as "regular Pods" to distinguish them from sandboxedPods.

Regular Pods, just like sandboxed Pods, are prevented from accessing otherGoogle Cloud services or cluster metadata. This prevention is part of thenode's configuration. If your regular Pods or sandboxed Pods require access toGoogle Cloud services, useWorkload Identity Federation for GKE.

Running untrusted code on the same nodes as critical system services comes with potential risk, even if the untrusted code is running in a sandbox. Consider these risks when designing your applications.

GKE Sandbox adds the following label and taint to nodes that can runsandboxed Pods:

labels:sandbox.gke.io/runtime:gvisor
taints:-effect:NoSchedulekey:sandbox.gke.io/runtimevalue:gvisor

In addition to any node affinity and toleration settings in your Pod manifest,GKE Sandbox applies the following node affinity and toleration to allPods withRuntimeClass set togvisor:

affinity:nodeAffinity:requiredDuringSchedulingIgnoredDuringExecution:nodeSelectorTerms:-matchExpressions:-key:sandbox.gke.io/runtimeoperator:Invalues:-gvisor
tolerations:-effect:NoSchedulekey:sandbox.gke.io/runtimeoperator:Equalvalue:gvisor

To schedule a regular Pod on a node with GKE Sandbox enabled,manually apply the node affinity and toleration described earlier in your Pod manifest.

  • If your podcan run on nodes with GKE Sandbox enabled, add thetoleration.
  • If your podmust run on nodes with GKE Sandbox enabled, add both thenode affinity and toleration.

For example, the following manifest modifies the manifest used inRunning an application in a sandbox so that it runs asa regular Pod on a node with sandboxed Pods, by removing the runtimeClass andadding both the taint and toleration described earlier.

# httpd-no-sandbox.yamlapiVersion:apps/v1kind:Deploymentmetadata:name:httpd-no-sandboxlabels:app:httpdspec:replicas:1selector:matchLabels:app:httpdtemplate:metadata:labels:app:httpdspec:containers:-name:httpdimage:httpdaffinity:nodeAffinity:requiredDuringSchedulingIgnoredDuringExecution:nodeSelectorTerms:-matchExpressions:-key:sandbox.gke.io/runtimeoperator:Invalues:-gvisortolerations:-effect:NoSchedulekey:sandbox.gke.io/runtimeoperator:Equalvalue:gvisor

First, verify that the Deployment is not running in a sandbox:

kubectlgetpods-ojsonpath=$'{range .items[*]}{.metadata.name}: {.spec.runtimeClassName}\n{end}'

The output is similar to:

httpd-db5899bc9-dk7lk: gvisorhttpd-no-sandbox-5bf87996c6-cfmmd:

Thehttpd Deployment created earlier is running in a sandbox, because itsruntimeClass isgvisor. Thehttpd-no-sandbox Deployment has no value forruntimeClass, so it is not running in a sandbox.

Next, verify that the non-sandboxed Deployment is running on a node withGKE Sandbox by running the following command:

kubectlgetpod-ojsonpath=$'{range .items[*]}{.metadata.name}: {.spec.nodeName}\n{end}'

The name of the node pool is embedded in the value ofnodeName. Verify thatthe Pod is running on a node in a node pool with GKE Sandbox enabled.

Note: If the regular Pod is unschedulable, verify that the taint andtoleration is set correctly in the Pod manifest.

Verify metadata protection

To validate the assertion that metadata is protected from nodes that can runsandboxed Pods, you can run a test:

  1. Create a sandboxed Deployment from the following manifest, usingkubectl apply -f. It uses thefedora image, which includes thecurlcommand. The Pod runs the/bin/sleep command to ensure that the Deploymentruns for 10000 seconds.

    # sandbox-metadata-test.yamlapiVersion:apps/v1kind:Deploymentmetadata:name:fedoralabels:app:fedoraspec:replicas:1selector:matchLabels:app:fedoratemplate:metadata:labels:app:fedoraspec:runtimeClassName:gvisorcontainers:-name:fedoraimage:fedoracommand:["/bin/sleep","10000"]
  2. Get the name of the Pod usingkubectl get pods, then usekubectl exec toconnect to the Pod interactively.

    kubectlexec-itPOD_NAME/bin/sh

    You are connected to a container running in the Pod, in a/bin/shsession.

  3. Within the interactive session, attempt to access a URL that returns clustermetadata:

    curl-s"http://169.254.169.254/computeMetadata/v1/instance/attributes/kube-env"-H"Metadata-Flavor: Google"

    The command hangs and eventually times out, because the packets are silentlydropped.

  4. PressCtrl+C to terminate thecurl command, and typeexit todisconnect from the Pod.

  5. Remove theRuntimeClass line from the YAML manifest and redeploy the Podusingkubectl apply -fFILENAME. The sandboxedPod is terminated and recreated on a node without GKE Sandbox.

  6. Get the new Pod name, connect to it usingkubectl exec, and run thecurl command again. This time, results are returned. This exampleoutput is truncated.

    ALLOCATE_NODE_CIDRS: "true"API_SERVER_TEST_LOG_LEVEL: --v=3AUTOSCALER_ENV_VARS: kube_reserved=cpu=60m,memory=960Mi,ephemeral-storage=41Gi;......

    Typeexit to disconnect from the Pod.

  7. Remove the deployment:

    kubectldeletedeploymentfedora

Disable GKE Sandbox

You can't disable GKE Sandbox in GKE Autopilotclusters or in GKE Standard node pools. If you want tostop using GKE Sandbox,delete the node pool.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.