About seccomp in GKE

This document describes the Linuxsecure computing mode (seccomp)in Google Kubernetes Engine (GKE). This document assumes that you know about thefollowing:

Use the information in this document to understand which actions yourcontainerized applications can perform on the host virtual machine (VM) thatbacks your nodes.

This document is for Security specialists who use seccomp as part of theirorganization's security strategy and want to understand how GKEinteracts with seccomp profiles. To learn more about common roles and exampletasks that we reference in Google Cloud content, seeCommon GKE user roles and tasks.

What is seccomp?

Secure computing mode, or seccomp, is a security capability in Linux that letsyou restrict the system calls (syscalls) that a process can make to the Linuxkernel.

By default, GKE nodes use theContainer-Optimized OSoperating systemwith thecontainerdcontainer runtime.containerd protects the Linux kernel by limiting the allowed Linuxcapabilities to a default list, and you can further limit allowedsyscalls with a seccompprofile. containerd has a default seccomp profileavailable. Whether GKE applies the default seccomp profile foryou depends on the cluster mode that you use, as follows:

  • Autopilot (recommended):GKE applies the containerd default seccomp profile to allworkloads automatically.
  • Standard:GKEdoes not apply the containerd default seccomp profileto all workloads automatically. We recommend that you apply either thedefault seccomp profile or acustom seccomp profileto your workloads.

The default containerd seccomp profile provides baseline hardening whilemaintaining compatibility with most workloads. The full seccomp profiledefinition for containerd is available onGitHub.

Linux capabilities and syscalls

Non-root processes running on Linux systems might require specific privileges toperform actions as the root user. Linux usescapabilities to divide theavailable privileges into groups, so that a non-root process can perform aspecific action without being granted all privileges. For a process tosuccessfully make a specific syscall, the process must have the correspondingprivileges granted by a capability.

For a list of all Linux capabilities, refer tocapabilities.

Denied syscalls in the default GKE seccomp profile

The containerd default seccomp profile blocks all syscalls and then selectivelyallows specific syscalls, some of which depend on the CPU architecture of thenode's VM and the kernel version. Thesyscalls variable in theDefaultProfile function lists the allowed syscalls for all architectures.

The default seccomp profile blocks syscalls that can be used to bypass containerisolation boundaries and allow privileged access to the node or to othercontainers. The following table describes some of the significant syscalls thatthe default seccomp profile denies:

Denied syscalls
mount,umount,umount2,fsmount,mount_setattr

Restrict processes from accessing or manipulating the node filesystem outside of the container boundaries.

Also denied because theCAP_SYS_ADMIN capability is dropped.

bpf

Restrict processes from creating eBPF programs in the kernel, which can lead to privilege escalation on the node. For example,CVE-2021-3490 used thebpf syscall. Also denied because theCAP_SYS_ADMIN capability is dropped.

clone,clone3,unshare

Restrict processes from creating new processes in new namespaces that might be outside the container's restricted namespace. These new processes might have elevated permissions and capabilities. For example,CVE-2022-0185 used theunshare syscall. Also denied because theCAP_SYS_ADMIN capability is dropped.

reboot

Restrict processes from rebooting the node.

Denied because theCAP_SYS_BOOT capability is dropped.

open_by_handle_at,name_to_handle_at

Restrict access to files outside of the container. These syscalls were used in one of the earliestDocker container escape exploits. Also denied because theCAP_DAC_READ_SEARCH capability and theCAP_SYS_ADMIN capability are dropped.

Note: This table only describes a subset of the syscalls that the defaultseccomp profile blocks. For a full list, refer to theprofile definition on GitHub.

How to use seccomp in GKE

In Autopilot clusters, GKE automatically applies thecontainerd default seccomp profile to all your workloads. No further action isrequired. Attempts to make restricted syscalls fail. Autopilotdisallows custom seccomp profiles because GKE manages the nodes.

In Standard clusters, you must manually apply a seccomp profile.GKE doesn't apply a profile for you.

Enable seccomp in Standard clusters

Apply a seccomp profile manually by setting the Pod or containerSecurity Context using thespec.securityContext.seccompProfile field in the Pod specification,such as in the following example. We strongly recommend that you use a seccompprofile for your workloads unless your use case requires using any restrictedsyscalls. The two supportedseccompProfile types are as follows:

The following example manifest sets the seccomp profile to the runtime defaultprofile:

apiVersion:apps/v1kind:Deploymentmetadata:name:my-deploymentlabels:app:default-podspec:replicas:3selector:matchLabels:app:default-podtemplate:metadata:labels:app:default-podspec:securityContext:seccompProfile:type:RuntimeDefaultcontainers:-name:seccomp-testimage:nginx
Important: You can't apply a seccomp profile to containers that run inPrivileged mode.

When you deploy this manifest, if a container in the Pod tries to make a syscallthat violates the runtime default seccomp profile, the Pod or the workload mightexperience unexpected behavior. For example, a Pod that makes a restrictedsyscall during startup would fail to start. If an application tries to make arestricted syscall while the Pod is running, you might notice errors in thecontainer. The severity of a failed syscall depends on how the applicationhandles errors.

Use a custom seccomp profile in Standard clusters

If the runtime default seccomp profile is too restrictive for your application(or not restrictive enough), you can apply a custom seccomp profile to Pods inStandard clusters. This process requires access to the filesystem onthe node. For a tutorial on how to load and use custom seccomp profiles, refertoRestrict a Container's Syscalls with seccomp.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.