Movatterモバイル変換

Jump to content

cgroups

From Wikipedia, the free encyclopedia

Resource limit method in Linux

"cgroup" redirects here. For other uses, seeC group.

cgroups
Original authors	v1: Paul Menage, Rohit Seth, Memory Controller by Balbir Singh, CPU controller by Srivatsa Vaddagiri v2: Tejun Heo
Developers	Tejun Heo, Johannes Weiner, Michal Hocko, Waiman Long, Roman Gushchin, Chris Down et al.
Initial release	2007; 19 years ago (2007)
Written in	C
Operating system	Linux
Type	System software
License	GPL andLGPL
Website	Cgroup v1,Cgroup v2

cgroups (abbreviated fromcontrol groups) is aLinux kernel feature that limits, accounts for, and isolates theresource usage (CPU, memory, disk I/O, etc.)^[1]^{: § Controllers} of a collection ofprocesses.

Engineers atGoogle started the work on this feature in 2006 under the name "process containers".^[2] In late 2007, the nomenclature changed to "control groups" to avoid confusion caused by multiple meanings of the term "container" in the Linux kernel context, and the control groups functionality was merged into theLinux kernel mainline in kernel version 2.6.24, which was released in January 2008.^[3] Since then, developers have added controllers for the kernel's own memory allocation,^[4] netfilterfirewalling,^[5] theOOM killer,^[6] and many other parts.

A major change in the history of cgroups iscgroup v2, which removes the ability to use multiple process hierarchies and to discriminate between threads as found in the original cgroup (now called "v1").^[1]^{: § Issues with v1 and Rationales for v2} Work on the single, unified hierarchy started with the repurposing of v1's dummy hierarchy as a place for holding all controllers not yet used by others in 2014.^[7] cgroup v2 was merged in Linux kernel 4.5 (2016).^[8]

Versions

There are two versions of cgroups. They can co-exist in a system.

The original version of cgroups was written by Paul Menage and Rohit Seth. It was merged into the mainline Linux kernel in 2007 (2.6.2). Development and maintenance of cgroups was then taken over byTejun Heo, who instituted major redesigns without breaking the interface (see§ Redesigns of v1). It was renamed "Control Group version 1" (cgroup-v1) after cgroups-v2 appeared in Linux 4.5.^[9]
Tejun Heo found that further redesign of v1 could not proceed without breaking the interface. As a result, he added a separate, new system called "Control Group version 2" (cgroup-v2). Unlike v1, cgroup v2 has only a single process hierarchy (because a controller can only be assigned to one hierarchy, processes in separate hierarchies cannot be managed by the same controller; this change sidesteps the issue). It also removes the ability to discriminate between threads, choosing to work on a granularity of processes instead (disabling an "abuse" of the system which led to convoluted APIs).^[1]^{: § Issues with v1 and Rationales for v2} The first version of the unified hierarchy The document first appeared in Linux kernel 4.5 released on 14 March 2016.^[8]

Features

One of the design goals of cgroups is to provide a unified interface to many differentuse cases, from controlling single processes (by usingnice, for example) to fulloperating system-level virtualization (as provided byOpenVZ,Linux-VServer orLXC, for example). Cgroups provides:

Resource limiting: groups can be set not to exceed a configuredmemory limit, which also includes thefile system cache,^[10]^[11] I/O bandwidth limit,^[12] CPU quota limit,^[13] CPU set limit,^[14] ormaximum open files.^[15]
Prioritization: some groups may get a larger share of CPU utilization^[16] or disk I/O throughput^[17]
Accounting: measures a group's resource usage, which may be used, for example, for billing purposes^[18]
Control: freezing groups of processes, theircheckpointing and restarting^[18]

Use

As an example of indirect usage, systemd assumes exclusive access to the cgroups facility

A control group (abbreviated as cgroup) is a collection of processes that are bound by the same criteria and associated with a set of parameters or limits. These groups can be hierarchical, meaning that each group inherits limits from its parent group. The kernel provides access to multiple controllers (also called subsystems) through the cgroup interface;^[3] for example, the "memory" controller limits memory use, "cpuacct" accounts CPU usage, etc.

Control groups can be used in multiple ways:

By accessing the cgroup virtual file system manually.
By creating and managing groups on the fly using tools likecgcreate,cgexec, andcgclassify (fromlibcgroup).
Through the "rules engine daemon" that can automatically move processes of certain users, groups, or commands to cgroups as specified in its configuration.
Indirectly through other software that uses cgroups, such asDocker,Firejail,LXC,^[19]libvirt,systemd,Open Grid Scheduler/Grid Engine,^[20] and Google's developmentally defunctlmctfy.

The Linux kernel documentation contains some technical details of the setup and use of control groups version 1^[21] and version 2.^[1]

Interfaces

Both versions of cgroup act through a pseudo-filesystem (cgroup for v1 andcgroup2 for v2). Like all filesystems they can bemounted on any path, but the general convention is to mount one of the versions (generally v2) on/sys/fs/cgroup under thesysfs default location of/sys. As mentioned before the two cgroup versions can be active at the same time; this too applies to the filesystems so long as they are mounted to a different path.^[21]^[1] For the description below we assume a setup where the v2 hierarchy lies in/sys/fs/cgroup. The v1 hierarchy, if ever required, will be mounted at a different location.

At initialization cgroup2 should have no defined control groups except the top-level one. In other words,/sys/fs/cgroup should have no directories, only a number of files that control the system as a whole. At this point, runningls /sys/fs/cgroup could list the following on one example system:

cgroup.controllers
cgroup.max.depth
cgroup.max.descendants
cgroup.pressure
cgroup.procs
cgroup.stat
cgroup.subtree_control
cgroup.threads
cpu.pressure
cpuset.cpus.effective
cpuset.cpus.isolated
cpuset.mems.effective
cpu.stat
cpu.stat.local
io.cost.model
io.cost.qos
io.pressure
io.prio.class
io.stat
irq.pressure
memory.numa_stat
memory.pressure
memory.reclaim
memory.stat
memory.zswap.writeback
misc.capacity
misc.current
misc.peak

These files are named according to the controllers that handle them. For example,cgroup.* deal with the cgroup system itself andmemory.* deal with the memory subsystem. Example: to request the kernel to 1 gigabyte of memory from anywhere in the system, one can runecho "1G swappiness=50" > /sys/fs/cgroup/memory.reclaim.^[1]

To create a subgroup, one simply creates a new directory under an existing group (including the top-level one). The files corresponding to available controls for this group are automatically created.^[1] For example, runningmkdir /sys/fs/cgroup/example; ls /sys/fs/cgroup/example would produce a list of files largely similar to the one above, but with noticeable changes. On one example system, these files are added:

These changes are not unexpected because some controls and statistics only make sense on a subset of processes (e.g.nice level being the CPU priority of processes relative to the rest of the system).^[1]

Processes are assigned to subgroups by writing to/proc/<PID>/cgroup. The cgroup a process is in can be found by reading the same file.^[1]

On systems based onsystemd, a hierarchy of subgroups is predefined to encapsulate every process directly and indirectly launched by systemd under a subgroup: the very basis of how systemd manages processes. An explanation of the nomenclature of these groups can be found in theRed Hat Enterprise Linux 7 manual.^[22] Red Hat also provides a guide on creating a systemd service file that causes a process to run in a separate cgroup.^[23]

systemd-cgtop^[24] command can be used to show top control groups by their resource usage.

V1 coexistence

On a system with v2, v1 can still be mounted and given access to controllersnot in use by v2. However, a modern system typically already places all controllers in use in v2, so there is no controller available for v1 at all even if a hierarchy is created. It is possible to clear all uses of a controller from v2 and hand it to v1, but moving controllers between hierarchies after the system is up and running is cumbersome and not recommended.^[1]

Major evolutions

Redesigns of v1

Redesign of cgroups started in 2013,^[25] with additional changes brought by versions 3.15 and 3.16 of the Linux kernel.^[26]^[27]^[28]

The following changes concern the kernel before 4.5/4.6, i.e. when cgroups-v2 were added. In other words they describe how cgroups-v1 had been changed, though most of them have also been inherited into v2 (after all, v1 and v2 share the same codebase).

Namespace isolation

Main article:Linux namespaces

While not technically part of the cgroups work, a related feature of the Linux kernel isnamespace isolation, where groups of processes are separated such that they cannot "see" resources in other groups. For example, a PID namespace provides a separate enumeration ofprocess identifiers within each namespace. Also available are mount, user, UTS (Unix Time Sharing), network and SysV IPC namespaces.

ThePID namespace provides isolation for the allocation ofprocess identifiers (PIDs), lists of processes and their details. While the new namespace is isolated from other siblings, processes in its "parent" namespace still see all processes in child namespaces—albeit with different PID numbers.^[29]
Network namespace isolates thenetwork interface controllers (physical or virtual),iptables firewall rules, routing tables etc. Network namespaces can be connected with each other using the "veth" virtual Ethernet device.^[30]
"UTS" namespace allows changing thehostname.
Mount namespace allows creating a different file system layout, or making certain mount points read-only.^[31]
IPC namespace isolates the System Vinter-process communication between namespaces.
User namespace isolates the user IDs between namespaces.^[32]
Cgroup namespace^[33]

Namespaces are created with the "unshare" command orsyscall, or as "new" flags in a "clone" syscall.^[34]

The "ns" subsystem was added early in cgroups development to integrate namespaces and control groups. If the "ns" cgroup was mounted, each namespace would also create a new group in the cgroup hierarchy. This was an experiment that was later judged to be a poor fit for the cgroups API, and removed from the kernel.

Linux namespaces were inspired by the more general namespace functionality used heavily throughoutPlan 9 from Bell Labs.^[35]

Conversion to kernfs

Kernfs was introduced into the Linux kernel with version 3.14 in March 2014, the main author being Tejun Heo.^[36] One of the main motivators for a separate kernfs is the cgroups file system. Kernfs is basically created by splitting off some of thesysfs logic into an independent entity, thus easing for other kernel subsystems the implementation of their own virtual file system with handling for device connect and disconnect, dynamic creation and removal, and other attributes. This does not affect how cgroups is used, but makes maintaining the code easier.^[37]

New features introduced during v1

Kernel memory control groups (kmemcg) were merged into version 3.8 (2013 February 18; 13 years ago (18-02-2013)) of theLinux kernel mainline.^[38]^[39]^[4] The kmemcg controller can limit the amount of memory that the kernel can utilize to manage its own internal processes.

Support for per-groupnetfilter setup was added in 2014.^[5]

The unified hierarchy was added in 2014. It repurposes of v1's dummy hierarchy to hold all controllers not yet used by others. This changed dummy hierarchy would become the only available hierarchy in v2.^[7]

Changes after v2

Unlike v1, cgroup v2 has only a single process hierarchy and discriminates between processes, not threads.

cgroup awareness of OOM killer

Linux Kernel 4.19 (October 2018) introduced cgroup awareness ofOOM killer implementation which adds an ability to kill a cgroup as a single unit and so guarantee the integrity of the workload.^[6]

Adoption

Various projects use cgroups as their basis, includingCoreOS,Docker (in 2013),Hadoop,Jelastic,Kubernetes,^[40]lmctfy (Let Me Contain That For You),LXC (Linux Containers),systemd,Mesos and Mesosphere,^[40]HTCondor, andFlatpak.

Major Linux distributions also adopted it such asRed Hat Enterprise Linux (RHEL) 6.0 in November 2010, three years before adoption by the mainline Linux kernel.^[41]

On 29 October 2019, theFedora Project modified Fedora 31 to use CgroupsV2 by default.^[42]

See also

Operating system–level virtualization implementations
Process group
Tc (Linux) – a traffic control utility slightly overlapping in functionality with network-oriented cgroup settings
Job object – the equivalentWindows concept, as managed by that platform’sObject Manager

References

^^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j
"Control Group v2".docs.kernel.org.
Sections referenced in this document:
- Controllers
- Issues with v1 and Rationales for v2
^Jonathan Corbet (29 May 2007)."Process containers". LWN.net.
^^a ^bJonathan Corbet (29 October 2007)."Notes from a container".LWN.net. Retrieved14 April 2015.The original 'containers' name was considered to be too generic – this code is an important part of a container solution, but it's far from the whole thing. So containers have now been renamed 'control groups' (or 'cgroups') and merged for 2.6.24.
^^a ^b"memcg: add documentation about the kmem controller".kernel.org. 18 December 2012.
^^a ^b"netfilter: x_tables: lightweight process control group matching". 23 April 2014. Archived fromthe original on 24 April 2014.
^^a ^b"Linux_4.19 - Linux Kernel Newbies".
^^a ^b"cgroup: prepare for the default unified hierarchy". 13 March 2014.
^^a ^b"Documentation/cgroup-v2.txt as appeared in Linux kernel 4.5". 14 March 2016.
^"diff between Linux kernel 4.4 and 4.5". 14 March 2016.
^Jonathan Corbet (31 July 2007)."Controlling memory use in containers". LWN.
^Balbir Singh, Vaidynathan Srinivasan (July 2007)."Containers: Challenges with the memory resource controller and its performance"(PDF). Ottawa Linux Symposium.
^Carvalho, André (18 October 2017)."Using cgroups to limit I/O".andrestc.com. Retrieved12 September 2022.
^Luu, Dan."The container throttling problem".danluu.com. Retrieved12 September 2022.
^Derr, Simon (2004)."CPUSETS". Retrieved12 September 2022.
^"setrlimit(2) — Arch manual pages".man.archlinux.org. Retrieved27 November 2023.
^Jonathan Corbet (23 October 2007)."Kernel space: Fair user scheduling for Linux". Network World. Archived fromthe original on 19 October 2013. Retrieved22 August 2012.
^Kamkamezawa Hiroyu (19 November 2008).Cgroup and Memory Resource Controller(PDF). Japan Linux Symposium. Archived fromthe original(PDF presentation slides) on 22 July 2011.
^^a ^bHansen D, IBM Linux Technology Center (2009).Resource Management(PDF presentation slides). Linux Foundation.
^Matt Helsley (3 February 2009)."LXC: Linux container tools". IBM developerWorks.
^"Grid Engine cgroups Integration". Scalable Logic. 22 May 2012.
^^a ^b"Control Groups version 1".docs.kernel.org.
^"1.2. Default Cgroup Hierarchies | Resource Management Guide | Red Hat Enterprise Linux | 7 | Red Hat Documentation".docs.redhat.com.
^"Managing cgroups with systemd".www.redhat.com.
^"Systemd-cgtop".
^"All About the Linux Kernel: Cgroup's Redesign".Linux.com. 15 August 2013. Archived fromthe original on 28 April 2019. Retrieved19 May 2014.
^"The unified control group hierarchy in 3.16".LWN.net. 11 June 2014.
^"Pull cgroup updates for 3.15 from Tejun Heo".kernel.org. 3 April 2014.
^"Pull cgroup updates for 3.16 from Tejun Heo".kernel.org. 9 June 2014.
^Pavel Emelyanov, Kir Kolyshkin (19 November 2007)."PID namespaces in the 2.6.24 kernel". LWN.net.
^Jonathan Corbet (30 January 2007)."Network namespaces". LWN.net.
^Serge E. Hallyn, Ram Pai (17 September 2007)."Applying mount namespaces". IBM developerWorks.
^Michael Kerrisk (27 February 2013)."Namespaces in operation, part 5: User namespaces". lwn.net Linux Info from the Source.
^"LKML: Linus Torvalds: Linux 4.6-rc1".
^Janak Desai (11 January 2006)."Linux kernel documentation on unshare".
^"The Use of Name Spaces in Plan 9". 1992. Archived fromthe original on 6 September 2014. Retrieved15 February 2015.
^"kernfs, sysfs, driver-core: implement synchronous self-removal".LWN.net. 3 February 2014. Retrieved7 April 2014.
^"Linux kernel source tree: kernel/git/torvalds/linux.git: cgroups: convert to kernfs".kernel.org. 11 February 2014. Retrieved23 May 2014.
^"memcg: kmem controller infrastructure".kernel.org source code. 18 December 2012.
^"memcg: kmem accounting basic infrastructure".kernel.org source code. 18 December 2012.
^^a ^b"Mesosphere to Bring Google's Kubernetes to Mesos". Mesosphere.io. 10 July 2014. Archived fromthe original on 6 September 2015. Retrieved13 July 2014.
^"Red Hat Enterprise Linux - 6.0 Release Notes"(PDF).redhat.com. Retrieved12 September 2023.
^"1732114 – Modify Fedora 31 to use CgroupsV2 by default".

External links

Official Linux kernel documentation on cgroups v1 andcgroups v2
Red Hat Resource Management Guide on cgroups
Ubuntu manpage on cgroups Archived 9 August 2021 at theWayback Machine
Linux kernel Namespaces and cgroups by Rami Rosen (2013)
Namespaces and cgroups, the basis of Linux containers (including cgroups v2), slides of a talk by Rami Rosen, Netdev 1.1, Seville, Spain, 2016
Understanding the new control groups API,LWN.net, by Rami Rosen, March 2016
Large-scale cluster management at Google with Borg, April 2015, by Abhishek Verma, Luis Pedrosa, Madhukar Korupolu, David Oppenheimer, Eric Tune and John Wilkes
Job Objects, similar feature on Windows

v
t
e

Virtualization software

Comparison of platform virtualization software

Hardware
(hypervisors)

Native

Hosted

Specialized	Basilisk II Bochs Cooperative Linux DOSBox DOSEMU 86Box PCem PikeOS SheepShaver SIMH Windows on Windows Virtual DOS machine Win4Lin
Independent	bhyve Microsoft Virtual Server Parallels Workstation Extreme Parallels Desktop for Mac Parallels Server for Mac PearPC QEMU UTM VirtualBox Virtual Iron Virtual PC VMware Fusion VMware Server VMware Workstation Player

Tools

Operating
system

OScontainers	FreeBSD jail iCore Virtual Accounts Linux-VServer Linux Containers OpenVZ Solaris Containers Virtuozzo Workload Partitions
Application containers	Docker Podman lmctfy Distrobox rkt
Virtual kernel architectures	Rump kernel User-mode Linux vkernel
Related kernel features	BrandZ cgroups chroot namespaces eBPF seccomp
Orchestration	Amazon ECS Kubernetes OpenShift

Distributed Overlay Virtual Ethernet (DOVE)
Ethernet VPN (EVPN)
NVGRE
Open vSwitch
Virtual security switch
Virtual Extensible LAN (VXLAN)
Generic Network Virtualization Encapsulation (GENEVE)

See also

BlueStacks

See also:List of emulators,List of computer system emulators

v
t
e

Organization

Kernel	Linux Foundation Linux Mark Institute Linus's law Tanenbaum–Torvalds debate Tux SCO disputes Linaro GNU GPL v2 menuconfig Supported computer architectures Version history Criticism
Support	Developers The Linux Programming Interface kernel.org LKML Linux conferences Users Linux User Group (LUG)
People	Werner Almesberger H. Peter Anvin Jens Axboe Moshe Bar Suparna Bhattacharya Andries Brouwer Rémy Card Alan Cox Matthew Garrett Avi Kivity Con Kolivas Greg Kroah-Hartman Benson Leung Robert Love David S. Miller Ingo Molnár Andrew Morton Hans Reiser Rusty Russell Shuah Khan Linus Torvalds Theodore Ts'o Stephen Tweedie Harald Welte Chris Wright

Technical

Debugging

Kernel

System Call Interface	POSIX ioctl select open read close sync … Linux-only futex epoll splice dnotify inotify readahead …
In-kernel	ALSA Crypto API io_uring DRM kernfs Memory barrier New API RCU Video4Linux IIO

Daemons, File systems	bpffs configfs devfs devpts debugfs FUSE hugetlbfs pipefs procfs securityfs sockfs sysfs tmpfs systemd udev Kmscon binfmt_misc
Wrapper libraries	C standard library glibc uClibc Bionic libhybris dietlibc EGLIBC klibc musl Newlib libcgroup libdrm libalsa libevdev libusb liburing

Components

Variants

Virtualization	Hypervisor KVM Xen OS-level virtualization Linux-VServer Lguest LXC OpenVZ Other L4Linux User-mode Linux MkLinux coLinux

Range of use	Desktop Embedded Gaming Thin client: LTSP Server: LAMP LYME-LYCE Devices
Adopters	List of Linux adopters

Linux portal
Free and open-source software portal
Category

Retrieved from "https://en.wikipedia.org/w/index.php?title=Cgroups&oldid=1335456376"

Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp