This articleneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources. Unsourced material may be challenged and removed. Find sources: "OS-level virtualization" – news ·newspapers ·books ·scholar ·JSTOR(November 2020) (Learn how and when to remove this message) |
OS-level virtualization is anoperating system (OS)virtualization paradigm in which thekernel allows the existence of multiple isolateduser space instances, includingcontainers (LXC,Solaris Containers, AIXWPARs, HP-UX SRP Containers,Docker,Podman,Guix),zones (Solaris Containers),virtual private servers (OpenVZ),partitions,virtual environments (VEs),virtual kernels (DragonFly BSD), andjails (FreeBSD jail andchroot).[1] Such instances may look like real computers from the point of view of programs running in them. Acomputer program running on an ordinary operating system can see all resources (connected devices, files and folders,network shares, CPU power, quantifiable hardware capabilities) of that computer. Programs running inside acontainer can only see the container's contents and devices assigned to the container.
OnUnix-like operating systems, this feature can be seen as an advanced implementation of the standardchroot mechanism, which changes the apparent root folder for the current running process and its children. In addition to isolation mechanisms, the kernel often providesresource-management features to limit the impact of one container's activities on other containers. Linux containers are all based on the virtualization, isolation, and resource management mechanisms provided by theLinux kernel, notablyLinux namespaces andcgroups.[2]
Although the wordcontainer most commonly refers to OS-level virtualization, it is sometimes used to refer to fullervirtual machines operating in varying degrees of concert with the host OS,[citation needed] such asMicrosoft'sHyper-V containers.[citation needed] For an overview ofvirtualization since 1960, seeTimeline of virtualization technologies.
On ordinary operating systems for personal computers, a computer program can see (even though it might not be able to access) all the system's resources. They include:
The operating system may be able to allow or deny access to such resources based on which program requests them and theuser account in the context in which it runs. The operating system may also hide those resources, so that when the computer program enumerates them, they do not appear in the enumeration results. Nevertheless, from a programming point of view, the computer program has interacted with those resources and the operating system has managed an act of interaction.
With operating-system-virtualization, or containerization, it is possible to run programs within containers, to which only parts of these resources are allocated. A program expecting to see the whole computer, once run inside a container, can only see the allocated resources and believes them to be all that is available. Several containers can be created on each operating system, to each of which a subset of the computer's resources is allocated. Each container may contain any number of computer programs. These programs may run concurrently or separately, and may even interact with one another.
Containerization has similarities toapplication virtualization: In the latter, only one computer program is placed in an isolated container and the isolation applies to file system only.
Operating-system-level virtualization is commonly used invirtual hosting environments, where it is useful for securely allocating finite hardware resources among a large number of mutually-distrusting users. System administrators may also use it for consolidating server hardware by moving services on separate hosts into containers on the one server.
Other typical scenarios include separating several programs to separate containers for improved security, hardware independence, and added resource management features.[3] The improved security provided by the use of a chroot mechanism, however, is not perfect.[4] Operating-system-level virtualization implementations capable oflive migration can also be used for dynamicload balancing of containers between nodes in a cluster.
Operating-system-level virtualization usually imposes less overhead thanfull virtualization because programs in OS-level virtual partitions use the operating system's normalsystem call interface and do not need to be subjected toemulation or be run in an intermediatevirtual machine, as is the case with full virtualization (such asVMware ESXi,QEMU, orHyper-V) andparavirtualization (such asXen orUser-mode Linux). This form of virtualization also does not require hardware support for efficient performance.
Operating-system-level virtualization is not as flexible as other virtualization approaches since it cannot host a guest operating system different from the host one, or a different guest kernel. For example, withLinux, different distributions are fine, but other operating systems such as Windows cannot be hosted. Operating systems using variable input systematics are subject to limitations within the virtualized architecture. Adaptation methods including cloud-server relay analytics maintain the OS-level virtual environment within these applications.[5]
Solaris partially overcomes the limitation described above with itsbranded zones feature, which provides the ability to run an environment within a container that emulates an olderSolaris 8 or 9 version in a Solaris 10 host. Linux branded zones (referred to as "lx" branded zones) are also available onx86-based Solaris systems, providing a complete Linuxuser space and support for the execution of Linux applications; additionally, Solaris provides utilities needed to installRed Hat Enterprise Linux 3.x orCentOS 3.xLinux distributions inside "lx" zones.[6][7] However, in 2010 Linux branded zones were removed from Solaris; in 2014 they were reintroduced inIllumos, which is theopen source Solaris fork, supporting 32-bitLinux kernels.[8]
Some implementations provide file-levelcopy-on-write (CoW) mechanisms. (Most commonly, a standard file system is shared between partitions, and those partitions that change the files automatically create their own copies.) This is easier to back up, more space-efficient and simpler to cache than the block-level copy-on-write schemes common on whole-system virtualizers. Whole-system virtualizers, however, can work with non-native file systems and create and roll back snapshots of the entire system state.
| Mechanism | Operating system | License | Start of development | Features | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| File system isolation | Copy on write | Disk quotas | I/O rate limiting | Memory limits | CPU quotas | Network isolation | Nested virtualization | Partition checkpointing and live migration | Root privilege isolation | ||||
| chroot | MostUNIX-like operating systems | Varies by operating system | 1982 | Partial[a] | No | No | No | No | No | No | Yes | No | No |
| Docker | Linux,[10]Windows x64[11]macOS[12] | Apache License 2.0 | 2013 | Yes | Yes | Partial[b] | Yes(since 1.10) | Yes | Yes | Yes | Yes | Only in experimental mode withCRIU[1] | Yes(since 1.10) |
| Podman | Linux,Windows,macOS,FreeBSD | Apache License 2.0 | 2018 | Yes | Yes | Yes[14] | Yes | Yes | Yes | Yes | Yes | Yes[15] | Yes |
| LXC | Linux | GNU GPLv2 | 2008 | Yes[16] | Yes | Partial[c] | Partial[d] | Yes | Yes | Yes | Yes | Yes | Yes[16] |
| Apptainer (formerly Singularity[17]) | Linux | BSD Licence | 2015[18] | Yes[19] | Yes | Yes | No | No | No | No | No | No | Yes[20] |
| OpenVZ | Linux | GNU GPLv2 | 2005 | Yes | Yes[21] | Yes | Yes[e] | Yes | Yes | Yes[f] | Partial[g] | Yes | Yes[h] |
| Virtuozzo | Linux,Windows | Trialware | 2000[25] | Yes | Yes | Yes | Yes[i] | Yes | Yes | Yes[f] | Partial[j] | Yes | Yes |
| Solaris Containers (Zones) | illumos (OpenSolaris), Solaris | CDDL, Proprietary | 2004 | Yes | Yes (ZFS) | Yes | Partial[k] | Yes | Yes | Yes[l][28][29] | Partial[m] | Partial[n][o] | Yes[p] |
| FreeBSD jail | FreeBSD,DragonFly BSD | BSD License | 2000[31] | Yes | Yes (ZFS) | Yes[q] | Yes | Yes[32] | Yes | Yes[33] | Yes | Partial[34][35] | Yes[36] |
| vkernel | DragonFly BSD | BSD Licence | 2006[37] | Yes[38] | Yes[38] | — | ? | Yes[39] | Yes[39] | Yes[40] | ? | ? | Yes |
| WPARs | AIX | Commercialproprietary software | 2007 | Yes | No | Yes | Yes | Yes | Yes | Yes[r] | No | Yes[42] | ? |
| iCore Virtual Accounts | Windows XP | Freeware | 2008 | Yes | No | Yes | No | No | No | No | ? | No | ? |
| Sandboxie | Windows | GNU GPLv3 | 2004 | Yes | Yes | Partial | No | No | No | Partial | No | No | Yes |
| systemd-nspawn | Linux | GNU LGPLv2.1+ | 2010 | Yes | Yes | Yes[43][44] | Yes[43][44] | Yes[43][44] | Yes[43][44] | Yes | ? | ? | Yes |
| Turbo | Windows | Freemium | 2012 | Yes | No | No | No | No | No | Yes | No | No | Yes |
| Mechanism | Operating system | License | Actively developed since or between | Features | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| File system isolation | Copy on write | Disk quotas | I/O rate limiting | Memory limits | CPU quotas | Network isolation | Nested virtualization | Partition checkpointing and live migration | Root privilege isolation | ||||
| Linux-VServer (security context) | Linux,Windows Server 2016 | GNU GPLv2 | 2001-2018 | Yes | Yes | Yes | Yes[s] | Yes | Yes | Partial[t] | ? | No | Partial[u] |
| lmctfy | Linux | Apache License 2.0 | 2013–2015 | Yes | Yes | Yes | Yes[s] | Yes | Yes | Partial[t] | ? | No | Partial[u] |
| sysjail | OpenBSD,NetBSD | BSD License | 2006–2009 | Yes | No | No | No | No | No | Yes | No | No | ? |
| rkt (rocket) | Linux | Apache License 2.0 | 2014[46]–2018 | Yes | Yes | Yes | Yes | Yes | Yes | Yes | ? | ? | Yes |
There are many other OS-level virtualization systems such as: Linux OpenVZ, Linux-VServer, FreeBSD Jails, AIX Workload Partitions (WPARs), HP-UX Containers (SRP), Solaris Containers, among others.
LXC now has support for user namespaces. [...] LXC is no longer running as root so even if an attacker manages to escape the container, he'd find himself having the privileges of a regular user on the host.
Jails were first introduced in FreeBSD 4.0 in 2000
treats the disk image as copy-on-write.