libcontainer
packageThis package is not in the latest version of its module.
Details
Validgo.mod file
The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license
Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version
Modules with tagged versions give importers more predictable builds.
Stable version
When a project reaches major version v1 it is considered stable.
- Learn more about best practices
Repository
Links
README¶
libcontainer
Libcontainer provides a native Go implementation for creating containerswith namespaces, cgroups, capabilities, and filesystem access controls.It allows you to manage the lifecycle of the container performing additional operationsafter the container is created.
Container
A container is a self contained execution environment that shares the kernel of thehost system and which is (optionally) isolated from other containers in the system.
Using libcontainer
Container init
Because containers are spawned in a two step process you will need a binary thatwill be executed as the init process for the container. In libcontainer, we usethe current binary (/proc/self/exe) to be executed as the init process, and usearg "init", we call the first step process "bootstrap", so you always need a "init"function as the entry of "bootstrap".
In addition to the go init function the early stage bootstrap is handled by importingnsenter.
For details on how runc implements such "init", seeinit.goandlibcontainer/init_linux.go.
Device management
If you want containers that have access to some devices, you need to importthis package into your code:
import ( _ "github.com/opencontainers/cgroups/devices" )
Without doing this, libcontainer cgroup manager won't be able to set up deviceaccess rules, and will fail if devices are specified in the containerconfiguration.
Container creation
To create a container you first have to create a configurationstruct describing how the container is to be created. A sample would look similar to this:
defaultMountFlags := unix.MS_NOEXEC | unix.MS_NOSUID | unix.MS_NODEVvar devices []*devices.Rulefor _, device := range specconv.AllowedDevices {devices = append(devices, &device.Rule)}config := &configs.Config{Rootfs: "/your/path/to/rootfs",Capabilities: &configs.Capabilities{Bounding: []string{"CAP_KILL","CAP_AUDIT_WRITE",},Effective: []string{"CAP_KILL","CAP_AUDIT_WRITE",},Permitted: []string{"CAP_KILL","CAP_AUDIT_WRITE",},},Namespaces: configs.Namespaces([]configs.Namespace{{Type: configs.NEWNS},{Type: configs.NEWUTS},{Type: configs.NEWIPC},{Type: configs.NEWPID},{Type: configs.NEWUSER},{Type: configs.NEWNET},{Type: configs.NEWCGROUP},}),Cgroups: &configs.Cgroup{Name: "test-container",Parent: "system",Resources: &configs.Resources{MemorySwappiness: nil,Devices: devices,},},MaskPaths: []string{"/proc/kcore","/sys/firmware",},ReadonlyPaths: []string{"/proc/sys", "/proc/sysrq-trigger", "/proc/irq", "/proc/bus",},Devices: specconv.AllowedDevices,Hostname: "testing",Mounts: []*configs.Mount{{Source: "proc",Destination: "/proc",Device: "proc",Flags: defaultMountFlags,},{Source: "tmpfs",Destination: "/dev",Device: "tmpfs",Flags: unix.MS_NOSUID | unix.MS_STRICTATIME,Data: "mode=755",},{Source: "devpts",Destination: "/dev/pts",Device: "devpts",Flags: unix.MS_NOSUID | unix.MS_NOEXEC,Data: "newinstance,ptmxmode=0666,mode=0620,gid=5",},{Device: "tmpfs",Source: "shm",Destination: "/dev/shm",Data: "mode=1777,size=65536k",Flags: defaultMountFlags,},{Source: "mqueue",Destination: "/dev/mqueue",Device: "mqueue",Flags: defaultMountFlags,},{Source: "sysfs",Destination: "/sys",Device: "sysfs",Flags: defaultMountFlags | unix.MS_RDONLY,},},UIDMappings: []configs.IDMap{{ContainerID: 0,HostID: 1000,Size: 65536,},},GIDMappings: []configs.IDMap{{ContainerID: 0,HostID: 1000,Size: 65536,},},Networks: []*configs.Network{{Type: "loopback",Address: "127.0.0.1/0",Gateway: "localhost",},},Rlimits: []configs.Rlimit{{Type: unix.RLIMIT_NOFILE,Hard: uint64(1025),Soft: uint64(1025),},},}
Once you have the configuration populated you can create a containerwith a specified ID under a specified state directory:
container, err := libcontainer.Create("/run/containers", "container-id", config)if err != nil {logrus.Fatal(err)return}
To spawn bash as the initial process inside the container and have theprocesses pid returned in order to wait, signal, or kill the process:
process := &libcontainer.Process{Args: []string{"/bin/bash"},Env: []string{"PATH=/bin"},User: "daemon",Stdin: os.Stdin,Stdout: os.Stdout,Stderr: os.Stderr,Init: true,}err := container.Run(process)if err != nil {container.Destroy()logrus.Fatal(err)return}// wait for the process to finish._, err := process.Wait()if err != nil {logrus.Fatal(err)}// destroy the container.container.Destroy()
Additional ways to interact with a running container are:
// return all the pids for all processes running inside the container.processes, err := container.Processes()// get detailed cpu, memory, io, and network statistics for the container and// it's processes.stats, err := container.Stats()// pause all processes inside the container.container.Pause()// resume all paused processes.container.Resume()// send signal to container's init process.container.Signal(signal)// update container resource constraints.container.Set(config)// get current status of the container.status, err := container.Status()// get current container's state information.state, err := container.State()
Checkpoint & Restore
libcontainer now integratesCRIU for checkpointing and restoring containers.This lets you save the state of a process running inside a container to disk, and then restorethat state into a new process, on the same machine or on another machine.
criu
version 1.5.2 or higher is required to use checkpoint and restore.If you don't already havecriu
installed, you can build it from source, following theonline instructions.criu
is also installed in the docker imagegenerated when building libcontainer with docker.
Copyright and license
Code and documentation copyright 2014 Docker, inc.The code and documentation are released under theApache 2.0 license.The documentation is also released under Creative Commons Attribution 4.0 International License.You may obtain a copy of the license, titled CC-BY-4.0, athttp://creativecommons.org/licenses/by/4.0/.
Documentation¶
Overview¶
Package libcontainer provides a native Go implementation for creating containerswith namespaces, cgroups, capabilities, and filesystem access controls.It allows you to manage the lifecycle of the container performing additional operationsafter the container is created.
Index¶
- Constants
- Variables
- func Init()
- type BaseState
- type Boolmsg
- type Bytemsg
- type Container
- func (c *Container) Checkpoint(criuOpts *CriuOpts) error
- func (c *Container) Config() configs.Config
- func (c *Container) Destroy() error
- func (c *Container) Exec() error
- func (c *Container) ID() string
- func (c *Container) NotifyMemoryPressure(level PressureLevel) (<-chan struct{}, error)
- func (c *Container) NotifyOOM() (<-chan struct{}, error)
- func (c *Container) OCIState() (*specs.State, error)
- func (c *Container) Pause() error
- func (c *Container) Processes() ([]int, error)
- func (c *Container) Restore(process *Process, criuOpts *CriuOpts) error
- func (c *Container) Resume() error
- func (c *Container) Run(process *Process) error
- func (c *Container) Set(config configs.Config) error
- func (c *Container) Signal(s os.Signal) error
- func (c *Container) Start(process *Process) error
- func (c *Container) State() (*State, error)
- func (c *Container) Stats() (*Stats, error)
- func (c *Container) Status() (Status, error)
- type CriuOpts
- type CriuPageServerInfo
- type IO
- type Int32msg
- type PressureLevel
- type Process
- type State
- type Stats
- type Status
- type VethPairName
Constants¶
const (InitMsguint16 = 62000CloneFlagsAttruint16 = 27281NsPathsAttruint16 = 27282UidmapAttruint16 = 27283GidmapAttruint16 = 27284SetgroupAttruint16 = 27285OomScoreAdjAttruint16 = 27286RootlessEUIDAttruint16 = 27287UidmapPathAttruint16 = 27288GidmapPathAttruint16 = 27289TimeOffsetsAttruint16 = 27290)
list of known message types we want to send to bootstrap programThe number is randomly chosen to not conflict with known netlink types
Variables¶
var (ErrExist =errors.New("container with given ID already exists")ErrInvalidID =errors.New("invalid container ID format")ErrNotExist =errors.New("container does not exist")ErrPaused =errors.New("container paused")ErrRunning =errors.New("container still running")ErrNotRunning =errors.New("container not running")ErrNotPaused =errors.New("container not paused")ErrCgroupNotExist =errors.New("cgroup not exist"))
var ErrCriuMissingFeatures =errors.New("criu is missing features")
Functions¶
Types¶
typeBaseState¶added inv0.0.5
type BaseState struct {// ID is the container ID.IDstring `json:"id"`// InitProcessPid is the init process id in the parent namespace.InitProcessPidint `json:"init_process_pid"`// InitProcessStartTime is the init process start time in clock cycles since boot time.InitProcessStartTimeuint64 `json:"init_process_start"`// Created is the unix timestamp for the creation time of the container in UTCCreatedtime.Time `json:"created"`// Config is the container's configuration.Configconfigs.Config `json:"config"`}
BaseState represents the platform agnostic pieces relating to arunning container's state
typeBytemsg¶added inv0.0.6
Bytemsg has the following representation| nlattr len | nlattr type || value | pad |
typeContainer¶
type Container struct {// contains filtered or unexported fields}
Container is a libcontainer container object.
funcCreate¶added inv1.2.0
Create creates a new container with the given id inside a given statedirectory (root), and returns a Container object.
The root is a state directory which many containers can share. It can beused later to get the list of containers, or to get information about aparticular container (see Load).
The id must not be empty and consist of only the following characters:ASCII letters, digits, underscore, plus, minus, period. The id must beunique and non-existent for the given root path.
funcLoad¶added inv1.2.0
Load takes a path to the state directory (root) and an id of an existingcontainer, and returns a Container object reconstructed from the savedstate. This presents a read only view of the container.
func (*Container)Checkpoint¶
func (*Container)Destroy¶
Destroy destroys the container, if its in a valid state.
Any event registrations are removed before the container is destroyed.No error is returned if the container is already destroyed.
Running containers must first be stopped using Signal.Paused containers must first be resumed using Resume.
func (*Container)Exec¶added inv1.2.0
Exec signals the container to exec the users process at the end of the init.
func (*Container)NotifyMemoryPressure¶added inv0.0.7
func (c *Container) NotifyMemoryPressure(levelPressureLevel) (<-chan struct{},error)
NotifyMemoryPressure returns a read-only channel signaling when thecontainer reaches a given pressure level.
func (*Container)NotifyOOM¶
NotifyOOM returns a read-only channel signaling when the container receivesan OOM notification.
func (*Container)Pause¶
Pause pauses the container, if its state is RUNNING or CREATED, changingits state to PAUSED. If the state is already PAUSED, does nothing.
func (*Container)Processes¶
Processes returns the PIDs inside this container. The PIDs are in thenamespace of the calling process.
Some of the returned PIDs may no longer refer to processes in the container,unless the container state is PAUSED in which case every PID in the slice isvalid.
func (*Container)Restore¶
Restore restores the checkpointed container to a running state using thecriu(8) utility.
func (*Container)Resume¶
Resume resumes the execution of any user processes in thecontainer before setting the container state to RUNNING.This is only performed if the current state is PAUSED.If the Container state is RUNNING, does nothing.
func (*Container)Run¶added inv1.2.0
Run immediately starts the process inside the container. Returns an error ifthe process fails to start. It does not block waiting for the exec fifoafter start returns but opens the fifo after start returns.
func (*Container)Set¶
Set resources of container as configured. Can be used to change resourceswhen the container is running.
func (*Container)Signal¶added inv0.0.3
Signal sends a specified signal to container's init.
When s is SIGKILL and the container does not have its own PID namespace, allthe container's processes are killed. In this scenario, the libcontaineruser may be required to implement a proper child reaper.
func (*Container)Start¶
Start starts a process inside the container. Returns error if process failsto start. You can track process lifecycle with passed Process structure.
typeCriuOpts¶
type CriuOpts struct {ImagesDirectorystring// directory for storing image filesWorkDirectorystring// directory to cd and write logs/pidfiles/stats toParentImagestring// directory for storing parent image files in pre-dump and dumpLeaveRunningbool// leave container in running state after checkpointTcpEstablishedbool// checkpoint/restore established TCP connectionsTcpSkipInFlightbool// skip in-flight TCP connectionsLinkRemapbool// allow one to link unlinked files back when possibleExternalUnixConnectionsbool// allow external unix connectionsShellJobbool// allow to dump and restore shell jobsFileLocksbool// handle file locks, for safetyPreDumpbool// call criu predump to perform iterative checkpointPageServerCriuPageServerInfo// allow to dump to criu page serverVethPairs []VethPairName// pass the veth to criu when restoreEmptyNsuint32// don't c/r properties for namespace from this maskAutoDedupbool// auto deduplication for incremental dumpsLazyPagesbool// restore memory pages lazily using userfaultfdStatusFdint// fd for feedback when lazy server is readyLsmProfilestring// LSM profile used to restore the containerLsmMountContextstring// LSM mount context value to use during restore// ManageCgroupsMode tells how criu should manage cgroups during// checkpoint or restore. Possible values are: "soft", "full",// "strict", "ignore", or "" (empty string) for criu default.// Seehttps://criu.org/CGroups for more details.ManageCgroupsModestring}
typeIO¶added inv0.0.7
type IO struct {Stdinio.WriteCloserStdoutio.ReadCloserStderrio.ReadCloser}
IO holds the process's STDIO
typePressureLevel¶added inv0.0.7
type PressureLeveluint
const (LowPressurePressureLevel =iotaMediumPressureCriticalPressure)
typeProcess¶
type Process struct {// The command to be run followed by any arguments.Args []string// Env specifies the environment variables for the process.Env []string// UID and GID of the executing process running inside the container// local to the container's user and group configuration.UID, GIDint// AdditionalGroups specifies the gids that should be added to supplementary groups// in addition to those that the user belongs to.AdditionalGroups []int// Cwd will change the process's current working directory inside the container's rootfs.Cwdstring// Stdin is a reader which provides the standard input stream.Stdinio.Reader// Stdout is a writer which receives the standard output stream.Stdoutio.Writer// Stderr is a writer which receives the standard error stream.Stderrio.Writer// ExtraFiles specifies additional open files to be inherited by the process.ExtraFiles []*os.File// Initial size for the console.ConsoleWidthuint16ConsoleHeightuint16// Capabilities specify the capabilities to keep when executing the process.// All capabilities not specified will be dropped from the processes capability mask.//// If not nil, takes precedence over container's [configs.Config.Capabilities].Capabilities *configs.Capabilities// AppArmorProfile specifies the profile to apply to the process and is// changed at the time the process is executed.//// If not empty, takes precedence over container's [configs.Config.AppArmorProfile].AppArmorProfilestring// Label specifies the label to apply to the process. It is commonly used by selinux.//// If not empty, takes precedence over container's [configs.Config.ProcessLabel].Labelstring// NoNewPrivileges controls whether processes can gain additional privileges.//// If not nil, takes precedence over container's [configs.Config.NoNewPrivileges].NoNewPrivileges *bool// Rlimits specifies the resource limits, such as max open files, to set for the process.// If unset, the process will inherit rlimits from the parent process.//// If not empty, takes precedence over container's [configs.Config.Rlimit].Rlimits []configs.Rlimit// ConsoleSocket provides the masterfd console.ConsoleSocket *os.File// PidfdSocket provides process file descriptor of it own.PidfdSocket *os.File// Init specifies whether the process is the first process in the container.Initbool// LogLevel is a string containing a numeric representation of the current// log level (i.e. "4", but never "info"). It is passed on to runc init as// _LIBCONTAINER_LOGLEVEL environment variable.LogLevelstring// SubCgroupPaths specifies sub-cgroups to run the process in.// Map keys are controller names, map values are paths (relative to// container's top-level cgroup).//// If empty, the default top-level container's cgroup is used.//// For cgroup v2, the only key allowed is "".SubCgroupPaths map[string]string// Scheduler represents the scheduling attributes for a process.//// If not empty, takes precedence over container's [configs.Config.Scheduler].Scheduler *configs.Scheduler// IOPriority is a process I/O priority.//// If not empty, takes precedence over container's [configs.Config.IOPriority].IOPriority *configs.IOPriorityCPUAffinity *configs.CPUAffinity// contains filtered or unexported fields}
Process defines the configuration and IO for a process inside a container.
Note that some Process properties are also present in container configuration(configs.Config). In all such cases, Process properties take precedenceover container configuration ones.
func (*Process)InitializeIO¶added inv0.0.7
InitializeIO creates pipes for use with the process's stdio and returns theopposite side for each. Do not use this if you want to have a pseudoterminalset up for you by libcontainer (TODO: fix that too).TODO: This is mostly unnecessary, and should be handled by clients.
typeState¶
type State struct {BaseState// Specified if the container was started under the rootless mode.// Set to true if BaseState.Config.RootlessEUID && BaseState.Config.RootlessCgroupsRootlessbool `json:"rootless"`// Paths to all the container's cgroups, as returned by (*cgroups.Manager).GetPaths//// For cgroup v1, a key is cgroup subsystem name, and the value is the path// to the cgroup for this subsystem.//// For cgroup v2 unified hierarchy, a key is "", and the value is the unified path.CgroupPaths map[string]string `json:"cgroup_paths"`// NamespacePaths are filepaths to the container's namespaces. Key is the namespace type// with the value as the path.NamespacePaths map[configs.NamespaceType]string `json:"namespace_paths"`// Container's standard descriptors (std{in,out,err}), needed for checkpoint and restoreExternalDescriptors []string `json:"external_descriptors,omitempty"`// Intel RDT "resource control" filesystem pathIntelRdtPathstring `json:"intel_rdt_path"`}
State represents a running container's state
typeStatus¶
type Statusint
Status is the status of a container.
const (// Created is the status that denotes the container exists but has not been run yet.CreatedStatus =iota// Running is the status that denotes the container exists and is running.Running// Paused is the status that denotes the container exists, but all its processes are paused.Paused// Stopped is the status that denotes the container does not have a created or running process.Stopped)
typeVethPairName¶added inv0.0.4
Source Files¶
- console_linux.go
- container.go
- container_linux.go
- criu_linux.go
- criu_opts_linux.go
- env.go
- error.go
- factory_linux.go
- init_linux.go
- message_linux.go
- mount_linux.go
- network_linux.go
- notify_linux.go
- notify_v2_linux.go
- process.go
- process_linux.go
- restored_process.go
- rootfs_linux.go
- setns_init_linux.go
- standard_init_linux.go
- state_linux.go
- stats_linux.go
- sync.go
- sync_unix.go
Directories¶
Path | Synopsis |
---|---|
integration is used for integration testing of libcontainer | integration is used for integration testing of libcontainer |
internal | |
Package specconv implements conversion of specifications to libcontainer configurations | Package specconv implements conversion of specifications to libcontainer configurations |
Package user is an alias for github.com/moby/sys/user. | Package user is an alias for github.com/moby/sys/user. |
Deprecated: use github.com/moby/sys/userns | Deprecated: use github.com/moby/sys/userns |