Movatterモバイル変換


[0]ホーム

URL:


Alert GO-2025-3543: WITHDRAWN: Libcontainer is affected by capabilities elevation in github.com/opencontainers/runc

libcontainer

package
v1.3.0Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 29, 2025 License:Apache-2.0Imports:59Imported by:652

Details

Repository

github.com/opencontainers/runc

Links

README

libcontainer

Go Reference

Libcontainer provides a native Go implementation for creating containerswith namespaces, cgroups, capabilities, and filesystem access controls.It allows you to manage the lifecycle of the container performing additional operationsafter the container is created.

Container

A container is a self contained execution environment that shares the kernel of thehost system and which is (optionally) isolated from other containers in the system.

Using libcontainer

Container init

Because containers are spawned in a two step process you will need a binary thatwill be executed as the init process for the container. In libcontainer, we usethe current binary (/proc/self/exe) to be executed as the init process, and usearg "init", we call the first step process "bootstrap", so you always need a "init"function as the entry of "bootstrap".

In addition to the go init function the early stage bootstrap is handled by importingnsenter.

For details on how runc implements such "init", seeinit.goandlibcontainer/init_linux.go.

Device management

If you want containers that have access to some devices, you need to importthis package into your code:

    import (        _ "github.com/opencontainers/cgroups/devices"    )

Without doing this, libcontainer cgroup manager won't be able to set up deviceaccess rules, and will fail if devices are specified in the containerconfiguration.

Container creation

To create a container you first have to create a configurationstruct describing how the container is to be created. A sample would look similar to this:

defaultMountFlags := unix.MS_NOEXEC | unix.MS_NOSUID | unix.MS_NODEVvar devices []*devices.Rulefor _, device := range specconv.AllowedDevices {devices = append(devices, &device.Rule)}config := &configs.Config{Rootfs: "/your/path/to/rootfs",Capabilities: &configs.Capabilities{Bounding: []string{"CAP_KILL","CAP_AUDIT_WRITE",},Effective: []string{"CAP_KILL","CAP_AUDIT_WRITE",},Permitted: []string{"CAP_KILL","CAP_AUDIT_WRITE",},},Namespaces: configs.Namespaces([]configs.Namespace{{Type: configs.NEWNS},{Type: configs.NEWUTS},{Type: configs.NEWIPC},{Type: configs.NEWPID},{Type: configs.NEWUSER},{Type: configs.NEWNET},{Type: configs.NEWCGROUP},}),Cgroups: &configs.Cgroup{Name:   "test-container",Parent: "system",Resources: &configs.Resources{MemorySwappiness: nil,Devices:          devices,},},MaskPaths: []string{"/proc/kcore","/sys/firmware",},ReadonlyPaths: []string{"/proc/sys", "/proc/sysrq-trigger", "/proc/irq", "/proc/bus",},Devices:  specconv.AllowedDevices,Hostname: "testing",Mounts: []*configs.Mount{{Source:      "proc",Destination: "/proc",Device:      "proc",Flags:       defaultMountFlags,},{Source:      "tmpfs",Destination: "/dev",Device:      "tmpfs",Flags:       unix.MS_NOSUID | unix.MS_STRICTATIME,Data:        "mode=755",},{Source:      "devpts",Destination: "/dev/pts",Device:      "devpts",Flags:       unix.MS_NOSUID | unix.MS_NOEXEC,Data:        "newinstance,ptmxmode=0666,mode=0620,gid=5",},{Device:      "tmpfs",Source:      "shm",Destination: "/dev/shm",Data:        "mode=1777,size=65536k",Flags:       defaultMountFlags,},{Source:      "mqueue",Destination: "/dev/mqueue",Device:      "mqueue",Flags:       defaultMountFlags,},{Source:      "sysfs",Destination: "/sys",Device:      "sysfs",Flags:       defaultMountFlags | unix.MS_RDONLY,},},UIDMappings: []configs.IDMap{{ContainerID: 0,HostID: 1000,Size: 65536,},},GIDMappings: []configs.IDMap{{ContainerID: 0,HostID: 1000,Size: 65536,},},Networks: []*configs.Network{{Type:    "loopback",Address: "127.0.0.1/0",Gateway: "localhost",},},Rlimits: []configs.Rlimit{{Type: unix.RLIMIT_NOFILE,Hard: uint64(1025),Soft: uint64(1025),},},}

Once you have the configuration populated you can create a containerwith a specified ID under a specified state directory:

container, err := libcontainer.Create("/run/containers", "container-id", config)if err != nil {logrus.Fatal(err)return}

To spawn bash as the initial process inside the container and have theprocesses pid returned in order to wait, signal, or kill the process:

process := &libcontainer.Process{Args:   []string{"/bin/bash"},Env:    []string{"PATH=/bin"},User:   "daemon",Stdin:  os.Stdin,Stdout: os.Stdout,Stderr: os.Stderr,Init:   true,}err := container.Run(process)if err != nil {container.Destroy()logrus.Fatal(err)return}// wait for the process to finish._, err := process.Wait()if err != nil {logrus.Fatal(err)}// destroy the container.container.Destroy()

Additional ways to interact with a running container are:

// return all the pids for all processes running inside the container.processes, err := container.Processes()// get detailed cpu, memory, io, and network statistics for the container and// it's processes.stats, err := container.Stats()// pause all processes inside the container.container.Pause()// resume all paused processes.container.Resume()// send signal to container's init process.container.Signal(signal)// update container resource constraints.container.Set(config)// get current status of the container.status, err := container.Status()// get current container's state information.state, err := container.State()

Checkpoint & Restore

libcontainer now integratesCRIU for checkpointing and restoring containers.This lets you save the state of a process running inside a container to disk, and then restorethat state into a new process, on the same machine or on another machine.

criu version 1.5.2 or higher is required to use checkpoint and restore.If you don't already havecriu installed, you can build it from source, following theonline instructions.criu is also installed in the docker imagegenerated when building libcontainer with docker.

Copyright and license

Code and documentation copyright 2014 Docker, inc.The code and documentation are released under theApache 2.0 license.The documentation is also released under Creative Commons Attribution 4.0 International License.You may obtain a copy of the license, titled CC-BY-4.0, athttp://creativecommons.org/licenses/by/4.0/.

Documentation

Overview

Package libcontainer provides a native Go implementation for creating containerswith namespaces, cgroups, capabilities, and filesystem access controls.It allows you to manage the lifecycle of the container performing additional operationsafter the container is created.

Index

Constants

View Source
const (InitMsguint16 = 62000CloneFlagsAttruint16 = 27281NsPathsAttruint16 = 27282UidmapAttruint16 = 27283GidmapAttruint16 = 27284SetgroupAttruint16 = 27285OomScoreAdjAttruint16 = 27286RootlessEUIDAttruint16 = 27287UidmapPathAttruint16 = 27288GidmapPathAttruint16 = 27289TimeOffsetsAttruint16 = 27290)

list of known message types we want to send to bootstrap programThe number is randomly chosen to not conflict with known netlink types

Variables

View Source
var (ErrExist          =errors.New("container with given ID already exists")ErrInvalidID      =errors.New("invalid container ID format")ErrNotExist       =errors.New("container does not exist")ErrPaused         =errors.New("container paused")ErrRunning        =errors.New("container still running")ErrNotRunning     =errors.New("container not running")ErrNotPaused      =errors.New("container not paused")ErrCgroupNotExist =errors.New("cgroup not exist"))
View Source
var ErrCriuMissingFeatures =errors.New("criu is missing features")

Functions

funcInitadded inv1.2.0

func Init()

Init is part of "runc init" implementation.

Types

typeBaseStateadded inv0.0.5

type BaseState struct {// ID is the container ID.IDstring `json:"id"`// InitProcessPid is the init process id in the parent namespace.InitProcessPidint `json:"init_process_pid"`// InitProcessStartTime is the init process start time in clock cycles since boot time.InitProcessStartTimeuint64 `json:"init_process_start"`// Created is the unix timestamp for the creation time of the container in UTCCreatedtime.Time `json:"created"`// Config is the container's configuration.Configconfigs.Config `json:"config"`}

BaseState represents the platform agnostic pieces relating to arunning container's state

typeBoolmsgadded inv0.0.9

type Boolmsg struct {Typeuint16Valuebool}

func (*Boolmsg)Lenadded inv0.0.9

func (msg *Boolmsg) Len()int

func (*Boolmsg)Serializeadded inv0.0.9

func (msg *Boolmsg) Serialize() []byte

typeBytemsgadded inv0.0.6

type Bytemsg struct {Typeuint16Value []byte}

Bytemsg has the following representation| nlattr len | nlattr type || value | pad |

func (*Bytemsg)Lenadded inv0.0.6

func (msg *Bytemsg) Len()int

func (*Bytemsg)Serializeadded inv0.0.6

func (msg *Bytemsg) Serialize() []byte

typeContainer

type Container struct {// contains filtered or unexported fields}

Container is a libcontainer container object.

funcCreateadded inv1.2.0

func Create(root, idstring, config *configs.Config) (*Container,error)

Create creates a new container with the given id inside a given statedirectory (root), and returns a Container object.

The root is a state directory which many containers can share. It can beused later to get the list of containers, or to get information about aparticular container (see Load).

The id must not be empty and consist of only the following characters:ASCII letters, digits, underscore, plus, minus, period. The id must beunique and non-existent for the given root path.

funcLoadadded inv1.2.0

func Load(root, idstring) (*Container,error)

Load takes a path to the state directory (root) and an id of an existingcontainer, and returns a Container object reconstructed from the savedstate. This presents a read only view of the container.

func (*Container)Checkpoint

func (c *Container) Checkpoint(criuOpts *CriuOpts)error

func (*Container)Config

func (c *Container) Config()configs.Config

Config returns the container's configuration

func (*Container)Destroy

func (c *Container) Destroy()error

Destroy destroys the container, if its in a valid state.

Any event registrations are removed before the container is destroyed.No error is returned if the container is already destroyed.

Running containers must first be stopped using Signal.Paused containers must first be resumed using Resume.

func (*Container)Execadded inv1.2.0

func (c *Container) Exec()error

Exec signals the container to exec the users process at the end of the init.

func (*Container)ID

func (c *Container) ID()string

ID returns the container's unique ID

func (*Container)NotifyMemoryPressureadded inv0.0.7

func (c *Container) NotifyMemoryPressure(levelPressureLevel) (<-chan struct{},error)

NotifyMemoryPressure returns a read-only channel signaling when thecontainer reaches a given pressure level.

func (*Container)NotifyOOM

func (c *Container) NotifyOOM() (<-chan struct{},error)

NotifyOOM returns a read-only channel signaling when the container receivesan OOM notification.

func (*Container)OCIStateadded inv1.2.0

func (c *Container) OCIState() (*specs.State,error)

OCIState returns the current container's state information.

func (*Container)Pause

func (c *Container) Pause()error

Pause pauses the container, if its state is RUNNING or CREATED, changingits state to PAUSED. If the state is already PAUSED, does nothing.

func (*Container)Processes

func (c *Container) Processes() ([]int,error)

Processes returns the PIDs inside this container. The PIDs are in thenamespace of the calling process.

Some of the returned PIDs may no longer refer to processes in the container,unless the container state is PAUSED in which case every PID in the slice isvalid.

func (*Container)Restore

func (c *Container) Restore(process *Process, criuOpts *CriuOpts)error

Restore restores the checkpointed container to a running state using thecriu(8) utility.

func (*Container)Resume

func (c *Container) Resume()error

Resume resumes the execution of any user processes in thecontainer before setting the container state to RUNNING.This is only performed if the current state is PAUSED.If the Container state is RUNNING, does nothing.

func (*Container)Runadded inv1.2.0

func (c *Container) Run(process *Process)error

Run immediately starts the process inside the container. Returns an error ifthe process fails to start. It does not block waiting for the exec fifoafter start returns but opens the fifo after start returns.

func (*Container)Set

func (c *Container) Set(configconfigs.Config)error

Set resources of container as configured. Can be used to change resourceswhen the container is running.

func (*Container)Signaladded inv0.0.3

func (c *Container) Signal(sos.Signal)error

Signal sends a specified signal to container's init.

When s is SIGKILL and the container does not have its own PID namespace, allthe container's processes are killed. In this scenario, the libcontaineruser may be required to implement a proper child reaper.

func (*Container)Start

func (c *Container) Start(process *Process)error

Start starts a process inside the container. Returns error if process failsto start. You can track process lifecycle with passed Process structure.

func (*Container)State

func (c *Container) State() (*State,error)

State returns the current container's state information.

func (*Container)Stats

func (c *Container) Stats() (*Stats,error)

Stats returns statistics for the container.

func (*Container)Status

func (c *Container) Status() (Status,error)

Status returns the current status of the container.

typeCriuOpts

type CriuOpts struct {ImagesDirectorystring// directory for storing image filesWorkDirectorystring// directory to cd and write logs/pidfiles/stats toParentImagestring// directory for storing parent image files in pre-dump and dumpLeaveRunningbool// leave container in running state after checkpointTcpEstablishedbool// checkpoint/restore established TCP connectionsTcpSkipInFlightbool// skip in-flight TCP connectionsLinkRemapbool// allow one to link unlinked files back when possibleExternalUnixConnectionsbool// allow external unix connectionsShellJobbool// allow to dump and restore shell jobsFileLocksbool// handle file locks, for safetyPreDumpbool// call criu predump to perform iterative checkpointPageServerCriuPageServerInfo// allow to dump to criu page serverVethPairs               []VethPairName// pass the veth to criu when restoreEmptyNsuint32// don't c/r properties for namespace from this maskAutoDedupbool// auto deduplication for incremental dumpsLazyPagesbool// restore memory pages lazily using userfaultfdStatusFdint// fd for feedback when lazy server is readyLsmProfilestring// LSM profile used to restore the containerLsmMountContextstring// LSM mount context value to use during restore// ManageCgroupsMode tells how criu should manage cgroups during// checkpoint or restore. Possible values are: "soft", "full",// "strict", "ignore", or "" (empty string) for criu default.// Seehttps://criu.org/CGroups for more details.ManageCgroupsModestring}

typeCriuPageServerInfo

type CriuPageServerInfo struct {Addressstring// IP address of CRIU page serverPortint32// port number of CRIU page server}

typeIOadded inv0.0.7

type IO struct {Stdinio.WriteCloserStdoutio.ReadCloserStderrio.ReadCloser}

IO holds the process's STDIO

typeInt32msgadded inv0.0.6

type Int32msg struct {Typeuint16Valueuint32}

func (*Int32msg)Lenadded inv0.0.6

func (msg *Int32msg) Len()int

func (*Int32msg)Serializeadded inv0.0.6

func (msg *Int32msg) Serialize() []byte

Serialize serializes the message.Int32msg has the following representation| nlattr len | nlattr type || uint32 value |

typePressureLeveladded inv0.0.7

type PressureLeveluint
const (LowPressurePressureLevel =iotaMediumPressureCriticalPressure)

typeProcess

type Process struct {// The command to be run followed by any arguments.Args []string// Env specifies the environment variables for the process.Env []string// UID and GID of the executing process running inside the container// local to the container's user and group configuration.UID, GIDint// AdditionalGroups specifies the gids that should be added to supplementary groups// in addition to those that the user belongs to.AdditionalGroups []int// Cwd will change the process's current working directory inside the container's rootfs.Cwdstring// Stdin is a reader which provides the standard input stream.Stdinio.Reader// Stdout is a writer which receives the standard output stream.Stdoutio.Writer// Stderr is a writer which receives the standard error stream.Stderrio.Writer// ExtraFiles specifies additional open files to be inherited by the process.ExtraFiles []*os.File// Initial size for the console.ConsoleWidthuint16ConsoleHeightuint16// Capabilities specify the capabilities to keep when executing the process.// All capabilities not specified will be dropped from the processes capability mask.//// If not nil, takes precedence over container's [configs.Config.Capabilities].Capabilities *configs.Capabilities// AppArmorProfile specifies the profile to apply to the process and is// changed at the time the process is executed.//// If not empty, takes precedence over container's [configs.Config.AppArmorProfile].AppArmorProfilestring// Label specifies the label to apply to the process. It is commonly used by selinux.//// If not empty, takes precedence over container's [configs.Config.ProcessLabel].Labelstring// NoNewPrivileges controls whether processes can gain additional privileges.//// If not nil, takes precedence over container's [configs.Config.NoNewPrivileges].NoNewPrivileges *bool// Rlimits specifies the resource limits, such as max open files, to set for the process.// If unset, the process will inherit rlimits from the parent process.//// If not empty, takes precedence over container's [configs.Config.Rlimit].Rlimits []configs.Rlimit// ConsoleSocket provides the masterfd console.ConsoleSocket *os.File// PidfdSocket provides process file descriptor of it own.PidfdSocket *os.File// Init specifies whether the process is the first process in the container.Initbool// LogLevel is a string containing a numeric representation of the current// log level (i.e. "4", but never "info"). It is passed on to runc init as// _LIBCONTAINER_LOGLEVEL environment variable.LogLevelstring// SubCgroupPaths specifies sub-cgroups to run the process in.// Map keys are controller names, map values are paths (relative to// container's top-level cgroup).//// If empty, the default top-level container's cgroup is used.//// For cgroup v2, the only key allowed is "".SubCgroupPaths map[string]string// Scheduler represents the scheduling attributes for a process.//// If not empty, takes precedence over container's [configs.Config.Scheduler].Scheduler *configs.Scheduler// IOPriority is a process I/O priority.//// If not empty, takes precedence over container's [configs.Config.IOPriority].IOPriority *configs.IOPriorityCPUAffinity *configs.CPUAffinity// contains filtered or unexported fields}

Process defines the configuration and IO for a process inside a container.

Note that some Process properties are also present in container configuration(configs.Config). In all such cases, Process properties take precedenceover container configuration ones.

func (*Process)InitializeIOadded inv0.0.7

func (p *Process) InitializeIO(rootuid, rootgidint) (i *IO, errerror)

InitializeIO creates pipes for use with the process's stdio and returns theopposite side for each. Do not use this if you want to have a pseudoterminalset up for you by libcontainer (TODO: fix that too).TODO: This is mostly unnecessary, and should be handled by clients.

func (Process)Pid

func (pProcess) Pid() (int,error)

Pid returns the process ID

func (Process)Signal

func (pProcess) Signal(sigos.Signal)error

Signal sends a signal to the Process.

func (Process)Wait

func (pProcess) Wait() (*os.ProcessState,error)

Wait waits for the process to exit.Wait releases any resources associated with the Process

typeState

type State struct {BaseState// Specified if the container was started under the rootless mode.// Set to true if BaseState.Config.RootlessEUID && BaseState.Config.RootlessCgroupsRootlessbool `json:"rootless"`// Paths to all the container's cgroups, as returned by (*cgroups.Manager).GetPaths//// For cgroup v1, a key is cgroup subsystem name, and the value is the path// to the cgroup for this subsystem.//// For cgroup v2 unified hierarchy, a key is "", and the value is the unified path.CgroupPaths map[string]string `json:"cgroup_paths"`// NamespacePaths are filepaths to the container's namespaces. Key is the namespace type// with the value as the path.NamespacePaths map[configs.NamespaceType]string `json:"namespace_paths"`// Container's standard descriptors (std{in,out,err}), needed for checkpoint and restoreExternalDescriptors []string `json:"external_descriptors,omitempty"`// Intel RDT "resource control" filesystem pathIntelRdtPathstring `json:"intel_rdt_path"`}

State represents a running container's state

typeStats

type Stats struct {Interfaces    []*types.NetworkInterfaceCgroupStats   *cgroups.StatsIntelRdtStats *intelrdt.Stats}

typeStatus

type Statusint

Status is the status of a container.

const (// Created is the status that denotes the container exists but has not been run yet.CreatedStatus =iota// Running is the status that denotes the container exists and is running.Running// Paused is the status that denotes the container exists, but all its processes are paused.Paused// Stopped is the status that denotes the container does not have a created or running process.Stopped)

func (Status)Stringadded inv0.0.7

func (sStatus) String()string

typeVethPairNameadded inv0.0.4

type VethPairName struct {ContainerInterfaceNamestringHostInterfaceNamestring}

Source Files

View all Source files

Directories

PathSynopsis
integration is used for integration testing of libcontainer
integration is used for integration testing of libcontainer
internal
Package specconv implements conversion of specifications to libcontainer configurations
Package specconv implements conversion of specifications to libcontainer configurations
Package user is an alias for github.com/moby/sys/user.
Package user is an alias for github.com/moby/sys/user.
Deprecated: use github.com/moby/sys/userns
Deprecated: use github.com/moby/sys/userns

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f orF : Jump to
y orY : Canonical URL
go.dev uses cookies from Google to deliver and enhance the quality of its services and to analyze traffic.Learn more.

[8]ページ先頭

©2009-2025 Movatter.jp