Exchanging pixel buffers

As originally designed, the Linux graphics subsystem had extremely limitedsupport for sharing pixel-buffer allocations between processes, devices, andsubsystems. Modern systems require extensive integration between all threeclasses; this document details how applications and kernel subsystems shouldapproach this sharing for two-dimensional image data.

It is written with reference to the DRM subsystem for GPU and display devices,V4L2 for media devices, and also to Vulkan, EGL and Wayland, for userspacesupport, however any other subsystems should also follow this design and advice.

Glossary of terms

image:

Conceptually a two-dimensional array of pixels. The pixels may be storedin one or more memory buffers. Has width and height in pixels, pixelformat and modifier (implicit or explicit).

row:

A span along a single y-axis value, e.g. from co-ordinates (0,100) to(200,100).

scanline:

Synonym for row.

column:

A span along a single x-axis value, e.g. from co-ordinates (100,0) to(100,100).

memory buffer:

A piece of memory for storing (parts of) pixel data. Has stride and sizein bytes and at least one handle in some API. May contain one or moreplanes.

plane:

A two-dimensional array of some or all of an image’s color and alphachannel values.

pixel:

A picture element. Has a single color value which is defined by one ormore color channels values, e.g. R, G and B, or Y, Cb and Cr. May alsohave an alpha value as an additional channel.

pixel data:

Bytes or bits that represent some or all of the color/alpha channel valuesof a pixel or an image. The data for one pixel may be spread over severalplanes or memory buffers depending on format and modifier.

color value:

A tuple of numbers, representing a color. Each element in the tuple is acolor channel value.

color channel:

One of the dimensions in a color model. For example, RGB model haschannels R, G, and B. Alpha channel is sometimes counted as a colorchannel as well.

pixel format:

A description of how pixel data represents the pixel’s color and alphavalues.

modifier:

A description of how pixel data is laid out in memory buffers.

alpha:

A value that denotes the color coverage in a pixel. Sometimes used fortranslucency instead.

stride:

A value that denotes the relationship between pixel-location co-ordinatesand byte-offset values. Typically used as the byte offset between twopixels at the start of vertically-consecutive tiling blocks. For linearlayouts, the byte offset between two vertically-adjacent pixels. Fornon-linear formats the stride must be computed in a consistent way, whichusually is done as-if the layout was linear.

pitch:

Synonym for stride.

Formats and modifiers

Each buffer must have an underlying format. This format describes the colorvalues provided for each pixel. Although each subsystem has its own formatdescriptions (e.g. V4L2 and fbdev), theDRM_FORMAT_* tokens should be reusedwherever possible, as they are the standard descriptions used for interchange.These tokens are described in thedrm_fourcc.h file, which is a part ofDRM’s uAPI.

EachDRM_FORMAT_* token describes the translation between a pixelco-ordinate in an image, and the color values for that pixel contained withinits memory buffers. The number and type of color channels are described:whether they are RGB or YUV, integer or floating-point, the size of each channeland their locations within the pixel memory, and the relationship between colorplanes.

For example,DRM_FORMAT_ARGB8888 describes a format in which each pixel hasa single 32-bit value in memory. Alpha, red, green, and blue, color channels areavailable at 8-bit precision per channel, ordered respectively from most toleast significant bits in little-endian storage.DRM_FORMAT_* is notaffected by either CPU or device endianness; the byte pattern in memory isalways as described in the format definition, which is usually little-endian.

As a more complex example,DRM_FORMAT_NV12 describes a format in which lumaand chroma YUV samples are stored in separate planes, where the chroma plane isstored at half the resolution in both dimensions (i.e. one U/V chromasample is stored for each 2x2 pixel grouping).

Format modifiers describe a translation mechanism between these per-pixel memorysamples, and the actual memory storage for the buffer. The most straightforwardmodifier isDRM_FORMAT_MOD_LINEAR, describing a scheme in which each planeis laid out row-sequentially, from the top-left to the bottom-right corner.This is considered the baseline interchange format, and most convenient for CPUaccess.

Modern hardware employs much more sophisticated access mechanisms, typicallymaking use of tiled access and possibly also compression. For example, theDRM_FORMAT_MOD_VIVANTE_TILED modifier describes memory storage where pixelsare stored in 4x4 blocks arranged in row-major ordering, i.e. the first tile ina plane stores pixels (0,0) to (3,3) inclusive, and the second tile in a planestores pixels (4,0) to (7,3) inclusive.

Some modifiers may modify the number of planes required for an image; forexample, theI915_FORMAT_MOD_Y_TILED_CCS modifier adds a second plane to RGBformats in which it stores data about the status of every tile, notablyincluding whether the tile is fully populated with pixel data, or can beexpanded from a single solid color.

These extended layouts are highly vendor-specific, and even specific toparticular generations or configurations of devices per-vendor. For this reason,support of modifiers must be explicitly enumerated and negotiated by all usersin order to ensure a compatible and optimal pipeline, as discussed below.

Dimensions and size

Each pixel buffer must be accompanied by logical pixel dimensions. This refersto the number of unique samples which can be extracted from, or stored to, theunderlying memory storage. For example, even though a 1920x1080DRM_FORMAT_NV12 buffer has a luma plane containing 1920x1080 samples for the Ycomponent, and 960x540 samples for the U and V components, the overall buffer isstill described as having dimensions of 1920x1080.

The in-memory storage of a buffer is not guaranteed to begin immediately at thebase address of the underlying memory, nor is it guaranteed that the memorystorage is tightly clipped to either dimension.

Each plane must therefore be described with anoffset in bytes, which will beadded to the base address of the memory storage before performing any per-pixelcalculations. This may be used to combine multiple planes into a single memorybuffer; for example,DRM_FORMAT_NV12 may be stored in a single memory bufferwhere the luma plane’s storage begins immediately at the start of the bufferwith an offset of 0, and the chroma plane’s storage follows within the same bufferbeginning from the byte offset for that plane.

Each plane must also have astride in bytes, expressing the offset in memorybetween two contiguous row. For example, aDRM_FORMAT_MOD_LINEAR bufferwith dimensions of 1000x1000 may have been allocated as if it were 1024x1000, inorder to allow for aligned access patterns. In this case, the buffer will stillbe described with a width of 1000, however the stride will be1024*bpp,indicating that there are 24 pixels at the positive extreme of the x axis whosevalues are not significant.

Buffers may also be padded further in the y dimension, simply by allocating alarger area than would ordinarily be required. For example, many media decodersare not able to natively output buffers of height 1080, but instead require aneffective height of 1088 pixels. In this case, the buffer continues to bedescribed as having a height of 1080, with the memory allocation for each bufferbeing increased to account for the extra padding.

Enumeration

Every user of pixel buffers must be able to enumerate a set of supported formatsand modifiers, described together. Within KMS, this is achieved with theIN_FORMATS property on each DRM plane, listing the supported DRM formats, andthe modifiers supported for each format. In userspace, this is supported throughtheEGL_EXT_image_dma_buf_import_modifiers extension entrypoints for EGL, theVK_EXT_image_drm_format_modifier extension for Vulkan, and thezwp_linux_dmabuf_v1 extension for Wayland.

Each of these interfaces allows users to query a set of supportedformat+modifier combinations.

Negotiation

It is the responsibility of userspace to negotiate an acceptable format+modifiercombination for its usage. This is performed through a simple intersection oflists. For example, if a user wants to use Vulkan to render an image to bedisplayed on a KMS plane, it must:

  • query KMS for theIN_FORMATS property for the given plane

  • query Vulkan for the supported formats for its physical device, making sureto pass theVkImageUsageFlagBits andVkImageCreateFlagBitscorresponding to the intended rendering use

  • intersect these formats to determine the most appropriate one

  • for this format, intersect the lists of supported modifiers for both KMS andVulkan, to obtain a final list of acceptable modifiers for that format

This intersection must be performed for all usages. For example, if the useralso wishes to encode the image to a video stream, it must query the media APIit intends to use for encoding for the set of modifiers it supports, andadditionally intersect against this list.

If the intersection of all lists is an empty list, it is not possible to sharebuffers in this way, and an alternate strategy must be considered (e.g. usingCPU access routines to copy data between the different uses, with thecorresponding performance cost).

The resulting modifier list is unsorted; the order is not significant.

Allocation

Once userspace has determined an appropriate format, and corresponding list ofacceptable modifiers, it must allocate the buffer. As there is no universalbuffer-allocation interface available at either kernel or userspace level, theclient makes an arbitrary choice of allocation interface such as Vulkan, GBM, ora media API.

Each allocation request must take, at a minimum: the pixel format, a list ofacceptable modifiers, and the buffer’s width and height. Each API may extendthis set of properties in different ways, such as allowing allocation in morethan two dimensions, intended usage patterns, etc.

The component which allocates the buffer will make an arbitrary choice of whatit considers the ‘best’ modifier within the acceptable list for the requestedallocation, any padding required, and further properties of the underlyingmemory buffers such as whether they are stored in system or device-specificmemory, whether or not they are physically contiguous, and their cache mode.These properties of the memory buffer are not visible to userspace, however thedma-heaps API is an effort to address this.

After allocation, the client must query the allocator to determine the actualmodifier selected for the buffer, as well as the per-plane offset and stride.Allocators are not permitted to vary the format in use, to select a modifier notprovided within the acceptable list, nor to vary the pixel dimensions other thanthe padding expressed through offset, stride, and size.

Communicating additional constraints, such as alignment of stride or offset,placement within a particular memory area, etc, is out of scope of dma-buf,and is not solved by format and modifier tokens.

Import

To use a buffer within a different context, device, or subsystem, the userpasses these parameters (format, modifier, width, height, and per-plane offsetand stride) to an importing API.

Each memory buffer is referred to by a buffer handle, which may be unique orduplicated within an image. For example, aDRM_FORMAT_NV12 buffer may havethe luma and chroma buffers combined into a single memory buffer by use of theper-plane offset parameters, or they may be completely separate allocations inmemory. For this reason, each import and allocation API must provide a separatehandle for each plane.

Each kernel subsystem has its own types and interfaces for buffer management.DRM uses GEM buffer objects (BOs), V4L2 has its own references, etc. These typesare not portable between contexts, processes, devices, or subsystems.

To address this,dma-buf handles are used as the universal interchange forbuffers. Subsystem-specific operations are used to export native buffer handlesto adma-buf file descriptor, and to import those file descriptors into anative buffer handle. dma-buf file descriptors can be transferred betweencontexts, processes, devices, and subsystems.

For example, a Wayland media player may use V4L2 to decode a video frame into aDRM_FORMAT_NV12 buffer. This will result in two memory planes (luma andchroma) being dequeued by the user from V4L2. These planes are then exported toone dma-buf file descriptor per plane, these descriptors are then sent alongwith the metadata (format, modifier, width, height, per-plane offset and stride)to the Wayland server. The Wayland server will then import these filedescriptors as an EGLImage for use through EGL/OpenGL (ES), a VkImage for usethrough Vulkan, or a KMS framebuffer object; each of these import operationswill take the same metadata and convert the dma-buf file descriptors into theirnative buffer handles.

Having a non-empty intersection of supported modifiers does not guarantee thatimport will succeed into all consumers; they may have constraints beyond thoseimplied by modifiers which must be satisfied.

Implicit modifiers

The concept of modifiers post-dates all of the subsystems mentioned above. Assuch, it has been retrofitted into all of these APIs, and in order to ensurebackwards compatibility, support is needed for drivers and userspace which donot (yet) support modifiers.

As an example, GBM is used to allocate buffers to be shared between EGL forrendering and KMS for display. It has two entrypoints for allocating buffers:gbm_bo_create which only takes the format, width, height, and a usage token,andgbm_bo_create_with_modifiers which extends this with a list of modifiers.

In the latter case, the allocation is as discussed above, being provided with alist of acceptable modifiers that the implementation can choose from (or fail ifit is not possible to allocate within those constraints). In the former casewhere modifiers are not provided, the GBM implementation must make its ownchoice as to what is likely to be the ‘best’ layout. Such a choice is entirelyimplementation-specific: some will internally use tiled layouts which are notCPU-accessible if the implementation decides that is a good idea throughwhatever heuristic. It is the implementation’s responsibility to ensure thatthis choice is appropriate.

To support this case where the layout is not known because there is no awarenessof modifiers, a specialDRM_FORMAT_MOD_INVALID token has been defined. Thispseudo-modifier declares that the layout is not known, and that the drivershould use its own logic to determine what the underlying layout may be.

Note

DRM_FORMAT_MOD_INVALID is a non-zero value. The modifier value zero isDRM_FORMAT_MOD_LINEAR, which is an explicit guarantee that the imagehas the linear layout. Care and attention should be taken to ensure thatzero as a default value is not mixed up with either no modifier or the linearmodifier. Also note that in some APIs the invalid modifier value is specifiedwith an out-of-band flag, like inDRM_IOCTL_MODE_ADDFB2.

There are four cases where this token may be used:
  • during enumeration, an interface may returnDRM_FORMAT_MOD_INVALID, eitheras the sole member of a modifier list to declare that explicit modifiers arenot supported, or as part of a larger list to declare that implicit modifiersmay be used

  • during allocation, a user may supplyDRM_FORMAT_MOD_INVALID, either as thesole member of a modifier list (equivalent to not supplying a modifier listat all) to declare that explicit modifiers are not supported and must not beused, or as part of a larger list to declare that an allocation using implicitmodifiers is acceptable

  • in a post-allocation query, an implementation may returnDRM_FORMAT_MOD_INVALID as the modifier of the allocated buffer to declarethat the underlying layout is implementation-defined and that an explicitmodifier description is not available; per the above rules, this may only bereturned when the user has includedDRM_FORMAT_MOD_INVALID as part of thelist of acceptable modifiers, or not provided a list

  • when importing a buffer, the user may supplyDRM_FORMAT_MOD_INVALID as thebuffer modifier (or not supply a modifier) to indicate that the modifier isunknown for whatever reason; this is only acceptable when the buffer hasnot been allocated with an explicit modifier

It follows from this that for any single buffer, the complete chain of operationsformed by the producer and all the consumers must be either fully implicit or fullyexplicit. For example, if a user wishes to allocate a buffer for use betweenGPU, display, and media, but the media API does not support modifiers, then theusermust not allocate the buffer with explicit modifiers and attempt toimport the buffer into the media API with no modifier, but either perform theallocation using implicit modifiers, or allocate the buffer for media useseparately and copy between the two buffers.

As one exception to the above, allocations may be ‘upgraded’ from implicitto explicit modifiers. For example, if the buffer is allocated withgbm_bo_create (taking no modifiers), the user may then query the modifier withgbm_bo_get_modifier and then use this modifier as an explicit modifier tokenif a valid modifier is returned.

When allocating buffers for exchange between different users and modifiers arenot available, implementations are strongly encouraged to useDRM_FORMAT_MOD_LINEAR for their allocation, as this is the universal baselinefor exchange. However, it is not guaranteed that this will result in the correctinterpretation of buffer content, as implicit modifier operation may still besubject to driver-specific heuristics.

Any new users - userspace programs and protocols, kernel subsystems, etc -wishing to exchange buffers must offer interoperability through dma-buf filedescriptors for memory planes, DRM format tokens to describe the format, DRMformat modifiers to describe the layout in memory, at least width and height fordimensions, and at least offset and stride for each memory plane.