DRM Memory Management¶

Modern Linux systems require large amount of graphics memory to storeframe buffers, textures, vertices and other graphics-related data. Giventhe very dynamic nature of many of that data, managing graphics memoryefficiently is thus crucial for the graphics stack and plays a centralrole in the DRM infrastructure.

The DRM core includes two memory managers, namely Translation Table Maps(TTM) and Graphics Execution Manager (GEM). TTM was the first DRM memorymanager to be developed and tried to be a one-size-fits-them allsolution. It provides a single userspace API to accommodate the need ofall hardware, supporting both Unified Memory Architecture (UMA) devicesand devices with dedicated video RAM (i.e. most discrete video cards).This resulted in a large, complex piece of code that turned out to behard to use for driver development.

GEM started as an Intel-sponsored project in reaction to TTM’scomplexity. Its design philosophy is completely different: instead ofproviding a solution to every graphics memory-related problems, GEMidentified common code between drivers and created a support library toshare it. GEM has simpler initialization and execution requirements thanTTM, but has no video RAM management capabilities and is thus limited toUMA devices.

The Translation Table Manager (TTM)¶

TTM design background and information belongs here.

TTM initialization¶

WarningThis section is outdated.

Drivers wishing to support TTM must pass a filledttm_bo_driver structure to ttm_bo_device_init, together with aninitialized global reference to the memory manager. The ttm_bo_driverstructure contains several fields with function pointers forinitializing the TTM, allocating and freeing memory, waiting for commandcompletion and fence synchronization, and memory migration.

Thestructdrm_global_reference is madeup of several fields:

structdrm_global_reference{enumttm_global_typesglobal_type;size_tsize;void*object;int(*init)(structdrm_global_reference*);void(*release)(structdrm_global_reference*);};

There should be one global reference structure for your memory manageras a whole, and there will be others for each object created by thememory manager at runtime. Your global TTM should have a type ofTTM_GLOBAL_TTM_MEM. The size field for the global object should besizeof(struct ttm_mem_global), and the init and release hooks shouldpoint at your driver-specific init and release routines, which probablyeventually call ttm_mem_global_init and ttm_mem_global_release,respectively.

Once your global TTM accounting structure is set up and initialized bycalling ttm_global_item_ref() on it, you need to create a bufferobject TTM to provide a pool for buffer object allocation by clients andthe kernel itself. The type of this object should beTTM_GLOBAL_TTM_BO, and its size should be sizeof(structttm_bo_global). Again, driver-specific init and release functions maybe provided, likely eventually calling ttm_bo_global_ref_init() andttm_bo_global_ref_release(), respectively. Also, like the previousobject, ttm_global_item_ref() is used to create an initial referencecount for the TTM, which will call your initialization function.

See the radeon_ttm.c file for an example of usage.

The Graphics Execution Manager (GEM)¶

The GEM design approach has resulted in a memory manager that doesn’tprovide full coverage of all (or even all common) use cases in itsuserspace or kernel API. GEM exposes a set of standard memory-relatedoperations to userspace and a set of helper functions to drivers, andlet drivers implement hardware-specific operations with their ownprivate API.

The GEM userspace API is described in theGEM - the Graphics ExecutionManager article on LWN. Whileslightly outdated, the document provides a good overview of the GEM APIprinciples. Buffer allocation and read and write operations, describedas part of the common GEM API, are currently implemented usingdriver-specific ioctls.

GEM is data-agnostic. It manages abstract buffer objects without knowingwhat individual buffers contain. APIs that require knowledge of buffercontents or purpose, such as buffer allocation or synchronizationprimitives, are thus outside of the scope of GEM and must be implementedusing driver-specific ioctls.

On a fundamental level, GEM involves several operations:

Memory allocation and freeing
Command execution
Aperture management at command execution time

Buffer object allocation is relatively straightforward and largelyprovided by Linux’s shmem layer, which provides memory to back eachobject.

Device-specific operations, such as command execution, pinning, bufferread & write, mapping, and domain ownership transfers are left todriver-specific ioctls.

GEM Initialization¶

Drivers that use GEM must set the DRIVER_GEM bit in the structstructdrm_driver driver_featuresfield. The DRM core will then automatically initialize the GEM corebefore calling the load operation. Behind the scene, this will create aDRM Memory Manager object which provides an address space pool forobject allocation.

In a KMS configuration, drivers need to allocate and initialize acommand ring buffer following core GEM initialization if required by thehardware. UMA devices usually have what is called a “stolen” memoryregion, which provides space for the initial framebuffer and large,contiguous memory regions required by the device. This space istypically not managed by GEM, and must be initialized separately intoits own DRM MM object.

GEM Objects Creation¶

GEM splits creation of GEM objects and allocation of the memory thatbacks them in two distinct operations.

GEM objects are represented by an instance of structstructdrm_gem_object. Drivers usually need toextend GEM objects with private information and thus create adriver-specific GEM object structure type that embeds an instance ofstructstructdrm_gem_object.

To create a GEM object, a driver allocates memory for an instance of itsspecific GEM object type and initializes the embedded structstructdrm_gem_object with a calltodrm_gem_object_init(). The function takes a pointerto the DRM device, a pointer to the GEM object and the buffer objectsize in bytes.

GEM uses shmem to allocate anonymous pageable memory.drm_gem_object_init() will create an shmfs file of therequested size and store it into the structstructdrm_gem_object filp field. The memory isused as either main storage for the object when the graphics hardwareuses system memory directly or as a backing store otherwise.

Drivers are responsible for the actual physical pages allocation bycalling shmem_read_mapping_page_gfp() for each page.Note that they can decide to allocate pages when initializing the GEMobject, or to delay allocation until the memory is needed (for instancewhen a page fault occurs as a result of a userspace memory access orwhen the driver needs to start a DMA transfer involving the memory).

Anonymous pageable memory allocation is not always desired, for instancewhen the hardware requires physically contiguous system memory as isoften the case in embedded devices. Drivers can create GEM objects withno shmfs backing (called private GEM objects) by initializing them with a calltodrm_gem_private_object_init() instead ofdrm_gem_object_init(). Storage forprivate GEM objects must be managed by drivers.

GEM Objects Lifetime¶

All GEM objects are reference-counted by the GEM core. References can beacquired and release by callingdrm_gem_object_get() anddrm_gem_object_put()respectively. The caller must hold thestructdrm_devicestruct_mutex lock when callingdrm_gem_object_get(). As a convenience, GEMprovidesdrm_gem_object_put_unlocked() functions that can be called withoutholding the lock.

When the last reference to a GEM object is released the GEM core callsthestructdrm_driver gem_free_object_unlockedoperation. That operation is mandatory for GEM-enabled drivers and mustfree the GEM object and all associated resources.

void (*gem_free_object) (struct drm_gem_object *obj); Drivers areresponsible for freeing all GEM object resources. This includes theresources created by the GEM core, which need to be released withdrm_gem_object_release().

GEM Objects Naming¶

Communication between userspace and the kernel refers to GEM objectsusing local handles, global names or, more recently, file descriptors.All of those are 32-bit integer values; the usual Linux kernel limitsapply to the file descriptors.

GEM handles are local to a DRM file. Applications get a handle to a GEMobject through a driver-specific ioctl, and can use that handle to referto the GEM object in other standard or driver-specific ioctls. Closing aDRM file handle frees all its GEM handles and dereferences theassociated GEM objects.

To create a handle for a GEM object drivers calldrm_gem_handle_create(). Thefunction takes a pointer to the DRM file and the GEM object and returns alocally unique handle. When the handle is no longer needed drivers delete itwith a call todrm_gem_handle_delete(). Finally the GEM object associated with ahandle can be retrieved by a call todrm_gem_object_lookup().

Handles don’t take ownership of GEM objects, they only take a referenceto the object that will be dropped when the handle is destroyed. Toavoid leaking GEM objects, drivers must make sure they drop thereference(s) they own (such as the initial reference taken at objectcreation time) as appropriate, without any special consideration for thehandle. For example, in the particular case of combined GEM object andhandle creation in the implementation of the dumb_create operation,drivers must drop the initial reference to the GEM object beforereturning the handle.

GEM names are similar in purpose to handles but are not local to DRMfiles. They can be passed between processes to reference a GEM objectglobally. Names can’t be used directly to refer to objects in the DRMAPI, applications must convert handles to names and names to handlesusing the DRM_IOCTL_GEM_FLINK and DRM_IOCTL_GEM_OPEN ioctlsrespectively. The conversion is handled by the DRM core without anydriver-specific support.

GEM also supports buffer sharing with dma-buf file descriptors throughPRIME. GEM-based drivers must use the provided helpers functions toimplement the exporting and importing correctly. See ?. Since sharingfile descriptors is inherently more secure than the easily guessable andglobal GEM names it is the preferred buffer sharing mechanism. Sharingbuffers through GEM names is only supported for legacy userspace.Furthermore PRIME also allows cross-device buffer sharing since it isbased on dma-bufs.

GEM Objects Mapping¶

Because mapping operations are fairly heavyweight GEM favoursread/write-like access to buffers, implemented through driver-specificioctls, over mapping buffers to userspace. However, when random accessto the buffer is needed (to perform software rendering for instance),direct access to the object can be more efficient.

The mmap system call can’t be used directly to map GEM objects, as theydon’t have their own file handle. Two alternative methods currentlyco-exist to map GEM objects to userspace. The first method uses adriver-specific ioctl to perform the mapping operation, callingdo_mmap() under the hood. This is often considereddubious, seems to be discouraged for new GEM-enabled drivers, and willthus not be described here.

The second method uses the mmap system call on the DRM file handle. void*mmap(void *addr, size_t length, int prot, int flags, int fd, off_toffset); DRM identifies the GEM object to be mapped by a fake offsetpassed through the mmap offset argument. Prior to being mapped, a GEMobject must thus be associated with a fake offset. To do so, driversmust calldrm_gem_create_mmap_offset() on the object.

Once allocated, the fake offset value must be passed to the applicationin a driver-specific way and can then be used as the mmap offsetargument.

The GEM core provides a helper methoddrm_gem_mmap() tohandle object mapping. The method can be set directly as the mmap fileoperation handler. It will look up the GEM object based on the offsetvalue and set the VMA operations to thestructdrm_driver gem_vm_ops field. Note thatdrm_gem_mmap() doesn’t map memory touserspace, but relies on the driver-provided fault handler to map pagesindividually.

To usedrm_gem_mmap(), drivers must fill the structstructdrm_driver gem_vm_ops field with a pointer to VM operations.

The VM operations is astructvm_operations_structmade up of several fields, the more interesting ones being:

structvm_operations_struct{void(*open)(structvm_area_struct*area);void(*close)(structvm_area_struct*area);vm_fault_t(*fault)(structvm_fault*vmf);};

The open and close operations must update the GEM object referencecount. Drivers can use thedrm_gem_vm_open() anddrm_gem_vm_close() helperfunctions directly as open and close handlers.

The fault operation handler is responsible for mapping individual pagesto userspace when a page fault occurs. Depending on the memoryallocation scheme, drivers can allocate pages at fault time, or candecide to allocate memory for the GEM object at the time the object iscreated.

Drivers that want to map the GEM object upfront instead of handling pagefaults can implement their own mmap file operation handler.

For platforms without MMU the GEM core provides a helper methoddrm_gem_cma_get_unmapped_area(). The mmap() routines will call this to get aproposed address for the mapping.

To usedrm_gem_cma_get_unmapped_area(), drivers must fill the structstructfile_operations get_unmapped_area field witha pointer ondrm_gem_cma_get_unmapped_area().

More detailed information about get_unmapped_area can be found inDocumentation/nommu-mmap.txt

Memory Coherency¶

When mapped to the device or used in a command buffer, backing pages foran object are flushed to memory and marked write combined so as to becoherent with the GPU. Likewise, if the CPU accesses an object after theGPU has finished rendering to the object, then the object must be madecoherent with the CPU’s view of memory, usually involving GPU cacheflushing of various kinds. This core CPU<->GPU coherency management isprovided by a device-specific ioctl, which evaluates an object’s currentdomain and performs any necessary flushing or synchronization to put theobject into the desired coherency domain (note that the object may bebusy, i.e. an active render target; in that case, setting the domainblocks the client and waits for rendering to complete before performingany necessary flushing operations).

Command Execution¶

Perhaps the most important GEM function for GPU devices is providing acommand execution interface to clients. Client programs constructcommand buffers containing references to previously allocated memoryobjects, and then submit them to GEM. At that point, GEM takes care tobind all the objects into the GTT, execute the buffer, and providenecessary synchronization between clients accessing the same buffers.This often involves evicting some objects from the GTT and re-bindingothers (a fairly expensive operation), and providing relocation supportwhich hides fixed GTT offsets from clients. Clients must take care notto submit command buffers that reference more objects than can fit inthe GTT; otherwise, GEM will reject them and no rendering will occur.Similarly, if several objects in the buffer require fence registers tobe allocated for correct rendering (e.g. 2D blits on pre-965 chips),care must be taken not to require more fence registers than areavailable to the client. Such resource management should be abstractedfrom the client in libdrm.

GEM Function Reference¶

structdrm_gem_object_funcs¶: GEM object functions

Definition

struct drm_gem_object_funcs {  void (*free)(struct drm_gem_object *obj);  int (*open)(struct drm_gem_object *obj, struct drm_file *file);  void (*close)(struct drm_gem_object *obj, struct drm_file *file);  void (*print_info)(struct drm_printer *p, unsigned int indent, const struct drm_gem_object *obj);  struct dma_buf *(*export)(struct drm_gem_object *obj, int flags);  int (*pin)(struct drm_gem_object *obj);  void (*unpin)(struct drm_gem_object *obj);  struct sg_table *(*get_sg_table)(struct drm_gem_object *obj);  void *(*vmap)(struct drm_gem_object *obj);  void (*vunmap)(struct drm_gem_object *obj, void *vaddr);  int (*mmap)(struct drm_gem_object *obj, struct vm_area_struct *vma);  const struct vm_operations_struct *vm_ops;};

Members

free

Deconstructor for drm_gem_objects.

This callback is mandatory.

open

Called upon GEM handle creation.

This callback is optional.

close

Called upon GEM handle release.

This callback is optional.

print_info

If driver subclasses structdrm_gem_object, it can implement thisoptional hook for printing additional driver specific info.

drm_printf_indent() should be used in the callback passing it theindent argument.

This callback is called from drm_gem_print_info().

This callback is optional.

export

Export backing buffer as adma_buf.If this is not setdrm_gem_prime_export() is used.

This callback is optional.

pin

Pin backing buffer in memory. Used by thedrm_gem_map_attach() helper.

This callback is optional.

unpin

Unpin backing buffer. Used by thedrm_gem_map_detach() helper.

This callback is optional.

get_sg_table

Returns a Scatter-Gather table representation of the buffer.Used when exporting a buffer by thedrm_gem_map_dma_buf() helper.Releasing is done by calling dma_unmap_sg_attrs() and sg_free_table()in drm_gem_unmap_buf(), therefore these helpers and this callbackhere cannot be used for sg tables pointing at driver private memoryranges.

GEM CMA Helper Functions Reference¶

The Contiguous Memory Allocator reserves a pool of memory at early bootthat is used to service requests for large blocks of contiguous memory.

The DRM GEM/CMA helpers use this allocator as a means to provide bufferobjects that are physically contiguous in memory. This is useful fordisplay drivers that are unable to map scattered buffers via an IOMMU.

structdrm_gem_cma_object¶: GEM object backed by CMA memory allocations

Definition

struct drm_gem_cma_object {  struct drm_gem_object base;  dma_addr_t paddr;  struct sg_table *sgt;  void *vaddr;};

Members

base: base GEM object
paddr: physical address of the backing memory
sgt: scatter/gather table for imported PRIME buffers. The table can havemore than one entry but they are guaranteed to have contiguousDMA addresses.
vaddr: kernel virtual address of the backing memory

DEFINE_DRM_GEM_CMA_FOPS(name)¶: macro to generate file operations for CMA drivers

Parameters

name: name for the generated structure

Description

This macro autogenerates a suitablestructfile_operations for CMA baseddrivers, which can be assigned todrm_driver.fops. Note that this structurecannot be shared between drivers, because it contains a reference to thecurrent module using THIS_MODULE.

Note that the declaration is already marked as static - if you need anon-static version of this you’re probably doing it wrong and will break theTHIS_MODULE reference by accident.

DRM_GEM_CMA_VMAP_DRIVER_OPS()¶: CMA GEM driver operations ensuring a virtual address on the buffer

Parameters

Description

This macro provides a shortcut for setting the default GEM operations in thedrm_driver structure for drivers that need the virtual address also onimported buffers.

structdrm_gem_cma_object *drm_gem_cma_create(structdrm_device * drm, size_t size)¶: allocate an object with the given size

Parameters

structdrm_device*drm: DRM device
size_tsize: size of the object to allocate

Description

This function creates a CMA GEM object and allocates a contiguous chunk ofmemory as backing store. The backing memory has the writecombine attributeset.

Return

A struct drm_gem_cma_object * on success or an ERR_PTR()-encoded negativeerror code on failure.

voiddrm_gem_cma_free_object(structdrm_gem_object * gem_obj)¶: free resources associated with a CMA GEM object

Parameters

structdrm_gem_object*gem_obj: GEM object to free

Description

This function frees the backing memory of the CMA GEM object, cleans up theGEM object state and frees the memory used to store the object itself.If the buffer is imported and the virtual address is set, it is released.Drivers using the CMA helpers should set this as theirdrm_driver.gem_free_object_unlocked callback.

intdrm_gem_cma_dumb_create_internal(structdrm_file * file_priv, structdrm_device * drm, struct drm_mode_create_dumb * args)¶: create a dumb buffer object

Parameters

structdrm_file*file_priv: DRM file-private structure to create the dumb buffer for
structdrm_device*drm: DRM device
structdrm_mode_create_dumb*args: IOCTL data

Description

This aligns the pitch and size arguments to the minimum required. This isan internal helper that can be wrapped by a driver to account for hardwarewith more specific alignment requirements. It should not be used directlyas theirdrm_driver.dumb_create callback.

Return

0 on success or a negative error code on failure.

intdrm_gem_cma_dumb_create(structdrm_file * file_priv, structdrm_device * drm, struct drm_mode_create_dumb * args)¶: create a dumb buffer object

Parameters

structdrm_file*file_priv: DRM file-private structure to create the dumb buffer for
structdrm_device*drm: DRM device
structdrm_mode_create_dumb*args: IOCTL data

Description

This function computes the pitch of the dumb buffer and rounds it up to aninteger number of bytes per pixel. Drivers for hardware that doesn’t haveany additional restrictions on the pitch can directly use this function astheirdrm_driver.dumb_create callback.

For hardware with additional restrictions, drivers can adjust the fieldsset up by userspace and pass the IOCTL data along to thedrm_gem_cma_dumb_create_internal() function.

Return

0 on success or a negative error code on failure.

intdrm_gem_cma_mmap(struct file * filp, struct vm_area_struct * vma)¶: memory-map a CMA GEM object

Parameters

structfile*filp: file object
structvm_area_struct*vma: VMA for the area to be mapped

Description

This function implements an augmented version of the GEM DRM file mmapoperation for CMA objects: In addition to the usual GEM VMA setup itimmediately faults in the entire object instead of using on-demaindfaulting. Drivers which employ the CMA helpers should use this functionas their ->mmap() handler in the DRM device file’s file_operationsstructure.

Instead of directly referencing this function, drivers should use theDEFINE_DRM_GEM_CMA_FOPS().macro.

Return

0 on success or a negative error code on failure.

unsigned longdrm_gem_cma_get_unmapped_area(struct file * filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags)¶: propose address for mapping in noMMU cases

Parameters

structfile*filp: file object
unsignedlongaddr: memory address
unsignedlonglen: buffer size
unsignedlongpgoff: page offset
unsignedlongflags: memory flags

Description

This function is used in noMMU platforms to propose address mappingfor a given buffer.It’s intended to be used as a direct handler for the structfile_operations.get_unmapped_area operation.

Return

mapping address on success or a negative error code on failure.

voiddrm_gem_cma_print_info(structdrm_printer * p, unsigned int indent, const structdrm_gem_object * obj)¶: Printdrm_gem_cma_object info for debugfs

Parameters

structdrm_printer*p: DRM printer
unsignedintindent: Tab indentation level
conststructdrm_gem_object*obj: GEM object

Description

This function can be used as thedrm_driver->gem_print_info callback.It prints paddr and vaddr for use in e.g. debugfs output.

struct sg_table *drm_gem_cma_prime_get_sg_table(structdrm_gem_object * obj)¶: provide a scatter/gather table of pinned pages for a CMA GEM object

Parameters

structdrm_gem_object*obj: GEM object

Description

This function exports a scatter/gather table suitable for PRIME usage bycalling the standard DMA mapping API. Drivers using the CMA helpers shouldset this as theirdrm_driver.gem_prime_get_sg_table callback.

Return

A pointer to the scatter/gather table of pinned pages or NULL on failure.

structdrm_gem_object *drm_gem_cma_prime_import_sg_table(structdrm_device * dev, structdma_buf_attachment * attach, struct sg_table * sgt)¶: produce a CMA GEM object from another driver’s scatter/gather table of pinned pages

Parameters

structdrm_device*dev: device to import into
structdma_buf_attachment*attach: DMA-BUF attachment
structsg_table*sgt: scatter/gather table of pinned pages

Description

This function imports a scatter/gather table exported via DMA-BUF byanother driver. Imported buffers must be physically contiguous in memory(i.e. the scatter/gather table must contain a single entry). Drivers thatuse the CMA helpers should set this as theirdrm_driver.gem_prime_import_sg_table callback.

Return

A pointer to a newly created GEM object or an ERR_PTR-encoded negativeerror code on failure.

intdrm_gem_cma_prime_mmap(structdrm_gem_object * obj, struct vm_area_struct * vma)¶: memory-map an exported CMA GEM object

Parameters

structdrm_gem_object*obj: GEM object
structvm_area_struct*vma: VMA for the area to be mapped

Description

This function maps a buffer imported via DRM PRIME into a userspaceprocess’s address space. Drivers that use the CMA helpers should set thisas theirdrm_driver.gem_prime_mmap callback.

Return

0 on success or a negative error code on failure.

void *drm_gem_cma_prime_vmap(structdrm_gem_object * obj)¶: map a CMA GEM object into the kernel’s virtual address space

Parameters

structdrm_gem_object*obj: GEM object

Description

This function maps a buffer exported via DRM PRIME into the kernel’svirtual address space. Since the CMA buffers are already mapped into thekernel virtual address space this simply returns the cached virtualaddress. Drivers using the CMA helpers should set this as their DRMdriver’sdrm_driver.gem_prime_vmap callback.

Return

The kernel virtual address of the CMA GEM object’s backing store.

voiddrm_gem_cma_prime_vunmap(structdrm_gem_object * obj, void * vaddr)¶: unmap a CMA GEM object from the kernel’s virtual address space

Parameters

structdrm_gem_object*obj: GEM object
void*vaddr: kernel virtual address where the CMA GEM object was mapped

Description

This function removes a buffer exported via DRM PRIME from the kernel’svirtual address space. This is a no-op because CMA buffers cannot beunmapped from kernel space. Drivers using the CMA helpers should set thisas theirdrm_driver.gem_prime_vunmap callback.

structdrm_gem_object *drm_cma_gem_create_object_default_funcs(structdrm_device * dev, size_t size)¶: Create a CMA GEM object with a default function table

Parameters

structdrm_device*dev: DRM device
size_tsize: Size of the object to allocate

Description

This sets the GEM object functions to the default CMA helper functions.This function can be used as thedrm_driver.gem_create_object callback.

Return

A pointer to a allocated GEM object or an error pointer on failure.

structdrm_gem_object *drm_gem_cma_prime_import_sg_table_vmap(structdrm_device * dev, structdma_buf_attachment * attach, struct sg_table * sgt)¶: PRIME import another driver’s scatter/gather table and get the virtual address of the buffer

Parameters

structdrm_device*dev: DRM device
structdma_buf_attachment*attach: DMA-BUF attachment
structsg_table*sgt: Scatter/gather table of pinned pages

Description

This function imports a scatter/gather table usingdrm_gem_cma_prime_import_sg_table() and usesdma_buf_vmap() to get the kernelvirtual address. This ensures that a CMA GEM object always has its virtualaddress set. This address is released when the object is freed.

This function can be used as thedrm_driver.gem_prime_import_sg_tablecallback. TheDRM_GEM_CMA_VMAP_DRIVER_OPS() macro provides a shortcut to setthe necessary DRM driver operations.

Return

A pointer to a newly created GEM object or an ERR_PTR-encoded negativeerror code on failure.

GEM VRAM Helper Functions Reference¶

This library providesstructdrm_gem_vram_object (GEM VRAM), a GEMbuffer object that is backed by video RAM (VRAM). It can be used forframebuffer devices with dedicated memory.

The data structurestructdrm_vram_mm and its helpers implement a memorymanager for simple framebuffer devices with dedicated video memory. GEMVRAM buffer objects are either placed in the video memory or remain evictedto system memory.

With the GEM interface userspace applications create, manage and destroygraphics buffers, such as an on-screen framebuffer. GEM does not providean implementation of these interfaces. It’s up to the DRM driver toprovide an implementation that suits the hardware. If the hardware devicecontains dedicated video memory, the DRM driver can use the VRAM helperlibrary. Each active buffer object is stored in video RAM. Activebuffer are used for drawing the current frame, typically something likethe frame’s scanout buffer or the cursor image. If there’s no more spaceleft in VRAM, inactive GEM objects can be moved to system memory.

The easiest way to use the VRAM helper library is to calldrm_vram_helper_alloc_mm(). The function allocates and initializes aninstance ofstructdrm_vram_mm instructdrm_device.vram_mm . UseDRM_GEM_VRAM_DRIVER to initializestructdrm_driver andDRM_VRAM_MM_FILE_OPERATIONS to initializestructfile_operations;as illustrated below.

structfile_operationsfops={.owner=THIS_MODULE,DRM_VRAM_MM_FILE_OPERATION};structdrm_driverdrv={.driver_feature=DRM_...,.fops=&fops,DRM_GEM_VRAM_DRIVER};intinit_drm_driver(){structdrm_device*dev;uint64_tvram_base;unsignedlongvram_size;intret;// setup device, vram base and size// ...ret=drm_vram_helper_alloc_mm(dev,vram_base,vram_size);if(ret)returnret;return0;}

This creates an instance ofstructdrm_vram_mm, exports DRM userspaceinterfaces for GEM buffer management and initializes file operations toallow for accessing created GEM buffers. With this setup, the DRM drivermanages an area of video RAM with VRAM MM and provides GEM VRAM objectsto userspace.

To clean up the VRAM memory management, calldrm_vram_helper_release_mm()in the driver’s clean-up code.

voidfini_drm_driver(){structdrm_device*dev=...;drm_vram_helper_release_mm(dev);}

For drawing or scanout operations, buffer object have to be pinned in videoRAM. Calldrm_gem_vram_pin() withDRM_GEM_VRAM_PL_FLAG_VRAM orDRM_GEM_VRAM_PL_FLAG_SYSTEM to pin a buffer object in video RAM or systemmemory. Calldrm_gem_vram_unpin() to release the pinned object afterwards.

A buffer object that is pinned in video RAM has a fixed address within thatmemory region. Calldrm_gem_vram_offset() to retrieve this value. Typicallyit’s used to program the hardware’s scanout engine for framebuffers, setthe cursor overlay’s image for a mouse cursor, or use it as input to thehardware’s draing engine.

To access a buffer object’s memory from the DRM driver, calldrm_gem_vram_kmap(). It (optionally) maps the buffer into kernel addressspace and returns the memory address. Usedrm_gem_vram_kunmap() torelease the mapping.

structdrm_gem_vram_object¶: GEM object backed by VRAM

Definition

struct drm_gem_vram_object {  struct ttm_buffer_object bo;  struct ttm_bo_kmap_obj kmap;  unsigned int kmap_use_count;  struct ttm_placement placement;  struct ttm_place placements[2];  int pin_count;};

Members

bo: TTM buffer object
kmap: Mapping information forbo
kmap_use_count: Reference count on the virtual address.The address are un-mapped when the count reaches zero.
placement: TTM placement information. Supported placements areTTM_PL_VRAM andTTM_PL_SYSTEM
placements: TTM placement information.
pin_count: Pin counter

Description

The type struct drm_gem_vram_object represents a GEM object that isbacked by VRAM. It can be used for simple framebuffer devices withdedicated memory. The buffer object can be evicted to system memory ifvideo memory becomes scarce.

GEM VRAM objects perform reference counting for pin and mappingoperations. So a buffer object that has been pinned N times withdrm_gem_vram_pin() must be unpinned N times withdrm_gem_vram_unpin(). The same applies to pairs ofdrm_gem_vram_kmap() anddrm_gem_vram_kunmap(), as well as pairs ofdrm_gem_vram_vmap() anddrm_gem_vram_vunmap().

structdrm_gem_vram_object *drm_gem_vram_of_bo(struct ttm_buffer_object * bo)¶

Parameters

structttm_buffer_object*bo: the VRAM buffer object

Description

for field bo.

Return

The containing GEM VRAM object

structdrm_gem_vram_object *drm_gem_vram_of_gem(structdrm_gem_object * gem)¶

Parameters

structdrm_gem_object*gem: the GEM object

Description

for field gem.

Return

The containing GEM VRAM object

DRM_GEM_VRAM_DRIVER()¶: default callback functions forstructdrm_driver

Parameters

Description

Drivers that use VRAM MM and GEM VRAM can use this macro to initializestructdrm_driver with default functions.

structdrm_vram_mm¶: An instance of VRAM MM

Definition

struct drm_vram_mm {  uint64_t vram_base;  size_t vram_size;  struct ttm_bo_device bdev;};

Members

vram_base: Base address of the managed video memory
vram_size: Size of the managed video memory in bytes
bdev: The TTM BO device.

Description

The fieldsstructdrm_vram_mm.vram_base andstructdrm_vram_mm.vrm_size are managed by VRAM MM, but areavailable for public read access. Use the fieldstructdrm_vram_mm.bdev to access the TTM BO device.

structdrm_vram_mm *drm_vram_mm_of_bdev(struct ttm_bo_device * bdev)¶: Returns the container of typestructttm_bo_device for field bdev.

Parameters

structttm_bo_device*bdev: the TTM BO device

Return

The containing instance ofstructdrm_vram_mm

structdrm_gem_vram_object *drm_gem_vram_create(structdrm_device * dev, size_t size, unsigned long pg_align)¶: Creates a VRAM-backed GEM object

Parameters

structdrm_device*dev: the DRM device
size_tsize: the buffer size in bytes
unsignedlongpg_align: the buffer’s alignment in multiples of the page size

Return

A new instance ofstructdrm_gem_vram_object on success, oran ERR_PTR()-encoded error code otherwise.

voiddrm_gem_vram_put(structdrm_gem_vram_object * gbo)¶: Releases a reference to a VRAM-backed GEM object

Parameters

structdrm_gem_vram_object*gbo: the GEM VRAM object

Description

See ttm_bo_put() for more information.

u64drm_gem_vram_mmap_offset(structdrm_gem_vram_object * gbo)¶: Returns a GEM VRAM object’s mmap offset

Parameters

structdrm_gem_vram_object*gbo: the GEM VRAM object

Description

Seedrm_vma_node_offset_addr() for more information.

Return

The buffer object’s offset for userspace mappings on success, or0 if no offset is allocated.

s64drm_gem_vram_offset(structdrm_gem_vram_object * gbo)¶: Returns a GEM VRAM object’s offset in video memory

Parameters

structdrm_gem_vram_object*gbo: the GEM VRAM object

Description

This function returns the buffer object’s offset in the device’s videomemory. The buffer object has to be pinned toTTM_PL_VRAM.

Return

The buffer object’s offset in video memory on success, ora negative errno code otherwise.

intdrm_gem_vram_pin(structdrm_gem_vram_object * gbo, unsigned long pl_flag)¶: Pins a GEM VRAM object in a region.

Parameters

structdrm_gem_vram_object*gbo: the GEM VRAM object
unsignedlongpl_flag: a bitmask of possible memory regions

Description

Pinning a buffer object ensures that it is not evicted froma memory region. A pinned buffer object has to be unpinned beforeit can be pinned to another region. If the pl_flag argument is 0,the buffer is pinned at its current location (video RAM or systemmemory).

Small buffer objects, such as cursor images, can lead to memoryfragmentation if they are pinned in the middle of video RAM. Thisis especially a problem on devices with only a small amount ofvideo RAM. Fragmentation can prevent the primary framebuffer fromfitting in, even though there’s enough memory overall. The modifierDRM_GEM_VRAM_PL_FLAG_TOPDOWN marks the buffer object to be pinnedat the high end of the memory region to avoid fragmentation.

Return

0 on success, ora negative error code otherwise.

intdrm_gem_vram_unpin(structdrm_gem_vram_object * gbo)¶: Unpins a GEM VRAM object

Parameters

structdrm_gem_vram_object*gbo: the GEM VRAM object

Return

0 on success, ora negative error code otherwise.

void *drm_gem_vram_kmap(structdrm_gem_vram_object * gbo, bool map, bool * is_iomem)¶: Maps a GEM VRAM object into kernel address space

Parameters

structdrm_gem_vram_object*gbo: the GEM VRAM object
boolmap: establish a mapping if necessary
bool*is_iomem: returns true if the mapped memory is I/O memory, or false otherwise; can be NULL

Description

This function maps the buffer object into the kernel’s address spaceor returns the current mapping. If the parameter map is false, thefunction only queries the current mapping, but does not establish anew one.

Return

The buffers virtual address if mapped, orNULL if not mapped, oran ERR_PTR()-encoded error code otherwise.

voiddrm_gem_vram_kunmap(structdrm_gem_vram_object * gbo)¶: Unmaps a GEM VRAM object

Parameters

structdrm_gem_vram_object*gbo: the GEM VRAM object

void *drm_gem_vram_vmap(structdrm_gem_vram_object * gbo)¶: Pins and maps a GEM VRAM object into kernel address space

Parameters

structdrm_gem_vram_object*gbo: The GEM VRAM object to map

Description

The vmap function pins a GEM VRAM object to its current location, eithersystem or video memory, and maps its buffer into kernel address space.As pinned object cannot be relocated, you should avoid pinning objectspermanently. Calldrm_gem_vram_vunmap() with the returned address tounmap and unpin the GEM VRAM object.

If you have special requirements for the pinning or mapping operations,calldrm_gem_vram_pin() anddrm_gem_vram_kmap() directly.

Return

The buffer’s virtual address on success, oran ERR_PTR()-encoded error code otherwise.

voiddrm_gem_vram_vunmap(structdrm_gem_vram_object * gbo, void * vaddr)¶: Unmaps and unpins a GEM VRAM object

Parameters

structdrm_gem_vram_object*gbo: The GEM VRAM object to unmap
void*vaddr: The mapping’s base address as returned bydrm_gem_vram_vmap()

Description

A call todrm_gem_vram_vunmap() unmaps and unpins a GEM VRAM buffer. Seethe documentation fordrm_gem_vram_vmap() for more information.

intdrm_gem_vram_fill_create_dumb(structdrm_file * file, structdrm_device * dev, unsigned long pg_align, unsigned long pitch_align, struct drm_mode_create_dumb * args)¶: Helper for implementingstructdrm_driver.dumb_create

Parameters

structdrm_file*file: the DRM file
structdrm_device*dev: the DRM device
unsignedlongpg_align: the buffer’s alignment in multiples of the page size
unsignedlongpitch_align: the scanline’s alignment in powers of 2
structdrm_mode_create_dumb*args: the arguments as provided tostructdrm_driver.dumb_create

Description

This helper function fillsstructdrm_mode_create_dumb, which is usedbystructdrm_driver.dumb_create. Implementations of this interfaceshould forwards their arguments to this helper, plus the driver-specificparameters.

Return

0 on success, ora negative error code otherwise.

intdrm_gem_vram_driver_dumb_create(structdrm_file * file, structdrm_device * dev, struct drm_mode_create_dumb * args)¶: Implementsstructdrm_driver.dumb_create

Parameters

structdrm_file*file: the DRM file
structdrm_device*dev: the DRM device
structdrm_mode_create_dumb*args: the arguments as provided tostructdrm_driver.dumb_create

Description

This function requires the driver to usedrm_device.vram_mm for itsinstance of VRAM MM.

Return

0 on success, ora negative error code otherwise.

intdrm_gem_vram_driver_dumb_mmap_offset(structdrm_file * file, structdrm_device * dev, uint32_t handle, uint64_t * offset)¶: Implementsstructdrm_driver.dumb_mmap_offset

Parameters

structdrm_file*file: DRM file pointer.
structdrm_device*dev: DRM device.
uint32_thandle: GEM handle
uint64_t*offset: Returns the mapping’s memory offset on success

Return

0 on success, ora negative errno code otherwise.

intdrm_gem_vram_plane_helper_prepare_fb(structdrm_plane * plane, structdrm_plane_state * new_state)¶

Implementsstructdrm_plane_helper_funcs.prepare_fb

Parameters

structdrm_plane*plane: a DRM plane
structdrm_plane_state*new_state: the plane’s new state

Description

During plane updates, this function sets the plane’s fence andpins the GEM VRAM objects of the plane’s new framebuffer to VRAM.Calldrm_gem_vram_plane_helper_cleanup_fb() to unpin them.

Return

0 on success, ora negative errno code otherwise.

voiddrm_gem_vram_plane_helper_cleanup_fb(structdrm_plane * plane, structdrm_plane_state * old_state)¶

Implementsstructdrm_plane_helper_funcs.cleanup_fb

Parameters

structdrm_plane*plane: a DRM plane
structdrm_plane_state*old_state: the plane’s old state

Description

During plane updates, this function unpins the GEM VRAMobjects of the plane’s old framebuffer from VRAM. Complementsdrm_gem_vram_plane_helper_prepare_fb().

intdrm_gem_vram_simple_display_pipe_prepare_fb(structdrm_simple_display_pipe * pipe, structdrm_plane_state * new_state)¶

Implementsstructdrm_simple_display_pipe_funcs.prepare_fb

Parameters

structdrm_simple_display_pipe*pipe: a simple display pipe
structdrm_plane_state*new_state: the plane’s new state

Description

During plane updates, this function pins the GEM VRAMobjects of the plane’s new framebuffer to VRAM. Calldrm_gem_vram_simple_display_pipe_cleanup_fb() to unpin them.

Return

0 on success, ora negative errno code otherwise.

voiddrm_gem_vram_simple_display_pipe_cleanup_fb(structdrm_simple_display_pipe * pipe, structdrm_plane_state * old_state)¶

Implementsstructdrm_simple_display_pipe_funcs.cleanup_fb

Parameters

structdrm_simple_display_pipe*pipe: a simple display pipe
structdrm_plane_state*old_state: the plane’s old state

Description

During plane updates, this function unpins the GEM VRAMobjects of the plane’s old framebuffer from VRAM. Complementsdrm_gem_vram_simple_display_pipe_prepare_fb().

voiddrm_vram_mm_debugfs_init(structdrm_minor * minor)¶: Register VRAM MM debugfs file.

Parameters

structdrm_minor*minor: drm minor device.

structdrm_vram_mm *drm_vram_helper_alloc_mm(structdrm_device * dev, uint64_t vram_base, size_t vram_size)¶: Allocates a device’s instance ofstructdrm_vram_mm

Parameters

structdrm_device*dev: the DRM device
uint64_tvram_base: the base address of the video memory
size_tvram_size: the size of the video memory in bytes

Return

The new instance ofstructdrm_vram_mm on success, oran ERR_PTR()-encoded errno code otherwise.

voiddrm_vram_helper_release_mm(structdrm_device * dev)¶: Releases a device’s instance ofstructdrm_vram_mm

Parameters

structdrm_device*dev: the DRM device

enumdrm_mode_statusdrm_vram_helper_mode_valid(structdrm_device * dev, const structdrm_display_mode * mode)¶: Tests if a display mode’s framebuffer fits into the available video memory.

Parameters

structdrm_device*dev: the DRM device
conststructdrm_display_mode*mode: the mode to test

Description

This function tests if enough video memory is available for using thespecified display mode. Atomic modesetting requires importing thedesignated framebuffer into video memory before evicting the activeone. Hence, any framebuffer may consume at most half of the availableVRAM. Display modes that require a larger framebuffer can not be used,even if the CRTC does support them. Each framebuffer is assumed tohave 32-bit color depth.

Note

The function can only test if the display mode is supported ingeneral. If there are too many framebuffers pinned to video memory,a display mode may still not be usable in practice. The color depth of32-bit fits all current use case. A more flexible test can be addedwhen necessary.

Return

MODE_OK if the display mode is supported, or an error code of typeenum drm_mode_status otherwise.

GEM TTM Helper Functions Reference¶

This library provides helper functions for gem objects backed byttm.

voiddrm_gem_ttm_print_info(structdrm_printer * p, unsigned int indent, const structdrm_gem_object * gem)¶: Printttm_buffer_object info for debugfs

Parameters

structdrm_printer*p: DRM printer
unsignedintindent: Tab indentation level
conststructdrm_gem_object*gem: GEM object

Description

This function can be used asdrm_gem_object_funcs.print_infocallback.

intdrm_gem_ttm_mmap(structdrm_gem_object * gem, struct vm_area_struct * vma)¶: mmapttm_buffer_object

Parameters

structdrm_gem_object*gem: GEM object.
structvm_area_struct*vma: vm area.

Description

This function can be used asdrm_gem_object_funcs.mmapcallback.

VMA Offset Manager¶

The vma-manager is responsible to map arbitrary driver-dependent memoryregions into the linear user address-space. It provides offsets to thecaller which can then be used on the address_space of the drm-device. Ittakes care to not overlap regions, size them appropriately and to notconfuse mm-core by inconsistent fake vm_pgoff fields.Drivers shouldn’t use this for object placement in VMEM. This manager shouldonly be used to manage mappings into linear user-space VMs.

We use drm_mm as backend to manage object allocations. But it is highlyoptimized for alloc/free calls, not lookups. Hence, we use an rb-tree tospeed up offset lookups.

You must not use multiple offset managers on a single address_space.Otherwise, mm-core will be unable to tear down memory mappings as the VM willno longer be linear.

This offset manager works on page-based addresses. That is, every argumentand return code (with the exception ofdrm_vma_node_offset_addr()) is givenin number of pages, not number of bytes. That means, object sizes and offsetsmust always be page-aligned (as usual).If you want to get a valid byte-based user-space address for a given offset,please seedrm_vma_node_offset_addr().

Additionally to offset management, the vma offset manager also handles accessmanagement. For every open-file context that is allowed to access a givennode, you must calldrm_vma_node_allow(). Otherwise, an mmap() call on thisopen-file with the offset of the node will fail with -EACCES. To revokeaccess again, usedrm_vma_node_revoke(). However, the caller is responsiblefor destroying already existing mappings, if required.

struct drm_vma_offset_node *drm_vma_offset_exact_lookup_locked(struct drm_vma_offset_manager * mgr, unsigned long start, unsigned long pages)¶: Look up node by exact address

Parameters

structdrm_vma_offset_manager*mgr: Manager object
unsignedlongstart: Start address (page-based, not byte-based)
unsignedlongpages: Size of object (page-based)

Description

Same asdrm_vma_offset_lookup_locked() but does not allow any offset into the node.It only returns the exact object with the given start address.

Return

Node at exact start addressstart.

voiddrm_vma_offset_lock_lookup(struct drm_vma_offset_manager * mgr)¶: Lock lookup for extended private use

Parameters

structdrm_vma_offset_manager*mgr: Manager object

Description

Lock VMA manager for extended lookups. Only locked VMA function callsare allowed while holding this lock. All other contexts are blocked from VMAuntil the lock is released viadrm_vma_offset_unlock_lookup().

Use this if you need to take a reference to the objects returned bydrm_vma_offset_lookup_locked() before releasing this lock again.

This lock must not be used for anything else than extended lookups. You mustnot call any other VMA helpers while holding this lock.

Note

You’re in atomic-context while holding this lock!

voiddrm_vma_offset_unlock_lookup(struct drm_vma_offset_manager * mgr)¶: Unlock lookup for extended private use

Parameters

structdrm_vma_offset_manager*mgr: Manager object

Description

Release lookup-lock. Seedrm_vma_offset_lock_lookup() for more information.

voiddrm_vma_node_reset(struct drm_vma_offset_node * node)¶: Initialize or reset node object

Parameters

structdrm_vma_offset_node*node: Node to initialize or reset

Description

Reset a node to its initial state. This must be called before using it withany VMA offset manager.

This must not be called on an already allocated node, or you will leakmemory.

unsigned longdrm_vma_node_start(const struct drm_vma_offset_node * node)¶: Return start address for page-based addressing

Parameters

conststructdrm_vma_offset_node*node: Node to inspect

Description

Return the start address of the given node. This can be used as offset intothe linear VM space that is provided by the VMA offset manager. Note thatthis can only be used for page-based addressing. If you need a proper offsetfor user-space mappings, you must apply “<< PAGE_SHIFT” or use thedrm_vma_node_offset_addr() helper instead.

Return

Start address ofnode for page-based addressing. 0 if the node does nothave an offset allocated.

unsigned longdrm_vma_node_size(struct drm_vma_offset_node * node)¶: Return size (page-based)

Parameters

structdrm_vma_offset_node*node: Node to inspect

Description

Return the size as number of pages for the given node. This is the same sizethat was passed todrm_vma_offset_add(). If no offset is allocated for thenode, this is 0.

Return

Size ofnode as number of pages. 0 if the node does not have an offsetallocated.

__u64drm_vma_node_offset_addr(struct drm_vma_offset_node * node)¶: Return sanitized offset for user-space mmaps

Parameters

structdrm_vma_offset_node*node: Linked offset node

Description

Same asdrm_vma_node_start() but returns the address as a valid offset thatcan be used for user-space mappings during mmap().This must not be called on unlinked nodes.

Return

Offset ofnode for byte-based addressing. 0 if the node does not have anobject allocated.

voiddrm_vma_node_unmap(struct drm_vma_offset_node * node, structaddress_space * file_mapping)¶: Unmap offset node

Parameters

structdrm_vma_offset_node*node: Offset node
structaddress_space*file_mapping: Address space to unmapnode from

Description

Unmap all userspace mappings for a given offset node. The mappings must beassociated with thefile_mapping address-space. If no offset existsnothing is done.

This call is unlocked. The caller must guarantee thatdrm_vma_offset_remove()is not called on this node concurrently.

intdrm_vma_node_verify_access(struct drm_vma_offset_node * node, structdrm_file * tag)¶: Access verification helper for TTM

Parameters

structdrm_vma_offset_node*node: Offset node
structdrm_file*tag: Tag of file to check

Description

This checks whethertag is granted access tonode. It is the same asdrm_vma_node_is_allowed() but suitable as drop-in helper for TTMverify_access() callbacks.

Return

0 if access is granted, -EACCES otherwise.

voiddrm_vma_offset_manager_init(struct drm_vma_offset_manager * mgr, unsigned long page_offset, unsigned long size)¶: Initialize new offset-manager

Parameters

structdrm_vma_offset_manager*mgr: Manager object
unsignedlongpage_offset: Offset of available memory area (page-based)
unsignedlongsize: Size of available address space range (page-based)

Description

Initialize a new offset-manager. The offset and area size available for themanager are given aspage_offset andsize. Both are interpreted aspage-numbers, not bytes.

Adding/removing nodes from the manager is locked internally and protectedagainst concurrent access. However, node allocation and destruction is leftfor the caller. While calling into the vma-manager, a given node mustalways be guaranteed to be referenced.

voiddrm_vma_offset_manager_destroy(struct drm_vma_offset_manager * mgr)¶: Destroy offset manager

Parameters

structdrm_vma_offset_manager*mgr: Manager object

Description

Destroy an object manager which was previously created viadrm_vma_offset_manager_init(). The caller must remove all allocated nodesbefore destroying the manager. Otherwise, drm_mm will refuse to free therequested resources.

The manager must not be accessed after this function is called.

struct drm_vma_offset_node *drm_vma_offset_lookup_locked(struct drm_vma_offset_manager * mgr, unsigned long start, unsigned long pages)¶: Find node in offset space

Parameters

structdrm_vma_offset_manager*mgr: Manager object
unsignedlongstart: Start address for object (page-based)
unsignedlongpages: Size of object (page-based)

Description

Find a node given a start address and object size. This returns the _best_match for the given node. That is,start may point somewhere into a validregion and the given node will be returned, as long as the node spans thewhole requested area (given the size in number of pages aspages).

Note that before lookup the vma offset manager lookup lock must be acquiredwithdrm_vma_offset_lock_lookup(). See there for an example. This can then beused to implement weakly referenced lookups using kref_get_unless_zero().

drm_vma_offset_lock_lookup(mgr);node = drm_vma_offset_lookup_locked(mgr);if (node)    kref_get_unless_zero(container_of(node, sth, entr));drm_vma_offset_unlock_lookup(mgr);

Example

Return

Returns NULL if no suitable node can be found. Otherwise, the best matchis returned. It’s the caller’s responsibility to make sure the node doesn’tget destroyed before the caller can access it.

intdrm_vma_offset_add(struct drm_vma_offset_manager * mgr, struct drm_vma_offset_node * node, unsigned long pages)¶: Add offset node to manager

Parameters

structdrm_vma_offset_manager*mgr: Manager object
structdrm_vma_offset_node*node: Node to be added
unsignedlongpages: Allocation size visible to user-space (in number of pages)

Description

Add a node to the offset-manager. If the node was already added, this doesnothing and return 0.pages is the size of the object given in number ofpages.After this call succeeds, you can access the offset of the node until itis removed again.

If this call fails, it is safe to retry the operation or calldrm_vma_offset_remove(), anyway. However, no cleanup is required in thatcase.

pages is not required to be the same size as the underlying memory objectthat you want to map. It only limits the size that user-space can map intotheir address space.

Return

0 on success, negative error code on failure.

voiddrm_vma_offset_remove(struct drm_vma_offset_manager * mgr, struct drm_vma_offset_node * node)¶: Remove offset node from manager

Parameters

structdrm_vma_offset_manager*mgr: Manager object
structdrm_vma_offset_node*node: Node to be removed

Description

Remove a node from the offset manager. If the node wasn’t added before, thisdoes nothing. After this call returns, the offset and size will be 0 until anew offset is allocated viadrm_vma_offset_add() again. Helper functions likedrm_vma_node_start() anddrm_vma_node_offset_addr() will return 0 if nooffset is allocated.

intdrm_vma_node_allow(struct drm_vma_offset_node * node, structdrm_file * tag)¶: Add open-file to list of allowed users

Parameters

structdrm_vma_offset_node*node: Node to modify
structdrm_file*tag: Tag of file to remove

Description

Addtag to the list of allowed open-files for this node. Iftag isalready on this list, the ref-count is incremented.

The list of allowed-users is preserved acrossdrm_vma_offset_add() anddrm_vma_offset_remove() calls. You may even call it if the node is currentlynot added to any offset-manager.

You must remove all open-files the same number of times as you added thembefore destroying the node. Otherwise, you will leak memory.

This is locked against concurrent access internally.

Return

0 on success, negative error code on internal failure (out-of-mem)

voiddrm_vma_node_revoke(struct drm_vma_offset_node * node, structdrm_file * tag)¶: Remove open-file from list of allowed users

Parameters

structdrm_vma_offset_node*node: Node to modify
structdrm_file*tag: Tag of file to remove

Description

Decrement the ref-count oftag in the list of allowed open-files onnode.If the ref-count drops to zero, removetag from the list. You must callthis once for everydrm_vma_node_allow() ontag.

This is locked against concurrent access internally.

Iftag is not on the list, nothing is done.

booldrm_vma_node_is_allowed(struct drm_vma_offset_node * node, structdrm_file * tag)¶: Check whether an open-file is granted access

Parameters

structdrm_vma_offset_node*node: Node to check
structdrm_file*tag: Tag of file to remove

Description

Search the list innode whethertag is currently on the list of allowedopen-files (seedrm_vma_node_allow()).

This is locked against concurrent access internally.

Return

true ifffilp is on the list

PRIME Buffer Sharing¶

PRIME is the cross device buffer sharing framework in drm, originallycreated for the OPTIMUS range of multi-gpu platforms. To userspace PRIMEbuffers are dma-buf based file descriptors.

Overview and Lifetime Rules¶

Similar to GEM global names, PRIME file descriptors are also used to sharebuffer objects across processes. They offer additional security: as filedescriptors must be explicitly sent over UNIX domain sockets to be sharedbetween applications, they can’t be guessed like the globally unique GEMnames.

Drivers that support the PRIME API implement thedrm_driver.prime_handle_to_fd anddrm_driver.prime_fd_to_handle operations.GEM based drivers must usedrm_gem_prime_handle_to_fd() anddrm_gem_prime_fd_to_handle() to implement these. For GEM based drivers theactual driver interfaces is provided through thedrm_gem_object_funcs.exportanddrm_driver.gem_prime_import hooks.

dma_buf_ops implementations for GEM drivers are all individually exportedfor drivers which need to overwrite or reimplement some of them.

Reference Counting for GEM Drivers¶

On the export thedma_buf holds a reference to the exported buffer object,usually adrm_gem_object. It takes this reference in the PRIME_HANDLE_TO_FDIOCTL, when it first callsdrm_gem_object_funcs.exportand stores the exporting GEM object in thedma_buf.priv field. Thisreference needs to be released when the final reference to thedma_bufitself is dropped and itsdma_buf_ops.release function is called. ForGEM-based drivers, thedma_buf should be exported usingdrm_gem_dmabuf_export() and then released bydrm_gem_dmabuf_release().

Thus the chain of references always flows in one direction, avoiding loops:importing GEM object -> dma-buf -> exported GEM bo. A further complicationare the lookup caches for import and export. These are required to guaranteethat any given object will always have only one uniqe userspace handle. Thisis required to allow userspace to detect duplicated imports, since some GEMdrivers do fail command submissions if a given buffer object is listed morethan once. These import and export caches indrm_prime_file_private onlyretain a weak reference, which is cleaned up when the corresponding object isreleased.

Self-importing: If userspace is using PRIME as a replacement for flink thenit will get a fd->handle request for a GEM object that it created. Driversshould detect this situation and return back the underlying object from thedma-buf private. For GEM based drivers this is handled indrm_gem_prime_import() already.

PRIME Helper Functions¶

Drivers can implementdrm_gem_object_funcs.export anddrm_driver.gem_prime_import in terms of simpler APIs by using the helperfunctionsdrm_gem_prime_export() anddrm_gem_prime_import(). These functionsimplement dma-buf support in terms of some lower-level helpers, which areagain exported for drivers to use individually:

Exporting buffers¶

Optional pinning of buffers is handled at dma-buf attach and detach time indrm_gem_map_attach() anddrm_gem_map_detach(). Backing storage itself ishandled bydrm_gem_map_dma_buf() anddrm_gem_unmap_dma_buf(), which relies ondrm_gem_object_funcs.get_sg_table.

For kernel-internal access there’sdrm_gem_dmabuf_vmap() anddrm_gem_dmabuf_vunmap(). Userspace mmap support is provided bydrm_gem_dmabuf_mmap().

Note that these export helpers can only be used if the underlying backingstorage is fully coherent and either permanently pinned, or it is safe to pinit indefinitely.

FIXME: The underlying helper functions are named rather inconsistently.

Exporting buffers¶

Importing dma-bufs usingdrm_gem_prime_import() relies ondrm_driver.gem_prime_import_sg_table.

Note that similarly to the export helpers this permanently pins theunderlying backing storage. Which is ok for scanout, but is not the bestoption for sharing lots of buffers for rendering.

PRIME Function References¶

structdrm_prime_file_private¶: per-file tracking for PRIME

Definition

struct drm_prime_file_private {};

Members

Description

This just contains the internalstructdma_buf and handle caches for eachstructdrm_file used by the PRIME core code.

structdma_buf *drm_gem_dmabuf_export(structdrm_device * dev, structdma_buf_export_info * exp_info)¶: dma_buf export implementation for GEM

Parameters

structdrm_device*dev: parent device for the exported dmabuf
structdma_buf_export_info*exp_info: the export information used bydma_buf_export()

Description

This wrapsdma_buf_export() for use by generic GEM drivers that are usingdrm_gem_dmabuf_release(). In addition to callingdma_buf_export(), we takea reference to thedrm_device and the exporteddrm_gem_object (stored indma_buf_export_info.priv) which is released bydrm_gem_dmabuf_release().

Returns the new dmabuf.

voiddrm_gem_dmabuf_release(structdma_buf * dma_buf)¶: dma_buf release implementation for GEM

Parameters

structdma_buf*dma_buf: buffer to be released

Description

Generic release function for dma_bufs exported as PRIME buffers. GEM driversmust use this in theirdma_buf_ops structure as the release callback.drm_gem_dmabuf_release() should be used in conjunction withdrm_gem_dmabuf_export().

intdrm_gem_prime_fd_to_handle(structdrm_device * dev, structdrm_file * file_priv, int prime_fd, uint32_t * handle)¶: PRIME import function for GEM drivers

Parameters

structdrm_device*dev: dev to export the buffer from
structdrm_file*file_priv: drm file-private structure
intprime_fd: fd id of the dma-buf which should be imported
uint32_t*handle: pointer to storage for the handle of the imported buffer object

Description

This is the PRIME import function which must be used mandatorily by GEMdrivers to ensure correct lifetime management of the underlying GEM object.The actual importing of GEM object from the dma-buf is done through thedrm_driver.gem_prime_import driver callback.

Returns 0 on success or a negative error code on failure.

intdrm_gem_prime_handle_to_fd(structdrm_device * dev, structdrm_file * file_priv, uint32_t handle, uint32_t flags, int * prime_fd)¶: PRIME export function for GEM drivers

Parameters

structdrm_device*dev: dev to export the buffer from
structdrm_file*file_priv: drm file-private structure
uint32_thandle: buffer handle to export
uint32_tflags: flags like DRM_CLOEXEC
int*prime_fd: pointer to storage for the fd id of the create dma-buf

Description

This is the PRIME export function which must be used mandatorily by GEMdrivers to ensure correct lifetime management of the underlying GEM object.The actual exporting from GEM object to a dma-buf is done through thedrm_driver.gem_prime_export driver callback.

intdrm_gem_map_attach(structdma_buf * dma_buf, structdma_buf_attachment * attach)¶: dma_buf attach implementation for GEM

Parameters

structdma_buf*dma_buf: buffer to attach device to
structdma_buf_attachment*attach: buffer attachment data

Description

Callsdrm_gem_object_funcs.pin for device specific handling. This can beused as thedma_buf_ops.attach callback. Must be used together withdrm_gem_map_detach().

Returns 0 on success, negative error code on failure.

voiddrm_gem_map_detach(structdma_buf * dma_buf, structdma_buf_attachment * attach)¶: dma_buf detach implementation for GEM

Parameters

structdma_buf*dma_buf: buffer to detach from
structdma_buf_attachment*attach: attachment to be detached

Description

Callsdrm_gem_object_funcs.pin for device specific handling. Cleans updma_buf_attachment fromdrm_gem_map_attach(). This can be used as thedma_buf_ops.detach callback.

struct sg_table *drm_gem_map_dma_buf(structdma_buf_attachment * attach, enum dma_data_direction dir)¶: map_dma_buf implementation for GEM

Parameters

structdma_buf_attachment*attach: attachment whose scatterlist is to be returned
enumdma_data_directiondir: direction of DMA transfer

Description

Callsdrm_gem_object_funcs.get_sg_table and then maps the scatterlist. Thiscan be used as thedma_buf_ops.map_dma_buf callback. Should be used togetherwithdrm_gem_unmap_dma_buf().

Return

sg_table containing the scatterlist to be returned; returns ERR_PTRon error. May return -EINTR if it is interrupted by a signal.

voiddrm_gem_unmap_dma_buf(structdma_buf_attachment * attach, struct sg_table * sgt, enum dma_data_direction dir)¶: unmap_dma_buf implementation for GEM

Parameters

structdma_buf_attachment*attach: attachment to unmap buffer from
structsg_table*sgt: scatterlist info of the buffer to unmap
enumdma_data_directiondir: direction of DMA transfer

Description

This can be used as thedma_buf_ops.unmap_dma_buf callback.

void *drm_gem_dmabuf_vmap(structdma_buf * dma_buf)¶: dma_buf vmap implementation for GEM

Parameters

structdma_buf*dma_buf: buffer to be mapped

Description

Sets up a kernel virtual mapping. This can be used as thedma_buf_ops.vmapcallback. Calls intodrm_gem_object_funcs.vmap for device specific handling.

Returns the kernel virtual address or NULL on failure.

voiddrm_gem_dmabuf_vunmap(structdma_buf * dma_buf, void * vaddr)¶: dma_buf vunmap implementation for GEM

Parameters

structdma_buf*dma_buf: buffer to be unmapped
void*vaddr: the virtual address of the buffer

Description

Releases a kernel virtual mapping. This can be used as thedma_buf_ops.vunmap callback. Calls intodrm_gem_object_funcs.vunmap for device specific handling.

intdrm_gem_prime_mmap(structdrm_gem_object * obj, struct vm_area_struct * vma)¶: PRIME mmap function for GEM drivers

Parameters

structdrm_gem_object*obj: GEM object
structvm_area_struct*vma: Virtual address range

Description

This function sets up a userspace mapping for PRIME exported buffers usingthe same codepath that is used for regular GEM buffer mapping on the DRM fd.The fake GEM offset is added to vma->vm_pgoff anddrm_driver->fops->mmap iscalled to set up the mapping.

Drivers can use this as theirdrm_driver.gem_prime_mmap callback.

intdrm_gem_dmabuf_mmap(structdma_buf * dma_buf, struct vm_area_struct * vma)¶: dma_buf mmap implementation for GEM

Parameters

structdma_buf*dma_buf: buffer to be mapped
structvm_area_struct*vma: virtual address range

Description

Provides memory mapping for the buffer. This can be used as thedma_buf_ops.mmap callback. It just forwards todrm_driver.gem_prime_mmap,which should be set todrm_gem_prime_mmap().

FIXME: There’s really no point to this wrapper, drivers which need anythingelse but drm_gem_prime_mmap can roll their owndma_buf_ops.mmap callback.

Returns 0 on success or a negative error code on failure.

struct sg_table *drm_prime_pages_to_sg(struct page ** pages, unsigned int nr_pages)¶: converts a page array into an sg list

Parameters

structpage**pages: pointer to the array of page pointers to convert
unsignedintnr_pages: length of the page vector

Description

This helper creates an sg table object from a set of pagesthe driver is responsible for mapping the pages into theimporters address space for use with dma_buf itself.

This is useful for implementingdrm_gem_object_funcs.get_sg_table.

structdma_buf *drm_gem_prime_export(structdrm_gem_object * obj, int flags)¶: helper library implementation of the export callback

Parameters

structdrm_gem_object*obj: GEM object to export
intflags: flags like DRM_CLOEXEC and DRM_RDWR

Description

This is the implementation of thedrm_gem_object_funcs.export functions for GEM driversusing the PRIME helpers. It is used as the default indrm_gem_prime_handle_to_fd().

structdrm_gem_object *drm_gem_prime_import_dev(structdrm_device * dev, structdma_buf * dma_buf, structdevice * attach_dev)¶: core implementation of the import callback

Parameters

structdrm_device*dev: drm_device to import into
structdma_buf*dma_buf: dma-buf object to import
structdevice*attach_dev: struct device to dma_buf attach

Description

This is the core ofdrm_gem_prime_import(). It’s designed to be called bydrivers who want to use a different device structure thandrm_device.dev forattaching via dma_buf. This function callsdrm_driver.gem_prime_import_sg_table internally.

Drivers must arrange to calldrm_prime_gem_destroy() from theirdrm_gem_object_funcs.free hook when using this function.

structdrm_gem_object *drm_gem_prime_import(structdrm_device * dev, structdma_buf * dma_buf)¶: helper library implementation of the import callback

Parameters

structdrm_device*dev: drm_device to import into
structdma_buf*dma_buf: dma-buf object to import

Description

This is the implementation of the gem_prime_import functions for GEM driversusing the PRIME helpers. Drivers can use this as theirdrm_driver.gem_prime_import implementation. It is used as the defaultimplementation indrm_gem_prime_fd_to_handle().

Drivers must arrange to calldrm_prime_gem_destroy() from theirdrm_gem_object_funcs.free hook when using this function.

intdrm_prime_sg_to_page_addr_arrays(struct sg_table * sgt, struct page ** pages, dma_addr_t * addrs, int max_entries)¶: convert an sg table into a page array

Parameters

structsg_table*sgt: scatter-gather table to convert
structpage**pages: optional array of page pointers to store the page array in
dma_addr_t*addrs: optional array to store the dma bus address of each page
intmax_entries: size of both the passed-in arrays

Description

Exports an sg table into an array of pages and addresses. This is currentlyrequired by the TTM driver in order to do correct fault handling.

Drivers can use this in theirdrm_driver.gem_prime_import_sg_tableimplementation.

voiddrm_prime_gem_destroy(structdrm_gem_object * obj, struct sg_table * sg)¶: helper to clean up a PRIME-imported GEM object

Parameters

structdrm_gem_object*obj: GEM object which was created from a dma-buf
structsg_table*sg: the sg-table which was pinned at import time

Description

This is the cleanup functions which GEM drivers need to call when they usedrm_gem_prime_import() ordrm_gem_prime_import_dev() to import dma-bufs.

DRM MM Range Allocator¶

Overview¶

drm_mm provides a simple range allocator. The drivers are free to use theresource allocator from the linux core if it suits them, the upside of drm_mmis that it’s in the DRM core. Which means that it’s easier to extend forsome of the crazier special purpose needs of gpus.

The main data struct isdrm_mm, allocations are tracked indrm_mm_node.Drivers are free to embed either of them into their own suitabledatastructures. drm_mm itself will not do any memory allocations of its own,so if drivers choose not to embed nodes they need to still allocate themthemselves.

The range allocator also supports reservation of preallocated blocks. This isuseful for taking over initial mode setting configurations from the firmware,where an object needs to be created which exactly matches the firmware’sscanout target. As long as the range is still free it can be inserted anytimeafter the allocator is initialized, which helps with avoiding loopeddependencies in the driver load sequence.

drm_mm maintains a stack of most recently freed holes, which of allsimplistic datastructures seems to be a fairly decent approach to clusteringallocations and avoiding too much fragmentation. This means free spacesearches are O(num_holes). Given that all the fancy features drm_mm supportssomething better would be fairly complex and since gfx thrashing is a fairlysteep cliff not a real concern. Removing a node again is O(1).

drm_mm supports a few features: Alignment and range restrictions can besupplied. Furthermore everydrm_mm_node has a color value (which is just anopaque unsigned long) which in conjunction with a driver callback can be usedto implement sophisticated placement restrictions. The i915 DRM driver usesthis to implement guard pages between incompatible caching domains in thegraphics TT.

Two behaviors are supported for searching and allocating: bottom-up andtop-down. The default is bottom-up. Top-down allocation can be used if thememory area has different restrictions, or just to reduce fragmentation.

Finally iteration helpers to walk all nodes and all holes are provided as aresome basic allocator dumpers for debugging.

Note that this range allocator is not thread-safe, drivers need to protectmodifications with their own locking. The idea behind this is that for a fullmemory manager additional data needs to be protected anyway, hence internallocking would be fully redundant.

LRU Scan/Eviction Support¶

Very often GPUs need to have continuous allocations for a given object. Whenevicting objects to make space for a new one it is therefore not mostefficient when we simply start to select all objects from the tail of an LRUuntil there’s a suitable hole: Especially for big objects or nodes thatotherwise have special allocation constraints there’s a good chance we evictlots of (smaller) objects unnecessarily.

The DRM range allocator supports this use-case through the scanninginterfaces. First a scan operation needs to be initialized withdrm_mm_scan_init() ordrm_mm_scan_init_with_range(). The driver addsobjects to the roster, probably by walking an LRU list, but this can befreely implemented. Eviction candiates are added usingdrm_mm_scan_add_block() until a suitable hole is found or there are nofurther evictable objects. Eviction roster metadata is tracked instructdrm_mm_scan.

The driver must walk through all objects again in exactly the reverseorder to restore the allocator state. Note that while the allocator is usedin the scan mode no other operation is allowed.

Finally the driver evicts all objects selected (drm_mm_scan_remove_block()reported true) in the scan, and any overlapping nodes after color adjustment(drm_mm_scan_color_evict()). Adding and removing an object is O(1), andsince freeing a node is also O(1) the overall complexity isO(scanned_objects). So like the free stack which needs to be walked before ascan operation even begins this is linear in the number of objects. Itdoesn’t seem to hurt too badly.

DRM MM Range Allocator Function References¶

enumdrm_mm_insert_mode¶: control search and allocation behaviour

Constants

DRM_MM_INSERT_BEST

Search for the smallest hole (within the search range) that fitsthe desired node.

Allocates the node from the bottom of the found hole.

DRM_MM_INSERT_LOW

Search for the lowest hole (address closest to 0, within the searchrange) that fits the desired node.

Allocates the node from the bottom of the found hole.

DRM_MM_INSERT_HIGH

Search for the highest hole (address closest to U64_MAX, within thesearch range) that fits the desired node.

Allocates the node from thetop of the found hole. The specifiedalignment for the node is applied to the base of the node(drm_mm_node.start).

DRM_MM_INSERT_EVICT

Search for the most recently evicted hole (within the search range)that fits the desired node. This is appropriate for use immediatelyafter performing an eviction scan (seedrm_mm_scan_init()) andremoving the selected nodes to form a hole.

Allocates the node from the bottom of the found hole.

DRM_MM_INSERT_ONCE

Only check the first hole for suitablity and report -ENOSPCimmediately otherwise, rather than check every hole until asuitable one is found. Can only be used in conjunction with anothersearch method such as DRM_MM_INSERT_HIGH or DRM_MM_INSERT_LOW.

DRM_MM_INSERT_HIGHEST

Only check the highest hole (the hole with the largest address) andinsert the node at the top of the hole or report -ENOSPC ifunsuitable.

Does not search all holes.

DRM_MM_INSERT_LOWEST

Only check the lowest hole (the hole with the smallest address) andinsert the node at the bottom of the hole or report -ENOSPC ifunsuitable.

Does not search all holes.

Description

Thestructdrm_mm range manager supports finding a suitable modes usinga number of search trees. These trees are oranised by size, by address andin most recent eviction order. This allows the user to find either thesmallest hole to reuse, the lowest or highest address to reuse, or simplyreuse the most recent eviction that fits. When allocating thedrm_mm_nodefrom within the hole, thedrm_mm_insert_mode also dictate whether toallocate the lowest matching address or the highest.

structdrm_mm_node¶: allocated block in the DRM allocator

Definition

struct drm_mm_node {  unsigned long color;  u64 start;  u64 size;};

Members

color: Opaque driver-private tag.
start: Start address of the allocated block.
size: Size of the allocated block.

Description

This represents an allocated block in adrm_mm allocator. Except forpre-reserved nodes inserted usingdrm_mm_reserve_node() the structure isentirely opaque and should only be accessed through the provided funcions.Since allocation of these nodes is entirely handled by the driver they can beembedded.

structdrm_mm¶: DRM allocator

Definition

struct drm_mm {  void (*color_adjust)(const struct drm_mm_node *node,unsigned long color, u64 *start, u64 *end);};

Members

color_adjust: Optional driver callback to further apply restrictions on a hole. Thenode argument points at the node containing the hole from which theblock would be allocated (seedrm_mm_hole_follows() and friends). Theother arguments are the size of the block to be allocated. The drivercan adjust the start and end as needed to e.g. insert guard pages.

Description

DRM range allocator with a few special functions and features geared towardsmanaging GPU memory. Except for thecolor_adjust callback the structure isentirely opaque and should only be accessed through the provided functionsand macros. This structure can be embedded into larger driver structures.

structdrm_mm_scan¶: DRM allocator eviction roaster data

Definition

struct drm_mm_scan {};

Members

Description

This structure tracks data needed for the eviction roaster set up usingdrm_mm_scan_init(), and used withdrm_mm_scan_add_block() anddrm_mm_scan_remove_block(). The structure is entirely opaque and should onlybe accessed through the provided functions and macros. It is meant to beallocated temporarily by the driver on the stack.

booldrm_mm_node_allocated(const structdrm_mm_node * node)¶: checks whether a node is allocated

Parameters

conststructdrm_mm_node*node: drm_mm_node to check

Description

Drivers are required to clear a node prior to using it with thedrm_mm range manager.

Drivers should use this helper for proper encapsulation of drm_mminternals.

Return

True if thenode is allocated.

booldrm_mm_initialized(const structdrm_mm * mm)¶: checks whether an allocator is initialized

Parameters

conststructdrm_mm*mm: drm_mm to check

Description

Drivers should clear the struct drm_mm prior to initialisation if theywant to use this function.

Drivers should use this helper for proper encapsulation of drm_mminternals.

Return

True if themm is initialized.

booldrm_mm_hole_follows(const structdrm_mm_node * node)¶: checks whether a hole follows this node

Parameters

conststructdrm_mm_node*node: drm_mm_node to check

Description

Holes are embedded into the drm_mm using the tail of a drm_mm_node.If you wish to know whether a hole follows this particular node,query this function. See alsodrm_mm_hole_node_start() anddrm_mm_hole_node_end().

Return

True if a hole follows thenode.

u64drm_mm_hole_node_start(const structdrm_mm_node * hole_node)¶: computes the start of the hole followingnode

Parameters

conststructdrm_mm_node*hole_node: drm_mm_node which implicitly tracks the following hole

Description

This is useful for driver-specific debug dumpers. Otherwise drivers shouldnot inspect holes themselves. Drivers must check first whether a hole indeedfollows by looking atdrm_mm_hole_follows()

Return

Start of the subsequent hole.

u64drm_mm_hole_node_end(const structdrm_mm_node * hole_node)¶: computes the end of the hole followingnode

Parameters

conststructdrm_mm_node*hole_node: drm_mm_node which implicitly tracks the following hole

Description

This is useful for driver-specific debug dumpers. Otherwise drivers shouldnot inspect holes themselves. Drivers must check first whether a hole indeedfollows by looking atdrm_mm_hole_follows().

Return

End of the subsequent hole.

drm_mm_nodes(mm)¶: list of nodes under the drm_mm range manager

Parameters

mm: the struct drm_mm range manger

Description

As the drm_mm range manager hides its node_list deep with itsstructure, extracting it looks painful and repetitive. This isnot expected to be used outside of thedrm_mm_for_each_node()macros and similar internal functions.

Return

The node list, may be empty.

drm_mm_for_each_node(entry,mm)¶: iterator to walk over all allocated nodes

Parameters

entry: structdrm_mm_node to assign to in each iteration step
mm: drm_mm allocator to walk

Description

This iterator walks over all nodes in the range allocator. It is implementedwithlist_for_each(), so not save against removal of elements.

drm_mm_for_each_node_safe(entry,next,mm)¶: iterator to walk over all allocated nodes

Parameters

entry: structdrm_mm_node to assign to in each iteration step
next: structdrm_mm_node to store the next step
mm: drm_mm allocator to walk

Description

This iterator walks over all nodes in the range allocator. It is implementedwithlist_for_each_safe(), so save against removal of elements.

drm_mm_for_each_hole(pos,mm,hole_start,hole_end)¶: iterator to walk over all holes

Parameters

pos: drm_mm_node used internally to track progress
mm: drm_mm allocator to walk
hole_start: ulong variable to assign the hole start to on each iteration
hole_end: ulong variable to assign the hole end to on each iteration

Description

This iterator walks over all holes in the range allocator. It is implementedwithlist_for_each(), so not save against removal of elements.entry is usedinternally and will not reflect a real drm_mm_node for the very first hole.Hence users of this iterator may not access it.

Implementation Note:We need to inline list_for_each_entry in order to be able to set hole_startand hole_end on each iteration while keeping the macro sane.

intdrm_mm_insert_node_generic(structdrm_mm * mm, structdrm_mm_node * node, u64 size, u64 alignment, unsigned long color, enumdrm_mm_insert_mode mode)¶: search for space and insertnode

Parameters

structdrm_mm*mm: drm_mm to allocate from
structdrm_mm_node*node: preallocate node to insert
u64size: size of the allocation
u64alignment: alignment of the allocation
unsignedlongcolor: opaque tag value to use for this node
enumdrm_mm_insert_modemode: fine-tune the allocation search and placement

Description

This is a simplified version ofdrm_mm_insert_node_in_range() with norange restrictions applied.

The preallocated node must be cleared to 0.

Return

0 on success, -ENOSPC if there’s no suitable hole.

intdrm_mm_insert_node(structdrm_mm * mm, structdrm_mm_node * node, u64 size)¶: search for space and insertnode

Parameters

structdrm_mm*mm: drm_mm to allocate from
structdrm_mm_node*node: preallocate node to insert
u64size: size of the allocation

Description

This is a simplified version ofdrm_mm_insert_node_generic() withcolor setto 0.

The preallocated node must be cleared to 0.

Return

0 on success, -ENOSPC if there’s no suitable hole.

booldrm_mm_clean(const structdrm_mm * mm)¶: checks whether an allocator is clean

Parameters

conststructdrm_mm*mm: drm_mm allocator to check

Return

True if the allocator is completely free, false if there’s still a nodeallocated in it.

drm_mm_for_each_node_in_range(node__,mm__,start__,end__)¶: iterator to walk over a range of allocated nodes

Parameters

node__: drm_mm_node structure to assign to in each iteration step
mm__: drm_mm allocator to walk
start__: starting offset, the first node will overlap this
end__: ending offset, the last node will start before this (but may overlap)

Description

This iterator walks over all nodes in the range allocator that liebetweenstart andend. It is implemented similarly tolist_for_each(),but using the internal interval tree to accelerate the search for thestarting node, and so not safe against removal of elements. It assumesthatend is within (or is the upper limit of) the drm_mm allocator.If [start,end] are beyond the range of the drm_mm, the iterator may walkover the special _unallocated_drm_mm.head_node, and may even continueindefinitely.

voiddrm_mm_scan_init(structdrm_mm_scan * scan, structdrm_mm * mm, u64 size, u64 alignment, unsigned long color, enumdrm_mm_insert_mode mode)¶: initialize lru scanning

Parameters

structdrm_mm_scan*scan: scan state
structdrm_mm*mm: drm_mm to scan
u64size: size of the allocation
u64alignment: alignment of the allocation
unsignedlongcolor: opaque tag value to use for the allocation
enumdrm_mm_insert_modemode: fine-tune the allocation search and placement

Description

This is a simplified version ofdrm_mm_scan_init_with_range() with no rangerestrictions applied.

This simply sets up the scanning routines with the parameters for the desiredhole.

Warning:As long as the scan list is non-empty, no other operations thanadding/removing nodes to/from the scan list are allowed.

intdrm_mm_reserve_node(structdrm_mm * mm, structdrm_mm_node * node)¶: insert an pre-initialized node

Parameters

structdrm_mm*mm: drm_mm allocator to insertnode into
structdrm_mm_node*node: drm_mm_node to insert

Description

This functions inserts an already set-updrm_mm_node into the allocator,meaning that start, size and color must be set by the caller. All otherfields must be cleared to 0. This is useful to initialize the allocator withpreallocated objects which must be set-up before the range allocator can beset-up, e.g. when taking over a firmware framebuffer.

Return

0 on success, -ENOSPC if there’s no hole wherenode is.

intdrm_mm_insert_node_in_range(structdrm_mm *const mm, structdrm_mm_node *const node, u64 size, u64 alignment, unsigned long color, u64 range_start, u64 range_end, enumdrm_mm_insert_mode mode)¶: ranged search for space and insertnode

Parameters

structdrm_mm*constmm: drm_mm to allocate from
structdrm_mm_node*constnode: preallocate node to insert
u64size: size of the allocation
u64alignment: alignment of the allocation
unsignedlongcolor: opaque tag value to use for this node
u64range_start: start of the allowed range for this node
u64range_end: end of the allowed range for this node
enumdrm_mm_insert_modemode: fine-tune the allocation search and placement

Description

The preallocatednode must be cleared to 0.

Return

0 on success, -ENOSPC if there’s no suitable hole.

voiddrm_mm_remove_node(structdrm_mm_node * node)¶: Remove a memory node from the allocator.

Parameters

structdrm_mm_node*node: drm_mm_node to remove

Description

This just removes a node from its drm_mm allocator. The node does not need tobe cleared again before it can be re-inserted into this or any other drm_mmallocator. It is a bug to call this function on a unallocated node.

voiddrm_mm_replace_node(structdrm_mm_node * old, structdrm_mm_node * new)¶: move an allocation fromold tonew

Parameters

structdrm_mm_node*old: drm_mm_node to remove from the allocator
structdrm_mm_node*new: drm_mm_node which should inheritold’s allocation

Description

This is useful for when drivers embed the drm_mm_node structure and hencecan’t move allocations by reassigning pointers. It’s a combination of removeand insert with the guarantee that the allocation start will match.

voiddrm_mm_scan_init_with_range(structdrm_mm_scan * scan, structdrm_mm * mm, u64 size, u64 alignment, unsigned long color, u64 start, u64 end, enumdrm_mm_insert_mode mode)¶: initialize range-restricted lru scanning

Parameters

structdrm_mm_scan*scan: scan state
structdrm_mm*mm: drm_mm to scan
u64size: size of the allocation
u64alignment: alignment of the allocation
unsignedlongcolor: opaque tag value to use for the allocation
u64start: start of the allowed range for the allocation
u64end: end of the allowed range for the allocation
enumdrm_mm_insert_modemode: fine-tune the allocation search and placement

Description

This simply sets up the scanning routines with the parameters for the desiredhole.

Warning:As long as the scan list is non-empty, no other operations thanadding/removing nodes to/from the scan list are allowed.

booldrm_mm_scan_add_block(structdrm_mm_scan * scan, structdrm_mm_node * node)¶: add a node to the scan list

Parameters

structdrm_mm_scan*scan: the active drm_mm scanner
structdrm_mm_node*node: drm_mm_node to add

Description

Add a node to the scan list that might be freed to make space for the desiredhole.

Return

True if a hole has been found, false otherwise.

booldrm_mm_scan_remove_block(structdrm_mm_scan * scan, structdrm_mm_node * node)¶: remove a node from the scan list

Parameters

structdrm_mm_scan*scan: the active drm_mm scanner
structdrm_mm_node*node: drm_mm_node to remove

Description

Nodesmust be removed in exactly the reverse order from the scan list asthey have been added (e.g. usinglist_add() as they are added and thenlist_for_each() over that eviction list to remove), otherwise the internalstate of the memory manager will be corrupted.

When the scan list is empty, the selected memory nodes can be freed. Animmediately following drm_mm_insert_node_in_range_generic() or one of thesimpler versions of that function with !DRM_MM_SEARCH_BEST will then returnthe just freed block (because it’s at the top of the free_stack list).

Return

True if this block should be evicted, false otherwise. Will alwaysreturn false when no hole has been found.

structdrm_mm_node *drm_mm_scan_color_evict(structdrm_mm_scan * scan)¶: evict overlapping nodes on either side of hole

Parameters

structdrm_mm_scan*scan: drm_mm scan with target hole

Description

After completing an eviction scan and removing the selected nodes, we mayneed to remove a few more nodes from either side of the target hole ifmm.color_adjust is being used.

Return

A node to evict, or NULL if there are no overlapping nodes.

voiddrm_mm_init(structdrm_mm * mm, u64 start, u64 size)¶: initialize a drm-mm allocator

Parameters

structdrm_mm*mm: the drm_mm structure to initialize
u64start: start of the range managed bymm
u64size: end of the range managed bymm

Description

Note thatmm must be cleared to 0 before calling this function.

voiddrm_mm_takedown(structdrm_mm * mm)¶: clean up a drm_mm allocator

Parameters

structdrm_mm*mm: drm_mm allocator to clean up

Description

Note that it is a bug to call this function on an allocator which is notclean.

voiddrm_mm_print(const structdrm_mm * mm, structdrm_printer * p)¶: print allocator state

Parameters

conststructdrm_mm*mm: drm_mm allocator to print
structdrm_printer*p: DRM printer to use

DRM Cache Handling¶

voiddrm_clflush_pages(struct page * pages, unsigned long num_pages)¶: Flush dcache lines of a set of pages.

Parameters

structpage*pages: List of pages to be flushed.
unsignedlongnum_pages: Number of pages in the array.

Description

Flush every data cache line entry that points to an address belongingto a page in the array.

voiddrm_clflush_sg(struct sg_table * st)¶: Flush dcache lines pointing to a scather-gather.

Parameters

structsg_table*st: struct sg_table.

Description

Flush every data cache line entry that points to an address in thesg.

voiddrm_clflush_virt_range(void * addr, unsigned long length)¶: Flush dcache lines of a region

Parameters

void*addr: Initial kernel memory address.
unsignedlonglength: Region size.

Description

Flush every data cache line entry that points to an address in theregion requested.

DRM Sync Objects¶

DRM synchronisation objects (syncobj, see structdrm_syncobj) provide acontainer for a synchronization primitive which can be used by userspaceto explicitly synchronize GPU commands, can be shared between userspaceprocesses, and can be shared between different DRM drivers.Their primary use-case is to implement Vulkan fences and semaphores.The syncobj userspace API provides ioctls for several operations:

Creation and destruction of syncobjs
Import and export of syncobjs to/from a syncobj file descriptor
Import and export a syncobj’s underlying fence to/from a sync file
Reset a syncobj (set its fence to NULL)
Signal a syncobj (set a trivially signaled fence)
Wait for a syncobj’s fence to appear and be signaled

The syncobj userspace API also provides operations to manipulate a syncobjin terms of a timeline of structdma_fence_chain rather than a singlestructdma_fence, through the following operations:

Signal a given point on the timeline
Wait for a given point to appear and/or be signaled
Import and export from/to a given point of a timeline

At it’s core, a syncobj is simply a wrapper around a pointer to a structdma_fence which may be NULL.When a syncobj is first created, its pointer is either NULL or a pointerto an already signaled fence depending on whether theDRM_SYNCOBJ_CREATE_SIGNALED flag is passed toDRM_IOCTL_SYNCOBJ_CREATE.

If the syncobj is considered as a binary (its state is either signaled orunsignaled) primitive, when GPU work is enqueued in a DRM driver to signalthe syncobj, the syncobj’s fence is replaced with a fence which will besignaled by the completion of that work.If the syncobj is considered as a timeline primitive, when GPU work isenqueued in a DRM driver to signal the a given point of the syncobj, a newstructdma_fence_chain pointing to the DRM driver’s fence and alsopointing to the previous fence that was in the syncobj. The new structdma_fence_chain fence replace the syncobj’s fence and will be signaled bycompletion of the DRM driver’s work and also any work associated with thefence previously in the syncobj.

When GPU work which waits on a syncobj is enqueued in a DRM driver, at thetime the work is enqueued, it waits on the syncobj’s fence beforesubmitting the work to hardware. That fence is either :

The syncobj’s current fence if the syncobj is considered as a binaryprimitive.
The structdma_fence associated with a given point if the syncobj isconsidered as a timeline primitive.

If the syncobj’s fence is NULL or not present in the syncobj’s timeline,the enqueue operation is expected to fail.

With binary syncobj, all manipulation of the syncobjs’s fence happens interms of the current fence at the time the ioctl is called by userspaceregardless of whether that operation is an immediate host-side operation(signal or reset) or or an operation which is enqueued in some driverqueue.DRM_IOCTL_SYNCOBJ_RESET andDRM_IOCTL_SYNCOBJ_SIGNAL can be usedto manipulate a syncobj from the host by resetting its pointer to NULL orsetting its pointer to a fence which is already signaled.

With a timeline syncobj, all manipulation of the synobj’s fence happens interms of a u64 value referring to point in the timeline. Seedma_fence_chain_find_seqno() to see how a given point is found in thetimeline.

Note that applications should be careful to always use timeline set ofioctl() when dealing with syncobj considered as timeline. Using a binaryset of ioctl() with a syncobj considered as timeline could result incorrectsynchronization. The use of binary syncobj is supported through thetimeline set of ioctl() by using a point value of 0, this will reproducethe behavior of the binary set of ioctl() (for example replace thesyncobj’s fence when signaling).

Host-side wait on syncobjs¶

DRM_IOCTL_SYNCOBJ_WAIT takes an array of syncobj handles and does ahost-side wait on all of the syncobj fences simultaneously.IfDRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL is set, the wait ioctl will wait onall of the syncobj fences to be signaled before it returns.Otherwise, it returns once at least one syncobj fence has been signaledand the index of a signaled fence is written back to the client.

Unlike the enqueued GPU work dependencies which fail if they see a NULLfence in a syncobj, ifDRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT is set,the host-side wait will first wait for the syncobj to receive a non-NULLfence and then wait on that fence.IfDRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT is not set and any one of thesyncobjs in the array has a NULL fence, -EINVAL will be returned.Assuming the syncobj starts off with a NULL fence, this allows a clientto do a host wait in one thread (or process) which waits on GPU worksubmitted in another thread (or process) without having to manuallysynchronize between the two.This requirement is inherited from the Vulkan fence API.

Similarly,DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT takes an array of syncobjhandles as well as an array of u64 points and does a host-side wait on allof syncobj fences at the given points simultaneously.

DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT also adds the ability to wait for a givenfence to materialize on the timeline without waiting for the fence to besignaled by using theDRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE flag. Thisrequirement is inherited from the wait-before-signal behavior required bythe Vulkan timeline semaphore API.

Import/export of syncobjs¶

DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE andDRM_IOCTL_SYNCOBJ_HANDLE_TO_FDprovide two mechanisms for import/export of syncobjs.

The first lets the client import or export an entire syncobj to a filedescriptor.These fd’s are opaque and have no other use case, except passing thesyncobj between processes.All exported file descriptors and any syncobj handles created as aresult of importing those file descriptors own a reference to thesame underlying structdrm_syncobj and the syncobj can be usedpersistently across all the processes with which it is shared.The syncobj is freed only once the last reference is dropped.Unlike dma-buf, importing a syncobj creates a new handle (with its ownreference) for every import instead of de-duplicating.The primary use-case of this persistent import/export is for sharedVulkan fences and semaphores.

The second import/export mechanism, which is indicated byDRM_SYNCOBJ_FD_TO_HANDLE_FLAGS_IMPORT_SYNC_FILE orDRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE lets the clientimport/export the syncobj’s current fence from/to async_file.When a syncobj is exported to a sync file, that sync file wraps thesycnobj’s fence at the time of export and any later signal or resetoperations on the syncobj will not affect the exported sync file.When a sync file is imported into a syncobj, the syncobj’s fence is setto the fence wrapped by that sync file.Because sync files are immutable, resetting or signaling the syncobjwill not affect any sync files whose fences have been imported into thesyncobj.

Import/export of timeline points in timeline syncobjs¶

DRM_IOCTL_SYNCOBJ_TRANSFER provides a mechanism to transfer a structdma_fence_chain of a syncobj at a given u64 point to another u64 pointinto another syncobj.

Note that if you want to transfer a structdma_fence_chain from a givenpoint on a timeline syncobj from/into a binary syncobj, you can use thepoint 0 to mean take/replace the fence in the syncobj.

structdrm_syncobj¶: sync object.

Definition

struct drm_syncobj {  struct kref refcount;  struct dma_fence __rcu *fence;  struct list_head cb_list;  spinlock_t lock;  struct file *file;};

Members

refcount

Reference count of this object.

fence

NULL or a pointer to the fence bound to this object.

This field should not be used directly. Usedrm_syncobj_fence_get()anddrm_syncobj_replace_fence() instead.

cb_list

List of callbacks to call when thefence gets replaced.

lock

Protectscb_list and write-locksfence.

file

A file backing for this syncobj.

Description

This structure defines a generic sync object which wraps adma_fence.

voiddrm_syncobj_get(structdrm_syncobj * obj)¶: acquire a syncobj reference

Parameters

structdrm_syncobj*obj: sync object

Description

This acquires an additional reference toobj. It is illegal to call thiswithout already holding a reference. No locks required.

voiddrm_syncobj_put(structdrm_syncobj * obj)¶: release a reference to a sync object.

Parameters

structdrm_syncobj*obj: sync object.

structdma_fence *drm_syncobj_fence_get(structdrm_syncobj * syncobj)¶: get a reference to a fence in a sync object

Parameters

structdrm_syncobj*syncobj: sync object.

Description

This acquires additional reference todrm_syncobj.fence contained inobj,if not NULL. It is illegal to call this without already holding a reference.No locks required.

Return

Either the fence ofobj or NULL if there’s none.

structdrm_syncobj *drm_syncobj_find(structdrm_file * file_private, u32 handle)¶: lookup and reference a sync object.

Parameters

structdrm_file*file_private: drm file private pointer
u32handle: sync object handle to lookup.

Description

Returns a reference to the syncobj pointed to by handle or NULL. Thereference must be released by callingdrm_syncobj_put().

voiddrm_syncobj_add_point(structdrm_syncobj * syncobj, struct dma_fence_chain * chain, structdma_fence * fence, uint64_t point)¶: add new timeline point to the syncobj

Parameters

structdrm_syncobj*syncobj: sync object to add timeline point do
structdma_fence_chain*chain: chain node to use to add the point
structdma_fence*fence: fence to encapsulate in the chain node
uint64_tpoint: sequence number to use for the point

Description

Add the chain node as new timeline point to the syncobj.

voiddrm_syncobj_replace_fence(structdrm_syncobj * syncobj, structdma_fence * fence)¶: replace fence in a sync object.

Parameters

structdrm_syncobj*syncobj: Sync object to replace fence in
structdma_fence*fence: fence to install in sync file.

Description

This replaces the fence on a sync object.

intdrm_syncobj_find_fence(structdrm_file * file_private, u32 handle, u64 point, u64 flags, structdma_fence ** fence)¶: lookup and reference the fence in a sync object

Parameters

structdrm_file*file_private: drm file private pointer
u32handle: sync object handle to lookup.
u64point: timeline point
u64flags: DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT or not
structdma_fence**fence: out parameter for the fence

Description

This is just a convenience function that combinesdrm_syncobj_find() anddrm_syncobj_fence_get().

Returns 0 on success or a negative error value on failure. On successfencecontains a reference to the fence, which must be released by callingdma_fence_put().

voiddrm_syncobj_free(struct kref * kref)¶: free a sync object.

Parameters

structkref*kref: kref to free.

Description

Only to be called from kref_put in drm_syncobj_put.

intdrm_syncobj_create(structdrm_syncobj ** out_syncobj, uint32_t flags, structdma_fence * fence)¶: create a new syncobj

Parameters

structdrm_syncobj**out_syncobj: returned syncobj
uint32_tflags: DRM_SYNCOBJ_* flags
structdma_fence*fence: if non-NULL, the syncobj will represent this fence

Description

This is the first function to create a sync object. After creating, driversprobably want to make it available to userspace, either throughdrm_syncobj_get_handle() ordrm_syncobj_get_fd().

Returns 0 on success or a negative error value on failure.

intdrm_syncobj_get_handle(structdrm_file * file_private, structdrm_syncobj * syncobj, u32 * handle)¶: get a handle from a syncobj

Parameters

structdrm_file*file_private: drm file private pointer
structdrm_syncobj*syncobj: Sync object to export
u32*handle: out parameter with the new handle

Description

Exports a sync object created withdrm_syncobj_create() as a handle onfile_private to userspace.

Returns 0 on success or a negative error value on failure.

intdrm_syncobj_get_fd(structdrm_syncobj * syncobj, int * p_fd)¶: get a file descriptor from a syncobj

Parameters

structdrm_syncobj*syncobj: Sync object to export
int*p_fd: out parameter with the new file descriptor

Description

Exports a sync object created withdrm_syncobj_create() as a file descriptor.

Returns 0 on success or a negative error value on failure.

signed longdrm_timeout_abs_to_jiffies(int64_t timeout_nsec)¶: calculate jiffies timeout from absolute value

Parameters

int64_ttimeout_nsec: timeout nsec component in ns, 0 for poll

Description

Calculate the timeout in jiffies from an absolute time in sec/nsec.

GPU Scheduler¶

Overview¶

The GPU scheduler provides entities which allow userspace to push jobsinto software queues which are then scheduled on a hardware run queue.The software queues have a priority among them. The scheduler selects the entitiesfrom the run queue using a FIFO. The scheduler provides dependency handlingfeatures among jobs. The driver is supposed to provide callback functions forbackend operations to the scheduler like submitting a job to hardware run queue,returning the dependencies of a job etc.

The organisation of the scheduler is the following:

Each hw run queue has one scheduler
Each scheduler has multiple run queues with different priorities(e.g., HIGH_HW,HIGH_SW, KERNEL, NORMAL)
Each scheduler run queue has a queue of entities to schedule
Entities themselves maintain a queue of jobs that will be scheduled onthe hardware.

The jobs in a entity are always scheduled in the order that they were pushed.

Scheduler Function References¶

structdrm_sched_entity¶: A wrapper around a job queue (typically attached to the DRM file_priv).

Definition

struct drm_sched_entity {  struct list_head                list;  struct drm_sched_rq             *rq;  struct drm_gpu_scheduler        **sched_list;  unsigned int                    num_sched_list;  enum drm_sched_priority         priority;  spinlock_t rq_lock;  struct spsc_queue               job_queue;  atomic_t fence_seq;  uint64_t fence_context;  struct dma_fence                *dependency;  struct dma_fence_cb             cb;  atomic_t *guilty;  struct dma_fence                *last_scheduled;  struct task_struct              *last_user;  bool stopped;  struct completion               entity_idle;};

Members

list: used to append this struct to the list of entities in therunqueue.
rq: runqueue on which this entity is currently scheduled.
sched_list: A list of schedulers (drm_gpu_schedulers).Jobs from this entity can be scheduled on any scheduleron this list.
num_sched_list: number of drm_gpu_schedulers in the sched_list.
priority: priority of the entity
rq_lock: lock to modify the runqueue to which this entity belongs.
job_queue: the list of jobs of this entity.
fence_seq: a linearly increasing seqno incremented with eachnewdrm_sched_fence which is part of the entity.
fence_context: a unique context for all the fences which belongto this entity.Thedrm_sched_fence.scheduled uses thefence_context butdrm_sched_fence.finished usesfence_context + 1.
dependency: the dependency fence of the job which is on the topof the job queue.
cb: callback for the dependency fence above.
guilty: points to ctx’s guilty.
last_scheduled: points to the finished fence of the last scheduled job.
last_user: last group leader pushing a job into the entity.
stopped: Marks the enity as removed from rq and destined for termination.
entity_idle: Signals when enityt is not in use

Description

Entities will emit jobs in order to their corresponding hardwarering, and the scheduler will alternate between entities based onscheduling policy.

structdrm_sched_rq¶: queue of entities to be scheduled.

Definition

struct drm_sched_rq {  spinlock_t lock;  struct drm_gpu_scheduler        *sched;  struct list_head                entities;  struct drm_sched_entity         *current_entity;};

Members

lock: to modify the entities list.
sched: the scheduler to which this rq belongs to.
entities: list of the entities to be scheduled.
current_entity: the entity which is to be scheduled.

Description

Run queue is a set of entities scheduling command submissions forone specific ring. It implements the scheduling policy that selectsthe next entity to emit commands from.

structdrm_sched_fence¶: fences corresponding to the scheduling of a job.

Definition

struct drm_sched_fence {  struct dma_fence                scheduled;  struct dma_fence                finished;  struct dma_fence                *parent;  struct drm_gpu_scheduler        *sched;  spinlock_t lock;  void *owner;};

Members

scheduled

this fence is what will be signaled by the schedulerwhen the job is scheduled.

finished

this fence is what will be signaled by the schedulerwhen the job is completed.

When setting up an out fence for the job, you should usethis, since it’s available immediately upondrm_sched_job_init(), and the fence returned by the driverfrom run_job() won’t be created until the dependencies haveresolved.

parent

the fence returned bydrm_sched_backend_ops.run_jobwhen scheduling the job on hardware. We signal thedrm_sched_fence.finished fence once parent is signalled.

sched

the scheduler instance to which the job having this structbelongs to.

lock

the lock used by the scheduled and the finished fences.

owner

job owner for debugging

structdrm_sched_job¶: A job to be run by an entity.

Definition

struct drm_sched_job {  struct spsc_node                queue_node;  struct drm_gpu_scheduler        *sched;  struct drm_sched_fence          *s_fence;  struct dma_fence_cb             finish_cb;  struct list_head                node;  uint64_t id;  atomic_t karma;  enum drm_sched_priority         s_priority;  struct drm_sched_entity  *entity;  struct dma_fence_cb             cb;};

Members

queue_node: used to append this struct to the queue of jobs in an entity.
sched: the scheduler instance on which this job is scheduled.
s_fence: contains the fences for the scheduling of job.
finish_cb: the callback for the finished fence.
node: used to append this struct to thedrm_gpu_scheduler.ring_mirror_list.
id: a unique id assigned to each job scheduled on the scheduler.
karma: increment on every hang caused by this job. If this exceeds the hanglimit of the scheduler then the job is marked guilty and will notbe scheduled further.
s_priority: the priority of the job.
entity: the entity to which this job belongs.
cb: the callback for the parent fence in s_fence.

Description

A job is created by the driver usingdrm_sched_job_init(), andshould call drm_sched_entity_push_job() once it wants the schedulerto schedule the job.

structdrm_sched_backend_ops¶

Definition

struct drm_sched_backend_ops {  struct dma_fence *(*dependency)(struct drm_sched_job *sched_job, struct drm_sched_entity *s_entity);  struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);  void (*timedout_job)(struct drm_sched_job *sched_job);  void (*free_job)(struct drm_sched_job *sched_job);};

Members

dependency: Called when the scheduler is considering schedulingthis job next, to get another struct dma_fence for this job toblock on. Once it returns NULL, run_job() may be called.
run_job: Called to execute the job once all of the dependencieshave been resolved. This may be called multiple times, iftimedout_job() has happened and drm_sched_job_recovery()decides to try it again.
timedout_job: Called when a job has taken too long to execute,to trigger GPU recovery.
free_job: Called once the job’s finished fence has been signaledand it’s time to clean it up.

Description

Define the backend operations called by the scheduler,these functions should be implemented in driver side.

structdrm_gpu_scheduler¶

Definition

struct drm_gpu_scheduler {  const struct drm_sched_backend_ops      *ops;  uint32_t hw_submission_limit;  long timeout;  const char                      *name;  struct drm_sched_rq             sched_rq[DRM_SCHED_PRIORITY_MAX];  wait_queue_head_t wake_up_worker;  wait_queue_head_t job_scheduled;  atomic_t hw_rq_count;  atomic64_t job_id_count;  struct delayed_work             work_tdr;  struct task_struct              *thread;  struct list_head                ring_mirror_list;  spinlock_t job_list_lock;  int hang_limit;  atomic_t num_jobs;  bool ready;  bool free_guilty;};

Members

ops: backend operations provided by the driver.
hw_submission_limit: the max size of the hardware queue.
timeout: the time after which a job is removed from the scheduler.
name: name of the ring for which this scheduler is being used.
sched_rq: priority wise array of run queues.
wake_up_worker: the wait queue on which the scheduler sleeps until a jobis ready to be scheduled.
job_scheduled: oncedrm_sched_entity_do_release is called the schedulerwaits on this wait queue until all the scheduled jobs arefinished.
hw_rq_count: the number of jobs currently in the hardware queue.
job_id_count: used to assign unique id to the each job.
work_tdr: schedules a delayed call todrm_sched_job_timedout after thetimeout interval is over.
thread: the kthread on which the scheduler which run.
ring_mirror_list: the list of jobs which are currently in the job queue.
job_list_lock: lock to protect the ring_mirror_list.
hang_limit: once the hangs by a job crosses this limit then it is markedguilty and it will be considered for scheduling further.
num_jobs: the number of jobs in queue in the scheduler
ready: marks if the underlying HW is ready to work
free_guilty: A hit to time out handler to free the guilty job.

Description

One scheduler is implemented for each hardware ring.

booldrm_sched_dependency_optimized(structdma_fence * fence, structdrm_sched_entity * entity)¶

Parameters

structdma_fence*fence: the dependency fence
structdrm_sched_entity*entity: the entity which depends on the above fence

Description

Returns true if the dependency can be optimized and false otherwise

voiddrm_sched_fault(structdrm_gpu_scheduler * sched)¶: immediately start timeout handler

Parameters

structdrm_gpu_scheduler*sched: scheduler where the timeout handling should be started.

Description

Start timeout handling immediately when the driver detects a hardware fault.

unsigned longdrm_sched_suspend_timeout(structdrm_gpu_scheduler * sched)¶: Suspend scheduler job timeout

Parameters

structdrm_gpu_scheduler*sched: scheduler instance for which to suspend the timeout

Description

Suspend the delayed work timeout for the scheduler. This is done bymodifying the delayed work timeout to an arbitrary large value,MAX_SCHEDULE_TIMEOUT in this case.

Returns the timeout remaining

voiddrm_sched_resume_timeout(structdrm_gpu_scheduler * sched, unsigned long remaining)¶: Resume scheduler job timeout

Parameters

structdrm_gpu_scheduler*sched: scheduler instance for which to resume the timeout
unsignedlongremaining: remaining timeout

Description

Resume the delayed work timeout for the scheduler.

voiddrm_sched_stop(structdrm_gpu_scheduler * sched, structdrm_sched_job * bad)¶: stop the scheduler

Parameters

structdrm_gpu_scheduler*sched: scheduler instance
structdrm_sched_job*bad: job which caused the time out

Description

Stop the scheduler and also removes and frees all completed jobs.

Note

bad job will not be freed as it might be used later and so it’scallers responsibility to release it manually if it’s not part of themirror list any more.

voiddrm_sched_start(structdrm_gpu_scheduler * sched, bool full_recovery)¶: recover jobs after a reset

Parameters

structdrm_gpu_scheduler*sched: scheduler instance
boolfull_recovery: proceed with complete sched restart

voiddrm_sched_resubmit_jobs(structdrm_gpu_scheduler * sched)¶: helper to relunch job from mirror ring list

Parameters

structdrm_gpu_scheduler*sched: scheduler instance

intdrm_sched_job_init(structdrm_sched_job * job, structdrm_sched_entity * entity, void * owner)¶: init a scheduler job

Parameters

structdrm_sched_job*job: scheduler job to init
structdrm_sched_entity*entity: scheduler entity to use
void*owner: job owner for debugging

Description

Refer to drm_sched_entity_push_job() documentationfor locking considerations.

Returns 0 for success, negative error code otherwise.

voiddrm_sched_job_cleanup(structdrm_sched_job * job)¶: clean up scheduler job resources

Parameters

structdrm_sched_job*job: scheduler job to clean up

structdrm_gpu_scheduler *drm_sched_pick_best(structdrm_gpu_scheduler ** sched_list, unsigned int num_sched_list)¶: Get a drm sched from a sched_list with the least load

Parameters

structdrm_gpu_scheduler**sched_list: list of drm_gpu_schedulers
unsignedintnum_sched_list: number of drm_gpu_schedulers in the sched_list

Description

Returns pointer of the sched with the least load or NULL if none of thedrm_gpu_schedulers are ready

intdrm_sched_init(structdrm_gpu_scheduler * sched, const structdrm_sched_backend_ops * ops, unsigned hw_submission, unsigned hang_limit, long timeout, const char * name)¶: Init a gpu scheduler instance

Parameters

structdrm_gpu_scheduler*sched: scheduler instance
conststructdrm_sched_backend_ops*ops: backend operations for this scheduler
unsignedhw_submission: number of hw submissions that can be in flight
unsignedhang_limit: number of times to allow a job to hang before dropping it
longtimeout: timeout value in jiffies for the scheduler
constchar*name: name used for debugging

Description

Return 0 on success, otherwise error code.

voiddrm_sched_fini(structdrm_gpu_scheduler * sched)¶: Destroy a gpu scheduler

Parameters

structdrm_gpu_scheduler*sched: scheduler instance

Description

Tears down and cleans up the scheduler.

Movatterモバイル変換

DRM Memory Management¶

The Translation Table Manager (TTM)¶

TTM initialization¶

The Graphics Execution Manager (GEM)¶

GEM Initialization¶

GEM Objects Creation¶

GEM Objects Lifetime¶

GEM Objects Naming¶

GEM Objects Mapping¶

Memory Coherency¶

Command Execution¶

GEM Function Reference¶

GEM CMA Helper Functions Reference¶

GEM VRAM Helper Functions Reference¶

GEM TTM Helper Functions Reference¶

VMA Offset Manager¶

PRIME Buffer Sharing¶

Overview and Lifetime Rules¶

Reference Counting for GEM Drivers¶

PRIME Helper Functions¶

Exporting buffers¶

Exporting buffers¶

PRIME Function References¶

DRM MM Range Allocator¶

Overview¶

LRU Scan/Eviction Support¶

DRM MM Range Allocator Function References¶

DRM Cache Handling¶

DRM Sync Objects¶

Host-side wait on syncobjs¶

Import/export of syncobjs¶

Import/export of timeline points in timeline syncobjs¶

GPU Scheduler¶

Overview¶

Scheduler Function References¶