DRM Memory Management¶

Modern Linux systems require large amount of graphics memory to storeframe buffers, textures, vertices and other graphics-related data. Giventhe very dynamic nature of many of that data, managing graphics memoryefficiently is thus crucial for the graphics stack and plays a centralrole in the DRM infrastructure.

The DRM core includes two memory managers, namely Translation Table Manager(TTM) and Graphics Execution Manager (GEM). TTM was the first DRM memorymanager to be developed and tried to be a one-size-fits-them allsolution. It provides a single userspace API to accommodate the need ofall hardware, supporting both Unified Memory Architecture (UMA) devicesand devices with dedicated video RAM (i.e. most discrete video cards).This resulted in a large, complex piece of code that turned out to behard to use for driver development.

GEM started as an Intel-sponsored project in reaction to TTM’scomplexity. Its design philosophy is completely different: instead ofproviding a solution to every graphics memory-related problems, GEMidentified common code between drivers and created a support library toshare it. GEM has simpler initialization and execution requirements thanTTM, but has no video RAM management capabilities and is thus limited toUMA devices.

The Translation Table Manager (TTM)¶

TTM is a memory manager for accelerator devices with dedicated memory.

The basic idea is that resources are grouped together in buffer objects ofcertain size and TTM handles lifetime, movement and CPU mappings of thoseobjects.

TODO: Add more design background and information here.

enumttm_caching¶: CPU caching and BUS snooping behavior.

Constants

ttm_uncached: Most defensive option for device mappings,don’t even allow write combining.
ttm_write_combined: Don’t cache read accesses, but allow at leastwrites to be combined.
ttm_cached: Fully cached like normal system memory, requires thatdevices snoop the CPU cache on accesses.

TTM device object reference¶

structttm_global¶: Buffer object driver global data.

Definition:

struct ttm_global {    struct page *dummy_read_page;    struct list_head device_list;    atomic_t bo_count;};

Members

dummy_read_page: Pointer to a dummy page used for mapping requestsof unpopulated pages. Constant after init.
device_list: List of buffer object devices. Protected byttm_global_mutex.
bo_count: Number of buffer objects allocated by devices.

structttm_device¶: Buffer object driver device-specific data.

Definition:

struct ttm_device {    struct list_head device_list;    unsigned int alloc_flags;    const struct ttm_device_funcs *funcs;    struct ttm_resource_manager sysman;    struct ttm_resource_manager *man_drv[TTM_NUM_MEM_TYPES];    struct drm_vma_offset_manager *vma_manager;    struct ttm_pool pool;    spinlock_t lru_lock;    struct list_head unevictable;    struct address_space *dev_mapping;    struct workqueue_struct *wq;};

Members

device_list: Our entry in the global device list.Constant after bo device init
alloc_flags: TTM_ALLOCATION_* flags.
funcs: Function table for the device.Constant after bo device init
sysman: Resource manager for the system domain.Access via ttm_manager_type.
man_drv: An array of resource_managers, one per resource type.
vma_manager: Address space manager for finding BOs to mmap.
pool: page pool for the device.
lru_lock: Protection for the per manager LRU and ddestroy lists.
unevictable: Buffer objects which are pinned or swapped and as suchnot on an LRU list.
dev_mapping: A pointer to thestructaddress_space for invalidatingCPU mappings on buffer move. Protected by load/unload sync.
wq: Work queue structure for the delayed delete workqueue.

intttm_device_prepare_hibernation(structttm_device*bdev)¶: move GTT BOs to shmem for hibernation.

Parameters

structttm_device*bdev: A pointer to astructttm_device to prepare hibernation for.

Return

0 on success, negative number on failure.

intttm_device_init(structttm_device*bdev,conststructttm_device_funcs*funcs,structdevice*dev,structaddress_space*mapping,structdrm_vma_offset_manager*vma_manager,unsignedintalloc_flags)¶

Parameters

structttm_device*bdev: A pointer to astructttm_device to initialize.
conststructttm_device_funcs*funcs: Function table for the device.
structdevice*dev: The core kernel device pointer for DMA mappings and allocations.
structaddress_space*mapping: The address space to use for this bo.
structdrm_vma_offset_manager*vma_manager: A pointer to a vma manager.
unsignedintalloc_flags: TTM_ALLOCATION_* flags.

Description

Initializes astructttm_device:

Return

!0: Failure.

TTM resource placement reference¶

structttm_place¶

Definition:

struct ttm_place {    unsigned fpfn;    unsigned lpfn;    uint32_t mem_type;    uint32_t flags;};

Members

fpfn: first valid page frame number to put the object
lpfn: last valid page frame number to put the object
mem_type: One of TTM_PL_* where the resource should be allocated from.
flags: memory domain and caching flags for the object

Description

Structure indicating a possible place to put an object.

structttm_placement¶

Definition:

struct ttm_placement {    unsigned num_placement;    const struct ttm_place  *placement;};

Members

num_placement: number of preferred placements
placement: preferred placements

Description

Structure indicating the placement you request for an object.

TTM resource object reference¶

TTM_NUM_MOVE_FENCES¶

TTM_NUM_MOVE_FENCES

How many entities can be used for evictions
Description
Pipelined evictions can be spread on multiple entities. Thisis the max number of entities that can be used by the driverfor that purpose.

enumttm_lru_item_type¶: enumerate ttm_lru_item subclasses

Constants

TTM_LRU_RESOURCE: The resource subclass
TTM_LRU_HITCH: The iterator hitch subclass

structttm_lru_item¶: The TTM lru list node base class

Definition:

struct ttm_lru_item {    struct list_head link;    enum ttm_lru_item_type type;};

Members

link: The list link
type: The subclass type

voidttm_lru_item_init(structttm_lru_item*item,enumttm_lru_item_typetype)¶: initialize astructttm_lru_item

Parameters

structttm_lru_item*item: The item to initialize
enumttm_lru_item_typetype: The subclass type

structttm_resource_manager¶

Definition:

struct ttm_resource_manager {    bool use_type;    bool use_tt;    struct ttm_device *bdev;    uint64_t size;    const struct ttm_resource_manager_func *func;    spinlock_t eviction_lock;    struct dma_fence *eviction_fences[TTM_NUM_MOVE_FENCES];    struct list_head lru[TTM_MAX_BO_PRIORITY];    uint64_t usage;    struct dmem_cgroup_region *cg;};

Members

use_type: The memory type is enabled.
use_tt: If a TT object should be used for the backing store.
bdev: ttm device this manager belongs to
size: Size of the managed region.
func: structure pointer implementing the range manager. See above
eviction_lock: lock for eviction fences
eviction_fences: The fences of the last pipelined move operation.
lru: The lru list for this memory type.
usage: How much of the resources are used, protected by thebdev->lru_lock.
cg: dmem_cgroup_region used for memory accounting, if not NULL.

Description

This structure is used to identify and manage memory types for a device.

structttm_bus_placement¶

Definition:

struct ttm_bus_placement {    void *addr;    phys_addr_t offset;    bool is_iomem;    enum ttm_caching        caching;};

Members

addr: mapped virtual address
offset: physical addr
is_iomem: is this io memory ?
caching: Seeenumttm_caching

Description

Structure indicating the bus placement of an object.

structttm_resource¶

Definition:

struct ttm_resource {    unsigned long start;    size_t size;    uint32_t mem_type;    uint32_t placement;    struct ttm_bus_placement bus;    struct ttm_buffer_object *bo;    struct dmem_cgroup_pool_state *css;    struct ttm_lru_item lru;};

Members

start: Start of the allocation.
size: Actual size of resource in bytes.
mem_type: Resource type of the allocation.
placement: Placement flags.
bus: Placement on io bus accessible to the CPU
bo: weak reference to the BO, protected by ttm_device::lru_lock
css: cgroup state this resource is charged to
lru: Least recently used list, seettm_resource_manager.lru

Description

Structure indicating the placement and space resources used by abuffer object.

structttm_resource*ttm_lru_item_to_res(structttm_lru_item*item)¶: Downcast astructttm_lru_item to astructttm_resource

Parameters

structttm_lru_item*item: Thestructttm_lru_item to downcast

Return

Pointer to the embeddingstructttm_resource

structttm_lru_bulk_move_pos¶

Definition:

struct ttm_lru_bulk_move_pos {    struct ttm_resource *first;    struct ttm_resource *last;};

Members

first: first res in the bulk move range
last: last res in the bulk move range

Description

Range of resources for a lru bulk move.

structttm_lru_bulk_move¶

Definition:

struct ttm_lru_bulk_move {    struct ttm_lru_bulk_move_pos pos[TTM_NUM_MEM_TYPES][TTM_MAX_BO_PRIORITY];    struct list_head cursor_list;};

Members

pos: first/last lru entry for resources in the each domain/priority
cursor_list: The list of cursors currently traversing any ofthe sublists ofpos. Protected by the ttm device’s lru_lock.

Description

Container for the current bulk move state. Should be used withttm_lru_bulk_move_init() andttm_bo_set_bulk_move().All BOs in a bulk_move structure need to share the same reservation object toensure that the bulk as a whole is locked for eviction even if only one BO ofthe bulk is evicted.

structttm_resource_cursor¶

Definition:

struct ttm_resource_cursor {    struct ttm_resource_manager *man;    struct ttm_lru_item hitch;    struct list_head bulk_link;    struct ttm_lru_bulk_move *bulk;    unsigned int mem_type;    unsigned int priority;};

Members

man: The resource manager currently being iterated over
hitch: A hitch list node inserted before the next resourceto iterate over.
bulk_link: A list link for the list of cursors traversing thebulk sublist ofbulk. Protected by the ttm device’s lru_lock.
bulk: Pointer tostructttm_lru_bulk_move whose subrangehitch isinserted to. NULL if none. Never dereference this pointer sincethestructttm_lru_bulk_move object pointed to might have beenfreed. The pointer is only for comparison.
mem_type: The memory type of the LRU list being traversed.This field is valid iffbulk != NULL.
priority: the current priority

Description

Cursor to iterate over the resources in a manager.

structttm_kmap_iter_iomap¶: Specialization for astructio_mapping +structsg_table backedstructttm_resource.

Definition:

struct ttm_kmap_iter_iomap {    struct ttm_kmap_iter base;    struct io_mapping *iomap;    struct sg_table *st;    resource_size_t start;    struct {        struct scatterlist *sg;        pgoff_t i;        pgoff_t end;        pgoff_t offs;    } cache;};

Members

base: Embeddedstructttm_kmap_iter providing the usage interface.
iomap: structio_mapping representing the underlying linear io_memory.
st: sg_table intoiomap, representing the memory of thestructttm_resource.
start: Offset that needs to be subtracted fromst to makesg_dma_address(st->sgl) -start == 0 foriomap start.
cache: Scatterlist traversal cache for fast lookups.
cache.sg: Pointer to the currently cached scatterlist segment.
cache.i: First index ofsg. PAGE_SIZE granularity.
cache.end: Last index + 1 ofsg. PAGE_SIZE granularity.
cache.offs: First offset intoiomap ofsg. PAGE_SIZE granularity.

structttm_kmap_iter_linear_io¶: Iterator specialization for linear io

Definition:

struct ttm_kmap_iter_linear_io {    struct ttm_kmap_iter base;    struct iosys_map dmap;    bool needs_unmap;};

Members

base: The base iterator
dmap: Points to the starting address of the region
needs_unmap: Whether we need to unmap on fini

voidttm_resource_manager_set_used(structttm_resource_manager*man,boolused)¶

Parameters

structttm_resource_manager*man: A memory manager object.
boolused: usage state to set.

Description

Set the manager in use flag. If disabled the manager is no longerused for object placement.

boolttm_resource_manager_used(structttm_resource_manager*man)¶

Parameters

structttm_resource_manager*man: Manager to get used state for

Description

Get the in use flag for a manager.

Return

true is used, false if not.

voidttm_resource_manager_cleanup(structttm_resource_manager*man)¶

Parameters

structttm_resource_manager*man: A memory manager object.

Description

Cleanup the move fences from the memory manager object.

ttm_resource_manager_for_each_res¶

ttm_resource_manager_for_each_res(cursor,res)

iterate over all resources

Parameters

cursor: structttm_resource_cursor for the current position
res: the current resource

Description

Iterate over all the evictable resources in a resource manager.

voidttm_lru_bulk_move_init(structttm_lru_bulk_move*bulk)¶: initialize a bulk move structure

Parameters

structttm_lru_bulk_move*bulk: the structure to init

Description

For now just memset the structure to zero.

voidttm_lru_bulk_move_fini(structttm_device*bdev,structttm_lru_bulk_move*bulk)¶: finalize a bulk move structure

Parameters

structttm_device*bdev: Thestructttm_device
structttm_lru_bulk_move*bulk: the structure to finalize

Description

Sanity checks that bulk moves don’t have anyresources left and hence no cursors attached.

voidttm_lru_bulk_move_tail(structttm_lru_bulk_move*bulk)¶: bulk move range of resources to the LRU tail.

Parameters

structttm_lru_bulk_move*bulk: bulk move structure

Description

Bulk move BOs to the LRU tail, only valid to use when driver makes sure thatresource order never changes. Should be called withttm_device.lru_lock held.

voidttm_resource_init(structttm_buffer_object*bo,conststructttm_place*place,structttm_resource*res)¶: resource object constructure

Parameters

structttm_buffer_object*bo: buffer object this resources is allocated for
conststructttm_place*place: placement of the resource
structttm_resource*res: the resource object to inistilize

Description

Initialize a new resource object. Counterpart ofttm_resource_fini().

voidttm_resource_fini(structttm_resource_manager*man,structttm_resource*res)¶: resource destructor

Parameters

structttm_resource_manager*man: the resource manager this resource belongs to
structttm_resource*res: the resource to clean up

Description

Should be used by resource manager backends to clean up the TTM resourceobjects before freeing the underlying structure. Makes sure the resource isremoved from the LRU before destruction.Counterpart ofttm_resource_init().

voidttm_resource_manager_init(structttm_resource_manager*man,structttm_device*bdev,uint64_tsize)¶

Parameters

structttm_resource_manager*man: memory manager object to init
structttm_device*bdev: ttm device this manager belongs to
uint64_tsize: size of managed resources in arbitrary units

Description

Initialise core parts of a manager object.

uint64_tttm_resource_manager_usage(structttm_resource_manager*man)¶

Parameters

structttm_resource_manager*man: A memory manager object.

Description

Return how many resources are currently used.

voidttm_resource_manager_debug(structttm_resource_manager*man,structdrm_printer*p)¶

Parameters

structttm_resource_manager*man: manager type to dump.
structdrm_printer*p: printer to use for debug.

structttm_kmap_iter*ttm_kmap_iter_iomap_init(structttm_kmap_iter_iomap*iter_io,structio_mapping*iomap,structsg_table*st,resource_size_tstart)¶: Initialize astructttm_kmap_iter_iomap

Parameters

structttm_kmap_iter_iomap*iter_io: Thestructttm_kmap_iter_iomap to initialize.
structio_mapping*iomap: Thestructio_mapping representing the underlying linear io_memory.
structsg_table*st: sg_table intoiomap, representing the memory of thestructttm_resource.
resource_size_tstart: Offset that needs to be subtracted fromst to makesg_dma_address(st->sgl) -start == 0 foriomap start.

Return

Pointer to the embeddedstructttm_kmap_iter.

voidttm_resource_manager_create_debugfs(structttm_resource_manager*man,structdentry*parent,constchar*name)¶: Create debugfs entry for specified resource manager.

Parameters

structttm_resource_manager*man: The TTM resource manager for which the debugfs stats file be creates
structdentry*parent: debugfs directory in which the file will reside
constchar*name: The filename to create.

Description

This function setups up a debugfs file that can be used to lookat debug statistics of the specified ttm_resource_manager.

TTM TT object reference¶

structttm_tt¶: This is a structure holding the pages, caching- and aperture binding status for a buffer object that isn’t backed by fixed (VRAM / AGP) memory.

Definition:

struct ttm_tt {    struct page **pages;#define TTM_TT_FLAG_SWAPPED             BIT(0);#define TTM_TT_FLAG_ZERO_ALLOC          BIT(1);#define TTM_TT_FLAG_EXTERNAL            BIT(2);#define TTM_TT_FLAG_EXTERNAL_MAPPABLE   BIT(3);#define TTM_TT_FLAG_DECRYPTED           BIT(4);#define TTM_TT_FLAG_BACKED_UP           BIT(5);#define TTM_TT_FLAG_PRIV_POPULATED      BIT(6);    uint32_t page_flags;    uint32_t num_pages;    struct sg_table *sg;    dma_addr_t *dma_address;    struct file *swap_storage;    struct file *backup;    enum ttm_caching caching;    struct ttm_pool_tt_restore *restore;};

Members

pages

Array of pages backing the data.

page_flags

The page flags.

Supported values:

TTM_TT_FLAG_SWAPPED: Set by TTM when the pages have been unpopulatedand swapped out by TTM. Callingttm_tt_populate() will then swap thepages back in, and unset the flag. Drivers should in general neverneed to touch this.

TTM_TT_FLAG_ZERO_ALLOC: Set if the pages will be zeroed onallocation.

TTM_TT_FLAG_EXTERNAL: Set if the underlying pages were allocatedexternally, like with dma-buf or userptr. This effectively disablesTTM swapping out such pages. Also important is to prevent TTM fromever directly mapping these pages.

Note thatenumttm_bo_type.ttm_bo_type_sg objects will always enablethis flag.

TTM_TT_FLAG_EXTERNAL_MAPPABLE: Same behaviour asTTM_TT_FLAG_EXTERNAL, but with the reduced restriction that it isstill valid to use TTM to map the pages directly. This is useful whenimplementing a ttm_tt backend which still allocates driver ownedpages underneath(say with shmem).

Note that since this also implies TTM_TT_FLAG_EXTERNAL, the usagehere should always be:

page_flags = TTM_TT_FLAG_EXTERNAL |
TTM_TT_FLAG_EXTERNAL_MAPPABLE;

TTM_TT_FLAG_DECRYPTED: The mapped ttm pages should be marked asnot encrypted. The framework will try to match what the dma layeris doing, but note that it is a little fragile because ttm pagefault handling abuses the DMA api a bit and dma_map_attrs can’t beused to assure pgprot always matches.

TTM_TT_FLAG_BACKED_UP: TTM internal only. This is set if thestructttm_tt has been (possibly partially) backed up.

TTM_TT_FLAG_PRIV_POPULATED: TTM internal only. DO NOT USE. This isset by TTM afterttm_tt_populate() has successfully returned, and isthen unset when TTM callsttm_tt_unpopulate().

num_pages

Number of pages in the page array.

sg

for SG objects via dma-buf.

dma_address

The DMA (bus) addresses of the pages.

swap_storage

Pointer to shmemstructfile for swap storage.

backup

Pointer to backup struct for backed up tts.Could be unified withswap_storage. Meanwhile, the driver’sttm_tt_create() callback is responsible for assigningthis field.

caching

The current caching state of the pages, seeenumttm_caching.

restore

Partial restoration from backup state. TTM private

structttm_kmap_iter_tt¶: Specialization of a mappig iterator for a tt.

Definition:

struct ttm_kmap_iter_tt {    struct ttm_kmap_iter base;    struct ttm_tt *tt;    pgprot_t prot;};

Members

base: Embeddedstructttm_kmap_iter providing the usage interface
tt: Cachedstructttm_tt.
prot: Cached page protection for mapping.

boolttm_tt_is_swapped(conststructttm_tt*tt)¶: Whether the ttm_tt is swapped out or backed up

Parameters

conststructttm_tt*tt: Thestructttm_tt.

Return

true if swapped or backed up, false otherwise.

boolttm_tt_is_backed_up(conststructttm_tt*tt)¶: Whether the ttm_tt backed up

Parameters

conststructttm_tt*tt: Thestructttm_tt.

Return

true if swapped or backed up, false otherwise.

voidttm_tt_clear_backed_up(structttm_tt*tt)¶: Clear the ttm_tt backed-up status

Parameters

structttm_tt*tt: Thestructttm_tt.

Description

Drivers can use this functionto clear the backed-up status,for example before destroying or re-validating a purged tt.

intttm_tt_create(structttm_buffer_object*bo,boolzero_alloc)¶

Parameters

structttm_buffer_object*bo: pointer to astructttm_buffer_object
boolzero_alloc: true if allocated pages needs to be zeroed

Description

Make sure we have a TTM structure allocated for the given BO.No pages are actually allocated.

intttm_tt_init(structttm_tt*ttm,structttm_buffer_object*bo,uint32_tpage_flags,enumttm_cachingcaching,unsignedlongextra_pages)¶

Parameters

structttm_tt*ttm: Thestructttm_tt.
structttm_buffer_object*bo: The buffer object we create the ttm for.
uint32_tpage_flags: Page flags as identified by TTM_TT_FLAG_XX flags.
enumttm_cachingcaching: the desired caching state of the pages
unsignedlongextra_pages: Extra pages needed for the driver.

Description

Create astructttm_tt to back data with system memory pages.No pages are actually allocated.

Return

NULL: Out of memory.

voidttm_tt_fini(structttm_tt*ttm)¶

Parameters

structttm_tt*ttm: the ttm_tt structure.

Description

Free memory of ttm_tt structure

voidttm_tt_destroy(structttm_device*bdev,structttm_tt*ttm)¶

Parameters

structttm_device*bdev: the ttm_device this object belongs to
structttm_tt*ttm: Thestructttm_tt.

Description

Unbind, unpopulate and destroy commonstructttm_tt.

intttm_tt_swapin(structttm_tt*ttm)¶

Parameters

structttm_tt*ttm: Thestructttm_tt.

Description

Swap in a previously swap out ttm_tt.

intttm_tt_populate(structttm_device*bdev,structttm_tt*ttm,structttm_operation_ctx*ctx)¶: allocate pages for a ttm

Parameters

structttm_device*bdev: the ttm_device this object belongs to
structttm_tt*ttm: Pointer to the ttm_tt structure
structttm_operation_ctx*ctx: operation context for populating the tt object.

Description

Calls the driver method to allocate pages for a ttm

voidttm_tt_unpopulate(structttm_device*bdev,structttm_tt*ttm)¶: free pages from a ttm

Parameters

structttm_device*bdev: the ttm_device this object belongs to
structttm_tt*ttm: Pointer to the ttm_tt structure

Description

Calls the driver method to free all pages from a ttm

voidttm_tt_mark_for_clear(structttm_tt*ttm)¶: Mark pages for clearing on populate.

Parameters

structttm_tt*ttm: Pointer to the ttm_tt structure

Description

Marks pages for clearing so that the next time the page vector ispopulated, the pages will be cleared.

structttm_backup_flags¶: Flags to govern backup behaviour.

Definition:

struct ttm_backup_flags {    u32 purge : 1;    u32 writeback : 1;};

Members

purge: Free pages without backing up. Bypass pools.
writeback: Attempt to copy contents directly to swap space, evenif that means blocking on writes to external memory.

structttm_tt*ttm_agp_tt_create(structttm_buffer_object*bo,structagp_bridge_data*bridge,uint32_tpage_flags)¶

Parameters

structttm_buffer_object*bo: Buffer object we allocate the ttm for.
structagp_bridge_data*bridge: The agp bridge this device is sitting on.
uint32_tpage_flags: Page flags as identified by TTM_TT_FLAG_XX flags.

Description

Create a TTM backend that uses the indicated AGP bridge as an aperturefor TT memory. This function uses the linux agpgart interface tobind and unbind memory backing a ttm_tt.

structttm_kmap_iter*ttm_kmap_iter_tt_init(structttm_kmap_iter_tt*iter_tt,structttm_tt*tt)¶: Initialize astructttm_kmap_iter_tt

Parameters

structttm_kmap_iter_tt*iter_tt: Thestructttm_kmap_iter_tt to initialize.
structttm_tt*tt: Struct ttm_tt holding page pointers of thestructttm_resource.

Return

Pointer to the embeddedstructttm_kmap_iter.

intttm_tt_setup_backup(structttm_tt*tt)¶: Allocate and assign a backup structure for a ttm_tt

Parameters

structttm_tt*tt: The ttm_tt for wich to allocate and assign a backup structure.

Description

Assign a backup structure to be used for tt backup. This shouldtypically be done at bo creation, to avoid allocations at shrinkingtime.

Return

0 on success, negative error code on failure.

TTM page pool reference¶

structttm_pool_type¶: Pool for a certain memory type

Definition:

struct ttm_pool_type {    struct ttm_pool *pool;    unsigned int order;    enum ttm_caching caching;    struct list_head shrinker_list;    spinlock_t lock;    struct list_head pages;};

Members

pool: the pool we belong to, might be NULL for the global ones
order: the allocation order our pages have
caching: the caching type our pages have
shrinker_list: our place on the global shrinker list
lock: protection of the page list
pages: the list of pages in the pool

structttm_pool¶: Pool for all caching and orders

Definition:

struct ttm_pool {    struct device *dev;    int nid;    unsigned int alloc_flags;    struct {        struct ttm_pool_type orders[NR_PAGE_ORDERS];    } caching[TTM_NUM_CACHING_TYPES];};

Members

dev: the device we allocate pages for
nid: which numa node to use
alloc_flags: TTM_ALLOCATION_POOL_* flags
caching: pools for each caching/order

intttm_pool_alloc(structttm_pool*pool,structttm_tt*tt,structttm_operation_ctx*ctx)¶: Fill a ttm_tt object

Parameters

structttm_pool*pool: ttm_pool to use
structttm_tt*tt: ttm_tt object to fill
structttm_operation_ctx*ctx: operation context

Description

Fill the ttm_tt object with pages and also make sure to DMA map them whennecessary.

Return

0 on successe, negative error code otherwise.

voidttm_pool_free(structttm_pool*pool,structttm_tt*tt)¶: Free the backing pages from a ttm_tt object

Parameters

structttm_pool*pool: Pool to give pages back to.
structttm_tt*tt: ttm_tt object to unpopulate

Description

Give the packing pages back to a pool or free them

voidttm_pool_init(structttm_pool*pool,structdevice*dev,intnid,unsignedintalloc_flags)¶: Initialize a pool

Parameters

structttm_pool*pool: the pool to initialize
structdevice*dev: device for DMA allocations and mappings
intnid: NUMA node to use for allocations
unsignedintalloc_flags: TTM_ALLOCATION_POOL_* flags

Description

Initialize the pool and its pool types.

voidttm_pool_fini(structttm_pool*pool)¶: Cleanup a pool

Parameters

structttm_pool*pool: the pool to clean up

Description

Free all pages in the pool and unregister the types from the globalshrinker.

intttm_pool_debugfs(structttm_pool*pool,structseq_file*m)¶: Debugfs dump function for a pool

Parameters

structttm_pool*pool: the pool to dump the information for
structseq_file*m: seq_file to dump to

Description

Make a debugfs dump with the per pool and global information.

The Graphics Execution Manager (GEM)¶

The GEM design approach has resulted in a memory manager that doesn’tprovide full coverage of all (or even all common) use cases in itsuserspace or kernel API. GEM exposes a set of standard memory-relatedoperations to userspace and a set of helper functions to drivers, andlet drivers implement hardware-specific operations with their ownprivate API.

The GEM userspace API is described in theGEM - the Graphics ExecutionManager article on LWN. Whileslightly outdated, the document provides a good overview of the GEM APIprinciples. Buffer allocation and read and write operations, describedas part of the common GEM API, are currently implemented usingdriver-specific ioctls.

GEM is data-agnostic. It manages abstract buffer objects without knowingwhat individual buffers contain. APIs that require knowledge of buffercontents or purpose, such as buffer allocation or synchronizationprimitives, are thus outside of the scope of GEM and must be implementedusing driver-specific ioctls.

On a fundamental level, GEM involves several operations:

Memory allocation and freeing
Command execution
Aperture management at command execution time

Buffer object allocation is relatively straightforward and largelyprovided by Linux’s shmem layer, which provides memory to back eachobject.

Device-specific operations, such as command execution, pinning, bufferread & write, mapping, and domain ownership transfers are left todriver-specific ioctls.

GEM Initialization¶

Drivers that use GEM must set the DRIVER_GEM bit in the structstructdrm_driver driver_featuresfield. The DRM core will then automatically initialize the GEM corebefore calling the load operation. Behind the scene, this will create aDRM Memory Manager object which provides an address space pool forobject allocation.

In a KMS configuration, drivers need to allocate and initialize acommand ring buffer following core GEM initialization if required by thehardware. UMA devices usually have what is called a “stolen” memoryregion, which provides space for the initial framebuffer and large,contiguous memory regions required by the device. This space istypically not managed by GEM, and must be initialized separately intoits own DRM MM object.

GEM Objects Creation¶

GEM splits creation of GEM objects and allocation of the memory thatbacks them in two distinct operations.

GEM objects are represented by an instance of structstructdrm_gem_object. Drivers usually need toextend GEM objects with private information and thus create adriver-specific GEM object structure type that embeds an instance ofstructstructdrm_gem_object.

To create a GEM object, a driver allocates memory for an instance of itsspecific GEM object type and initializes the embedded structstructdrm_gem_object with a calltodrm_gem_object_init(). The function takes a pointerto the DRM device, a pointer to the GEM object and the buffer objectsize in bytes.

GEM uses shmem to allocate anonymous pageable memory.drm_gem_object_init() will create an shmfs file of therequested size and store it into the structstructdrm_gem_object filp field. The memory isused as either main storage for the object when the graphics hardwareuses system memory directly or as a backing store otherwise. Driverscan calldrm_gem_huge_mnt_create() to create, mount and use a hugeshmem mountpoint instead of the default one (‘shm_mnt’). For buildswith CONFIG_TRANSPARENT_HUGEPAGE enabled, further calls todrm_gem_object_init() will let shmem allocate huge pages whenpossible.

Drivers are responsible for the actual physical pages allocation bycallingshmem_read_mapping_page_gfp() for each page.Note that they can decide to allocate pages when initializing the GEMobject, or to delay allocation until the memory is needed (for instancewhen a page fault occurs as a result of a userspace memory access orwhen the driver needs to start a DMA transfer involving the memory).

Anonymous pageable memory allocation is not always desired, for instancewhen the hardware requires physically contiguous system memory as isoften the case in embedded devices. Drivers can create GEM objects withno shmfs backing (called private GEM objects) by initializing them with a calltodrm_gem_private_object_init() instead ofdrm_gem_object_init(). Storage forprivate GEM objects must be managed by drivers.

GEM Objects Lifetime¶

All GEM objects are reference-counted by the GEM core. References can beacquired and release by callingdrm_gem_object_get() anddrm_gem_object_put()respectively.

When the last reference to a GEM object is released the GEM core callsthestructdrm_gem_object_funcs freeoperation. That operation is mandatory for GEM-enabled drivers and mustfree the GEM object and all associated resources.

void (*free) (structdrm_gem_object *obj); Drivers areresponsible for freeing all GEM object resources. This includes theresources created by the GEM core, which need to be released withdrm_gem_object_release().

GEM Objects Naming¶

Communication between userspace and the kernel refers to GEM objectsusing local handles, global names or, more recently, file descriptors.All of those are 32-bit integer values; the usual Linux kernel limitsapply to the file descriptors.

GEM handles are local to a DRM file. Applications get a handle to a GEMobject through a driver-specific ioctl, and can use that handle to referto the GEM object in other standard or driver-specific ioctls. Closing aDRM file handle frees all its GEM handles and dereferences theassociated GEM objects.

To create a handle for a GEM object drivers calldrm_gem_handle_create(). Thefunction takes a pointer to the DRM file and the GEM object and returns alocally unique handle. When the handle is no longer needed drivers delete itwith a call todrm_gem_handle_delete(). Finally the GEM object associated with ahandle can be retrieved by a call todrm_gem_object_lookup().

Handles don’t take ownership of GEM objects, they only take a referenceto the object that will be dropped when the handle is destroyed. Toavoid leaking GEM objects, drivers must make sure they drop thereference(s) they own (such as the initial reference taken at objectcreation time) as appropriate, without any special consideration for thehandle. For example, in the particular case of combined GEM object andhandle creation in the implementation of the dumb_create operation,drivers must drop the initial reference to the GEM object beforereturning the handle.

GEM names are similar in purpose to handles but are not local to DRMfiles. They can be passed between processes to reference a GEM objectglobally. Names can’t be used directly to refer to objects in the DRMAPI, applications must convert handles to names and names to handlesusing the DRM_IOCTL_GEM_FLINK and DRM_IOCTL_GEM_OPEN ioctlsrespectively. The conversion is handled by the DRM core without anydriver-specific support.

GEM also supports buffer sharing with dma-buf file descriptors throughPRIME. GEM-based drivers must use the provided helpers functions toimplement the exporting and importing correctly. See ?. Since sharingfile descriptors is inherently more secure than the easily guessable andglobal GEM names it is the preferred buffer sharing mechanism. Sharingbuffers through GEM names is only supported for legacy userspace.Furthermore PRIME also allows cross-device buffer sharing since it isbased on dma-bufs.

GEM Objects Mapping¶

Because mapping operations are fairly heavyweight GEM favoursread/write-like access to buffers, implemented through driver-specificioctls, over mapping buffers to userspace. However, when random accessto the buffer is needed (to perform software rendering for instance),direct access to the object can be more efficient.

The mmap system call can’t be used directly to map GEM objects, as theydon’t have their own file handle. Two alternative methods currentlyco-exist to map GEM objects to userspace. The first method uses adriver-specific ioctl to perform the mapping operation, callingdo_mmap() under the hood. This is often considereddubious, seems to be discouraged for new GEM-enabled drivers, and willthus not be described here.

The second method uses the mmap system call on the DRM file handle. void*mmap(void *addr, size_t length, int prot, int flags, int fd, off_toffset); DRM identifies the GEM object to be mapped by a fake offsetpassed through the mmap offset argument. Prior to being mapped, a GEMobject must thus be associated with a fake offset. To do so, driversmust calldrm_gem_create_mmap_offset() on the object.

Once allocated, the fake offset value must be passed to the applicationin a driver-specific way and can then be used as the mmap offsetargument.

The GEM core provides a helper methoddrm_gem_mmap() tohandle object mapping. The method can be set directly as the mmap fileoperation handler. It will look up the GEM object based on the offsetvalue and set the VMA operations to thestructdrm_driver gem_vm_ops field. Note thatdrm_gem_mmap() doesn’t map memory touserspace, but relies on the driver-provided fault handler to map pagesindividually.

To usedrm_gem_mmap(), drivers must fill the structstructdrm_driver gem_vm_ops field with a pointer to VM operations.

The VM operations is astructvm_operations_structmade up of several fields, the more interesting ones being:

structvm_operations_struct{void(*open)(structvm_area_struct*area);void(*close)(structvm_area_struct*area);vm_fault_t(*fault)(structvm_fault*vmf);};

The open and close operations must update the GEM object referencecount. Drivers can use thedrm_gem_vm_open() anddrm_gem_vm_close() helperfunctions directly as open and close handlers.

The fault operation handler is responsible for mapping pages touserspace when a page fault occurs. Depending on the memory allocationscheme, drivers can allocate pages at fault time, or can decide toallocate memory for the GEM object at the time the object is created.

Drivers that want to map the GEM object upfront instead of handling pagefaults can implement their own mmap file operation handler.

In order to reduce page table overhead, if the internal shmem mountpoint“shm_mnt” is configured to use transparent huge pages (for builds withCONFIG_TRANSPARENT_HUGEPAGE enabled) and if the shmem backing storemanaged to allocate a huge page for a faulty address, the fault handlerwill first attempt to insert that huge page into the VMA before fallingback to individual page insertion. mmap() user address alignment for GEMobjects is handled by providing a custom get_unmapped_area fileoperation which forwards to the shmem backing store. For most drivers,which don’t create a huge mountpoint by default or through a moduleparameter, transparent huge pages can be enabled by either setting the“transparent_hugepage_shmem” kernel parameter or the“/sys/kernel/mm/transparent_hugepage/shmem_enabled” sysfs knob.

For platforms without MMU the GEM core provides a helper methoddrm_gem_dma_get_unmapped_area(). The mmap() routines will call this to get aproposed address for the mapping.

To usedrm_gem_dma_get_unmapped_area(), drivers must fill the structstructfile_operations get_unmapped_area field witha pointer ondrm_gem_dma_get_unmapped_area().

More detailed information about get_unmapped_area can be found inNo-MMU memory mapping support

Memory Coherency¶

When mapped to the device or used in a command buffer, backing pages foran object are flushed to memory and marked write combined so as to becoherent with the GPU. Likewise, if the CPU accesses an object after theGPU has finished rendering to the object, then the object must be madecoherent with the CPU’s view of memory, usually involving GPU cacheflushing of various kinds. This core CPU<->GPU coherency management isprovided by a device-specific ioctl, which evaluates an object’s currentdomain and performs any necessary flushing or synchronization to put theobject into the desired coherency domain (note that the object may bebusy, i.e. an active render target; in that case, setting the domainblocks the client and waits for rendering to complete before performingany necessary flushing operations).

Command Execution¶

Perhaps the most important GEM function for GPU devices is providing acommand execution interface to clients. Client programs constructcommand buffers containing references to previously allocated memoryobjects, and then submit them to GEM. At that point, GEM takes care tobind all the objects into the GTT, execute the buffer, and providenecessary synchronization between clients accessing the same buffers.This often involves evicting some objects from the GTT and re-bindingothers (a fairly expensive operation), and providing relocation supportwhich hides fixed GTT offsets from clients. Clients must take care notto submit command buffers that reference more objects than can fit inthe GTT; otherwise, GEM will reject them and no rendering will occur.Similarly, if several objects in the buffer require fence registers tobe allocated for correct rendering (e.g. 2D blits on pre-965 chips),care must be taken not to require more fence registers than areavailable to the client. Such resource management should be abstractedfrom the client in libdrm.

GEM Function Reference¶

enumdrm_gem_object_status¶: bitmask of object state for fdinfo reporting

Constants

DRM_GEM_OBJECT_RESIDENT: object is resident in memory (ie. not unpinned)
DRM_GEM_OBJECT_PURGEABLE: object marked as purgeable by userspace
DRM_GEM_OBJECT_ACTIVE: object is currently used by an active submission

Description

Bitmask of status used for fdinfo memory stats, seedrm_gem_object_funcs.statusanddrm_show_fdinfo(). Note that an object can report DRM_GEM_OBJECT_PURGEABLEand be active or not resident, in which casedrm_show_fdinfo() will notaccount for it as purgeable. So drivers do not need to check if the bufferis idle and resident to return this bit, i.e. userspace can mark a buffer aspurgeable even while it is still busy on the GPU. It will not get reported inthe puregeable stats until it becomes idle. The status gem object func doesnot need to consider this.

structdrm_gem_object_funcs¶: GEM object functions

Definition:

struct drm_gem_object_funcs {    void (*free)(struct drm_gem_object *obj);    int (*open)(struct drm_gem_object *obj, struct drm_file *file);    void (*close)(struct drm_gem_object *obj, struct drm_file *file);    void (*print_info)(struct drm_printer *p, unsigned int indent, const struct drm_gem_object *obj);    struct dma_buf *(*export)(struct drm_gem_object *obj, int flags);    int (*pin)(struct drm_gem_object *obj);    void (*unpin)(struct drm_gem_object *obj);    struct sg_table *(*get_sg_table)(struct drm_gem_object *obj);    int (*vmap)(struct drm_gem_object *obj, struct iosys_map *map);    void (*vunmap)(struct drm_gem_object *obj, struct iosys_map *map);    int (*mmap)(struct drm_gem_object *obj, struct vm_area_struct *vma);    int (*evict)(struct drm_gem_object *obj);    enum drm_gem_object_status (*status)(struct drm_gem_object *obj);    size_t (*rss)(struct drm_gem_object *obj);    const struct vm_operations_struct *vm_ops;};

Members

free

Deconstructor for drm_gem_objects.

This callback is mandatory.

open

Called upon GEM handle creation.

This callback is optional.

close

Called upon GEM handle release.

This callback is optional.

print_info

If driver subclasses structdrm_gem_object, it can implement thisoptional hook for printing additional driver specific info.

drm_printf_indent() should be used in the callback passing it theindent argument.

This callback is called fromdrm_gem_print_info().

This callback is optional.

export

Export backing buffer as adma_buf.If this is not setdrm_gem_prime_export() is used.

This callback is optional.

pin

Pin backing buffer in memory, such that dma-buf importers canaccess it. Used by thedrm_gem_map_attach() helper.

This callback is optional.

unpin

Unpin backing buffer. Used by thedrm_gem_map_detach() helper.

This callback is optional.

get_sg_table

Returns a Scatter-Gather table representation of the buffer.Used when exporting a buffer by thedrm_gem_map_dma_buf() helper.Releasing is done by callingdma_unmap_sg_attrs() andsg_free_table()indrm_gem_unmap_buf(), therefore these helpers and this callbackhere cannot be used for sg tables pointing at driver private memoryranges.

GEM DMA Helper Functions Reference¶

The DRM GEM/DMA helpers are a means to provide buffer objects that arepresented to the device as a contiguous chunk of memory. This is usefulfor devices that do not support scatter-gather DMA (either directly orby using an intimately attached IOMMU).

For devices that access the memory bus through an (external) IOMMU thenthe buffer objects are allocated using a traditional page-basedallocator and may be scattered through physical memory. However theyare contiguous in the IOVA space so appear contiguous to devices usingthem.

For other devices then the helpers rely on CMA to provide bufferobjects that are physically contiguous in memory.

For GEM callback helpers in structdrm_gem_object functions, see likewisenamed functions with an _object_ infix (e.g.,drm_gem_dma_object_vmap() wrapsdrm_gem_dma_vmap()). These helpers perform the necessary type conversion.

structdrm_gem_dma_object¶: GEM object backed by DMA memory allocations

Definition:

struct drm_gem_dma_object {    struct drm_gem_object base;    dma_addr_t dma_addr;    struct sg_table *sgt;    void *vaddr;    bool map_noncoherent;};

Members

base: base GEM object
dma_addr: DMA address of the backing memory
sgt: scatter/gather table for imported PRIME buffers. The table can havemore than one entry but they are guaranteed to have contiguousDMA addresses.
vaddr: kernel virtual address of the backing memory
map_noncoherent: if true, the GEM object is backed by non-coherent memory

voiddrm_gem_dma_object_free(structdrm_gem_object*obj)¶: GEM object function fordrm_gem_dma_free()

Parameters

structdrm_gem_object*obj: GEM object to free

Description

This function wrapsdrm_gem_dma_free_object(). Drivers that employ the DMA helpersshould use it as theirdrm_gem_object_funcs.free handler.

voiddrm_gem_dma_object_print_info(structdrm_printer*p,unsignedintindent,conststructdrm_gem_object*obj)¶: Printdrm_gem_dma_object info for debugfs

Parameters

structdrm_printer*p: DRM printer
unsignedintindent: Tab indentation level
conststructdrm_gem_object*obj: GEM object

Description

This function wrapsdrm_gem_dma_print_info(). Drivers that employ the DMA helpersshould use this function as theirdrm_gem_object_funcs.print_info handler.

structsg_table*drm_gem_dma_object_get_sg_table(structdrm_gem_object*obj)¶: GEM object function fordrm_gem_dma_get_sg_table()

Parameters

structdrm_gem_object*obj: GEM object

Description

This function wrapsdrm_gem_dma_get_sg_table(). Drivers that employ the DMA helpers shoulduse it as theirdrm_gem_object_funcs.get_sg_table handler.

Return

A pointer to the scatter/gather table of pinned pages or NULL on failure.

intdrm_gem_dma_object_mmap(structdrm_gem_object*obj,structvm_area_struct*vma)¶: GEM object function fordrm_gem_dma_mmap()

Parameters

structdrm_gem_object*obj: GEM object
structvm_area_struct*vma: VMA for the area to be mapped

Description

This function wrapsdrm_gem_dma_mmap(). Drivers that employ the dma helpers shoulduse it as theirdrm_gem_object_funcs.mmap handler.

Return

0 on success or a negative error code on failure.

DRM_GEM_DMA_DRIVER_OPS_WITH_DUMB_CREATE¶

DRM_GEM_DMA_DRIVER_OPS_WITH_DUMB_CREATE(dumb_create_func)

DMA GEM driver operations

Parameters

dumb_create_func: callback function for .dumb_create

Description

This macro provides a shortcut for setting the default GEM operations in thedrm_driver structure.

This macro is a variant of DRM_GEM_DMA_DRIVER_OPS for drivers thatoverride the default implementation ofstructrm_driver.dumb_create. UseDRM_GEM_DMA_DRIVER_OPS if possible. Drivers that require a virtual addresson imported buffers should useDRM_GEM_DMA_DRIVER_OPS_VMAP_WITH_DUMB_CREATE() instead.

DRM_GEM_DMA_DRIVER_OPS¶

DRM_GEM_DMA_DRIVER_OPS

DMA GEM driver operations
Description
This macro provides a shortcut for setting the default GEM operations in thedrm_driver structure.
Drivers that come with their own implementation ofstructdrm_driver.dumb_create should useDRM_GEM_DMA_DRIVER_OPS_WITH_DUMB_CREATE() instead. UseDRM_GEM_DMA_DRIVER_OPS if possible. Drivers that require a virtual addresson imported buffers should use DRM_GEM_DMA_DRIVER_OPS_VMAP instead.

DRM_GEM_DMA_DRIVER_OPS_VMAP_WITH_DUMB_CREATE¶

DRM_GEM_DMA_DRIVER_OPS_VMAP_WITH_DUMB_CREATE(dumb_create_func)

DMA GEM driver operations ensuring a virtual address on the buffer

Parameters

dumb_create_func: callback function for .dumb_create

Description

This macro provides a shortcut for setting the default GEM operations in thedrm_driver structure for drivers that need the virtual address also onimported buffers.

This macro is a variant of DRM_GEM_DMA_DRIVER_OPS_VMAP for drivers thatoverride the default implementation ofstructdrm_driver.dumb_create. UseDRM_GEM_DMA_DRIVER_OPS_VMAP if possible. Drivers that do not require avirtual address on imported buffers should useDRM_GEM_DMA_DRIVER_OPS_WITH_DUMB_CREATE() instead.

DRM_GEM_DMA_DRIVER_OPS_VMAP¶

DRM_GEM_DMA_DRIVER_OPS_VMAP

DMA GEM driver operations ensuring a virtual address on the buffer
Description
This macro provides a shortcut for setting the default GEM operations in thedrm_driver structure for drivers that need the virtual address also onimported buffers.
Drivers that come with their own implementation ofstructdrm_driver.dumb_create should useDRM_GEM_DMA_DRIVER_OPS_VMAP_WITH_DUMB_CREATE() instead. UseDRM_GEM_DMA_DRIVER_OPS_VMAP if possible. Drivers that do not require avirtual address on imported buffers should use DRM_GEM_DMA_DRIVER_OPSinstead.

DEFINE_DRM_GEM_DMA_FOPS¶

DEFINE_DRM_GEM_DMA_FOPS(name)

macro to generate file operations for DMA drivers

Parameters

name: name for the generated structure

Description

This macro autogenerates a suitablestructfile_operations for DMA baseddrivers, which can be assigned todrm_driver.fops. Note that this structurecannot be shared between drivers, because it contains a reference to thecurrent module using THIS_MODULE.

Note that the declaration is already marked as static - if you need anon-static version of this you’re probably doing it wrong and will break theTHIS_MODULE reference by accident.

structdrm_gem_dma_object*drm_gem_dma_create(structdrm_device*drm,size_tsize)¶: allocate an object with the given size

Parameters

structdrm_device*drm: DRM device
size_tsize: size of the object to allocate

Description

This function creates a DMA GEM object and allocates memory as backing store.The allocated memory will occupy a contiguous chunk of bus address space.

For devices that are directly connected to the memory bus then the allocatedmemory will be physically contiguous. For devices that access through anIOMMU, then the allocated memory is not expected to be physically contiguousbecause having contiguous IOVAs is sufficient to meet a devices DMArequirements.

Return

Astructdrm_gem_dma_object * on success or anERR_PTR()-encoded negativeerror code on failure.

voiddrm_gem_dma_free(structdrm_gem_dma_object*dma_obj)¶: free resources associated with a DMA GEM object

Parameters

structdrm_gem_dma_object*dma_obj: DMA GEM object to free

Description

This function frees the backing memory of the DMA GEM object, cleans up theGEM object state and frees the memory used to store the object itself.If the buffer is imported and the virtual address is set, it is released.

intdrm_gem_dma_dumb_create_internal(structdrm_file*file_priv,structdrm_device*drm,structdrm_mode_create_dumb*args)¶: create a dumb buffer object

Parameters

structdrm_file*file_priv: DRM file-private structure to create the dumb buffer for
structdrm_device*drm: DRM device
structdrm_mode_create_dumb*args: IOCTL data

Description

This aligns the pitch and size arguments to the minimum required. This isan internal helper that can be wrapped by a driver to account for hardwarewith more specific alignment requirements. It should not be used directlyas theirdrm_driver.dumb_create callback.

Return

0 on success or a negative error code on failure.

intdrm_gem_dma_dumb_create(structdrm_file*file_priv,structdrm_device*drm,structdrm_mode_create_dumb*args)¶: create a dumb buffer object

Parameters

structdrm_file*file_priv: DRM file-private structure to create the dumb buffer for
structdrm_device*drm: DRM device
structdrm_mode_create_dumb*args: IOCTL data

Description

This function computes the pitch of the dumb buffer and rounds it up to aninteger number of bytes per pixel. Drivers for hardware that doesn’t haveany additional restrictions on the pitch can directly use this function astheirdrm_driver.dumb_create callback.

For hardware with additional restrictions, drivers can adjust the fieldsset up by userspace and pass the IOCTL data along to thedrm_gem_dma_dumb_create_internal() function.

Return

0 on success or a negative error code on failure.

unsignedlongdrm_gem_dma_get_unmapped_area(structfile*filp,unsignedlongaddr,unsignedlonglen,unsignedlongpgoff,unsignedlongflags)¶: propose address for mapping in noMMU cases

Parameters

structfile*filp: file object
unsignedlongaddr: memory address
unsignedlonglen: buffer size
unsignedlongpgoff: page offset
unsignedlongflags: memory flags

Description

This function is used in noMMU platforms to propose address mappingfor a given buffer.It’s intended to be used as a direct handler for the structfile_operations.get_unmapped_area operation.

Return

mapping address on success or a negative error code on failure.

voiddrm_gem_dma_print_info(conststructdrm_gem_dma_object*dma_obj,structdrm_printer*p,unsignedintindent)¶: Printdrm_gem_dma_object info for debugfs

Parameters

conststructdrm_gem_dma_object*dma_obj: DMA GEM object
structdrm_printer*p: DRM printer
unsignedintindent: Tab indentation level

Description

This function prints dma_addr and vaddr for use in e.g. debugfs output.

structsg_table*drm_gem_dma_get_sg_table(structdrm_gem_dma_object*dma_obj)¶: provide a scatter/gather table of pinned pages for a DMA GEM object

Parameters

structdrm_gem_dma_object*dma_obj: DMA GEM object

Description

This function exports a scatter/gather table by calling the standardDMA mapping API.

Return

A pointer to the scatter/gather table of pinned pages or NULL on failure.

structdrm_gem_object*drm_gem_dma_prime_import_sg_table(structdrm_device*dev,structdma_buf_attachment*attach,structsg_table*sgt)¶: produce a DMA GEM object from another driver’s scatter/gather table of pinned pages

Parameters

structdrm_device*dev: device to import into
structdma_buf_attachment*attach: DMA-BUF attachment
structsg_table*sgt: scatter/gather table of pinned pages

Description

This function imports a scatter/gather table exported via DMA-BUF byanother driver. Imported buffers must be physically contiguous in memory(i.e. the scatter/gather table must contain a single entry). Drivers thatuse the DMA helpers should set this as theirdrm_driver.gem_prime_import_sg_table callback.

Return

A pointer to a newly created GEM object or an ERR_PTR-encoded negativeerror code on failure.

intdrm_gem_dma_vmap(structdrm_gem_dma_object*dma_obj,structiosys_map*map)¶: map a DMA GEM object into the kernel’s virtual address space

Parameters

structdrm_gem_dma_object*dma_obj: DMA GEM object
structiosys_map*map: Returns the kernel virtual address of the DMA GEM object’s backingstore.

Description

This function maps a buffer into the kernel’s virtual address space.Since the DMA buffers are already mapped into the kernel virtual addressspace this simply returns the cached virtual address.

Return

0 on success, or a negative error code otherwise.

intdrm_gem_dma_mmap(structdrm_gem_dma_object*dma_obj,structvm_area_struct*vma)¶: memory-map an exported DMA GEM object

Parameters

structdrm_gem_dma_object*dma_obj: DMA GEM object
structvm_area_struct*vma: VMA for the area to be mapped

Description

This function maps a buffer into a userspace process’s address space.In addition to the usual GEM VMA setup it immediately faults in the entireobject instead of using on-demand faulting.

Return

0 on success or a negative error code on failure.

structdrm_gem_object*drm_gem_dma_prime_import_sg_table_vmap(structdrm_device*dev,structdma_buf_attachment*attach,structsg_table*sgt)¶: PRIME import another driver’s scatter/gather table and get the virtual address of the buffer

Parameters

structdrm_device*dev: DRM device
structdma_buf_attachment*attach: DMA-BUF attachment
structsg_table*sgt: Scatter/gather table of pinned pages

Description

This function imports a scatter/gather table usingdrm_gem_dma_prime_import_sg_table() and usesdma_buf_vmap() to get the kernelvirtual address. This ensures that a DMA GEM object always has its virtualaddress set. This address is released when the object is freed.

This function can be used as thedrm_driver.gem_prime_import_sg_tablecallback. TheDRM_GEM_DMA_DRIVER_OPS_VMAP macro provides a shortcut to setthe necessary DRM driver operations.

Return

A pointer to a newly created GEM object or an ERR_PTR-encoded negativeerror code on failure.

GEM SHMEM Helper Function Reference¶

This library provides helpers for GEM objects backed by shmem buffersallocated using anonymous pageable memory.

Functions that operate on the GEM object receive structdrm_gem_shmem_object.For GEM callback helpers in structdrm_gem_object functions, see likewisenamed functions with an _object_ infix (e.g.,drm_gem_shmem_object_vmap() wrapsdrm_gem_shmem_vmap()). These helpers perform the necessary type conversion.

structdrm_gem_shmem_object¶: GEM object backed by shmem

Definition:

struct drm_gem_shmem_object {    struct drm_gem_object base;    struct page **pages;    refcount_t pages_use_count;    refcount_t pages_pin_count;    int madv;    struct list_head madv_list;    struct sg_table *sgt;    void *vaddr;    refcount_t vmap_use_count;    bool pages_mark_dirty_on_put : 1;    bool pages_mark_accessed_on_put : 1;    bool map_wc : 1;};

Members

base

Base GEM object

pages

Page table

pages_use_count

Reference count on the pages table.The pages are put when the count reaches zero.

pages_pin_count

Reference count on the pinned pages table.

Pages are hard-pinned and reside in memory if countgreater than zero. Otherwise, when count is zero, the pages areallowed to be evicted and purged by memory shrinker.

madv

State for madvise

0 is active/inuse.A negative value is the object is purged.Positive values are driver specific and not used by the helpers.

madv_list

List entry for madvise tracking

Typically used by drivers to track purgeable objects

sgt

Scatter/gather table for imported PRIME buffers

vaddr

Kernel virtual address of the backing memory

vmap_use_count

Reference count on the virtual address.The address are un-mapped when the count reaches zero.

pages_mark_dirty_on_put

Mark pages as dirty when they are put.

pages_mark_accessed_on_put

Mark pages as accessed when they are put.

map_wc

map object write-combined (instead of using shmem defaults).

voiddrm_gem_shmem_object_free(structdrm_gem_object*obj)¶: GEM object function fordrm_gem_shmem_free()

Parameters

structdrm_gem_object*obj: GEM object to free

Description

This function wrapsdrm_gem_shmem_free(). Drivers that employ the shmem helpersshould use it as theirdrm_gem_object_funcs.free handler.

voiddrm_gem_shmem_object_print_info(structdrm_printer*p,unsignedintindent,conststructdrm_gem_object*obj)¶: Printdrm_gem_shmem_object info for debugfs

Parameters

structdrm_printer*p: DRM printer
unsignedintindent: Tab indentation level
conststructdrm_gem_object*obj: GEM object

Description

This function wrapsdrm_gem_shmem_print_info(). Drivers that employ the shmem helpers shoulduse this function as theirdrm_gem_object_funcs.print_info handler.

intdrm_gem_shmem_object_pin(structdrm_gem_object*obj)¶: GEM object function fordrm_gem_shmem_pin()

Parameters

structdrm_gem_object*obj: GEM object

Description

This function wrapsdrm_gem_shmem_pin(). Drivers that employ the shmem helpers shoulduse it as theirdrm_gem_object_funcs.pin handler.

voiddrm_gem_shmem_object_unpin(structdrm_gem_object*obj)¶: GEM object function fordrm_gem_shmem_unpin()

Parameters

structdrm_gem_object*obj: GEM object

Description

This function wrapsdrm_gem_shmem_unpin(). Drivers that employ the shmem helpers shoulduse it as theirdrm_gem_object_funcs.unpin handler.

structsg_table*drm_gem_shmem_object_get_sg_table(structdrm_gem_object*obj)¶: GEM object function fordrm_gem_shmem_get_sg_table()

Parameters

structdrm_gem_object*obj: GEM object

Description

This function wrapsdrm_gem_shmem_get_sg_table(). Drivers that employ the shmem helpers shoulduse it as theirdrm_gem_object_funcs.get_sg_table handler.

Return

A pointer to the scatter/gather table of pinned pages or error pointer on failure.

intdrm_gem_shmem_object_mmap(structdrm_gem_object*obj,structvm_area_struct*vma)¶: GEM object function fordrm_gem_shmem_mmap()

Parameters

structdrm_gem_object*obj: GEM object
structvm_area_struct*vma: VMA for the area to be mapped

Description

This function wrapsdrm_gem_shmem_mmap(). Drivers that employ the shmem helpers shoulduse it as theirdrm_gem_object_funcs.mmap handler.

Return

0 on success or a negative error code on failure.

DRM_GEM_SHMEM_DRIVER_OPS¶

DRM_GEM_SHMEM_DRIVER_OPS

Default shmem GEM operations
Description
This macro provides a shortcut for setting the shmem GEM operationsin thedrm_driver structure. Drivers that do not require an s/g tablefor imported buffers should use this.

intdrm_gem_shmem_init(structdrm_device*dev,structdrm_gem_shmem_object*shmem,size_tsize)¶: Initialize an allocated object.

Parameters

structdrm_device*dev: DRM device
structdrm_gem_shmem_object*shmem: shmem GEM object to initialize
size_tsize: Buffer size in bytes

Description

This function initializes an allocated shmem GEM object.

Return

0 on success, or a negative error code on failure.

structdrm_gem_shmem_object*drm_gem_shmem_create(structdrm_device*dev,size_tsize)¶: Allocate an object with the given size

Parameters

structdrm_device*dev: DRM device
size_tsize: Size of the object to allocate

Description

This function creates a shmem GEM object.

Return

Astructdrm_gem_shmem_object * on success or anERR_PTR()-encoded negativeerror code on failure.

voiddrm_gem_shmem_release(structdrm_gem_shmem_object*shmem)¶: Release resources associated with a shmem GEM object.

Parameters

structdrm_gem_shmem_object*shmem: shmem GEM object

Description

This function cleans up the GEM object state, but does not free the memory used to store theobject itself. This function is meant to be a dedicated helper for the Rust GEM bindings.

voiddrm_gem_shmem_free(structdrm_gem_shmem_object*shmem)¶: Free resources associated with a shmem GEM object

Parameters

structdrm_gem_shmem_object*shmem: shmem GEM object to free

Description

This function cleans up the GEM object state and frees the memory used tostore the object itself.

intdrm_gem_shmem_pin(structdrm_gem_shmem_object*shmem)¶: Pin backing pages for a shmem GEM object

Parameters

structdrm_gem_shmem_object*shmem: shmem GEM object

Description

This function makes sure the backing pages are pinned in memory while thebuffer is exported.

Return

0 on success or a negative error code on failure.

voiddrm_gem_shmem_unpin(structdrm_gem_shmem_object*shmem)¶: Unpin backing pages for a shmem GEM object

Parameters

structdrm_gem_shmem_object*shmem: shmem GEM object

Description

This function removes the requirement that the backing pages are pinned inmemory.

intdrm_gem_shmem_dumb_create(structdrm_file*file,structdrm_device*dev,structdrm_mode_create_dumb*args)¶: Create a dumb shmem buffer object

Parameters

structdrm_file*file: DRM file structure to create the dumb buffer for
structdrm_device*dev: DRM device
structdrm_mode_create_dumb*args: IOCTL data

Description

For hardware with additional restrictions, drivers can adjust the fieldsset up by userspace before calling into this function.

Return

0 on success or a negative error code on failure.

intdrm_gem_shmem_mmap(structdrm_gem_shmem_object*shmem,structvm_area_struct*vma)¶: Memory-map a shmem GEM object

Parameters

structdrm_gem_shmem_object*shmem: shmem GEM object
structvm_area_struct*vma: VMA for the area to be mapped

Description

This function implements an augmented version of the GEM DRM file mmapoperation for shmem objects.

Return

0 on success or a negative error code on failure.

voiddrm_gem_shmem_print_info(conststructdrm_gem_shmem_object*shmem,structdrm_printer*p,unsignedintindent)¶: Printdrm_gem_shmem_object info for debugfs

Parameters

conststructdrm_gem_shmem_object*shmem: shmem GEM object
structdrm_printer*p: DRM printer
unsignedintindent: Tab indentation level

structsg_table*drm_gem_shmem_get_sg_table(structdrm_gem_shmem_object*shmem)¶: Provide a scatter/gather table of pinned pages for a shmem GEM object

Parameters

structdrm_gem_shmem_object*shmem: shmem GEM object

Description

This function exports a scatter/gather table suitable for PRIME usage bycalling the standard DMA mapping API.

Drivers who need to acquire an scatter/gather table for objects need to calldrm_gem_shmem_get_pages_sgt() instead.

Return

A pointer to the scatter/gather table of pinned pages or error pointer on failure.

structsg_table*drm_gem_shmem_get_pages_sgt(structdrm_gem_shmem_object*shmem)¶: Pin pages, dma map them, and return a scatter/gather table for a shmem GEM object.

Parameters

structdrm_gem_shmem_object*shmem: shmem GEM object

Description

This function returns a scatter/gather table suitable for driver usage. Ifthe sg table doesn’t exist, the pages are pinned, dma-mapped, and a sgtable created.

This is the main function for drivers to get at backing storage, and it hidesand difference between dma-buf imported and natively allocated objects.drm_gem_shmem_get_sg_table() should not be directly called by drivers.

Return

A pointer to the scatter/gather table of pinned pages or errno on failure.

structdrm_gem_object*drm_gem_shmem_prime_import_sg_table(structdrm_device*dev,structdma_buf_attachment*attach,structsg_table*sgt)¶: Produce a shmem GEM object from another driver’s scatter/gather table of pinned pages

Parameters

structdrm_device*dev: Device to import into
structdma_buf_attachment*attach: DMA-BUF attachment
structsg_table*sgt: Scatter/gather table of pinned pages

Description

This function imports a scatter/gather table exported via DMA-BUF byanother driver. Drivers that use the shmem helpers should set this as theirdrm_driver.gem_prime_import_sg_table callback.

Return

A pointer to a newly created GEM object or an ERR_PTR-encoded negativeerror code on failure.

structdrm_gem_object*drm_gem_shmem_prime_import_no_map(structdrm_device*dev,structdma_buf*dma_buf)¶: Import dmabuf without mapping its sg_table

Parameters

structdrm_device*dev: Device to import into
structdma_buf*dma_buf: dma-buf object to import

Description

Drivers that use the shmem helpers but also wants to import dmabuf withoutmapping its sg_table can use this as theirdrm_driver.gem_prime_importimplementation.

GEM VRAM Helper Functions Reference¶

This library providesstructdrm_gem_vram_object (GEM VRAM), a GEMbuffer object that is backed by video RAM (VRAM). It can be used forframebuffer devices with dedicated memory.

The data structurestructdrm_vram_mm and its helpers implement a memorymanager for simple framebuffer devices with dedicated video memory. GEMVRAM buffer objects are either placed in the video memory or remain evictedto system memory.

With the GEM interface userspace applications create, manage and destroygraphics buffers, such as an on-screen framebuffer. GEM does not providean implementation of these interfaces. It’s up to the DRM driver toprovide an implementation that suits the hardware. If the hardware devicecontains dedicated video memory, the DRM driver can use the VRAM helperlibrary. Each active buffer object is stored in video RAM. Activebuffer are used for drawing the current frame, typically something likethe frame’s scanout buffer or the cursor image. If there’s no more spaceleft in VRAM, inactive GEM objects can be moved to system memory.

To initialize the VRAM helper library calldrmm_vram_helper_init().The function allocates and initializes an instance ofstructdrm_vram_mminstructdrm_device.vram_mm . UseDRM_GEM_VRAM_DRIVER to initializestructdrm_driver andDRM_VRAM_MM_FILE_OPERATIONS to initializestructfile_operations; as illustrated below.

structfile_operationsfops={.owner=THIS_MODULE,DRM_VRAM_MM_FILE_OPERATION};structdrm_driverdrv={.driver_feature=DRM_...,.fops=&fops,DRM_GEM_VRAM_DRIVER};intinit_drm_driver(){structdrm_device*dev;uint64_tvram_base;unsignedlongvram_size;intret;// setup device, vram base and size// ...ret=drmm_vram_helper_init(dev,vram_base,vram_size);if(ret)returnret;return0;}

This creates an instance ofstructdrm_vram_mm, exports DRM userspaceinterfaces for GEM buffer management and initializes file operations toallow for accessing created GEM buffers. With this setup, the DRM drivermanages an area of video RAM with VRAM MM and provides GEM VRAM objectsto userspace.

You don’t have to clean up the instance of VRAM MM.drmm_vram_helper_init() is a managed interface that installs aclean-up handler to run during the DRM device’s release.

A buffer object that is pinned in video RAM has a fixed address within thatmemory region. Calldrm_gem_vram_offset() to retrieve this value. Typicallyit’s used to program the hardware’s scanout engine for framebuffers, setthe cursor overlay’s image for a mouse cursor, or use it as input to thehardware’s drawing engine.

To access a buffer object’s memory from the DRM driver, calldrm_gem_vram_vmap(). It maps the buffer into kernel addressspace and returns the memory address. Usedrm_gem_vram_vunmap() torelease the mapping.

structdrm_gem_vram_object¶: GEM object backed by VRAM

Definition:

struct drm_gem_vram_object {    struct ttm_buffer_object bo;    struct iosys_map map;    unsigned int vmap_use_count;    struct ttm_placement placement;    struct ttm_place placements[2];};

Members

bo: TTM buffer object
map: Mapping information forbo
vmap_use_count: Reference count on the virtual address.The address are un-mapped when the count reaches zero.
placement: TTM placement information. Supported placements areTTM_PL_VRAMandTTM_PL_SYSTEM
placements: TTM placement information.

Description

The typestructdrm_gem_vram_object represents a GEM object that isbacked by VRAM. It can be used for simple framebuffer devices withdedicated memory. The buffer object can be evicted to system memory ifvideo memory becomes scarce.

GEM VRAM objects perform reference counting for pin and mappingoperations. So a buffer object that has been pinned N times withdrm_gem_vram_pin() must be unpinned N times withdrm_gem_vram_unpin(). The same applies to pairs ofdrm_gem_vram_kmap() anddrm_gem_vram_kunmap(), as well as pairs ofdrm_gem_vram_vmap() anddrm_gem_vram_vunmap().

structdrm_gem_vram_object*drm_gem_vram_of_bo(structttm_buffer_object*bo)¶: Returns the container of typestructdrm_gem_vram_object for field bo.

Parameters

structttm_buffer_object*bo: the VRAM buffer object

Return

The containing GEM VRAM object

structdrm_gem_vram_object*drm_gem_vram_of_gem(structdrm_gem_object*gem)¶: Returns the container of typestructdrm_gem_vram_object for field gem.

Parameters

structdrm_gem_object*gem: the GEM object

Return

The containing GEM VRAM object

DRM_GEM_VRAM_PLANE_HELPER_FUNCS¶

DRM_GEM_VRAM_PLANE_HELPER_FUNCS

Initializesstructdrm_plane_helper_funcs for VRAM handling
Description
Drivers may use GEM BOs as VRAM helpers for the framebuffer memory. Thismacro initializesstructdrm_plane_helper_funcs to use the respective helperfunctions.

DRM_GEM_VRAM_DRIVER¶

DRM_GEM_VRAM_DRIVER

default callback functions forstructdrm_driver
Description
Drivers that use VRAM MM and GEM VRAM can use this macro to initializestructdrm_driver with default functions.

structdrm_vram_mm¶: An instance of VRAM MM

Definition:

struct drm_vram_mm {    uint64_t vram_base;    size_t vram_size;    struct ttm_device bdev;};

Members

vram_base: Base address of the managed video memory
vram_size: Size of the managed video memory in bytes
bdev: The TTM BO device.

Description

The fieldsstructdrm_vram_mm.vram_base andstructdrm_vram_mm.vrm_size are managed by VRAM MM, but areavailable for public read access. Use the fieldstructdrm_vram_mm.bdev to access the TTM BO device.

structdrm_vram_mm*drm_vram_mm_of_bdev(structttm_device*bdev)¶: Returns the container of typestructttm_device for field bdev.

Parameters

structttm_device*bdev: the TTM BO device

Return

The containing instance ofstructdrm_vram_mm

structdrm_gem_vram_object*drm_gem_vram_create(structdrm_device*dev,size_tsize,unsignedlongpg_align)¶: Creates a VRAM-backed GEM object

Parameters

structdrm_device*dev: the DRM device
size_tsize: the buffer size in bytes
unsignedlongpg_align: the buffer’s alignment in multiples of the page size

Description

GEM objects are allocated by callingstructdrm_driver.gem_create_object,if set. Otherwisekzalloc() will be used. Drivers can set their own GEMobject functions instructdrm_driver.gem_create_object. If no functionsare set, the new GEM object will use the default functions from GEM VRAMhelpers.

Return

A new instance ofstructdrm_gem_vram_object on success, oranERR_PTR()-encoded error code otherwise.

voiddrm_gem_vram_put(structdrm_gem_vram_object*gbo)¶: Releases a reference to a VRAM-backed GEM object

Parameters

structdrm_gem_vram_object*gbo: the GEM VRAM object

Description

Seettm_bo_fini() for more information.

s64drm_gem_vram_offset(structdrm_gem_vram_object*gbo)¶: Returns a GEM VRAM object’s offset in video memory

Parameters

structdrm_gem_vram_object*gbo: the GEM VRAM object

Description

This function returns the buffer object’s offset in the device’s videomemory. The buffer object has to be pinned toTTM_PL_VRAM.

Return

The buffer object’s offset in video memory on success, ora negative errno code otherwise.

intdrm_gem_vram_vmap(structdrm_gem_vram_object*gbo,structiosys_map*map)¶: Pins and maps a GEM VRAM object into kernel address space

Parameters

structdrm_gem_vram_object*gbo: The GEM VRAM object to map
structiosys_map*map: Returns the kernel virtual address of the VRAM GEM object’s backingstore.

Description

The vmap function pins a GEM VRAM object to its current location, eithersystem or video memory, and maps its buffer into kernel address space.As pinned object cannot be relocated, you should avoid pinning objectspermanently. Calldrm_gem_vram_vunmap() with the returned address tounmap and unpin the GEM VRAM object.

Return

0 on success, or a negative error code otherwise.

voiddrm_gem_vram_vunmap(structdrm_gem_vram_object*gbo,structiosys_map*map)¶: Unmaps and unpins a GEM VRAM object

Parameters

structdrm_gem_vram_object*gbo: The GEM VRAM object to unmap
structiosys_map*map: Kernel virtual address where the VRAM GEM object was mapped

Description

A call todrm_gem_vram_vunmap() unmaps and unpins a GEM VRAM buffer. Seethe documentation fordrm_gem_vram_vmap() for more information.

intdrm_gem_vram_fill_create_dumb(structdrm_file*file,structdrm_device*dev,unsignedlongpg_align,unsignedlongpitch_align,structdrm_mode_create_dumb*args)¶: Helper for implementingstructdrm_driver.dumb_create

Parameters

structdrm_file*file: the DRM file
structdrm_device*dev: the DRM device
unsignedlongpg_align: the buffer’s alignment in multiples of the page size
unsignedlongpitch_align: the scanline’s alignment in powers of 2
structdrm_mode_create_dumb*args: the arguments as provided tostructdrm_driver.dumb_create

Description

This helper function fillsstructdrm_mode_create_dumb, which is usedbystructdrm_driver.dumb_create. Implementations of this interfaceshould forwards their arguments to this helper, plus the driver-specificparameters.

Return

0 on success, ora negative error code otherwise.

intdrm_gem_vram_driver_dumb_create(structdrm_file*file,structdrm_device*dev,structdrm_mode_create_dumb*args)¶: Implementsstructdrm_driver.dumb_create

Parameters

structdrm_file*file: the DRM file
structdrm_device*dev: the DRM device
structdrm_mode_create_dumb*args: the arguments as provided tostructdrm_driver.dumb_create

Description

This function requires the driver to usedrm_device.vram_mm for itsinstance of VRAM MM.

Return

0 on success, ora negative error code otherwise.

intdrm_gem_vram_plane_helper_prepare_fb(structdrm_plane*plane,structdrm_plane_state*new_state)¶: Implementsstructdrm_plane_helper_funcs.prepare_fb

Parameters

structdrm_plane*plane: a DRM plane
structdrm_plane_state*new_state: the plane’s new state

Description

During plane updates, this function sets the plane’s fence andpins the GEM VRAM objects of the plane’s new framebuffer to VRAM.Calldrm_gem_vram_plane_helper_cleanup_fb() to unpin them.

Return

0 on success, ora negative errno code otherwise.

voiddrm_gem_vram_plane_helper_cleanup_fb(structdrm_plane*plane,structdrm_plane_state*old_state)¶: Implementsstructdrm_plane_helper_funcs.cleanup_fb

Parameters

structdrm_plane*plane: a DRM plane
structdrm_plane_state*old_state: the plane’s old state

Description

During plane updates, this function unpins the GEM VRAMobjects of the plane’s old framebuffer from VRAM. Complementsdrm_gem_vram_plane_helper_prepare_fb().

voiddrm_vram_mm_debugfs_init(structdrm_minor*minor)¶: Register VRAM MM debugfs file.

Parameters

structdrm_minor*minor: drm minor device.

intdrmm_vram_helper_init(structdrm_device*dev,uint64_tvram_base,size_tvram_size)¶: Initializes a device’s instance ofstructdrm_vram_mm

Parameters

structdrm_device*dev: the DRM device
uint64_tvram_base: the base address of the video memory
size_tvram_size: the size of the video memory in bytes

Description

Creates a new instance ofstructdrm_vram_mm and stores it instructdrm_device.vram_mm. The instance is auto-managed and cleanedup as part of device cleanup. Calling this function multiple timeswill generate an error message.

Return

0 on success, or a negative errno code otherwise.

enumdrm_mode_statusdrm_vram_helper_mode_valid(structdrm_device*dev,conststructdrm_display_mode*mode)¶: Tests if a display mode’s framebuffer fits into the available video memory.

Parameters

structdrm_device*dev: the DRM device
conststructdrm_display_mode*mode: the mode to test

Description

This function tests if enough video memory is available for using thespecified display mode. Atomic modesetting requires importing thedesignated framebuffer into video memory before evicting the activeone. Hence, any framebuffer may consume at most half of the availableVRAM. Display modes that require a larger framebuffer can not be used,even if the CRTC does support them. Each framebuffer is assumed tohave 32-bit color depth.

Note

The function can only test if the display mode is supported ingeneral. If there are too many framebuffers pinned to video memory,a display mode may still not be usable in practice. The color depth of32-bit fits all current use case. A more flexible test can be addedwhen necessary.

Return

MODE_OK if the display mode is supported, or an error code of typeenumdrm_mode_status otherwise.

GEM TTM Helper Functions Reference¶

This library provides helper functions for gem objects backed byttm.

voiddrm_gem_ttm_print_info(structdrm_printer*p,unsignedintindent,conststructdrm_gem_object*gem)¶: Printttm_buffer_object info for debugfs

Parameters

structdrm_printer*p: DRM printer
unsignedintindent: Tab indentation level
conststructdrm_gem_object*gem: GEM object

Description

This function can be used asdrm_gem_object_funcs.print_infocallback.

intdrm_gem_ttm_vmap(structdrm_gem_object*gem,structiosys_map*map)¶: vmapttm_buffer_object

Parameters

structdrm_gem_object*gem: GEM object.
structiosys_map*map: [out] returns the dma-buf mapping.

Description

Maps a GEM object withttm_bo_vmap(). This function can be used asdrm_gem_object_funcs.vmap callback.

Return

0 on success, or a negative errno code otherwise.

voiddrm_gem_ttm_vunmap(structdrm_gem_object*gem,structiosys_map*map)¶: vunmapttm_buffer_object

Parameters

structdrm_gem_object*gem: GEM object.
structiosys_map*map: dma-buf mapping.

Description

Unmaps a GEM object withttm_bo_vunmap(). This function can be used asdrm_gem_object_funcs.vmap callback.

intdrm_gem_ttm_mmap(structdrm_gem_object*gem,structvm_area_struct*vma)¶: mmapttm_buffer_object

Parameters

structdrm_gem_object*gem: GEM object.
structvm_area_struct*vma: vm area.

Description

This function can be used asdrm_gem_object_funcs.mmapcallback.

intdrm_gem_ttm_dumb_map_offset(structdrm_file*file,structdrm_device*dev,uint32_thandle,uint64_t*offset)¶: Implements structdrm_driver.dumb_map_offset

Parameters

structdrm_file*file: DRM file pointer.
structdrm_device*dev: DRM device.
uint32_thandle: GEM handle
uint64_t*offset: Returns the mapping’s memory offset on success

Description

Provides an implementation of structdrm_driver.dumb_map_offset forTTM-based GEM drivers. TTM allocates the offset internally anddrm_gem_ttm_dumb_map_offset() returns it for dumb-buffer implementations.

See structdrm_driver.dumb_map_offset.

Return

0 on success, or a negative errno code otherwise.

VMA Offset Manager¶

The vma-manager is responsible to map arbitrary driver-dependent memoryregions into the linear user address-space. It provides offsets to thecaller which can then be used on the address_space of the drm-device. Ittakes care to not overlap regions, size them appropriately and to notconfuse mm-core by inconsistent fake vm_pgoff fields.Drivers shouldn’t use this for object placement in VMEM. This manager shouldonly be used to manage mappings into linear user-space VMs.

We use drm_mm as backend to manage object allocations. But it is highlyoptimized for alloc/free calls, not lookups. Hence, we use an rb-tree tospeed up offset lookups.

You must not use multiple offset managers on a single address_space.Otherwise, mm-core will be unable to tear down memory mappings as the VM willno longer be linear.

This offset manager works on page-based addresses. That is, every argumentand return code (with the exception ofdrm_vma_node_offset_addr()) is givenin number of pages, not number of bytes. That means, object sizes and offsetsmust always be page-aligned (as usual).If you want to get a valid byte-based user-space address for a given offset,please seedrm_vma_node_offset_addr().

Additionally to offset management, the vma offset manager also handles accessmanagement. For every open-file context that is allowed to access a givennode, you must calldrm_vma_node_allow(). Otherwise, an mmap() call on thisopen-file with the offset of the node will fail with -EACCES. To revokeaccess again, usedrm_vma_node_revoke(). However, the caller is responsiblefor destroying already existing mappings, if required.

structdrm_vma_offset_node*drm_vma_offset_exact_lookup_locked(structdrm_vma_offset_manager*mgr,unsignedlongstart,unsignedlongpages)¶: Look up node by exact address

Parameters

structdrm_vma_offset_manager*mgr: Manager object
unsignedlongstart: Start address (page-based, not byte-based)
unsignedlongpages: Size of object (page-based)

Description

Same asdrm_vma_offset_lookup_locked() but does not allow any offset into the node.It only returns the exact object with the given start address.

Return

Node at exact start addressstart.

voiddrm_vma_offset_lock_lookup(structdrm_vma_offset_manager*mgr)¶: Lock lookup for extended private use

Parameters

structdrm_vma_offset_manager*mgr: Manager object

Description

Lock VMA manager for extended lookups. Only locked VMA function callsare allowed while holding this lock. All other contexts are blocked from VMAuntil the lock is released viadrm_vma_offset_unlock_lookup().

Use this if you need to take a reference to the objects returned bydrm_vma_offset_lookup_locked() before releasing this lock again.

This lock must not be used for anything else than extended lookups. You mustnot call any other VMA helpers while holding this lock.

Note

You’re in atomic-context while holding this lock!

voiddrm_vma_offset_unlock_lookup(structdrm_vma_offset_manager*mgr)¶: Unlock lookup for extended private use

Parameters

structdrm_vma_offset_manager*mgr: Manager object

Description

Release lookup-lock. Seedrm_vma_offset_lock_lookup() for more information.

voiddrm_vma_node_reset(structdrm_vma_offset_node*node)¶: Initialize or reset node object

Parameters

structdrm_vma_offset_node*node: Node to initialize or reset

Description

Reset a node to its initial state. This must be called before using it withany VMA offset manager.

This must not be called on an already allocated node, or you will leakmemory.

unsignedlongdrm_vma_node_start(conststructdrm_vma_offset_node*node)¶: Return start address for page-based addressing

Parameters

conststructdrm_vma_offset_node*node: Node to inspect

Description

Return the start address of the given node. This can be used as offset intothe linear VM space that is provided by the VMA offset manager. Note thatthis can only be used for page-based addressing. If you need a proper offsetfor user-space mappings, you must apply “<< PAGE_SHIFT” or use thedrm_vma_node_offset_addr() helper instead.

Return

Start address ofnode for page-based addressing. 0 if the node does nothave an offset allocated.

unsignedlongdrm_vma_node_size(structdrm_vma_offset_node*node)¶: Return size (page-based)

Parameters

structdrm_vma_offset_node*node: Node to inspect

Description

Return the size as number of pages for the given node. This is the same sizethat was passed todrm_vma_offset_add(). If no offset is allocated for thenode, this is 0.

Return

Size ofnode as number of pages. 0 if the node does not have an offsetallocated.

__u64drm_vma_node_offset_addr(structdrm_vma_offset_node*node)¶: Return sanitized offset for user-space mmaps

Parameters

structdrm_vma_offset_node*node: Linked offset node

Description

Same asdrm_vma_node_start() but returns the address as a valid offset thatcan be used for user-space mappings during mmap().This must not be called on unlinked nodes.

Return

Offset ofnode for byte-based addressing. 0 if the node does not have anobject allocated.

voiddrm_vma_node_unmap(structdrm_vma_offset_node*node,structaddress_space*file_mapping)¶: Unmap offset node

Parameters

structdrm_vma_offset_node*node: Offset node
structaddress_space*file_mapping: Address space to unmapnode from

Description

Unmap all userspace mappings for a given offset node. The mappings must beassociated with thefile_mapping address-space. If no offset existsnothing is done.

This call is unlocked. The caller must guarantee thatdrm_vma_offset_remove()is not called on this node concurrently.

intdrm_vma_node_verify_access(structdrm_vma_offset_node*node,structdrm_file*tag)¶: Access verification helper for TTM

Parameters

structdrm_vma_offset_node*node: Offset node
structdrm_file*tag: Tag of file to check

Description

This checks whethertag is granted access tonode. It is the same asdrm_vma_node_is_allowed() but suitable as drop-in helper for TTMverify_access() callbacks.

Return

0 if access is granted, -EACCES otherwise.

voiddrm_vma_offset_manager_init(structdrm_vma_offset_manager*mgr,unsignedlongpage_offset,unsignedlongsize)¶: Initialize new offset-manager

Parameters

structdrm_vma_offset_manager*mgr: Manager object
unsignedlongpage_offset: Offset of available memory area (page-based)
unsignedlongsize: Size of available address space range (page-based)

Description

Initialize a new offset-manager. The offset and area size available for themanager are given aspage_offset andsize. Both are interpreted aspage-numbers, not bytes.

Adding/removing nodes from the manager is locked internally and protectedagainst concurrent access. However, node allocation and destruction is leftfor the caller. While calling into the vma-manager, a given node mustalways be guaranteed to be referenced.

voiddrm_vma_offset_manager_destroy(structdrm_vma_offset_manager*mgr)¶: Destroy offset manager

Parameters

structdrm_vma_offset_manager*mgr: Manager object

Description

Destroy an object manager which was previously created viadrm_vma_offset_manager_init(). The caller must remove all allocated nodesbefore destroying the manager. Otherwise, drm_mm will refuse to free therequested resources.

The manager must not be accessed after this function is called.

structdrm_vma_offset_node*drm_vma_offset_lookup_locked(structdrm_vma_offset_manager*mgr,unsignedlongstart,unsignedlongpages)¶: Find node in offset space

Parameters

structdrm_vma_offset_manager*mgr: Manager object
unsignedlongstart: Start address for object (page-based)
unsignedlongpages: Size of object (page-based)

Description

Find a node given a start address and object size. This returns the _best_match for the given node. That is,start may point somewhere into a validregion and the given node will be returned, as long as the node spans thewhole requested area (given the size in number of pages aspages).

Note that before lookup the vma offset manager lookup lock must be acquiredwithdrm_vma_offset_lock_lookup(). See there for an example. This can then beused to implement weakly referenced lookups usingkref_get_unless_zero().

Example

drm_vma_offset_lock_lookup(mgr);node = drm_vma_offset_lookup_locked(mgr);if (node)    kref_get_unless_zero(container_of(node, sth, entr));drm_vma_offset_unlock_lookup(mgr);

Return

Returns NULL if no suitable node can be found. Otherwise, the best matchis returned. It’s the caller’s responsibility to make sure the node doesn’tget destroyed before the caller can access it.

intdrm_vma_offset_add(structdrm_vma_offset_manager*mgr,structdrm_vma_offset_node*node,unsignedlongpages)¶: Add offset node to manager

Parameters

structdrm_vma_offset_manager*mgr: Manager object
structdrm_vma_offset_node*node: Node to be added
unsignedlongpages: Allocation size visible to user-space (in number of pages)

Description

Add a node to the offset-manager. If the node was already added, this doesnothing and return 0.pages is the size of the object given in number ofpages.After this call succeeds, you can access the offset of the node until itis removed again.

If this call fails, it is safe to retry the operation or calldrm_vma_offset_remove(), anyway. However, no cleanup is required in thatcase.

pages is not required to be the same size as the underlying memory objectthat you want to map. It only limits the size that user-space can map intotheir address space.

Return

0 on success, negative error code on failure.

voiddrm_vma_offset_remove(structdrm_vma_offset_manager*mgr,structdrm_vma_offset_node*node)¶: Remove offset node from manager

Parameters

structdrm_vma_offset_manager*mgr: Manager object
structdrm_vma_offset_node*node: Node to be removed

Description

Remove a node from the offset manager. If the node wasn’t added before, thisdoes nothing. After this call returns, the offset and size will be 0 until anew offset is allocated viadrm_vma_offset_add() again. Helper functions likedrm_vma_node_start() anddrm_vma_node_offset_addr() will return 0 if nooffset is allocated.

intdrm_vma_node_allow(structdrm_vma_offset_node*node,structdrm_file*tag)¶: Add open-file to list of allowed users

Parameters

structdrm_vma_offset_node*node: Node to modify
structdrm_file*tag: Tag of file to remove

Description

Addtag to the list of allowed open-files for this node. Iftag isalready on this list, the ref-count is incremented.

The list of allowed-users is preserved acrossdrm_vma_offset_add() anddrm_vma_offset_remove() calls. You may even call it if the node is currentlynot added to any offset-manager.

You must remove all open-files the same number of times as you added thembefore destroying the node. Otherwise, you will leak memory.

This is locked against concurrent access internally.

Return

0 on success, negative error code on internal failure (out-of-mem)

intdrm_vma_node_allow_once(structdrm_vma_offset_node*node,structdrm_file*tag)¶: Add open-file to list of allowed users

Parameters

structdrm_vma_offset_node*node: Node to modify
structdrm_file*tag: Tag of file to remove

Description

Addtag to the list of allowed open-files for this node.

The list of allowed-users is preserved acrossdrm_vma_offset_add() anddrm_vma_offset_remove() calls. You may even call it if the node is currentlynot added to any offset-manager.

This is not ref-counted unlikedrm_vma_node_allow() hencedrm_vma_node_revoke()should only be called once after this.

This is locked against concurrent access internally.

Return

0 on success, negative error code on internal failure (out-of-mem)

voiddrm_vma_node_revoke(structdrm_vma_offset_node*node,structdrm_file*tag)¶: Remove open-file from list of allowed users

Parameters

structdrm_vma_offset_node*node: Node to modify
structdrm_file*tag: Tag of file to remove

Description

Decrement the ref-count oftag in the list of allowed open-files onnode.If the ref-count drops to zero, removetag from the list. You must callthis once for everydrm_vma_node_allow() ontag.

This is locked against concurrent access internally.

Iftag is not on the list, nothing is done.

booldrm_vma_node_is_allowed(structdrm_vma_offset_node*node,structdrm_file*tag)¶: Check whether an open-file is granted access

Parameters

structdrm_vma_offset_node*node: Node to check
structdrm_file*tag: Tag of file to remove

Description

Search the list innode whethertag is currently on the list of allowedopen-files (seedrm_vma_node_allow()).

This is locked against concurrent access internally.

Return

true iffilp is on the list

PRIME Buffer Sharing¶

PRIME is the cross device buffer sharing framework in drm, originallycreated for the OPTIMUS range of multi-gpu platforms. To userspace PRIMEbuffers are dma-buf based file descriptors.

Overview and Lifetime Rules¶

Similar to GEM global names, PRIME file descriptors are also used to sharebuffer objects across processes. They offer additional security: as filedescriptors must be explicitly sent over UNIX domain sockets to be sharedbetween applications, they can’t be guessed like the globally unique GEMnames.

Drivers that support the PRIME API implement the drm_gem_object_funcs.exportanddrm_driver.gem_prime_import hooks.dma_buf_ops implementations fordrivers are all individually exported for drivers which need to overwriteor reimplement some of them.

Reference Counting for GEM Drivers¶

On the export thedma_buf holds a reference to the exported buffer object,usually adrm_gem_object. It takes this reference in the PRIME_HANDLE_TO_FDIOCTL, when it first callsdrm_gem_object_funcs.exportand stores the exporting GEM object in thedma_buf.priv field. Thisreference needs to be released when the final reference to thedma_bufitself is dropped and itsdma_buf_ops.release function is called. ForGEM-based drivers, thedma_buf should be exported usingdrm_gem_dmabuf_export() and then released bydrm_gem_dmabuf_release().

Thus the chain of references always flows in one direction, avoiding loops:importing GEM object -> dma-buf -> exported GEM bo. A further complicationare the lookup caches for import and export. These are required to guaranteethat any given object will always have only one unique userspace handle. Thisis required to allow userspace to detect duplicated imports, since some GEMdrivers do fail command submissions if a given buffer object is listed morethan once. These import and export caches indrm_prime_file_private onlyretain a weak reference, which is cleaned up when the corresponding object isreleased.

Self-importing: If userspace is using PRIME as a replacement for flink thenit will get a fd->handle request for a GEM object that it created. Driversshould detect this situation and return back the underlying object from thedma-buf private. For GEM based drivers this is handled indrm_gem_prime_import() already.

PRIME Helper Functions¶

Drivers can implementdrm_gem_object_funcs.export anddrm_driver.gem_prime_import in terms of simpler APIs by using the helperfunctionsdrm_gem_prime_export() anddrm_gem_prime_import(). These functionsimplement dma-buf support in terms of some lower-level helpers, which areagain exported for drivers to use individually:

Exporting buffers¶

Optional pinning of buffers is handled at dma-buf attach and detach time indrm_gem_map_attach() anddrm_gem_map_detach(). Backing storage itself ishandled bydrm_gem_map_dma_buf() anddrm_gem_unmap_dma_buf(), which relies ondrm_gem_object_funcs.get_sg_table. Ifdrm_gem_object_funcs.get_sg_table isunimplemented, exports into another device are rejected.

For kernel-internal access there’sdrm_gem_dmabuf_vmap() anddrm_gem_dmabuf_vunmap(). Userspace mmap support is provided bydrm_gem_dmabuf_mmap().

Note that these export helpers can only be used if the underlying backingstorage is fully coherent and either permanently pinned, or it is safe to pinit indefinitely.

FIXME: The underlying helper functions are named rather inconsistently.

Importing buffers¶

Importing dma-bufs usingdrm_gem_prime_import() relies ondrm_driver.gem_prime_import_sg_table.

Note that similarly to the export helpers this permanently pins theunderlying backing storage. Which is ok for scanout, but is not the bestoption for sharing lots of buffers for rendering.

PRIME Function References¶

structdrm_prime_file_private¶: per-file tracking for PRIME

Definition:

struct drm_prime_file_private {};

Members

Description

This just contains the internalstructdma_buf and handle caches for eachstructdrm_file used by the PRIME core code.

structdma_buf*drm_gem_dmabuf_export(structdrm_device*dev,structdma_buf_export_info*exp_info)¶: dma_buf export implementation for GEM

Parameters

structdrm_device*dev: parent device for the exported dmabuf
structdma_buf_export_info*exp_info: the export information used bydma_buf_export()

Description

This wrapsdma_buf_export() for use by generic GEM drivers that are usingdrm_gem_dmabuf_release(). In addition to callingdma_buf_export(), we takea reference to thedrm_device and the exporteddrm_gem_object (stored indma_buf_export_info.priv) which is released bydrm_gem_dmabuf_release().

Returns the new dmabuf.

voiddrm_gem_dmabuf_release(structdma_buf*dma_buf)¶: dma_buf release implementation for GEM

Parameters

structdma_buf*dma_buf: buffer to be released

Description

Generic release function for dma_bufs exported as PRIME buffers. GEM driversmust use this in theirdma_buf_ops structure as the release callback.drm_gem_dmabuf_release() should be used in conjunction withdrm_gem_dmabuf_export().

intdrm_gem_prime_fd_to_handle(structdrm_device*dev,structdrm_file*file_priv,intprime_fd,uint32_t*handle)¶: PRIME import function for GEM drivers

Parameters

structdrm_device*dev: drm_device to import into
structdrm_file*file_priv: drm file-private structure
intprime_fd: fd id of the dma-buf which should be imported
uint32_t*handle: pointer to storage for the handle of the imported buffer object

Description

This is the PRIME import function which must be used mandatorily by GEMdrivers to ensure correct lifetime management of the underlying GEM object.The actual importing of GEM object from the dma-buf is done through thedrm_driver.gem_prime_import driver callback.

Returns 0 on success or a negative error code on failure.

structdma_buf*drm_gem_prime_handle_to_dmabuf(structdrm_device*dev,structdrm_file*file_priv,uint32_thandle,uint32_tflags)¶: PRIME export function for GEM drivers

Parameters

structdrm_device*dev: dev to export the buffer from
structdrm_file*file_priv: drm file-private structure
uint32_thandle: buffer handle to export
uint32_tflags: flags like DRM_CLOEXEC

Description

This is the PRIME export function which must be used mandatorily by GEMdrivers to ensure correct lifetime management of the underlying GEM object.The actual exporting from GEM object to a dma-buf is done through thedrm_gem_object_funcs.export callback.

Unlikedrm_gem_prime_handle_to_fd(), it returns thestructdma_buf ithas created, without attaching it to any file descriptors. The differencebetween those two is similar to that betweenanon_inode_getfile() andanon_inode_getfd(); insertion into descriptor table is something youcan not revert if any cleanup is needed, so the descriptor-returningvariants should only be used when you are past the last failure exitand the only thing left is passing the new file descriptor to userland.When all you need is the object itself or when you need to do somethingelse that might fail, use that one instead.

intdrm_gem_prime_handle_to_fd(structdrm_device*dev,structdrm_file*file_priv,uint32_thandle,uint32_tflags,int*prime_fd)¶: PRIME export function for GEM drivers

Parameters

structdrm_device*dev: dev to export the buffer from
structdrm_file*file_priv: drm file-private structure
uint32_thandle: buffer handle to export
uint32_tflags: flags like DRM_CLOEXEC
int*prime_fd: pointer to storage for the fd id of the create dma-buf

Description

intdrm_gem_map_attach(structdma_buf*dma_buf,structdma_buf_attachment*attach)¶: dma_buf attach implementation for GEM

Parameters

structdma_buf*dma_buf: buffer to attach device to
structdma_buf_attachment*attach: buffer attachment data

Description

Callsdrm_gem_object_funcs.pin for device specific handling. This can beused as thedma_buf_ops.attach callback. Must be used together withdrm_gem_map_detach().

Returns 0 on success, negative error code on failure.

voiddrm_gem_map_detach(structdma_buf*dma_buf,structdma_buf_attachment*attach)¶: dma_buf detach implementation for GEM

Parameters

structdma_buf*dma_buf: buffer to detach from
structdma_buf_attachment*attach: attachment to be detached

Description

Callsdrm_gem_object_funcs.pin for device specific handling. Cleans updma_buf_attachment fromdrm_gem_map_attach(). This can be used as thedma_buf_ops.detach callback.

structsg_table*drm_gem_map_dma_buf(structdma_buf_attachment*attach,enumdma_data_directiondir)¶: map_dma_buf implementation for GEM

Parameters

structdma_buf_attachment*attach: attachment whose scatterlist is to be returned
enumdma_data_directiondir: direction of DMA transfer

Description

Callsdrm_gem_object_funcs.get_sg_table and then maps the scatterlist. Thiscan be used as thedma_buf_ops.map_dma_buf callback. Should be used togetherwithdrm_gem_unmap_dma_buf().

Return

sg_table containing the scatterlist to be returned; returns ERR_PTRon error. May return -EINTR if it is interrupted by a signal.

voiddrm_gem_unmap_dma_buf(structdma_buf_attachment*attach,structsg_table*sgt,enumdma_data_directiondir)¶: unmap_dma_buf implementation for GEM

Parameters

structdma_buf_attachment*attach: attachment to unmap buffer from
structsg_table*sgt: scatterlist info of the buffer to unmap
enumdma_data_directiondir: direction of DMA transfer

Description

This can be used as thedma_buf_ops.unmap_dma_buf callback.

intdrm_gem_dmabuf_vmap(structdma_buf*dma_buf,structiosys_map*map)¶: dma_buf vmap implementation for GEM

Parameters

structdma_buf*dma_buf: buffer to be mapped
structiosys_map*map: the virtual address of the buffer

Description

Sets up a kernel virtual mapping. This can be used as thedma_buf_ops.vmapcallback. Calls intodrm_gem_object_funcs.vmap for device specific handling.The kernel virtual address is returned in map.

Returns 0 on success or a negative errno code otherwise.

voiddrm_gem_dmabuf_vunmap(structdma_buf*dma_buf,structiosys_map*map)¶: dma_buf vunmap implementation for GEM

Parameters

structdma_buf*dma_buf: buffer to be unmapped
structiosys_map*map: the virtual address of the buffer

Description

Releases a kernel virtual mapping. This can be used as thedma_buf_ops.vunmap callback. Calls intodrm_gem_object_funcs.vunmap for device specific handling.

intdrm_gem_prime_mmap(structdrm_gem_object*obj,structvm_area_struct*vma)¶: PRIME mmap function for GEM drivers

Parameters

structdrm_gem_object*obj: GEM object
structvm_area_struct*vma: Virtual address range

Description

This function sets up a userspace mapping for PRIME exported buffers usingthe same codepath that is used for regular GEM buffer mapping on the DRM fd.The fake GEM offset is added to vma->vm_pgoff anddrm_driver->fops->mmap iscalled to set up the mapping.

intdrm_gem_dmabuf_mmap(structdma_buf*dma_buf,structvm_area_struct*vma)¶: dma_buf mmap implementation for GEM

Parameters

structdma_buf*dma_buf: buffer to be mapped
structvm_area_struct*vma: virtual address range

Description

Provides memory mapping for the buffer. This can be used as thedma_buf_ops.mmap callback. It just forwards todrm_gem_prime_mmap().

Returns 0 on success or a negative error code on failure.

structsg_table*drm_prime_pages_to_sg(structdrm_device*dev,structpage**pages,unsignedintnr_pages)¶: converts a page array into an sg list

Parameters

structdrm_device*dev: DRM device
structpage**pages: pointer to the array of page pointers to convert
unsignedintnr_pages: length of the page vector

Description

This helper creates an sg table object from a set of pagesthe driver is responsible for mapping the pages into theimporters address space for use with dma_buf itself.

This is useful for implementingdrm_gem_object_funcs.get_sg_table.

unsignedlongdrm_prime_get_contiguous_size(structsg_table*sgt)¶: returns the contiguous size of the buffer

Parameters

structsg_table*sgt: sg_table describing the buffer to check

Description

This helper calculates the contiguous size in the DMA address spaceof the buffer described by the provided sg_table.

This is useful for implementingdrm_gem_object_funcs.gem_prime_import_sg_table.

structdma_buf*drm_gem_prime_export(structdrm_gem_object*obj,intflags)¶: helper library implementation of the export callback

Parameters

structdrm_gem_object*obj: GEM object to export
intflags: flags like DRM_CLOEXEC and DRM_RDWR

Description

This is the implementation of thedrm_gem_object_funcs.export functions for GEM driversusing the PRIME helpers. It is used as the default indrm_gem_prime_handle_to_fd().

booldrm_gem_is_prime_exported_dma_buf(structdrm_device*dev,structdma_buf*dma_buf)¶: checks if the DMA-BUF was exported from a GEM object belonging todev.

Parameters

structdrm_device*dev: drm_device to check against
structdma_buf*dma_buf: dma-buf object to import

Return

true if the DMA-BUF was exported from a GEM object belongingtodev, false otherwise.

structdrm_gem_object*drm_gem_prime_import_dev(structdrm_device*dev,structdma_buf*dma_buf,structdevice*attach_dev)¶: core implementation of the import callback

Parameters

structdrm_device*dev: drm_device to import into
structdma_buf*dma_buf: dma-buf object to import
structdevice*attach_dev: structdevice to dma_buf attach

Description

This is the core ofdrm_gem_prime_import(). It’s designed to be called bydrivers who want to use a different device structure thandrm_device.dev forattaching via dma_buf. This function callsdrm_driver.gem_prime_import_sg_table internally.

Drivers must arrange to calldrm_prime_gem_destroy() from theirdrm_gem_object_funcs.free hook when using this function.

structdrm_gem_object*drm_gem_prime_import(structdrm_device*dev,structdma_buf*dma_buf)¶: helper library implementation of the import callback

Parameters

structdrm_device*dev: drm_device to import into
structdma_buf*dma_buf: dma-buf object to import

Description

This is the implementation of the gem_prime_import functions for GEM driversusing the PRIME helpers. Drivers can use this as theirdrm_driver.gem_prime_import implementation. It is used as the defaultimplementation indrm_gem_prime_fd_to_handle().

Drivers must arrange to calldrm_prime_gem_destroy() from theirdrm_gem_object_funcs.free hook when using this function.

intdrm_prime_sg_to_page_array(structsg_table*sgt,structpage**pages,intmax_entries)¶: convert an sg table into a page array

Parameters

structsg_table*sgt: scatter-gather table to convert
structpage**pages: array of page pointers to store the pages in
intmax_entries: size of the passed-in array

Description

Exports an sg table into an array of pages.

This function is deprecated and strongly discouraged to be used.The page array is only useful for page faults and those can corrupt fieldsin thestructpage if they are not handled by the exporting driver.

intdrm_prime_sg_to_dma_addr_array(structsg_table*sgt,dma_addr_t*addrs,intmax_entries)¶: convert an sg table into a dma addr array

Parameters

structsg_table*sgt: scatter-gather table to convert
dma_addr_t*addrs: array to store the dma bus address of each page
intmax_entries: size of both the passed-in arrays

Description

Exports an sg table into an array of addresses.

Drivers should use this in theirdrm_driver.gem_prime_import_sg_tableimplementation.

voiddrm_prime_gem_destroy(structdrm_gem_object*obj,structsg_table*sg)¶: helper to clean up a PRIME-imported GEM object

Parameters

structdrm_gem_object*obj: GEM object which was created from a dma-buf
structsg_table*sg: the sg-table which was pinned at import time

Description

This is the cleanup functions which GEM drivers need to call when they usedrm_gem_prime_import() ordrm_gem_prime_import_dev() to import dma-bufs.

DRM MM Range Allocator¶

Overview¶

drm_mm provides a simple range allocator. The drivers are free to use theresource allocator from the linux core if it suits them, the upside of drm_mmis that it’s in the DRM core. Which means that it’s easier to extend forsome of the crazier special purpose needs of gpus.

The main datastructisdrm_mm, allocations are tracked indrm_mm_node.Drivers are free to embed either of them into their own suitabledatastructures. drm_mm itself will not do any memory allocations of its own,so if drivers choose not to embed nodes they need to still allocate themthemselves.

The range allocator also supports reservation of preallocated blocks. This isuseful for taking over initial mode setting configurations from the firmware,where an object needs to be created which exactly matches the firmware’sscanout target. As long as the range is still free it can be inserted anytimeafter the allocator is initialized, which helps with avoiding loopeddependencies in the driver load sequence.

drm_mm maintains a stack of most recently freed holes, which of allsimplistic datastructures seems to be a fairly decent approach to clusteringallocations and avoiding too much fragmentation. This means free spacesearches are O(num_holes). Given that all the fancy features drm_mm supportssomething better would be fairly complex and since gfx thrashing is a fairlysteep cliff not a real concern. Removing a node again is O(1).

drm_mm supports a few features: Alignment and range restrictions can besupplied. Furthermore everydrm_mm_node has a color value (which is just anopaque unsigned long) which in conjunction with a driver callback can be usedto implement sophisticated placement restrictions. The i915 DRM driver usesthis to implement guard pages between incompatible caching domains in thegraphics TT.

Two behaviors are supported for searching and allocating: bottom-up andtop-down. The default is bottom-up. Top-down allocation can be used if thememory area has different restrictions, or just to reduce fragmentation.

Finally iteration helpers to walk all nodes and all holes are provided as aresome basic allocator dumpers for debugging.

Note that this range allocator is not thread-safe, drivers need to protectmodifications with their own locking. The idea behind this is that for a fullmemory manager additional data needs to be protected anyway, hence internallocking would be fully redundant.

LRU Scan/Eviction Support¶

Very often GPUs need to have continuous allocations for a given object. Whenevicting objects to make space for a new one it is therefore not mostefficient when we simply start to select all objects from the tail of an LRUuntil there’s a suitable hole: Especially for big objects or nodes thatotherwise have special allocation constraints there’s a good chance we evictlots of (smaller) objects unnecessarily.

The DRM range allocator supports this use-case through the scanninginterfaces. First a scan operation needs to be initialized withdrm_mm_scan_init() ordrm_mm_scan_init_with_range(). The driver addsobjects to the roster, probably by walking an LRU list, but this can befreely implemented. Eviction candidates are added usingdrm_mm_scan_add_block() until a suitable hole is found or there are nofurther evictable objects. Eviction roster metadata is tracked instructdrm_mm_scan.

The driver must walk through all objects again in exactly the reverseorder to restore the allocator state. Note that while the allocator is usedin the scan mode no other operation is allowed.

Finally the driver evicts all objects selected (drm_mm_scan_remove_block()reported true) in the scan, and any overlapping nodes after color adjustment(drm_mm_scan_color_evict()). Adding and removing an object is O(1), andsince freeing a node is also O(1) the overall complexity isO(scanned_objects). So like the free stack which needs to be walked before ascan operation even begins this is linear in the number of objects. Itdoesn’t seem to hurt too badly.

DRM MM Range Allocator Function References¶

enumdrm_mm_insert_mode¶: control search and allocation behaviour

Constants

DRM_MM_INSERT_BEST

Search for the smallest hole (within the search range) that fitsthe desired node.

Allocates the node from the bottom of the found hole.

DRM_MM_INSERT_LOW

Search for the lowest hole (address closest to 0, within the searchrange) that fits the desired node.

Allocates the node from the bottom of the found hole.

DRM_MM_INSERT_HIGH

Search for the highest hole (address closest to U64_MAX, within thesearch range) that fits the desired node.

Allocates the node from thetop of the found hole. The specifiedalignment for the node is applied to the base of the node(drm_mm_node.start).

DRM_MM_INSERT_EVICT

Search for the most recently evicted hole (within the search range)that fits the desired node. This is appropriate for use immediatelyafter performing an eviction scan (seedrm_mm_scan_init()) andremoving the selected nodes to form a hole.

Allocates the node from the bottom of the found hole.

DRM_MM_INSERT_ONCE

Only check the first hole for suitablity and report -ENOSPCimmediately otherwise, rather than check every hole until asuitable one is found. Can only be used in conjunction with anothersearch method such as DRM_MM_INSERT_HIGH or DRM_MM_INSERT_LOW.

DRM_MM_INSERT_HIGHEST

Only check the highest hole (the hole with the largest address) andinsert the node at the top of the hole or report -ENOSPC ifunsuitable.

Does not search all holes.

DRM_MM_INSERT_LOWEST

Only check the lowest hole (the hole with the smallest address) andinsert the node at the bottom of the hole or report -ENOSPC ifunsuitable.

Does not search all holes.

Description

Thestructdrm_mm range manager supports finding a suitable modes usinga number of search trees. These trees are oranised by size, by address andin most recent eviction order. This allows the user to find either thesmallest hole to reuse, the lowest or highest address to reuse, or simplyreuse the most recent eviction that fits. When allocating thedrm_mm_nodefrom within the hole, thedrm_mm_insert_mode also dictate whether toallocate the lowest matching address or the highest.

structdrm_mm_node¶: allocated block in the DRM allocator

Definition:

struct drm_mm_node {    unsigned long color;    u64 start;    u64 size;};

Members

color: Opaque driver-private tag.
start: Start address of the allocated block.
size: Size of the allocated block.

Description

This represents an allocated block in adrm_mm allocator. Except forpre-reserved nodes inserted usingdrm_mm_reserve_node() the structure isentirely opaque and should only be accessed through the provided funcions.Since allocation of these nodes is entirely handled by the driver they can beembedded.

structdrm_mm¶: DRM allocator

Definition:

struct drm_mm {    void (*color_adjust)(const struct drm_mm_node *node, unsigned long color, u64 *start, u64 *end);};

Members

color_adjust: Optional driver callback to further apply restrictions on a hole. Thenode argument points at the node containing the hole from which theblock would be allocated (seedrm_mm_hole_follows() and friends). Theother arguments are the size of the block to be allocated. The drivercan adjust the start and end as needed to e.g. insert guard pages.

Description

DRM range allocator with a few special functions and features geared towardsmanaging GPU memory. Except for thecolor_adjust callback the structure isentirely opaque and should only be accessed through the provided functionsand macros. This structure can be embedded into larger driver structures.

structdrm_mm_scan¶: DRM allocator eviction roaster data

Definition:

struct drm_mm_scan {};

Members

Description

This structure tracks data needed for the eviction roaster set up usingdrm_mm_scan_init(), and used withdrm_mm_scan_add_block() anddrm_mm_scan_remove_block(). The structure is entirely opaque and should onlybe accessed through the provided functions and macros. It is meant to beallocated temporarily by the driver on the stack.

booldrm_mm_node_allocated(conststructdrm_mm_node*node)¶: checks whether a node is allocated

Parameters

conststructdrm_mm_node*node: drm_mm_node to check

Description

Drivers are required to clear a node prior to using it with thedrm_mm range manager.

Drivers should use this helper for proper encapsulation of drm_mminternals.

Return

True if thenode is allocated.

booldrm_mm_initialized(conststructdrm_mm*mm)¶: checks whether an allocator is initialized

Parameters

conststructdrm_mm*mm: drm_mm to check

Description

Drivers should clear thestructdrm_mm prior to initialisation if theywant to use this function.

Drivers should use this helper for proper encapsulation of drm_mminternals.

Return

True if themm is initialized.

booldrm_mm_hole_follows(conststructdrm_mm_node*node)¶: checks whether a hole follows this node

Parameters

conststructdrm_mm_node*node: drm_mm_node to check

Description

Holes are embedded into the drm_mm using the tail of a drm_mm_node.If you wish to know whether a hole follows this particular node,query this function. See alsodrm_mm_hole_node_start() anddrm_mm_hole_node_end().

Return

True if a hole follows thenode.

u64drm_mm_hole_node_start(conststructdrm_mm_node*hole_node)¶: computes the start of the hole followingnode

Parameters

conststructdrm_mm_node*hole_node: drm_mm_node which implicitly tracks the following hole

Description

This is useful for driver-specific debug dumpers. Otherwise drivers shouldnot inspect holes themselves. Drivers must check first whether a hole indeedfollows by looking atdrm_mm_hole_follows()

Return

Start of the subsequent hole.

u64drm_mm_hole_node_end(conststructdrm_mm_node*hole_node)¶: computes the end of the hole followingnode

Parameters

conststructdrm_mm_node*hole_node: drm_mm_node which implicitly tracks the following hole

Description

This is useful for driver-specific debug dumpers. Otherwise drivers shouldnot inspect holes themselves. Drivers must check first whether a hole indeedfollows by looking atdrm_mm_hole_follows().

Return

End of the subsequent hole.

drm_mm_nodes¶

drm_mm_nodes(mm)

list of nodes under the drm_mm range manager

Parameters

mm: thestructdrm_mm range manager

Description

As the drm_mm range manager hides its node_list deep with itsstructure, extracting it looks painful and repetitive. This isnot expected to be used outside of thedrm_mm_for_each_node()macros and similar internal functions.

Return

The node list, may be empty.

drm_mm_for_each_node¶

drm_mm_for_each_node(entry,mm)

iterator to walk over all allocated nodes

Parameters

entry: structdrm_mm_node to assign to in each iteration step
mm: drm_mm allocator to walk

Description

This iterator walks over all nodes in the range allocator. It is implementedwithlist_for_each(), so not save against removal of elements.

drm_mm_for_each_node_safe¶

drm_mm_for_each_node_safe(entry,next,mm)

iterator to walk over all allocated nodes

Parameters

entry: structdrm_mm_node to assign to in each iteration step
next: structdrm_mm_node to store the next step
mm: drm_mm allocator to walk

Description

This iterator walks over all nodes in the range allocator. It is implementedwithlist_for_each_safe(), so save against removal of elements.

drm_mm_for_each_hole¶

drm_mm_for_each_hole(pos,mm,hole_start,hole_end)

iterator to walk over all holes

Parameters

pos: drm_mm_node used internally to track progress
mm: drm_mm allocator to walk
hole_start: ulong variable to assign the hole start to on each iteration
hole_end: ulong variable to assign the hole end to on each iteration

Description

This iterator walks over all holes in the range allocator. It is implementedwithlist_for_each(), so not save against removal of elements.entry is usedinternally and will not reflect a real drm_mm_node for the very first hole.Hence users of this iterator may not access it.

Implementation Note:We need to inline list_for_each_entry in order to be able to set hole_startand hole_end on each iteration while keeping the macro sane.

intdrm_mm_insert_node_generic(structdrm_mm*mm,structdrm_mm_node*node,u64size,u64alignment,unsignedlongcolor,enumdrm_mm_insert_modemode)¶: search for space and insertnode

Parameters

structdrm_mm*mm: drm_mm to allocate from
structdrm_mm_node*node: preallocate node to insert
u64size: size of the allocation
u64alignment: alignment of the allocation
unsignedlongcolor: opaque tag value to use for this node
enumdrm_mm_insert_modemode: fine-tune the allocation search and placement

Description

This is a simplified version ofdrm_mm_insert_node_in_range() with norange restrictions applied.

The preallocated node must be cleared to 0.

Return

0 on success, -ENOSPC if there’s no suitable hole.

intdrm_mm_insert_node(structdrm_mm*mm,structdrm_mm_node*node,u64size)¶: search for space and insertnode

Parameters

structdrm_mm*mm: drm_mm to allocate from
structdrm_mm_node*node: preallocate node to insert
u64size: size of the allocation

Description

This is a simplified version ofdrm_mm_insert_node_generic() withcolor setto 0.

The preallocated node must be cleared to 0.

Return

0 on success, -ENOSPC if there’s no suitable hole.

booldrm_mm_clean(conststructdrm_mm*mm)¶: checks whether an allocator is clean

Parameters

conststructdrm_mm*mm: drm_mm allocator to check

Return

True if the allocator is completely free, false if there’s still a nodeallocated in it.

drm_mm_for_each_node_in_range¶

drm_mm_for_each_node_in_range(node__,mm__,start__,end__)

iterator to walk over a range of allocated nodes

Parameters

node__: drm_mm_node structure to assign to in each iteration step
mm__: drm_mm allocator to walk
start__: starting offset, the first node will overlap this
end__: ending offset, the last node will start before this (but may overlap)

Description

This iterator walks over all nodes in the range allocator that liebetweenstart andend. It is implemented similarly tolist_for_each(),but using the internal interval tree to accelerate the search for thestarting node, and so not safe against removal of elements. It assumesthatend is within (or is the upper limit of) the drm_mm allocator.If [start,end] are beyond the range of the drm_mm, the iterator may walkover the special _unallocated_drm_mm.head_node, and may even continueindefinitely.

voiddrm_mm_scan_init(structdrm_mm_scan*scan,structdrm_mm*mm,u64size,u64alignment,unsignedlongcolor,enumdrm_mm_insert_modemode)¶: initialize lru scanning

Parameters

structdrm_mm_scan*scan: scan state
structdrm_mm*mm: drm_mm to scan
u64size: size of the allocation
u64alignment: alignment of the allocation
unsignedlongcolor: opaque tag value to use for the allocation
enumdrm_mm_insert_modemode: fine-tune the allocation search and placement

Description

This is a simplified version ofdrm_mm_scan_init_with_range() with no rangerestrictions applied.

This simply sets up the scanning routines with the parameters for the desiredhole.

Warning:As long as the scan list is non-empty, no other operations thanadding/removing nodes to/from the scan list are allowed.

intdrm_mm_reserve_node(structdrm_mm*mm,structdrm_mm_node*node)¶: insert an pre-initialized node

Parameters

structdrm_mm*mm: drm_mm allocator to insertnode into
structdrm_mm_node*node: drm_mm_node to insert

Description

This functions inserts an already set-updrm_mm_node into the allocator,meaning that start, size and color must be set by the caller. All otherfields must be cleared to 0. This is useful to initialize the allocator withpreallocated objects which must be set-up before the range allocator can beset-up, e.g. when taking over a firmware framebuffer.

Return

0 on success, -ENOSPC if there’s no hole wherenode is.

intdrm_mm_insert_node_in_range(structdrm_mm*constmm,structdrm_mm_node*constnode,u64size,u64alignment,unsignedlongcolor,u64range_start,u64range_end,enumdrm_mm_insert_modemode)¶: ranged search for space and insertnode

Parameters

structdrm_mm*constmm: drm_mm to allocate from
structdrm_mm_node*constnode: preallocate node to insert
u64size: size of the allocation
u64alignment: alignment of the allocation
unsignedlongcolor: opaque tag value to use for this node
u64range_start: start of the allowed range for this node
u64range_end: end of the allowed range for this node
enumdrm_mm_insert_modemode: fine-tune the allocation search and placement

Description

The preallocatednode must be cleared to 0.

Return

0 on success, -ENOSPC if there’s no suitable hole.

voiddrm_mm_remove_node(structdrm_mm_node*node)¶: Remove a memory node from the allocator.

Parameters

structdrm_mm_node*node: drm_mm_node to remove

Description

This just removes a node from its drm_mm allocator. The node does not need tobe cleared again before it can be re-inserted into this or any other drm_mmallocator. It is a bug to call this function on a unallocated node.

voiddrm_mm_scan_init_with_range(structdrm_mm_scan*scan,structdrm_mm*mm,u64size,u64alignment,unsignedlongcolor,u64start,u64end,enumdrm_mm_insert_modemode)¶: initialize range-restricted lru scanning

Parameters

structdrm_mm_scan*scan: scan state
structdrm_mm*mm: drm_mm to scan
u64size: size of the allocation
u64alignment: alignment of the allocation
unsignedlongcolor: opaque tag value to use for the allocation
u64start: start of the allowed range for the allocation
u64end: end of the allowed range for the allocation
enumdrm_mm_insert_modemode: fine-tune the allocation search and placement

Description

This simply sets up the scanning routines with the parameters for the desiredhole.

Warning:As long as the scan list is non-empty, no other operations thanadding/removing nodes to/from the scan list are allowed.

booldrm_mm_scan_add_block(structdrm_mm_scan*scan,structdrm_mm_node*node)¶: add a node to the scan list

Parameters

structdrm_mm_scan*scan: the active drm_mm scanner
structdrm_mm_node*node: drm_mm_node to add

Description

Add a node to the scan list that might be freed to make space for the desiredhole.

Return

True if a hole has been found, false otherwise.

booldrm_mm_scan_remove_block(structdrm_mm_scan*scan,structdrm_mm_node*node)¶: remove a node from the scan list

Parameters

structdrm_mm_scan*scan: the active drm_mm scanner
structdrm_mm_node*node: drm_mm_node to remove

Description

Nodesmust be removed in exactly the reverse order from the scan list asthey have been added (e.g. usinglist_add() as they are added and thenlist_for_each() over that eviction list to remove), otherwise the internalstate of the memory manager will be corrupted.

When the scan list is empty, the selected memory nodes can be freed. Animmediately followingdrm_mm_insert_node_in_range_generic() or one of thesimpler versions of that function with !DRM_MM_SEARCH_BEST will then returnthe just freed block (because it’s at the top of the free_stack list).

Return

True if this block should be evicted, false otherwise. Will alwaysreturn false when no hole has been found.

structdrm_mm_node*drm_mm_scan_color_evict(structdrm_mm_scan*scan)¶: evict overlapping nodes on either side of hole

Parameters

structdrm_mm_scan*scan: drm_mm scan with target hole

Description

After completing an eviction scan and removing the selected nodes, we mayneed to remove a few more nodes from either side of the target hole ifmm.color_adjust is being used.

Return

A node to evict, or NULL if there are no overlapping nodes.

voiddrm_mm_init(structdrm_mm*mm,u64start,u64size)¶: initialize a drm-mm allocator

Parameters

structdrm_mm*mm: the drm_mm structure to initialize
u64start: start of the range managed bymm
u64size: end of the range managed bymm

Description

Note thatmm must be cleared to 0 before calling this function.

voiddrm_mm_takedown(structdrm_mm*mm)¶: clean up a drm_mm allocator

Parameters

structdrm_mm*mm: drm_mm allocator to clean up

Description

Note that it is a bug to call this function on an allocator which is notclean.

voiddrm_mm_print(conststructdrm_mm*mm,structdrm_printer*p)¶: print allocator state

Parameters

conststructdrm_mm*mm: drm_mm allocator to print
structdrm_printer*p: DRM printer to use

DRM GPUVM¶

Overview¶

The DRM GPU VA Manager, represented bystructdrm_gpuvm keeps track of aGPU’s virtual address (VA) space and manages the corresponding virtualmappings represented bydrm_gpuva objects. It also keeps track of themapping’s backingdrm_gem_object buffers.

drm_gem_object buffers maintain a list ofdrm_gpuva objects representingall existing GPU VA mappings using thisdrm_gem_object as backing buffer.

GPU VAs can be flagged as sparse, such that drivers may use GPU VAs to alsokeep track of sparse PTEs in order to support Vulkan ‘Sparse Resources’.

The GPU VA manager internally uses a rb-tree to manage thedrm_gpuva mappings within a GPU’s virtual address space.

Thedrm_gpuvm structure contains a specialdrm_gpuva representing theportion of VA space reserved by the kernel. This node is initialized togetherwith the GPU VA manager instance and removed when the GPU VA manager isdestroyed.

In a typical application drivers would embedstructdrm_gpuvm andstructdrm_gpuva within their own driver specific structures, there won’t beany memory allocations of its own nor memory allocations ofdrm_gpuvaentries.

The data structures needed to storedrm_gpuvas within thedrm_gpuvm arecontained withinstructdrm_gpuva already. Hence, for insertingdrm_gpuvaentries from within dma-fence signalling critical sections it is enough topre-allocate thedrm_gpuva structures.

drm_gem_objects which are private to a single VM can share a commondma_resv in order to improve locking efficiency (e.g. withdrm_exec).For this purpose drivers must pass adrm_gem_object todrm_gpuvm_init(), inthe following called ‘resv object’, which serves as the container of theGPUVM’s shareddma_resv. This resv object can be a driver specificdrm_gem_object, such as thedrm_gem_object containing the root page table,but it can also be a ‘dummy’ object, which can be allocated withdrm_gpuvm_resv_object_alloc().

In order to connect astructdrm_gpuva to its backingdrm_gem_object eachdrm_gem_object maintains a list ofdrm_gpuvm_bo structures, and eachdrm_gpuvm_bo contains a list ofdrm_gpuva structures.

Adrm_gpuvm_bo is an abstraction that represents a combination of adrm_gpuvm and adrm_gem_object. Every such combination should be unique.This is ensured by the API throughdrm_gpuvm_bo_obtain() anddrm_gpuvm_bo_obtain_prealloc() which first look into the correspondingdrm_gem_object list ofdrm_gpuvm_bos for an existing instance of thisparticular combination. If not present, a new instance is created and linkedto thedrm_gem_object.

drm_gpuvm_bo structures, since unique for a givendrm_gpuvm, are also usedas entry for thedrm_gpuvm’s lists of external and evicted objects. Thoselists are maintained in order to accelerate locking of dma-resv locks andvalidation of evicted objects bound in adrm_gpuvm. For instance, alldrm_gem_object’sdma_resv of a givendrm_gpuvm can be locked by callingdrm_gpuvm_exec_lock(). Once locked drivers can calldrm_gpuvm_validate() inorder to validate all evicteddrm_gem_objects. It is also possible to lockadditionaldrm_gem_objects by providing the corresponding parameters todrm_gpuvm_exec_lock() as well as open code thedrm_exec loop while makinguse of helper functions such asdrm_gpuvm_prepare_range() ordrm_gpuvm_prepare_objects().

Every bounddrm_gem_object is treated as external object when itsdma_resvstructure is different than thedrm_gpuvm’s commondma_resv structure.

Split and Merge¶

Besides its capability to manage and represent a GPU VA space, theGPU VA manager also provides functions to let thedrm_gpuvm calculate asequence of operations to satisfy a given map or unmap request.

Therefore the DRM GPU VA manager provides an algorithm implementing splittingand merging of existing GPU VA mappings with the ones that are requested tobe mapped or unmapped. This feature is required by the Vulkan API toimplement Vulkan ‘Sparse Memory Bindings’ - drivers UAPIs often refer to thisas VM BIND.

Drivers can calldrm_gpuvm_sm_map() to receive a sequence of callbackscontaining map, unmap and remap operations for a given newly requestedmapping. The sequence of callbacks represents the set of operations toexecute in order to integrate the new mapping cleanly into the current stateof the GPU VA space.

Depending on how the new GPU VA mapping intersects with the existing mappingsof the GPU VA space thedrm_gpuvm_ops callbacks contain an arbitrary amountof unmap operations, a maximum of two remap operations and a single mapoperation. The caller might receive no callback at all if no operation isrequired, e.g. if the requested mapping already exists in the exact same way.

The single map operation represents the original map operation requested bythe caller.

drm_gpuva_op_unmap contains a ‘keep’ field, which indicates whether thedrm_gpuva to unmap is physically contiguous with the original mappingrequest. Optionally, if ‘keep’ is set, drivers may keep the actual page tableentries for thisdrm_gpuva, adding the missing page table entries only andupdate thedrm_gpuvm’s view of things accordingly.

Drivers may do the same optimization, namely delta page table updates, alsofor remap operations. This is possible sincedrm_gpuva_op_remap consists ofone unmap operation and one or two map operations, such that drivers canderive the page table update delta accordingly.

Note that there can’t be more than two existing mappings to split up, one atthe beginning and one at the end of the new mapping, hence there is amaximum of two remap operations.

Analogous todrm_gpuvm_sm_map()drm_gpuvm_sm_unmap() usesdrm_gpuvm_ops tocall back into the driver in order to unmap a range of GPU VA space. Thelogic behind this function is way simpler though: For all existing mappingsenclosed by the given range unmap operations are created. For mappings whichare only partially located within the given range, remap operations arecreated such that those mappings are split up and re-mapped partially.

As an alternative todrm_gpuvm_sm_map() anddrm_gpuvm_sm_unmap(),drm_gpuvm_sm_map_ops_create() anddrm_gpuvm_sm_unmap_ops_create() can be usedto directly obtain an instance ofstructdrm_gpuva_ops containing a list ofdrm_gpuva_op, which can be iterated withdrm_gpuva_for_each_op(). This listcontains thedrm_gpuva_ops analogous to the callbacks one would receive whencallingdrm_gpuvm_sm_map() ordrm_gpuvm_sm_unmap(). While this way requiresmore memory (to allocate thedrm_gpuva_ops), it provides drivers a way toiterate thedrm_gpuva_op multiple times, e.g. once in a context where memoryallocations are possible (e.g. to allocate GPU page tables) and once in thedma-fence signalling critical path.

To update thedrm_gpuvm’s view of the GPU VA spacedrm_gpuva_insert() anddrm_gpuva_remove() may be used. These functions can safely be used fromdrm_gpuvm_ops callbacks originating fromdrm_gpuvm_sm_map() ordrm_gpuvm_sm_unmap(). However, it might be more convenient to use theprovided helper functionsdrm_gpuva_map(),drm_gpuva_remap() anddrm_gpuva_unmap() instead.

The following diagram depicts the basic relationships of existing GPU VAmappings, a newly requested mapping and the resulting mappings as implementedbydrm_gpuvm_sm_map() - it doesn’t cover any arbitrary combinations of these.

Requested mapping is identical. Replace it, but indicate the backing PTEscould be kept.

     0     a     1old: |-----------| (bo_offset=n)     0     a     1req: |-----------| (bo_offset=n)     0     a     1new: |-----------| (bo_offset=n)

Requested mapping is identical, except for the BO offset, hence replacethe mapping.

     0     a     1old: |-----------| (bo_offset=n)     0     a     1req: |-----------| (bo_offset=m)     0     a     1new: |-----------| (bo_offset=m)

Requested mapping is identical, except for the backing BO, hence replacethe mapping.

     0     a     1old: |-----------| (bo_offset=n)     0     b     1req: |-----------| (bo_offset=n)     0     b     1new: |-----------| (bo_offset=n)

Existent mapping is a left aligned subset of the requested one, hencereplace the existing one.
```
     0  a  1old: |-----|       (bo_offset=n)     0     a     2req: |-----------| (bo_offset=n)     0     a     2new: |-----------| (bo_offset=n)
```
Note
We expect to see the same result for a request with a different BOand/or non-contiguous BO offset.
Requested mapping’s range is a left aligned subset of the existing one,but backed by a different BO. Hence, map the requested mapping and splitthe existing one adjusting its BO offset.
```
     0     a     2old: |-----------| (bo_offset=n)     0  b  1req: |-----|       (bo_offset=n)     0  b  1  a' 2new: |-----|-----| (b.bo_offset=n, a.bo_offset=n+1)
```
Note
We expect to see the same result for a request with a different BOand/or non-contiguous BO offset.

Existent mapping is a superset of the requested mapping. Split it up, butindicate that the backing PTEs could be kept.

     0     a     2old: |-----------| (bo_offset=n)     0  a  1req: |-----|       (bo_offset=n)     0  a  1  a' 2new: |-----|-----| (a.bo_offset=n, a'.bo_offset=n+1)

Requested mapping’s range is a right aligned subset of the existing one,but backed by a different BO. Hence, map the requested mapping and splitthe existing one, without adjusting the BO offset.
```
     0     a     2old: |-----------| (bo_offset=n)           1  b  2req:       |-----| (bo_offset=m)     0  a  1  b  2new: |-----|-----| (a.bo_offset=n,b.bo_offset=m)
```

Existent mapping is a superset of the requested mapping. Split it up, butindicate that the backing PTEs could be kept.

      0     a     2old: |-----------| (bo_offset=n)           1  a  2req:       |-----| (bo_offset=n+1)     0  a' 1  a  2new: |-----|-----| (a'.bo_offset=n, a.bo_offset=n+1)

Existent mapping is overlapped at the end by the requested mapping backedby a different BO. Hence, map the requested mapping and split up theexisting one, without adjusting the BO offset.

     0     a     2old: |-----------|       (bo_offset=n)           1     b     3req:       |-----------| (bo_offset=m)     0  a  1     b     3new: |-----|-----------| (a.bo_offset=n,b.bo_offset=m)

Existent mapping is overlapped by the requested mapping, both having thesame backing BO with a contiguous offset. Indicate the backing PTEs ofthe old mapping could be kept.

     0     a     2old: |-----------|       (bo_offset=n)           1     a     3req:       |-----------| (bo_offset=n+1)     0  a' 1     a     3new: |-----|-----------| (a'.bo_offset=n, a.bo_offset=n+1)

Requested mapping’s range is a centered subset of the existing onehaving a different backing BO. Hence, map the requested mapping and splitup the existing one in two mappings, adjusting the BO offset of the rightone accordingly.
```
     0        a        3old: |-----------------| (bo_offset=n)           1  b  2req:       |-----|       (bo_offset=m)     0  a  1  b  2  a' 3new: |-----|-----|-----| (a.bo_offset=n,b.bo_offset=m,a'.bo_offset=n+2)
```

Requested mapping is a contiguous subset of the existing one. Split itup, but indicate that the backing PTEs could be kept.

     0        a        3old: |-----------------| (bo_offset=n)           1  a  2req:       |-----|       (bo_offset=n+1)     0  a' 1  a  2 a'' 3old: |-----|-----|-----| (a'.bo_offset=n, a.bo_offset=n+1, a''.bo_offset=n+2)

Existent mapping is a right aligned subset of the requested one, hencereplace the existing one.
```
           1  a  2old:       |-----| (bo_offset=n+1)     0     a     2req: |-----------| (bo_offset=n)     0     a     2new: |-----------| (bo_offset=n)
```
Note
We expect to see the same result for a request with a different boand/or non-contiguous bo_offset.
Existent mapping is a centered subset of the requested one, hencereplace the existing one.
```
           1  a  2old:       |-----| (bo_offset=n+1)     0        a       3req: |----------------| (bo_offset=n)     0        a       3new: |----------------| (bo_offset=n)
```
Note
We expect to see the same result for a request with a different boand/or non-contiguous bo_offset.

Existent mappings is overlapped at the beginning by the requested mappingbacked by a different BO. Hence, map the requested mapping and split upthe existing one, adjusting its BO offset accordingly.

           1     a     3old:       |-----------| (bo_offset=n)     0     b     2req: |-----------|       (bo_offset=m)     0     b     2  a' 3new: |-----------|-----| (b.bo_offset=m,a.bo_offset=n+2)

Locking¶

In terms of managingdrm_gpuva entries DRM GPUVM does not take care oflocking itself, it is the drivers responsibility to take care about locking.Drivers might want to protect the following operations: inserting, removingand iteratingdrm_gpuva objects as well as generating all kinds ofoperations, such as split / merge or prefetch.

DRM GPUVM also does not take care of the locking of the backingdrm_gem_object buffers GPU VA lists anddrm_gpuvm_bo abstractions byitself; drivers are responsible to enforce mutual exclusion using either theGEMs dma_resv lock or the GEMs gpuva.lock mutex.

However, DRM GPUVM contains lockdep checks to ensure callers of its API holdthe corresponding lock whenever thedrm_gem_objects GPU VA list is accessedby functions such asdrm_gpuva_link() ordrm_gpuva_unlink(), but alsodrm_gpuvm_bo_obtain() anddrm_gpuvm_bo_put().

The latter is required since on creation and destruction of adrm_gpuvm_bothedrm_gpuvm_bo is attached / removed from thedrm_gem_objects gpuva list.Subsequent calls todrm_gpuvm_bo_obtain() for the samedrm_gpuvm anddrm_gem_object must be able to observe previous creations and destructionsofdrm_gpuvm_bos in order to keep instances unique.

Thedrm_gpuvm’s lists for keeping track of external and evicted objects areprotected against concurrent insertion / removal and iteration internally.

However, drivers still need ensure to protect concurrent calls to functionsiterating those lists, namelydrm_gpuvm_prepare_objects() anddrm_gpuvm_validate().

Alternatively, drivers can set theDRM_GPUVM_RESV_PROTECTED flag to indicatethat the correspondingdma_resv locks are held in order to protect thelists. IfDRM_GPUVM_RESV_PROTECTED is set, internal locking is disabled andthe corresponding lockdep checks are enabled. This is an optimization fordrivers which are capable of taking the correspondingdma_resv locks andhence do not require internal locking.

Examples¶

This section gives two examples on how to let the DRM GPUVA Manager generatedrm_gpuva_op in order to satisfy a given map or unmap request and how tomake use of them.

The below code is strictly limited to illustrate the generic usage pattern.To maintain simplicity, it doesn’t make use of any abstractions for commoncode, different (asynchronous) stages with fence signalling critical paths,any other helpers or error handling in terms of freeing memory and droppingpreviously taken locks.

Obtain a list ofdrm_gpuva_op to create a new mapping:

// Allocates a new &drm_gpuva.struct drm_gpuva * driver_gpuva_alloc(void);// Typically drivers would embed the &drm_gpuvm and &drm_gpuva// structure in individual driver structures and lock the dma-resv with// drm_exec or similar helpers.int driver_mapping_create(struct drm_gpuvm *gpuvm,                          u64 addr, u64 range,                          struct drm_gem_object *obj, u64 offset){        struct drm_gpuvm_map_req map_req = {                .map.va.addr = addr,                .map.va.range = range,                .map.gem.obj = obj,                .map.gem.offset = offset,           };        struct drm_gpuva_ops *ops;        struct drm_gpuva_op *op        struct drm_gpuvm_bo *vm_bo;        driver_lock_va_space();        ops = drm_gpuvm_sm_map_ops_create(gpuvm, &map_req);        if (IS_ERR(ops))                return PTR_ERR(ops);        vm_bo = drm_gpuvm_bo_obtain(gpuvm, obj);        if (IS_ERR(vm_bo))                return PTR_ERR(vm_bo);        drm_gpuva_for_each_op(op, ops) {                struct drm_gpuva *va;                switch (op->op) {                case DRM_GPUVA_OP_MAP:                        va = driver_gpuva_alloc();                        if (!va)                                ; // unwind previous VA space updates,                                  // free memory and unlock                        driver_vm_map();                        drm_gpuva_map(gpuvm, va, &op->map);                        drm_gpuva_link(va, vm_bo);                        break;                case DRM_GPUVA_OP_REMAP: {                        struct drm_gpuva *prev = NULL, *next = NULL;                        va = op->remap.unmap->va;                        if (op->remap.prev) {                                prev = driver_gpuva_alloc();                                if (!prev)                                        ; // unwind previous VA space                                          // updates, free memory and                                          // unlock                        }                        if (op->remap.next) {                                next = driver_gpuva_alloc();                                if (!next)                                        ; // unwind previous VA space                                          // updates, free memory and                                          // unlock                        }                        driver_vm_remap();                        drm_gpuva_remap(prev, next, &op->remap);                        if (prev)                                drm_gpuva_link(prev, va->vm_bo);                        if (next)                                drm_gpuva_link(next, va->vm_bo);                        drm_gpuva_unlink(va);                        break;                }                case DRM_GPUVA_OP_UNMAP:                        va = op->unmap->va;                        driver_vm_unmap();                        drm_gpuva_unlink(va);                        drm_gpuva_unmap(&op->unmap);                        break;                default:                        break;                }        }        drm_gpuvm_bo_put(vm_bo);        driver_unlock_va_space();        return 0;}

Receive a callback for eachdrm_gpuva_op to create a new mapping:

struct driver_context {        struct drm_gpuvm *gpuvm;        struct drm_gpuvm_bo *vm_bo;        struct drm_gpuva *new_va;        struct drm_gpuva *prev_va;        struct drm_gpuva *next_va;};// ops to pass to drm_gpuvm_init()static const struct drm_gpuvm_ops driver_gpuvm_ops = {        .sm_step_map = driver_gpuva_map,        .sm_step_remap = driver_gpuva_remap,        .sm_step_unmap = driver_gpuva_unmap,};// Typically drivers would embed the &drm_gpuvm and &drm_gpuva// structure in individual driver structures and lock the dma-resv with// drm_exec or similar helpers.int driver_mapping_create(struct drm_gpuvm *gpuvm,                          u64 addr, u64 range,                          struct drm_gem_object *obj, u64 offset){        struct driver_context ctx;        struct drm_gpuvm_bo *vm_bo;        struct drm_gpuva_ops *ops;        struct drm_gpuva_op *op;        int ret = 0;        ctx.gpuvm = gpuvm;        ctx.new_va = kzalloc(sizeof(*ctx.new_va), GFP_KERNEL);        ctx.prev_va = kzalloc(sizeof(*ctx.prev_va), GFP_KERNEL);        ctx.next_va = kzalloc(sizeof(*ctx.next_va), GFP_KERNEL);        ctx.vm_bo = drm_gpuvm_bo_create(gpuvm, obj);        if (!ctx.new_va || !ctx.prev_va || !ctx.next_va || !vm_bo) {                ret = -ENOMEM;                goto out;        }        // Typically protected with a driver specific GEM gpuva lock        // used in the fence signaling path for drm_gpuva_link() and        // drm_gpuva_unlink(), hence pre-allocate.        ctx.vm_bo = drm_gpuvm_bo_obtain_prealloc(ctx.vm_bo);        driver_lock_va_space();        ret = drm_gpuvm_sm_map(gpuvm, &ctx, addr, range, obj, offset);        driver_unlock_va_space();out:        drm_gpuvm_bo_put(ctx.vm_bo);        kfree(ctx.new_va);        kfree(ctx.prev_va);        kfree(ctx.next_va);        return ret;}int driver_gpuva_map(struct drm_gpuva_op *op, void *__ctx){        struct driver_context *ctx = __ctx;        drm_gpuva_map(ctx->vm, ctx->new_va, &op->map);        drm_gpuva_link(ctx->new_va, ctx->vm_bo);        // prevent the new GPUVA from being freed in        // driver_mapping_create()        ctx->new_va = NULL;        return 0;}int driver_gpuva_remap(struct drm_gpuva_op *op, void *__ctx){        struct driver_context *ctx = __ctx;        struct drm_gpuva *va = op->remap.unmap->va;        drm_gpuva_remap(ctx->prev_va, ctx->next_va, &op->remap);        if (op->remap.prev) {                drm_gpuva_link(ctx->prev_va, va->vm_bo);                ctx->prev_va = NULL;        }        if (op->remap.next) {                drm_gpuva_link(ctx->next_va, va->vm_bo);                ctx->next_va = NULL;        }        drm_gpuva_unlink(va);        kfree(va);        return 0;}int driver_gpuva_unmap(struct drm_gpuva_op *op, void *__ctx){        drm_gpuva_unlink(op->unmap.va);        drm_gpuva_unmap(&op->unmap);        kfree(op->unmap.va);        return 0;}

DRM GPUVM Function References¶

enumdrm_gpuva_flags¶: flags forstructdrm_gpuva

Constants

DRM_GPUVA_INVALIDATED: Flag indicating that thedrm_gpuva’s backing GEM is invalidated.
DRM_GPUVA_SPARSE: Flag indicating that thedrm_gpuva is a sparse mapping.
DRM_GPUVA_USERBITS: user defined bits

structdrm_gpuva¶: structure to track a GPU VA mapping

Definition:

struct drm_gpuva {    struct drm_gpuvm *vm;    struct drm_gpuvm_bo *vm_bo;    enum drm_gpuva_flags flags;    struct {        u64 addr;        u64 range;    } va;    struct {        u64 offset;        struct drm_gem_object *obj;        struct list_head entry;    } gem;    struct {        struct rb_node node;        struct list_head entry;        u64 __subtree_last;    } rb;};

Members

vm: thedrm_gpuvm this object is associated with
vm_bo: thedrm_gpuvm_bo abstraction for the mappeddrm_gem_object
flags: thedrm_gpuva_flags for this mapping
va: structure containing the address and range of thedrm_gpuva
va.addr: the start address
gem: structure containing thedrm_gem_object and its offset
gem.offset: the offset within thedrm_gem_object
gem.obj: the mappeddrm_gem_object
gem.entry: thelist_head to attach this object to adrm_gpuvm_bo
rb: structure containing data to storedrm_gpuvas in a rb-tree
rb.node: the rb-tree node
rb.entry: Thelist_head to additionally connectdrm_gpuvasin the same order they appear in the interval tree. This isuseful to keep iteratingdrm_gpuvas from a start node foundthrough the rb-tree while doing modifications on the rb-treeitself.
rb.__subtree_last: needed by the interval tree, holding last-in-subtree

Description

This structure represents a GPU VA mapping and is associated with adrm_gpuvm.

Typically, this structure is embedded in bigger driver structures.

voiddrm_gpuva_invalidate(structdrm_gpuva*va,boolinvalidate)¶: sets whether the backing GEM of thisdrm_gpuva is invalidated

Parameters

structdrm_gpuva*va: thedrm_gpuva to set the invalidate flag for
boolinvalidate: indicates whether thedrm_gpuva is invalidated

booldrm_gpuva_invalidated(structdrm_gpuva*va)¶: indicates whether the backing BO of thisdrm_gpuva is invalidated

Parameters

structdrm_gpuva*va: thedrm_gpuva to check

Return

true if the GPU VA is invalidated,false otherwise

enumdrm_gpuvm_flags¶: flags forstructdrm_gpuvm

Constants

DRM_GPUVM_RESV_PROTECTED

GPUVM is protected externally by theGPUVM’sdma_resv lock

DRM_GPUVM_IMMEDIATE_MODE

use the locking scheme for GEMs designedfor modifying the GPUVM during the fence signalling path

When set, gpuva.lock is used to protect gpuva.list in all GEMobjects associated with this GPUVM. Otherwise, the GEMs dma-resv isused.

DRM_GPUVM_USERBITS

user defined bits

structdrm_gpuvm¶: DRM GPU VA Manager

Definition:

struct drm_gpuvm {    const char *name;    enum drm_gpuvm_flags flags;    struct drm_device *drm;    u64 mm_start;    u64 mm_range;    struct {        struct rb_root_cached tree;        struct list_head list;    } rb;    struct kref kref;    struct drm_gpuva kernel_alloc_node;    const struct drm_gpuvm_ops *ops;    struct drm_gem_object *r_obj;    struct {        struct list_head list;        struct list_head *local_list;        spinlock_t lock;    } extobj;    struct {        struct list_head list;        struct list_head *local_list;        spinlock_t lock;    } evict;    struct llist_head bo_defer;};

Members

name: the name of the DRM GPU VA space
flags: thedrm_gpuvm_flags of this GPUVM
drm: thedrm_device this VM lives in
mm_start: start of the VA space
mm_range: length of the VA space
rb: structures to trackdrm_gpuva entries
rb.tree: the rb-tree to track GPU VA mappings
rb.list: thelist_head to track GPU VA mappings
kref: reference count of this object
kernel_alloc_node: drm_gpuva representing the address space cutout reserved forthe kernel
ops: drm_gpuvm_ops providing the split/merge steps to drivers
r_obj: Resv GEM object; representing the GPUVM’s commondma_resv.
extobj: structure holding the extobj list
extobj.list: list_head storingdrm_gpuvm_bos serving asexternal object
extobj.local_list: pointer to the local list temporarilystoring entries from the external object list
extobj.lock: spinlock to protect the extobj list
evict: structure holding the evict list and evict list lock
evict.list: list_head storingdrm_gpuvm_bos currentlybeing evicted
evict.local_list: pointer to the local list temporarilystoring entries from the evicted object list
evict.lock: spinlock to protect the evict list
bo_defer: structure holding vm_bos that need to be destroyed

Description

The DRM GPU VA Manager keeps track of a GPU’s virtual address space by usingmaple_tree structures. Typically, this structure is embedded in biggerdriver structures.

Drivers can pass addresses and ranges in an arbitrary unit, e.g. bytes orpages.

There should be one manager instance per GPU virtual address space.

structdrm_gpuvm*drm_gpuvm_get(structdrm_gpuvm*gpuvm)¶: acquire astructdrm_gpuvm reference

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm to acquire the reference of

Description

This function acquires an additional reference togpuvm. It is illegal tocall this without already holding a reference. No locks required.

Return

thestructdrm_gpuvm pointer

booldrm_gpuvm_resv_protected(structdrm_gpuvm*gpuvm)¶: indicates whetherDRM_GPUVM_RESV_PROTECTED is set

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm

Return

true ifDRM_GPUVM_RESV_PROTECTED is set, false otherwise.

booldrm_gpuvm_immediate_mode(structdrm_gpuvm*gpuvm)¶: indicates whetherDRM_GPUVM_IMMEDIATE_MODE is set

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm

Return

true ifDRM_GPUVM_IMMEDIATE_MODE is set, false otherwise.

drm_gpuvm_resv¶

drm_gpuvm_resv(gpuvm__)

returns thedrm_gpuvm’sdma_resv

Parameters

gpuvm__: thedrm_gpuvm

Return

a pointer to thedrm_gpuvm’s shareddma_resv

drm_gpuvm_resv_obj¶

drm_gpuvm_resv_obj(gpuvm__)

returns thedrm_gem_object holding thedrm_gpuvm’sdma_resv

Parameters

gpuvm__: thedrm_gpuvm

Return

a pointer to thedrm_gem_object holding thedrm_gpuvm’s shareddma_resv

booldrm_gpuvm_is_extobj(structdrm_gpuvm*gpuvm,structdrm_gem_object*obj)¶: indicates whether the givendrm_gem_object is an external object

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm to check
structdrm_gem_object*obj: thedrm_gem_object to check

Return

true if thedrm_gem_objectdma_resv differs from thedrm_gpuvmsdma_resv, false otherwise

drm_gpuvm_for_each_va_range¶

drm_gpuvm_for_each_va_range(va__,gpuvm__,start__,end__)

iterate over a range ofdrm_gpuvas

Parameters

va__: drm_gpuva structure to assign to in each iteration step
gpuvm__: drm_gpuvm to walk over
start__: starting offset, the first gpuva will overlap this
end__: ending offset, the last gpuva will start before this (but mayoverlap)

Description

This iterator walks over alldrm_gpuvas in thedrm_gpuvm that liebetweenstart__ andend__. It is implemented similarly tolist_for_each(),but is using thedrm_gpuvm’s internal interval tree to acceleratethe search for the startingdrm_gpuva, and hence isn’t safe against removalof elements. It assumes thatend__ is within (or is the upper limit of) thedrm_gpuvm. This iterator does not skip over thedrm_gpuvm’skernel_alloc_node.

drm_gpuvm_for_each_va_range_safe¶

drm_gpuvm_for_each_va_range_safe(va__,next__,gpuvm__,start__,end__)

safely iterate over a range ofdrm_gpuvas

Parameters

va__: drm_gpuva to assign to in each iteration step
next__: anotherdrm_gpuva to use as temporary storage
gpuvm__: drm_gpuvm to walk over
start__: starting offset, the first gpuva will overlap this
end__: ending offset, the last gpuva will start before this (but mayoverlap)

Description

This iterator walks over alldrm_gpuvas in thedrm_gpuvm that liebetweenstart__ andend__. It is implemented similarly tolist_for_each_safe(), but is using thedrm_gpuvm’s internal intervaltree to accelerate the search for the startingdrm_gpuva, and hence is safeagainst removal of elements. It assumes thatend__ is within (or is theupper limit of) thedrm_gpuvm. This iterator does not skip over thedrm_gpuvm’skernel_alloc_node.

drm_gpuvm_for_each_va¶

drm_gpuvm_for_each_va(va__,gpuvm__)

iterate over alldrm_gpuvas

Parameters

va__: drm_gpuva to assign to in each iteration step
gpuvm__: drm_gpuvm to walk over

Description

This iterator walks over alldrm_gpuva structures associated with the givendrm_gpuvm.

drm_gpuvm_for_each_va_safe¶

drm_gpuvm_for_each_va_safe(va__,next__,gpuvm__)

safely iterate over alldrm_gpuvas

Parameters

va__: drm_gpuva to assign to in each iteration step
next__: anotherdrm_gpuva to use as temporary storage
gpuvm__: drm_gpuvm to walk over

Description

This iterator walks over alldrm_gpuva structures associated with the givendrm_gpuvm. It is implemented withlist_for_each_entry_safe(), andhence safe against the removal of elements.

structdrm_gpuvm_exec¶: drm_gpuvm abstraction ofdrm_exec

Definition:

struct drm_gpuvm_exec {    struct drm_exec exec;    u32 flags;    struct drm_gpuvm *vm;    unsigned int num_fences;    struct {        int (*fn)(struct drm_gpuvm_exec *vm_exec);        void *priv;    } extra;};

Members

exec: thedrm_exec structure
flags: the flags for thestructdrm_exec
vm: thedrm_gpuvm to lock its DMA reservations
num_fences: the number of fences to reserve for thedma_resv of thelockeddrm_gem_objects
extra: Callback and corresponding private data for the driver tolock arbitrary additionaldrm_gem_objects.
extra.fn: The driver callback to lock additionaldrm_gem_objects.
extra.priv: driver private data for thefn callback

Description

This structure should be created on the stack asdrm_exec should be.

Optionally,extra can be set in order to lock additionaldrm_gem_objects.

voiddrm_gpuvm_exec_unlock(structdrm_gpuvm_exec*vm_exec)¶: lock all dma-resv of all assoiciated BOs

Parameters

structdrm_gpuvm_exec*vm_exec: thedrm_gpuvm_exec wrapper

Description

Releases all dma-resv locks of alldrm_gem_objects previously acquiredthroughdrm_gpuvm_exec_lock() or its variants.

Return

0 on success, negative error code on failure.

voiddrm_gpuvm_exec_resv_add_fence(structdrm_gpuvm_exec*vm_exec,structdma_fence*fence,enumdma_resv_usageprivate_usage,enumdma_resv_usageextobj_usage)¶: add fence to private and all extobj

Parameters

structdrm_gpuvm_exec*vm_exec: thedrm_gpuvm_exec wrapper
structdma_fence*fence: fence to add
enumdma_resv_usageprivate_usage: private dma-resv usage
enumdma_resv_usageextobj_usage: extobj dma-resv usage

Description

Seedrm_gpuvm_resv_add_fence().

intdrm_gpuvm_exec_validate(structdrm_gpuvm_exec*vm_exec)¶: validate all BOs marked as evicted

Parameters

structdrm_gpuvm_exec*vm_exec: thedrm_gpuvm_exec wrapper

Description

Seedrm_gpuvm_validate().

Return

0 on success, negative error code on failure.

structdrm_gpuvm_bo¶: structure representing adrm_gpuvm anddrm_gem_object combination

Definition:

struct drm_gpuvm_bo {    struct drm_gpuvm *vm;    struct drm_gem_object *obj;    bool evicted;    struct kref kref;    struct {        struct list_head gpuva;        struct {            struct list_head gem;            struct list_head extobj;            struct list_head evict;            struct llist_node bo_defer;        } entry;    } list;};

Members

vm

Thedrm_gpuvm theobj is mapped in. This is a referencecounted pointer.

obj

Thedrm_gem_object being mapped invm. This is a referencecounted pointer.

evicted

Indicates whether thedrm_gem_object is evicted; fieldprotected by thedrm_gem_object’s dma-resv lock.

kref

The reference count for thisdrm_gpuvm_bo.

list

Structure containing alllist_heads.

list.gpuva

The list of linkeddrm_gpuvas.

It is safe to access entries from this list as long as theGEM’s gpuva lock is held. See alsostructdrm_gem_object.

list.entry

Structure containing alllist_heads serving asentry.

list.entry.gem

List entry to attach to thedrm_gem_objects gpuva list.

list.entry.evict

List entry to attach to thedrm_gpuvms evict list.

list.entry.bo_defer

List entry to attach tothedrm_gpuvms bo_defer list.

Description

This structure is an abstraction representing adrm_gpuvm anddrm_gem_object combination. It serves as an indirection to accelerateiterating alldrm_gpuvas within adrm_gpuvm backed by the samedrm_gem_object.

Furthermore it is used cache evicted GEM objects for a certain GPU-VM toaccelerate validation.

Typically, drivers want to create an instance of astructdrm_gpuvm_bo oncea GEM object is mapped first in a GPU-VM and release the instance once thelast mapping of the GEM object in this GPU-VM is unmapped.

structdrm_gpuvm_bo*drm_gpuvm_bo_get(structdrm_gpuvm_bo*vm_bo)¶: acquire astructdrm_gpuvm_bo reference

Parameters

structdrm_gpuvm_bo*vm_bo: thedrm_gpuvm_bo to acquire the reference of

Description

This function acquires an additional reference tovm_bo. It is illegal tocall this without already holding a reference. No locks required.

Return

thestructvm_bo pointer

voiddrm_gpuvm_bo_gem_evict(structdrm_gem_object*obj,boolevict)¶: add/remove alldrm_gpuvm_bo’s in the list to/from thedrm_gpuvms evicted list

Parameters

structdrm_gem_object*obj: thedrm_gem_object
boolevict: indicates whetherobj is evicted

Description

Seedrm_gpuvm_bo_evict().

drm_gpuvm_bo_for_each_va¶

drm_gpuvm_bo_for_each_va(va__,vm_bo__)

iterator to walk over a list ofdrm_gpuva

Parameters

va__: drm_gpuva structure to assign to in each iteration step
vm_bo__: thedrm_gpuvm_bo thedrm_gpuva to walk are associated with

Description

This iterator walks over alldrm_gpuva structures associated with thedrm_gpuvm_bo.

The caller must hold the GEM’s gpuva lock.

drm_gpuvm_bo_for_each_va_safe¶

drm_gpuvm_bo_for_each_va_safe(va__,next__,vm_bo__)

iterator to safely walk over a list ofdrm_gpuva

Parameters

va__: drm_gpuva structure to assign to in each iteration step
next__: nextdrm_gpuva to store the next step
vm_bo__: thedrm_gpuvm_bo thedrm_gpuva to walk are associated with

Description

This iterator walks over alldrm_gpuva structures associated with thedrm_gpuvm_bo. It is implemented withlist_for_each_entry_safe(), henceit is save against removal of elements.

The caller must hold the GEM’s gpuva lock.

enumdrm_gpuva_op_type¶: GPU VA operation type

Constants

DRM_GPUVA_OP_MAP: the map op type
DRM_GPUVA_OP_REMAP: the remap op type
DRM_GPUVA_OP_UNMAP: the unmap op type
DRM_GPUVA_OP_PREFETCH: the prefetch op type
DRM_GPUVA_OP_DRIVER: the driver defined op type

Description

Operations to alter the GPU VA mappings tracked by thedrm_gpuvm.

structdrm_gpuva_op_map¶: GPU VA map operation

Definition:

struct drm_gpuva_op_map {    struct {        u64 addr;        u64 range;    } va;    struct {        u64 offset;        struct drm_gem_object *obj;    } gem;};

Members

va: structure containing address and range of a mapoperation
va.addr: the base address of the new mapping
va.range: the range of the new mapping
gem: structure containing thedrm_gem_object and its offset
gem.offset: the offset within thedrm_gem_object
gem.obj: thedrm_gem_object to map

Description

This structure represents a single map operation generated by theDRM GPU VA manager.

structdrm_gpuva_op_unmap¶: GPU VA unmap operation

Definition:

struct drm_gpuva_op_unmap {    struct drm_gpuva *va;    bool keep;};

Members

va

thedrm_gpuva to unmap

keep

Indicates whether thisdrm_gpuva is physically contiguous with theoriginal mapping request.

Optionally, ifkeep is set, drivers may keep the actual page tablemappings for thisdrm_gpuva, adding the missing page table entriesonly and update thedrm_gpuvm accordingly.

Description

This structure represents a single unmap operation generated by theDRM GPU VA manager.

structdrm_gpuva_op_remap¶: GPU VA remap operation

Definition:

struct drm_gpuva_op_remap {    struct drm_gpuva_op_map *prev;    struct drm_gpuva_op_map *next;    struct drm_gpuva_op_unmap *unmap;};

Members

prev: the preceding part of a split mapping
next: the subsequent part of a split mapping
unmap: the unmap operation for the original existing mapping

Description

This represents a single remap operation generated by the DRM GPU VA manager.

A remap operation is generated when an existing GPU VA mmapping is split upby inserting a new GPU VA mapping or by partially unmapping existentmapping(s), hence it consists of a maximum of two map and one unmapoperation.

Theunmap operation takes care of removing the original existing mapping.prev is used to remap the preceding part,next the subsequent part.

If either a new mapping’s start address is aligned with the start addressof the old mapping or the new mapping’s end address is aligned with theend address of the old mapping, eitherprev ornext is NULL.

Note, the reason for a dedicated remap operation, rather than arbitraryunmap and map operations, is to give drivers the chance of extracting driverspecific data for creating the new mappings from the unmap operations’sdrm_gpuva structure which typically is embedded in larger driver specificstructures.

structdrm_gpuva_op_prefetch¶: GPU VA prefetch operation

Definition:

struct drm_gpuva_op_prefetch {    struct drm_gpuva *va;};

Members

va: thedrm_gpuva to prefetch

Description

This structure represents a single prefetch operation generated by theDRM GPU VA manager.

structdrm_gpuva_op¶: GPU VA operation

Definition:

struct drm_gpuva_op {    struct list_head entry;    enum drm_gpuva_op_type op;    union {        struct drm_gpuva_op_map map;        struct drm_gpuva_op_remap remap;        struct drm_gpuva_op_unmap unmap;        struct drm_gpuva_op_prefetch prefetch;    };};

Members

entry: Thelist_head used to distribute instances of thisstructwithindrm_gpuva_ops.
op: the type of the operation
{unnamed_union}: anonymous
map: the map operation
remap: the remap operation
unmap: the unmap operation
prefetch: the prefetch operation

Description

This structure represents a single generic operation.

The particular type of the operation is defined byop.

structdrm_gpuva_ops¶: wraps a list ofdrm_gpuva_op

Definition:

struct drm_gpuva_ops {    struct list_head list;};

Members

list: thelist_head

drm_gpuva_for_each_op¶

drm_gpuva_for_each_op(op,ops)

iterator to walk overdrm_gpuva_ops

Parameters

op: drm_gpuva_op to assign in each iteration step
ops: drm_gpuva_ops to walk

Description

This iterator walks over all ops within a given list of operations.

drm_gpuva_for_each_op_safe¶

drm_gpuva_for_each_op_safe(op,next,ops)

iterator to safely walk overdrm_gpuva_ops

Parameters

op: drm_gpuva_op to assign in each iteration step
next: nextdrm_gpuva_op to store the next step
ops: drm_gpuva_ops to walk

Description

This iterator walks over all ops within a given list of operations. It isimplemented withlist_for_each_safe(), so save against removal of elements.

drm_gpuva_for_each_op_from_reverse¶

drm_gpuva_for_each_op_from_reverse(op,ops)

iterate backwards from the given point

Parameters

op: drm_gpuva_op to assign in each iteration step
ops: drm_gpuva_ops to walk

Description

This iterator walks over all ops within a given list of operations beginningfrom the given operation in reverse order.

drm_gpuva_for_each_op_reverse¶

drm_gpuva_for_each_op_reverse(op,ops)

iterator to walk overdrm_gpuva_ops in reverse

Parameters

op: drm_gpuva_op to assign in each iteration step
ops: drm_gpuva_ops to walk

Description

This iterator walks over all ops within a given list of operations in reverse

drm_gpuva_first_op¶

drm_gpuva_first_op(ops)

returns the firstdrm_gpuva_op fromdrm_gpuva_ops

Parameters

ops: thedrm_gpuva_ops to get the fistdrm_gpuva_op from

drm_gpuva_last_op¶

drm_gpuva_last_op(ops)

returns the lastdrm_gpuva_op fromdrm_gpuva_ops

Parameters

ops: thedrm_gpuva_ops to get the lastdrm_gpuva_op from

drm_gpuva_prev_op¶

drm_gpuva_prev_op(op)

previousdrm_gpuva_op in the list

Parameters

op: the currentdrm_gpuva_op

drm_gpuva_next_op¶

drm_gpuva_next_op(op)

nextdrm_gpuva_op in the list

Parameters

op: the currentdrm_gpuva_op

structdrm_gpuvm_map_req¶: arguments passed to drm_gpuvm_sm_map[_ops_create]()

Definition:

struct drm_gpuvm_map_req {    struct drm_gpuva_op_map map;};

Members

map: structdrm_gpuva_op_map

structdrm_gpuvm_ops¶: callbacks for split/merge steps

Definition:

struct drm_gpuvm_ops {    void (*vm_free)(struct drm_gpuvm *gpuvm);    struct drm_gpuva_op *(*op_alloc)(void);    void (*op_free)(struct drm_gpuva_op *op);    struct drm_gpuvm_bo *(*vm_bo_alloc)(void);    void (*vm_bo_free)(struct drm_gpuvm_bo *vm_bo);    int (*vm_bo_validate)(struct drm_gpuvm_bo *vm_bo, struct drm_exec *exec);    int (*sm_step_map)(struct drm_gpuva_op *op, void *priv);    int (*sm_step_remap)(struct drm_gpuva_op *op, void *priv);    int (*sm_step_unmap)(struct drm_gpuva_op *op, void *priv);};

Members

vm_free

called when the last reference of astructdrm_gpuvm isdropped

This callback is mandatory.

op_alloc

called when thedrm_gpuvm allocatesastructdrm_gpuva_op

Some drivers may want to embedstructdrm_gpuva_op into driverspecific structures. By implementing this callback drivers canallocate memory accordingly.

This callback is optional.

op_free

called when thedrm_gpuvm frees astructdrm_gpuva_op

Some drivers may want to embedstructdrm_gpuva_op into driverspecific structures. By implementing this callback drivers canfree the previously allocated memory accordingly.

This callback is optional.

vm_bo_alloc

called when thedrm_gpuvm allocatesastructdrm_gpuvm_bo

Some drivers may want to embedstructdrm_gpuvm_bo into driverspecific structures. By implementing this callback drivers canallocate memory accordingly.

This callback is optional.

vm_bo_free

called when thedrm_gpuvm frees astructdrm_gpuvm_bo

Some drivers may want to embedstructdrm_gpuvm_bo into driverspecific structures. By implementing this callback drivers canfree the previously allocated memory accordingly.

This callback is optional.

vm_bo_validate

called fromdrm_gpuvm_validate()

Drivers receive this callback for every evicteddrm_gem_object beingmapped in the correspondingdrm_gpuvm.

Typically, drivers would call their driver specific variant ofttm_bo_validate() from within this callback.

sm_step_map

called fromdrm_gpuvm_sm_map to finally insert themapping once all previous steps were completed

Thepriv pointer matches the one the driver passed todrm_gpuvm_sm_map ordrm_gpuvm_sm_unmap, respectively.

Can be NULL ifdrm_gpuvm_sm_map is used.

sm_step_remap

called fromdrm_gpuvm_sm_map anddrm_gpuvm_sm_unmap to split up an existent mapping

This callback is called when existent mapping needs to be split up.This is the case when either a newly requested mapping overlaps oris enclosed by an existent mapping or a partial unmap of an existentmapping is requested.

Thepriv pointer matches the one the driver passed todrm_gpuvm_sm_map ordrm_gpuvm_sm_unmap, respectively.

Can be NULL if neitherdrm_gpuvm_sm_map nordrm_gpuvm_sm_unmap isused.

sm_step_unmap

called fromdrm_gpuvm_sm_map anddrm_gpuvm_sm_unmap to unmap an existing mapping

This callback is called when existing mapping needs to be unmapped.This is the case when either a newly requested mapping encloses anexisting mapping or an unmap of an existing mapping is requested.

Thepriv pointer matches the one the driver passed todrm_gpuvm_sm_map ordrm_gpuvm_sm_unmap, respectively.

Can be NULL if neitherdrm_gpuvm_sm_map nordrm_gpuvm_sm_unmap isused.

Description

This structure defines the callbacks used bydrm_gpuvm_sm_map anddrm_gpuvm_sm_unmap to provide the split/merge steps for map and unmapoperations to drivers.

voiddrm_gpuva_op_remap_to_unmap_range(conststructdrm_gpuva_op_remap*op,u64*start_addr,u64*range)¶: Helper to get the start and range of the unmap stage of a remap op.

Parameters

conststructdrm_gpuva_op_remap*op: Remap op.
u64*start_addr: Output pointer for the start of the required unmap.
u64*range: Output pointer for the length of the required unmap.

Description

The given start address and range will be set such that they represent therange of the address space that was previously covered by the mapping beingre-mapped, but is now empty.

booldrm_gpuvm_range_valid(structdrm_gpuvm*gpuvm,u64addr,u64range)¶: checks whether the given range is valid for the givendrm_gpuvm

Parameters

structdrm_gpuvm*gpuvm: the GPUVM to check the range for
u64addr: the base address
u64range: the range starting from the base address

Description

Checks whether the range is within the GPUVM’s managed boundaries.

Return

true for a valid range, false otherwise

structdrm_gem_object*drm_gpuvm_resv_object_alloc(structdrm_device*drm)¶: allocate a dummydrm_gem_object

Parameters

structdrm_device*drm: the driversdrm_device

Description

Allocates a dummydrm_gem_object which can be passed todrm_gpuvm_init() inorder to serve as root GEM object providing thedrm_resv shared acrossdrm_gem_objects local to a single GPUVM.

Return

thedrm_gem_object on success, NULL on failure

voiddrm_gpuvm_init(structdrm_gpuvm*gpuvm,constchar*name,enumdrm_gpuvm_flagsflags,structdrm_device*drm,structdrm_gem_object*r_obj,u64start_offset,u64range,u64reserve_offset,u64reserve_range,conststructdrm_gpuvm_ops*ops)¶: initialize adrm_gpuvm

Parameters

structdrm_gpuvm*gpuvm: pointer to thedrm_gpuvm to initialize
constchar*name: the name of the GPU VA space
enumdrm_gpuvm_flagsflags: thedrm_gpuvm_flags for this GPUVM
structdrm_device*drm: thedrm_device this VM resides in
structdrm_gem_object*r_obj: the resvdrm_gem_object providing the GPUVM’s commondma_resv
u64start_offset: the start offset of the GPU VA space
u64range: the size of the GPU VA space
u64reserve_offset: the start of the kernel reserved GPU VA area
u64reserve_range: the size of the kernel reserved GPU VA area
conststructdrm_gpuvm_ops*ops: drm_gpuvm_ops called ondrm_gpuvm_sm_map /drm_gpuvm_sm_unmap

Description

Thedrm_gpuvm must be initialized with this function before use.

Note thatgpuvm must be cleared to 0 before calling this function. The givenname is expected to be managed by the surrounding driver structures.

voiddrm_gpuvm_put(structdrm_gpuvm*gpuvm)¶: drop astructdrm_gpuvm reference

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm to release the reference of

Description

This releases a reference togpuvm.

This function may be called from atomic context.

intdrm_gpuvm_prepare_vm(structdrm_gpuvm*gpuvm,structdrm_exec*exec,unsignedintnum_fences)¶: prepare the GPUVMs common dma-resv

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm
structdrm_exec*exec: thedrm_exec context
unsignedintnum_fences: the amount ofdma_fences to reserve

Description

Callsdrm_exec_prepare_obj() for the GPUVMs dummydrm_gem_object; ifnum_fences is zerodrm_exec_lock_obj() is called instead.

Using this function directly, it is the drivers responsibility to calldrm_exec_init() anddrm_exec_fini() accordingly.

Return

0 on success, negative error code on failure.

intdrm_gpuvm_prepare_objects(structdrm_gpuvm*gpuvm,structdrm_exec*exec,unsignedintnum_fences)¶: prepare all associated BOs

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm
structdrm_exec*exec: thedrm_exec locking context
unsignedintnum_fences: the amount ofdma_fences to reserve

Description

Callsdrm_exec_prepare_obj() for alldrm_gem_objects the givendrm_gpuvm contains mappings of; ifnum_fences is zerodrm_exec_lock_obj()is called instead.

Using this function directly, it is the drivers responsibility to calldrm_exec_init() anddrm_exec_fini() accordingly.

Note

This function is safe against concurrent insertion and removal ofexternal objects, however it is not safe against concurrent usage itself.

Drivers need to make sure to protect this case with either an outer VM lockor by callingdrm_gpuvm_prepare_vm() before this function within thedrm_exec_until_all_locked() loop, such that the GPUVM’s dma-resv lock ensuresmutual exclusion.

Return

0 on success, negative error code on failure.

intdrm_gpuvm_prepare_range(structdrm_gpuvm*gpuvm,structdrm_exec*exec,u64addr,u64range,unsignedintnum_fences)¶: prepare all BOs mapped within a given range

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm
structdrm_exec*exec: thedrm_exec locking context
u64addr: the start address within the VA space
u64range: the range to iterate within the VA space
unsignedintnum_fences: the amount ofdma_fences to reserve

Description

Callsdrm_exec_prepare_obj() for alldrm_gem_objects mapped betweenaddrandaddr +range; ifnum_fences is zerodrm_exec_lock_obj() is calledinstead.

Return

0 on success, negative error code on failure.

intdrm_gpuvm_exec_lock(structdrm_gpuvm_exec*vm_exec)¶: lock all dma-resv of all associated BOs

Parameters

structdrm_gpuvm_exec*vm_exec: thedrm_gpuvm_exec wrapper

Description

Acquires all dma-resv locks of alldrm_gem_objects the givendrm_gpuvm contains mappings of.

Additionally, when calling this function withstructdrm_gpuvm_exec::extrabeing set the driver receives the givenfn callback to lock additionaldma-resv in the context of thedrm_gpuvm_exec instance. Typically, driverswould calldrm_exec_prepare_obj() from within this callback.

Return

0 on success, negative error code on failure.

intdrm_gpuvm_exec_lock_array(structdrm_gpuvm_exec*vm_exec,structdrm_gem_object**objs,unsignedintnum_objs)¶: lock all dma-resv of all associated BOs

Parameters

structdrm_gpuvm_exec*vm_exec: thedrm_gpuvm_exec wrapper
structdrm_gem_object**objs: additionaldrm_gem_objects to lock
unsignedintnum_objs: the number of additionaldrm_gem_objects to lock

Description

Acquires all dma-resv locks of alldrm_gem_objects the givendrm_gpuvmcontains mappings of, plus the ones given throughobjs.

Return

0 on success, negative error code on failure.

intdrm_gpuvm_exec_lock_range(structdrm_gpuvm_exec*vm_exec,u64addr,u64range)¶: prepare all BOs mapped within a given range

Parameters

structdrm_gpuvm_exec*vm_exec: thedrm_gpuvm_exec wrapper
u64addr: the start address within the VA space
u64range: the range to iterate within the VA space

Description

Acquires all dma-resv locks of alldrm_gem_objects mapped betweenaddr andaddr +range.

Return

0 on success, negative error code on failure.

intdrm_gpuvm_validate(structdrm_gpuvm*gpuvm,structdrm_exec*exec)¶: validate all BOs marked as evicted

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm to validate evicted BOs
structdrm_exec*exec: thedrm_exec instance used for locking the GPUVM

Description

Calls thedrm_gpuvm_ops::vm_bo_validate callback for all evicted bufferobjects being mapped in the givendrm_gpuvm.

Return

0 on success, negative error code on failure.

voiddrm_gpuvm_resv_add_fence(structdrm_gpuvm*gpuvm,structdrm_exec*exec,structdma_fence*fence,enumdma_resv_usageprivate_usage,enumdma_resv_usageextobj_usage)¶: add fence to private and all extobj dma-resv

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm to add a fence to
structdrm_exec*exec: thedrm_exec locking context
structdma_fence*fence: fence to add
enumdma_resv_usageprivate_usage: private dma-resv usage
enumdma_resv_usageextobj_usage: extobj dma-resv usage

structdrm_gpuvm_bo*drm_gpuvm_bo_create(structdrm_gpuvm*gpuvm,structdrm_gem_object*obj)¶: create a new instance ofstructdrm_gpuvm_bo

Parameters

structdrm_gpuvm*gpuvm: Thedrm_gpuvm theobj is mapped in.
structdrm_gem_object*obj: Thedrm_gem_object being mapped in thegpuvm.

Description

If provided by the driver, this function uses thedrm_gpuvm_opsvm_bo_alloc() callback to allocate.

Return

a pointer to thedrm_gpuvm_bo on success, NULL on failure

booldrm_gpuvm_bo_put(structdrm_gpuvm_bo*vm_bo)¶: drop astructdrm_gpuvm_bo reference

Parameters

structdrm_gpuvm_bo*vm_bo: thedrm_gpuvm_bo to release the reference of

Description

This releases a reference tovm_bo.

If the reference count drops to zero, thegpuvm_bo is destroyed, whichincludes removing it from the GEMs gpuva list. Hence, if a call to thisfunction can potentially let the reference count drop to zero the caller musthold the lock that the GEM uses for its gpuva list (either the GEM’sdma-resv or gpuva.lock mutex).

This function may only be called from non-atomic context.

Return

true if vm_bo was destroyed, false otherwise.

booldrm_gpuvm_bo_put_deferred(structdrm_gpuvm_bo*vm_bo)¶: drop astructdrm_gpuvm_bo reference with deferred cleanup

Parameters

structdrm_gpuvm_bo*vm_bo: thedrm_gpuvm_bo to release the reference of

Description

This releases a reference tovm_bo.

This might take and release the GEMs GPUVA lock. You should calldrm_gpuvm_bo_deferred_cleanup() later to complete the cleanup process.

Return

true if vm_bo is being destroyed, false otherwise.

voiddrm_gpuvm_bo_deferred_cleanup(structdrm_gpuvm*gpuvm)¶: clean up BOs in the deferred list deferred cleanup

Parameters

structdrm_gpuvm*gpuvm: the VM to clean up

Description

Cleans updrm_gpuvm_bo instances in the deferred cleanup list.

structdrm_gpuvm_bo*drm_gpuvm_bo_find(structdrm_gpuvm*gpuvm,structdrm_gem_object*obj)¶: find thedrm_gpuvm_bo for the givendrm_gpuvm anddrm_gem_object

Parameters

structdrm_gpuvm*gpuvm: Thedrm_gpuvm theobj is mapped in.
structdrm_gem_object*obj: Thedrm_gem_object being mapped in thegpuvm.

Description

Find thedrm_gpuvm_bo representing the combination of the givendrm_gpuvm anddrm_gem_object. If found, increases the referencecount of thedrm_gpuvm_bo accordingly.

Return

a pointer to thedrm_gpuvm_bo on success, NULL on failure

structdrm_gpuvm_bo*drm_gpuvm_bo_obtain_locked(structdrm_gpuvm*gpuvm,structdrm_gem_object*obj)¶: obtains an instance of thedrm_gpuvm_bo for the givendrm_gpuvm anddrm_gem_object

Parameters

structdrm_gpuvm*gpuvm: Thedrm_gpuvm theobj is mapped in.
structdrm_gem_object*obj: Thedrm_gem_object being mapped in thegpuvm.

Description

Find thedrm_gpuvm_bo representing the combination of the givendrm_gpuvm anddrm_gem_object. If found, increases the referencecount of thedrm_gpuvm_bo accordingly. If not found, allocates a newdrm_gpuvm_bo.

Requires the lock for the GEMs gpuva list.

A newdrm_gpuvm_bo is added to the GEMs gpuva list.

Return

a pointer to thedrm_gpuvm_bo on success, an ERR_PTR on failure

structdrm_gpuvm_bo*drm_gpuvm_bo_obtain_prealloc(structdrm_gpuvm_bo*__vm_bo)¶: obtains an instance of thedrm_gpuvm_bo for the givendrm_gpuvm anddrm_gem_object

Parameters

structdrm_gpuvm_bo*__vm_bo: A pre-allocatedstructdrm_gpuvm_bo.

Description

Find thedrm_gpuvm_bo representing the combination of the givendrm_gpuvm anddrm_gem_object. If found, increases the referencecount of the founddrm_gpuvm_bo accordingly, while the__vm_bo referencecount is decreased. If not found__vm_bo is returned without furtherincrease of the reference count.

The provided__vm_bo must not already be in the gpuva, evict, or extobjlists prior to calling this method.

A newdrm_gpuvm_bo is added to the GEMs gpuva list.

Return

a pointer to the founddrm_gpuvm_bo or__vm_bo if no existingdrm_gpuvm_bo was found

voiddrm_gpuvm_bo_extobj_add(structdrm_gpuvm_bo*vm_bo)¶: adds thedrm_gpuvm_bo to itsdrm_gpuvm’s extobj list

Parameters

structdrm_gpuvm_bo*vm_bo: Thedrm_gpuvm_bo to add to itsdrm_gpuvm’s the extobj list.

Description

Adds the givenvm_bo to itsdrm_gpuvm’s extobj list if not on the listalready and if the correspondingdrm_gem_object is an external object,actually.

voiddrm_gpuvm_bo_evict(structdrm_gpuvm_bo*vm_bo,boolevict)¶: add / remove adrm_gpuvm_bo to / from thedrm_gpuvms evicted list

Parameters

structdrm_gpuvm_bo*vm_bo: thedrm_gpuvm_bo to add or remove
boolevict: indicates whether the object is evicted

Description

Adds adrm_gpuvm_bo to or removes it from thedrm_gpuvm’s evicted list.

intdrm_gpuva_insert(structdrm_gpuvm*gpuvm,structdrm_gpuva*va)¶: insert adrm_gpuva

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm to insert thedrm_gpuva in
structdrm_gpuva*va: thedrm_gpuva to insert

Description

Insert adrm_gpuva with a given address and range into adrm_gpuvm.

It is safe to use this function using the safe versions of iterating the GPUVA space, such asdrm_gpuvm_for_each_va_safe() anddrm_gpuvm_for_each_va_range_safe().

Return

0 on success, negative error code on failure.

voiddrm_gpuva_remove(structdrm_gpuva*va)¶: remove adrm_gpuva

Parameters

structdrm_gpuva*va: thedrm_gpuva to remove

Description

This removes the givenva from the underlying tree.

It is safe to use this function using the safe versions of iterating the GPUVA space, such asdrm_gpuvm_for_each_va_safe() anddrm_gpuvm_for_each_va_range_safe().

voiddrm_gpuva_link(structdrm_gpuva*va,structdrm_gpuvm_bo*vm_bo)¶: link adrm_gpuva

Parameters

structdrm_gpuva*va: thedrm_gpuva to link
structdrm_gpuvm_bo*vm_bo: thedrm_gpuvm_bo to add thedrm_gpuva to

Description

This adds the givenva to the GPU VA list of thedrm_gpuvm_bo and thedrm_gpuvm_bo to thedrm_gem_object it is associated with.

For everydrm_gpuva entry added to thedrm_gpuvm_bo an additionalreference of the latter is taken.

This function expects the caller to protect the GEM’s GPUVA list againstconcurrent access using either the GEM’s dma-resv or gpuva.lock mutex.

voiddrm_gpuva_unlink(structdrm_gpuva*va)¶: unlink adrm_gpuva

Parameters

structdrm_gpuva*va: thedrm_gpuva to unlink

Description

This removes the givenva from the GPU VA list of thedrm_gem_object it isassociated with.

This removes the givenva from the GPU VA list of thedrm_gpuvm_bo andthedrm_gpuvm_bo from thedrm_gem_object it is associated with in casethis call unlinks the lastdrm_gpuva from thedrm_gpuvm_bo.

For everydrm_gpuva entry removed from thedrm_gpuvm_bo a reference ofthe latter is dropped.

This function expects the caller to protect the GEM’s GPUVA list againstconcurrent access using either the GEM’s dma-resv or gpuva.lock mutex.

voiddrm_gpuva_unlink_defer(structdrm_gpuva*va)¶: unlink adrm_gpuva with deferred vm_bo cleanup

Parameters

structdrm_gpuva*va: thedrm_gpuva to unlink

Description

Similar todrm_gpuva_unlink(), but usesdrm_gpuvm_bo_put_deferred() and takesthe lock for the caller.

structdrm_gpuva*drm_gpuva_find_first(structdrm_gpuvm*gpuvm,u64addr,u64range)¶: find the firstdrm_gpuva in the given range

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm to search in
u64addr: thedrm_gpuvas address
u64range: thedrm_gpuvas range

Return

the firstdrm_gpuva within the given range

structdrm_gpuva*drm_gpuva_find(structdrm_gpuvm*gpuvm,u64addr,u64range)¶: find adrm_gpuva

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm to search in
u64addr: thedrm_gpuvas address
u64range: thedrm_gpuvas range

Return

thedrm_gpuva at a givenaddr and with a givenrange

structdrm_gpuva*drm_gpuva_find_prev(structdrm_gpuvm*gpuvm,u64start)¶: find thedrm_gpuva before the given address

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm to search in
u64start: the given GPU VA’s start address

Description

Find the adjacentdrm_gpuva before the GPU VA with givenstart address.

Note that if there is any free space between the GPU VA mappings no mappingis returned.

Return

a pointer to the founddrm_gpuva or NULL if none was found

structdrm_gpuva*drm_gpuva_find_next(structdrm_gpuvm*gpuvm,u64end)¶: find thedrm_gpuva after the given address

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm to search in
u64end: the given GPU VA’s end address

Description

Find the adjacentdrm_gpuva after the GPU VA with givenend address.

Note that if there is any free space between the GPU VA mappings no mappingis returned.

Return

a pointer to the founddrm_gpuva or NULL if none was found

booldrm_gpuvm_interval_empty(structdrm_gpuvm*gpuvm,u64addr,u64range)¶: indicate whether a given interval of the VA space is empty

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm to check the range for
u64addr: the start address of the range
u64range: the range of the interval

Return

true if the interval is empty, false otherwise

voiddrm_gpuva_map(structdrm_gpuvm*gpuvm,structdrm_gpuva*va,conststructdrm_gpuva_op_map*op)¶: helper to insert adrm_gpuva according to adrm_gpuva_op_map

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm
structdrm_gpuva*va: thedrm_gpuva to insert
conststructdrm_gpuva_op_map*op: thedrm_gpuva_op_map to initializeva with

Description

Initializes theva from theop and inserts it into the givengpuvm.

voiddrm_gpuva_remap(structdrm_gpuva*prev,structdrm_gpuva*next,conststructdrm_gpuva_op_remap*op)¶: helper to remap adrm_gpuva according to adrm_gpuva_op_remap

Parameters

structdrm_gpuva*prev: thedrm_gpuva to remap when keeping the start of a mapping
structdrm_gpuva*next: thedrm_gpuva to remap when keeping the end of a mapping
conststructdrm_gpuva_op_remap*op: thedrm_gpuva_op_remap to initializeprev andnext with

Description

Removes the currently mappeddrm_gpuva and remaps it usingprev and/ornext.

voiddrm_gpuva_unmap(conststructdrm_gpuva_op_unmap*op)¶: helper to remove adrm_gpuva according to adrm_gpuva_op_unmap

Parameters

conststructdrm_gpuva_op_unmap*op: thedrm_gpuva_op_unmap specifying thedrm_gpuva to remove

Description

Removes thedrm_gpuva associated with thedrm_gpuva_op_unmap.

intdrm_gpuvm_sm_map(structdrm_gpuvm*gpuvm,void*priv,conststructdrm_gpuvm_map_req*req)¶: calls thedrm_gpuva_op split/merge steps

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm representing the GPU VA space
void*priv: pointer to a driver private data structure
conststructdrm_gpuvm_map_req*req: ptr tostructdrm_gpuvm_map_req

Description

This function iterates the given range of the GPU VA space. It utilizes thedrm_gpuvm_ops to call back into the driver providing the split and mergesteps.

Drivers may use these callbacks to update the GPU VA space right away withinthe callback. In case the driver decides to copy and store the operations forlater processing neither this function nordrm_gpuvm_sm_unmap is allowed tobe called before thedrm_gpuvm’s view of the GPU VA space wasupdated with the previous set of operations. To update thedrm_gpuvm’s view of the GPU VA spacedrm_gpuva_insert(),drm_gpuva_destroy_locked() and/ordrm_gpuva_destroy_unlocked() should beused.

A sequence of callbacks can contain map, unmap and remap operations, butthe sequence of callbacks might also be empty if no operation is required,e.g. if the requested mapping already exists in the exact same way.

There can be an arbitrary amount of unmap operations, a maximum of two remapoperations and a single map operation. The latter one represents the originalmap operation requested by the caller.

Return

0 on success or a negative error code

intdrm_gpuvm_sm_unmap(structdrm_gpuvm*gpuvm,void*priv,u64req_addr,u64req_range)¶: calls thedrm_gpuva_ops to split on unmap

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm representing the GPU VA space
void*priv: pointer to a driver private data structure
u64req_addr: the start address of the range to unmap
u64req_range: the range of the mappings to unmap

Description

This function iterates the given range of the GPU VA space. It utilizes thedrm_gpuvm_ops to call back into the driver providing the operations tounmap and, if required, split existing mappings.

Drivers may use these callbacks to update the GPU VA space right away withinthe callback. In case the driver decides to copy and store the operations forlater processing neither this function nordrm_gpuvm_sm_map is allowed to becalled before thedrm_gpuvm’s view of the GPU VA space was updatedwith the previous set of operations. To update thedrm_gpuvm’s viewof the GPU VA spacedrm_gpuva_insert(),drm_gpuva_destroy_locked() and/ordrm_gpuva_destroy_unlocked() should be used.

A sequence of callbacks can contain unmap and remap operations, depending onwhether there are actual overlapping mappings to split.

There can be an arbitrary amount of unmap operations and a maximum of tworemap operations.

Return

0 on success or a negative error code

intdrm_gpuvm_sm_map_exec_lock(structdrm_gpuvm*gpuvm,structdrm_exec*exec,unsignedintnum_fences,structdrm_gpuvm_map_req*req)¶: locks the objects touched by adrm_gpuvm_sm_map()

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm representing the GPU VA space
structdrm_exec*exec: thedrm_exec locking context
unsignedintnum_fences: for newly mapped objects, the # of fences to reserve
structdrm_gpuvm_map_req*req: ptr to drm_gpuvm_map_req struct

Description

This function locks (drm_exec_lock_obj()) objects that will be unmapped/remapped, and locks+prepares (drm_exec_prepare_object()) objects thatwill be newly mapped.

The expected usage is:

vm_bind {    struct drm_exec exec;    // IGNORE_DUPLICATES is required, INTERRUPTIBLE_WAIT is recommended:    drm_exec_init(&exec, IGNORE_DUPLICATES | INTERRUPTIBLE_WAIT, 0);    drm_exec_until_all_locked (&exec) {        for_each_vm_bind_operation {            switch (op->op) {            case DRIVER_OP_UNMAP:                ret = drm_gpuvm_sm_unmap_exec_lock(gpuvm, &exec, op->addr, op->range);                break;            case DRIVER_OP_MAP:                ret = drm_gpuvm_sm_map_exec_lock(gpuvm, &exec, num_fences, &req);                break;            }            drm_exec_retry_on_contention(&exec);            if (ret)                return ret;        }    }}

This enables all locking to be performed before the driver begins modifyingthe VM. This is safe to do in the case of overlapping DRIVER_VM_BIND_OPs,where an earlier op can alter the sequence of steps generated for a laterop, because the later altered step will involve the same GEM object(s)already seen in the earlier locking step. For example:

An earlier driver DRIVER_OP_UNMAP op removes the need for aDRM_GPUVA_OP_REMAP/UNMAP step. This is safe because we’ve alreadylocked the GEM object in the earlier DRIVER_OP_UNMAP op.
An earlier DRIVER_OP_MAP op overlaps with a later DRIVER_OP_MAP/UNMAPop, introducing a DRM_GPUVA_OP_REMAP/UNMAP that wouldn’t have beenrequired without the earlier DRIVER_OP_MAP. This is safe because we’vealready locked the GEM object in the earlier DRIVER_OP_MAP step.

Return

0 on success or a negative error code

intdrm_gpuvm_sm_unmap_exec_lock(structdrm_gpuvm*gpuvm,structdrm_exec*exec,u64req_addr,u64req_range)¶: locks the objects touched bydrm_gpuvm_sm_unmap()

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm representing the GPU VA space
structdrm_exec*exec: thedrm_exec locking context
u64req_addr: the start address of the range to unmap
u64req_range: the range of the mappings to unmap

Description

This function locks (drm_exec_lock_obj()) objects that will be unmapped/remapped bydrm_gpuvm_sm_unmap().

Seedrm_gpuvm_sm_map_exec_lock() for expected usage.

Return

0 on success or a negative error code

structdrm_gpuva_ops*drm_gpuvm_sm_map_ops_create(structdrm_gpuvm*gpuvm,conststructdrm_gpuvm_map_req*req)¶: creates thedrm_gpuva_ops to split and merge

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm representing the GPU VA space
conststructdrm_gpuvm_map_req*req: map request arguments

Description

This function creates a list of operations to perform splitting and mergingof existing mapping(s) with the newly requested one.

The list can be iterated withdrm_gpuva_for_each_op and must be processedin the given order. It can contain map, unmap and remap operations, but italso can be empty if no operation is required, e.g. if the requested mappingalready exists in the exact same way.

There can be an arbitrary amount of unmap operations, a maximum of two remapoperations and a single map operation. The latter one represents the originalmap operation requested by the caller.

Note that before calling this function again with another mapping request itis necessary to update thedrm_gpuvm’s view of the GPU VA space. Thepreviously obtained operations must be either processed or abandoned. Toupdate thedrm_gpuvm’s view of the GPU VA spacedrm_gpuva_insert(),drm_gpuva_destroy_locked() and/ordrm_gpuva_destroy_unlocked() should beused.

After the caller finished processing the returneddrm_gpuva_ops, they mustbe freed withdrm_gpuva_ops_free.

Return

a pointer to thedrm_gpuva_ops on success, an ERR_PTR on failure

structdrm_gpuva_ops*drm_gpuvm_madvise_ops_create(structdrm_gpuvm*gpuvm,conststructdrm_gpuvm_map_req*req)¶: creates thedrm_gpuva_ops to split

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm representing the GPU VA space
conststructdrm_gpuvm_map_req*req: map request arguments

Description

This function creates a list of operations to perform splittingof existent mapping(s) at start or end, based on the request map.

The list can be iterated withdrm_gpuva_for_each_op and must be processedin the given order. It can contain map and remap operations, but italso can be empty if no operation is required, e.g. if the requested mappingalready exists is the exact same way.

There will be no unmap operations, a maximum of two remap operations and twomap operations. The two map operations correspond to: one from start to theend of drm_gpuvaX, and another from the start of drm_gpuvaY to end.

After the caller finished processing the returneddrm_gpuva_ops, they mustbe freed withdrm_gpuva_ops_free.

Return

a pointer to thedrm_gpuva_ops on success, an ERR_PTR on failure

structdrm_gpuva_ops*drm_gpuvm_sm_unmap_ops_create(structdrm_gpuvm*gpuvm,u64req_addr,u64req_range)¶: creates thedrm_gpuva_ops to split on unmap

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm representing the GPU VA space
u64req_addr: the start address of the range to unmap
u64req_range: the range of the mappings to unmap

Description

This function creates a list of operations to perform unmapping and, ifrequired, splitting of the mappings overlapping the unmap range.

The list can be iterated withdrm_gpuva_for_each_op and must be processedin the given order. It can contain unmap and remap operations, depending onwhether there are actual overlapping mappings to split.

There can be an arbitrary amount of unmap operations and a maximum of tworemap operations.

Note that before calling this function again with another range to unmap itis necessary to update thedrm_gpuvm’s view of the GPU VA space. Thepreviously obtained operations must be processed or abandoned. To update thedrm_gpuvm’s view of the GPU VA spacedrm_gpuva_insert(),drm_gpuva_destroy_locked() and/ordrm_gpuva_destroy_unlocked() should beused.

After the caller finished processing the returneddrm_gpuva_ops, they mustbe freed withdrm_gpuva_ops_free.

Return

a pointer to thedrm_gpuva_ops on success, an ERR_PTR on failure

structdrm_gpuva_ops*drm_gpuvm_prefetch_ops_create(structdrm_gpuvm*gpuvm,u64addr,u64range)¶: creates thedrm_gpuva_ops to prefetch

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm representing the GPU VA space
u64addr: the start address of the range to prefetch
u64range: the range of the mappings to prefetch

Description

This function creates a list of operations to perform prefetching.

The list can be iterated withdrm_gpuva_for_each_op and must be processedin the given order. It can contain prefetch operations.

There can be an arbitrary amount of prefetch operations.

After the caller finished processing the returneddrm_gpuva_ops, they mustbe freed withdrm_gpuva_ops_free.

Return

a pointer to thedrm_gpuva_ops on success, an ERR_PTR on failure

structdrm_gpuva_ops*drm_gpuvm_bo_unmap_ops_create(structdrm_gpuvm_bo*vm_bo)¶: creates thedrm_gpuva_ops to unmap a GEM

Parameters

structdrm_gpuvm_bo*vm_bo: thedrm_gpuvm_bo abstraction

Description

This function creates a list of operations to perform unmapping for everyGPUVA attached to a GEM.

The list can be iterated withdrm_gpuva_for_each_op and consists out of anarbitrary amount of unmap operations.

After the caller finished processing the returneddrm_gpuva_ops, they mustbe freed withdrm_gpuva_ops_free.

This function expects the caller to protect the GEM’s GPUVA list againstconcurrent access using either the GEM’s dma-resv or gpuva.lock mutex.

Return

a pointer to thedrm_gpuva_ops on success, an ERR_PTR on failure

voiddrm_gpuva_ops_free(structdrm_gpuvm*gpuvm,structdrm_gpuva_ops*ops)¶: free the givendrm_gpuva_ops

Parameters

structdrm_gpuvm*gpuvm: thedrm_gpuvm the ops were created for
structdrm_gpuva_ops*ops: thedrm_gpuva_ops to free

Description

Frees the givendrm_gpuva_ops structure including all the ops associatedwith it.

DRM Buddy Allocator¶

Buddy Allocator Function References (GPU buddy)¶

intgpu_buddy_init(structgpu_buddy*mm,u64size,u64chunk_size)¶: init memory manager

Parameters

structgpu_buddy*mm: GPU buddy manager to initialize
u64size: size in bytes to manage
u64chunk_size: minimum page size in bytes for our allocations

Description

Initializes the memory manager and its resources.

Return

0 on success, error code on failure.

voidgpu_buddy_fini(structgpu_buddy*mm)¶: tear down the memory manager

Parameters

structgpu_buddy*mm: GPU buddy manager to free

Description

Cleanup memory manager resources and the freetree

structgpu_buddy_block*gpu_get_buddy(structgpu_buddy_block*block)¶: get buddy address

Parameters

structgpu_buddy_block*block: GPU buddy block

Description

Returns the corresponding buddy block forblock, or NULLif this is a root block and can’t be merged further.Requires some kind of locking to protect againstany concurrent allocate and free operations.

voidgpu_buddy_reset_clear(structgpu_buddy*mm,boolis_clear)¶: reset blocks clear state

Parameters

structgpu_buddy*mm: GPU buddy manager
boolis_clear: blocks clear state

Description

Reset the clear state based onis_clear value for each blockin the freetree.

voidgpu_buddy_free_block(structgpu_buddy*mm,structgpu_buddy_block*block)¶: free a block

Parameters

structgpu_buddy*mm: GPU buddy manager
structgpu_buddy_block*block: block to be freed

voidgpu_buddy_free_list(structgpu_buddy*mm,structlist_head*objects,unsignedintflags)¶: free blocks

Parameters

structgpu_buddy*mm: GPU buddy manager
structlist_head*objects: input list head to free blocks
unsignedintflags: optional flags like GPU_BUDDY_CLEARED

intgpu_buddy_block_trim(structgpu_buddy*mm,u64*start,u64new_size,structlist_head*blocks)¶: free unused pages

Parameters

structgpu_buddy*mm: GPU buddy manager
u64*start: start address to begin the trimming.
u64new_size: original size requested
structlist_head*blocks: Input and output list of allocated blocks.MUST contain single block as input to be trimmed.On success will contain the newly allocated blocksmaking up thenew_size. Blocks always appear inascending order

Description

For contiguous allocation, we round up the size to the nearestpower of two value, drivers consumeactual size, so remainingportions are unused and can be optionally freed with this function

Return

0 on success, error code on failure.

intgpu_buddy_alloc_blocks(structgpu_buddy*mm,u64start,u64end,u64size,u64min_block_size,structlist_head*blocks,unsignedlongflags)¶: allocate power-of-two blocks

Parameters

structgpu_buddy*mm: GPU buddy manager to allocate from
u64start: start of the allowed range for this block
u64end: end of the allowed range for this block
u64size: size of the allocation in bytes
u64min_block_size: alignment of the allocation
structlist_head*blocks: output list head to add allocated blocks
unsignedlongflags: GPU_BUDDY_*_ALLOCATION flags

Description

alloc_range_bias() called on range limitations, which traversesthe tree and returns the desired block.

alloc_from_freetree() called whenno range restrictionsare enforced, which picks the block from the freetree.

Return

0 on success, error code on failure.

voidgpu_buddy_block_print(structgpu_buddy*mm,structgpu_buddy_block*block)¶: print block information

Parameters

structgpu_buddy*mm: GPU buddy manager
structgpu_buddy_block*block: GPU buddy block

voidgpu_buddy_print(structgpu_buddy*mm)¶: print allocator state

Parameters

structgpu_buddy*mm: GPU buddy manager

DRM Buddy Specific Logging Function References¶

voiddrm_buddy_block_print(structgpu_buddy*mm,structgpu_buddy_block*block,structdrm_printer*p)¶: print block information

Parameters

structgpu_buddy*mm: DRM buddy manager
structgpu_buddy_block*block: DRM buddy block
structdrm_printer*p: DRM printer to use

voiddrm_buddy_print(structgpu_buddy*mm,structdrm_printer*p)¶: print allocator state

Parameters

structgpu_buddy*mm: DRM buddy manager
structdrm_printer*p: DRM printer to use

DRM Cache Handling and Fast WC memcpy()¶

voiddrm_clflush_pages(structpage*pages[],unsignedlongnum_pages)¶: Flush dcache lines of a set of pages.

Parameters

structpage*pages[]: List of pages to be flushed.
unsignedlongnum_pages: Number of pages in the array.

Description

Flush every data cache line entry that points to an address belongingto a page in the array.

voiddrm_clflush_sg(structsg_table*st)¶: Flush dcache lines pointing to a scather-gather.

Parameters

structsg_table*st: structsg_table.

Description

Flush every data cache line entry that points to an address in thesg.

voiddrm_clflush_virt_range(void*addr,unsignedlonglength)¶: Flush dcache lines of a region

Parameters

void*addr: Initial kernel memory address.
unsignedlonglength: Region size.

Description

Flush every data cache line entry that points to an address in theregion requested.

voiddrm_memcpy_from_wc(structiosys_map*dst,conststructiosys_map*src,unsignedlonglen)¶: Perform the fastest available memcpy from a source that may be WC.

Parameters

structiosys_map*dst: The destination pointer
conststructiosys_map*src: The source pointer
unsignedlonglen: The size of the area o transfer in bytes

Description

Tries an arch optimized memcpy for prefetching reading out of a WC region,and if no such beast is available, falls back to a normal memcpy.

DRM Sync Objects¶

DRM synchronisation objects (syncobj, see structdrm_syncobj) provide acontainer for a synchronization primitive which can be used by userspaceto explicitly synchronize GPU commands, can be shared between userspaceprocesses, and can be shared between different DRM drivers.Their primary use-case is to implement Vulkan fences and semaphores.The syncobj userspace API provides ioctls for several operations:

Creation and destruction of syncobjs
Import and export of syncobjs to/from a syncobj file descriptor
Import and export a syncobj’s underlying fence to/from a sync file
Reset a syncobj (set its fence to NULL)
Signal a syncobj (set a trivially signaled fence)
Wait for a syncobj’s fence to appear and be signaled

The syncobj userspace API also provides operations to manipulate a syncobjin terms of a timeline of structdma_fence_chain rather than a singlestructdma_fence, through the following operations:

Signal a given point on the timeline
Wait for a given point to appear and/or be signaled
Import and export from/to a given point of a timeline

At it’s core, a syncobj is simply a wrapper around a pointer to a structdma_fence which may be NULL.When a syncobj is first created, its pointer is either NULL or a pointerto an already signaled fence depending on whether theDRM_SYNCOBJ_CREATE_SIGNALED flag is passed toDRM_IOCTL_SYNCOBJ_CREATE.

If the syncobj is considered as a binary (its state is either signaled orunsignaled) primitive, when GPU work is enqueued in a DRM driver to signalthe syncobj, the syncobj’s fence is replaced with a fence which will besignaled by the completion of that work.If the syncobj is considered as a timeline primitive, when GPU work isenqueued in a DRM driver to signal the a given point of the syncobj, a newstructdma_fence_chain pointing to the DRM driver’s fence and alsopointing to the previous fence that was in the syncobj. The new structdma_fence_chain fence replace the syncobj’s fence and will be signaled bycompletion of the DRM driver’s work and also any work associated with thefence previously in the syncobj.

When GPU work which waits on a syncobj is enqueued in a DRM driver, at thetime the work is enqueued, it waits on the syncobj’s fence beforesubmitting the work to hardware. That fence is either :

The syncobj’s current fence if the syncobj is considered as a binaryprimitive.
The structdma_fence associated with a given point if the syncobj isconsidered as a timeline primitive.

If the syncobj’s fence is NULL or not present in the syncobj’s timeline,the enqueue operation is expected to fail.

With binary syncobj, all manipulation of the syncobjs’s fence happens interms of the current fence at the time the ioctl is called by userspaceregardless of whether that operation is an immediate host-side operation(signal or reset) or or an operation which is enqueued in some driverqueue.DRM_IOCTL_SYNCOBJ_RESET andDRM_IOCTL_SYNCOBJ_SIGNAL can be usedto manipulate a syncobj from the host by resetting its pointer to NULL orsetting its pointer to a fence which is already signaled.

With a timeline syncobj, all manipulation of the synobj’s fence happens interms of a u64 value referring to point in the timeline. Seedma_fence_chain_find_seqno() to see how a given point is found in thetimeline.

Note that applications should be careful to always use timeline set ofioctl() when dealing with syncobj considered as timeline. Using a binaryset of ioctl() with a syncobj considered as timeline could result incorrectsynchronization. The use of binary syncobj is supported through thetimeline set of ioctl() by using a point value of 0, this will reproducethe behavior of the binary set of ioctl() (for example replace thesyncobj’s fence when signaling).

Host-side wait on syncobjs¶

DRM_IOCTL_SYNCOBJ_WAIT takes an array of syncobj handles and does ahost-side wait on all of the syncobj fences simultaneously.IfDRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL is set, the wait ioctl will wait onall of the syncobj fences to be signaled before it returns.Otherwise, it returns once at least one syncobj fence has been signaledand the index of a signaled fence is written back to the client.

Unlike the enqueued GPU work dependencies which fail if they see a NULLfence in a syncobj, ifDRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT is set,the host-side wait will first wait for the syncobj to receive a non-NULLfence and then wait on that fence.IfDRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT is not set and any one of thesyncobjs in the array has a NULL fence, -EINVAL will be returned.Assuming the syncobj starts off with a NULL fence, this allows a clientto do a host wait in one thread (or process) which waits on GPU worksubmitted in another thread (or process) without having to manuallysynchronize between the two.This requirement is inherited from the Vulkan fence API.

IfDRM_SYNCOBJ_WAIT_FLAGS_WAIT_DEADLINE is set, the ioctl will also seta fence deadline hint on the backing fences before waiting, to provide thefence signaler with an appropriate sense of urgency. The deadline isspecified as an absoluteCLOCK_MONOTONIC value in units of ns.

Similarly,DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT takes an array of syncobjhandles as well as an array of u64 points and does a host-side wait on allof syncobj fences at the given points simultaneously.

DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT also adds the ability to wait for a givenfence to materialize on the timeline without waiting for the fence to besignaled by using theDRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE flag. Thisrequirement is inherited from the wait-before-signal behavior required bythe Vulkan timeline semaphore API.

Alternatively,DRM_IOCTL_SYNCOBJ_EVENTFD can be used to wait withoutblocking: an eventfd will be signaled when the syncobj is. This is useful tointegrate the wait in an event loop.

Import/export of syncobjs¶

DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE andDRM_IOCTL_SYNCOBJ_HANDLE_TO_FDprovide two mechanisms for import/export of syncobjs.

The first lets the client import or export an entire syncobj to a filedescriptor.These fd’s are opaque and have no other use case, except passing thesyncobj between processes.All exported file descriptors and any syncobj handles created as aresult of importing those file descriptors own a reference to thesame underlying structdrm_syncobj and the syncobj can be usedpersistently across all the processes with which it is shared.The syncobj is freed only once the last reference is dropped.Unlike dma-buf, importing a syncobj creates a new handle (with its ownreference) for every import instead of de-duplicating.The primary use-case of this persistent import/export is for sharedVulkan fences and semaphores.

The second import/export mechanism, which is indicated byDRM_SYNCOBJ_FD_TO_HANDLE_FLAGS_IMPORT_SYNC_FILE orDRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE lets the clientimport/export the syncobj’s current fence from/to async_file.When a syncobj is exported to a sync file, that sync file wraps thesycnobj’s fence at the time of export and any later signal or resetoperations on the syncobj will not affect the exported sync file.When a sync file is imported into a syncobj, the syncobj’s fence is setto the fence wrapped by that sync file.Because sync files are immutable, resetting or signaling the syncobjwill not affect any sync files whose fences have been imported into thesyncobj.

Import/export of timeline points in timeline syncobjs¶

DRM_IOCTL_SYNCOBJ_TRANSFER provides a mechanism to transfer a structdma_fence_chain of a syncobj at a given u64 point to another u64 pointinto another syncobj.

Note that if you want to transfer a structdma_fence_chain from a givenpoint on a timeline syncobj from/into a binary syncobj, you can use thepoint 0 to mean take/replace the fence in the syncobj.

structdrm_syncobj¶: sync object.

Definition:

struct drm_syncobj {    struct kref refcount;    struct dma_fence  *fence;    struct list_head cb_list;    struct list_head ev_fd_list;    spinlock_t lock;    struct file *file;};

Members

refcount

Reference count of this object.

fence

NULL or a pointer to the fence bound to this object.

This field should not be used directly. Usedrm_syncobj_fence_get()anddrm_syncobj_replace_fence() instead.

cb_list

List of callbacks to call when thefence gets replaced.

ev_fd_list

List of registered eventfd.

lock

Protectscb_list andev_fd_list, and write-locksfence.

file

A file backing for this syncobj.

Description

This structure defines a generic sync object which wraps adma_fence.

voiddrm_syncobj_get(structdrm_syncobj*obj)¶: acquire a syncobj reference

Parameters

structdrm_syncobj*obj: sync object

Description

This acquires an additional reference toobj. It is illegal to call thiswithout already holding a reference. No locks required.

voiddrm_syncobj_put(structdrm_syncobj*obj)¶: release a reference to a sync object.

Parameters

structdrm_syncobj*obj: sync object.

structdma_fence*drm_syncobj_fence_get(structdrm_syncobj*syncobj)¶: get a reference to a fence in a sync object

Parameters

structdrm_syncobj*syncobj: sync object.

Description

This acquires additional reference todrm_syncobj.fence contained inobj,if not NULL. It is illegal to call this without already holding a reference.No locks required.

Return

Either the fence ofobj or NULL if there’s none.

structdrm_syncobj*drm_syncobj_find(structdrm_file*file_private,u32handle)¶: lookup and reference a sync object.

Parameters

structdrm_file*file_private: drm file private pointer
u32handle: sync object handle to lookup.

Description

Returns a reference to the syncobj pointed to by handle or NULL. Thereference must be released by callingdrm_syncobj_put().

voiddrm_syncobj_add_point(structdrm_syncobj*syncobj,structdma_fence_chain*chain,structdma_fence*fence,uint64_tpoint)¶: add new timeline point to the syncobj

Parameters

structdrm_syncobj*syncobj: sync object to add timeline point do
structdma_fence_chain*chain: chain node to use to add the point
structdma_fence*fence: fence to encapsulate in the chain node
uint64_tpoint: sequence number to use for the point

Description

Add the chain node as new timeline point to the syncobj.

voiddrm_syncobj_replace_fence(structdrm_syncobj*syncobj,structdma_fence*fence)¶: replace fence in a sync object.

Parameters

structdrm_syncobj*syncobj: Sync object to replace fence in
structdma_fence*fence: fence to install in sync file.

Description

This replaces the fence on a sync object.

intdrm_syncobj_find_fence(structdrm_file*file_private,u32handle,u64point,u64flags,structdma_fence**fence)¶: lookup and reference the fence in a sync object

Parameters

structdrm_file*file_private: drm file private pointer
u32handle: sync object handle to lookup.
u64point: timeline point
u64flags: DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT or not
structdma_fence**fence: out parameter for the fence

Description

This is just a convenience function that combinesdrm_syncobj_find() anddrm_syncobj_fence_get().

Returns 0 on success or a negative error value on failure. On successfencecontains a reference to the fence, which must be released by callingdma_fence_put().

voiddrm_syncobj_free(structkref*kref)¶: free a sync object.

Parameters

structkref*kref: kref to free.

Description

Only to be called from kref_put in drm_syncobj_put.

intdrm_syncobj_create(structdrm_syncobj**out_syncobj,uint32_tflags,structdma_fence*fence)¶: create a new syncobj

Parameters

structdrm_syncobj**out_syncobj: returned syncobj
uint32_tflags: DRM_SYNCOBJ_* flags
structdma_fence*fence: if non-NULL, the syncobj will represent this fence

Description

This is the first function to create a sync object. After creating, driversprobably want to make it available to userspace, either throughdrm_syncobj_get_handle() ordrm_syncobj_get_fd().

Returns 0 on success or a negative error value on failure.

intdrm_syncobj_get_handle(structdrm_file*file_private,structdrm_syncobj*syncobj,u32*handle)¶: get a handle from a syncobj

Parameters

structdrm_file*file_private: drm file private pointer
structdrm_syncobj*syncobj: Sync object to export
u32*handle: out parameter with the new handle

Description

Exports a sync object created withdrm_syncobj_create() as a handle onfile_private to userspace.

Returns 0 on success or a negative error value on failure.

intdrm_syncobj_get_fd(structdrm_syncobj*syncobj,int*p_fd)¶: get a file descriptor from a syncobj

Parameters

structdrm_syncobj*syncobj: Sync object to export
int*p_fd: out parameter with the new file descriptor

Description

Exports a sync object created withdrm_syncobj_create() as a file descriptor.

Returns 0 on success or a negative error value on failure.

signedlongdrm_timeout_abs_to_jiffies(int64_ttimeout_nsec)¶: calculate jiffies timeout from absolute value

Parameters

int64_ttimeout_nsec: timeout nsec component in ns, 0 for poll

Description

Calculate the timeout in jiffies from an absolute time in sec/nsec.

DRM Execution context¶

This component mainly abstracts the retry loop necessary for lockingmultiple GEM objects while preparing hardware operations (e.g. commandsubmissions, page table updates etc..).

If a contention is detected while locking a GEM object the cleanup procedureunlocks all previously locked GEM objects and locks the contended one firstbefore locking any further objects.

After an object is locked fences slots can optionally be reserved on thedma_resv object inside the GEM object.

A typical usage pattern should look like this:

struct drm_gem_object *obj;struct drm_exec exec;unsigned long index;int ret;drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT);drm_exec_until_all_locked(&exec) {        ret = drm_exec_prepare_obj(&exec, boA, 1);        drm_exec_retry_on_contention(&exec);        if (ret)                goto error;        ret = drm_exec_prepare_obj(&exec, boB, 1);        drm_exec_retry_on_contention(&exec);        if (ret)                goto error;}drm_exec_for_each_locked_object(&exec, index, obj) {        dma_resv_add_fence(obj->resv, fence, DMA_RESV_USAGE_READ);        ...}drm_exec_fini(&exec);

Seestructdma_exec for more details.

structdrm_exec¶: Execution context

Definition:

struct drm_exec {    u32 flags;    struct ww_acquire_ctx   ticket;    unsigned int            num_objects;    unsigned int            max_objects;    struct drm_gem_object   **objects;    struct drm_gem_object   *contended;    struct drm_gem_object *prelocked;};

Members

flags: Flags to control locking behavior
ticket: WW ticket used for acquiring locks
num_objects: number of objects locked
max_objects: maximum objects in array
objects: array of the locked objects
contended: contended GEM object we backed off for
prelocked: already locked GEM object due to contention

structdrm_gem_object*drm_exec_obj(structdrm_exec*exec,unsignedlongindex)¶: Return the object for a give drm_exec index

Parameters

structdrm_exec*exec: Pointer to the drm_exec context
unsignedlongindex: The index.

Return

Pointer to the locked object corresponding toindex ifindex is within the number of locked objects. NULL otherwise.

drm_exec_for_each_locked_object¶

drm_exec_for_each_locked_object(exec,index,obj)

iterate over all the locked objects

Parameters

exec: drm_exec object
index: unsigned long index for the iteration
obj: the current GEM object

Description

Iterate over all the locked GEM objects inside the drm_exec object.

drm_exec_for_each_locked_object_reverse¶

drm_exec_for_each_locked_object_reverse(exec,index,obj)

iterate over all the locked objects in reverse locking order

Parameters

exec: drm_exec object
index: unsigned long index for the iteration
obj: the current GEM object

Description

Iterate over all the locked GEM objects inside the drm_exec object inreverse locking order. Note thatindex may go below zero and wrap,but that will be caught bydrm_exec_obj(), returning a NULL object.

drm_exec_until_all_locked¶

drm_exec_until_all_locked(exec)

loop until all GEM objects are locked

Parameters

exec: drm_exec object

Description

Core functionality of the drm_exec object. Loops until all GEM objects arelocked and no more contention exists. At the beginning of the loop it isguaranteed that no GEM object is locked.

Since labels can’t be defined local to the loops body we use a jump pointerto make sure that the retry is only used from within the loops body.

drm_exec_retry_on_contention¶

drm_exec_retry_on_contention(exec)

restart the loop to grap all locks

Parameters

exec: drm_exec object

Description

Control flow helper to continue when a contention was detected and we need toclean up and re-start the loop to prepare all GEM objects.

booldrm_exec_is_contended(structdrm_exec*exec)¶: check for contention

Parameters

structdrm_exec*exec: drm_exec object

Description

Returns true if the drm_exec object has run into some contention whilelocking a GEM object and needs to clean up.

voiddrm_exec_init(structdrm_exec*exec,u32flags,unsignednr)¶: initialize a drm_exec object

Parameters

structdrm_exec*exec: the drm_exec object to initialize
u32flags: controls locking behavior, see DRM_EXEC_* defines
unsignednr: the initial # of objects

Description

Initialize the object and make sure that we can track locked objects.

If nr is non-zero then it is used as the initial objects table size.In either case, the table will grow (be re-allocated) on demand.

voiddrm_exec_fini(structdrm_exec*exec)¶: finalize a drm_exec object

Parameters

structdrm_exec*exec: the drm_exec object to finalize

Description

Unlock all locked objects, drop the references to objects and free all memoryused for tracking the state.

booldrm_exec_cleanup(structdrm_exec*exec)¶: cleanup when contention is detected

Parameters

structdrm_exec*exec: the drm_exec object to cleanup

Description

Cleanup the current state and return true if we should stay inside the retryloop, false if there wasn’t any contention detected and we can keep theobjects locked.

intdrm_exec_lock_obj(structdrm_exec*exec,structdrm_gem_object*obj)¶: lock a GEM object for use

Parameters

structdrm_exec*exec: the drm_exec object with the state
structdrm_gem_object*obj: the GEM object to lock

Description

Lock a GEM object for use and grab a reference to it.

Return

-EDEADLK if a contention is detected, -EALREADY when object isalready locked (can be suppressed by setting the DRM_EXEC_IGNORE_DUPLICATESflag), -ENOMEM when memory allocation failed and zero for success.

voiddrm_exec_unlock_obj(structdrm_exec*exec,structdrm_gem_object*obj)¶: unlock a GEM object in this exec context

Parameters

structdrm_exec*exec: the drm_exec object with the state
structdrm_gem_object*obj: the GEM object to unlock

Description

Unlock the GEM object and remove it from the collection of locked objects.Should only be used to unlock the most recently locked objects. It’s not timeefficient to unlock objects locked long ago.

intdrm_exec_prepare_obj(structdrm_exec*exec,structdrm_gem_object*obj,unsignedintnum_fences)¶: prepare a GEM object for use

Parameters

structdrm_exec*exec: the drm_exec object with the state
structdrm_gem_object*obj: the GEM object to prepare
unsignedintnum_fences: how many fences to reserve

Description

Prepare a GEM object for use by locking it and reserving fence slots.

Return

-EDEADLK if a contention is detected, -EALREADY when object isalready locked, -ENOMEM when memory allocation failed and zero for success.

intdrm_exec_prepare_array(structdrm_exec*exec,structdrm_gem_object**objects,unsignedintnum_objects,unsignedintnum_fences)¶: helper to prepare an array of objects

Parameters

structdrm_exec*exec: the drm_exec object with the state
structdrm_gem_object**objects: array of GEM object to prepare
unsignedintnum_objects: number of GEM objects in the array
unsignedintnum_fences: number of fences to reserve on each GEM object

Description

Prepares all GEM objects in an array, aborts on first error.Reservesnum_fences on each GEM object after locking it.

Return

-EDEADLOCK on contention, -EALREADY when object is already locked,-ENOMEM when memory allocation failed and zero for success.

GPU Scheduler¶

Overview¶

The GPU scheduler provides entities which allow userspace to push jobsinto software queues which are then scheduled on a hardware run queue.The software queues have a priority among them. The scheduler selects the entitiesfrom the run queue using a FIFO. The scheduler provides dependency handlingfeatures among jobs. The driver is supposed to provide callback functions forbackend operations to the scheduler like submitting a job to hardware run queue,returning the dependencies of a job etc.

The organisation of the scheduler is the following:

Each hw run queue has one scheduler
Each scheduler has multiple run queues with different priorities(e.g., HIGH_HW,HIGH_SW, KERNEL, NORMAL)
Each scheduler run queue has a queue of entities to schedule
Entities themselves maintain a queue of jobs that will be scheduled onthe hardware.

The jobs in an entity are always scheduled in the order in which they were pushed.

Note that once a job was taken from the entities queue and pushed to thehardware, i.e. the pending queue, the entity must not be referenced anymorethrough the jobs entity pointer.

Flow Control¶

The DRM GPU scheduler provides a flow control mechanism to regulate the ratein which the jobs fetched from scheduler entities are executed.

In this context thedrm_gpu_scheduler keeps track of a driver specifiedcredit limit representing the capacity of this scheduler and a credit count;everydrm_sched_job carries a driver specified number of credits.

Once a job is executed (but not yet finished), the job’s credits contributeto the scheduler’s credit count until the job is finished. If by executingone more job the scheduler’s credit count would exceed the scheduler’scredit limit, the job won’t be executed. Instead, the scheduler will waituntil the credit count has decreased enough to not overflow its credit limit.This implies waiting for previously executed jobs.

Scheduler Function References¶

DRM_SCHED_FENCE_DONT_PIPELINE¶

DRM_SCHED_FENCE_DONT_PIPELINE

Prevent dependency pipelining
Description
Setting this flag on a scheduler fence prevents pipelining of jobs dependingon this fence. In other words we always insert a full CPU round trip beforedependent jobs are pushed to the hw queue.

DRM_SCHED_FENCE_FLAG_HAS_DEADLINE_BIT¶

DRM_SCHED_FENCE_FLAG_HAS_DEADLINE_BIT

A fence deadline hint has been set
Description
Because we could have a deadline hint can be set before the backing hwfence is created, we need to keep track of whether a deadline has alreadybeen set.

structdrm_sched_entity¶: A wrapper around a job queue (typically attached to the DRM file_priv).

Definition:

struct drm_sched_entity {    struct list_head                list;    spinlock_t lock;    struct drm_sched_rq             *rq;    struct drm_gpu_scheduler        **sched_list;    unsigned int                    num_sched_list;    enum drm_sched_priority         priority;    struct spsc_queue               job_queue;    atomic_t fence_seq;    uint64_t fence_context;    struct dma_fence                *dependency;    struct dma_fence_cb             cb;    atomic_t *guilty;    struct dma_fence           *last_scheduled;    struct task_struct              *last_user;    bool stopped;    struct completion               entity_idle;    ktime_t oldest_job_waiting;    struct rb_node                  rb_tree_node;};

Members

list

Used to append thisstructto the list of entities in the runqueuerq underdrm_sched_rq.entities.

Protected bydrm_sched_rq.lock ofrq.

lock

Lock protecting the run-queue (rq) to which this entity belongs,priority and the list of schedulers (sched_list,num_sched_list).

rq

Runqueue on which this entity is currently scheduled.

FIXME: Locking is very unclear for this. Writers are protected bylock, but readers are generally lockless and seem to just race withnot even a READ_ONCE.

sched_list

A list of schedulers (structdrm_gpu_scheduler). Jobs from this entity canbe scheduled on any scheduler on this list.

This can be modified by callingdrm_sched_entity_modify_sched().Locking is entirely up to the driver, see the above function for moredetails.

This will be set to NULL ifnum_sched_list equals 1 andrq has beenset already.

FIXME: This means priority changes throughdrm_sched_entity_set_priority() will be lost henceforth in this case.

num_sched_list

Number of drm_gpu_schedulers in thesched_list.

priority

Priority of the entity. This can be modified by callingdrm_sched_entity_set_priority(). Protected bylock.

job_queue

the list of jobs of this entity.

fence_seq

A linearly increasing seqno incremented with each newdrm_sched_fence which is part of the entity.

FIXME: Callers ofdrm_sched_job_arm() need to ensure correct locking,this doesn’t need to be atomic.

fence_context

A unique context for all the fences which belong to this entity. Thedrm_sched_fence.scheduled uses the fence_context butdrm_sched_fence.finished uses fence_context + 1.

dependency

The dependency fence of the job which is on the top of the job queue.

cb

Callback for the dependency fence above.

guilty

Points to entities’ guilty.

last_scheduled

Points to the finished fence of the last scheduled job. Only writtenbydrm_sched_entity_pop_job(). Can be accessed locklessly fromdrm_sched_job_arm() if the queue is empty.

last_user

last group leader pushing a job into the entity.

stopped

Marks the enity as removed from rq and destined fortermination. This is set by callingdrm_sched_entity_flush() and bydrm_sched_fini().

entity_idle

Signals when entity is not in use, used to sequence entity cleanup indrm_sched_entity_fini().

oldest_job_waiting

Marks earliest job waiting in SW queue

rb_tree_node

The node used to insert this entity into time based priority queue

Description

Entities will emit jobs in order to their corresponding hardwarering, and the scheduler will alternate between entities based onscheduling policy.

structdrm_sched_rq¶: queue of entities to be scheduled.

Definition:

struct drm_sched_rq {    struct drm_gpu_scheduler        *sched;    spinlock_t lock;    struct drm_sched_entity         *current_entity;    struct list_head                entities;    struct rb_root_cached           rb_tree_root;};

Members

sched: the scheduler to which this rq belongs to.
lock: protectsentities,rb_tree_root andcurrent_entity.
current_entity: the entity which is to be scheduled.
entities: list of the entities to be scheduled.
rb_tree_root: root of time based priority queue of entities for FIFO scheduling

Description

Run queue is a set of entities scheduling command submissions forone specific ring. It implements the scheduling policy that selectsthe next entity to emit commands from.

structdrm_sched_fence¶: fences corresponding to the scheduling of a job.

Definition:

struct drm_sched_fence {    struct dma_fence                scheduled;    struct dma_fence                finished;    ktime_t deadline;    struct dma_fence                *parent;    struct drm_gpu_scheduler        *sched;    spinlock_t lock;    void *owner;    uint64_t drm_client_id;};

Members

scheduled

this fence is what will be signaled by the schedulerwhen the job is scheduled.

finished

this fence is what will be signaled by the schedulerwhen the job is completed.

When setting up an out fence for the job, you should usethis, since it’s available immediately upondrm_sched_job_init(), and the fence returned by the driverfromrun_job() won’t be created until the dependencies haveresolved.

deadline

deadline set ondrm_sched_fence.finished whichpotentially needs to be propagated todrm_sched_fence.parent

parent

the fence returned bydrm_sched_backend_ops.run_jobwhen scheduling the job on hardware. We signal thedrm_sched_fence.finished fence once parent is signalled.

sched

the scheduler instance to which the job having thisstructbelongs to.

lock

the lock used by the scheduled and the finished fences.

owner

job owner for debugging

drm_client_id

The client_id of the drm_file which owns the job.

structdrm_sched_job¶: A job to be run by an entity.

Definition:

struct drm_sched_job {    ktime_t submit_ts;    struct drm_gpu_scheduler        *sched;    struct drm_sched_fence          *s_fence;    struct drm_sched_entity         *entity;    enum drm_sched_priority         s_priority;    u32 credits;    unsigned int                    last_dependency;    atomic_t karma;    struct spsc_node                queue_node;    struct list_head                list;    union {        struct dma_fence_cb     finish_cb;        struct work_struct      work;    };    struct dma_fence_cb             cb;    struct xarray                   dependencies;};

Members

submit_ts: When the job was pushed into the entity queue.
sched: The scheduler this job is or will be scheduled on. Gets set bydrm_sched_job_arm(). Valid until drm_sched_backend_ops.free_job()has finished.
s_fence: contains the fences for the scheduling of job.
entity: the entity to which this job belongs.
s_priority: the priority of the job.
credits: the number of credits this job contributes to the scheduler
last_dependency: tracksdependencies as they signal
karma: increment on every hang caused by this job. If this exceeds the hanglimit of the scheduler then the job is marked guilty and will notbe scheduled further.
queue_node: used to append thisstructto the queue of jobs in an entity.
list: a job participates in a “pending” and “done” lists.
{unnamed_union}: anonymous
finish_cb: the callback for the finished fence.
work: Helper to reschedule job kill to different context.
cb: the callback for the parent fence in s_fence.
dependencies: Contains the dependencies asstructdma_fence for this job, seedrm_sched_job_add_dependency() anddrm_sched_job_add_implicit_dependencies().

Description

A job is created by the driver usingdrm_sched_job_init(), andshould calldrm_sched_entity_push_job() once it wants the schedulerto schedule the job.

enumdrm_gpu_sched_stat¶: the scheduler’s status

Constants

DRM_GPU_SCHED_STAT_NONE: Reserved. Do not use.
DRM_GPU_SCHED_STAT_RESET: The GPU hung and successfully reset.
DRM_GPU_SCHED_STAT_ENODEV: Error: Device is not available anymore.
DRM_GPU_SCHED_STAT_NO_HANG: Contrary to scheduler’s assumption, the GPUdid not hang and is still running.

structdrm_sched_backend_ops¶: Define the backend operations called by the scheduler

Definition:

struct drm_sched_backend_ops {    struct dma_fence *(*prepare_job)(struct drm_sched_job *sched_job, struct drm_sched_entity *s_entity);    struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);    enum drm_gpu_sched_stat (*timedout_job)(struct drm_sched_job *sched_job);    void (*free_job)(struct drm_sched_job *sched_job);    void (*cancel_job)(struct drm_sched_job *sched_job);};

Members

prepare_job

Called when the scheduler is considering scheduling this job next, toget anotherstructdma_fence for this job to block on. Once itreturns NULL,run_job() may be called.

Can be NULL if no additional preparation to the dependencies arenecessary. Skipped when jobs are killed instead of run.

run_job

Called to execute the job once all of the dependencieshave been resolved.

sched_job: the job to run

The deprecateddrm_sched_resubmit_jobs() (called bystructdrm_sched_backend_ops.timedout_job) can invoke this again with thesame parameters. Using this is discouraged because it violatesdma_fence rules, notablydma_fence_init() has to be called onalready initialized fences for a second time. Moreover, this isdangerous because attempts to allocate memory might deadlock withmemory management code waiting for the reset to complete.

TODO: Document what drivers should do / use instead.

This method is called in a workqueue context - either from thesubmit_wq the driver passed throughdrm_sched_init(), or, if thedriver passed NULL, a separate, ordered workqueue the schedulerallocated.

Note that the scheduler expects to ‘inherit’ its own reference tothis fence from the callback. It does not invoke an extradma_fence_get() on it. Consequently, this callback must take areference for the scheduler, and additional ones for the driver’srespective needs.

Return:* On success: dma_fence the driver must signal once the hardware hascompleted the job (“hardware fence”).* On failure: NULL or an ERR_PTR.

timedout_job

Called when a job has taken too long to execute,to trigger GPU recovery.

sched_job: The job that has timed out

Drivers typically issue a reset to recover from GPU hangs.This procedure looks very different depending on whether a firmwareor a hardware scheduler is being used.

For a FIRMWARE SCHEDULER, each ring has one scheduler, and eachscheduler has one entity. Hence, the steps taken typically look asfollows:

Stop the scheduler usingdrm_sched_stop(). This will pause thescheduler workqueues and cancel the timeout work, guaranteeingthat nothing is queued while the ring is being removed.
Remove the ring. The firmware will make sure that thecorresponding parts of the hardware are resetted, and that otherrings are not impacted.
Kill the entity and the associated scheduler.

For a HARDWARE SCHEDULER, a scheduler instance schedules jobs fromone or more entities to one ring. This implies that all entitiesassociated with the affected scheduler cannot be torn down, becausethis would effectively also affect innocent userspace processes whichdid not submit faulty jobs (for example).

Consequently, the procedure to recover with a hardware schedulershould look like this:

Stop all schedulers impacted by the reset usingdrm_sched_stop().
Kill the entity the faulty job stems from.
Issue a GPU reset on all faulty rings (driver-specific).
Re-submit jobs on all schedulers impacted by re-submitting them tothe entities which are still alive.
Restart all schedulers that were stopped in step #1 usingdrm_sched_start().

Note that some GPUs have distinct hardware queues but need to resetthe GPU globally, which requires extra synchronization between thetimeout handlers of different schedulers. One way to achieve thissynchronization is to create an ordered workqueue (usingalloc_ordered_workqueue()) at the driver level, and pass this queueasdrm_sched_init()’stimeout_wq parameter. This will guaranteethat timeout handlers are executed sequentially.

Return: The scheduler’s status, defined byenumdrm_gpu_sched_stat

free_job

Called once the job’s finished fence has been signaledand it’s time to clean it up.

cancel_job

Used by the scheduler to guarantee remaining jobs’ fencesget signaled indrm_sched_fini().

Used by the scheduler to cancel all jobs that have not been executedwithstructdrm_sched_backend_ops.run_job by the timedrm_sched_fini() gets invoked.

Drivers need to signal the passed job’s hardware fence with anappropriate error code (e.g., -ECANCELED) in this callback. Theymust not free the job.

The scheduler will only call this callback once it stopped callingall other callbacks forever, with the exception ofstructdrm_sched_backend_ops.free_job.

Description

These functions should be implemented in the driver side.

structdrm_gpu_scheduler¶: scheduler instance-specific data

Definition:

struct drm_gpu_scheduler {    const struct drm_sched_backend_ops      *ops;    u32 credit_limit;    atomic_t credit_count;    long timeout;    const char                      *name;    u32 num_rqs;    struct drm_sched_rq             **sched_rq;    wait_queue_head_t job_scheduled;    atomic64_t job_id_count;    struct workqueue_struct         *submit_wq;    struct workqueue_struct         *timeout_wq;    struct work_struct              work_run_job;    struct work_struct              work_free_job;    struct delayed_work             work_tdr;    struct list_head                pending_list;    spinlock_t job_list_lock;    int hang_limit;    atomic_t *score;    atomic_t _score;    bool ready;    bool free_guilty;    bool pause_submit;    bool own_submit_wq;    struct device                   *dev;};

Members

ops: backend operations provided by the driver.
credit_limit: the credit limit of this scheduler
credit_count: the current credit count of this scheduler
timeout: the time after which a job is removed from the scheduler.
name: name of the ring for which this scheduler is being used.
num_rqs: Number of run-queues. This is at most DRM_SCHED_PRIORITY_COUNT,as there’s usually one run-queue per priority, but could be less.
sched_rq: An allocated array of run-queues of sizenum_rqs;
job_scheduled: oncedrm_sched_entity_flush() is called the schedulerwaits on this wait queue until all the scheduled jobs arefinished.
job_id_count: used to assign unique id to the each job.
submit_wq: workqueue used to queuework_run_job andwork_free_job
timeout_wq: workqueue used to queuework_tdr
work_run_job: work which calls run_job op of each scheduler.
work_free_job: work which calls free_job op of each scheduler.
work_tdr: schedules a delayed call todrm_sched_job_timedout after thetimeout interval is over.
pending_list: the list of jobs which are currently in the job queue.
job_list_lock: lock to protect the pending_list.
hang_limit: once the hangs by a job crosses this limit then it is markedguilty and it will no longer be considered for scheduling.
score: score to help loadbalancer pick a idle sched
_score: score used when the driver doesn’t provide one
ready: marks if the underlying HW is ready to work
free_guilty: A hit to time out handler to free the guilty job.
pause_submit: pause queuing ofwork_run_job onsubmit_wq
own_submit_wq: scheduler owns allocation ofsubmit_wq
dev: systemstructdevice

Description

One scheduler is implemented for each hardware ring.

structdrm_sched_init_args¶: parameters for initializing a DRM GPU scheduler

Definition:

struct drm_sched_init_args {    const struct drm_sched_backend_ops *ops;    struct workqueue_struct *submit_wq;    struct workqueue_struct *timeout_wq;    u32 num_rqs;    u32 credit_limit;    unsigned int hang_limit;    long timeout;    atomic_t *score;    const char *name;    struct device *dev;};

Members

ops: backend operations provided by the driver
submit_wq: workqueue to use for submission. If NULL, an ordered wq isallocated and used.
timeout_wq: workqueue to use for timeout work. If NULL, the system_wq is used.
num_rqs: Number of run-queues. This may be at most DRM_SCHED_PRIORITY_COUNT,as there’s usually one run-queue per priority, but may be less.
credit_limit: the number of credits this scheduler can hold from all jobs
hang_limit: number of times to allow a job to hang before dropping it.This mechanism is DEPRECATED. Set it to 0.
timeout: timeout value in jiffies for submitted jobs.
score: score atomic shared with other schedulers. May be NULL.
name: name (typically the driver’s name). Used for debugging
dev: associated device. Used for debugging

structdrm_sched_pending_job_iter¶: DRM scheduler pending job iterator state

Definition:

struct drm_sched_pending_job_iter {    struct drm_gpu_scheduler *sched;};

Members

sched: DRM scheduler associated with pending job iterator

drm_sched_for_each_pending_job¶

drm_sched_for_each_pending_job(__job,__sched,__entity)

Iterator for each pending job in scheduler

Parameters

__job: Current pending job being iterated over
__sched: DRM scheduler to iterate over pending jobs
__entity: DRM scheduler entity to filter jobs, NULL indicates no filter

Description

Iterator for each pending job in scheduler, filtering on an entity, andenforcing scheduler is fully stopped

voiddrm_sched_tdr_queue_imm(structdrm_gpu_scheduler*sched)¶

immediately start job timeout handler

Parameters

structdrm_gpu_scheduler*sched: scheduler for which the timeout handling should be started.

Description

Start timeout handling immediately for the named scheduler.

voiddrm_sched_fault(structdrm_gpu_scheduler*sched)¶: immediately start timeout handler

Parameters

structdrm_gpu_scheduler*sched: scheduler where the timeout handling should be started.

Description

Start timeout handling immediately when the driver detects a hardware fault.

unsignedlongdrm_sched_suspend_timeout(structdrm_gpu_scheduler*sched)¶: Suspend scheduler job timeout

Parameters

structdrm_gpu_scheduler*sched: scheduler instance for which to suspend the timeout

Description

Suspend the delayed work timeout for the scheduler. This is done bymodifying the delayed work timeout to an arbitrary large value,MAX_SCHEDULE_TIMEOUT in this case.

Returns the timeout remaining

voiddrm_sched_resume_timeout(structdrm_gpu_scheduler*sched,unsignedlongremaining)¶: Resume scheduler job timeout

Parameters

structdrm_gpu_scheduler*sched: scheduler instance for which to resume the timeout
unsignedlongremaining: remaining timeout

Description

Resume the delayed work timeout for the scheduler.

voiddrm_sched_stop(structdrm_gpu_scheduler*sched,structdrm_sched_job*bad)¶: stop the scheduler

Parameters

structdrm_gpu_scheduler*sched: scheduler instance
structdrm_sched_job*bad: job which caused the time out

Description

Stop the scheduler and also removes and frees all completed jobs.

Note

bad job will not be freed as it might be used later and so it’scallers responsibility to release it manually if it’s not part of thepending list any more.

This function is typically used for reset recovery (see the docu ofdrm_sched_backend_ops.timedout_job() for details). Do not call it forscheduler teardown, i.e., before callingdrm_sched_fini().

As it’s only used for reset recovery, drivers must not call this functionin theirstructdrm_sched_backend_ops.timedout_job callback when theyskip a reset usingenumdrm_gpu_sched_stat.DRM_GPU_SCHED_STAT_NO_HANG.

voiddrm_sched_start(structdrm_gpu_scheduler*sched,interrno)¶: recover jobs after a reset

Parameters

structdrm_gpu_scheduler*sched: scheduler instance
interrno: error to set on the pending fences

Description

This function is typically used for reset recovery (see the docu ofdrm_sched_backend_ops.timedout_job() for details). Do not call it forscheduler startup. The scheduler itself is fully operational afterdrm_sched_init() succeeded.

voiddrm_sched_resubmit_jobs(structdrm_gpu_scheduler*sched)¶: Deprecated, don’t use in new code!

Parameters

structdrm_gpu_scheduler*sched: scheduler instance

Description

Re-submitting jobs was a concept AMD came up as cheap way to implementrecovery after a job timeout.

This turned out to be not working very well. First of all there are manyproblem with the dma_fence implementation and requirements. Either theimplementation is risking deadlocks with core memory management or violatingdocumented implementation details of the dma_fence object.

Drivers can still save and restore their state for recovery operations, butwe shouldn’t make this a general scheduler feature around the dma_fenceinterface. The suggested driver-side replacement is to usedrm_sched_for_each_pending_job() after stopping the scheduler and implementtheir own recovery operations.

intdrm_sched_job_init(structdrm_sched_job*job,structdrm_sched_entity*entity,u32credits,void*owner,uint64_tdrm_client_id)¶: init a scheduler job

Parameters

structdrm_sched_job*job: scheduler job to init
structdrm_sched_entity*entity: scheduler entity to use
u32credits: the number of credits this job contributes to the schedulerscredit limit
void*owner: job owner for debugging
uint64_tdrm_client_id: structdrm_file.client_id of the owner (used by traceevents)

Description

Refer todrm_sched_entity_push_job() documentationfor locking considerations.

Drivers must make suredrm_sched_job_cleanup() if this function returnssuccessfully, even whenjob is aborted beforedrm_sched_job_arm() is called.

Note that this function does not assign a valid value to eachstructmemberofstructdrm_sched_job. Take a look at that struct’s documentation to seewho sets whichstructmember with what lifetime.

WARNING: amdgpu abusesdrm_sched.ready to signal when the hardwarehas died, which can mean that there’s no valid runqueue for aentity.This function returns -ENOENT in this case (which probably should be -EIO asa more meanigful return value).

Returns 0 for success, negative error code otherwise.

voiddrm_sched_job_arm(structdrm_sched_job*job)¶: arm a scheduler job for execution

Parameters

structdrm_sched_job*job: scheduler job to arm

Description

This arms a scheduler job for execution. Specifically it initializes thedrm_sched_job.s_fence ofjob, so that it can be attached tostructdma_resvor other places that need to track the completion of this job. It alsoinitializes sequence numbers, which are fundamental for fence ordering.

Refer todrm_sched_entity_push_job() documentation for lockingconsiderations.

Once this function was called, youmust submitjob withdrm_sched_entity_push_job().

This can only be called ifdrm_sched_job_init() succeeded.

intdrm_sched_job_add_dependency(structdrm_sched_job*job,structdma_fence*fence)¶: adds the fence as a job dependency

Parameters

structdrm_sched_job*job: scheduler job to add the dependencies to
structdma_fence*fence: the dma_fence to add to the list of dependencies.

Description

Note thatfence is consumed in both the success and error cases.

Return

0 on success, or an error on failing to expand the array.

intdrm_sched_job_add_syncobj_dependency(structdrm_sched_job*job,structdrm_file*file,u32handle,u32point)¶: adds a syncobj’s fence as a job dependency

Parameters

structdrm_sched_job*job: scheduler job to add the dependencies to
structdrm_file*file: drm file private pointer
u32handle: syncobj handle to lookup
u32point: timeline point

Description

This adds the fence matching the given syncobj tojob.

Return

0 on success, or an error on failing to expand the array.

intdrm_sched_job_add_resv_dependencies(structdrm_sched_job*job,structdma_resv*resv,enumdma_resv_usageusage)¶: add all fences from the resv to the job

Parameters

structdrm_sched_job*job: scheduler job to add the dependencies to
structdma_resv*resv: the dma_resv object to get the fences from
enumdma_resv_usageusage: the dma_resv_usage to use to filter the fences

Description

This adds all fences matching the given usage fromresv tojob.Must be called with theresv lock held.

Return

0 on success, or an error on failing to expand the array.

intdrm_sched_job_add_implicit_dependencies(structdrm_sched_job*job,structdrm_gem_object*obj,boolwrite)¶: adds implicit dependencies as job dependencies

Parameters

structdrm_sched_job*job: scheduler job to add the dependencies to
structdrm_gem_object*obj: the gem object to add new dependencies from.
boolwrite: whether the job might write the object (so we need to depend onshared fences in the reservation object).

Description

This should be called afterdrm_gem_lock_reservations() on your array ofGEM objects used in the job but before updating the reservations with yourown fences.

Return

0 on success, or an error on failing to expand the array.

booldrm_sched_job_has_dependency(structdrm_sched_job*job,structdma_fence*fence)¶: check whether fence is the job’s dependency

Parameters

structdrm_sched_job*job: scheduler job to check
structdma_fence*fence: fence to look for

Return

True iffence is found within the job’s dependencies, or otherwise false.

voiddrm_sched_job_cleanup(structdrm_sched_job*job)¶: clean up scheduler job resources

Parameters

structdrm_sched_job*job: scheduler job to clean up

Description

Cleans up the resources allocated withdrm_sched_job_init().

Drivers should call this from their error unwind code ifjob is abortedbeforedrm_sched_job_arm() is called.

drm_sched_job_arm() is a point of no return since it initializes the fencesand their sequence number etc. Once that function has been called, youmustsubmit it withdrm_sched_entity_push_job() and cannot simply abort it bycallingdrm_sched_job_cleanup().

This function should be called in thedrm_sched_backend_ops.free_job callback.

structdrm_gpu_scheduler*drm_sched_pick_best(structdrm_gpu_scheduler**sched_list,unsignedintnum_sched_list)¶: Get a drm sched from a sched_list with the least load

Parameters

structdrm_gpu_scheduler**sched_list: list of drm_gpu_schedulers
unsignedintnum_sched_list: number of drm_gpu_schedulers in the sched_list

Description

Returns pointer of the sched with the least load or NULL if none of thedrm_gpu_schedulers are ready

intdrm_sched_init(structdrm_gpu_scheduler*sched,conststructdrm_sched_init_args*args)¶: Init a gpu scheduler instance

Parameters

structdrm_gpu_scheduler*sched: scheduler instance
conststructdrm_sched_init_args*args: scheduler initialization arguments

Description

Return 0 on success, otherwise error code.

voiddrm_sched_fini(structdrm_gpu_scheduler*sched)¶: Destroy a gpu scheduler

Parameters

structdrm_gpu_scheduler*sched: scheduler instance

Description

Tears down and cleans up the scheduler.

This stops submission of new jobs to the hardware throughstructdrm_sched_backend_ops.run_job. Ifstructdrm_sched_backend_ops.cancel_jobis implemented, all jobs will be canceled through it and afterwards cleanedup throughstructdrm_sched_backend_ops.free_job. If cancel_job is notimplemented, memory could leak.

voiddrm_sched_increase_karma(structdrm_sched_job*bad)¶: Update sched_entity guilty flag

Parameters

structdrm_sched_job*bad: The job guilty of time out

Description

Increment on every hang caused by the ‘bad’ job. If this exceeds the hanglimit of the scheduler then the respective sched entity is marked guilty andjobs from it will not be scheduled further

booldrm_sched_wqueue_ready(structdrm_gpu_scheduler*sched)¶: Is the scheduler ready for submission

Parameters

structdrm_gpu_scheduler*sched: scheduler instance

Description

Returns true if submission is ready

voiddrm_sched_wqueue_stop(structdrm_gpu_scheduler*sched)¶: stop scheduler submission

Parameters

structdrm_gpu_scheduler*sched: scheduler instance

Description

Stops the scheduler from pulling new jobs from entities. It also stopsfreeing jobs automatically through drm_sched_backend_ops.free_job().

voiddrm_sched_wqueue_start(structdrm_gpu_scheduler*sched)¶: start scheduler submission

Parameters

structdrm_gpu_scheduler*sched: scheduler instance

Description

Restarts the scheduler afterdrm_sched_wqueue_stop() has stopped it.

This function is not necessary for ‘conventional’ startup. The scheduler isfully operational afterdrm_sched_init() succeeded.

booldrm_sched_is_stopped(structdrm_gpu_scheduler*sched)¶: Checks whether drm_sched is stopped

Parameters

structdrm_gpu_scheduler*sched: DRM scheduler

Return

true if sched is stopped, false otherwise

booldrm_sched_job_is_signaled(structdrm_sched_job*job)¶: DRM scheduler job is signaled

Parameters

structdrm_sched_job*job: DRM scheduler job

Description

Determine if DRM scheduler job is signaled. DRM scheduler should be stoppedto obtain a stable snapshot of state. Both parent fence (hardware fence) andfinished fence (software fence) are checked to determine signaling state.

Return

true if job is signaled, false otherwise

intdrm_sched_entity_init(structdrm_sched_entity*entity,enumdrm_sched_prioritypriority,structdrm_gpu_scheduler**sched_list,unsignedintnum_sched_list,atomic_t*guilty)¶: Init a context entity used by scheduler when submit to HW ring.

Parameters

structdrm_sched_entity*entity: scheduler entity to init
enumdrm_sched_prioritypriority: priority of the entity
structdrm_gpu_scheduler**sched_list: the list of drm scheds on which jobs from thisentity can be submitted
unsignedintnum_sched_list: number of drm sched in sched_list
atomic_t*guilty: atomic_t set to 1 when a job on this queueis found to be guilty causing a timeout

Description

Note that thesched_list must have at least one element to schedule the entity.

For changingpriority later on at runtime seedrm_sched_entity_set_priority(). For changing the set of schedulerssched_list at runtime seedrm_sched_entity_modify_sched().

An entity is cleaned up by callingdrm_sched_entity_fini(). See alsodrm_sched_entity_destroy().

Returns 0 on success or a negative error code on failure.

voiddrm_sched_entity_modify_sched(structdrm_sched_entity*entity,structdrm_gpu_scheduler**sched_list,unsignedintnum_sched_list)¶: Modify sched of an entity

Parameters

structdrm_sched_entity*entity: scheduler entity to init
structdrm_gpu_scheduler**sched_list: the list of new drm scheds which will replaceexisting entity->sched_list
unsignedintnum_sched_list: number of drm sched in sched_list

Description

Note that this must be called under the same common lock forentity asdrm_sched_job_arm() anddrm_sched_entity_push_job(), or the driver needs toguarantee through some other means that this is never called while new jobscan be pushed toentity.

intdrm_sched_entity_error(structdrm_sched_entity*entity)¶: return error of last scheduled job

Parameters

structdrm_sched_entity*entity: scheduler entity to check

Description

Opportunistically return the error of the last scheduled job. Result canchange any time when new jobs are pushed to the hw.

longdrm_sched_entity_flush(structdrm_sched_entity*entity,longtimeout)¶: Flush a context entity

Parameters

structdrm_sched_entity*entity: scheduler entity
longtimeout: time to wait in for Q to become empty in jiffies.

Description

Splittingdrm_sched_entity_fini() into two functions, The first one does thewaiting, removes the entity from the runqueue and returns an error when theprocess was killed.

Returns the remaining time in jiffies left from the input timeout

voiddrm_sched_entity_fini(structdrm_sched_entity*entity)¶: Destroy a context entity

Parameters

structdrm_sched_entity*entity: scheduler entity

Description

Cleanups upentity which has been initialized bydrm_sched_entity_init().

If there are potentially job still in flight or getting newly queueddrm_sched_entity_flush() must be called first. This function then goes overthe entity and signals all jobs with an error code if the process was killed.

voiddrm_sched_entity_destroy(structdrm_sched_entity*entity)¶: Destroy a context entity

Parameters

structdrm_sched_entity*entity: scheduler entity

Description

Callsdrm_sched_entity_flush() anddrm_sched_entity_fini() as aconvenience wrapper.

voiddrm_sched_entity_set_priority(structdrm_sched_entity*entity,enumdrm_sched_prioritypriority)¶: Sets priority of the entity

Parameters

structdrm_sched_entity*entity: scheduler entity
enumdrm_sched_prioritypriority: scheduler priority

Description

Update the priority of runqueues used for the entity.

voiddrm_sched_entity_push_job(structdrm_sched_job*sched_job)¶: Submit a job to the entity’s job queue

Parameters

structdrm_sched_job*sched_job: job to submit

Note

To guarantee that the order of insertion to queue matches the job’sfence sequence number this function should be called withdrm_sched_job_arm()under common lock for thestructdrm_sched_entity that was set up forsched_job indrm_sched_job_init().

Movatterモバイル変換

DRM Memory Management¶

The Translation Table Manager (TTM)¶

TTM device object reference¶

TTM resource placement reference¶

TTM resource object reference¶

TTM TT object reference¶

TTM page pool reference¶

The Graphics Execution Manager (GEM)¶

GEM Initialization¶

GEM Objects Creation¶

GEM Objects Lifetime¶

GEM Objects Naming¶

GEM Objects Mapping¶

Memory Coherency¶

Command Execution¶

GEM Function Reference¶

GEM DMA Helper Functions Reference¶

GEM SHMEM Helper Function Reference¶

GEM VRAM Helper Functions Reference¶

GEM TTM Helper Functions Reference¶

VMA Offset Manager¶

PRIME Buffer Sharing¶

Overview and Lifetime Rules¶

Reference Counting for GEM Drivers¶

PRIME Helper Functions¶

Exporting buffers¶

Importing buffers¶

PRIME Function References¶

DRM MM Range Allocator¶

Overview¶

LRU Scan/Eviction Support¶

DRM MM Range Allocator Function References¶

DRM GPUVM¶

Overview¶

Split and Merge¶

Locking¶

Examples¶

DRM GPUVM Function References¶

DRM Buddy Allocator¶

Buddy Allocator Function References (GPU buddy)¶

DRM Buddy Specific Logging Function References¶

DRM Cache Handling and Fast WC memcpy()¶

DRM Sync Objects¶

Host-side wait on syncobjs¶

Import/export of syncobjs¶

Import/export of timeline points in timeline syncobjs¶

DRM Execution context¶

GPU Scheduler¶

Overview¶

Flow Control¶

Scheduler Function References¶