Execution Queue¶
An Execution queue is an interface for the HW context of execution.The user creates an execution queue, submits the GPU jobs through thosequeues and in the end destroys them.
Execution queues can also be created by XeKMD itself for driver internaloperations like object migration etc.
An execution queue is associated with a specified HW engine or a group ofengines (belonging to the same tile and engine class) and any GPU jobsubmitted on the queue will be run on one of these engines.
An execution queue is tied to an address space (VM). It holds a referenceof the associated VM and the underlying Logical Ring Context/s (LRC/s)until the queue is destroyed.
The execution queue sits on top of the submission backend. It opaquelyhandles the GuC and Execlist backends whichever the platform uses, andthe ring operations the different engine classes support.
Multi Queue Group¶
Multi Queue Group is another mode of execution supported by the computeand blitter copy command streamers (CCS and BCS, respectively). It isan enhancement of the existing hardware architecture and leverages thesame submission model. It enables support for efficient, parallelexecution of multiple queues within a single shared context. The multiqueue group functionality is only supported with GuC submission backend.All the queues of a group must use the same address space (VM).
The DRM_XE_EXEC_QUEUE_SET_PROPERTY_MULTI_QUEUE execution queue propertysupports creating a multi queue group and adding queues to a queue group.
The XE_EXEC_QUEUE_CREATE ioctl call with above property with value fieldset to DRM_XE_MULTI_GROUP_CREATE, will create a new multi queue group withthe queue being created as the primary queue (aka q0) of the group. To addsecondary queues to the group, they need to be created with the aboveproperty with id of the primary queue as the value. The properties ofthe primary queue (like priority, time slice) applies to the whole group.So, these properties can’t be set for secondary queues of a group.
The hardware does not support removing a queue from a multi-queue group.However, queues can be dynamically added to the group. A group can haveup to 64 queues. To support this, XeKMD holds references to LRCs of thequeues even after the queues are destroyed by the user until the wholegroup is destroyed. The secondary queues hold a reference to the primaryqueue thus preventing the group from being destroyed when user destroysthe primary queue. Once the primary queue is destroyed, secondary queuescan’t be added to the queue group and new job submissions on existingsecondary queues are not allowed.
The queues of a multi queue group can set their priority within the groupthrough the DRM_XE_EXEC_QUEUE_SET_PROPERTY_MULTI_QUEUE_PRIORITY property.This multi queue priority can also be set dynamically through theXE_EXEC_QUEUE_SET_PROPERTY ioctl. This is the only other propertysupported by the secondary queues of a multi queue group, other thanDRM_XE_EXEC_QUEUE_SET_PROPERTY_MULTI_QUEUE.
When GuC reports an error on any of the queues of a multi queue group,the queue cleanup mechanism is invoked for all the queues of the groupas hardware cannot make progress on the multi queue context.
ReferMulti Queue Group GuC interface for multi queue group GuCinterface.
Multi Queue Group GuC interface¶
The multi queue group coordination between KMD and GuC is through a softwareconstruct called Context Group Page (CGP). The CGP is a KMD managed 4KB pageallocated in the global GTT.
CGP format:
DWORD | Name | Description |
0 | Version | Bits [15:8]=Major ver, [7:0]=Minor ver |
1..15 | RESERVED | MBZ |
16 | KMD_QUEUE_UPDATE_MASK_DW0 | KMD queue mask for queues 31..0 |
17 | KMD_QUEUE_UPDATE_MASK_DW1 | KMD queue mask for queues 63..32 |
18..31 | RESERVED | MBZ |
32 | Q0CD_DW0 | Queue 0 context LRC descriptor lower DWORD |
33 | Q0ContextIndex | Context ID for Queue 0 |
34 | Q1CD_DW0 | Queue 1 context LRC descriptor lower DWORD |
35 | Q1ContextIndex | Context ID for Queue 1 |
... | ... | ... |
158 | Q63CD_DW0 | Queue 63 context LRC descriptor lower DWORD |
159 | Q63ContextIndex | Context ID for Queue 63 |
160..1024 | RESERVED | MBZ |
While registering Q0 with GuC, CGP is updated with Q0 entry and GuC is notifiedthrough XE_GUC_ACTION_REGISTER_CONTEXT_MULTI_QUEUE H2G message which specifiesthe CGP address. When the secondary queues are added to the group, the CGP isupdated with entry for that queue and GuC is notified through the H2G interfaceXE_GUC_ACTION_MULTI_QUEUE_CONTEXT_CGP_SYNC. GuC responds to these H2G messageswith a XE_GUC_ACTION_NOTIFY_MULTIQ_CONTEXT_CGP_SYNC_DONE G2H message. GuC alsosends a XE_GUC_ACTION_NOTIFY_MULTI_QUEUE_CGP_CONTEXT_ERROR notification for anyerror in the CGP. Only one of these CGP update messages can be outstanding(waiting for GuC response) at any time. The bits in KMD_QUEUE_UPDATE_MASK_DW*fields indicate which queue entry is being updated in the CGP.
The primary queue (Q0) represents the multi queue group context in GuC andsubmission on any queue of the group must be through Q0 GuC interface only.
As it is not required to register secondary queues with GuC, the secondary queuecontext ids in the CGP are populated with Q0 context id.
Internal API¶
- enumxe_multi_queue_priority¶
Multi Queue priority values
Constants
XE_MULTI_QUEUE_PRIORITY_LOWPriority low
XE_MULTI_QUEUE_PRIORITY_NORMALPriority normal
XE_MULTI_QUEUE_PRIORITY_HIGHPriority high
Description
The priority values of the queues within the multi queue group.
- structxe_exec_queue_group¶
Execution multi queue group
Definition:
struct xe_exec_queue_group { struct xe_exec_queue *primary; struct xe_bo *cgp_bo; struct xarray xa; struct list_head list; struct mutex list_lock; bool sync_pending; bool banned; bool stopped;};Members
primaryPrimary queue of this group
cgp_boBO for the Context Group Page
xaxarray to store LRCs
listList of all secondary queues in the group
list_lockSecondary queue list lock
sync_pendingCGP_SYNC_DONE g2h response pending
bannedGroup banned
stoppedGroup is stopped, protected by list_lock
Description
Contains multi queue group information.
- structxe_exec_queue¶
Execution queue
Definition:
struct xe_exec_queue { struct xe_file *xef; struct xe_gt *gt; struct xe_hw_engine *hwe; struct kref refcount; struct xe_vm *vm; struct xe_vm *user_vm; enum xe_engine_class class; u32 logical_mask; char name[MAX_FENCE_NAME_LEN]; u16 width; u16 msix_vec; struct xe_hw_fence_irq *fence_irq; struct dma_fence *last_fence;#define EXEC_QUEUE_FLAG_KERNEL BIT(0);#define EXEC_QUEUE_FLAG_PERMANENT BIT(1);#define EXEC_QUEUE_FLAG_VM BIT(2);#define EXEC_QUEUE_FLAG_BIND_ENGINE_CHILD BIT(3);#define EXEC_QUEUE_FLAG_HIGH_PRIORITY BIT(4);#define EXEC_QUEUE_FLAG_LOW_LATENCY BIT(5);#define EXEC_QUEUE_FLAG_MIGRATE BIT(6); unsigned long flags; union { struct list_head multi_gt_list; struct list_head multi_gt_link; }; union { struct xe_execlist_exec_queue *execlist; struct xe_guc_exec_queue *guc; }; struct { struct xe_exec_queue_group *group; struct list_head link; enum xe_multi_queue_priority priority; spinlock_t lock; u8 pos; u8 valid:1; u8 is_primary:1; } multi_queue; struct { u32 timeslice_us; u32 preempt_timeout_us; u32 job_timeout_ms; enum xe_exec_queue_priority priority; } sched_props; struct { struct dma_fence *pfence; u64 context; u32 seqno; struct list_head link; } lr;#define XE_EXEC_QUEUE_TLB_INVAL_PRIMARY_GT 0;#define XE_EXEC_QUEUE_TLB_INVAL_MEDIA_GT 1;#define XE_EXEC_QUEUE_TLB_INVAL_COUNT (XE_EXEC_QUEUE_TLB_INVAL_MEDIA_GT + 1); struct { struct xe_dep_scheduler *dep_scheduler; struct dma_fence *last_fence; } tlb_inval[XE_EXEC_QUEUE_TLB_INVAL_COUNT]; struct list_head vm_exec_queue_link; struct { u8 type; struct list_head link; } pxp; struct drm_syncobj *ufence_syncobj; u64 ufence_timeline_value; void *replay_state; const struct xe_exec_queue_ops *ops; const struct xe_ring_ops *ring_ops; struct drm_sched_entity *entity;#define XE_MAX_JOB_COUNT_PER_EXEC_QUEUE 1000; atomic_t job_cnt; u64 tlb_flush_seqno; struct list_head hw_engine_group_link; struct xe_lrc *lrc[] ;};Members
xefBack pointer to xe file if this is user created exec queue
gtGT structure this exec queue can submit to
hweA hardware of the same class. May (physical engine) or may not(virtual engine) be where jobs actual engine up running. Should neverreally be used for submissions.
refcountref count of this exec queue
vmVM (address space) for this exec queue
user_vmUser VM (address space) for this exec queue (bind queuesonly)
classclass of this exec queue
logical_masklogical mask of where job submitted to exec queue can run
namename of this exec queue
widthwidth (number BB submitted per exec) of this exec queue
msix_vecMSI-X vector (for platforms that support it)
fence_irqfence IRQ used to signal job completion
last_fencelast fence for tlb invalidation, protected byvm->lock in write mode
flagsflags for this exec queue, should statically setup aside from banbit
{unnamed_union}anonymous
multi_gt_listlist head for VM bind engines if multi-GT
multi_gt_linklink for VM bind engines if multi-GT
{unnamed_union}anonymous
execlistexeclist backend specific state for exec queue
gucGuC backend specific state for exec queue
multi_queueMulti queue information
multi_queue.priorityQueue priority within the multi-queue group.It is protected bymulti_queue.lock.
sched_propsscheduling properties
lrlong-running exec queue state
tlb_invalTLB invalidations exec queue state
tlb_inval.dep_schedulerThe TLB invalidationdependency scheduler
vm_exec_queue_linkLink to track exec queue within a VM’s list of exec queues.
pxpPXP info tracking
ufence_syncobjUser fence syncobj
ufence_timeline_valueUser fence timeline value
replay_stateGPU hang replay state
opssubmission backend exec queue operations
ring_opsring operations for this exec queue
entityDRM sched entity for this exec queue (1 to 1 relationship)
job_cntnumber of drm jobs in this exec queue
tlb_flush_seqnoThe seqno of the last rebind tlb flush performedProtected byvm’s resv. Unused ifvm == NULL.
hw_engine_group_linklink into exec queues in the same hw engine group
lrclogical ring context for this exec queue
Description
Contains all state necessary for submissions. Can either be a user object ora kernel object.
- structxe_exec_queue_ops¶
Submission backend exec queue operations
Definition:
struct xe_exec_queue_ops { int (*init)(struct xe_exec_queue *q); void (*kill)(struct xe_exec_queue *q); void (*fini)(struct xe_exec_queue *q); void (*destroy)(struct xe_exec_queue *q); int (*set_priority)(struct xe_exec_queue *q, enum xe_exec_queue_priority priority); int (*set_timeslice)(struct xe_exec_queue *q, u32 timeslice_us); int (*set_preempt_timeout)(struct xe_exec_queue *q, u32 preempt_timeout_us); int (*set_multi_queue_priority)(struct xe_exec_queue *q, enum xe_multi_queue_priority priority); int (*suspend)(struct xe_exec_queue *q); int (*suspend_wait)(struct xe_exec_queue *q); void (*resume)(struct xe_exec_queue *q); bool (*reset_status)(struct xe_exec_queue *q); bool (*active)(struct xe_exec_queue *q);};Members
initInitialize exec queue for submission backend
killKill inflight submissions for backend
finiUndoes the
init()for submission backenddestroyDestroy exec queue for submission backend. The backendfunction must call
xe_exec_queue_fini()(which will in turn call thefini()backend function) to ensure the queue is properly cleaned up.set_prioritySet priority for exec queue
set_timesliceSet timeslice for exec queue
set_preempt_timeoutSet preemption timeout for exec queue
set_multi_queue_prioritySet multi queue priority
suspendSuspend exec queue from executing, allowed to be calledmultiple times in a row before resume with the caveat thatsuspend_wait returns before calling suspend again.
suspend_waitWait for an exec queue to suspend executing, should becall after suspend. In dma-fencing path thus must return within areasonable amount of time. -ETIME return shall indicate an errorwaiting for suspend resulting in associated VM getting killed.-EAGAIN return indicates the wait should be tried again, if the waitis within a work item, the work item should be requeued as deadlockavoidance mechanism.
resumeResume exec queue execution, exec queue must be in a suspendedstate and dma fence returned from most recent suspend call must besignalled when this function is called.
reset_statuscheck exec queue reset status
activecheck exec queue is active
- boolxe_exec_queue_is_multi_queue(structxe_exec_queue*q)¶
Whether an exec_queue is part of a queue group.
Parameters
structxe_exec_queue*qThe exec_queue
Return
True if the exec_queue is part of a queue group, false otherwise.
- boolxe_exec_queue_is_multi_queue_primary(structxe_exec_queue*q)¶
Whether an exec_queue is primary queue of a multi queue group.
Parameters
structxe_exec_queue*qThe exec_queue
Return
True ifq is primary queue of a queue group, false otherwise.
- boolxe_exec_queue_is_multi_queue_secondary(structxe_exec_queue*q)¶
Whether an exec_queue is secondary queue of a multi queue group.
Parameters
structxe_exec_queue*qThe exec_queue
Return
True ifq is secondary queue of a queue group, false otherwise.
- structxe_exec_queue*xe_exec_queue_multi_queue_primary(structxe_exec_queue*q)¶
Get multi queue group’s primary queue
Parameters
structxe_exec_queue*qThe exec_queue
Description
Ifq belongs to a multi queue group, then the primary queue of the group willbe returned. Otherwise,q will be returned.
- boolxe_exec_queue_idle_skip_suspend(structxe_exec_queue*q)¶
Can exec queue skip suspend
Parameters
structxe_exec_queue*qThe exec_queue
Description
If an exec queue is not parallel and is idle, the suspend steps can beskipped in the submission backend immediatley signaling the suspend fence.Parallel queues cannot skip this step due to limitations in the submissionbackend.
Return
True if exec queue is idle and can skip suspend steps, Falseotherwise
- structxe_exec_queue*xe_exec_queue_create_bind(structxe_device*xe,structxe_tile*tile,structxe_vm*user_vm,u32flags,u64extensions)¶
Create bind exec queue.
Parameters
structxe_device*xeXe device.
structxe_tile*tiletile which bind exec queue belongs to.
structxe_vm*user_vmThe user VM which this exec queue belongs to
u32flagsexec queue creation flags
u64extensionsexec queue creation extensions
Description
Normalize bind exec queue creation. Bind exec queue is tied to migration VMfor access to physical memory required for page table programming. On afaulting devices the reserved copy engine instance must be used to avoiddeadlocking (user binds cannot get stuck behind faults as kernel binds whichresolve faults depend on user binds). On non-faulting devices any copy enginecan be used.
Returns exec queue on success, ERR_PTR on failure
- structxe_lrc*xe_exec_queue_lrc(structxe_exec_queue*q)¶
Get the LRC from exec queue.
Parameters
structxe_exec_queue*qThe exec_queue.
Description
Retrieves the primary LRC for the exec queue. Note that this functionreturns only the first LRC instance, even when multiple parallel LRCsare configured.
Return
Pointer to LRC on success, error on failure
- boolxe_exec_queue_is_lr(structxe_exec_queue*q)¶
Whether an exec_queue is long-running
Parameters
structxe_exec_queue*qThe exec_queue
Return
True if the exec_queue is long-running, false otherwise.
- boolxe_exec_queue_is_idle(structxe_exec_queue*q)¶
Whether an exec_queue is idle.
Parameters
structxe_exec_queue*qThe exec_queue
Description
FIXME: Need to determine what to use as the short-livedtimeline lock for the exec_queues, so that the return valueof this function becomes more than just an advisorysnapshot in time. The timeline lock must protect theseqno from racing submissions on the same exec_queue.Typically vm->resv, but user-created timeline locks use the migrate vmand never grabs the migrate vm->resv so we have a race there.
Return
True if the exec_queue is idle, false otherwise.
- voidxe_exec_queue_update_run_ticks(structxe_exec_queue*q)¶
Update run time in ticks for this exec queue from hw
Parameters
structxe_exec_queue*qThe exec queue
Description
Update the timestamp saved by HW for this exec queue and save run tickscalculated by using the delta from last update.
- voidxe_exec_queue_kill(structxe_exec_queue*q)¶
permanently stop all execution from an exec queue
Parameters
structxe_exec_queue*qThe exec queue
Description
This function permanently stops all activity on an exec queue. If the queueis actively executing on the HW, it will be kicked off the engine; anypending jobs are discarded and all future submissions are rejected.This function is safe to call multiple times.
- voidxe_exec_queue_last_fence_put(structxe_exec_queue*q,structxe_vm*vm)¶
Drop ref to last fence
Parameters
structxe_exec_queue*qThe exec queue
structxe_vm*vmThe VM the engine does a bind or exec for
- voidxe_exec_queue_last_fence_put_unlocked(structxe_exec_queue*q)¶
Drop ref to last fence unlocked
Parameters
structxe_exec_queue*qThe exec queue
Description
Only safe to be called fromxe_exec_queue_destroy().
- structdma_fence*xe_exec_queue_last_fence_get(structxe_exec_queue*q,structxe_vm*vm)¶
Get last fence
Parameters
structxe_exec_queue*qThe exec queue
structxe_vm*vmThe VM the engine does a bind or exec for
Description
Get last fence, takes a ref
Return
last fence if not signaled, dma fence stub if signaled
- structdma_fence*xe_exec_queue_last_fence_get_for_resume(structxe_exec_queue*q,structxe_vm*vm)¶
Get last fence
Parameters
structxe_exec_queue*qThe exec queue
structxe_vm*vmThe VM the engine does a bind or exec for
Description
Get last fence, takes a ref. Only safe to be called in the context ofresuming the hw engine group’s long-running exec queue, when the groupsemaphore is held.
Return
last fence if not signaled, dma fence stub if signaled
- voidxe_exec_queue_last_fence_set(structxe_exec_queue*q,structxe_vm*vm,structdma_fence*fence)¶
Set last fence
Parameters
structxe_exec_queue*qThe exec queue
structxe_vm*vmThe VM the engine does a bind or exec for
structdma_fence*fenceThe fence
Description
Set the last fence for the engine. Increases reference count for fence, whenclosing engine xe_exec_queue_last_fence_put should be called.
- voidxe_exec_queue_tlb_inval_last_fence_put(structxe_exec_queue*q,structxe_vm*vm,unsignedinttype)¶
Drop ref to last TLB invalidation fence
Parameters
structxe_exec_queue*qThe exec queue
structxe_vm*vmThe VM the engine does a bind for
unsignedinttypeEither primary or media GT
- voidxe_exec_queue_tlb_inval_last_fence_put_unlocked(structxe_exec_queue*q,unsignedinttype)¶
Drop ref to last TLB invalidation fence unlocked
Parameters
structxe_exec_queue*qThe exec queue
unsignedinttypeEither primary or media GT
Description
Only safe to be called fromxe_exec_queue_destroy().
- structdma_fence*xe_exec_queue_tlb_inval_last_fence_get(structxe_exec_queue*q,structxe_vm*vm,unsignedinttype)¶
Get last fence for TLB invalidation
Parameters
structxe_exec_queue*qThe exec queue
structxe_vm*vmThe VM the engine does a bind for
unsignedinttypeEither primary or media GT
Description
Get last fence, takes a ref
Return
last fence if not signaled, dma fence stub if signaled
- voidxe_exec_queue_tlb_inval_last_fence_set(structxe_exec_queue*q,structxe_vm*vm,structdma_fence*fence,unsignedinttype)¶
Set last fence for TLB invalidation
Parameters
structxe_exec_queue*qThe exec queue
structxe_vm*vmThe VM the engine does a bind for
structdma_fence*fenceThe fence
unsignedinttypeEither primary or media GT
Description
Set the last fence for the tlb invalidation type on the queue. Increasesreference count for fence, when closing queuexe_exec_queue_tlb_inval_last_fence_put should be called.
- intxe_exec_queue_contexts_hwsp_rebase(structxe_exec_queue*q,void*scratch)¶
Re-compute GGTT references within all LRCs of a queue.
Parameters
structxe_exec_queue*qthe
xe_exec_queuestructinstancecontaining target LRCsvoid*scratchscratch buffer to be used as temporary storage
Return
zero on success, negative error code on failure