drm/v3d Broadcom V3D Graphics Driver¶
This driver supports the Broadcom V3D 3.3 and 4.1 OpenGL ES GPUs.For V3D 2.x support, see the VC4 driver.
The V3D GPU includes a tiled render (composed of a bin and renderpipelines), the TFU (texture formatting unit), and the CSD (computeshader dispatch).
GPU buffer object (BO) management¶
Compared to VC4 (V3D 2.x), V3D 3.3 introduces an MMU between theGPU and the bus, allowing us to use shmem objects for our storageinstead of CMA.
Physically contiguous objects may still be imported to V3D, but thedriver doesn’t allocate physically contiguous objects on its own.Display engines requiring physically contiguous allocations shouldlook into Mesa’s “renderonly” support (as used by the Mesa pl111driver) for an example of how to integrate with V3D.
Address space management¶
The V3D 3.x hardware (compared to VC4) now includes an MMU. It hasa single level of page tables for the V3D’s 4GB address space tomap to AXI bus addresses, thus it could need up to 4MB ofphysically contiguous memory to store the PTEs.
Because the 4MB of contiguous memory for page tables is precious,and switching between them is expensive, we load all BOs into thesame 4GB address space.
To protect clients from each other, we should use the GMP toquickly mask out (at 128kb granularity) what pages are available toeach client. This is not yet implemented.
GPU Scheduling¶
The shared DRM GPU scheduler is used to coordinate submitting jobsto the hardware. Each DRM fd (roughly a client process) gets itsown scheduler entity, which will process jobs in order. The GPUscheduler will schedule the clients with a FIFO scheduling algorithm.
For simplicity, and in order to keep latency low for interactivejobs when bulk background jobs are queued up, we submit a new jobto the HW only when it has completed the last one, instead offilling up the CT[01]Q FIFOs with jobs. Similarly, we usedrm_sched_job_add_dependency() to manage the dependency between binand render, instead of having the clients submit jobs using the HW’ssemaphores to interlock between them.
Interrupts¶
When we take a bin, render, TFU done, or CSD done interrupt, weneed to signal the fence for that job so that the scheduler canqueue up the next one and unblock any waiters.
When we take the binner out of memory interrupt, we need toallocate some new memory and pass it to the binner so that thecurrent job can make progress.