Hardware workarounds

Hardware workarounds are register programming documented to be executed inthe driver that fall outside of the normal programming sequences for aplatform. There are some basic categories of workarounds, depending onhow/when they are applied:

  • LRC workarounds: workarounds that touch registers that aresaved/restored to/from the HW context image. The list is emitted (via LoadRegister Immediate commands) once when initializing the device and saved inthe default context. That default context is then used on every contextcreation to have a “primed golden context”, i.e. a context image thatalready contains the changes needed to all the registers. Seedrivers/gpu/drm/xe/xe_lrc.c for default context handling.

  • Engine workarounds: the list of these WAs is applied whenever the specificengine is reset. It’s also possible that a set of engine classes share acommon power domain and they are reset together. This happens on someplatforms with render and compute engines. In this case (at least) one ofthem need to keeep the workaround programming: the approach taken in thedriver is to tie those workarounds to the first compute/render engine thatis registered. When executing with GuC submission, engine resets areoutside of kernel driver control, hence the list of registers involved iswritten once, on engine initialization, and then passed to GuC, thatsaves/restores their values before/after the reset takes place. Seedrivers/gpu/drm/xe/xe_guc_ads.c for reference.

  • GT workarounds: the list of these WAs is applied whenever these registersrevert to their default values: on GPU reset, suspend/resume[1], etc.

  • Register whitelist: some workarounds need to be implemented in userspace,but need to touch privileged registers. The whitelist in the kernelinstructs the hardware to allow the access to happen. From the kernel side,this is just a special case of a MMIO workaround (as we write the list ofthese to/be-whitelisted registers to some special HW registers).

  • Workaround batchbuffers: buffers that get executed automatically by thehardware on every HW context restore. These buffers are created andprogrammed in the default context so the hardware always go through thoseprogramming sequences when switching contexts. The support for workaroundbatchbuffers is enabled via these hardware mechanisms:

    1. INDIRECT_CTX (also known asmid context restore bb): A batchbufferand an offset are provided in the default context, pointing the hardwareto jump to that location when that offset is reached in the contextrestore. When a context is being restored, this is executed after thering context, in the middle (or beginning) of the engine context image.

    2. BB_PER_CTX_PTR (also known aspost context restore bb): Abatchbuffer is provided in the default context, pointing the hardware toa buffer to continue executing after the engine registers are restoredin a context restore sequence.

    Below is the timeline for a context restore sequence:

                      INDIRECT_CTX_OFFSET             |----------->|.------------.------------.-------------.------------.--------------.-----------.|Ring        | Engine     | Mid-context | Engine     | Post-context | Ring      ||Restore     | Restore (1)| BB Restore  | Restore (2)| BB Restore   | Execution |`------------'------------'-------------'------------'--------------'-----------'
  • Other/OOB: There are WAs that, due to their nature, cannot be applied froma central place. Those are peppered around the rest of the code, as needed.There’s a central place to control which workarounds are enabled:drivers/gpu/drm/xe/xe_wa_oob.rules for GT workarounds anddrivers/gpu/drm/xe/xe_device_wa_oob.rules for device/SoC workarounds.These files only record which workarounds are enabled: during early deviceinitialization those rules are evaluated and recorded by the driver. Thenlater the driver checks withXE_GT_WA() andXE_DEVICE_WA() toimplement them.

[1]

Technically, some registers are powercontext saved & restored, so theysurvive a suspend/resume. In practice, writing them again is not toocostly and simplifies things, so it’s the approach taken in the driver.

Note

Hardware workarounds in xe work the same way as in i915, with thedifference of how they are maintained in the code. In xe it uses thexe_rtp infrastructure so the workarounds can be kept in tables, followinga more declarative approach rather than procedural.

Internal API

voidxe_wa_process_device_oob(structxe_device*xe)

process OOB workaround table

Parameters

structxe_device*xe

device instance to process workarounds for

Description

process OOB workaround table for this device, marking inxe theworkarounds that are active.

voidxe_wa_process_gt_oob(structxe_gt*gt)

process GT OOB workaround table

Parameters

structxe_gt*gt

GT instance to process workarounds for

Description

Process OOB workaround table for this platform, marking ingt theworkarounds that are active.

voidxe_wa_process_gt(structxe_gt*gt)

process GT workaround table

Parameters

structxe_gt*gt

GT instance to process workarounds for

Description

Process GT workaround table for this platform, saving ingt all theworkarounds that need to be applied at the GT level.

voidxe_wa_process_engine(structxe_hw_engine*hwe)

process engine workaround table

Parameters

structxe_hw_engine*hwe

engine instance to process workarounds for

Description

Process engine workaround table for this platform, saving inhwe all theworkarounds that need to be applied at the engine level that match thisengine.

voidxe_wa_process_lrc(structxe_hw_engine*hwe)

process context workaround table

Parameters

structxe_hw_engine*hwe

engine instance to process workarounds for

Description

Process context workaround table for this platform, saving inhwe all theworkarounds that need to be applied on context restore. These are workaroundstouching registers that are part of the HW context image.

intxe_wa_device_init(structxe_device*xe)

initialize device with workaround oob bookkeeping

Parameters

structxe_device*xe

Xe device instance to initialize

Description

Returns 0 for success, negative with error code otherwise

intxe_wa_gt_init(structxe_gt*gt)

initialize gt with workaround bookkeeping

Parameters

structxe_gt*gt

GT instance to initialize

Description

Returns 0 for success, negative error code otherwise.

intxe_wa_gt_dump(structxe_gt*gt,structdrm_printer*p)

Dump GT workarounds into a drm printer.

Parameters

structxe_gt*gt

thexe_gt

structdrm_printer*p

thedrm_printer

Return

always 0.