CPU Hotplug and ACPI

CPU hotplug in the arm64 world is commonly used to describe the kernel takingCPUs online/offline using PSCI. This document is about ACPI firmware allowingCPUs that were not available during boot to be added to the system later.

possible andpresent refer to the state of the CPU as seen by linux.

CPU Hotplug on physical systems - CPUs not present at boot

Physical systems need to mark a CPU that ispossible but notpresent asbeingpresent. An example would be a dual socket machine, where the packagein one of the sockets can be replaced while the system is running.

This is not supported.

In the arm64 world CPUs are not a single device but a slice of the system.There are no systems that support the physical addition (or removal) of CPUswhile the system is running, and ACPI is not able to sufficiently describethem.

e.g. New CPUs come with new caches, but the platform’s cache topology isdescribed in a static table, the PPTT. How caches are shared between CPUs isnot discoverable, and must be described by firmware.

e.g. The GIC redistributor for each CPU must be accessed by the driver duringboot to discover the system wide supported features. ACPI’s MADT GICCstructures can describe a redistributor associated with a disabled CPU, butcan’t describe whether the redistributor is accessible, only that it is not‘always on’.

arm64’s ACPI tables assume that everything described ispresent.

CPU Hotplug on virtual systems - CPUs not enabled at boot

Virtual systems have the advantage that all the properties the system willever have can be described at boot. There are no power-domain considerationsas such devices are emulated.

CPU Hotplug on virtual systems is supported. It is distinct from physicalCPU Hotplug as all resources are described aspresent, but CPUs may bemarked as disabled by firmware. Only the CPU’s online/offline behaviour isinfluenced by firmware. An example is where a virtual machine boots with asingle CPU, and additional CPUs are added once a cloud orchestrator deploysthe workload.

For a virtual machine, the VMM (e.g. Qemu) plays the part of firmware.

Virtual hotplug is implemented as a firmware policy affecting which CPUs can bebrought online. Firmware can enforce its policy via PSCI’s return codes. e.g.DENIED.

The ACPI tables must describe all the resources of the virtual machine. CPUsthat firmware wishes to disable either from boot (or later) should not beenabled in the MADT GICC structures, but should have theonlinecapablebit set, to indicate they can be enabled later. The boot CPU must be marked asenabled. The ‘always on’ GICR structure must be used to describe theredistributors.

CPUs described asonlinecapable but notenabled can be set to enabledby the DSDT’s Processor object’s _STA method. On virtual systems the _STA methodmust always report the CPU aspresent. Changes to the firmware policy canbe notified to the OS via device-check or eject-request.

CPUs described asenabled in the static table, should not have their _STAmodified dynamically by firmware. Soft-restart features such as kexec willre-read the static properties of the system from these static tables, andmay malfunction if these no longer describe the running system. Linux willre-discover the dynamic properties of the system from the _STA method laterduring boot.