The Linux Kernel Device Model

Patrick Mochel <mochel@digitalimplant.org>

Drafted 26 August 2002Updated 31 January 2006

Overview

The Linux Kernel Driver Model is a unification of all the disparate drivermodels that were previously used in the kernel. It is intended to augment thebus-specific drivers for bridges and devices by consolidating a set of dataand operations into globally accessible data structures.

Traditional driver models implemented some sort of tree-like structure(sometimes just a list) for the devices they control. There wasn’t anyuniformity across the different bus types.

The current driver model provides a common, uniform data model for describinga bus and the devices that can appear under the bus. The unified busmodel includes a set of common attributes which all buses carry, and a setof common callbacks, such as device discovery during bus probing, busshutdown, bus power management, etc.

The common device and bridge interface reflects the goals of the moderncomputer: namely the ability to do seamless device “plug and play”, powermanagement, and hot plug. In particular, the model dictated by Intel andMicrosoft (namely ACPI) ensures that almost every device on almost any buson an x86-compatible system can work within this paradigm. Of course,not every bus is able to support all such operations, although mostbuses support most of those operations.

Downstream Access

Common data fields have been moved out of individual bus layers into a commondata structure. These fields must still be accessed by the bus layers,and sometimes by the device-specific drivers.

Other bus layers are encouraged to do what has been done for the PCI layer.structpci_dev now looks like this:

struct pci_dev {      ...      struct device dev;     /* Generic device interface */      ...};

Note first that thestructdevice dev within thestructpci_dev isstatically allocated. This means only one allocation on device discovery.

Note also that thatstructdevice dev is not necessarily defined at thefront of the pci_dev structure. This is to make people think about whatthey’re doing when switching between the bus driver and the global driver,and to discourage meaningless and incorrect casts between the two.

The PCI bus layer freely accesses the fields ofstructdevice. It knows aboutthe structure ofstructpci_dev, and it should know the structure ofstructdevice. Individual PCI device drivers that have been converted to the currentdriver model generally do not and should not touch the fields ofstructdevice,unless there is a compelling reason to do so.

The above abstraction prevents unnecessary pain during transitional phases.If it were not done this way, then when a field was renamed or removed, everydownstream driver would break. On the other hand, if only the bus layer(and not the device layer) accesses thestructdevice, it is only the buslayer that needs to change.

User Interface

By virtue of having a complete hierarchical view of all the devices in thesystem, exporting a complete hierarchical view to userspace becomes relativelyeasy. This has been accomplished by implementing a special purpose virtualfile system named sysfs.

Almost all mainstream Linux distros mount this filesystem automatically; youcan see some variation of the following in the output of the “mount” command:

$ mount...none on /sys type sysfs (rw,noexec,nosuid,nodev)...$

The auto-mounting of sysfs is typically accomplished by an entry similar tothe following in the /etc/fstab file:

none          /sys    sysfs    defaults               0 0

or something similar in the /lib/init/fstab file on Debian-based systems:

none            /sys    sysfs    nodev,noexec,nosuid    0 0

If sysfs is not automatically mounted, you can always do it manually with:

# mount -t sysfs sysfs /sys

Whenever a device is inserted into the tree, a directory is created for it.This directory may be populated at each layer of discovery - the global layer,the bus layer, or the device layer.

The global layer currently creates two files - ‘name’ and ‘power’. Theformer only reports the name of the device. The latter reports thecurrent power state of the device. It will also be used to set the currentpower state.

The bus layer may also create files for the devices it finds while probing thebus. For example, the PCI layer currently creates ‘irq’ and ‘resource’ filesfor each PCI device.

A device-specific driver may also export files in its directory to exposedevice-specific data or tunable interfaces.

More information about the sysfs directory layout can be found inthe other documents in this directory and in the filesysfs - _The_ filesystem for exporting kernel objects.