DWARF module versioning

Introduction

When CONFIG_MODVERSIONS is enabled, symbol versions for modulesare typically calculated from preprocessed source code using thegenksyms tool. However, this is incompatible with languages suchas Rust, where the source code has insufficient information aboutthe resulting ABI. With CONFIG_GENDWARFKSYMS (and CONFIG_DEBUG_INFO)selected,gendwarfksyms is used instead to calculate symbol versionsfrom the DWARF debugging information, which contains the necessarydetails about the final module ABI.

Usage

gendwarfksyms accepts a list of object files on the command line, and alist of symbol names (one per line) in standard input:

Usage: gendwarfksyms [options] elf-object-file ... < symbol-listOptions:  -d, --debug          Print debugging information      --dump-dies      Dump DWARF DIE contents      --dump-die-map   Print debugging information about die_map changes      --dump-types     Dump type strings      --dump-versions  Dump expanded type strings used for symbol versions  -s, --stable         Support kABI stability features  -T, --symtypes file  Write a symtypes file  -h, --help           Print this message

Type information availability

While symbols are typically exported in the same translation unit (TU)where they’re defined, it’s also perfectly fine for a TU to exportexternal symbols. For example, this is done when calculating symbolversions for exports in stand-alone assembly code.

To ensure the compiler emits the necessary DWARF type information in theTU where symbols are actually exported, gendwarfksyms adds a pointerto exported symbols in theEXPORT_SYMBOL() macro using the followingmacro:

#define __GENDWARFKSYMS_EXPORT(sym)                             \        static typeof(sym) *__gendwarfksyms_ptr_##sym __used    \                __section(".discard.gendwarfksyms") = &sym;

When a symbol pointer is found in DWARF, gendwarfksyms can use itstype for calculating symbol versions even if the symbol is definedelsewhere. The name of the symbol pointer is expected to start with__gendwarfksyms_ptr_, followed by the name of the exported symbol.

Symtypes output format

Similarly to genksyms, gendwarfksyms supports writing a symtypesfile for each processed object that contain types for exportedsymbols and each referenced type that was used in calculating symbolversions. These files can be useful when trying to determine whatexactly caused symbol versions to change between builds. To generatesymtypes files during a kernel build, setKBUILD_SYMTYPES=1.

Matching the existing format, the first column of each line containseither a type reference or a symbol name. Type references have aone-letter prefix followed by “#” and the name of the type. Fourreference types are supported:

e#<type> = enums#<type> = structt#<type> = typedefu#<type> = union

Type names with spaces in them are wrapped in single quotes, e.g.:

s#'core::result::Result<u8, core::num::error::ParseIntError>'

The rest of the line contains a type string. Unlike with genksyms thatproduces C-style type strings, gendwarfksyms uses the same simple parsedDWARF format produced by--dump-dies, but with type referencesinstead of fully expanded strings.

Maintaining a stable kABI

Distribution maintainers often need the ability to make ABI compatiblechanges to kernel data structures due to LTS updates or backports. Usingthe traditional#ifndef __GENKSYMS__ to hide these changes from symbolversioning won’t work when processing object files. To support thisuse case, gendwarfksyms provides kABI stability features designed tohide changes that won’t affect the ABI when calculating versions. Thesefeatures are all gated behind the--stable command line flag and arenot used in the mainline kernel. To use stable features during a kernelbuild, setKBUILD_GENDWARFKSYMS_STABLE=1.

Examples for using these features are provided in thescripts/gendwarfksyms/examples directory, including helper macrosfor source code annotation. Note that as these features are only used totransform the inputs for symbol versioning, the user is responsible forensuring that their changes actually won’t break the ABI.

kABI rules

kABI rules allow distributions to fine-tune certain partsof gendwarfksyms output and thus control how symbolversions are calculated. These rules are defined in the.discard.gendwarfksyms.kabi_rules section of the object file andconsist of simple null-terminated strings with the following structure:

version\0type\0target\0value\0

This string sequence is repeated as many times as needed to express allthe rules. The fields are as follows:

  • version: Ensures backward compatibility for future changes to thestructure. Currently expected to be “1”.

  • type: Indicates the type of rule being applied.

  • target: Specifies the target of the rule, typically the fullyqualified name of the DWARF Debugging Information Entry (DIE).

  • value: Provides rule-specific data.

The following helper macros, for example, can be used to specify rulesin the source code:

#define ___KABI_RULE(hint, target, value)                           \        static const char __PASTE(__gendwarfksyms_rule_,             \                                  __COUNTER__)[] __used __aligned(1) \                __section(".discard.gendwarfksyms.kabi_rules") =     \                        "1\0" #hint "\0" target "\0" value#define __KABI_RULE(hint, target, value) \        ___KABI_RULE(hint, #target, #value)

Currently, only the rules discussed in this section are supported, butthe format is extensible enough to allow further rules to be added asneed arises.

Managing definition visibility

A declaration can change into a full definition when additional includesare pulled into the translation unit. This changes the versions of anysymbol that references the type even if the ABI remains unchanged. Asit may not be possible to drop includes without breaking the build, thedeclonly rule can be used to specify a type as declaration-only, evenif the debugging information contains the full definition.

The rule fields are expected to be as follows:

  • type: “declonly”

  • target: The fully qualified name of the target data structure(as shown in--dump-dies output).

  • value: This field is ignored.

Using the__KABI_RULE macro, this rule can be defined as:

#define KABI_DECLONLY(fqn) __KABI_RULE(declonly, fqn, )

Example usage:

struct s {        /* definition */};KABI_DECLONLY(s);

Adding enumerators

For enums, all enumerators and their values are included in calculatingsymbol versions, which becomes a problem if we later need to add moreenumerators without changing symbol versions. Theenumerator_ignorerule allows us to hide named enumerators from the input.

The rule fields are expected to be as follows:

  • type: “enumerator_ignore”

  • target: The fully qualified name of the target enum(as shown in--dump-dies output) and the name of theenumerator field separated by a space.

  • value: This field is ignored.

Using the__KABI_RULE macro, this rule can be defined as:

#define KABI_ENUMERATOR_IGNORE(fqn, field) \        __KABI_RULE(enumerator_ignore, fqn field, )

Example usage:

enum e {        A, B, C, D,};KABI_ENUMERATOR_IGNORE(e, B);KABI_ENUMERATOR_IGNORE(e, C);

If theenumadditionally includes an end marker and new values mustbe added in the middle, we may need to use the old value for the lastenumerator when calculating versions. Theenumerator_value rule allowsus to override the value of an enumerator for version calculation:

  • type: “enumerator_value”

  • target: The fully qualified name of the target enum(as shown in--dump-dies output) and the name of theenumerator field separated by a space.

  • value: Integer value used for the field.

Using the__KABI_RULE macro, this rule can be defined as:

#define KABI_ENUMERATOR_VALUE(fqn, field, value) \        __KABI_RULE(enumerator_value, fqn field, value)

Example usage:

enum e {        A, B, C, LAST,};KABI_ENUMERATOR_IGNORE(e, C);KABI_ENUMERATOR_VALUE(e, LAST, 2);

Managing structure size changes

A data structure can be partially opaque to modules if its allocation ishandled by the core kernel, and modules only need to access some of itsmembers. In this situation, it’s possible to append new members to thestructure without breaking the ABI, as long as the layout for the originalmembers remains unchanged.

To append new members, we can hide them from symbol versioning asdescribed in sectionHiding members, but we can’thide the increase in structure size. Thebyte_size rule allows us tooverride the structure size used for symbol versioning.

The rule fields are expected to be as follows:

  • type: “byte_size”

  • target: The fully qualified name of the target data structure(as shown in--dump-dies output).

  • value: A positive decimal number indicating the structure sizein bytes.

Using the__KABI_RULE macro, this rule can be defined as:

#define KABI_BYTE_SIZE(fqn, value) \        __KABI_RULE(byte_size, fqn, value)

Example usage:

struct s {        /* Unchanged original members */        unsigned long a;        void *p;        /* Appended new members */        KABI_IGNORE(0, unsigned long n);};KABI_BYTE_SIZE(s, 16);

Overriding type strings

In rare situations where distributions must make significant changes tootherwise opaque data structures that have inadvertently been includedin the published ABI, keeping symbol versions stable using the moretargeted kABI rules can become tedious. Thetype_string rule allows usto override the full type string for a type or a symbol, and even addtypes for versioning that no longer exist in the kernel.

The rule fields are expected to be as follows:

  • type: “type_string”

  • target: The fully qualified name of the target data structure(as shown in--dump-dies output) or symbol.

  • value: A valid type string (as shown in--symtypes) output)to use instead of the real type.

Using the__KABI_RULE macro, this rule can be defined as:

#define KABI_TYPE_STRING(type, str) \        ___KABI_RULE("type_string", type, str)

Example usage:

/* Override type for a structure */KABI_TYPE_STRING("s#s",        "structure_type s { "                "member base_type int byte_size(4) "                        "encoding(5) n "                "data_member_location(0) "        "} byte_size(8)");/* Override type for a symbol */KABI_TYPE_STRING("my_symbol", "variable s#s");

Thetype_string rule should be used only as a last resort if maintaininga stable symbol versions cannot be reasonably achieved using othermeans. Overriding a type string increases the risk of actual ABI breakagesgoing unnoticed as it hides all changes to the type.

Adding structure members

Perhaps the most common ABI compatible change is adding a member to akernel data structure. When changes to a structure are anticipated,distribution maintainers can pre-emptively reserve space in thestructure and take it into use later without breaking the ABI. Ifchanges are needed to data structures without reserved space, existingalignment holes can potentially be used instead. While kABI rules couldbe added for these type of changes, using unions is typically a morenatural method. This section describes gendwarfksyms support for usingreserved space in data structures and hiding members that don’t changethe ABI when calculating symbol versions.

Reserving space and replacing members

Space is typically reserved for later use by appending integer types, orarrays, to the end of the data structure, but any type can be used. Eachreserved member needs a unique name, but as the actual purpose is usuallynot known at the time the space is reserved, for convenience, names thatstart with__kabi_ are left out when calculating symbol versions:

struct s {        long a;        long __kabi_reserved_0; /* reserved for future use */};

The reserved space can be taken into use by wrapping the member in aunion, which includes the original type and the replacement member:

struct s {        long a;        union {                long __kabi_reserved_0; /* original type */                struct b b; /* replaced field */        };};

If the__kabi_ naming scheme was used when reserving space, the nameof the first member of theunionmust start with__kabi_reserved. Thisensures the original type is used when calculating versions, but the nameis again left out. The rest of theunionis ignored.

If we’re replacing a member that doesn’t follow this naming convention,we also need to preserve the original name to avoid changing versions,which we can do by changing the firstunionmember’s name to start with__kabi_renamed followed by the original name.

The examples includeKABI_(RESERVE|USE|REPLACE)* macros that helpsimplify the process and also ensure the replacement member is correctlyaligned and its size won’t exceed the reserved space.

Hiding members

Predicting which structures will require changes during the supporttimeframe isn’t always possible, in which case one might have to resortto placing new members into existing alignment holes:

struct s {        int a;        /* a 4-byte alignment hole */        unsigned long b;};

While this won’t change the size of the data structure, one needs tobe able to hide the added members from symbol versioning. Similarlyto reserved fields, this can be accomplished by wrapping the addedmember to aunionwhere one of the fields has a name starting with__kabi_ignored:

struct s {        int a;        union {                char __kabi_ignored_0;                int n;        };        unsigned long b;};

With--stable, both versions produce the same symbol version. Theexamples include aKABI_IGNORE macro to simplify the code.