Assembler Annotations¶
Copyright (c) 2017-2019 Jiri Slaby
This document describes the new macros for annotation of data and code inassembly. In particular, it contains information aboutSYM_FUNC_START,SYM_FUNC_END,SYM_CODE_START, and similar.
Rationale¶
Some code like entries, trampolines, or boot code needs to be written inassembly. The same as in C, such code is grouped into functions andaccompanied with data. Standard assemblers do not force users into preciselymarking these pieces as code, data, or even specifying their length.Nevertheless, assemblers provide developers with such annotations to aiddebuggers throughout assembly. On top of that, developers also want to marksome functions asglobal in order to be visible outside of their translationunits.
Over time, the Linux kernel has adopted macros from various projects (likebinutils) to facilitate such annotations. So for historic reasons,developers have been usingENTRY,END,ENDPROC, and otherannotations in assembly. Due to the lack of their documentation, the macrosare used in rather wrong contexts at some locations. Clearly,ENTRY wasintended to denote the beginning of global symbols (be it data or code).END used to mark the end of data or end of special functions withnon-standard calling convention. In contrast,ENDPROC should annotateonly ends ofstandard functions.
When these macros are used correctly, they help assemblers generate a niceobject with both sizes and types set correctly. For example, the result ofarch/x86/lib/putuser.S:
Num: Value Size Type Bind Vis Ndx Name 25: 0000000000000000 33 FUNC GLOBAL DEFAULT 1 __put_user_1 29: 0000000000000030 37 FUNC GLOBAL DEFAULT 1 __put_user_2 32: 0000000000000060 36 FUNC GLOBAL DEFAULT 1 __put_user_4 35: 0000000000000090 37 FUNC GLOBAL DEFAULT 1 __put_user_8
This is not only important for debugging purposes. When there are properlyannotated objects like this, tools can be run on them to generate more usefulinformation. In particular, on properly annotated objects,objtool can berun to check and fix the object if needed. Currently,objtool can reportmissing frame pointer setup/destruction in functions. It can alsoautomatically generate annotations forORC unwinderfor most code. Both of these are especially important to support reliablestack traces which are in turn necessary forKernel live patching.
Caveat and Discussion¶
As one might realize, there were only three macros previously. That is indeedinsufficient to cover all the combinations of cases:
- standard/non-standard function
- code/data
- global/local symbol
There was adiscussion and instead of extending the currentENTRY/END*macros, it was decided that brand new macros should be introduced instead:
So how about using macro names that actually show the purpose, insteadof importing all the crappy, historic, essentially randomly chosendebug symbol macro names from the binutils and older kernels?
Macros Description¶
The new macros are prefixed with theSYM_ prefix and can be divided intothree main groups:
SYM_FUNC_*– to annotate C-like functions. This means functions withstandard C calling conventions. For example, on x86, this means that thestack contains a return address at the predefined place and a return fromthe function can happen in a standard way. When frame pointers are enabled,save/restore of frame pointer shall happen at the start/end of a function,respectively, too.Checking tools like
objtoolshould ensure such marked functions conformto these rules. The tools can also easily annotate these functions withdebugging information (likeORC data) automatically.SYM_CODE_*– special functions called with special stack. Be itinterrupt handlers with special stack content, trampolines, or startupfunctions.Checking tools mostly ignore checking of these functions. But some debuginformation still can be generated automatically. For correct debug data,this code needs hints like
UNWIND_HINT_REGSprovided by developers.SYM_DATA*– obviously data belonging to.datasections and not to.text. Data do not contain instructions, so they have to be treatedspecially by the tools: they should not treat the bytes as instructions,nor assign any debug information to them.
Instruction Macros¶
This section coversSYM_FUNC_* andSYM_CODE_* enumerated above.
SYM_FUNC_STARTandSYM_FUNC_START_LOCALare supposed to bethemost frequent markings. They are used for functions with standard callingconventions – global and local. Like in C, they both align the functions toarchitecture specific__ALIGNbytes. There are also_NOALIGNvariantsfor special cases where developers do not want this implicit alignment.SYM_FUNC_START_WEAKandSYM_FUNC_START_WEAK_NOALIGNmarkings arealso offered as an assembler counterpart to theweak attribute known fromC.All of theseshall be coupled with
SYM_FUNC_END. First, it marksthe sequence of instructions as a function and computes its size to thegenerated object file. Second, it also eases checking and processing suchobject files as the tools can trivially find exact function boundaries.So in most cases, developers should write something like in the followingexample, having some asm instructions in between the macros, of course:
SYM_FUNC_START(memset) ... asm insns ...SYM_FUNC_END(memset)
In fact, this kind of annotation corresponds to the now deprecated
ENTRYandENDPROCmacros.SYM_FUNC_START_ALIASandSYM_FUNC_START_LOCAL_ALIASserve for thosewho decided to have two or more names for one function. The typical use is:SYM_FUNC_START_ALIAS(__memset)SYM_FUNC_START(memset) ... asm insns ...SYM_FUNC_END(memset)SYM_FUNC_END_ALIAS(__memset)
In this example, one can call
__memsetormemsetwith the sameresult, except the debug information for the instructions is generated tothe object file only once – for the non-ALIAScase.SYM_CODE_STARTandSYM_CODE_START_LOCALshould be used only inspecial cases – if you know what you are doing. This is used exclusivelyfor interrupt handlers and similar where the calling convention is not the Cone._NOALIGNvariants exist too. The use is the same as for theFUNCcategory above:SYM_CODE_START_LOCAL(bad_put_user) ... asm insns ...SYM_CODE_END(bad_put_user)
Again, every
SYM_CODE_START*shall be coupled bySYM_CODE_END.To some extent, this category corresponds to deprecated
ENTRYandEND. ExceptENDhad several other meanings too.SYM_INNER_LABEL*is used to denote a label inside someSYM_{CODE,FUNC}_STARTandSYM_{CODE,FUNC}_END. They are very similarto C labels, except they can be made global. An example of use:SYM_CODE_START(ftrace_caller) /* save_mcount_regs fills in first two parameters */ ...SYM_INNER_LABEL(ftrace_caller_op_ptr, SYM_L_GLOBAL) /* Load the ftrace_ops into the 3rd parameter */ ...SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL) call ftrace_stub ... retqSYM_CODE_END(ftrace_caller)
Data Macros¶
Similar to instructions, there is a couple of macros to describe data in theassembly.
SYM_DATA_STARTandSYM_DATA_START_LOCALmark the start of some dataand shall be used in conjunction with eitherSYM_DATA_END, orSYM_DATA_END_LABEL. The latter adds also a label to the end, so thatpeople can uselstackand (local)lstack_endin the followingexample:SYM_DATA_START_LOCAL(lstack) .skip 4096SYM_DATA_END_LABEL(lstack, SYM_L_LOCAL, lstack_end)
SYM_DATAandSYM_DATA_LOCALare variants for simple, mostly one-linedata:SYM_DATA(HEAP, .long rm_heap)SYM_DATA(heap_end, .long rm_stack)
In the end, they expand to
SYM_DATA_STARTwithSYM_DATA_ENDinternally.
Support Macros¶
All the above reduce themselves to some invocation ofSYM_START,SYM_END, orSYM_ENTRY at last. Normally, developers should avoid usingthese.
Further, in the above examples, one could seeSYM_L_LOCAL. There are alsoSYM_L_GLOBAL andSYM_L_WEAK. All are intended to denote linkage of asymbol marked by them. They are used either in_LABEL variants of theearlier macros, or inSYM_START.
Overriding Macros¶
Architecture can also override any of the macros in their ownasm/linkage.h, including macros specifying the type of a symbol(SYM_T_FUNC,SYM_T_OBJECT, andSYM_T_NONE). As every macrodescribed in this file is surrounded by#ifdef +#endif, it is enoughto define the macros differently in the aforementioned architecture-dependentheader.