1Assembler Annotations
   4Copyright (c) 2017-2019 Jiri Slaby
   6This document describes the new macros for annotation of data and code in
   7assembly. In particular, it contains information about ``SYM_FUNC_START``,
   8``SYM_FUNC_END``, ``SYM_CODE_START``, and similar.
  12Some code like entries, trampolines, or boot code needs to be written in
  13assembly. The same as in C, such code is grouped into functions and
  14accompanied with data. Standard assemblers do not force users into precisely
  15marking these pieces as code, data, or even specifying their length.
  16Nevertheless, assemblers provide developers with such annotations to aid
  17debuggers throughout assembly. On top of that, developers also want to mark
  18some functions as *global* in order to be visible outside of their translation
  21Over time, the Linux kernel has adopted macros from various projects (like
  22``binutils``) to facilitate such annotations. So for historic reasons,
  23developers have been using ``ENTRY``, ``END``, ``ENDPROC``, and other
  24annotations in assembly.  Due to the lack of their documentation, the macros
  25are used in rather wrong contexts at some locations. Clearly, ``ENTRY`` was
  26intended to denote the beginning of global symbols (be it data or code).
  27``END`` used to mark the end of data or end of special functions with
  28*non-standard* calling convention. In contrast, ``ENDPROC`` should annotate
  29only ends of *standard* functions.
  31When these macros are used correctly, they help assemblers generate a nice
  32object with both sizes and types set correctly. For example, the result of
  35   Num:    Value          Size Type    Bind   Vis      Ndx Name
  36    25: 0000000000000000    33 FUNC    GLOBAL DEFAULT    1 __put_user_1
  37    29: 0000000000000030    37 FUNC    GLOBAL DEFAULT    1 __put_user_2
  38    32: 0000000000000060    36 FUNC    GLOBAL DEFAULT    1 __put_user_4
  39    35: 0000000000000090    37 FUNC    GLOBAL DEFAULT    1 __put_user_8
  41This is not only important for debugging purposes. When there are properly
  42annotated objects like this, tools can be run on them to generate more useful
  43information. In particular, on properly annotated objects, ``objtool`` can be
  44run to check and fix the object if needed. Currently, ``objtool`` can report
  45missing frame pointer setup/destruction in functions. It can also
  46automatically generate annotations for :doc:`ORC unwinder <x86/orc-unwinder>`
  47for most code. Both of these are especially important to support reliable
  48stack traces which are in turn necessary for :doc:`Kernel live patching
  51Caveat and Discussion
  53As one might realize, there were only three macros previously. That is indeed
  54insufficient to cover all the combinations of cases:
  56* standard/non-standard function
  57* code/data
  58* global/local symbol
  60There was a discussion_ and instead of extending the current ``ENTRY/END*``
  61macros, it was decided that brand new macros should be introduced instead::
  63    So how about using macro names that actually show the purpose, instead
  64    of importing all the crappy, historic, essentially randomly chosen
  65    debug symbol macro names from the binutils and older kernels?
  67.. _discussion:
  69Macros Description
  72The new macros are prefixed with the ``SYM_`` prefix and can be divided into
  73three main groups:
  751. ``SYM_FUNC_*`` -- to annotate C-like functions. This means functions with
  76   standard C calling conventions. For example, on x86, this means that the
  77   stack contains a return address at the predefined place and a return from
  78   the function can happen in a standard way. When frame pointers are enabled,
  79   save/restore of frame pointer shall happen at the start/end of a function,
  80   respectively, too.
  82   Checking tools like ``objtool`` should ensure such marked functions conform
  83   to these rules. The tools can also easily annotate these functions with
  84   debugging information (like *ORC data*) automatically.
  862. ``SYM_CODE_*`` -- special functions called with special stack. Be it
  87   interrupt handlers with special stack content, trampolines, or startup
  88   functions.
  90   Checking tools mostly ignore checking of these functions. But some debug
  91   information still can be generated automatically. For correct debug data,
  92   this code needs hints like ``UNWIND_HINT_REGS`` provided by developers.
  943. ``SYM_DATA*`` -- obviously data belonging to ``.data`` sections and not to
  95   ``.text``. Data do not contain instructions, so they have to be treated
  96   specially by the tools: they should not treat the bytes as instructions,
  97   nor assign any debug information to them.
  99Instruction Macros
 101This section covers ``SYM_FUNC_*`` and ``SYM_CODE_*`` enumerated above.
 103``objtool`` requires that all code must be contained in an ELF symbol. Symbol
 104names that have a ``.L`` prefix do not emit symbol table entries. ``.L``
 105prefixed symbols can be used within a code region, but should be avoided for
 106denoting a range of code via ``SYM_*_START/END`` annotations.
 108* ``SYM_FUNC_START`` and ``SYM_FUNC_START_LOCAL`` are supposed to be **the
 109  most frequent markings**. They are used for functions with standard calling
 110  conventions -- global and local. Like in C, they both align the functions to
 111  architecture specific ``__ALIGN`` bytes. There are also ``_NOALIGN`` variants
 112  for special cases where developers do not want this implicit alignment.
 114  ``SYM_FUNC_START_WEAK`` and ``SYM_FUNC_START_WEAK_NOALIGN`` markings are
 115  also offered as an assembler counterpart to the *weak* attribute known from
 116  C.
 118  All of these **shall** be coupled with ``SYM_FUNC_END``. First, it marks
 119  the sequence of instructions as a function and computes its size to the
 120  generated object file. Second, it also eases checking and processing such
 121  object files as the tools can trivially find exact function boundaries.
 123  So in most cases, developers should write something like in the following
 124  example, having some asm instructions in between the macros, of course::
 126    SYM_FUNC_START(memset)
 127        ... asm insns ...
 128    SYM_FUNC_END(memset)
 130  In fact, this kind of annotation corresponds to the now deprecated ``ENTRY``
 131  and ``ENDPROC`` macros.
 133* ``SYM_FUNC_START_ALIAS`` and ``SYM_FUNC_START_LOCAL_ALIAS`` serve for those
 134  who decided to have two or more names for one function. The typical use is::
 136    SYM_FUNC_START_ALIAS(__memset)
 137    SYM_FUNC_START(memset)
 138        ... asm insns ...
 139    SYM_FUNC_END(memset)
 140    SYM_FUNC_END_ALIAS(__memset)
 142  In this example, one can call ``__memset`` or ``memset`` with the same
 143  result, except the debug information for the instructions is generated to
 144  the object file only once -- for the non-``ALIAS`` case.
 146* ``SYM_CODE_START`` and ``SYM_CODE_START_LOCAL`` should be used only in
 147  special cases -- if you know what you are doing. This is used exclusively
 148  for interrupt handlers and similar where the calling convention is not the C
 149  one. ``_NOALIGN`` variants exist too. The use is the same as for the ``FUNC``
 150  category above::
 152    SYM_CODE_START_LOCAL(bad_put_user)
 153        ... asm insns ...
 154    SYM_CODE_END(bad_put_user)
 156  Again, every ``SYM_CODE_START*`` **shall** be coupled by ``SYM_CODE_END``.
 158  To some extent, this category corresponds to deprecated ``ENTRY`` and
 159  ``END``. Except ``END`` had several other meanings too.
 161* ``SYM_INNER_LABEL*`` is used to denote a label inside some
 162  ``SYM_{CODE,FUNC}_START`` and ``SYM_{CODE,FUNC}_END``.  They are very similar
 163  to C labels, except they can be made global. An example of use::
 165    SYM_CODE_START(ftrace_caller)
 166        /* save_mcount_regs fills in first two parameters */
 167        ...
 169    SYM_INNER_LABEL(ftrace_caller_op_ptr, SYM_L_GLOBAL)
 170        /* Load the ftrace_ops into the 3rd parameter */
 171        ...
 173    SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
 174        call ftrace_stub
 175        ...
 176        retq
 177    SYM_CODE_END(ftrace_caller)
 179Data Macros
 181Similar to instructions, there is a couple of macros to describe data in the
 184* ``SYM_DATA_START`` and ``SYM_DATA_START_LOCAL`` mark the start of some data
 185  and shall be used in conjunction with either ``SYM_DATA_END``, or
 186  ``SYM_DATA_END_LABEL``. The latter adds also a label to the end, so that
 187  people can use ``lstack`` and (local) ``lstack_end`` in the following
 188  example::
 190    SYM_DATA_START_LOCAL(lstack)
 191        .skip 4096
 192    SYM_DATA_END_LABEL(lstack, SYM_L_LOCAL, lstack_end)
 194* ``SYM_DATA`` and ``SYM_DATA_LOCAL`` are variants for simple, mostly one-line
 195  data::
 197    SYM_DATA(HEAP,     .long rm_heap)
 198    SYM_DATA(heap_end, .long rm_stack)
 200  In the end, they expand to ``SYM_DATA_START`` with ``SYM_DATA_END``
 201  internally.
 203Support Macros
 205All the above reduce themselves to some invocation of ``SYM_START``,
 206``SYM_END``, or ``SYM_ENTRY`` at last. Normally, developers should avoid using
 209Further, in the above examples, one could see ``SYM_L_LOCAL``. There are also
 210``SYM_L_GLOBAL`` and ``SYM_L_WEAK``. All are intended to denote linkage of a
 211symbol marked by them. They are used either in ``_LABEL`` variants of the
 212earlier macros, or in ``SYM_START``.
 215Overriding Macros
 217Architecture can also override any of the macros in their own
 218``asm/linkage.h``, including macros specifying the type of a symbol
 219(``SYM_T_FUNC``, ``SYM_T_OBJECT``, and ``SYM_T_NONE``).  As every macro
 220described in this file is surrounded by ``#ifdef`` + ``#endif``, it is enough
 221to define the macros differently in the aforementioned architecture-dependent