linux/Documentation/unaligned-memory-access.txt
<<
>>
Prefs
   1UNALIGNED MEMORY ACCESSES
   2=========================
   3
   4Linux runs on a wide variety of architectures which have varying behaviour
   5when it comes to memory access. This document presents some details about
   6unaligned accesses, why you need to write code that doesn't cause them,
   7and how to write such code!
   8
   9
  10The definition of an unaligned access
  11=====================================
  12
  13Unaligned memory accesses occur when you try to read N bytes of data starting
  14from an address that is not evenly divisible by N (i.e. addr % N != 0).
  15For example, reading 4 bytes of data from address 0x10004 is fine, but
  16reading 4 bytes of data from address 0x10005 would be an unaligned memory
  17access.
  18
  19The above may seem a little vague, as memory access can happen in different
  20ways. The context here is at the machine code level: certain instructions read
  21or write a number of bytes to or from memory (e.g. movb, movw, movl in x86
  22assembly). As will become clear, it is relatively easy to spot C statements
  23which will compile to multiple-byte memory access instructions, namely when
  24dealing with types such as u16, u32 and u64.
  25
  26
  27Natural alignment
  28=================
  29
  30The rule mentioned above forms what we refer to as natural alignment:
  31When accessing N bytes of memory, the base memory address must be evenly
  32divisible by N, i.e. addr % N == 0.
  33
  34When writing code, assume the target architecture has natural alignment
  35requirements.
  36
  37In reality, only a few architectures require natural alignment on all sizes
  38of memory access. However, we must consider ALL supported architectures;
  39writing code that satisfies natural alignment requirements is the easiest way
  40to achieve full portability.
  41
  42
  43Why unaligned access is bad
  44===========================
  45
  46The effects of performing an unaligned memory access vary from architecture
  47to architecture. It would be easy to write a whole document on the differences
  48here; a summary of the common scenarios is presented below:
  49
  50 - Some architectures are able to perform unaligned memory accesses
  51   transparently, but there is usually a significant performance cost.
  52 - Some architectures raise processor exceptions when unaligned accesses
  53   happen. The exception handler is able to correct the unaligned access,
  54   at significant cost to performance.
  55 - Some architectures raise processor exceptions when unaligned accesses
  56   happen, but the exceptions do not contain enough information for the
  57   unaligned access to be corrected.
  58 - Some architectures are not capable of unaligned memory access, but will
  59   silently perform a different memory access to the one that was requested,
  60   resulting in a subtle code bug that is hard to detect!
  61
  62It should be obvious from the above that if your code causes unaligned
  63memory accesses to happen, your code will not work correctly on certain
  64platforms and will cause performance problems on others.
  65
  66
  67Code that does not cause unaligned access
  68=========================================
  69
  70At first, the concepts above may seem a little hard to relate to actual
  71coding practice. After all, you don't have a great deal of control over
  72memory addresses of certain variables, etc.
  73
  74Fortunately things are not too complex, as in most cases, the compiler
  75ensures that things will work for you. For example, take the following
  76structure:
  77
  78        struct foo {
  79                u16 field1;
  80                u32 field2;
  81                u8 field3;
  82        };
  83
  84Let us assume that an instance of the above structure resides in memory
  85starting at address 0x10000. With a basic level of understanding, it would
  86not be unreasonable to expect that accessing field2 would cause an unaligned
  87access. You'd be expecting field2 to be located at offset 2 bytes into the
  88structure, i.e. address 0x10002, but that address is not evenly divisible
  89by 4 (remember, we're reading a 4 byte value here).
  90
  91Fortunately, the compiler understands the alignment constraints, so in the
  92above case it would insert 2 bytes of padding in between field1 and field2.
  93Therefore, for standard structure types you can always rely on the compiler
  94to pad structures so that accesses to fields are suitably aligned (assuming
  95you do not cast the field to a type of different length).
  96
  97Similarly, you can also rely on the compiler to align variables and function
  98parameters to a naturally aligned scheme, based on the size of the type of
  99the variable.
 100
 101At this point, it should be clear that accessing a single byte (u8 or char)
 102will never cause an unaligned access, because all memory addresses are evenly
 103divisible by one.
 104
 105On a related topic, with the above considerations in mind you may observe
 106that you could reorder the fields in the structure in order to place fields
 107where padding would otherwise be inserted, and hence reduce the overall
 108resident memory size of structure instances. The optimal layout of the
 109above example is:
 110
 111        struct foo {
 112                u32 field2;
 113                u16 field1;
 114                u8 field3;
 115        };
 116
 117For a natural alignment scheme, the compiler would only have to add a single
 118byte of padding at the end of the structure. This padding is added in order
 119to satisfy alignment constraints for arrays of these structures.
 120
 121Another point worth mentioning is the use of __attribute__((packed)) on a
 122structure type. This GCC-specific attribute tells the compiler never to
 123insert any padding within structures, useful when you want to use a C struct
 124to represent some data that comes in a fixed arrangement 'off the wire'.
 125
 126You might be inclined to believe that usage of this attribute can easily
 127lead to unaligned accesses when accessing fields that do not satisfy
 128architectural alignment requirements. However, again, the compiler is aware
 129of the alignment constraints and will generate extra instructions to perform
 130the memory access in a way that does not cause unaligned access. Of course,
 131the extra instructions obviously cause a loss in performance compared to the
 132non-packed case, so the packed attribute should only be used when avoiding
 133structure padding is of importance.
 134
 135
 136Code that causes unaligned access
 137=================================
 138
 139With the above in mind, let's move onto a real life example of a function
 140that can cause an unaligned memory access. The following function adapted
 1"L41">from e inus="line" name="L106"> 106that you could reorder tigned-memor" name="L141"> 1"L41">from e inus="linereorder tigned-memor" name="L141"> 1"L4entation/unaligned-memory-access.txt#L137"*rion/unaligned"Docdmory-access.txt#LBmemory-access.txt#L137"*rion/una16e="L58">  58  90line" name="a>e="L58">  58 4dH2.igned access,
  43Why unaligned access is bad1="line" n1ame="L45">  45
  _d-mem_gned(-memoss.t*gned1, -memoss.t*gned203"> 103divisible by one.
  50 1- Som14field2;
cumentation/unaligned-memory-access.txt#L115" id="L115"che followlass="line" name="L51"> 1 51   tr}umentation/unaligned-memory-access.txt#L115" id="L115"rmance co1st.
 122structure type. This GCC-specific attribute telnaligned 1accesses
 124to represent some data that comes in a fixedhe unalig1ned access,
and="L32" 
 124to represent some data that comes in a fixedh"line" n1/unaligned-memory-access1.txt#155" idifygnedi
  88structure, i.e. address 0x10002, but that addre1entation/1unaligned-memory-access.1txt#L16" id=t that 03"> 103divisible by one.
 125
  d08" id="kemneltatiy-acne"  id=txt#L9oo="L13ref="tatio> 1"L41">from e inus="line" name="L106"> 106that youe one tha1t was requested,
calL99" cl  77  62It 1shoul1 be obs="l#L ida decgnedo classre not capablame="L5" id="L1228">  77 132non-packed case, so the packed attribute sho" id="L631" class="line" name="L631">  61me#L25" 23" ru/="ue" naatioccess.tximned-md-memnet nettatiss="lined-m="line" name="L43">  43Why unaligned access is bad1ectly on 1certain
  14from an address that is not evenly divisible by Ns.
  66
You line" namecss="line" nd-memory-access.txss.txt#L112" id="L112" class="line" name="L112"> 112 103divisible by one.
  16   u8 field3;
 1 74Fortunately thing1s are1not tomory-ne" name="ine" nd-memory-access.txcces"L15iocumentmightme="L100"ned-med-memory-access.txt#L139" id="L139" class="line" name="" id="L751" class="line" name="L751">  71enid="ed-memory-access.txt#L16" id="L16" class="s="line" name="L27">  27Natural alignment
Inosare abclass=2 mses,saligned malignene" namerun">  89tation/unaligned-memory-access.txt#L139" id="L139" class="line" name="o {
  79       17little vaggue, as memory access can" na1tion/u imeCasunaliine" name=="L6" id=gn variables and fud-memory-access.txt#L139" id="L139" class="line" name="o one tha10">  80             1   u31 field22. Pd-me"linr-acL10ic"L141">ssesine" name="Las any on2  81      1     17=======
  82       1 };
<1822"> 122structure type. This GCC-specific attribute tel>  831
  57   unaligned access to be corrected.
  15starting at address 0x10000. With a basic level of und1erstandin1g, it would

	 .txt"liclasrent
 118byte of padding at the end of the structure.ss is not1 evenly divisible
  64platforms and will cause performance problems on other1e).
  79       1s="line" 1name="L91">  91Fortu1natel18=======
 103divisible by one.
  97Sim1ilarl195ref="Documen}umentation/unaligned-memory-access.txt#L115" id="L115"xt#L98" i1d="L98" class="line" nam1e="L91">  98parameters to a naturally aligned scheme, based o1n the siz1e of the type of
n#L79" id="L79" class="line" name="L79">  79       1ry-access1.txt#L100" id="L100" cla1ss="l19  20ways. The context here is at the machine code level:: certain  instructions read
 103divisible by one.
  id="L106" class="line" 2ame="20naligned-memory-acces[...]cumentation/unaligned-memory-access.txt#L114" id="L112ure in or2er to place fields
  optimal l2yout of the
 124to represent some data that comes in a fixe2-memory-a2cess.txt#L110" id="L110"2class2"line"igned-me"line" wishe="Launal 9" class="line" cli"s28" cla ida27L141">n#L79" id="L79" class="line" name="L79">  79       2      str2ct foo {
        2       u32 field2;
 113      2     21/a>memory accesses to happen, your code will not work corr2" class="2ine" name="L114"> 114 1152     21ef="Doc" name=line"ss.txt#). Be="L1301
< id=e" name="L13emory-accline" n01" id="L101" class="line" name="L101"> 101At this 2e="L116">2116
 ne" nures raiseigned-m28">b 53 136Code that causes u2ess.txt#L218" id="L118" class="lin2" nam2="L118"> 118byte of padding at the end of the structure2 This pad2ing is added in order
 132non-packed case, so the packed attribute sh2f these s2ructures.
 136Code that causes u2e     str2" class="line" name="L122"> 122AnDuee="L61">a hr-#L109n/unalass="lineop="L107",nd-memory-access.txc" id=unal9   silently perform a different memory access to th2.g. movb,  movw, movl in x86
 122structure type. This GCC-specific attribute te2asy to spoot C statements
   26

"L39" clmemory-aloadmentettatiss=""L39" ca>

href=2Docume.txt"liry-1
  line" namon/uprop="ne" name="L39e hrefsi24"> 124to represent some data that comes in a fixe2 in perfo2mance compared to the
b 5expd-memdda274*n +22. On#L76"ed-mee="L58"> "> 124to represent some data that comes in a fixe2 sy to spoe used when avoiding
platforms and will cause performance problems on other2ocumentat2on/unaligned-memory-acce2s.txt2L134" gned-memor28">b 5es"L1-accnsra>
<"liiL13 to perfstid="L60" classloadm/a>   silently perform a different memory access to th2="L135"> 235
 132====================2=====2======4*n+2" r id="d-memnet ="liL130d-meior28">b 5an67" id= classl#L idtnaligned-memory-access.txt#L26" id="L26" class="line" nam38" id="L238" class="line" name="L238"> 23L18">  39" cdme="L9pinass=ine"  1"L4="L1">  89mory-piburiab. Becumentngis"Ldealing with types such as u16, u32 and u64.
With the above in mi2d, le2'sunneu64. ableiongned-memory-a
28">dned-memory-access.txd-memoryder28">b /a>dealing with types such as u16, u32 and u64.
  79       2The follo2ing function adapted
dealing with types such as u16, u32 and u64.
  45
   26
 101At this 2t#L47" id2="L47" class="line" name2="L472>  47to architecture. It would be easy to write a whole2 document2 on the differences

	2"> 132non-packed case, so the packed attribute sh2class="li2ne" name="L50">  50 2- Som24field2;

	"> 132non-packed case, so the packed attribute sh2che follo2lass="line" name="L51"> 2 51Cox, Avut39 Olr25", Heikki Orsila, J8">Engel="lie2"> 132non-packed case, so the packed attribute sh2rmance co2st.
 Ky-meMcMn/una, Ky-meMoriatt, Rlasy Dunlap, RobmentHane"ck, Uli Kunitz2"> 132non-packed case, so the packed attribute sh2rsy to spoaccesses
 132non-packed case, so the packed attribute sh2rcumentat2ned access,
  14from an address that is not evenly divisible by 2h"line" n2/unaligned-memory-access2.txt#255" id


unalirignarl;LXRm>YftL130hsinass=from http://Lour86forge.net/pdojunal/lxr">LXRmrforunity" idd-meior-accriddresl5es"s notsinfrom mailto:lxr@8lxr@8 lxr.8from http://www. cdpill-8Rcdpill L