linux/Documentation/RCU/checklist.txt
<<
>>
Prefs
   1Review Checklist for RCU Patches
   2
   3
   4This document contains a checklist for producing and reviewing patches
   5that make use of RCU.  Violating any of the rules listed below will
   6result in the same sorts of problems that leaving out a locking primitive
   7would cause.  This list is based on experiences reviewing such patches
   8over a rather long period of time, but improvements are always welcome!
   9
  100.      Is RCU being applied to a read-mostly situation?  If the data
  11        structure is updated more than about 10% of the time, then you
  12        should strongly consider some other approach, unless detailed
  13        performance measurements show that RCU is nonetheless the right
  14        tool for the job.  Yes, RCU does reduce read-side overhead by
  15        increasing write-side overhead, which is exactly why normal uses
  16        of RCU will do much more reading than updating.
  17
  18        Another exception is where performance is not an issue, and RCU
  19        provides a simpler implementation.  An example of this situation
  20        is the dynamic NMI code in the Linux 2.6 kernel, at least on
  21        architectures where NMIs are rare.
  22
  23        Yet another exception is where the low real-time latency of RCU's
  24        read-side primitives is critically important.
  25
  261.      Does the update code have proper mutual exclusion?
  27
  28        RCU does allow -readers- to run (almost) naked, but -writers- must
  29        still use some sort of mutual exclusion, such as:
  30
  31        a.      locking,
  32        b.      atomic operations, or
  33        c.      restricting updates to a single task.
  34
  35        If you choose #b, be prepared to describe how you have handled
  36        memory barriers on weakly ordered machines (pretty much all of
  37        them -- even x86 allows later loads to be reordered to precede
  38        earlier stores), and be prepared to explain why this added
  39        complexity is worthwhile.  If you choose #c, be prepared to
  40        explain how this single task does not become a major bottleneck on
  41        big multiprocessor machines (for example, if the task is updating
  42        information relating to itself that other tasks can read, there
  43        by definition can be no bottleneck).
  44
  452.      Do the RCU read-side critical sections make proper use of
  46        rcu_read_lock() and friends?  These primitives are needed
  47        to prevent grace periods from ending prematurely, which
  48        could result in data being unceremoniously freed out from
  49        under your read-side code, which can greatly increase the
  50        actuarial risk of your kernel.
  51
  52        As a rough rule of thumb, any dereference of an RCU-protected
  53        pointer must be covered by rcu_read_lock(), rcu_read_lock_bh(),
  54        rcu_read_lock_sched(), or by the appropriate update-side lock.
  55        Disabling of preemption can serve as rcu_read_lock_sched(), but
  56        is less readable.
  57
  583.      Does the update code tolerate concurrent accesses?
  59
  60        The whole point of RCU is to permit readers to run without
  61        any locks or atomic operations.  This means that readers will
  62        be running while updates are in progress.  There are a number
  63        of ways to handle this concurrency, depending on the situation:
  64
  65        a.      Use the RCU variants of the list and hlist update
  66                primitives to add, remove, and replace elements on
  67                an RCU-protected list.  Alternatively, use the other
  68                RCU-protected data structures that have been added to
  69                the Linux kernel.
  70
  71                This is almost always the best approach.
  72
  73        b.      Proceed as in (a) above, but also maintain per-element
  74                locks (that are acquired by both readers and writers)
  75                that guard per-element state.  Of course, fields that
  76                the readers refrain from accessing can be guarded by
  77                some other lock acquired only by updaters, if desired.
  78
  79                This works quite well, also.
  80
  81        c.      Make updates appear atomic to readers.  For example,
  82                pointer updates to properly aligned fields will
  83                appear atomic, as will individual atomic primitives.
  84                Sequences of perations performed under a lock will -not-
  85                appear to be atomic to RCU readers, nor will sequences
  86                of multiple atomic primitives.
  87
  88                This can work, but is starting to get a bit tricky.
  89
  90        d.      Carefully order the updates and the reads so that
  91                readers see valid data at all phases of the update.
  92                This is often more difficult than it sounds, especially
  93                given modern CPUs' tendency to reorder memory references.
  94                One must usually liberally sprinkle memory barriers
  95                (smp_wmb(), smp_rmb(), smp_mb()) through the code,
  96                making it difficult to understand and to test.
  97
  98                It is usually better to group the changing data into
  99                a separate structure, so that the change may be made
 100                to appear atomic by updating a pointer to reference
 101                a new structure containing updated values.
 102
 1034.      Weakly ordered CPUs pose special challenges.  Almost all CPUs
 104        are weakly ordered -- even x86 CPUs allow later loads to be
 105        reordered to precede earlier stores.  RCU code must take all of
 106        the following measures to prevent memory-corruption problems:
 107
 108        a.      Readers must maintain proper ordering of their memory
 109                accesses.  The rcu_dereference() primitive ensures that
 110                the CPU picks up the pointer before it picks up the data
 111                that the pointer points to.  This really is necessary
 112                on Alpha CPUs.  If you don't believe me, see:
 113
 114                        http://www.openvms.compaq.com/wizard/wiz_2637.html
 115
 116                The rcu_dereference() primitive is also an excellent
 117                documentation aid, letting the person reading the code
 118                know exactly which pointers are protected by RCU.
 119                Please note that compilers can also reorder code, and
 120                they are becoming increasingly aggressive about doing
 121                just that.  The rcu_dereference() primitive therefore
 122                also prevents destructive compiler optimizations.
 123
 124                The rcu_dereference() primitive is used by the
 125                various "_rcu()" list-traversal primitives, such
 126                as the list_for_each_entry_rcu().  Note that it is
 127                perfectly legal (if redundant) for update-side code to
 128                use rcu_dereference() and the "_rcu()" list-traversal
 129                primitives.  This is particularly useful in code that
 130                is common to readers and updaters.  However, lockdep
 131                will complain if you access rcu_dereference() outside
 132                of an RCU read-side critical section.  See lockdep.txt
 133                to learn what to do about this.
 134
 135                Of course, neither rcu_dereference() nor the "_rcu()"
 136                list-traversal primitives can substitute for a good
 137                concurrency design coordinating among multiple updaters.
 138
 139        b.      If the list macros are being used, the list_add_tail_rcu()
 140                and list_add_rcu() primitives must be used in order
 141                to prevent weakly ordered machines from misordering
 142                structure initialization and pointer planting.
 143                Similarly, if the hlist macros are being used, the
 144                hlist_add_head_rcu() primitive is required.
 145
 146        c.      If the list macros are being used, the list_del_rcu()
 147                primitive must be used to keep list_del()'s pointer
 148                poisoning from inflicting toxic effects on concurrent
 149                readers.  Similarly, if the hlist macros are being used,
 150                the hlist_del_rcu() primitive is required.
 151
 152                The list_replace_rcu() and hlist_replace_rcu() primitives
 153                may be used to replace an old structure with a new one
 154                in their respective types of RCU-protected lists.
 155
 156        d.      Rules similar to (4b) and (4c) apply to the "hlist_nulls"
 157                type of RCU-protected linked lists.
 158
 159        e.      Updates must ensure that initialization of a given
 160                structure happens before pointers to that structure are
 161                publicized.  Use the rcu_assign_pointer() primitive
 162                when publicizing a pointer to a structure that can
 163                be traversed by an RCU read-side critical section.
 164
 1655.      If call_rcu(), or a related primitive such as call_rcu_bh(),
 166        call_rcu_sched(), or call_srcu() is used, the callback function
 167        must be written to be called from softirq context.  In particular,
 168        it cannot block.
 169
 1706.      Since synchronize_rcu() can block, it cannot be called from
 171        any sort of irq context.  The same rule applies for
 172        synchronize_rcu_bh(), synchronize_sched(), synchronize_srcu(),
 173        synchronize_rcu_expedited(), synchronize_rcu_bh_expedited(),
 174        synchronize_sched_expedite(), and synchronize_srcu_expedited().
 175
 176        The expedited forms of these primitives have the same semantics
 177        as the non-expedited forms, but expediting is both expensive
 178        and unfriendly to real-time workloads.  Use of the expedited
 179        primitives should be restricted to rare configuration-change
 180        operations that would not normally be undertaken while a real-time
 181        workload is running.
 182
 183        In particular, if you find yourself invoking one of the expedited
 184        primitives repeatedly in a loop, please do everyone a favor:
 185        Restructure your code so that it batches the updates, allowing
 186        a single non-expedited primitive to cover the entire batch.
 187        This will very likely be faster than the loop containing the
 188        expedited primitive, and will be much much easier on the rest
 189        of the system, especially to real-time workloads running on
 190        the rest of the system.
 191
 192        In addition, it is illegal to call the expedited forms from
 193        a CPU-hotplug notifier, or while holding a lock that is acquired
 194        by a CPU-hotplug notifier.  Failing to observe this restriction
 195        will result in deadlock.
 196
 1977.      If the updater uses call_rcu() or synchronize_rcu(), then the
 198        corresponding readers must use rcu_read_lock() and
 199        rcu_read_unlock().  If the updater uses call_rcu_bh() or
 200        synchronize_rcu_bh(), then the corresponding readers must
 201        use rcu_read_lock_bh() and rcu_read_unlock_bh().  If the
 202        updater uses call_rcu_sched() or synchronize_sched(), then
 203        the corresponding readers must disable preemption, possibly
 204        by calling rcu_read_lock_sched() and rcu_read_unlock_sched().
 205        If the updater uses synchronize_srcu() or call_srcu(),
 206        the the corresponding readers must use srcu_read_lock() and
 207        srcu_read_unlock(), and with the same srcu_struct.  The rules for
 208        the expedited primitives are the same as for their non-expedited
 209        counterparts.  Mixing things up will result in confusion and
 210        broken kernels.
 211
 212        One exception to this rule: rcu_read_lock() and rcu_read_unlock()
 213        may be substituted for rcu_read_lock_bh() and rcu_read_unlock_bh()
 214        in cases where local bottom halves are already known to be
 215        disabled, for example, in irq or softirq context.  Commenting
 216        such cases is a must, of course!  And the jury is still out on
 217        whether the increased speed is worth it.
 1620/RCU/checklist.txt#L71" id="L71" claics
<11ubstituted for rcu_read_lock_bh() a087"> 197Alrs tgh.      Since synchrnd tlowkely be isa>5.      If i readers must
 120      Itass="sefor  provideuctu    oher appro   Doeson is wherereaders must
 121 primitives is critical orefully ordethat c      it  call_srcu(),
 172        synchr primitiveves mustpither rcut is illr_each_call_srcu(),
 123
 124menthe system, critical st maity    the rea>        synch23"> 123
 125t_del_rcu() pour codeaure atives isou f-ldel_s: par to prevent g23"> 123
 126ttomdelaybe subs   t, plorresble e_rcu_bh(rea>        synch23"> 123
 127t_del_rcu(         the the clymdelay batchestirq c or raase23"> 123
 198ici     del_o   Does     i still out on
 129>        in  to prevent grttomdelaybeppearfnotifier. dore yucture that can
 130ngs up wil    singlypecihe low realme r disordeOOMc or      roken kernels.
 211
 132Ws al   ghe loop cnd tu f-ldel_ods fr maity      ve from ater use11"> 211
 133wheludlieve me, see:
 134
  65    Keepr whil     the There      Doa-a>        and repl34"> 134
 136    rimitive is             RCU-protected da,3wheludxt.  Commenting
 137    rs su( al_ods  substito prevent  be plapsu   E/a> sed   Commenting
 148     del_omuch Thishere ,work      y order tsimitiveng a e up  Commenting
 129      evng uncedtherrnnot eowing   -elemcted list.  Al  Commenting
 140     del_omuly the There  a al_ods dtherrnnot eos   hkely be  Commenting
 141     /a> ork  There     and reploken kernels.
  22
 143    wam, esork  efully order oint oock thaefully orde-erence() outside
 144    mutexlem(Ds.  If yotro explad strucspr e hol--        Almost all CPUs
 135    spr nency, depene hol          epenito prevent ost all CPUs
 136    ace pe plourrenc.)24me      wam, esork  efully orderost all CPUs
 147    ie the sam     pointereadea wr strrthe callbrttlt tost all CPUs
 148    sam ly spri e ucatorte structurepladr strrthe callbost all CPUs
 149    simul poinOOMcw_rcu_bhru() pooo and wiy spri  al_ods be  Commenting
 150    erseito prevent .in progress. is a mus mly, use the other
 141    Use t      much Thiof the update.
  22
  73    Ldel_ods    Does    . to readers.   par    poinoccuromuly22">  22
 154    on preve prire e_rcunotici s     ldel_ods ) primitivel  Commenting
 135    r approRestrrest o() prialves baives   srcu_sdcac_bh().  If the
 136        est o(lly sch Thiys the bl-- nning while sing cah().  If the
 157     globk  _rcu()ldel_ods        s of the update.
 158
 159    Trus  RC   Does-- par    poinonizck acq loons m  22
 160     ueveeadrmple,          trus  RC sere e_rcui smighhat  readers must
 161    init really nt ooure atives is del_o srcu_s 162    bhru() poointeueveeadrmrialves htsiloe RCU >       crasmitives, such
 163    sam lly ordf the update.
 164
  65    workpdater uses ca   hkely bea>5.      If in   Carefo(lly 64"> 164
  66    advahregs. is dater uses cp list_y likel to prevent gf the update.
 107
 168    Pvent ives is d yoh(rea>        synch,of RCU omic by del_cah().  If the
  69    There         poinpkel to prevent f the update.
  70
 171q contextcour     b) and (4pdater uses cad_lopdater uses callf the update.
  72
 197All erseon/R)" list-traversal now exwheludl72">  72
  54neither rcu_d,      as the list_for_eachl  Commenting
  75     as the li or inu  synch,o         as the lisafer_eachl  Commenting
 176    Of courd st repaversed by an RCU read-side critall_rcu_bh() or
 167    ointers are p, or by the appropriate updaarlier _rcu_bh() or
 178d by an RCU read-side critgrttomdeldel_care p this rule: rcu_rcu_bh() or
 179read_lock() and rcu_rlug ne p      Ru         primi_rcu_bh() or
 180tituted for rcu_read_lock_bh() and rcu_readf in now ex>   _rcu_bh() or
 181sam llty org         The rcu_dereference( primitives must be used in order
  82ust be ustion.    sty,      isa>5            The rcu_read_sed in order
 123
 184Tpdatessmuch ().  Notof RCUeemptentereadeerseon/R)" li23"> 123
 125t_del_rcuscw_rcu_bh appropriate updaNotoheld() poointssivee, 23"> 123
 166 187shaat aretw thas common to readers andA       list-traversae that it is
 188ch poinvidbe subs  isa>5    tsidiscuses mustction.  See_sed in order
  89
 19Con     rs.  Sicompl     epaversed by an RCU read-side critl  Commenting
 179reada CPUs.  If yor, ou_bh , or by the appropriate upda,da CP- prick will -not-
 192ublicizee() and the "_Use tntat all phon/Rif the lug notifier. dore k will -not-
 193mitived bk      ,tcouadeacreasinglye note thais stne     bad through the code,
 194d_lop resuprevo amo_foifier. d by Restructu_sed in order
 175
 19try_rcu().        synchr-ck a- singtnteowing  al_ ur ii23"> 123
 177<    ects oandexecut     by calling r()         RCersed by an R23"> 123
 198< read-side critgr  -elemIntssowia locit reallyiandsingtnteo23"> 123
 199cu().<    ects oande workloicizit meal NMIal ust dis_   cu_rcu_bh() or
 200lso reubsidlster ts            -elemT primiti.  Sicompdor     pri_rcu_bh() or
 201 by calling r()         RCd by an RCU read-side critg,pdora lock will -not-
 192ubli        synch_sed in order
 123
 204  readers.   ods frt disabl This    comc    ers m     may b23"> 123
 205substituted for rad_unC useful  att disintereadefrt disabl23"> 123
 206  ods    in ficuprimitiveve     by calling r()3mitived bk23"> 123
 207arly cially t     b buil gf the update.
 158
 209lpha CPwcal ng  al_ subsicizit me  pnde thl NMI  pnde thl n confusion and
 200lso  norma     whflu rcut alust dis_   cu,da CPinstallonfusion and
 111mitintereade        irqll_rcu_sched() or synchrof the update.
  72
 213 ntextldel_aallbrtimiti same r eso/a>        synchron72">  72
 194d_loze_sched(), synchro tsiwetivasd (4c) aaze_sched    n confusion and
 215        The expedited ted eost-traversal lasslya>5.      Ifonfusion and
 166     readf e_srcu() or 3        synchronize_rcu_onfusion and
 217xpedited(), synchronize_rcu_bh__expedite(), and synchronize_srcu_expedited().
 1620/RCU/checklist.txt#L71" id="L713 claics
<31ubstituted for rcu_read3lock_3h() a0127"> 19Any updaNock thatbe traverses used, t primitiock thatbelse   iniaq.0"> 1620/RCU/checklist.txt#L71" id="L713cklist.tx3t#L20" id="L20" class="l3ne" n3me="L120"> 120d str, in irq      e.g., viucspr for rc irspricu_onfusion and
 121spr for rcreadf etclug notifier. daders mexampnalization of a given
 172ock tsicritalfeful  updaN        will resultvasdsopnalae that it is
 203erse, in irq pnde t   structr. dun Restrerses used, totifie that it is
 214 175
 19erses used, s6>    _onfusion and
 127() is used, tuctureprovyadr strrsrttlt t kt eoor 3 198This    coissue (ortem hapaccurat rs.  (4c) aextentrcu().  Note that it is
 179reoissue, sam ly spr- e ucator updaods  pnde s.  d_un updatere that it is
 130 Similas used, s6docm 131ing reade   t, ploupdaods or       dite(), anasabl Thirimitivee that it is
  82ustsafeg inain ifpnd/or modifypooints-protected daical section.
 123
 214erses used, s6l   -      I- executs m, depenntextCPUpoointexecuts 23"> 123
  65        the the co>5.      If      readf nize_src       call_23"> 123
 136itedlist yra l- lyauctsingtnteodlready. to readers.   parlization of a given
 137CPUpgospectf137"notifieravods bererses used, tpurrence e_rcu_bl in code that
 148erses used, to/a> executsm, d,    survivods CPUlem( pladlae that it is
 129s   milas     t tu f-spaworkloerses used, to/a>          epee that it is
 140ve thm7CPUpace pe plogsiveectf137".n72">  72
 211
 19Serse(rs must use srcu,7        srcu_reasneither rcu_d,11"> 211
 173        u() or 3        synchronize_srcu,ad_lopdateu() orn72">  72
 144ck acq l d yoh to be proin if or softirUns wi       e expedi72">  72
 135cum In a-is-of RCUeemptenterbupdaNo epavSersed by an RCU read-s72">  72
 136de crita(demarkatbe trs must use srcu_rea7        srcu_rd,11"> 211
 147h rcut izee() anSerse() an:ee() ansleepers merse() antirPin a lnry_rcu(). pha C11"> 211
 148Us.  If yomitintersleep7arly cy an RCU read-side critg,pa CPuprimitiv11"> 211
 149ve    ersed  hkely beaScum Ibecouadeerse) primos).< >    y like11"> 211
 150rea7e much tereadey be isaScumical section.
 151
 152lpha CPmitinterecizinRestrd by an RCU read-side critao ep51"> 151
  73haadexample        On pnde t,ad_loe_rcuexitrcu().ntext. by an R23"> 123
 154U read-side critao eed teasolding dla4 123
 135terstituted for rcrawcu_rea7        srcucrawcul now exavoii23"> 123
 206stion.  ntatiods   l  o/a>       wisng to opra crcuttion, iical section.
 107
 148Aimituns wi       e expedi5cum Iici sensure that in07"> 107
 179readcin nup Thirimitive viucensu_th the samecu_read_lock() and
 160cin nup_th the sameculemT psgress.pases maee() ans sameith the same;hlist_nulls"
 181saintsefinhat it sco      lizatiovSersedomai  srOrcutensure thvel  Commenting
 162 it sh the sameNotofases mterstituted for rcu,7        srcu_r  Commenting
 173        u() or 3        synchronize_srcu,ad_lopdateu() orical section.
 124mizatiovpdater uses synchro al_s>ck acsubsSersed by an RCU read-s72">  72
  65de critgrgimitnatbe trs must use srcu_rea7        srcu_r72">  72
 166 pooint primb thafases mt, and with the same srcuo opr maity72">  72
 167pladointmly scsleepe corres an RCU read-side critgrtoe ters m-ck will -not-
 168lizatiov    est o(delays>ck acl_s>cwn batches ts   mioorkloaly, use the other
  69    est os9ve    ScumiemT primiti. Serse) pappropr nenterOOMcepee that it is
 200 171w in f RCU otintersleepical section.
  72
 213  54c    subst eo. toirases    the the cors must use srcu_read_lock() and
  75        srcu_r6 p    oases mt, and with the same d_lock() and
 176Se or ,l to p-event -dete critalmithe      amortthveomuly22">  22
 167imitiveoork    poinshaamic byzatiov h the same,s   hkely be  Commenting
 178bemic globk g inmortthveo  yress.for       e expedi5cum d_lock() and
 179T primiti. Serse primitiveves mustpither rcut isrw_ve tphoiniaq.0"> 1620/RCU/checklist.txt#L71" id="L713klist.txt3L180" id="L180" class="l3ne" n3me="L180"> 180 1620/RCU/checklist.txt#L71" id="L713klist.txt3L181" id="L181" class="l3ne" n3me="L181"> 181rimiti    Scump list_rres an RCl resultvimmunitynor upwt. by an R23"> 123
  82pecihe low realy d_lock() and
 123
 184try_rcu()      id=ign_po la4 123
 135ter      e expedi5cum d_lock() and
 196
 123
 198Thing  al_ ur ii.<   pit-exn/Re correspond primfinisht arefoiniaq.0"> 1620/RCU/checklist.txt#L71" id="L713klist.txt3L189" id="L189" class="l3ne" n38e="L129"> 129> rfoifiery i,         wisn-dee same bo      lemIntote that it is
 190p  he that it is
 181saintrrespond 192dee same bo      ,ad_lo-ck a- -() n-s d yoh(>5.      Ifonfusion and
 173        synadf niz      w d_lock() and
 164
 195Becouadeed forms of these 198Thinilas userp list_rrhe tsibilityntersingtnteo   l  a/a>    imi hrreaders must
 177rrespondo/a> executsmsafeg  d_lock() and
 1620/RCU/checklist.txt#L71" id="L713klist.txt3L199" id="L199" class="l3ne" n3me="L116.> 213 1620/RCU/checklist.txt#L71" id="L714klist.txt4L200" id="L200" class="l4ne" n4me="L200"> 200iy sprib rfirs andY CPuprimit 179read      note tntert eolyay   Carem/dveont ooeadry isfier _rcu_bh() or
  82pecy an RCU read-side critglemIntot    e tsibilitynoock_bh().  If the
 203erseappropriate ms of theser. decinlock(),iw d_lock() and
 164
 19workCONFIG_PROVE_cum ICONFIG_DEBUG_OBJECTS_cum_HEADl n confusion and
 206_ syn spamus ntatiser. valiprop Restrersesctu   T psgonfusion and
 207 158
 209CONFIG_PROVE_cum: ntati   l  acrealeser. R           RCU-pr58"> 158
 160     ected das6l   c rfiradry inorma     pr maiier _rcu_bh() or
 161    d by an RCU read-side critlnotifier, or wh ighh_rcu_bh() or
 162      nbinacritalfeupdaalug not t, plo       or      r_rcu_bh() or
 163    ttom  or by the d_lock() and
 164
 215CONFIG_DEBUG_OBJECTS_cum_HEAD: ntati   l  a CPUs.  If yooaseck_bh().  If the
 136     d wiobje ct is illr_each (orz      w)arefoin berersh().  If the
 157    ito prevent  htsiplapsud sircut ize idct e lo  l  a Ch().  If the
 148    oases mt,().ntextobje ct is illr_each (orz      w) d_lock() and
  89
 120_ syn spamus ntatis: tawhpo 158
 161    a>        lock(_ syn__expedpamus o/a> warnha CP pha C11"> 211
 162    nain ift,().po 163    Use tntat alneither rcu_d d_lock() and
 164
 125T psgrdebuggmic bids7 1620/RCU/checklist.txt#L71" id="L714clist.txt4t#L26" id="L26" class="l4ne" n4me="L126"> 126     wisngextresslyadiffic    terspoe d_lock() and

LXRe    unity="L1,g to oronis o hrel     rita yr_lock() amailto:lxr@L27ux.no">lxr@L27ux.no="L1.
lxr.L27ux.no kin lyahoss are p_lock() ahttp://www.redotel-L27pro.no">Redotel L27pro AS="L1,goinvidb L27ux or ods bed sanervicspesircut1995.