linux/Documentation/mutex-design.txt
<<
>>
Prefs
   1Generic Mutex Subsystem
   2
   3started by Ingo Molnar <mingo@redhat.com>
   4
   5  "Why on earth do we need a new mutex subsystem, and what's wrong
   6   with semaphores?"
   7
   8firstly, there's nothing wrong with semaphores. But if the simpler
   9mutex semantics are sufficient for your code, then there are a couple
  10of advantages of mutexes:
  11
  12 - 'struct mutex' is smaller on most architectures: E.g. on x86,
  13   'struct semaphore' is 20 bytes, 'struct mutex' is 16 bytes.
  14   A smaller structure size means less RAM footprint, and better
  15   CPU-cache utilization.
  16
  17 - tighter code. On x86 i get the following .text sizes when
  18   switching all mutex-alike semaphores in the kernel to the mutex
  19   subsystem:
  20
  21        text    data     bss     dec     hex filename
  22     3280380  868188  396860 4545428  455b94 vmlinux-semaphore
  23     3255329  865296  396732 4517357  44eded vmlinux-mutex
  24
  25   that's 25051 bytes of code saved, or a 0.76% win - off the hottest
  26   codepaths of the kernel. (The .data savings are 2892 bytes, or 0.33%)
  27   Smaller code means better icache footprint, which is one of the
  28   major optimization goals in the Linux kernel currently.
  29
  30 - the mutex subsystem is slightly faster and has better scalability for
  31   contended workloads. On an 8-way x86 system, running a mutex-based
  32   kernel and testing creat+unlink+close (of separate, per-task files)
  33   in /tmp with 16 parallel tasks, the average number of ops/sec is:
  34
  35    Semaphores:                        Mutexes:
  36
  37    $ ./test-mutex V 16 10             $ ./test-mutex V 16 10
  38    8 CPUs, running 16 tasks.          8 CPUs, running 16 tasks.
  39    checking VFS performance.          checking VFS performance.
  40    avg loops/sec:      34713          avg loops/sec:      84153
  41    CPU utilization:    63%            CPU utilization:    22%
  42
  43   i.e. in this workload, the mutex based kernel was 2.4 times faster
  44   than the semaphore based kernel, _and_ it also had 2.8 times less CPU
  45   utilization. (In terms of 'ops per CPU cycle', the semaphore kernel
  46   performed 551 ops/sec per 1% of CPU time used, while the mutex kernel
  47   performed 3825 ops/sec per 1% of CPU time used - it was 6.9 times
  48   more efficient.)
  49
  50   the scalability difference is visible even on a 2-way P4 HT box:
  51
  52    Semaphores:                        Mutexes:
  53
  54    $ ./test-mutex V 16 10             $ ./test-mutex V 16 10
  55    4 CPUs, running 16 tasks.          8 CPUs, running 16 tasks.
  56    checking VFS performance.          checking VFS performance.
  57    avg loops/sec:      127659         avg loops/sec:      181082
  58    CPU utilization:    100%           CPU utilization:    34%
  59
  60   (the straight performance advantage of mutexes is 41%, the per-cycle
  61    efficiency of mutexes is 4.1 times better.)
  62
  63 - there are no fastpath tradeoffs, the mutex fastpath is just as tight
  64   as the semaphore fastpath. On x86, the locking fastpath is 2
  65   instructions:
  66
  67    c0377ccb <mutex_lock>:
  68    c0377ccb:       f0 ff 08                lock decl (%eax)
  69    c0377cce:       78 0e                   js     c0377cde <.text..lock.mutex>
  70    c0377cd0:       c3                      ret
  71
  72   the unlocking fastpath is equally tight:
  73
  74    c0377cd1 <mutex_unlock>:
  75    c0377cd1:       f0 ff 00                lock incl (%eax)
  76    c0377cd4:       7e 0f                   jle    c0377ce5 <.text..lock.mutex+0x7>
  77    c0377cd6:       c3                      ret
  78
  79 - 'struct mutex' semantics are well-defined and are enforced if
  80   CONFIG_DEBUG_MUTEXES is turned on. Semaphores on the other hand have
  81   virtually no debugging code or instrumentation. The mutex subsystem
  82   checks and enforces the following rules:
  83
  84   * - only one task can hold the mutex at a time
  85   * - only the owner can unlock the mutex
  86   * - multiple unlocks are not permitted
  87   * - recursive locking is not permitted
  88   * - a mutex object must be initialized via the API
  89   * - a mutex object must not be initialized via memset or copying
  90   * - task may not exit with mutex held
  91   * - memory areas where held locks reside must not be freed
  92   * - held mutexes must not be reinitialized
  93   * - mutexes may not be used in hardware or software interrupt
  94   *   contexts such as tasklets and timers
  95
  96   furthermore, there are also convenience features in the debugging
  97   code:
  98
  99   * - uses symbolic names of mutexes, whenever they are printed in debug output
 100   * - point-of-acquire tracking, symbolic lookup of function names
 101   * - list of all locks held in the system, printout of them
 102   * - owner tracking
 103   * - detects self-recursing locks and prints out all relevant info
 104   * - detects multi-task circular deadlocks and prints out all affected
 105   *   locks and ets a5ntation/mutex-de=n value="v2.option v%eax)

  v2ationt >
  v2.6.19.7
  
    11
l"L9="L   nt knowirstyou fi - &ny. ]ti-task circular deadlocks and prints out aEl affecteter
  13   'struct ine"" name="L5ax+*, >  79 ant * ->  303 6/Dome: 1line"L6" cl"L86" "lL6" cltex' is smaller on most architectures: E21">  11
 - neg onv>  33o >
 e="ex' is smaller on most architectures: E2.g. on x8ore
  15   CPU-cache umlinux-mu1tex
  24<1/a>
   'struUG_Mabden#L60ea  23edass="line" name="L10">  10of advantages  the hott1est

  v2ationt >
  v2.6.19.7  29<1/a>
  11    *lass)          jle    c0377ce5 <.text..lock.m12.g. on x866,
    *lass)          jle    c0377ce5 <.text..lock.m12linux-mu1is:
    *lass)          jle    c0377ce5 <.text..lock.m124">  24<1/a>
    *lass)          jle    c0377ce5 <.text..lock.m12the hott1es:
    *lass,="Lxt..
    *lass,         jle    c0377ce5 <.text..lock.m12 one of 1 10
    *lass)          jle    c0377ce5 <.text..lock.m1performan1ce.

 

 id="search_r
/a> original LXR>  93 b5" id=        jhttp://sourceinege.net/proL89"s/lxr">LXR>communamy-des,3" id=exL60i  c03l v/ac" csb5"        jmailto:lxr@elect.no">lxr@elect.no-des.
 id="serch_r
lxr.elect.no ki -1" hosgn.txt#        jhttp://www.cllpill-01npro.no">Rllpill nampro AS-des,3provid/a>   name="c    lL32" ss="lL60377ce="L1rvi2" cs="ce 1995.
 id="sear/bodyser/html