linux/Documentation/trace/tracepoint-analysis.txt
<<
>>
Prefs
   1                Notes on Analysing Behaviour Using Events and Tracepoints
   2
   3                        Documentation written by Mel Gorman
   4                PCL information heavily based on email from Ingo Molnar
   5
   61. Introduction
   7===============
   8
   9Tracepoints (see Documentation/trace/tracepoints.txt) can be used without
  10creating custom kernel modules to register probe functions using the event
  11tracing infrastructure.
  12
  13Simplistically, tracepoints represent important events that can be
  14taken in conjunction with other tracepoints to build a "Big Picture" of
  15what is going on within the system. There are a large number of methods for
  16gathering and interpreting these events. Lacking any current Best Practises,
  17this document describes some of the methods that can be used.
  18
  19This document assumes that debugfs is mounted on /sys/kernel/debug and that
  20the appropriate tracing options have been configured into the kernel. It is
  21assumed that the PCL tool tools/perf has been installed and is in your path.
  22
  232. Listing Available Events
  24===========================
  25
  262.1 Standard Utilities
  27----------------------
  28
  29All possible events are visible from /sys/kernel/debug/tracing/events. Simply
  30calling
  31
  32  $ find /sys/kernel/debug/tracing/events -type d
  33
  34will give a fair indication of the number of events available.
  35
  362.2 PCL (Performance Counters for Linux)
  37-------
  38
  39Discovery and enumeration of all counters and events, including tracepoints,
  40are available with the perf tool. Getting a list of available events is a
  41simple case of:
  42
  43  $ perf list 2>&1 | grep Tracepoint
  44  ext4:ext4_free_inode                     [Tracepoint event]
  45  ext4:ext4_request_inode                  [Tracepoint event]
  46  ext4:ext4_allocate_inode                 [Tracepoint event]
  47  ext4:ext4_write_begin                    [Tracepoint event]
  48  ext4:ext4_ordered_write_end              [Tracepoint event]
  49  [ .... remaining output snipped .... ]
  50
  51
  523. Enabling Events
  53==================
  54
  553.1 System-Wide Event Enabling
  56------------------------------
  57
  58See Documentation/trace/events.txt for a proper description on how events
  59can be enabled system-wide. A short example of enabling all events related
  60to page allocation would look something like:
  61
  62  $ for i in `find /sys/kernel/debug/tracing/events -name "enable" | grep mm_`; do echo 1 > $i; done
  63
  643.2 System-Wide Event Enabling with SystemTap
  65---------------------------------------------
  66
  67In SystemTap, tracepoints are accessible using the kernel.trace() function
  68call. The following is an example that reports every 5 seconds what processes
  69were allocating the pages.
  70
  71  global page_allocs
  72
  73  probe kernel.trace("mm_page_alloc") {
  74        page_allocs[execname()]++
  75  }
  76
  77  function print_count() {
  78        printf ("%-25s %-s\n", "#Pages Allocated", "Process Name")
  79        foreach (proc in page_allocs-)
  80                printf("%-25d %s\n", page_allocs[proc], proc)
  81        printf ("\n")
  82        delete page_allocs
  83  }
  84
  85  probe timer.s(5) {
  86          print_count()
  87  }
  88
  893.3 System-Wide Event Enabling with PCL
  90---------------------------------------
  91
  92By specifying the -a switch and analysing sleep, the system-wide events
  93for a duration of time can be examined.
  94
  95 $ perf stat -a \
  96        -e kmem:mm_page_alloc -e kmem:mm_page_free \
  97        -e kmem:mm_page_free_batched \
  98        sleep 10
  99 Performance counter stats for 'sleep 10':
 100
 101           9630  kmem:mm_page_alloc
 102           2143  kmem:mm_page_free
 103           7424  kmem:mm_page_free_batched
 104
 105   10.002577764  seconds time elapsed
 106
 107Similarly, one could execute a shell and exit it as desired to get a report
 108at that point.
 109
 1103.4 Local Event Enabling
 111------------------------
 112
 113Documentation/trace/ftrace.txt describes how to enable events on a per-thread
 114basis using set_ftrace_pid.
 115
 1163.5 Local Event Enablement with PCL
 117-----------------------------------
 118
 119Events can be activated and tracked for the duration of a process on a local
 120basis using PCL such as follows.
 121
 122  $ perf stat -e kmem:mm_page_alloc -e kmem:mm_page_free \
 123                 -e kmem:mm_page_free_batched ./hackbench 10
 124  Time: 0.909
 125
 126    Performance counter stats for './hackbench 10':
 127
 128          17803  kmem:mm_page_alloc
 129          12398  kmem:mm_page_free
 130           4827  kmem:mm_page_free_batched
 131
 132    0.973913387  seconds time elapsed
 133
 1344. Event Filtering
 135==================
 136
 137Documentation/trace/ftrace.txt covers in-depth how to filter events in
 138ftrace.  Obviously using grep and awk of trace_pipe is an option as well
 139as any script reading trace_pipe.
 140
 1415. Analysing Event Variances with PCL
 142=====================================
 143
 144Any workload can exhibit variances between runs and it can be important
 145to know what the standard deviation is. By and large, this is left to the
 146performance analyst to do it by hand. In the event that the discrete event
 147occurrences are useful to the performance analyst, then perf can be used.
 148
 149  $ perf stat --repeat 5 -e kmem:mm_page_alloc -e kmem:mm_page_free
 150                        -e kmem:mm_page_free_batched ./hackbench 10
 151  Time: 0.890
 152  Time: 0.895
 153  Time: 0.915
 154  Time: 1.001
 155  Time: 0.899
 156
 157   Performance counter stats for './hackbench 10' (5 runs):
 158
 159          16630  kmem:mm_page_alloc         ( +-   3.542% )
 160          11486  kmem:mm_page_free          ( +-   4.771% )
 161           4730  kmem:mm_page_free_batched  ( +-   2.325% )
 162
 163    0.982653002  seconds time elapsed   ( +-   1.448% )
 164
 165In the event that some higher-level event is required that depends on some
 166aggregation of discrete events, then a script would need to be developed.
 167
 168Using --repeat, it is also possible to view how events are fluctuating ame=
cepoint-analtion/trace/t-aion/tint-atxt#L167" id="L167" class="line" name="L167"> 167
  71  glob1al page_a1llocs
 123    ef="Docum1entation/trace/tracepoin1t-ana17cumentation/trac href="Documentation/trace/tracepotxt#L123" id="L123" class="line" name="L123"> 123    e.98265300trace("mm_page_allo1c&quo17./hackbench 10
 123    eref="Doculocs[execname()]++
 123    ee event tcumentation/trace/tracep1oint-1nalysis.a href="Documentation/trace/tracepoint-anaoint-an10oysis.txt#L158" id="L158" class="line" name="L158"> 158
  77  func17cument desscribes some of the metho-anal77n email from In106a href="Documentation/trace/tracepoi2 v2.alysis.txt#L164" id="L164" class="line" name="L164"> 164
 164
 164
  71  glob1  printf 1("\n")
 162
 163     href="Do1cumentation/trace/tracep1oint-1nalysi6. H  93for a 1ef="Docum1entation/trace/tracepoin1t-ana18tion/trace/tracepoint-analysis.tepoint-analysis.txt#L143" id="L143" class="line" name="L143"> 143
 126        print1_count()
 126     cument decumentation/trace/tracep1oint-1nalysi" | grep mm_`; do echion/trace/trn hu="D-aceptracea hrefarchhough binartxt#L30" id="L30" class="line" name="L30">  30callin1ef="Docum1entation/trace/tracepoin1t-ana1ysis.t
   4         1stem-Wide1 Event Enabling with PCL1
 166aggr---------1------------------------1
  71  glob1ef="Docum1entation/trace/tracepoin1t-ana19entatioo Rcepoint
  71  glob1e delete 1he -a switch and analysi1ng sl192ntatioo Drrevntatio
crrees devlowhref="Document.txt#L121" id="L121" class="line" name="L121"> 121
  93for a 1ef="Docum1entation/trace/tracepoin1t-ana1ysis.txt#L95" id="L95" class="line" name="L95">  95 $ per1f stat -a1 \
  95 $ per1f   print1:mm_page_alloc -e kmem:m1m_pag1_free tation/epoint-anaceprion/trace/tef="DSTDIN umentcopyce/trais an oWocumus.txt#L133" id="L133" class="line" name="L133"> 133
 135====  sleep 110
 121
 1103.42href="Doc2mentation/trace/tracepoi2t-ana2ysis.tnts that can be
 1103.42h that the0  kmem:mm_page_alloc
 1103.42hf="Docume3  kmem:mm_page_free
  232. List       7424  kmem:mm_page_free_bat2hed
<203ntatioo Drrevme
  232. List =========mentation/trace/tracepoi2t-ana20on email raceprracepoin/traarn tracepoumeef="Dhref="D-CPUntatis,raceracegnis.txt#L69" id="L69" class="line" name="L69">  69were a20.002577724  seconds time elapsed
2a hre2="Documenhat
  69were a20ndard Utimentation/trace/tracepoi2t-ana20ench 10&#ocumenatt depext#L71" id="L71" class="line" name="L71">  71  glob2larly, on2 could execute a shell a2d exi207ntatioo Icepoin need toetion/tracePIDce &qndevnduint-epoint-nrace/xt#L71" id="L71" class="line" name="L71">  71  glob2lf="Docume
  71  glob2lsible evementation/trace/tracepoi2t-ana209n email n/trhe of
  22
  22
 133
  21assumed       742trace/ftrace.txt describ2s how21./hackbenpoint-and" cm-i.txt#L133" id="L133" class="line" name="L133"> 133
  95 $ per2href="Doc2mentation/trace/tracepoi2t-ana2ysis.t7. LowhreL="Doctation/analysis.txt#L142" id="L142" class="line" name="L142"> 142===2Local Eve2t Enablement with PCL
 143
 168Usi2href="Doc2mentation/trace/tracepoi2t-ana2ysis.tcumentmayme=
 126   2ts can be2activated and tracked fo2 the 21ion/trace/t-npoinntatepoint-ods for
 126<,is.txt#L146" id="L146" class="line" name="L146"> 146per2propriate  tracing options have beef="Do2umentadata mustralyaccerded. Aracepoine"yOf/tracntatracepon/trace/troottxt#L158" id="L158" class="line" name="L158"> 158
 122  $2ef="Documenntation/trace/tracepoint--ana2m_page_free \
 123   2ting Availlable Events
 123   2======================
 123   2ef="Documeentation/trace/tracepointt-ana225/hackbench 1int-analysis.txt#L151" id="L151" class="line" name="L151"> 151  T2andard Utiilities
 151  T2a--------2-------
  50
  29All posssible eveents are visible from /syref="2ocumenpoinacepouysis.racepo-c 1oint-apoiset that depene \iocepoisa hreseconddefaultisa hrext#L29" id="L29" class="line" name="L29">  29All pos       4827  kmem:mm_page_free_bat2hed
<2 href=e \ioceepotratme
  14taken 2href="Doc2mentation/trace/tracepoi2t-ana2ysis.tn/tracoarsgolyst  id="ftxt#L22" id="L22" class="line" name="L22">  22
  232. Listiing Availaable Events
  232. Listi=========ring
  22
  362.2 PC2href="Doc2mentation/trace/tracepoi2t-ana236page_free \
 108at 2mentation2trace/ftrace.txt covers 2n-dep237ntatio# Sa hres: 30922xt#L108" id="L108" class="line" name="L108"> 108at 2mf="Documeusly using grep and awk 2f tra238ntatio#xt#L108" id="L108" class="line" name="L108"> 108at 2msible evereading trace_pipe.
 108at 2href="Doc2mentation/trace/tracepoi2t-ana240ntatio# ........  .........  ................................xt#L108" id="L108" class="line" name="L108"> 108at 2href="Doc2vent Variances with PCL
2a hre241ntatio#xt#L108" id="L108" class="line" name="L108"> 108at 2=========2=======================
2a hre24cumentation87.27lysis. t-analysis [vdsotxt#L50" id="L50" class="line" name="L50">  50
  22
  22
  50
  50
  50
  22
  22
  22
  22
 108at 2me: 0.9152 162
 108at 2mf="Docume  362.2 PC2href="Doc2mentation/trace/tracepoi2t-ana2ysis.tAccerdide eoracep, hrefvast majoritytracepoint-criggrrnts"Do depetxt#L21" id="L21" class="line" name="L21">  21assumederformanc2 counter stats for '2/hack2ench 1ods for
  21assumedef="Documementation/trace/tracepoi2t-ana2ysis.ttak hrefls="lly differepen
  21assumedesible eve0  kmem:mm_page_alloc   2     2 +-   notice/tracepXnwrsot-npoinntatrocinsangolmentatracysisumentation/siso letacepoitracext#L21" id="L21" class="line" name="L21">  21assumed      11426  kmem:mm_page_free    2     2 +-   
 158
  62  $ fo2href="Doc2mentation/trace/tracepoi2t-ana26_page_free \
 123   20.982653022  seconds time elapsed 2 ( +-26./hackbench 10
 123   2href="Doc2mentation/trace/tracepoi2t-ana26on email from Ingo Mof="Documentationtrace/tracepotxt#L123" id="L123" class="line" name="L123"> 123   2he event 2hat some higher-level ev2nt is265n email from Ingo Mofp `pidracX`xt#L123" id="L123" class="line" name="L123"> 123   2href="Doc2 discrete events, then a2scrip2ysis.txt#L67" id="L67" class="line" name="L67">  67In Sys2href="Doc2mentation/trace/tracepoi2t-ana2ysis.tTcepowrsontatirupatioafatioa fewtracepoinataxt#L67" id="L67" class="line" name="L67">  67In Sys2hf="Docume, it is also possible to2view 26sis.txt#L29" id="L29" class="line" name="L29">  29All posxt#L135" 2d="L135" class="lineid="266" c269page_free \
 108at 2ef="Docum2entation/trace/tracepoin2t-ana270ntatio# Sa hres: 27666xt#L108" id="L108" class="line" name="L108"> 108at 2e      472llocs
 108at 2ef="Docum2entation/trace/tracepoin2t-ana27cumenta# On/tsearhhCommatation/trace/traaaaaaaaaaaaaaaSharaceObjecaxt#L108" id="L108" class="line" name="L108"> 108at 2e.98265302trace("mm_page_allo2c&quo27cepoint# ........  .......  .......................................xt#L108" id="L108" class="line" name="L108"> 108at 2eref="Doc2locs[execname()]++
 108at 2ee event 2cumentation/trace/tracep2oint-275n email fr51.95lysis.    Xorg< [vdsotxt#L50" id="L50" class="line" name="L50">  50
 123   27cument d2sscribes some of the met2o-ana277n email fro0.09lysis.    Xorg< /lib/i686/cmov/libc-2.9.soxt#L22" id="L22" class="line" name="L22">  22
  50
 108at 2         2 printf("%-25d 2%280epoint# (Fe &mornddetail<,isry:ee \
 162
 108at 2  delete 2page_allocs
 163   2 href="Do2cumentation/trace/tracep2oint-2nalysiSo,umemost halfntatihat depetha1ddthen point
 163   2 ref="Doc2entation/trace/tracepoin2t-ana28tion/symboltxt#L158" id="L158" class="line" name="L158"> 158
 126   2    print2_count()
 126   2 cument d2cumentation/trace/tracep2oint-287ntatio# Sa hres: 27666xt#L108" id="L108" class="line" name="L108"> 108at 2ef="Docum2entation/trace/tracepoin2t-ana288ntatio#xt#L108" id="L108" class="line" name="L108"> 108at 2stem-Wide2 Event Enabling with PCL2
 126   2---------2------------------------2
 126   2- printf 2entation/trace/tracepoin2t-ana291ntatio#xt#L108" id="L108" class="line" name="L108"> 108at 2e delete 2he -a switch and analysi2ng sl292n email fr51.95lysis.    Xorg< [vdsottion/trace/traaaaaaaaaaaaaaaaaaaaaa[.] 0x000000ffffe424xt#L151" id="L151" class="line" name="L151"> 151  T2duration 2of time can be examined.2
 108at 2ef="Docum2entation/trace/tracepoin2t-ana294n email fro0.09lysis.    Xorg< /lib/i686/cmov/libc-2.9.soaaaaaaaaaaaaaaa[.] _"L1_msis.txt#L129" id="L129" class="line" name="L129"> 129   2f stat -a2 \
  22
 108at 2  -e kmem2:mm_page_free_batched \
2 163   2  sleep 120
  21assumeddcument a2ssumes that debugfs is m2u 1103.43href="Doc3mentation/trace/tracepoi3t-ana3ysis.tToiseetrhe1ddods for
 158
  62  $ fo3hf="Docum33  kmem:mm_page_free
 108at 3       7434  kmem:mm_page_free_bat3hed
<303ntatio[ ....txt#L50" id="L50" class="line" name="L50">  50
 162
  88
 158
 158
  86      3lsible ev3mentation/trace/tracepoi3t-ana309n email from :aaaaaa  *__P = __B;xt#L86" id="L86" class="line" name="L86">  86      3Local Eve3t Enabling
 162
  22
 162
  22
 162
  22
 162
  22
 162
 162
 162
 162
 163   3ting Avai3lable Events
  22
  22
   4         3andard Ut3ilities
   4         3a7dard Ut3-------
   4         3a8dard Ut3entation/trace/tracepoin3href=3ysis.t


condorigiLXRtcommunityis.t, hr/anexe \i claslDn/ts be eytt#L4" id=mailto:lxi@aceux.no">lxi@aceux.nois.t.
lxi.aceux.no kindly hosatioeytt#L4" id=http://www.redpill-lintra.no">Redpill Lintra ASis.t, travide/tracLceuxtcond="fntatroe/dpertion/siservif casi/cep1995.