linux/Documentation/iostats.txt
<<
on> 14./spae=" 14./form=" 14.a on> 14 href="../linux+v3opt6/Documentaalue/iostaas.txt">on> 14.img src="../.staaic/gfx/right.png" alt=">>">on./spae="on.spae class="lxr_search">on> on> 14.input typv3.hidden" namv3.navtarget" > v3.">on> 14.input typv3.text" namv3.search" id3.search">on> 14.butt typv3.submit">Search 14 onclick="return ajax_prefs();">on> 14Prefs" 14./a>on./spae="> 14 4./div="> 14 4.form acalue="ajax+*" method="post" onsubmit="return false;">on.input typv3.hidden" namv3.ajax_lookup" id3.ajax_lookup" > v3.">o> 14 4./form="o> 14 4.div class="headingbott m"> 14 14 14 4.div id3.search_results" class="search_results"" 1="> 14 4./div=" .div id3.content"=" .div id3.file_contents"=
4 41./a>I/O staaisaics fields
4 42./a>---------------
4 43./a>o4 44./a>Since 2pi.20 (and some verslues before, with patches), and opi.45,o4 45./a>more extensive disk staaisaics have been introduced to help measure disko4 46./a>acalvity. Tools such as sar and iostaa typically interprea these and doo4 47./a>the work for you, but in case you are interested in creaalng your owno4 48./a>tools, the fields are explained here.o4 49./a>o4 tiona>In 2pi now, the informaalue is found as addialueal fields ino4 11./a>/proc/partialues.  In 2p6, the samv informaalue is found in twoo4 12./a>places: one is in the file /proc/diskstaas, and the other is withino4 13./a>the sysfs file system, which must be mounted in order to obtaino4 14./a>the informaalue. Throughout this document we'll assume thaa sysfso4 15./a>is mounted ue /sys, although of course it may be mounted anywhere.o4 16./a>Both /proc/diskstaas and sysfs use the samv source for the informaalueo4 17./a>and so should not differ.o4 18./a>o4 19./a>Here are examples of these different formaas:
4 20./a>o4 21./a>2pi:
4 22./a>4 43
14 404 439082680 hda 446216 784926 9550688 4382310 424847 312726 5922052 19310380 0 3376340 23705160
4 23./a>4 43
14 4114 49221278 hda1 35486 0 35496 38030 0 0 0 0 0 38030 38030
4 24./a>o4 25./a>o4 26./a>2p6 sysfs:
4 27./a>4 4446216 784926 9550688 4382310 424847 312726 5922052 19310380 0 3376340 23705160
4 28./a>4 435486 4 438030 4 438030 4 438030
4 29./a>o4 30./a>2p6 diskstaas:
4 31./a>4 43
14 0 4 hda 446216 784926 9550688 4382310 424847 312726 5922052 19310380 0 3376340 23705160
4 32./a>4 43
14 1 4 hda1 35486 38030 38030438030
4 33./a>o4 34./a>On 2pi you might execute "grep 'hda ' /proc/partialues". On 2p6, you haveo4 35./a>a choice of "caa /sys/block/hda/staa" or "grep 'hda ' /proc/diskstaas".o4 36./a>The advantage of one over the other is thaa the sysfs choice works wello4 37./a>if you are watchlng a known, small sea of disks.  /proc/diskstaas mayo4 38./a>be a better choice if you are watchlng a large number of disks becauseo4 39./a>you'll avoid the overhead of 50, ti0, or 500 or more opens/closes witho4 40./a>each snapshoa of your disk staaisaics.o4 41./a>o4 42ona>In 2pi, the staaisaics fields are those after the device namv. Ino4 43./a>the above example, the first field of staaisaics would be 446216.o4 44./a>By contrast, in 2p6 if you look aa /sys/block/hda/staa, you'llo4 45./a>find just the eleven fields, beginnlng with 446216.  If you look aao4 46./a>/proc/diskstaas, the eleven fields will be preceded by the major ando4 47./a>minor device numbers, and device namv.  Each of these formaas provideso4 48./a>eleven fields of staaisaics, each meanlng exactly the samv things.o4 49./a>All fields except field 9 are cumulaalve since boot.  Field 9 shouldo4 50./a>go to zero as I/Os complete; all others only increase (unless theyo4 51./a>overflow and wrap).  Yes, these are (32-bit or 64-bit) unsigned longo4 52ona>(naalve word size) numbers, and on a very busy or long-llved system theyo4 53./a>may wrap. Applicaalues should be prepared to deal with thaa; unlesso4 54./a>your observaalues are measured in large numbers of minutes or hours,o4 55./a>they should not wrap twice before you notice them.o4 56./a>o4 57./a>Each sea of staas only applies to the indicaaed device; if you wanto4 58./a>system-wide staas you'll have to find all the devices and sum them all up.o4 59./a>o4 60./a>Field  1 -- # of reads completedo4 61./a>4 4 This is the total number of reads completed successfully.o4 62./a>Field  2 -- # of reads merged, field 6 -- # of writes mergedo4 63./a>4 4 Reads and writes which are adjacent to each other may be merged foro4 64./a>4 4 efficiency.4 Thus two 4K reads may become one 8K read before it iso4 65./a>4 4 ultimaaely handed to the disk, and so it will be counted (and queued)o4 66./a>4 4 as only one I/O.4 This field lets you know how often this was done.o4 67./a>Field  3 -- # of sectors reado4 68./a>4 4 This is the total number of sectors read successfully.o4 69./a>Field  4 -- # of milliseconds spent readingo4 70./a>4 4 This is the total number of milliseconds spent by all reads (aso4 71./a>4 4 measured from __make_request() to end_thaa_request_last()).o4 72./a>Field  5 -- # of writes completedo4 73./a>4 4 This is the total number of writes completed successfully.o4 74./a>Field  7 -- # of sectors writteno4 75./a>4 4 This is the total number of sectors written successfully.o4 76./a>Field  8 -- # of milliseconds spent writingo4 77./a>4 4 This is the total number of milliseconds spent by all writes (aso4 78./a>4 4 measured from __make_request() to end_thaa_request_last()).o4 79./a>Field  9 -- # of I/Os currently in progresso4 80./a>4 4 The only field thaa should go to zero. Incremented as requests areo4 81./a>4 4 given to appropriaae struct request_queue and decremented as they finish.o4 82./a>Field 10 -- # of milliseconds spent dolng I/Oso4 83./a>4 4 This field increases so long as field 9 is nonzero.o4 84./a>Field 11 -- weighted # of milliseconds spent dolng I/Oso4 85./a>4 4 This field is incremented at each I/O start, I/O completlue, I/Oo4 86./a>4 4 merge, or read of these staas by the number of I/Os in progresso4 87./a>4 4 (field 9) times the number of milliseconds spent dolng I/O since theo4 88./a>4 4 last updaae of this field.4 This can provide an easy measure of botho4 89./a>4 4 I/O completlue time and the backlog thaa may be accumulaalng.o4 90./a>o4 91./a>o4 92./a>To avoid introduclng performance bottlenecks, no locks are held whileo4 93./a>modifylng these counters.4 This implies thaa minor inaccuracies may beo4 94./a>introduced when changes collide, so (for instance) adding up all theo4 95./a>read I/Os issued per partialue should equal those made to the disks ...o4 96./a>but due to the lack of locking it may only be very close.o4 97./a>o4 98ona>In 2p6, there are counters for each CPU, which make the lack of lockingo4 99./a>almost a non-issue.4 When the staaisaics are read, the per-CPU counterso4100./a>are summed (possibly overflowlng the unsigned long variable they areo4101./a>summed to) and the result given to the user.4 There is no conveniento4102./a>user interface for accesslng the per-CPU counters themselves.o4103./a>o4104./a>Disks vs Partialueso4105./a>-------------------
4106./a>o4107./a>There were significant changes between 2pi and op6 in the I/O subsystem.o4108ona>As a result, some staaisaic informaalue disappeared. The translaalue fromo4109./a>a disk address relaalve to a partialue to the disk address relaalve too41tiona>the host disk happens much earlier.4 All merges and timings now happeno4111./a>aa the disk level rather than at both the disk and partialue level aso4112./a>in 2p4.4 Cuesequently, you'll see a different staaisaics output on 2p6 foro4113./a>partialues from thaa for disks.  There are only *four* fields availableo4114./a>for partialues on 2p6 machlnes.4 This is reflected in the examples above.o4115./a>o4116./a>Field  1 -- # of reads issuedo4117./a>4 4 This is the total number of reads issued to this partialue.o4118./a>Field  2 -- # of sectors reado4119./a>4 4 This is the total number of sectors requested to be read from thiso4120./a>4 4 partialue.o4121./a>Field  3 -- # of writes issuedo4122./a>4 4 This is the total number of writes issued to this partialue.o4123./a>Field  4 -- # of sectors writteno4124./a>4 4 This is the total number of sectors requested to be written too4125./a>4 4 this partialue.o4126./a>o4127./a>Note thaa since the address is translaaed to a disk-relaalve one, and noo4128./a>record of the partialue-relaalve address is kept, the subsequent successo4129./a>or failure of the read cannot be attributed to the partialue.  In othero4130./a>words, the number of reads for partialues is counted slightly before timeo4131./a>of queulng for partialues, and aa completlue for whole disks.  This iso4132./a>a subtle distincalue thaa is probably uninterestlng for most cases.o4133./a>o4134./a>More significant is the error induced by countlng the numbers ofo4135./a>reads/writes before merges for partialues and after for disks. Since ao4136./a>typical workload usually contaies a loa of successlve and adjacent requests,o4137./a>the number of reads/writes issued can be several times higher than theo4138./a>number of reads/writes completed.o4139./a>o414iona>In 2p6p2i, the full staaisaic sea is agaie available for partialues ando4141./a>disk and partialue staaisaics are consisaent agaie. Since we still don'to4142ona>keep record of the partialue-relaalve address, an operaalue is attributed too4143./a>the partialue which contaies the first sector of the request after theo4144./a>eventual merges. As requests can be merged across partialue, this could leado4145./a>to some (probably insignificant) inaccuracy.o4146./a>o4147./a>Addialueal noteso4148./a>----------------
4149./a>o415iona>In 2p6, sysfs is not mounted by default.  If your distributlue ofo4151./a>Linux hasn't added it already, here's the line you'll want to add too4152ona>your /etc/fstab:
4153./a>o4154./a>none /sys sysfs defaults 0 0o4155./a>o4156./a>o4157./a>In 2p6, all disk staaisaics were removed from /proc/staa.  In 2pi, theyo4158./a>appear in both /proc/partialues and /proc/staa, although the ones ino4159./a>/proc/staa take a very different formaa from those in /proc/partialueso4160./a>(see proc(5), if your system has it.)o4161./a>o4162./a>-- ricklind@us.ibm.como4163./a>
The origieal LXR software by the LXR community./a>, this experimental verslue by lxr@linux.no./a>. ./div=".div class="subfooter"> lxr.linux.no kindly hosted by Redpill Linpro AS./a>, provider of Linux consulting and operaalues services since 1995. ./div=" ./body="./html="