linux/Documentation/kdump/kdump.txt
<<
>>
Prefs
   1================================================================
   2Documentation for Kdump - The kexec-based Crash Dumping Solution
   3================================================================
   4
   5This document includes overview, setup and installation, and analysis
   6information.
   7
   8Overview
   9========
  10
  11Kdump uses kexec to quickly boot to a dump-capture kernel whenever a
  12dump of the system kernel's memory needs to be taken (for example, when
  13the system panics). The system kernel's memory image is preserved across
  14the reboot and is accessible to the dump-capture kernel.
  15
  16You can use common commands, such as cp and scp, to copy the
  17memory image to a dump file on the local disk, or across the network to
  18a remote system.
  19
  20Kdump and kexec are currently supported on the x86, x86_64, ppc64, ia64,
  21and s390x architectures.
  22
  23When the system kernel boots, it reserves a small section of memory for
  24the dump-capture kernel. This ensures that ongoing Direct Memory Access
  25(DMA) from the system kernel does not corrupt the dump-capture kernel.
  26The kexec -p command loads the dump-capture kernel into this reserved
  27memory.
  28
  29On x86 machines, the first 640 KB of physical memory is needed to boot,
  30regardless of where the kernel loads. Therefore, kexec backs up this
  31region just before rebooting into the dump-capture kernel.
  32
  33Similarly on PPC64 machines first 32KB of physical memory is needed for
  34booting regardless of where the kernel is loaded and to support 64K page
  35size kexec backs up the first 64KB memory.
  36
  37For s390x, when kdump is triggered, the crashkernel region is exchanged
  38with the region [0, crashkernel region size] and then the kdump kernel
  39runs in [0, crashkernel region size]. Therefore no relocatable kernel is
  40needed for s390x.
  41
  42All of the necessary information about the system kernel's core image is
  43encoded in the ELF format, and stored in a reserved area of memory
  44before a crash. The physical address of the start of the ELF header is
  45passed to the dump-capture kernel through the elfcorehdr= boot
  46parameter. Optionally the size of the ELF header can also be passed
  47when using the elfcorehdr=[size[KMG]@]offset[KMG] syntax.
  48
  49
  50With the dump-capture kernel, you can access the memory image, or "old
  51memory," in two ways:
  52
  53- Through a /dev/oldmem device interface. A capture utility can read the
  54  device file and write out the memory in raw format. This is a raw dump
  55  of memory. Analysis and capture tools must be intelligent enough to
  56  determine where to look for the right information.
  57
  58- Through /proc/vmcore. This exports the dump as an ELF-format file that
  59  you can write out using file copy commands such as cp or scp. Further,
  60  you can use analysis tools such as the GNU Debugger (GDB) and the Crash
  61  tool to debug the dump file. This method ensures that the dump pages are
  62  correctly ordered.
  63
  64
  65Setup and Installation
  66======================
  67
  68Install kexec-tools
  69-------------------
  70
  711) Login as the root user.
  72
  732) Download the kexec-tools user-space package from the following URL:
  74
  75http://kernel.org/pub/linux/utils/kernel/kexec/kexec-tools.tar.gz
  76
  77This is a symlink to the latest version.
  78
  79The latest kexec-tools git tree is available at:
  80
  81git://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
  82and
  83http://www.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
  84
  85There is also a gitweb interface available at
  86http://www.kernel.org/git/?p=utils/kernel/kexec/kexec-tools.git
  87
  88More information about kexec-tools can be found at
  89http://horms.net/projects/kexec/
  90
  913) Unpack the tarball with the tar command, as follows:
  92
  93   tar xvpzf kexec-tools.tar.gz
  94
  954) Change to the kexec-tools directory, as follows:
  96
  97   cd kexec-tools-VERSION
  98
  995) Configure the package, as follows:
 100
 101   ./configure
 102
 1036) Compile the package, as follows:
 104
 105   make
 106
 1077) Install the package, as follows:
 108
 109   make install
 110
 111
 112Build the system and dump-capture kernels
 113-----------------------------------------
 114There are two possible methods of using Kdump.
 115
 1161) Build a separate custom dump-capture kernel for capturing the
 117   kernel core dump.
 118
 1192) Or use the system kernel binary itself as dump-capture kernel and there is
 120   no need to build a separate dump-capture kernel. This is possible
 121   only with the architectures which support a relocatable kernel. As
 122   of today, i386, x86_64, ppc64 and ia64 architectures support relocatable
 123   kernel.
 124
 125Building a relocatable kernel is advantageous from the point of view that
 126one does not have to build a second kernel for capturing the dump. But
 127at the same time one might want to build a custom dump capture kernel
 128suitable to his needs.
 129
 130Following are the configuration setting required for system and
 131dump-capture kernels for enabling kdump support.
 132
 133System kernel config options
 134----------------------------
 135
 1361) Enable "kexec system call" in "Processor type and features."
 137
 138   CONFIG_KEXEC=y
 139
 1402) Enable "sysfs file system support" in "Filesystem" -> "Pseudo
 141   filesystems." This is usually enabled by default.
 142
 143   CONFIG_SYSFS=y
 144
 145   Note that "sysfs file system support" might not appear in the "Pseudo
 146   filesystems" menu if "Configure standard kernel features (for small
 147   systems)" is not enabled in "General Setup." In this case, check the
 148   .config file itself to ensure that sysfs is turned on, as follows:
 149
 150   grep 'CONFIG_SYSFS' .config
 151
 1523) Enable "Compile the kernel with debug info" in "Kernel hacking."
 153
 154   CONFIG_DEBUG_INFO=Y
 155
 156   This causes the kernel to be built with debug symbols. The dump
 157   analysis tools require a vmlinux with debug symbols in order to read
 158   and analyze a dump file.
 159
 160Dump-capture kernel config options (Arch Independent)
 161-----------------------------------------------------
 162
 1631) Enable "kernel crash dumps" support under "Processor type and
 164   features":
 165
 166   CONFIG_CRASH_DUMP=y
 167
 1682) Enable "/proc/vmcore support" under "Filesystems" -> "Pseudo filesystems".
 169
 170   CONFIG_PROC_VMCORE=y
 171   (CONFIG_PROC_VMCORE is set by default when CONFIG_CRASH_DUMP is selected.)
 172
 173Dump-capture kernel config options (Arch Dependent, i386 and x86_64)
 174--------------------------------------------------------------------
 175
 1761) On i386, enable high memory support under "Processor type and
 177   features":
 178
 179   CONFIG_HIGHMEM64G=y
 180   or
 181   CONFIG_HIGHMEM4G
 182
 1832) On i386 and x86_64, disable symmetric multi-processing support
 184   under "Processor type and features":
 185
 186   CONFIG_SMP=n
 187
 188   (If CONFIG_SMP=y, then specify maxcpus=1 on the kernel command line
 189   when loading the dump-capture kernel, see section "Load the Dump-capture
 190   Kernel".)
 191
 1923) If one wants to build and use a relocatable kernel,
 193   Enable "Build a relocatable kernel" support under "Processor type and
 194   features"
 195
 196   CONFIG_RELOCATABLE=y
 197
 1984) Use a suitable value for "Physical address where the kernel is
 199   loaded" (under "Processor type and features"). This only appears when
 200   "kernel crash dumps" is enabled. A suitable value depends upon
 201   whether kernel is relocatable or not.
 202
 203   If you are using a relocatable kernel use CONFIG_PHYSICAL_START=0x100000
 204   This will compile the kernel for physical address 1MB, but given the fact
 205   kernel is relocatable, it can be run from any physical address hence
 206   kexec boot loader will load it in memory region reserved for dump-capture
 207   kernel.
 208
 209   Otherwise it should be the start of memory region reserved for
 210   second kernel using boot parameter "crashkernel=Y@X". Here X is
 211   start of memory region reserved for dump-capture kernel.
 212   Generally X is 16MB (0x1000000). So you can set
 213   CONFIG_PHYSICAL_START=0x1000000
 214
 2155) Make and install the kernel and its modules. DO NOT add this kernel
 216   to the boot loader configuration files.
 217
 218Dump-capture kernel config options (Arch Dependent, ppc64)
 219----------------------------------------------------------
 220
 2211) Enable "Build a kdump crash kernel" support under "Kernel" options:
 222
 223   CONFIG_CRASH_DUMP=y
 224
 2252)   Enable "Build a relocatable kernel" support
 226
 227   CONFIG_RELOCATABLE=y
 228
 229   Make and install the kernel and its modules.
 230
 231Dump-capture kernel config options (Arch Dependent, ia64)
 232----------------------------------------------------------
 233
 234- No specific options are required to create a dump-capture kernel
 235  for ia64, other than those specified in the arch independent section
 236  above. This means that it is possible to use the system kernel
 237  as a dump-capture kernel if desired.
 238
 239  The crashkernel region can be automatically placed by the system
 240  kernel at run time. This is done by specifying the base address as 0,
 241  or omitting it all together.
 242
 243  crashkernel=256M@0
 244  or
 245  crashkernel=256M
 246
 247  If the start address is specified, note that the start address of the
 248  kernel will be aligned to 64Mb, so if the start address is not then
 249  any space below the alignment point will be wasted.
 250
 251
 252Extended crashkernel syntax
 253===========================
 254
 255While the "crashkernel=size[@offset]" syntax is sufficient for most
 256configurations, sometimes it's handy to have the reserved memory dependent
 257on the value of System RAM -- that's mostly for distributors that pre-setup
 258the kernel command line to avoid a unbootable system after some memory has
 259been removed from the machine.
 260
 261The syntax is:
 262
 263    crashkernel=<range1>:<size1>[,<range2>:<size2>,...][@offset]
 264    range=start-[end]
 265
 266    'start' is inclusive and 'end' is exclusive.
 267
 268For example:
 269
 270    crashkernel=512M-2G:64M,2G-:128M
 271
 272This would mean:
 273
 274    1) if the RAM is smaller than 512M, then don't reserve anything
 275       (this is the "rescue" case)
 276    2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
 277    3) if the RAM size is larger than 2G, then reserve 128M
 278
 279
 280
 281Boot into System Kernel
 282=======================
 283
 2841) Update the boot loader (such as grub, yaboot, or lilo) configuration
 285   files as necessary.
 286
 2872) Boot the system kernel with the boot parameter "crashkernel=Y@X",
 288   where Y specifies how much memory to reserve for the dump-capture kernel
 289   and X specifies the beginning of this reserved memory. For example,
 290   "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory
 291   starting at physical address 0x01000000 (16MB) for the dump-capture kernel.
 292
 293   On x86 and x86_64, use "crashkernel=64M@16M".
 294
 295   On ppc64, use "crashkernel=128M@32M".
 296
 297   On ia64, 256M@256M is a generous value that typically works.
 298   The region may be automatically placed on ia64, see the
 299   dump-capture kernel config option notes above.
 300
 301   On s390x, typically use "crashkernel=xxM". The value of xx is dependent
 302   on the memory consumption of the kdump system. In general this is not
 303   dependent on the memory size of the production system.
 304
 305Load the Dump-capture Kernel
 306============================
 307
 308After booting to the system kernel, dump-capture kernel needs to be
 309loaded.
 310
 311Based on the architecture and type of image (relocatable or not), one
 312can choose to load the uncompressed vmlinux or compressed bzImage/vmlinuz
 313of dump-capture kernel. Following is the summary.
 314
 315For i386 and x86_64:
 316        - Use vmlinux if kernel is not relocatable.
 317        - Use bzImage/vmlinuz if kernel is relocatable.
 318For ppc64:
 319        - Use vmlinux
 320For ia64:
 321        - Use vmlinux or vmlinuz.gz
 322For s390x:
 323        - Use image or bzImage
 324
 325
 326If you are using a uncompressed vmlinux image then use following command
 327to load dump-capture kernel.
 328
 329   kexec -p <dump-capture-kernel-vmlinux-image> \
 330   --initrd=<initrd-for-dump-capture-kernel> --args-linux \
 331   --append="root=<root-dev> <arch-specific-options>"
 332
 333If you are using a compressed bzImage/vmlinuz, then use following command
 334to load dump-capture kernel.
 335
 336   kexec -p <dump-capture-kernel-bzImage> \
 337   --initrd=<initrd-for-dump-capture-kernel> \
 338   --append="root=<root-dev> <arch-specific-options>"
 339
 340Please note, that --args-linux does not need to be specified for ia64.
 341It is planned to make this a no-op on that architecture, but for now
 342it should be omitted
 343
 344Following are the arch specific command line options to be used while
 345loading dump-capture kernel.
 346
 347For i386, x86_64 and ia64:
 348        "1 irqpoll maxcpus=1 reset_devices"
 349
 350For ppc64:
 351        "1 maxcpus=1 noirqdistrib reset_devices"
 352
 353For s390x:
 354        "1 maxcpus=1 cgroup_disable=memory"
 355
 356Notes on loading the dump-capture kernel:
 357
 358* By default, the ELF headers are stored in ELF64 format to support
 359  systems with more than 4GB memory. On i386, kexec automatically checks if
 360  the physical RAM size exceeds the 4 GB limit and if not, uses ELF32.
 361  So, on non-PAE systems, ELF32 is always used.
 362
 363  The --elf32-core-headers option can be used to force the generation of ELF32
 364  headers. This is necessary because GDB currently cannot open vmcore files
 365  with ELF64 headers on 32-bit systems.
 366
 367* The "irqpoll" boot parameter reduces driver initialization failures
 368  due to shared interrupts in the dump-capture kernel.
 369
 370* You must specify <root-dev> in the format corresponding to the root
 371  device name in the output of mount command.
 372
 373* Boot parameter "1" boots the dump-capture kernel into single-user
 374  mode without networking. If you want networking, use "3".
 375
 376* We generally don' have to bring up a SMP kernel just to capture the
 377  dump. Hence generally it is useful either to build a UP dump-capture
 378  kernel or specify maxcpus=1 option while loading dump-capture kernel.
 379
 380* For s390x there are two kdump modes: If a ELF header is specified with
 381  the elfcorehdr= kernel parameter, it is used by the kdump kernel as it
 382  is done on all other architectures. If no elfcorehdr= kernel parameter is
 383  specified, the s390x kdump kernel dynamically creates the header. The
 384  second mode has the advantage that for CPU and memory hotplug, kdump has
 385  not to be reloaded with kexec_load().
 386
 387* For s390x systems with many attached devices the "cio_ignore" kernel
 388  parameter should be used for the kdump kernel in order to prevent allocation
 389  of kernel memory for devices that are not relevant for kdump. The same
 390  applies to systems that use SCSI/FCP devices. In that case the
 391  "allow_lun_scan" zfcp module parameter should be set to zero before
 392  setting FCP devices online.
 393
 394Kernel Panic
 395============
 396
 397After successfully loading the dump-capture kernel as previously
 398described, the system will reboot into the dump-capture kernel if a
 399system crash is triggered.  Trigger points are located in panic(),
 400die(), die_nmi() and in the sysrq handler (ALT-SysRq-c).
 401
 402The following conditions will execute a crash trigger point:
 403
 404If a hard lockup is detected and "NMI watchdog" is configured, the system
 405will boot into the dump-capture kernel ( die_nmi() ).
 406
 407If die() is called, and it happens to be a thread with pid 0 or 1, or die()
 408is called inside interrupt context or die() is called and panic_on_oops is set,
 409the system will boot into the dump-capture kernel.
 410
 411On powerpc systems when a soft-reset is generated, die() is called by all cpus
 412and the system will boot into the dump-capture kernel.
 413
 414For testing purposes, you can trigger a crash by using "ALT-SysRq-c",
 415"echo c > /proc/sysrq-trigger" or write a module to force the panic.
 416
 417Write Out the Dump File
 418=======================
 419
 420After the dump-capture kernel is booted, write out the dump file with
 421the following command:
 422
 423   cp /proc/vmcore <dump-file>
 424
 425You can also access dumped memory as a /dev/oldmem device for a linear
 426and raw view. To create the device, use the following command:
 427
 428    mknod /dev/oldmem c 1 12
 429
 430Use the dd command with suitable options for count, bs, and skip to
 431access specific portions of the dump.
 432
 433To see the entire memory, use the following command:
 434
 435   dd if=/dev/oldmem of=oldmem.001
 436
 437
 438Analysis
 439========
 440
 441Before analyzing the dump image, you should reboot into a stable kernel.
 442
 443You can do limited analysis using GDB on the dump file copied out of
 444/proc/vmcore. Use the debug vmlinux built with -g and run the following
 445command:
 446
 447   gdb vmlinux <dump-file>
 448
 449Stack trace for the task on processor 0, register display, and memory
 450display work fine.
 451
 452Note: GDB cannot analyze core files generated in ELF64 format for x86.
 453On systems with a maximum of 4GB of memory, you can generate
 454ELF32-format headers using the --elf32-core-headers kernel option on the
 455dump kernel.
 456
 457You can also use the Crash utility to analyze dump files in Kdump
 458format. Crash is available on Dave Anderson's site at the following URL:
 459
 460   http://people.redhat.com/~anderson/
 461
 462
 463To Do
 464=====
 465
 4661) Provide relocatable kernels for all architectures to help in maintaining
 467   multiple kernels for crash_dump, and the same kernel as the system kernel
 468   can be used to capture the dump.
 469
 470
 471Contact
 472=======
 473
 474Vivek Goyal (vgoyal@redhat.com)
 475Maneesh Soni (maneesh@in.ibm.com)
 476
 477
lxr.linux.no kindly hosted by Redpill Linpro AS, provider of Linux consulting and operations services since 1995.