linux/Documentation/powerpc/phyp-assisted-dump.txt
<<
>>
Prefs
   1
   2                   Hypervisor-Assisted Dump
   3                   ------------------------
   4                       November 2007
   5
   6The goal of hypervisor-assisted dump is to enable the dump of
   7a crashed system, and to do so from a fully-reset system, and
   8to minimize the total elapsed time until the system is back
   9in production use.
  10
  11As compared to kdump or other strategies, hypervisor-assisted
  12dump offers several strong, practical advantages:
  13
  14-- Unlike kdump, the system has been reset, and loaded
  15   with a fresh copy of the kernel.  In particular,
  16   PCI and I/O devices have been reinitialized and are
  17   in a clean, consistent state.
  18-- As the dump is performed, the dumped memory becomes
  19   immediately available to the system for normal use.
  20-- After the dump is completed, no further reboots are
  21   required; the system will be fully usable, and running
  22   in it's normal, production mode on it normal kernel.
  23
  24The above can only be accomplished by coordination with,
  25and assistance from the hypervisor. The procedure is
  26as follows:
  27
  28-- When a system crashes, the hypervisor will save
  29   the low 256MB of RAM to a previously registered
  30   save region. It will also save system state, system
  31   registers, and hardware PTE's.
  32
  33-- After the low 256MB area has been saved, the
  34   hypervisor will reset PCI and other hardware state.
  35   It will *not* clear RAM. It will then launch the
  36   bootloader, as normal.
  37
  38-- The freshly booted kernel will notice that there
  39   is a new node (ibm,dump-kernel) in the device tree,
  40   indicating that there is crash data available from
  41   a previous boot. It will boot into only 256MB of RAM,
  42   reserving the rest of system memory.
  43
  44-- Userspace tools will parse /sys/kernel/release_region
  45   and read /proc/vmcore to obtain the contents of memory,
  46   which holds the previous crashed kernel. The userspace
  47   tools may copy this info to disk, or network, nas, san,
  48   iscsi, etc. as desired.
  49
  50   For Example: the values in /sys/kernel/release-region
  51   would look something like this (address-range pairs).
  52   CPU:0x177fee000-0x10000: HPTE:0x177ffe020-0x1000: /
  53   DUMP:0x177fff020-0x10000000, 0x10000000-0x16F1D370A
  54
  55-- As the userspace tools complete saving a portion of
  56   dump, they echo an offset and size to
  57   /sys/kernel/release_region to release the reserved
  58   memory back to general use.
  59
  60   An example of this is:
  61     "echo 0x40000000 0x10000000 > /sys/kernel/release_region"
  62   which will release 256MB at the 1GB boundary.
  63
  64Please note that the hypervisor-assisted dump feature
  65is only available on Power6-based systems with recent
  66firmware versions.
  67
  68Implementation details:
  69----------------------
  70
  71During boot, a check is made to see if firmware supports
  72this feature on this particular machine. If it does, then
  73we check to see if a active dump is waiting for us. If yes
  74then everything but 256 MB of RAM is reserved during early
  75boot. This area is released once we collect a dump from user
  76land scripts that are run. If there is dump data, then
  77the /sys/kernel/release_region file is created, and
  78the reserved memory is held.
  79
  80If there is no waiting dump data, then only the highest
  81256MB of the ram is reserved as a scratch area. This area
  82is *not* released: this region will be kept permanently
  83reserved, so that it can act as a receptacle for a copy
  84of the low 256MB in the case a crash does occur. See,
  85however, "open issues" below, as to whether
  86such a reserved region is really needed.
  87
  88Currently the dump will be copied from /proc/vmcore to a
  89a new file upon user intervention. The starting address
  90to be read and the range for each data point in provided
  91in /sys/kernel/release_region.
  92
  93The tools to examine the dump will be same as the ones
  94used for kdump.
  95
  96General notes:
  97--------------
  98Security: please note that there are potential security issues
  99with any sort of dump mechanism. In particular, plaintext
 100(unencrypted) data, and possibly passwords, may be present in
 101the dump data. Userspace tools must take adequate precautions to
 102preserve security.
 103
 104Open issues/ToDo:
 105------------
 106 o The various code paths that tell the hypervisor that a crash
 107   occurred, vs. it simply being a normal reboot, should be
 108   reviewed, and possibly clarified/fixed.
 109
 110 o Instead of using /sys/kernel, should there be a /sys/dump
 111   instead? There is a dump_subsys being created by the s390 code,
 112   perhaps the pseries code should use a similar layout as well.
 113
 114 o Is reserving a 256MB region really required? The goal of
 115   reserving a 256MB scratch area is to make sure that no
 116   important crash data is clobbered when the hypervisor
 117   save low mem to the scratch area. But, if one could assure
 118   that nothing important is located in some 256MB area, then
 119   it would not need to be reserved. Something that can be
 120   improved in subsequent versions.
 121
 122 o Still working the kdump team to integrate this with kdump,
 123   some work remains but this would not affect the current
 124   patches.
 125
 126 o Still need to write a shell script, to copy the dump away.
 127   Currently I am parsing it manually.
 128
lxr.linux.no kindly hosted by Redpill Linpro AS, provider of Linux consulting and operations services since 1995.