linux/Documentation/ia64/aliasing.txt
<<
>>
Prefs
   1                 MEMORY ATTRIBUTE ALIASING ON IA-64
   2
   3                           Bjorn Helgaas
   4                       <bjorn.helgaas@hp.com>
   5                            May 4, 2006
   6
   7
   8MEMORY ATTRIBUTES
   9
  10    Itanium supports several attributes for virtual memory references.
  11    The attribute is part of the virtual translation, i.e., it is
  12    contained in the TLB entry.  The ones of most interest to the Linux
  13    kernel are:
  14
  15        WB              Write-back (cacheable)
  16        UC              Uncacheable
  17        WC              Write-coalescing
  18
  19    System memory typically uses the WB attribute.  The UC attribute is
  20    used for memory-mapped I/O devices.  The WC attribute is uncacheable
  21    like UC is, but writes may be delayed and combined to increase
  22    performance for things like frame buffers.
  23
  24    The Itanium architecture requires that we avoid accessing the same
  25    page with both a cacheable mapping and an uncacheable mapping[1].
  26
  27    The design of the chipset determines which attributes are supported
  28    on which regions of the address space.  For example, some chipsets
  29    support either WB or UC access to main memory, while others support
  30    only WB access.
  31
  32MEMORY MAP
  33
  34    Platform firmware describes the physical memory map and the
  35    supported attributes for each region.  At boot-time, the kernel uses
  36    the EFI GetMemoryMap() interface.  ACPI can also describe memory
  37    devices and the attributes they support, but Linux/ia64 currently
  38    doesn't use this information.
  39
  40    The kernel uses the efi_memmap table returned from GetMemoryMap() to
  41    learn the attributes supported by each region of physical address
  42    space.  Unfortunately, this table does not completely describe the
  43    address space because some machines omit some or all of the MMIO
  44    regions from the map.
  45
  46    The kernel maintains another table, kern_memmap, which describes the
  47    memory Linux is actually using and the attribute for each region.
  48    This contains only system memory; it does not contain MMIO space.
  49
  50    The kern_memmap table typically contains only a subset of the system
  51    memory described by the efi_memmap.  Linux/ia64 can't use all memory
  52    in the system because of constraints imposed by the identity mapping
  53    scheme.
  54
  55    The efi_memmap table is preserved unmodified because the original
  56    boot-time information is required for kexec.
  57
  58KERNEL IDENTITY MAPPINGS
  59
  60    Linux/ia64 identity mappings are done with large pages, currently
  61    either 16MB or 64MB, referred to as "granules."  Cacheable mappings
  62    are speculative[2], so the processor can read any location in the
  63    page at any time, independent of the programmer's intentions.  This
  64    means that to avoid attribute aliasing, Linux can create a cacheable
  65    identity mapping only when the entire granule supports cacheable
  66    access.
  67
  68    Therefore, kern_memmap contains only full granule-sized regions that
  69    can referenced safely by an identity mapping.
  70
  71    Uncacheable mappings are not speculative, so the processor will
  72    generate UC accesses only to locations explicitly referenced by
  73    software.  This allows UC identity mappings to cover granules that
  74    are only partially populated, or populated with a combination of UC
  75    and WB regions.
  76
  77USER MAPPINGS
  78
  79    User mappings are typically done with 16K or 64K pages.  The smaller
  80    page size allows more flexibility because only 16K or 64K has to be
  81    homogeneous with respect to memory attributes.
  82
  83POTENTIAL ATTRIBUTE ALIASING CASES
  84
  85    There are several ways the kernel creates new mappings:
  86
  87    mmap of /dev/mem
  88
  89        This uses remap_pfn_range(), which creates user mappings.  These
  90        mappings may be either WB or UC.  If the region being mapped
  91        happens to be in kern_memmap, meaning that it may also be mapped
  92        by a kernel identity mapping, the user mapping must use the same
  93        attribute as the kernel mapping.
  94
  95        If the region is not in kern_memmap, the user mapping should use
  96        an attribute reported as being supported in the EFI memory map.
  97
  98        Since the EFI memory map does not describe MMIO on some
  99        machines, this should use an uncacheable mapping as a fallback.
 100
 101    mmap of /sys/class/pci_bus/.../legacy_mem
 102
 103        This is very similar to mmap of /dev/mem, except that legacy_mem
 104        only allows mmap of the one megabyte "legacy MMIO" area for a
 105        specific PCI bus.  Typically this is the first megabyte of
 106        physical address space, but it may be different on machines with
 107        several VGA devices.
 108
 109        "X" uses this to access VGA frame buffers.  Using legacy_mem
 110        rather than /dev/mem allows multiple instances of X to talk to
 111        different VGA cards.
 112
 113        The /dev/mem mmap constraints apply.
 114
 115    mmap of /proc/bus/pci/.../??.?
 116
 117        This is an MMIO mmap of PCI functions, which additionally may or
 118        may not be requested as using the WC attribute.
 119
 120        If WC is requested, and the region in kern_memmap is either WC
 121        or UC, and the EFI memory map designates the region as WC, then
 122        the WC mapping is allowed.
 123
 124        Otherwise, the user mapping must use the same attribute as the
 125        kernel mapping.
 126
 127    read/write of /dev/mem
 128
 129        This uses copy_from_user(), which implicitly uses a kernel
 130        identity mapping.  This is obviously safe for things in
 131        kern_memmap.
 132
 133        There may be corner cases of things that are not in kern_memmap,
 134        but could be accessed this way.  For example, registers in MMIO
 135        space are not in kern_memmap, but could be accessed with a UC
 136        mapping.  This would not cause attribute aliasing.  But
 137        registers typically can be accessed only with four-byte or
 138        eight-byte accesses, and the copy_from_user() path doesn't allow
 139        any control over the access size, so this would be dangerous.
 140
 141    ioremap()
 142
 143        This returns a mapping for use inside the kernel.
 144
 145        If the region is in kern_memmap, we should use the attribute
 146        specified there.
 147
 148        If the EFI memory map reports that the entire granule supports
 149        WB, we should use that (granules that are partially reserved
 150        or occupied by firmware do not appear in kern_memmap).
 151
 152        If the granule contains non-WB memory, but we can cover the
 153        region safely with kernel page table mappings, we can use
 154        ioremap_page_range() as most other architectures do.
 155
 156        Failing all of the above, we have to fall back to a UC mapping.
 157
 158PAST PROBLEM CASES
 159
 160    mmap of various MMIO regions from /dev/mem by "X" on Intel platforms
 161
 162      The EFI memory map may not report these MMIO regions.
 163
 164      These must be allowed so that X will work.  This means that
 165      when the EFI memory map is incomplete, every /dev/mem mmap must
 166      succeed.  It may create either WB or UC user mappings, depending
 167      on whether the region is in kern_memmap or the EFI memory map.
 168
 169    mmap of 0x0-0x9FFFF /dev/mem by "hwinfo" on HP sx1000 with VGA enabled
 170
 171      The EFI memory map reports the following attributes:
 172        0x00000-0x9FFFF WB only
 173        0xA0000-0xBFFFF UC only (VGA frame buffer)
 174        0xC0000-0xFFFFF WB only
 175
 176      This mmap is done with user pages, not kernel identity mappings,
 177      so it is safe to use WB mappings.
 178
 179      The kernel VGA driver may ioremap the VGA frame buffer at 0xA0000,
 180      which uses a granule-sized UC mapping.  This granule will cover some
 181      WB-only memory, but since UC is non-speculative, the processor will
 182      never generate an uncacheable reference to the WB-only areas unless
 183      the driver explicitly touches them.
 184
 185    mmap of 0x0-0xFFFFF legacy_mem by "X"
 186
 187      If the EFI memory map reports that the entire range supports the
 188      same attributes, we can allow the mmap (and we will prefer WB if
 189      supported, as is the case with HP sx[12]000 machines with VGA
 190      disabled).
 191
 192      If EFI reports the range as partly WB and partly UC (as on sx[12]000
 193      machines with VGA enabled), we must fail the mmap because there's no
 194      safe attribute to use.
 195
 196      If EFI reports some of the range but not all (as on Intel firmware
 197      that doesn't report the VGA frame buffer at all), we should fail the
 198      mmap and force the user to map just the specific region of interest.
 199
 200    mmap of 0xA0000-0xBFFFF legacy_mem by "X" on HP sx1000 with VGA disabled
 201
 202      The EFI memory map reports the following attributes:
 203        0x00000-0xFFFFF WB only (no VGA MMIO hole)
 204
 205      This is a special case of the previous case, and the mmap should
 206      fail for the same reason as above.
 207
 208    read of /sys/devices/.../rom
 209
 210      For VGA devices, this may cause an ioremap() of 0xC0000.  This
 211      used to be done with a UC mapping, because the VGA frame buffer
 212      at 0xA0000 prevents use of a WB granule.  The UC mapping causes
 213      an MCA on HP sx[12]000 chipsets.
 214
 215      We should use WB page table mappings to avoid covering the VGA
 216      frame buffer.
 217
 218NOTES
 219
 220    [1] SDM rev 2.2, vol 2, sec 4.4.1.
 221    [2] SDM rev 2.2, vol 2, sec 4.4.6.
 222
lxr.linux.no kindly hosted by Redpill Linpro AS, provider of Linux consulting and operations services since 1995.