linux-bk/Documentation/MSI-HOWTO.txt
<<
>>
Prefs
   1                The MSI Driver Guide HOWTO
   2        Tom L Nguyen tom.l.nguyen@intel.com
   3                        10/03/2003
   4
   51. About this guide
   6
   7This guide describes the basics of Message Signaled Interrupts(MSI), the
   8advantages of using MSI over traditional interrupt mechanisms, and how
   9to enable your driver to use MSI or MSI-X. Also included is a Frequently
  10Asked Questions.
  11
  122. Copyright 2003 Intel Corporation
  13
  143. What is MSI/MSI-X?
  15
  16Message Signaled Interrupt (MSI), as described in the PCI Local Bus
  17Specification Revision 2.3 or latest, is an optional feature, and a
  18required feature for PCI Express devices. MSI enables a device function
  19to request service by sending an Inbound Memory Write on its PCI bus to
  20the FSB as a Message Signal Interrupt transaction. Because MSI is
  21generated in the form of a Memory Write, all transaction conditions,
  22such as a Retry, Master-Abort, Target-Abort or normal completion, are
  23supported.
  24
  25A PCI device that supports MSI must also support pin IRQ assertion
  26interrupt mechanism to provide backward compatibility for systems that
  27do not support MSI. In Systems, which support MSI, the bus driver is
  28responsible for initializing the message address and message data of
  29the device function's MSI/MSI-X capability structure during device
  30initial configuration.
  31
  32An MSI capable device function indicates MSI support by implementing
  33the MSI/MSI-X capability structure in its PCI capability list. The
  34device function may implement both the MSI capability structure and
  35the MSI-X capability structure; however, the bus driver should not
  36enable both, but instead enable only the MSI-X capability structure.
  37
  38The MSI capability structure contains Message Control register,
  39Message Address register and Message Data register. These registers
  40provide the bus driver control over MSI. The Message Control register
  41indicates the MSI capability supported by the device. The Message
  42Address register specifies the target address and the Message Data
  43register specifies the characteristics of the message. To request
  44service, the device function writes the content of the Message Data
  45register to the target address. The device and its software driver
  46are prohibited from writing to these registers.
  47
  48The MSI-X capability structure is an optional extension to MSI. It
  49uses an independent and separate capability structure. There are
  50some key advantages to implementing the MSI-X capability structure
  51over the MSI capability structure as described below.
  52
  53        - Support a larger maximum number of vectors per function.
  54
  55        - Provide the ability for system software to configure
  56        each vector with an independent message address and message
  57        data, specified by a table that resides in Memory Space.
  58
  59        - MSI and MSI-X both support per-vector masking. Per-vector
  60        masking is an optional extension of MSI but a required
  61        feature for MSI-X. Per-vector masking provides the kernel
  62        the ability to mask/unmask MSI when servicing its software
  63        interrupt service routing handler. If per-vector masking is
  64        not supported, then the device driver should provide the
  65        hardware/software synchronization to ensure that the device
  66        generates MSI when the driver wants it to do so.
  67
  684. Why use MSI?
  69
  70As a benefit the simplification of board design, MSI allows board
  71designers to remove out of band interrupt routing. MSI is another
  72step towards a legacy-free environment.
  73
  74Due to increasing pressure on chipset and processor packages to
  75reduce pin count, the need for interrupt pins is expected to
  76diminish over time. Devices, due to pin constraints, may implement
  77messages to increase performance.
  78
  79PCI Express endpoints uses INTx emulation (in-band messages) instead
  80of IRQ pin assertion. Using INTx emulation requires interrupt
  81sharing among devices connected to the same node (PCI bridge) while
  82MSI is unique (non-shared) and does not require BIOS configuration
  83support. As a result, the PCI Express technology requires MSI
  84support for better interrupt performance.
  85
  86Using MSI enables the device functions to support two or more
  87vectors, which can be configure to target different CPU's to
  88increase scalability.
  89
  905. Configuring a driver to use MSI/MSI-X
  91
  92By default, the kernel will not enable MSI/MSI-X on all devices that
  93support this capability once the patch is installed. A kernel
  94configuration option must be selected to enable MSI/MSI-X support.
  95
  965.1 Including MSI support into the kernel
  97
  98To include MSI support into the kernel requires users to patch the
  99VECTOR-base patch first and then the MSI patch because the MSI
 100support needs VECTOR based scheme. Once these patches are installed,
 101setting CONFIG_PCI_USE_VECTOR enables the VECTOR based scheme and
 102the option for MSI-capable device drivers to selectively enable MSI
 103(using pci_enable_msi as desribed below).
 104
 105Since the target of the inbound message is the local APIC, providing
 106CONFIG_PCI_USE_VECTOR is dependent on whether CONFIG_X86_LOCAL_APIC
 107is enabled or not.
 108
 109int pci_enable_msi(struct pci_dev *)
 110
 111With this new API, any existing device driver, which like to have
 112MSI enabled on its device function, must call this explicitly. A
 113successful call will initialize the MSI/MSI-X capability structure
 114with ONE vector, regardless of whether the device function is
 115capable of supporting multiple messages. This vector replaces the
 116pre-assigned dev->irq with a new MSI vector. To avoid the conflict
 117of new assigned vector with existing pre-assigned vector requires
 118the device driver to call this API before calling request_irq(...).
 119
 120The below diagram shows the events, which switches the interrupt
 121mode on the MSI-capable device function between MSI mode and
 122PIN-IRQ assertion mode.
 123
 124         ------------   pci_enable_msi   ------------------------
 125        |            | <=============== |                        |
 126        | MSI MODE   |                  | PIN-IRQ ASSERTION MODE |
 127        |            | ===============> |                        |
 128         ------------   free_irq         ------------------------
 129
 1305.2 Configuring for MSI support
 131
 132Due to the non-contiguous fashion in vector assignment of the
 133existing Linux kernel, this patch does not support multiple
 134messages regardless of the device function is capable of supporting
 135more than one vector. The bus driver initializes only entry 0 of
 136this capability if pci_enable_msi(...) is called successfully by
 137the device driver.
 138
 1395.3 Configuring for MSI-X support
 140
 141Both the MSI capability structure and the MSI-X capability structure
 142share the same above semantics; however, due to the ability of the
 143system software to configure each vector of the MSI-X capability
 144structure with an independent message address and message data, the
 145non-contiguous fashion in vector assignment of the existing Linux
 146kernel has no impact on supporting multiple messages on an MSI-X
 147capable device functions. By default, as mentioned above, ONE vector
 148should be always allocated to the MSI-X capability structure at
 149entry 0. The bus driver does not initialize other entries of the
 150MSI-X table.
 151
 152Note that the PCI subsystem should have full control of a MSI-X
 153table that resides in Memory Space. The software device driver
 154should not access this table.
 155
 156To request for additional vectors, the device software driver should
 157call function msi_alloc_vectors(). It is recommended that the
 158software driver should call this function once during the
 159initialization phase of the device driver.
 160
 161The function msi_alloc_vectors(), once invoked, enables either
 162all or nothing, depending on the current availability of vector
 163resources. If no vector resources are available, the device function
 164still works with ONE vector. If the vector resources are available
 165for the number of vectors requested by the driver, this function
 166will reconfigure the MSI-X capability structure of the device with
 167additional messages, starting from entry 1. To emphasize this
 168reason, for example, the device may be capable for supporting the
 169maximum of 32 vectors while its software driver usually may request
 1704 vectors.
 171
 172For each vector, after this successful call, the device driver is
 173responsible to call other functions like request_irq(), enable_irq(),
 174etc. to enable this vector with its corresponding interrupt service
 175handler. It is the device driver's choice to have all vectors shared
 176the same interrupt service handler or each vector with a unique
 177interrupt service handler.
 178
 179In addition to the function msi_alloc_vectors(), another function
 180msi_free_vectors() is provided to allow the software driver to
 181release a number of vectors back to the vector resources. Once
 182invoked, the PCI subsystem disables (masks) each vector released.
 183These vectors are no longer valid for the hardware device and its
 184software driver to use. Like free_irq, it recommends that the
 185device driver should also call msi_free_vectors to release all
 186additional vectors previously requested.
 187
 188int msi_alloc_vectors(struct pci_dev *dev, int *vector, int nvec)
 189
 190This API enables the software driver to request the PCI subsystem
 191for additional messages. Depending on the number of vectors
 192available, the PCI subsystem enables either all or nothing.
 193
 194Argument dev points to the device (pci_dev) structure.
 195Argument vector is a pointer of integer type. The number of
 196elements is indicated in argument nvec.
 197Argument nvec is an integer indicating the number of messages
 198requested.
 199A return of zero indicates that the number of allocated vector is
 200successfully allocated. Otherwise, indicate resources not
 201available.
 202
 203int msi_free_vectors(struct pci_dev* dev, int *vector, int nvec)
 204
 205This API enables the software driver to inform the PCI subsystem
 206that it is willing to release a number of vectors back to the
 207MSI resource pool. Once invoked, the PCI subsystem disables each
 208MSI-X entry associated with each vector stored in the argument 2.
 209These vectors are no longer valid for the hardware device and
 210its software driver to use.
 211
 212Argument dev points to the device (pci_dev) structure.
 213Argument vector is a pointer of integer type. The number of
 214elements is indicated in argument nvec.
 215Argument nvec is an integer indicating the number of messages
 216released.
 217A return of zero indicates that the number of allocated vectors
 218is successfully released. Otherwise, indicates a failure.
 219
 2205.4 Hardware requirements for MSI support
 221MSI support requires support from both system hardware and
 222individual hardware device functions.
 223
 2245.4.1 System hardware support
 225Since the target of MSI address is the local APIC CPU, enabling
 226MSI support in Linux kernel is dependent on whether existing
 227system hardware supports local APIC. Users should verify their
 228system whether it runs when CONFIG_X86_LOCAL_APIC=y.
 229
 230In SMP environment, CONFIG_X86_LOCAL_APIC is automatically set;
 231however, in UP environment, users must manually set
 232CONFIG_X86_LOCAL_APIC. Once CONFIG_X86_LOCAL_APIC=y, setting
 233CONFIG_PCI_USE_VECTOR enables the VECTOR based scheme and
 234the option for MSI-capable device drivers to selectively enable
 235MSI (using pci_enable_msi as desribed below).
 236
 237Note that CONFIG_X86_IO_APIC setting is irrelevant because MSI
 238vector is allocated new during runtime and MSI support does not
 239depend on BIOS support. This key independency enables MSI support
 240on future IOxAPIC free platform.
 241
 2425.4.2 Device hardware support
 243The hardware device function supports MSI by indicating the
 244MSI/MSI-X capability structure on its PCI capability list. By
 245default, this capability structure will not be initialized by
 246the kernel to enable MSI during the system boot. In other words,
 247the device function is running on its default pin assertion mode.
 248Note that in many cases the hardware supporting MSI have bugs,
 249which may result in system hang. The software driver of specific
 250MSI-capable hardware is responsible for whether calling
 251pci_enable_msi or not. A return of zero indicates the kernel
 252successfully initializes the MSI/MSI-X capability structure of the
 253device funtion. The device function is now running on MSI mode.
 254
 2555.5 How to tell whether MSI is enabled on device function
 256
 257At the driver level, a return of zero from pci_enable_msi(...)
 258indicates to the device driver that its device function is
 259initialized successfully and ready to run in MSI mode.
 260
 261At the user level, users can use command 'cat /proc/interrupts'
 262to display the vector allocated for the device and its interrupt
 263mode, as shown below.
 264
 265           CPU0       CPU1
 266  0:     324639          0    IO-APIC-edge  timer
 267  1:       1186          0    IO-APIC-edge  i8042
 268  2:          0          0          XT-PIC  cascade
 269 12:       2797          0    IO-APIC-edge  i8042
 270 14:       6543          0    IO-APIC-edge  ide0
 271 15:          1          0    IO-APIC-edge  ide1
 272169:          0          0   IO-APIC-level  uhci-hcd
 273185:          0          0   IO-APIC-level  uhci-hcd
 274193:        138         10         PCI MSI  aic79xx
 275201:         30          0         PCI MSI  aic79xx
 276225:         30          0   IO-APIC-level  aic7xxx
 277233:         30          0   IO-APIC-level  aic7xxx
 278NMI:          0          0
 279LOC:     324553     325068
 280ERR:          0
 281MIS:          0
 282
 2836. FAQ
 284
 285Q1. Are there any limitations on using the MSI?
 286
 287A1. If the PCI device supports MSI and conforms to the
 288specification and the platform supports the APIC local bus,
 289then using MSI should work.
 290
 291Q2. Will it work on all the Pentium processors (P3, P4, Xeon,
 292AMD processors)? In P3 IPI's are transmitted on the APIC local
 293bus and in P4 and Xeon they are transmitted on the system
 294bus. Are there any implications with this?
 295
 296A2. MSI support enables a PCI device sending an inbound
 297memory write (0xfeexxxxx as target address) on its PCI bus
 298directly to the FSB. Since the message address has a
 299redirection hint bit cleared, it should work.
 300
 301Q3. The target address 0xfeexxxxx will be translated by the
 302Host Bridge into an interrupt message. Are there any
 303limitations on the chipsets such as Intel 8xx, Intel e7xxx,
 304or VIA?
 305
 306A3. If these chipsets support an inbound memory write with
 307target address set as 0xfeexxxxx, as conformed to PCI
 308specification 2.3 or latest, then it should work.
 309
 310Q4. From the driver point of view, if the MSI is lost because
 311of the errors occur during inbound memory write, then it may
 312wait for ever. Is there a mechanism for it to recover?
 313
 314A4. Since the target of the transaction is an inbound memory
 315write, all transaction termination conditions (Retry,
 316Master-Abort, Target-Abort, or normal completion) are
 317supported. A device sending an MSI must abide by all the PCI
 318rules and conditions regarding that inbound memory write. So,
 319if a retry is signaled it must retry, etc... We believe that
 320the recommendation for Abort is also a retry (refer to PCI
 321specification 2.3 or latest).
 322
lxr.linux.no kindly hosted by Redpill Linpro AS, provider of Linux consulting and operations services since 1995.