1 2 The Linux IPMI Driver 3 --------------------- 4 Corey Minyard 5 <minyard@mvista.com> 6 <minyard@acm.org> 7 8The Intelligent Platform Management Interface, or IPMI, is a 9standard for controlling intelligent devices that monitor a system. 10It provides for dynamic discovery of sensors in the system and the 11ability to monitor the sensors and be informed when the sensor's 12values change or go outside certain boundaries. It also has a 13standardized database for field-replacable units (FRUs) and a watchdog 14timer. 15 16To use this, you need an interface to an IPMI controller in your 17system (called a Baseboard Management Controller, or BMC) and 18management software that can use the IPMI system. 19 20This document describes how to use the IPMI driver for Linux. If you 21are not familiar with IPMI itself, see the web site at 22http://www.intel.com/design/servers/ipmi/index.htm. IPMI is a big 23subject and I can't cover it all here! 24 25Configuration 26------------- 27 28The LinuxIPMI driver is modular, which means you have to pick several 29things to have it work right depending on your hardware. Most of 30these are available in the 'Character Devices' menu. 31 32No matter what, you must pick 'IPMI top-level message handler' to use 33IPMI. What you do beyond that depends on your needs and hardware. 34 35The message handler does not provide any user-level interfaces. 36Kernel code (like the watchdog) can still use it. If you need access 37from userland, you need to select 'Device interface for IPMI' if you 38want access through a device driver. Another interface is also 39available, you may select 'IPMI sockets' in the 'Networking Support' 40main menu. This provides a socket interface to IPMI. You may select 41both of these at the same time, they will both work together. 42 43The driver interface depends on your hardware. If you have a board 44with a standard interface (These will generally be either "KCS", 45"SMIC", or "BT", consult your hardware manual), choose the 'IPMI SI 46handler' option. A driver also exists for direct I2C access to the 47IPMI management controller. Some boards support this, but it is 48unknown if it will work on every board. For this, choose 'IPMI SMBus 49handler', but be ready to try to do some figuring to see if it will 50work. 51 52There is also a KCS-only driver interface supplied, but it is 53depracated in favor of the SI interface. 54 55You should generally enable ACPI on your system, as systems with IPMI 56should have ACPI tables describing them. 57 58If you have a standard interface and the board manufacturer has done 59their job correctly, the IPMI controller should be automatically 60detect (via ACPI or SMBIOS tables) and should just work. Sadly, many 61boards do not have this information. The driver attempts standard 62defaults, but they may not work. If you fall into this situation, you 63need to read the section below named 'The SI Driver' on how to 64hand-configure your system. 65 66IPMI defines a standard watchdog timer. You can enable this with the 67'IPMI Watchdog Timer' config option. If you compile the driver into 68the kernel, then via a kernel command-line option you can have the 69watchdog timer start as soon as it intitializes. It also have a lot 70of other options, see the 'Watchdog' section below for more details. 71Note that you can also have the watchdog continue to run if it is 72closed (by default it is disabled on close). Go into the 'Watchdog 73Cards' menu, enable 'Watchdog Timer Support', and enable the option 74'Disable watchdog shutdown on close'. 75 76 77Basic Design 78------------ 79 80The Linux IPMI driver is designed to be very modular and flexible, you 81only need to take the pieces you need and you can use it in many 82different ways. Because of that, it's broken into many chunks of 83code. These chunks are: 84 85ipmi_msghandler - This is the central piece of software for the IPMI 86system. It handles all messages, message timing, and responses. The 87IPMI users tie into this, and the IPMI physical interfaces (called 88System Management Interfaces, or SMIs) also tie in here. This 89provides the kernelland interface for IPMI, but does not provide an 90interface for use by application processes. 91 92ipmi_devintf - This provides a userland IOCTL interface for the IPMI 93driver, each open file for this device ties in to the message handler 94as an IPMI user. 95 96ipmi_si - A driver for various system interfaces. This supports 97KCS, SMIC, and may support BT in the future. Unless you have your own 98custom interface, you probably need to use this. 99 100ipmi_smb - A driver for accessing BMCs on the SMBus. It uses the 101I2C kernel driver's SMBus interfaces to send and receive IPMI messages 102over the SMBus. 103 104af_ipmi - A network socket interface to IPMI. This doesn't take up 105a character device in your system. 106 107Note that the KCS-only interface ahs been removed. 108 109Much documentation for the interface is in the include files. The 110IPMI include files are: 111 112net/af_ipmi.h - Contains the socket interface. 113 114linux/ipmi.h - Contains the user interface and IOCTL interface for IPMI. 115 116linux/ipmi_smi.h - Contains the interface for system management interfaces 117(things that interface to IPMI controllers) to use. 118 119linux/ipmi_msgdefs.h - General definitions for base IPMI messaging. 120 121 122Addressing 123---------- 124 125The IPMI addressing works much like IP addresses, you have an overlay 126to handle the different address types. The overlay is: 127 128 struct ipmi_addr 129 { 130 int addr_type; 131 short channel; 132 char data[IPMI_MAX_ADDR_SIZE]; 133 }; 134 135The addr_type determines what the address really is. The driver 136currently understands two different types of addresses. 137 138"System Interface" addresses are defined as: 139 140 struct ipmi_system_interface_addr 141 { 142 int addr_type; 143 short channel; 144 }; 145 146and the type is IPMI_SYSTEM_INTERFACE_ADDR_TYPE. This is used for talking 147straight to the BMC on the current card. The channel must be 148IPMI_BMC_CHANNEL. 149 150Messages that are destined to go out on the IPMB bus use the 151IPMI_IPMB_ADDR_TYPE address type. The format is 152 153 struct ipmi_ipmb_addr 154 { 155 int addr_type; 156 short channel; 157 unsigned char slave_addr; 158 unsigned char lun; 159 }; 160 161The "channel" here is generally zero, but some devices support more 162than one channel, it corresponds to the channel as defined in the IPMI 163spec. 164 165 166Messages 167-------- 168 169Messages are defined as: 170 171struct ipmi_msg 172{ 173 unsigned char netfn; 174 unsigned char lun; 175 unsigned char cmd; 176 unsigned char *data; 177 int data_len; 178}; 179 180The driver takes care of adding/stripping the header information. The 181data portion is just the data to be send (do NOT put addressing info 182here) or the response. Note that the completion code of a response is 183the first item in "data", it is not stripped out because that is how 184all the messages are defined in the spec (and thus makes counting the 185offsets a little easier :-). 186 187When using the IOCTL interface from userland, you must provide a block 188of data for "data", fill it, and set data_len to the length of the 189block of data, even when receiving messages. Otherwise the driver 190will have no place to put the message. 191 192Messages coming up from the message handler in kernelland will come in 193as: 194 195 struct ipmi_recv_msg 196 { 197 struct list_head link; 198 199 /* The type of message as defined in the "Receive Types" 200 defines above. */ 201 int recv_type; 202 203 ipmi_user_t *user; 204 struct ipmi_addr addr; 205 long msgid; 206 struct ipmi_msg msg; 207 208 /* Call this when done with the message. It will presumably free 209 the message and do any other necessary cleanup. */ 210 void (*done)(struct ipmi_recv_msg *msg); 211 212 /* Place-holder for the data, don't make any assumptions about 213 the size or existence of this, since it may change. */ 214 unsigned char msg_data[IPMI_MAX_MSG_LENGTH]; 215 }; 216 217You should look at the receive type and handle the message 218appropriately. 219 220 221The Upper Layer Interface (Message Handler) 222------------------------------------------- 223 224The upper layer of the interface provides the users with a consistent 225view of the IPMI interfaces. It allows multiple SMI interfaces to be 226addressed (because some boards actually have multiple BMCs on them) 227and the user should not have to care what type of SMI is below them. 228 229 230Creating the User 231 232To user the message handler, you must first create a user using 233ipmi_create_user. The interface number specifies which SMI you want 234to connect to, and you must supply callback functions to be called 235when data comes in. The callback function can run at interrupt level, 236so be careful using the callbacks. This also allows to you pass in a 237piece of data, the handler_data, that will be passed back to you on 238all calls. 239 240Once you are done, call ipmi_destroy_user() to get rid of the user. 241 242From userland, opening the device automatically creates a user, and 243closing the device automatically destroys the user. 244 245 246Messaging 247 248To send a message from kernel-land, the ipmi_request() call does 249pretty much all message handling. Most of the parameter are 250self-explanatory. However, it takes a "msgid" parameter. This is NOT 251the sequence number of messages. It is simply a long value that is 252passed back when the response for the message is returned. You may 253use it for anything you like. 254 255Responses come back in the function pointed to by the ipmi_recv_hndl 256field of the "handler" that you passed in to ipmi_create_user(). 257Remember again, these may be running at interrupt level. Remember to 258look at the receive type, too. 259 260From userland, you fill out an ipmi_req_t structure and use the 261IPMICTL_SEND_COMMAND ioctl. For incoming stuff, you can use select() 262or poll() to wait for messages to come in. However, you cannot use 263read() to get them, you must call the IPMICTL_RECEIVE_MSG with the 264ipmi_recv_t structure to actually get the message. Remember that you 265must supply a pointer to a block of data in the msg.data field, and 266you must fill in the msg.data_len field with the size of the data. 267This gives the receiver a place to actually put the message. 268 269If the message cannot fit into the data you provide, you will get an 270EMSGSIZE error and the driver will leave the data in the receive 271queue. If you want to get it and have it truncate the message, us 272the IPMICTL_RECEIVE_MSG_TRUNC ioctl. 273 274When you send a command (which is defined by the lowest-order bit of 275the netfn per the IPMI spec) on the IPMB bus, the driver will 276automatically assign the sequence number to the command and save the 277command. If the response is not receive in the IPMI-specified 5 278seconds, it will generate a response automatically saying the command 279timed out. If an unsolicited response comes in (if it was after 5 280seconds, for instance), that response will be ignored. 281 282In kernelland, after you receive a message and are done with it, you 283MUST call ipmi_free_recv_msg() on it, or you will leak messages. Note 284that you should NEVER mess with the "done" field of a message, that is 285required to properly clean up the message. 286 287Note that when sending, there is an ipmi_request_supply_msgs() call 288that lets you supply the smi and receive message. This is useful for 289pieces of code that need to work even if the system is out of buffers 290(the watchdog timer uses this, for instance). You supply your own 291buffer and own free routines. This is not recommended for normal use, 292though, since it is tricky to manage your own buffers. 293 294 295Events and Incoming Commands 296 297The driver takes care of polling for IPMI events and receiving 298commands (commands are messages that are not responses, they are 299commands that other things on the IPMB bus have sent you). To receive 300these, you must register for them, they will not automatically be sent 301to you. 302 303To receive events, you must call ipmi_set_gets_events() and set the 304"val" to non-zero. Any events that have been received by the driver 305since startup will immediately be delivered to the first user that 306registers for events. After that, if multiple users are registered 307for events, they will all receive all events that come in. 308 309For receiving commands, you have to individually register commands you 310want to receive. Call ipmi_register_for_cmd() and supply the netfn 311and command name for each command you want to receive. Only one user 312may be registered for each netfn/cmd, but different users may register 313for different commands. 314 315From userland, equivalent IOCTLs are provided to do these functions. 316 317 318The Lower Layer (SMI) Interface 319------------------------------- 320 321As mentioned before, multiple SMI interfaces may be registered to the 322message handler, each of these is assigned an interface number when 323they register with the message handler. They are generally assigned 324in the order they register, although if an SMI unregisters and then 325another one registers, all bets are off. 326 327The ipmi_smi.h defines the interface for management interfaces, see 328that for more details. 329 330 331The SI Driver 332------------- 333 334The SI driver allows up to 4 KCS or SMIC interfaces to be configured 335in the system. By default, scan the ACPI tables for interfaces, and 336if it doesn't find any the driver will attempt to register one KCS 337interface at the spec-specified I/O port 0xca2 without interrupts. 338You can change this at module load time (for a module) with: 339 340 modprobe ipmi_si.o type=<type1>,<type2>.... 341 ports=<port1>,<port2>... addrs=<addr1>,<addr2>... 342 irqs=<irq1>,<irq2>... trydefaults=[0|1] 343 regspacings=<sp1>,<sp2>,... regsizes=<size1>,<size2>,... 344 regshifts=<shift1>,<shift2>,... 345 346Each of these except si_trydefaults is a list, the first item for the 347first interface, second item for the second interface, etc. 348 349The si_type may be either "kcs", "smic", or "bt". If you leave it blank, it 350defaults to "kcs". 351 352If you specify si_addrs as non-zero for an interface, the driver will 353use the memory address given as the address of the device. This 354overrides si_ports. 355 356If you specify si_ports as non-zero for an interface, the driver will 357use the I/O port given as the device address. 358 359If you specify si_irqs as non-zero for an interface, the driver will 360attempt to use the given interrupt for the device. 361 362si_trydefaults sets whether the standard IPMI interface at 0xca2 and 363any interfaces specified by ACPE are tried. By default, the driver 364tries it, set this value to zero to turn this off. 365 366The next three parameters have to do with register layout. The 367registers used by the interfaces may not appear at successive 368locations and they may not be in 8-bit registers. These parameters 369allow the layout of the data in the registers to be more precisely 370specified. 371 372The regspacings parameter give the number of bytes between successive 373register start addresses. For instance, if the regspacing is set to 4 374and the start address is 0xca2, then the address for the second 375register would be 0xca6. This defaults to 1. 376 377The regsizes parameter gives the size of a register, in bytes. The 378data used by IPMI is 8-bits wide, but it may be inside a larger 379register. This parameter allows the read and write type to specified. 380It may be 1, 2, 4, or 8. The default is 1. 381 382Since the register size may be larger than 32 bits, the IPMI data may not 383be in the lower 8 bits. The regshifts parameter give the amount to shift 384the data to get to the actual IPMI data. 385 386When compiled into the kernel, the addresses can be specified on the 387kernel command line as: 388 389 ipmi_si.type=<type1>,<type2>... 390 ipmi_si.ports=<port1>,<port2>... ipmi_si.addrs=<addr1>,<addr2>... 391 ipmi_si.irqs=<irq1>,<irq2>... ipmi_si.trydefaults=[0|1] 392 ipmi_si.regspacings=<sp1>,<sp2>,... 393 ipmi_si.regsizes=<size1>,<size2>,... 394 ipmi_si.regshifts=<shift1>,<shift2>,... 395 396It works the same as the module parameters of the same names. 397 398By default, the driver will attempt to detect any device specified by 399ACPI, and if none of those then a KCS device at the spec-specified 4000xca2. If you want to turn this off, set the "trydefaults" option to 401false. 402 403If you have high-res timers compiled into the kernel, the driver will 404use them to provide much better performance. Note that if you do not 405have high-res timers enabled in the kernel and you don't have 406interrupts enabled, the driver will run VERY slowly. Don't blame me, 407these interfaces suck. 408 409 410The SMBus Driver 411---------------- 412 413The SMBus driver allows up to 4 SMBus devices to be configured in the 414system. By default, the driver will register any SMBus interfaces it finds 415in the I2C address range of 0x20 to 0x4f on any adapter. You can change this 416at module load time (for a module) with: 417 418 modprobe ipmi_smb.o 419 addr=<adapter1>,<i2caddr1>[,<adapter2>,<i2caddr2>[,...]] 420 dbg=<flags1>,<flags2>... 421 [defaultprobe=0] [dbg_probe=1] 422 423The addresses are specified in pairs, the first is the adapter ID and the 424second is the I2C address on that adapter. 425 426The debug flags are bit flags for each BMC found, they are: 427IPMI messages: 1, driver state: 2, timing: 4, I2C probe: 8 428 429Setting smb_defaultprobe to zero disabled the default probing of SMBus 430interfaces at address range 0x20 to 0x4f. This means that only the 431BMCs specified on the smb_addr line will be detected. 432 433Setting smb_dbg_probe to 1 will enable debugging of the probing and 434detection process for BMCs on the SMBusses. 435 436Discovering the IPMI compilant BMC on the SMBus can cause devices 437on the I2C bus to fail. The SMBus driver writes a "Get Device ID" IPMI 438message as a block write to the I2C bus and waits for a response. 439This action can be detrimental to some I2C devices. It is highly recommended 440that the known I2c address be given to the SMBus driver in the smb_addr 441parameter. The default adrress range will not be used when a smb_addr 442parameter is provided. 443 444When compiled into the kernel, the addresses can be specified on the 445kernel command line as: 446 447 ipmb_smb.addr=<adapter1>,<i2caddr1>[,<adapter2>,<i2caddr2>[,...]] 448 ipmi_smb.dbg=<flags1>,<flags2>... 449 ipmi_smb.defaultprobe=0 ipmi_smb.dbg_probe=1 450 451These are the same options as on the module command line. 452 453Note that you might need some I2C changes if CONFIG_IPMI_PANIC_EVENT 454is enabled along with this, so the I2C driver knows to run to 455completion during sending a panic event. 456 457 458Other Pieces 459------------ 460 461Watchdog 462-------- 463 464A watchdog timer is provided that implements the Linux-standard 465watchdog timer interface. It has three module parameters that can be 466used to control it: 467 468 modprobe ipmi_watchdog timeout=<t> pretimeout=<t> action=<action type> 469 preaction=<preaction type> preop=<preop type> start_now=x 470 nowayout=x 471 472The timeout is the number of seconds to the action, and the pretimeout 473is the amount of seconds before the reset that the pre-timeout panic will 474occur (if pretimeout is zero, then pretimeout will not be enabled). Note 475that the pretimeout is the time before the final timeout. So if the 476timeout is 50 seconds and the pretimeout is 10 seconds, then the pretimeout 477will occur in 40 second (10 seconds before the timeout). 478 479The action may be "reset", "power_cycle", or "power_off", and 480specifies what to do when the timer times out, and defaults to 481"reset". 482 483The preaction may be "pre_smi" for an indication through the SMI 484interface, "pre_int" for an indication through the SMI with an 485interrupts, and "pre_nmi" for a NMI on a preaction. This is how 486the driver is informed of the pretimeout. 487 488The preop may be set to "preop_none" for no operation on a pretimeout, 489"preop_panic" to set the preoperation to panic, or "preop_give_data" 490to provide data to read from the watchdog device when the pretimeout 491occurs. A "pre_nmi" setting CANNOT be used with "preop_give_data" 492because you can't do data operations from an NMI. 493 494When preop is set to "preop_give_data", one byte comes ready to read 495on the device when the pretimeout occurs. Select and fasync work on 496the device, as well. 497 498If start_now is set to 1, the watchdog timer will start running as 499soon as the driver is loaded. 500 501If nowayout is set to 1, the watchdog timer will not stop when the 502watchdog device is closed. The default value of nowayout is true 503if the CONFIG_WATCHDOG_NOWAYOUT option is enabled, or false if not. 504 505When compiled into the kernel, the kernel command line is available 506for configuring the watchdog: 507 508 ipmi_watchdog.timeout=<t> ipmi_watchdog.pretimeout=<t> 509 ipmi_watchdog.action=<action type> 510 ipmi_watchdog.preaction=<preaction type> 511 ipmi_watchdog.preop=<preop type> 512 ipmi_watchdog.start_now=x 513 ipmi_watchdog.nowayout=x 514 515The options are the same as the module parameter options. 516 517The watchdog will panic and start a 120 second reset timeout if it 518gets a pre-action. During a panic or a reboot, the watchdog will 519start a 120 timer if it is running to make sure the reboot occurs. 520 521Note that if you use the NMI preaction for the watchdog, you MUST 522NOT use nmi watchdog mode 1. If you use the NMI watchdog, you 523must use mode 2. 524 525Once you open the watchdog timer, you must write a 'V' character to the 526device to close it, or the timer will not stop. This is a new semantic 527for the driver, but makes it consistent with the rest of the watchdog 528drivers in Linux. 529

