1 Dynamic DMA mapping 2 =================== 3 4 David S. Miller <davem@redhat.com> 5 Richard Henderson <rth@cygnus.com> 6 Jakub Jelinek <jakub@redhat.com> 7 8Most of the 64bit platforms have special hardware that translates bus 9addresses (DMA addresses) into physical addresses. This is similar to 10how page tables and/or a TLB translates virtual addresses to physical 11addresses on a CPU. This is needed so that e.g. PCI devices can 12access with a Single Address Cycle (32bit DMA address) any page in the 1364bit physical address space. Previously in Linux those 64bit 14platforms had to set artificial limits on the maximum RAM size in the 15system, so that the virt_to_bus() static scheme works (the DMA address 16translation tables were simply filled on bootup to map each bus 17address to the physical page __pa(bus_to_virt())). 18 19So that Linux can use the dynamic DMA mapping, it needs some help from the 20drivers, namely it has to take into account that DMA addresses should be 21mapped only for the time they are actually used and unmapped after the DMA 22transfer. 23 24The following API will work of course even on platforms where no such 25hardware exists, see e.g. include/asm-i386/pci.h for how it is implemented on 26top of the virt_to_bus interface. 27 28First of all, you should make sure 29 30#include <linux/pci.h> 31 32is in your driver. This file will obtain for you the definition of the 33dma_addr_t (which can hold any valid DMA address for the platform) 34type which should be used everywhere you hold a DMA (bus) address 35returned from the DMA mapping functions. 36 37 What memory is DMA'able? 38 39The first piece of information you must know is what kernel memory can 40be used with the DMA mapping facilities. There has been an unwritten 41set of rules regarding this, and this text is an attempt to finally 42write them down. 43 44If you acquired your memory via the page allocator 45(i.e. __get_free_page*()) or the generic memory allocators 46(i.e. kmalloc() or kmem_cache_alloc()) then you may DMA to/from 47that memory using the addresses returned from those routines. 48 49This means specifically that you may _not_ use the memory/addresses 50returned from vmalloc() for DMA. It is possible to DMA to the 51_underlying_ memory mapped into a vmalloc() area, but this requires 52walking page tables to get the physical addresses, and then 53translating each of those pages back to a kernel address using 54something like __va(). [ EDIT: Update this when we integrate 55Gerd Knorr's generic code which does this. ] 56 57This rule also means that you may not use kernel image addresses 58(ie. items in the kernel's data/text/bss segment, or your driver's) 59nor may you use kernel stack addresses for DMA. Both of these items 60might be mapped somewhere entirely different than the rest of physical 61memory. 62 63Also, this means that you cannot take the return of a kmap() 64call and DMA to/from that. This is similar to vmalloc(). 65 66What about block I/O and networking buffers? The block I/O and 67networking subsystems make sure that the buffers they use are valid 68for you to DMA from/to. 69 70 DMA addressing limitations 71 72Does your device have any DMA addressing limitations? For example, is 73your device only capable of driving the low order 24-bits of address 74on the PCI bus for SAC DMA transfers? If so, you need to inform the 75PCI layer of this fact. 76 77By default, the kernel assumes that your device can address the full 7832-bits in a SAC cycle. For a 64-bit DAC capable device, this needs 79to be increased. And for a device with limitations, as discussed in 80the previous paragraph, it needs to be decreased. 81 82For correct operation, you must interrogate the PCI layer in your 83device probe routine to see if the PCI controller on the machine can 84properly support the DMA addressing limitation your device has. It is 85good style to do this even if your device holds the default setting, 86because this shows that you did think about these issues wrt. your 87device. 88 89The query is performed via a call to pci_set_dma_mask(): 90 91 int pci_set_dma_mask(struct pci_dev *pdev, u64 device_mask); 92 93Here, pdev is a pointer to the PCI device struct of your device, and 94device_mask is a bit mask describing which bits of a PCI address your 95device supports. It returns zero if your card can perform DMA 96properly on the machine given the address mask you provided. 97 98If it returns non-zero, your device can not perform DMA properly on 99this platform, and attempting to do so will result in undefined 100behavior. You must either use a different mask, or not use DMA. 101 102This means that in the failure case, you have three options: 103 1041) Use another DMA mask, if possible (see below). 1052) Use some non-DMA mode for data transfer, if possible. 1063) Ignore this device and do not initialize it. 107 108It is recommended that your driver print a kernel KERN_WARNING message 109when you end up performing either #2 or #3. In this manner, if a user 110of your driver reports that performance is bad or that the device is not 111even detected, you can ask them for the kernel messages to find out 112exactly why. 113 114The standard 32-bit addressing PCI device would do something like 115this: 116 117 if (pci_set_dma_mask(pdev, 0xffffffff)) { 118 printk(KERN_WARNING 119 "mydev: No suitable DMA available.\n"); 120 goto ignore_this_device; 121 } 122 123Another common scenario is a 64-bit capable device. The approach 124here is to try for 64-bit DAC addressing, but back down to a 12532-bit mask should that fail. The PCI platform code may fail the 12664-bit mask not because the platform is not capable of 64-bit 127addressing. Rather, it may fail in this case simply because 12832-bit SAC addressing is done more efficiently than DAC addressing. 129Sparc64 is one platform which behaves in this way. 130 131Here is how you would handle a 64-bit capable device which can drive 132all 64-bits during a DAC cycle: 133 134 int using_dac; 135 136 if (!pci_set_dma_mask(pdev, 0xffffffffffffffff)) { 137 using_dac = 1; 138 } else if (!pci_set_dma_mask(pdev, 0xffffffff)) { 139 using_dac = 0; 140 } else { 141 printk(KERN_WARNING 142 "mydev: No suitable DMA available.\n"); 143 goto ignore_this_device; 144 } 145 146If your 64-bit device is going to be an enormous consumer of DMA 147mappings, this can be problematic since the DMA mappings are a 148finite resource on many platforms. Please see the "DAC Addressing 149for Address Space Hungry Devices" section near the end of this 150document for how to handle this case. 151 152Finally, if your device can only drive the low 24-bits of 153address during PCI bus mastering you might do something like: 154 155 if (pci_set_dma_mask(pdev, 0x00ffffff)) { 156 printk(KERN_WARNING 157 "mydev: 24-bit DMA addressing not available.\n"); 158 goto ignore_this_device; 159 } 160 161When pci_set_dma_mask() is successful, and returns zero, the PCI layer 162saves away this mask you have provided. The PCI layer will use this 163information later when you make DMA mappings. 164 165There is a case which we are aware of at this time, which is worth 166mentioning in this documentation. If your device supports multiple 167functions (for example a sound card provides playback and record 168functions) and the various different functions have _different_ 169DMA addressing limitations, you may wish to probe each mask and 170only provide the functionality which the machine can handle. It 171is important that the last call to pci_set_dma_mask() be for the 172most specific mask. 173 174Here is pseudo-code showing how this might be done: 175 176 #define PLAYBACK_ADDRESS_BITS 0xffffffff 177 #define RECORD_ADDRESS_BITS 0x00ffffff 178 179 struct my_sound_card *card; 180 struct pci_dev *pdev; 181 182 ... 183 if (pci_set_dma_mask(pdev, PLAYBACK_ADDRESS_BITS)) { 184 card->playback_enabled = 1; 185 } else { 186 card->playback_enabled = 0; 187 printk(KERN_WARN "%s: Playback disabled due to DMA limitations.\n", 188 card->name); 189 } 190 if (pci_set_dma_mask(pdev, RECORD_ADDRESS_BITS)) { 191 card->record_enabled = 1; 192 } else { 193 card->record_enabled = 0; 194 printk(KERN_WARN "%s: Record disabled due to DMA limitations.\n", 195 card->name); 196 } 197 198A sound card was used as an example here because this genre of PCI 199devices seems to be littered with ISA chips given a PCI front end, 200and thus retaining the 16MB DMA addressing limitations of ISA. 201 202 Types of DMA mappings 203 204There are two types of DMA mappings: 205 206- Consistent DMA mappings which are usually mapped at driver 207 initialization, unmapped at the end and for which the hardware should 208 guarantee that the device and the CPU can access the data 209 in parallel and will see updates made by each other without any 210 explicit software flushing. 211 212 Think of "consistent" as "synchronous" or "coherent". 213 214 Consistent DMA mappings are always SAC addressable. That is 215 to say, consistent DMA addresses given to the driver will always 216 be in the low 32-bits of the PCI bus space. 217 218 Good examples of what to use consistent mappings for are: 219 220 - Network card DMA ring descriptors. 221 - SCSI adapter mailbox command data structures. 222 - Device firmware microcode executed out of 223 main memory. 224 225 The invariant these examples all require is that any CPU store 226 to memory is immediately visible to the device, and vice 227 versa. Consistent mappings guarantee this. 228 229 IMPORTANT: Consistent DMA memory does not preclude the usage of 230 proper memory barriers. The CPU may reorder stores to 231 consistent memory just as it may normal memory. Example: 232 if it is important for the device to see the first word 233 of a descriptor updated before the second, you must do 234 something like: 235 236 desc->word0 = address; 237 wmb(); 238 desc->word1 = DESC_VALID; 239 240 in order to get correct behavior on all platforms. 241 242- Streaming DMA mappings which are usually mapped for one DMA transfer, 243 unmapped right after it (unless you use pci_dma_sync below) and for which 244 hardware can optimize for sequential accesses. 245 246 This of "streaming" as "asynchronous" or "outside the coherency 247 domain". 248 249 Good examples of what to use streaming mappings for are: 250 251 - Networking buffers transmitted/received by a device. 252 - Filesystem buffers written/read by a SCSI device. 253 254 The interfaces for using this type of mapping were designed in 255 such a way that an implementation can make whatever performance 256 optimizations the hardware allows. To this end, when using 257 such mappings you must be explicit about what you want to happen. 258 259Neither type of DMA mapping has alignment restrictions that come 260from PCI, although some devices may have such restrictions. 261 262 Using Consistent DMA mappings. 263 264To allocate and map large (PAGE_SIZE or so) consistent DMA regions, 265you should do: 266 267 dma_addr_t dma_handle; 268 269 cpu_addr = pci_alloc_consistent(dev, size, &dma_handle); 270 271where dev is a struct pci_dev *. You should pass NULL for PCI like buses 272where devices don't have struct pci_dev (like ISA, EISA). This may be 273called in interrupt context. 274 275This argument is needed because the DMA translations may be bus 276specific (and often is private to the bus which the device is attached 277to). 278 279Size is the length of the region you want to allocate, in bytes. 280 281This routine will allocate RAM for that region, so it acts similarly to 282__get_free_pages (but takes size instead of a page order). If your 283driver needs regions sized smaller than a page, you may prefer using 284the pci_pool interface, described below. 285 286The consistent DMA mapping interfaces, for non-NULL dev, will always 287return a DMA address which is SAC (Single Address Cycle) addressable. 288Even if the device indicates (via PCI dma mask) that it may address 289the upper 32-bits and thus perform DAC cycles, consistent allocation 290will still only return 32-bit PCI addresses for DMA. This is true 291of the pci_pool interface as well. 292 293In fact, as mentioned above, all consistent memory provided by the 294kernel DMA APIs are always SAC addressable. 295 296pci_alloc_consistent returns two values: the virtual address which you 297can use to access it from the CPU and dma_handle which you pass to the 298card. 299 300The cpu return address and the DMA bus master address are both 301guaranteed to be aligned to the smallest PAGE_SIZE order which 302is greater than or equal to the requested size. This invariant 303exists (for example) to guarantee that if you allocate a chunk 304which is smaller than or equal to 64 kilobytes, the extent of the 305buffer you receive will not cross a 64K boundary. 306 307To unmap and free such a DMA region, you call: 308 309 pci_free_consistent(dev, size, cpu_addr, dma_handle); 310 311where dev, size are the same as in the above call and cpu_addr and 312dma_handle are the values pci_alloc_consistent returned to you. 313This function may not be called in interrupt context. 314 315If your driver needs lots of smaller memory regions, you can write 316custom code to subdivide pages returned by pci_alloc_consistent, 317or you can use the pci_pool API to do that. A pci_pool is like 318a kmem_cache, but it uses pci_alloc_consistent not __get_free_pages. 319Also, it understands common hardware constraints for alignment, 320like queue heads needing to be aligned on N byte boundaries. 321 322Create a pci_pool like this: 323 324 struct pci_pool *pool; 325 326 pool = pci_pool_create(name, dev, size, align, alloc, flags); 327 328The "name" is for diagnostics (like a kmem_cache name); dev and size 329are as above. The device's hardware alignment requirement for this 330type of data is "align" (which is expressed in bytes, and must be a 331power of two). The flags are SLAB_ flags as you'd pass to 332kmem_cache_create. Not all flags are understood, but SLAB_POISON may 333help you find driver bugs. If you call this in a non- sleeping 334context (f.e. in_interrupt is true or while holding SMP locks), pass 335SLAB_ATOMIC. If your device has no boundary crossing restrictions, 336pass 0 for alloc; passing 4096 says memory allocated from this pool 337must not cross 4KByte boundaries (but at that time it may be better to 338go for pci_alloc_consistent directly instead). 339 340Allocate memory from a pci pool like this: 341 342 cpu_addr = pci_pool_alloc(pool, flags, &dma_handle); 343 344flags are SLAB_KERNEL if blocking is permitted (not in_interrupt nor 345holding SMP locks), SLAB_ATOMIC otherwise. Like pci_alloc_consistent, 346this returns two values, cpu_addr and dma_handle. 347 348Free memory that was allocated from a pci_pool like this: 349 350 pci_pool_free(pool, cpu_addr, dma_handle); 351 352where pool is what you passed to pci_pool_alloc, and cpu_addr and 353dma_handle are the values pci_pool_alloc returned. This function 354may be called in interrupt context. 355 356Destroy a pci_pool by calling: 357 358 pci_pool_destroy(pool); 359 360Make sure you've called pci_pool_free for all memory allocated 361from a pool before you destroy the pool. This function may not 362be called in interrupt context. 363 364 DMA Direction 365 366The interfaces described in subsequent portions of this document 367take a DMA direction argument, which is an integer and takes on 368one of the following values: 369 370 PCI_DMA_BIDIRECTIONAL 371 PCI_DMA_TODEVICE 372 PCI_DMA_FROMDEVICE 373 PCI_DMA_NONE 374 375One should provide the exact DMA direction if you know it. 376 377PCI_DMA_TODEVICE means "from main memory to the PCI device" 378PCI_DMA_FROMDEVICE means "from the PCI device to main memory" 379It is the direction in which the data moves during the DMA 380transfer. 381 382You are _strongly_ encouraged to specify this as precisely 383as you possibly can. 384 385If you absolutely cannot know the direction of the DMA transfer, 386specify PCI_DMA_BIDIRECTIONAL. It means that the DMA can go in 387either direction. The platform guarantees that you may legally 388specify this, and that it will work, but this may be at the 389cost of performance for example. 390 391The value PCI_DMA_NONE is to be used for debugging. One can 392hold this in a data structure before you come to know the 393precise direction, and this will help catch cases where your 394direction tracking logic has failed to set things up properly. 395 396Another advantage of specifying this value precisely (outside of 397potential platform-specific optimizations of such) is for debugging. 398Some platforms actually have a write permission boolean which DMA 399mappings can be marked with, much like page protections in the user 400program address space. Such platforms can and do report errors in the 401kernel logs when the PCI controller hardware detects violation of the 402permission setting. 403 404Only streaming mappings specify a direction, consistent mappings 405implicitly have a direction attribute setting of 406PCI_DMA_BIDIRECTIONAL. 407 408The SCSI subsystem provides mechanisms for you to easily obtain 409the direction to use, in the SCSI command: 410 411 scsi_to_pci_dma_dir(SCSI_DIRECTION) 412 413Where SCSI_DIRECTION is obtained from the 'sc_data_direction' 414member of the SCSI command your driver is working on. The 415mentioned interface above returns a value suitable for passing 416into the streaming DMA mapping interfaces below. 417 418For Networking drivers, it's a rather simple affair. For transmit 419packets, map/unmap them with the PCI_DMA_TODEVICE direction 420specifier. For receive packets, just the opposite, map/unmap them 421with the PCI_DMA_FROMDEVICE direction specifier. 422 423 Using Streaming DMA mappings 424 425The streaming DMA mapping routines can be called from interrupt 426context. There are two versions of each map/unmap, one which will 427map/unmap a single memory region, and one which will map/unmap a 428scatterlist. 429 430To map a single region, you do: 431 432 struct pci_dev *pdev = mydev->pdev; 433 dma_addr_t dma_handle; 434 void *addr = buffer->ptr; 435 size_t size = buffer->len; 436 437 dma_handle = pci_map_single(dev, addr, size, direction); 438 439and to unmap it: 440 441 pci_unmap_single(dev, dma_handle, size, direction); 442 443You should call pci_unmap_single when the DMA activity is finished, e.g. 444from the interrupt which told you that the DMA transfer is done. 445 446Using cpu pointers like this for single mappings has a disadvantage, 447you cannot reference HIGHMEM memory in this way. Thus, there is a 448map/unmap interface pair akin to pci_{map,unmap}_single. These 449interfaces deal with page/offset pairs instead of cpu pointers. 450Specifically: 451 452 struct pci_dev *pdev = mydev->pdev; 453 dma_addr_t dma_handle; 454 struct page *page = buffer->page; 455 unsigned long offset = buffer->offset; 456 size_t size = buffer->len; 457 458 dma_handle = pci_map_page(dev, page, offset, size, direction); 459 460 ... 461 462 pci_unmap_page(dev, dma_handle, size, direction); 463 464Here, "offset" means byte offset within the given page. 465 466With scatterlists, you map a region gathered from several regions by: 467 468 int i, count = pci_map_sg(dev, sglist, nents, direction); 469 struct scatterlist *sg; 470 471 for (i = 0, sg = sglist; i < count; i++, sg++) { 472 hw_address[i] = sg_dma_address(sg); 473 hw_len[i] = sg_dma_len(sg); 474 } 475 476where nents is the number of entries in the sglist. 477 478The implementation is free to merge several consecutive sglist entries 479into one (e.g. if DMA mapping is done with PAGE_SIZE granularity, any 480consecutive sglist entries can be merged into one provided the first one 481ends and the second one starts on a page boundary - in fact this is a huge 482advantage for cards which either cannot do scatter-gather or have very 483limited number of scatter-gather entries) and returns the actual number 484of sg entries it mapped them to. 485 486Then you should loop count times (note: this can be less than nents times) 487and use sg_dma_address() and sg_dma_len() macros where you previously 488accessed sg->address and sg->length as shown above. 489 490To unmap a scatterlist, just call: 491 492 pci_unmap_sg(dev, sglist, nents, direction); 493 494Again, make sure DMA activity has already finished. 495 496PLEASE NOTE: The 'nents' argument to the pci_unmap_sg call must be 497 the _same_ one you passed into the pci_map_sg call, 498 it should _NOT_ be the 'count' value _returned_ from the 499 pci_map_sg call. 500 501Every pci_map_{single,sg} call should have its pci_unmap_{single,sg} 502counterpart, because the bus address space is a shared resource (although 503in some ports the mapping is per each BUS so less devices contend for the 504same bus address space) and you could render the machine unusable by eating 505all bus addresses. 506 507If you need to use the same streaming DMA region multiple times and touch 508the data in between the DMA transfers, just map it with 509pci_map_{single,sg}, and after each DMA transfer call either: 510 511 pci_dma_sync_single(dev, dma_handle, size, direction); 512 513or: 514 515 pci_dma_sync_sg(dev, sglist, nents, direction); 516 517as appropriate. 518 519After the last DMA transfer call one of the DMA unmap routines 520pci_unmap_{single,sg}. If you don't touch the data from the first pci_map_* 521call till pci_unmap_*, then you don't have to call the pci_dma_sync_* 522routines at all. 523 524Here is pseudo code which shows a situation in which you would need 525to use the pci_dma_sync_*() interfaces. 526 527 my_card_setup_receive_buffer(struct my_card *cp, char *buffer, int len) 528 { 529 dma_addr_t mapping; 530 531 mapping = pci_map_single(cp->pdev, buffer, len, PCI_DMA_FROMDEVICE); 532 533 cp->rx_buf = buffer; 534 cp->rx_len = len; 535 cp->rx_dma = mapping; 536 537 give_rx_buf_to_card(cp); 538 } 539 540 ... 541 542 my_card_interrupt_handler(int irq, void *devid, struct pt_regs *regs) 543 { 544 struct my_card *cp = devid; 545 546 ... 547 if (read_card_status(cp) == RX_BUF_TRANSFERRED) { 548 struct my_card_header *hp; 549 550 /* Examine the header to see if we wish 551 * to accept the data. But synchronize 552 * the DMA transfer with the CPU first 553 * so that we see updated contents. 554 */ 555 pci_dma_sync_single(cp->pdev, cp->rx_dma, cp->rx_len, 556 PCI_DMA_FROMDEVICE); 557 558 /* Now it is safe to examine the buffer. */ 559 hp = (struct my_card_header *) cp->rx_buf; 560 if (header_is_ok(hp)) { 561 pci_unmap_single(cp->pdev, cp->rx_dma, cp->rx_len, 562 PCI_DMA_FROMDEVICE); 563 pass_to_upper_layers(cp->rx_buf); 564 make_and_setup_new_rx_buf(cp); 565 } else { 566 /* Just give the buffer back to the card. */ 567 give_rx_buf_to_card(cp); 568 } 569 } 570 } 571 572Drivers converted fully to this interface should not use virt_to_bus any 573longer, nor should they use bus_to_virt. Some drivers have to be changed a 574little bit, because there is no longer an equivalent to bus_to_virt in the 575dynamic DMA mapping scheme - you have to always store the DMA addresses 576returned by the pci_alloc_consistent, pci_pool_alloc, and pci_map_single 577calls (pci_map_sg stores them in the scatterlist itself if the platform 578supports dynamic DMA mapping in hardware) in your driver structures and/or 579in the card registers. 580 581All PCI drivers should be using these interfaces with no exceptions. 582It is planned to completely remove virt_to_bus() and bus_to_virt() as 583they are entirely deprecated. Some ports already do not provide these 584as it is impossible to correctly support them. 585 586 64-bit DMA and DAC cycle support 587 588Do you understand all of the text above? Great, then you already 589know how to use 64-bit DMA addressing under Linux. Simply make 590the appropriate pci_set_dma_mask() calls based upon your cards 591capabilities, then use the mapping APIs above. 592 593It is that simple. 594 595Well, not for some odd devices. See the next section for information 596about that. 597 598 DAC Addressing for Address Space Hungry Devices 599 600There exists a class of devices which do not mesh well with the PCI 601DMA mapping API. By definition these "mappings" are a finite 602resource. The number of total available mappings per bus is platform 603specific, but there will always be a reasonable amount. 604 605What is "reasonable"? Reasonable means that networking and block I/O 606devices need not worry about using too many mappings. 607 608As an example of a problematic device, consider compute cluster cards. 609They can potentially need to access gigabytes of memory at once via 610DMA. Dynamic mappings are unsuitable for this kind of access pattern. 611 612To this end we've provided a small API by which a device driver 613may use DAC cycles to directly address all of physical memory. 614Not all platforms support this, but most do. It is easy to determine 615whether the platform will work properly at probe time. 616 617First, understand that there may be a SEVERE performance penalty for 618using these interfaces on some platforms. Therefore, you MUST only 619use these interfaces if it is absolutely required. %99 of devices can 620use the normal APIs without any problems. 621 622Note that for streaming type mappings you must either use these 623interfaces, or the dynamic mapping interfaces above. You may not mix 624usage of both for the same device. Such an act is illegal and is 625guaranteed to put a banana in your tailpipe. 626 627However, consistent mappings may in fact be used in conjunction with 628these interfaces. Remember that, as defined, consistent mappings are 629always going to be SAC addressable. 630 631The first thing your driver needs to do is query the PCI platform 632layer with your devices DAC addressing capabilities: 633 634 int pci_dac_set_dma_mask(struct pci_dev *pdev, u64 mask); 635 636This routine behaves identically to pci_set_dma_mask. You may not 637use the following interfaces if this routine fails. 638 639Next, DMA addresses using this API are kept track of using the 640dma64_addr_t type. It is guaranteed to be big enough to hold any 641DAC address the platform layer will give to you from the following 642routines. If you have consistent mappings as well, you still 643use plain dma_addr_t to keep track of those. 644 645All mappings obtained here will be direct. The mappings are not 646translated, and this is the purpose of this dialect of the DMA API. 647 648All routines work with page/offset pairs. This is the _ONLY_ way to 649portably refer to any piece of memory. If you have a cpu pointer 650(which may be validly DMA'd too) you may easily obtain the page 651and offset using something like this: 652 653 struct page *page = virt_to_page(ptr); 654 unsigned long offset = ((unsigned long)ptr & ~PAGE_MASK); 655 656Here are the interfaces: 657 658 dma64_addr_t pci_dac_page_to_dma(struct pci_dev *pdev, 659 struct page *page, 660 unsigned long offset, 661 int direction); 662 663The DAC address for the tuple PAGE/OFFSET are returned. The direction 664argument is the same as for pci_{map,unmap}_single(). The same rules 665for cpu/device access apply here as for the streaming mapping 666interfaces. To reiterate: 667 668 The cpu may touch the buffer before pci_dac_page_to_dma. 669 The device may touch the buffer after pci_dac_page_to_dma 670 is made, but the cpu may NOT. 671 672When the DMA transfer is complete, invoke: 673 674 void pci_dac_dma_sync_single(struct pci_dev *pdev, 675 dma64_addr_t dma_addr, 676 size_t len, int direction); 677 678This must be done before the CPU looks at the buffer again. 679This interface behaves identically to pci_dma_sync_{single,sg}(). 680 681If you need to get back to the PAGE/OFFSET tuple from a dma64_addr_t 682the following interfaces are provided: 683 684 struct page *pci_dac_dma_to_page(struct pci_dev *pdev, 685 dma64_addr_t dma_addr); 686 unsigned long pci_dac_dma_to_offset(struct pci_dev *pdev, 687 dma64_addr_t dma_addr); 688 689This is possible with the DAC interfaces purely because they are 690not translated in any way. 691 692 Optimizing Unmap State Space Consumption 693 694On many platforms, pci_unmap_{single,page}() is simply a nop. 695Therefore, keeping track of the mapping address and length is a waste 696of space. Instead of filling your drivers up with ifdefs and the like 697to "work around" this (which would defeat the whole purpose of a 698portable API) the following facilities are provided. 699 700Actually, instead of describing the macros one by one, we'll 701transform some example code. 702 7031) Use DECLARE_PCI_UNMAP_{ADDR,LEN} in state saving structures. 704 Example, before: 705 706 struct ring_state { 707 struct sk_buff *skb; 708 dma_addr_t mapping; 709 __u32 len; 710 }; 711 712 after: 713 714 struct ring_state { 715 struct sk_buff *skb; 716 DECLARE_PCI_UNMAP_ADDR(mapping) 717 DECLARE_PCI_UNMAP_LEN(len) 718 }; 719 720 NOTE: DO NOT put a semicolon at the end of the DECLARE_*() 721 macro. 722 7232) Use pci_unmap_{addr,len}_set to set these values. 724 Example, before: 725 726 ringp->mapping = FOO; 727 ringp->len = BAR; 728 729 after: 730 731 pci_unmap_addr_set(ringp, mapping, FOO); 732 pci_unmap_len_set(ringp, len, BAR); 733 7343) Use pci_unmap_{addr,len} to access these values. 735 Example, before: 736 737 pci_unmap_single(pdev, ringp->mapping, ringp->len, 738 PCI_DMA_FROMDEVICE); 739 740 after: 741 742 pci_unmap_single(pdev, 743 pci_unmap_addr(ringp, mapping), 744 pci_unmap_len(ringp, len), 745 PCI_DMA_FROMDEVICE); 746 747It really should be self-explanatory. We treat the ADDR and LEN 748separately, because it is possible for an implementation to only 749need the address in order to perform the unmap operation. 750 751 Platform Issues 752 753If you are just writing drivers for Linux and do not maintain 754an architecture port for the kernel, you can safely skip down 755to "Closing". 756 7571) Struct scatterlist requirements. 758 759 Struct scatterlist must contain, at a minimum, the following 760 members: 761 762 char *address; 763 struct page *page; 764 unsigned int offset; 765 unsigned int length; 766 767 The "address" member will disappear in 2.5.x 768 769 This means that your pci_{map,unmap}_sg() and all other 770 interfaces dealing with scatterlists must be able to cope 771 properly with page being non NULL. 772 773 A scatterlist is in one of two states. The base address is 774 either specified by "address" or by a "page+offset" pair. 775 If "address" is NULL, then "page+offset" is being used. 776 If "page" is NULL, then "address" is being used. 777 778 In 2.5.x, all scatterlists will use "page+offset". But during 779 2.4.x we still have to support the old method. 780 7812) More to come... 782 783 Closing 784 785This document, and the API itself, would not be in it's current 786form without the feedback and suggestions from numerous individuals. 787We would like to specifically mention, in no particular order, the 788following people: 789 790 Russell King <rmk@arm.linux.org.uk> 791 Leo Dagum <dagum@barrel.engr.sgi.com> 792 Ralf Baechle <ralf@oss.sgi.com> 793 Grant Grundler <grundler@cup.hp.com> 794 Jay Estabrook <Jay.Estabrook@compaq.com> 795 Thomas Sailer <sailer@ife.ee.ethz.ch> 796 Andrea Arcangeli <andrea@suse.de> 797 Jens Axboe <axboe@suse.de> 798 David Mosberger-Tang <davidm@hpl.hp.com> 799

