1An introduction to the videobuf layer
   2Jonathan Corbet <>
   3Current as of 2.6.33
   5The videobuf layer functions as a sort of glue layer between a V4L2 driver
   6and user space.  It handles the allocation and management of buffers for
   7the storage of video frames.  There is a set of functions which can be used
   8to implement many of the standard POSIX I/O system calls, including read(),
   9poll(), and, happily, mmap().  Another set of functions can be used to
  10implement the bulk of the V4L2 ioctl() calls related to streaming I/O,
  11including buffer allocation, queueing and dequeueing, and streaming
  12control.  Using videobuf imposes a few design decisions on the driver
  13author, but the payback comes in the form of reduced code in the driver and
  14a consistent implementation of the V4L2 user-space API.
  16Buffer types
  18Not all video devices use the same kind of buffers.  In fact, there are (at
  19least) three common variations:
  21 - Buffers which are scattered in both the physical and (kernel) virtual
  22   address spaces.  (Almost) all user-space buffers are like this, but it
  23   makes great sense to allocate kernel-space buffers this way as well when
  24   it is possible.  Unfortunately, it is not always possible; working with
  25   this kind of buffer normally requires hardware which can do
  26   scatter/gather DMA operations.
  28 - Buffers which are physically scattered, but which are virtually
  29   contiguous; buffers allocated with vmalloc(), in other words.  These
  30   buffers are just as hard to use for DMA operations, but they can be
  31   useful in situations where DMA is not available but virtually-contiguous
  32   buffers are convenient.
  34 - Buffers which are physically contiguous.  Allocation of this kind of
  35   buffer can be unreliable on fragmented systems, but simpler DMA
  36   controllers cannot deal with anything else.
  38Videobuf can work with all three types of buffers, but the driver author
  39must pick one at the outset and design the driver around that decision.
  41[It's worth noting that there's a fourth kind of buffer: "overlay" buffers
  42which are located within the system's video memory.  The overlay
  43functionality is considered to be deprecated for most use, but it still
  44shows up occasionally in system-on-chip drivers where the performance
  45benefits merit the use of this technique.  Overlay buffers can be handled
  46as a form of scattered buffer, but there are very few implementations in
  47the kernel and a description of this technique is currently beyond the
  48scope of this document.]
  50Data structures, callbacks, and initialization
  52Depending on which type of buffers are being used, the driver should
  53include one of the following files:
  55    <media/videobuf-dma-sg.h>           /* Physically scattered */
  56    <media/videobuf-vmalloc.h>          /* vmalloc() buffers    */
  57    <media/videobuf-dma-contig.h>       /* Physically contiguous */
  59The driver's data structure describing a V4L2 device should include a
  60struct videobuf_queue instance for the management of the buffer queue,
  61along with a list_head for the queue of available buffers.  There will also
  62need to be an interrupt-safe spinlock which is used to protect (at least)
  63the queue.
  65The next step is to write four simple callbacks to help videobuf deal with
  66the management of buffers:
  68    struct videobuf_queue_ops {
  69        int (*buf_setup)(struct videobuf_queue *q,
  70                         unsigned int *count, unsigned int *size);
  71        int (*buf_prepare)(struct videobuf_queue *q,
  72                           struct videobuf_buffer *vb,
  73                           enum v4l2_field field);
  74        void (*buf_queue)(struct videobuf_queue *q,
  75                          struct videobuf_buffer *vb);
  76        void (*buf_release)(struct videobuf_queue *q,
  77                            struct videobuf_buffer *vb);
  78    };
  80buf_setup() is called early in the I/O process, when streaming is being
  81initiated; its purpose is to tell videobuf about the I/O stream.  The count
  82parameter will be a suggested number of buffers to use; the driver should
  83check it for rationality and adjust it if need be.  As a practical rule, a
  84minimum of two buffers are needed for proper streaming, and there is
  85usually a maximum (which cannot exceed 32) which makes sense for each
  86device.  The size parameter should be set to the expected (maximum) size
  87for each frame of data.
  89Each buffer (in the form of a struct videobuf_buffer pointer) will be
  90passed to buf_prepare(), which should set the buffer's size, width, height,
  91and field fields properly.  If the buffer's state field is
  92VIDEOBUF_NEEDS_INIT, the driver should pass it to:
  94    int videobuf_iolock(struct videobuf_queue* q, struct videobuf_buffer *vb,
  95                        struct v4l2_framebuffer *fbuf);
  97Among other things, this call will usually allocate memory for the buffer.
  98Finally, the buf_prepare() function should set the buffer's state to
 101When a buffer is queued for I/O, it is passed to buf_queue(), which should
 102put it onto the driver's list of available buffers and set its state to
 103VIDEOBUF_QUEUED.  Note that this function is called with the queue spinlock
 104held; if it tries to acquire it as well things will come to a screeching
 105halt.  Yes, this is the voice of experience.  Note also that videobuf may
 106wait on the first buffer in the queue; placing other buffers in front of it
 107could again gum up the works.  So use list_add_tail() to enqueue buffers.
 109Finally, buf_release() is called when a buffer is no longer intended to be
 110used.  The driver should ensure that there is no I/O active on the buffer,
 111then pass it to the appropriate free routine(s):
 113    /* Scatter/gather drivers */
 114    int videobuf_dma_unmap(struct videobuf_queue *q,
 115                           struct videobuf_dmabuf *dma);
 116    int videobuf_dma_free(struct videobuf_dmabuf *dma);
 118    /* vmalloc drivers */
 119    void videobuf_vmalloc_free (struct videobuf_buffer *buf);
 121    /* Contiguous drivers */
 122    void videobuf_dma_contig_free(struct videobuf_queue *q,
 123                                  struct videobuf_buffer *buf);
 125One way to ensure that a buffer is no longer under I/O is to pass it to:
 127    int videobuf_waiton(struct videobuf_buffer *vb, int non_blocking, int intr);
 129Here, vb is the buffer, non_blocking indicates whether non-blocking I/O
 130should be used (it should be zero in the buf_release() case), and intr
 131controls whether an interruptible wait is used.
 133File operations
 135At this point, much of the work is done; much of the rest is slipping
 136videobuf calls into the implementation of the other driver callbacks.  The
 137first step is in the open() function, which must initialize the
 138videobuf queue.  The function to use depends on the type of buffer used:
 140    void videobuf_queue_sg_init(struct videobuf_queue *q,
 141                                struct videobuf_queue_ops *ops,
 142                                struct device *dev,
 143                                spinlock_t *irqlock,
 144                                enum v4l2_buf_type type,
 145                                enum v4l2_field field,
 146                                unsigned int msize,
 147                                void *priv);
 149    void videobuf_queue_vmalloc_init(struct videobuf_queue *q,
 150                                struct videobuf_queue_ops *ops,
 151                                struct device *dev,
 152                                spinlock_t *irqlock,
 153                                enum v4l2_buf_type type,
 154                                enum v4l2_field field,
 155                                unsigned int msize,
 156                                void *priv);
 158    void videobuf_queue_dma_contig_init(struct videobuf_queue *q,
 159                                       struct videobuf_queue_ops *ops,
 160                                       struct device *dev,
 161                                       spinlock_t *irqlock,
 162                                       enum v4l2_buf_type type,
 163                                       enum v4l2_field field,
 164                                       unsigned int msize,
 165                                       void *priv);
 167In each case, the parameters are the same: q is the queue structure for the
 168device, ops is the set of callbacks as described above, dev is the device
 169structure for this video device, irqlock is an interrupt-safe spinlock to
 170protect access to the data structures, type is the buffer type used by the
 171device (cameras will use V4L2_BUF_TYPE_VIDEO_CAPTURE, for example), field
 172describes which field is being captured (often V4L2_FIELD_NONE for
 173progressive devices), msize is the size of any containing structure used
 174around struct videobuf_buffer, and priv is a private data pointer which
 175shows up in the priv_data field of struct videobuf_queue.  Note that these
 176are void functions which, evidently, are immune to failure.
 178V4L2 capture drivers can be written to support either of two APIs: the
 179read() system call and the rather more complicated streaming mechanism.  As
 180a general rule, it is necessary to support both to ensure that all
 181applications have a chance of working with the device.  Videobuf makes it
 182easy to do that with the same code.  To implement read(), the driver need
 183only make a call to one of:
 185    ssize_t videobuf_read_one(struct videobuf_queue *q,
 186                              char __user *data, size_t count,
 187                              loff_t *ppos, int nonblocking);
 189    ssize_t videobuf_read_stream(struct videobuf_queue *q,
 190                                 char __user *data, size_t count,
 191                                 loff_t *ppos, int vbihack, int nonblocking);
 193Either one of these functions will read frame data into data, returning the
 194amount actually read; the difference is that videobuf_read_one() will only
 195read a single frame, while videobuf_read_stream() will read multiple frames
 196if they are needed to satisfy the count requested by the application.  A
 197typical driver read() implementation will start the capture engine, call
 198one of the above functions, then stop the engine before returning (though a
 199smarter implementation might leave the engine running for a little while in
 200anticipation of another read() call happening in the near future).
 202The poll() function can usually be implemented with a direct call to:
 204    unsigned int videobuf_poll_stream(struct file *file,
 205                                      struct videobuf_queue *q,
 206                                      poll_table *wait);
 208Note that the actual wait queue eventually used will be the one associated
 209with the first available buffer.
 211When streaming I/O is done to kernel-space buffers, the driver must support
 212the mmap() system call to enable user space to access the data.  In many
 213V4L2 drivers, the often-complex mmap() implementation simplifies to a
 214single call to:
 216    int videobuf_mmap_mapper(struct videobuf_queue *q,
 217                             struct vm_area_struct *vma);
 219Everything else is handled by the videobuf code.
 221The release() function requires two separate videobuf calls:
 223    void videobuf_stop(struct videobuf_queue *q);
 224    int videobuf_mmap_free(struct videobuf_queue *q);
 226The call to videobuf_stop() terminates any I/O in progress - though it is
 227still up to the driver to stop the capture engine.  The call to
 228videobuf_mmap_free() will ensure that all buffers have been unmapped; if
 229so, they will all be passed to the buf_release() callback.  If buffers
 230remain mapped, videobuf_mmap_free() returns an error code instead.  The
 231purpose is clearly to cause the closing of the file descriptor to fail if
 232buffers are still mapped, but every driver in the 2.6.32 kernel cheerfully
 233ignores its return value.
 235ioctl() operations
 237The V4L2 API includes a very long list of driver callbacks to respond to
 238the many ioctl() commands made available to user space.  A number of these
 239- those associated with streaming I/O - turn almost directly into videobuf
 240calls.  The relevant helper functions are:
 242    int videobuf_reqbufs(struct videobuf_queue *q,
 243                         struct v4l2_requestbuffers *req);
 244    int videobuf_querybuf(struct videobuf_queue *q, struct v4l2_buffer *b);
 245    int videobuf_qbuf(struct videobuf_queue *q, struct v4l2_buffer *b);
 246    int videobuf_dqbuf(struct videobuf_queue *q, struct v4l2_buffer *b,
 247                       int nonblocking);
 248    int videobuf_streamon(struct videobuf_queue *q);
 249    int videobuf_streamoff(struct videobuf_queue *q);
 251So, for example, a VIDIOC_REQBUFS call turns into a call to the driver's
 252vidioc_reqbufs() callback which, in turn, usually only needs to locate the
 253proper struct videobuf_queue pointer and pass it to videobuf_reqbufs().
 254These support functions can replace a great deal of buffer management
 255boilerplate in a lot of V4L2 drivers.
 257The vidioc_streamon() and vidioc_streamoff() functions will be a bit more
 258complex, of course, since they will also need to deal with starting and
 259stopping the capture engine.
 261Buffer allocation
 263Thus far, we have talked about buffers, but have not looked at how they are
 264allocated.  The scatter/gather case is the most complex on this front.  For
 265allocation, the driver can leave buffer allocation entirely up to the
 266videobuf layer; in this case, buffers will be allocated as anonymous
 267user-space pages and will be very scattered indeed.  If the application is
 268using user-space buffers, no allocation is needed; the videobuf layer will
 269take care of calling get_user_pages() and filling in the scatterlist array.
 271If the driver needs to do its own memory allocation, it should be done in
 272the vidioc_reqbufs() function, *after* calling videobuf_reqbufs().  The
 273first step is a call to:
 275    struct videobuf_dmabuf *videobuf_to_dma(struct videobuf_buffer *buf);
 277The returned videobuf_dmabuf structure (defined in
 278<media/videobuf-dma-sg.h>) includes a couple of relevant fields:
 280    struct scatterlist  *sglist;
 281    int                 sglen;
 283The driver must allocate an appropriately-sized scatterlist array and
 284populate it with pointers to the pieces of the allocated buffer; sglen
 285should be set to the length of the array.
 287Drivers using the vmalloc() method need not (and cannot) concern themselves
 288with buffer allocation at all; videobuf will handle those details.  The
 289same is normally true of contiguous-DMA drivers as well; videobuf will
 290allocate the buffers (with dma_alloc_coherent()) when it sees fit.  That
 291means that these drivers may be trying to do high-order allocations at any
 292time, an operation which is not always guaranteed to work.  Some drivers
 293play tricks by allocating DMA space at system boot time; videobuf does not
 294currently play well with those drivers.
 296As of 2.6.31, contiguous-DMA drivers can work with a user-supplied buffer,
 297as long as that buffer is physically contiguous.  Normal user-space
 298allocations will not meet that criterion, but buffers obtained from other
 299kernel drivers, or those contained within huge pages, will work with these
 302Filling the buffers
 304The final part of a videobuf implementation has no direct callback - it's
 305the portion of the code which actually puts frame data into the buffers,
 306usually in response to interrupts from the device.  For all types of
 307drivers, this process works approximately as follows:
 309 - Obtain the next available buffer and make sure that somebody is actually
 310   waiting for it.
 312 - Get a pointer to the memory and put video data there.
 314 - Mark the buffer as done and wake up the process waiting for it.
 316Step (1) above is done by looking at the driver-managed list_head structure
 317- the one which is filled in the buf_queue() callback.  Because starting
 318the engine and enqueueing buffers are done in separate steps, it's possible
 319for the engine to be running without any buffers available - in the
 320vmalloc() case especially.  So the driver should be prepared for the list
 321to be empty.  It is equally possible that nobody is yet interested in the
 322buffer; the driver should not remove it from the list or fill it until a
 323process is waiting on it.  That test can be done by examining the buffer's
 324done field (a wait_queue_head_t structure) with waitqueue_active().
 326A buffer's state should be set to VIDEOBUF_ACTIVE before being mapped for
 327DMA; that ensures that the videobuf layer will not try to do anything with
 328it while the device is transferring data.
 330For scatter/gather drivers, the needed memory pointers will be found in the
 331scatterlist structure described above.  Drivers using the vmalloc() method
 332can get a memory pointer with:
 334    void *videobuf_to_vmalloc(struct videobuf_buffer *buf);
 336For contiguous DMA drivers, the function to use is:
 338    dma_addr_t videobuf_to_dma_contig(struct videobuf_buffer *buf);
 340The contiguous DMA API goes out of its way to hide the kernel-space address
 341of the DMA buffer from drivers.
 343The final step is to set the size field of the relevant videobuf_buffer
 344structure to the actual size of the captured image, set state to
 345VIDEOBUF_DONE, then call wake_up() on the done queue.  At this point, the
 346buffer is owned by the videobuf layer and the driver should not touch it
 349Developers who are interested in more information can go into the relevant
 350header files; there are a few low-level functions declared there which have
 351not been talked about here.  Also worthwhile is the vivi driver
 352(drivers/media/video/vivi.c), which is maintained as an example of how V4L2
 353drivers should be written.  Vivi only uses the vmalloc() API, but it's good
 354enough to get started with.  Note also that all of these calls are exported
 355GPL-only, so they will not be available to non-GPL kernel modules.
 356 kindly hosted by Redpill Linpro AS, provider of Linux consulting and operations services since 1995.