linux/Documentation/filesystems/files.rst
<<
>>
Prefs
   1.. SPDX-License-Identifier: GPL-2.0
   2
   3===================================
   4File management in the Linux kernel
   5===================================
   6
   7This document describes how locking for files (struct file)
   8and file descriptor table (struct files) works.
   9
  10Up until 2.6.12, the file descriptor table has been protected
  11with a lock (files->file_lock) and reference count (files->count).
  12->file_lock protected accesses to all the file related fields
  13of the table. ->count was used for sharing the file descriptor
  14table between tasks cloned with CLONE_FILES flag. Typically
  15this would be the case for posix threads. As with the common
  16refcounting model in the kernel, the last task doing
  17a put_files_struct() frees the file descriptor (fd) table.
  18The files (struct file) themselves are protected using
  19reference count (->f_count).
  20
  21In the new lock-free model of file descriptor management,
  22the reference counting is similar, but the locking is
  23based on RCU. The file descriptor table contains multiple
  24elements - the fd sets (open_fds and close_on_exec, the
  25array of file pointers, the sizes of the sets and the array
  26etc.). In order for the updates to appear atomic to
  27a lock-free reader, all the elements of the file descriptor
  28table are in a separate structure - struct fdtable.
  29files_struct contains a pointer to struct fdtable through
  30which the actual fd table is accessed. Initially the
  31fdtable is embedded in files_struct itself. On a subsequent
  32expansion of fdtable, a new fdtable structure is allocated
  33and files->fdtab points to the new structure. The fdtable
  34structure is freed with RCU and lock-free readers either
  35see the old fdtable or the new fdtable making the update
  36appear atomic. Here are the locking rules for
  37the fdtable structure -
  38
  391. All references to the fdtable must be done through
  40   the files_fdtable() macro::
  41
  42        struct fdtable *fdt;
  43
  44        rcu_read_lock();
  45
  46        fdt = files_fdtable(files);
  47        ....
  48        if (n <= fdt->max_fds)
  49                ....
  50        ...
  51        rcu_read_unlock();
  52
  53   files_fdtable() uses rcu_dereference() macro which takes care of
  54   the memory barrier requirements for lock-free dereference.
  55   The fdtable pointer must be read within the read-side
  56   critical section.
  57
  582. Reading of the fdtable as described above must be protected
  59   by rcu_read_lock()/rcu_read_unlock().
  60
  613. For any update to the fd table, files->file_lock must
  62   be held.
  63
  644. To look up the file structure given an fd, a reader
  65   must use either lookup_fd_rcu() or files_lookup_fd_rcu() APIs. These
  66   take care of barrier requirements due to lock-free lookup.
  67
  68   An example::
  69
  70        struct file *file;
  71
  72        rcu_read_lock();
  73        file = lookup_fd_rcu(fd);
  74        if (file) {
  75                ...
  76        }
  77        ....
  78        rcu_read_unlock();
  79
  805. Handling of the file structures is special. Since the look-up
  81   of the fd (fget()/fget_light()) are lock-free, it is possible
  82   that look-up may race with the last put() operation on the
  83   file structure. This is avoided using atomic_long_inc_not_zero()
  84   on ->f_count::
  85
  86        rcu_read_lock();
  87        file = files_lookup_fd_rcu(files, fd);
  88        if (file) {
  89                if (atomic_long_inc_not_zero(&file->f_count))
  90                        *fput_needed = 1;
  91                else
  92                /* Didn't get the reference, someone's freed */
  93                        file = NULL;
  94        }
  95        rcu_read_unlock();
  96        ....
  97        return file;
  98
  99   atomic_long_inc_not_zero() detects if refcounts is already zero or
 100   goes to zero during increment. If it does, we fail
 101   fget()/fget_light().
 102
 1036. Since both fdtable and file structures can be looked up
 104   lock-free, they must be installed using rcu_assign_pointer()
 105   API. If they are looked up lock-free, rcu_dereference()
 106   must be used. However it is advisable to use files_fdtable()
 107   and lookup_fd_rcu()/files_lookup_fd_rcu() which take care of these issues.
 108
 1097. While updating, the fdtable pointer must be looked up while
 110   holding files->file_lock. If ->file_lock is dropped, then
 111   another thread expand the files thereby creating a new
 112   fdtable and making the earlier fdtable pointer stale.
 113
 114   For example::
 115
 116        spin_lock(&files->file_lock);
 117        fd = locate_fd(files, file, start);
 118        if (fd >= 0) {
 119                /* locate_fd() may have expanded fdtable, load the ptr */
 120                fdt = files_fdtable(files);
 121                __set_open_fd(fd, fdt);
 122                __clear_close_on_exec(fd, fdt);
 123                spin_unlock(&files->file_lock);
 124        .....
 125
 126   Since locate_fd() can drop ->file_lock (and reacquire ->file_lock),
 127   the fdtable pointer (fdt) must be loaded after locate_fd().
 128
 129