1 2BTRFS 3===== 4 5Btrfs is a copy on write filesystem for Linux aimed at 6implementing advanced features while focusing on fault tolerance, 7repair and easy administration. Initially developed by Oracle, Btrfs 8is licensed under the GPL and open for contribution from anyone. 9 10Linux has a wealth of filesystems to choose from, but we are facing a 11number of challenges with scaling to the large storage subsystems that 12are becoming common in today's data centers. Filesystems need to scale 13in their ability to address and manage large storage, and also in 14their ability to detect, repair and tolerate errors in the data stored 15on disk. Btrfs is under heavy development, and is not suitable for 16any uses other than benchmarking and review. The Btrfs disk format is 17not yet finalized. 18 19The main Btrfs features include: 20 21 * Extent based file storage (2^64 max file size) 22 * Space efficient packing of small files 23 * Space efficient indexed directories 24 * Dynamic inode allocation 25 * Writable snapshots 26 * Subvolumes (separate internal filesystem roots) 27 * Object level mirroring and striping 28 * Checksums on data and metadata (multiple algorithms available) 29 * Compression 30 * Integrated multiple device support, with several raid algorithms 31 * Online filesystem check (not yet implemented) 32 * Very fast offline filesystem check 33 * Efficient incremental backup and FS mirroring (not yet implemented) 34 * Online filesystem defragmentation 35 36 37Mount Options 38============= 39 40When mounting a btrfs filesystem, the following option are accepted. 41Unless otherwise specified, all options default to off. 42 43 alloc_start=<bytes> 44 Debugging option to force all block allocations above a certain 45 byte threshold on each block device. The value is specified in 46 bytes, optionally with a K, M, or G suffix, case insensitive. 47 Default is 1MB. 48 49 autodefrag 50 Detect small random writes into files and queue them up for the 51 defrag process. Works best for small files; Not well suited for 52 large database workloads. 53 54 check_int 55 check_int_data 56 check_int_print_mask=<value> 57 These debugging options control the behavior of the integrity checking 58 module (the BTRFS_FS_CHECK_INTEGRITY config option required). 59 60 check_int enables the integrity checker module, which examines all 61 block write requests to ensure on-disk consistency, at a large 62 memory and CPU cost. 63 64 check_int_data includes extent data in the integrity checks, and 65 implies the check_int option. 66 67 check_int_print_mask takes a bitmask of BTRFSIC_PRINT_MASK_* values 68 as defined in fs/btrfs/check-integrity.c, to control the integrity 69 checker module behavior. 70 71 See comments at the top of fs/btrfs/check-integrity.c for more info. 72 73 compress 74 compress=<type> 75 compress-force 76 compress-force=<type> 77 Control BTRFS file data compression. Type may be specified as "zlib" 78 "lzo" or "no" (for no compression, used for remounting). If no type 79 is specified, zlib is used. If compress-force is specified, 80 all files will be compressed, whether or not they compress well. 81 If compression is enabled, nodatacow and nodatasum are disabled. 82 83 degraded 84 Allow mounts to continue with missing devices. A read-write mount may 85 fail with too many devices missing, for example if a stripe member 86 is completely missing. 87 88 device=<devicepath> 89 Specify a device during mount so that ioctls on the control device 90 can be avoided. Especialy useful when trying to mount a multi-device 91 setup as root. May be specified multiple times for multiple devices. 92 93 discard 94 Issue frequent commands to let the block device reclaim space freed by 95 the filesystem. This is useful for SSD devices, thinly provisioned 96 LUNs and virtual machine images, but may have a significant 97 performance impact. (The fstrim command is also available to 98 initiate batch trims from userspace). 99 100 enospc_debug 101 Debugging option to be more verbose in some ENOSPC conditions. 102 103 fatal_errors=<action> 104 Action to take when encountering a fatal error: 105 "bug" - BUG() on a fatal error. This is the default. 106 "panic" - panic() on a fatal error. 107 108 flushoncommit 109 The 'flushoncommit' mount option forces any data dirtied by a write in a 110 prior transaction to commit as part of the current commit. This makes 111 the committed state a fully consistent view of the file system from the 112 application's perspective (i.e., it includes all completed file system 113 operations). This was previously the behavior only when a snapshot is 114 created. 115 116 inode_cache 117 Enable free inode number caching. Defaults to off due to an overflow 118 problem when the free space crcs don't fit inside a single page. 119 120 max_inline=<bytes> 121 Specify the maximum amount of space, in bytes, that can be inlined in 122 a metadata B-tree leaf. The value is specified in bytes, optionally 123 with a K, M, or G suffix, case insensitive. In practice, this value 124 is limited by the root sector size, with some space unavailable due 125 to leaf headers. For a 4k sectorsize, max inline data is ~3900 bytes. 126 127 metadata_ratio=<value> 128 Specify that 1 metadata chunk should be allocated after every <value> 129 data chunks. Off by default. 130 131 noacl 132 Disable support for Posix Access Control Lists (ACLs). See the 133 acl(5) manual page for more information about ACLs. 134 135 nobarrier 136 Disables the use of block layer write barriers. Write barriers ensure 137 that certain IOs make it through the device cache and are on persistent 138 storage. If used on a device with a volatile (non-battery-backed) 139 write-back cache, this option will lead to filesystem corruption on a 140 system crash or power loss. 141 142 nodatacow 143 Disable data copy-on-write for newly created files. Implies nodatasum, 144 and disables all compression. 145 146 nodatasum 147 Disable data checksumming for newly created files. 148 149 notreelog 150 Disable the tree logging used for fsync and O_SYNC writes. 151 152 recovery 153 Enable autorecovery attempts if a bad tree root is found at mount time. 154 Currently this scans a list of several previous tree roots and tries to 155 use the first readable. 156 157 skip_balance 158 Skip automatic resume of interrupted balance operation after mount. 159 May be resumed with "btrfs balance resume." 160 161 space_cache (*) 162 Enable the on-disk freespace cache. 163 nospace_cache 164 Disable freespace cache loading without clearing the cache. 165 clear_cache 166 Force clearing and rebuilding of the disk space cache if something 167 has gone wrong. 168 169 ssd 170 nossd 171 ssd_spread 172 Options to control ssd allocation schemes. By default, BTRFS will 173 enable or disable ssd allocation heuristics depending on whether a 174 rotational or nonrotational disk is in use. The ssd and nossd options 175 can override this autodetection. 176 177 The ssd_spread mount option attempts to allocate into big chunks 178 of unused space, and may perform better on low-end ssds. ssd_spread 179 implies ssd, enabling all other ssd heuristics as well. 180 181 subvol=<path> 182 Mount subvolume at <path> rather than the root subvolume. <path> is 183 relative to the top level subvolume. 184 185 subvolid=<ID> 186 Mount subvolume specified by an ID number rather than the root subvolume. 187 This allows mounting of subvolumes which are not in the root of the mounted 188 filesystem. 189 You can use "btrfs subvolume list" to see subvolume ID numbers. 190 191 subvolrootid=<objectid> (deprecated) 192 Mount subvolume specified by <objectid> rather than the root subvolume. 193 This allows mounting of subvolumes which are not in the root of the mounted 194 filesystem. 195 You can use "btrfs subvolume show " to see the object ID for a subvolume. 196 197 thread_pool=<number> 198 The number of worker threads to allocate. The default number is equal 199 to the number of CPUs + 2, or 8, whichever is smaller. 200 201 user_subvol_rm_allowed 202 Allow subvolumes to be deleted by a non-root user. Use with caution. 203 204MAILING LIST 205============ 206 207There is a Btrfs mailing list hosted on vger.kernel.org. You can 208find details on how to subscribe here: 209 210http://vger.kernel.org/vger-lists.html#linux-btrfs 211 212Mailing list archives are available from gmane: 213 214http://dir.gmane.org/gmane.comp.file-systems.btrfs 215 216 217 218IRC 219=== 220 221Discussion of Btrfs also occurs on the #btrfs channel of the Freenode 222IRC network. 223 224 225 226 UTILITIES 227 ========= 228 229Userspace tools for creating and manipulating Btrfs file systems are 230available from the git repository at the following location: 231 232 http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs.git 233 git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git 234 235These include the following tools: 236 237mkfs.btrfs: create a filesystem 238 239btrfsctl: control program to create snapshots and subvolumes: 240 241 mount /dev/sda2 /mnt 242 btrfsctl -s new_subvol_name /mnt 243 btrfsctl -s snapshot_of_default /mnt/default 244 btrfsctl -s snapshot_of_new_subvol /mnt/new_subvol_name 245 btrfsctl -s snapshot_of_a_snapshot /mnt/snapshot_of_new_subvol 246 ls /mnt 247 default snapshot_of_a_snapshot snapshot_of_new_subvol 248 new_subvol_name snapshot_of_default 249 250 Snapshots and subvolumes cannot be deleted right now, but you can 251 rm -rf all the files and directories inside them. 252 253btrfsck: do a limited check of the FS extent trees. 254 255btrfs-debug-tree: print all of the FS metadata in text form. Example: 256 257 btrfs-debug-tree /dev/sda2 >& big_output_file 258