linux/fs/cramfs/README
<<
>>
Prefs
   1Notes on Filesystem Layout
   2--------------------------
   3
   4These notes describe what mkcramfs generates.  Kernel requirements are
   5a bit looser, e.g. it doesn't care if the <file_data> items are
   6swapped around (though it does care that directory entries (inodes) in
   7a given directory are contiguous, as this is used by readdir).
   8
   9All data is currently in host-endian format; neither mkcramfs nor the
  10kernel ever do swabbing.  (See section `Block Size' below.)
  11
  12<filesystem>:
  13        <superblock>
  14        <directory_structure>
  15        <data>
  16
  17<superblock>: struct cramfs_super (see cramfs_fs.h).
  18
  19<directory_structure>:
  20        For each file:
  21                struct cramfs_inode (see cramfs_fs.h).
  22                Filename.  Not generally null-terminated, but it is
  23                 null-padded to a multiple of 4 bytes.
  24
  25The order of inode traversal is described as "width-first" (not to be
  26confused with breadth-first); i.e. like depth-first but listing all of
  27a directory's entries before recursing down its subdirectories: the
  28same order as `ls -AUR' (but without the /^\..*:$/ directory header
  29lines); put another way, the same order as `find -type d -exec
  30ls -AU1 {} \;'.
  31
  32Beginning in 2.4.7, directory entries are sorted.  This optimization
  33allows cramfs_lookup to return more quickly when a filename does not
  34exist, speeds up user-space directory sorts, etc.
  35
  36<data>:
  37        One <file_data> for each file that's either a symlink or a
  38         regular file of non-zero st_size.
  39
  40<file_data>:
  41        nblocks * <block_pointer>
  42         (where nblocks = (st_size - 1) / blksize + 1)
  43        nblocks * <block>
  44        padding to multiple of 4 bytes
  45
  46The i'th <block_pointer> for a file stores the byte offset of the
  47*end* of the i'th <block> (i.e. one past the last byte, which is the
  48same as the start of the (i+1)'th <block> if there is one).  The first
  49<block> immediately follows the last <block_pointer> for the file.
  50<block_pointer>s are each 32 bits long.
  51
  52The order of <file_data>'s is a depth-first descent of the directory
  53tree, i.e. the same order as `find -size +0 \( -type f -o -type l \)
  54-print'.
  55
  56
  57<block>: The i'th <block> is the output of zlib's compress function
  58applied to the i'th blksize-sized chunk of the input data.
  59(For the last <block> of the file, the input may of course be smaller.)
  60Each <block> may be a different size.  (See <block_pointer> above.)
  61<block>s are merely byte-aligned, not generally u32-aligned.
  62
  63
  64Holes
  65-----
  66
  67This kernel supports cramfs holes (i.e. [efficient representation of]
  68blocks in uncompressed data consisting entirely of NUL bytes), but by
  69default mkcramfs doesn't test for & create holes, since cramfs in
  70kernels up to at least 2.3.39 didn't support holes.  Run mkcramfs
  71with -z if you want it to create files that can have holes in them.
  72
  73
  74Tools
  75-----
  76
  77The cramfs user-space tools, including mkcramfs and cramfsck, are
  78located at <http://sourceforge.net/projects/cramfs/>.
  79
  80
  81Future Development
  82==================
  83
  84Block Size
  85----------
  86
  87(Block size in cramfs refers to the size of input data that is
  88compressed at a time.  It's intended to be somewhere around
  89PAGE_CACHE_SIZE for cramfs_readpage's convenience.)
  90
  91The superblock ought to indicate the block size that the fs was
  92written for, since comments in <linux/pagemap.h> indicate that
  93PAGE_CACHE_SIZE may grow in future (if I interpret the comment
  94correctly).
  95
  96Currently, mkcramfs #define's PAGE_CACHE_SIZE as 4096 and uses that
  97for blksize, whereas Linux-2.3.39 uses its PAGE_CACHE_SIZE, which in
  98turn is defined as PAGE_SIZE (which can be as large as 32KB on arm).
  99This discrepancy is a bug, though it's not clear which should be
 100changed.
 101
 102One option is to change mkcramfs to take its PAGE_CACHE_SIZE from
 103<asm/page.h>.  Personally I don't like this option, but it does
 104require the least amount of change: just change `#define
 105PAGE_CACHE_SIZE (4096)' to `#include <asm/page.h>'.  The disadvantage
 106is that the generated cramfs cannot always be shared between different
 107kernels, not even necessarily kernels of the same architecture if
 108PAGE_CACHE_SIZE is subject to change between kernel versions
 109(currently possible with arm and ia64).
 110
 111The remaining options try to make cramfs more sharable.
 112
 113One part of that is addressing endianness.  The two options here are
 114`always use little-endian' (like ext2fs) or `writer chooses
 115endianness; kernel adapts at runtime'.  Little-endian wins because of
 116code simplicity and little CPU overhead even on big-endian machines.
 117
 118The cost of swabbing is changing the code to use the le32_to_cpu
 119etc. macros as used by ext2fs.  We don't need to swab the compressed
 120data, only the superblock, inodes and block pointers.
 121
 122
 123The other part of making cramfs more sharable is choosing a block
 124size.  The options are:
 125
 126  1. Always 4096 bytes.
 127
 128  2. Writer chooses blocksize; kernel adapts but rejects blocksize >
 129     PAGE_CACHE_SIZE.
 130
 131  3. Writer chooses blocksize; kernel adapts even to blocksize >
 132     PAGE_CACHE_SIZE.
 133
 134It's easy enough to change the kernel to use a smaller value than
 135PAGE_CACHE_SIZE: just make cramfs_readpage read multiple blocks.
 136
 137The cost of option 1 is that kernels with a larger PAGE_CACHE_SIZE
 138value don't get as good compression as they can.
 139
 140The cost of option 2 relative to option 1 is that the code uses
 141variables instead of #define'd constants.  The gain is that people
 142with kernels having larger PAGE_CACHE_SIZE can make use of that if
 143they don't mind their cramfs being inaccessible to kernels with
 144smaller PAGE_CACHE_SIZE values.
 145
 146Option 3 is easy to implement if we don't mind being CPU-inefficient:
 147e.g. get readpage to decompress to a buffer of size MAX_BLKSIZE (which
 148must be no larger than 32KB) and discard what it doesn't need.
 149Getting readpage to read into all the covered pages is harder.
 150
 151The main advantage of option 3 over 1, 2, is better compression.  The
 152cost is greater complexity.  Probably not worth it, but I hope someone
 153will disagree.  (If it is implemented, then I'll re-use that code in
 154e2compr.)
 155
 156
 157Another cost of 2 and 3 over 1 is making mkcramfs use a different
 158block size, but that just means adding and parsing a -b option.
 159
 160
 161Inode Size
 162----------
 163
 164Given that cramfs will probably be used for CDs etc. as well as just
 165silicon ROMs, it might make sense to expand the inode a little from
 166its current 12 bytes.  Inodes other than the root inode are followed
 167by filename, so the expansion doesn't even have to be a multiple of 4
 168bytes.
 169
lxr.linux.no kindly hosted by Redpill Linpro AS, provider of Linux consulting and operations services since 1995.