1 Network Block Device (TCP version) 2 3 Note: Network Block Device is now experimental, which approximately 4 means, that it works on my computer, and it worked on one of school 5 computers. 6 7 What is it: With this compiled in the kernel, Linux can use a remote 8 server as one of its block devices. So every time the client computer 9 wants to read /dev/nd0, it sends a request over TCP to the server, which 10 will reply with the data read. This can be used for stations with 11 low disk space (or even diskless - if you boot from floppy) to 12 borrow disk space from another computer. Unlike NFS, it is possible to 13 put any filesystem on it etc. It is impossible to use NBD as a root 14 filesystem, since it requires a user-level program to start. It also 15 allows you to run block-device in user land (making server and client 16 physically the same computer, communicating using loopback). 17 18 Current state: It currently works. Network block device looks like 19 being pretty stable. I originally thought that it is impossible to swap 20 over TCP. It turned out not to be true - swapping over TCP now works 21 and seems to be deadlock-free, but it requires heavy patches into 22 Linux's network layer. 23 24 Devices: Network block device uses major 43, minors 0..n (where n is 25 configurable in nbd.h). Create these files by mknod when needed. After 26 that, your ls -l /dev/ should look like: 27 28brw-rw-rw- 1 root root 43, 0 Apr 11 00:28 nd0 29brw-rw-rw- 1 root root 43, 1 Apr 11 00:28 nd1 30... 31 32 Protocol: Userland program passes file handle with connected TCP 33 socket to actual kernel driver. This way, the kernel does not have to 34 care about connecting etc. Protocol is rather simple: If the driver is 35 asked to read from block device, it sends packet of following form 36 "request" (all data are in network byte order): 37 38 __u32 magic; must be equal to 0x12560953 39 __u32 from; position in bytes to read from / write at 40 __u32 len; number of bytes to be read / written 41 __u64 handle; handle of operation 42 __u32 type; 0 = read 43 1 = write 44 ... in case of write operation, this is 45 immediately followed len bytes of data 46 47 When operation is completed, server responds with packet of following 48 structure "reply": 49 50 __u32 magic; must be equal to 51 __u64 handle; handle copied from request 52 __u32 error; 0 = operation completed successfully, 53 else error code 54 ... in case of read operation with no error, 55 this is immediately followed len bytes of data 56 57 For more information, look at http://atrey.karlin.mff.cuni.cz/~pavel. 58

