# File Systems - different ways of storing persistend data - file systems - organize and store data on block devices - 2 main abstractions 1. File - collection of bytes associated with a name 2. Directories - collection of directories and files ## File (System) APIs Basic ones - open - open a file - close - close a file - read/write - read or write from or to an open file Others: - lseek - adjust the offset of the given file - fsync - flush buffers by writing to the disk - rename - (atomically) rename a file - (f)stat - get file metadata - link - create a new name and associate it with file data, that is, create a hard link - unlink - remove a name (label) from file data - if number of links drops to 0, this means "delete the file" Directories: - mkdir - rmdir - opendir / closedir - readdir ## Filesystem Implementation - implemented as a device driver - file system then communicates with the block device driver - 2 major tasks of a file system: 1. Translate names + offsets to the actual data/storage 2. Free space management - design of a FS 1. Data structs: contents, metadata 2. Access methods ## Useful observations about a typicial FS - Most files: relatively small - ~2K most common size - Average size: growin - ~200K - Most bytes occupied by large files - Relatively large number of files - ~100K - File systems - roughly half full - Directories are relatively small - most directories will have ~20 files or less # Organization - a disk is divided blocks - typical size 4K = 4096B - 1M drive = 256 blocks - blocks numbered 0 to N - 1 (for a disk partition size of 4K * N) - let's say we only have 32 blocks - here's how we might split the disk up: +---------------------------------------------------------------+ |S|i|d|I|I|I|I|I|D|D|D|D|D|D|D|D|D|D|D|D|D|D|D|D|D|D|D|D|D|D|D|D| +---------------------------------------------------------------+ 0 1 2 3 4 5 6 7 11 15 31 - We have the following regions + S: superblock + i/d: free space bitmap for I blocks and D blocks + I: metadata stored as i-nodes + D: data blocks - actual file contents ## I-nodes - metadata about the file - the disk contains a table of these - simply an array of structs stored on the disk - fixed size - for easy searching - find file using it's i-node number (i-node number * sizeof(inode_t)) - Fields (example): 1. size - long int - 8 bytes 2. Owner - short int - 2 bytes 3. Group - short int - 2 bytes 4. Access - short int - 2 bytes 5. Modification timestamp - long int - 8 bytes 6. Refernce count - short int - 2 bytes 7. Blocks - array of char - 8 bytes ## Directories - To get from filename -> inode: directories - A directory is a specially flagged i-node and a data block containing a table of filenames associated with i-node numbers - Root directory of the file system: a designated i-node (e.g., i-node 0) ### Example - example: file.txt - size 2K (occupies one block) How to get from `/home/ferd/file.txt` -> i-node 42 1. go to i-node 0 (/) 2. find its datablock: home d 3 usr d 2 3. go to i-node 3 (home) 4. find its datablock: ferd d 7 bob d 5 alice d 12 5. go to i-node 7 (ferd) 6. find its datablock file.txt f 42 test.csv f 55 7. go to-inode 42