1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
<!-- !> Notes from Ferd's class 
-->

# File Systems

- different ways of storing persistend data
- file systems - organize and store data on block devices

- 2 main abstractions
  1. File
     - collection of bytes associated with a name
  2. Directories
     - collection of directories and files


## File (System) APIs

Basic ones
- open - open a file
- close - close a file
- read/write - read or write from or to an open file

Others:
- lseek - adjust the offset of the given file
- fsync - flush buffers by writing to the disk
- rename - (atomically) rename a file
- (f)stat - get file metadata
- link - create a new name and associate it with file data, that is,
  create a hard link
- unlink - remove a name (label) from file data - if number of links
  drops to 0, this means "delete the file"

Directories:
- mkdir
- rmdir
- opendir / closedir
- readdir


## Filesystem Implementation

- implemented as a device driver
  - file system then communicates with the block device driver

- 2 major tasks of a file system:
  1. Translate names + offsets to the actual data/storage
  2. Free space management

- design of a FS
  1. Data structs: contents, metadata
  2. Access methods

## Useful observations about a typicial FS

- Most files: relatively small
  - ~2K most common size
- Average size: growin
  - ~200K 
- Most bytes occupied by large files
- Relatively large number of files
  - ~100K
- File systems - roughly half full
- Directories are relatively small
  - most directories will have ~20 files or less

# Organization

- a disk is divided blocks
- typical size 4K = 4096B
- 1M drive = 256 blocks
- blocks numbered 0 to N - 1 (for a disk partition size of 4K * N)
- let's say we only have 32 blocks 
- here's how we might split the disk up:

      +---------------------------------------------------------------+
      |S|i|d|I|I|I|I|I|D|D|D|D|D|D|D|D|D|D|D|D|D|D|D|D|D|D|D|D|D|D|D|D|
      +---------------------------------------------------------------+
      0 1 2 3 4 5 6 7       11      15                              31

- We have the following regions
  + S: superblock
  + i/d: free space bitmap for I blocks and D blocks
  + I: metadata stored as i-nodes
  + D: data blocks - actual file contents

## I-nodes

- metadata about the file
- the disk contains a table of these
- simply an array of structs stored on the disk
- fixed size - for easy searching
  - find file using it's i-node number (i-node number * sizeof(inode_t))
- Fields (example):
  1. size - long int - 8 bytes
  2. Owner - short int - 2 bytes
  3. Group - short int - 2 bytes
  4. Access - short int - 2 bytes
  5. Modification timestamp - long int - 8 bytes
  6. Refernce count - short int - 2 bytes
  7. Blocks - array of char - 8 bytes


## Directories

- To get from filename -> inode: directories
- A directory is a specially flagged i-node and a data block containing a
  table of filenames associated with i-node numbers
- Root directory of the file system: a designated i-node (e.g., i-node 0)


### Example
- example: file.txt - size 2K (occupies one block)

How to get from `/home/ferd/file.txt` -> i-node 42

1. go to i-node 0 (/)

2. find its datablock:

       home   d   3
       usr    d   2

3. go to i-node 3 (home)

4. find its datablock:

       ferd   d   7
       bob    d   5
       alice  d   12

5. go to i-node 7 (ferd)

6. find its datablock

       file.txt  f  42
       test.csv  f  55

7. go to-inode 42