Virtual Memory

Motivation

We talked about how processes have their address space laid out
Each process has its variables and code (labels) at some set addresses (simplification)
These addresses can overlap
It’s even more apparent when we fork
How come that multiple processes don’t step on each other’s toes and overwrite each other’s variables or call the wrong functions?

Memory that we access in our programs is a big lie
It’s a conspiration of the OS and the hardware to create an illusion of a huge continuous private address space
How do we achieve this?
Where are things actually stored? RAM? Disk?
Both!
How do we do this?

We need some sort of translation function
Something that, for any given address, returns where the address is actually located
How would we implement such a function?
For simplicity, let’s start by assuming that stuff can only be stored in RAM and that we only have, say, 64K of RAM available and that an address space is only 16K
So we just need a translation, from virtual addresses (VA) to physical addresses (PA), for a particular process
Signature, something like: translate : Proc VA -> PA
Let’s also think about two processes that are running at the same time, and thus have their bits in RAM - maybe it’s even the same program
Well, we could come up with all sorts of translations, e.g., using the process IDs to somehow partition the physical addresses
Such (mathematical) approaches have problems: wasted space, how can we predict how many processes will run, etc.
How about we just use a table?
- Mapping VA to PA?
- Options: global or per-process?
What is in the table?
- Each entry is a VA and the corresponding PA?
- Each entry is the location of the process’ address space?
- Each entry is a variable-sized range of VAs and a range of PAs?
  - Segments
- Each entry is a fixed chunk of VAs and PAs?
  - Pages
What are the downsides/upsides?

This is (obviously?) the most inefficient solution
If we had a table that maps individual virtual addresses to physical addresses, we would, in the worst case, need as much memory to store our translation table as the amount of memory we are trying to map

Another approach is storing the boundaries of a process’ address space
What do we need? Two addresses per process: where the address space starts (the base) and where it ends (bound)
Then to translate an address, we
1. Take the virtual address and add it to the base address to get the physical address
2. Then we check that the physical address is within the boundary of the address space

Where does this translation take place?
One answer could be: the OS, since the OS is the manager of the process and the process’ address space
This would mean that for every instruction we execute, the OS would have to translate at least one address (the instruction itself)
But this contradicts our goal of direct execution for programs and would make things really slow
Just like with multiprocessing, the OS relies on hardware to make allow efficiently managed but direct execution of program’s instructions on the CPU
The piece of hardware is typically called a Memory Management Unit (MMU)
These can be separate chips, but are usually part of the CPU itself nowadays
For our mapping complete address spaces, the MMU is really simple: just two registers holding the base and the bounds for the currently running process
The OS’ responsibility is to maintain base and bounds for each process in its data structures then load the two registers for the process that is about to run

Mapping complete address spaces is wasteful
A typical address space will have a large chunk of space that is not being used - between the heap and the stack
A better approach is to map individual segments of the address space: code, data, heap, stack, etc.
These can vary in size - for example the heap and the stack will grow and shrink during the existence of a process
Separating the segments gives the OS more flexibility in using the available memory
The MMU gets a little more complicated: we no longer need 2 addresses per process, we need 2 addresses per segment
If there is a small number of segments, these can be registers
If there is a larger number, a segment table is needed, stored in memory, instead of the CPU
Problem of segments: because of their varying size, segmentation leads to fragmentation of the memory - the state where we have enough available space, but not in a continuous block - memory is wasted

While splitting and mapping memory as segments - chunks of varying sizes - allows more flexibility and wastes space, it also leads to fragmentation because of the differing sizes of replaced segments
This leads us to the last approach to try: split and map memory as fixed-size chunks, aka pages
A typical page size on a modern OS is 4K (= 4096 bytes)
The address space is split into pages, just like the physical memory (in PM we call them page frames - frames that can hold pages)
A per-process page table is maintained that maps virtual pages (VP) to physical page frames (PF)

To follow. For now check out OSTEP Ch. 18 and 19

To follow. For now check out OSTEP Ch. 20 and 21