Something that, for any given address, returns where the address is
actually located
How would we implement such a function?
For simplicity, let’s start by assuming that stuff can only be
stored in RAM and that we only have, say, 64K of RAM available and that
an address space is only 16K
So we just need a translation, from virtual addresses (VA) to
physical addresses (PA), for a particular process
Signature, something like:
translate : Proc VA -> PA
Let’s also think about two processes that are running at the same
time, and thus have their bits in RAM - maybe it’s even the same
program
Well, we could come up with all sorts of translations, e.g., using
the process IDs to somehow partition the physical addresses
Such (mathematical) approaches have problems: wasted space, how can
we predict how many processes will run, etc.
How about we just use a table?
Mapping VA to PA?
Options: global or per-process?
What is in the table?
Each entry is a VA and the corresponding PA?
Each entry is the location of the process’ address space?
Each entry is a variable-sized range of VAs and a range of PAs?
Segments
Each entry is a fixed chunk of VAs and PAs?
Pages
What are the downsides/upsides?
Mapping address-by-address
This is (obviously?) the most inefficient solution
If we had a table that maps individual virtual addresses to physical
addresses, we would, in the worst case, need as much memory to store our
translation table as the amount of memory we are trying to map
Mapping the address space
Another approach is storing the boundaries of a process’ address
space
What do we need? Two addresses per process: where the address space
starts (the base) and where it ends
(bound)
Then to translate an address, we
Take the virtual address and add it to the base address to get the
physical address
Then we check that the physical address is within the boundary of
the address space
Aside: Hardware Support
Where does this translation take place?
One answer could be: the OS, since the OS is the manager of the
process and the process’ address space
This would mean that for every instruction we execute, the OS would
have to translate at least one address (the instruction itself)
But this contradicts our goal of direct execution for
programs and would make things really slow
Just like with multiprocessing, the OS relies on hardware to make
allow efficiently managed but direct execution of program’s instructions
on the CPU
The piece of hardware is typically called a Memory Management Unit
(MMU)
These can be separate chips, but are usually part of the CPU itself
nowadays
For our mapping complete address spaces, the MMU is really simple:
just two registers holding the base and the bounds for the currently
running process
The OS’ responsibility is to maintain base and bounds for each
process in its data structures then load the two registers for the
process that is about to run
Segmentation
Mapping complete address spaces is wasteful
A typical address space will have a large chunk of space that is not
being used - between the heap and the stack
A better approach is to map individual segments of the
address space: code, data, heap, stack, etc.
These can vary in size - for example the heap and the stack will
grow and shrink during the existence of a process
Separating the segments gives the OS more flexibility in using the
available memory
The MMU gets a little more complicated: we no longer need 2
addresses per process, we need 2 addresses per segment
If there is a small number of segments, these can be registers
If there is a larger number, a segment table is needed, stored in
memory, instead of the CPU
Problem of segments: because of their varying size, segmentation
leads to fragmentation of the memory - the state where we have enough
available space, but not in a continuous block - memory is wasted
Paging
While splitting and mapping memory as segments - chunks of varying
sizes - allows more flexibility and wastes space, it also leads to
fragmentation because of the differing sizes of replaced segments
This leads us to the last approach to try: split and map memory as
fixed-size chunks, aka pages
A typical page size on a modern OS is 4K (= 4096 bytes)
The address space is split into pages, just like the physical memory
(in PM we call them page frames - frames that can hold pages)
A per-process page table is maintained that maps virtual pages (VP)
to physical page frames (PF)