Lecture 3: Memory, the Stack, Recursion

Introduction

Today, we’ll discuss:
1. How memory is laid out when our program is resident
2. What are the different segments of our program space
3. What is a call stack and how we use it
4. How to write functions & the assembly design recipe

Memory and addressing modes

From our (a programmer’s) perspective, main memory (think RAM, even though the story is more complicated) is a byte array indexed by addresses from 0 to 2⁶⁴ − 1 (on our 64 bit systems)
In assembly, we normally access memory indirectly using an address stored in a register or via a label

E.g., if a valid address is stored in %rax, the following will first save the number 1234 at that address and then move move the contents of that address into %rbx

# move the quadword 1234 into memory at the address stored in %rax
movq $1234, (%rax)  
# move the quadword stored in memory, at the address stored in %rax, into %rbx
movq (%rax). %rbx

We can also use offsets (displacement) to access memory a certain number of bytes before or after an address stored in a register:
```
# copy the 4 bytes at the address %rax - 8 to the address %rax + 4
movl -8(%rax), %ebx
movl %ebx, 4(%rax)
```
Remember: how much data is actually moved depends on the instruction size
Note: x86 assembly does not allow you to move a value between memory location using just one instructions, so the following is not valid:
```
movl -8(%rax), 4(%rax) # ERROR
```

Working with arrays

Working with consecutive chunks of the same size (i.e., arrays) is made easier using a so-called base-index-scale syntax
This addressing mode uses a base register (the beginning of the array), an index register (the element index) and scale (the size of each element in bytes)

A Program’s Address Space

When our program is loaded into memory, some of the things that go there are directly in the executable: the code (.text), global variables (.data)
Some things are created while the program is running

A program’s memory space (that is, the portion of memory that a program can access and use) is partitioned into a few chunks (segments):

+-----------------------------------+ <- High address
| Environment vars + args           |
+-----------------------------------+
|         STACK                     |
|           |                       |
|           v                       |
|...................................|
|                                   |
|                                   |
|                                   |
|                                   |
|                                   |
|                                   |
|                                   |
|                                   |
|                                   |
|                                   |
|...................................|
|             ^                     |
|             |                     | Dynamically allocated memory
|            HEAP                   |
+-----------------------------------+
| Uninitialized globals (.bss)      |
+-----------------------------------+
| Initialized globals (.data)       |   .text, .data, .bss come from the executable
+-----------------------------------+
|                                   |
| Code (.text)                      |
+-----------------------------------+
| OS stuff                          |
+-----------------------------------+ <- low address

The Stack (and the Heap)

The stack and the heap are 2 areas where a program can allocate memory during its lifetime
Heap-allocated memory is managed using library function which pass requests to the OS
Lifetime of data on the heap can vary - memory needs to be allocated and deallocated explicitly
We’ll talk more about the heap when we start working in C next week
The stack is used for “automatic” local allocation - it is very easy and quick to allocate memory
Memory is released when a function returns
The stack is organized in stack frames which are managed using the registers %rsp (the stack pointer) and %rbp (the base pointer)
The other thing to note is that while heap grows upward - “new” memory will have a higher address than “old” memory, the stack grows downward - from higher addresses to lower addresses

Stack Frames

A stack frame is an area of the stack delimited by the registers %rbp and %rsp
Normally, everytime a function is called, it sets up a stack frame for itself for storing local information
Once the function exits, the stack frame is released
Setting up the stack frame is exactly what the instruction enter does
On the other hand, releasing the stack frame is the job of leave - this cleans up whatever the function might have stored on the stack

Setting up the stack frame can be also achieved using the following pair of instructions:

pushq %rbp      # save the previous stack frame base to the stack
movq %rsp, %rbp # copy the current stack pointer into the base pointer, creating a stack of size 0

leave can be then replaced by the following instructions

movq %rbp, %rsp # drop the current stack frame by resetting %rsp to the base of the frame
popq %rbp       # restore the previous frame base

As mentioned above, stack frames are useful for storing local information a function needs during its lifetime
This can be either using push/pop or by using offsets from %(rbp) as local variables

How does this work?

First we need to allocate some number of bytes on the stack
Let’s say that we want to store two long variables (let’s call them a and b) on the stack
That’s a total of 16 bytes
We’ll tell enter that we want an initial stack frame of size 16 bytes instead of 0:
```
enter $16, $0
...
```

Now we can map a to -8(%rbp) and -16(%rbp) (remember that the stack grows downward!)

...
movq $42, -8(%rbp)    # a = 42
movq $1, -16(%rbp)    # b = 1
addq $12, -16(%rbp)   # b += 12

# return b;
movq -16(%rbp), %rax
leave
ret

We’ll need to use local variables if:
1. we have more locals than available registers
2. we write recursive functions

Writing Functions

In assembly, a function is represented as a label, a prologue (with stack frame setup), a body, and an epilogue (stack frame teardown and return)
In this class, we also add comments with the C signature and variable mappings (following the Assembly Design Recipe)

Here’s an example:

# long double(long x)
# x -> %rdi
double:
  # PROLOGUE
  enter $0, $0

  # BODY

  # return x + x;
  movq %rdi, %rax
  addq %rdi, %rax

  # EPILOGUE
  leave
  ret

The ret instruction jumps back to the instruction right after the call that called the given function
How does ret know where to jump? The return address gets pushed onto the stack just before it jumps to the function’s body
Now, go and read Nat Tuck’s Assembly Design Recipe
The recipe breaks the process of writing a function into 5 steps:
1. Signature
2. Pseudocode
3. Variable mappings
4. Skeleton
5. Body

Writing Recursive Functions

After reading the assembly design recipe, let’s try to write a recursive factorial function in assembly

Signature
- Our function will take an unsigned long and will return an unsigned long
```
# unsigned long fact(unsigned long n)
```

Pseudocode

As “pseudocode”, we’ll write the usual recursive C implementation of factorial:

unsigned long fact(unsigned long n) {
  if (n < 2) 
    return 1;
  else
    return n * fact(n - 1);
}

Variable mappings

# n -> %rsi

Skeleton

…
Body

…

To be continued…