## Assignment 10: Racer Snake: Register allocation

#### Due: Wed 4/10 at 8:59pm

`git clone `

In this compiler you’ll implement register allocation, to replace the naive stack allocation scheme. After all, Register Allocation Contributes to Efficient Runtime...

There is relatively little starter code this time. The main change is a new
command-line flag, `-alloc <naive|register>`

, to configure your compiler
to use either the former naive allocation strategy or your new register
allocation strategy. You can also use this command line flag in your options
files for your tests; there is not a test helper-function in `test.ml`

to
do this for you, though you can easily add one.

### 1` `Language

Racer starts with the same semantics and runtime behavior as Garter, and simply revisits the naive stack-allocation scheme to assign locations to let-bound variables.

### 2` `Implementation

Read the TODO list below for a recommended order of operations for this assignment. Here, we’ll focus on the new code, rather than the refactorings of old code.

Cache the free variables on every expression.

Define an interference graph for a given program.

Color the graph such that no edge has two endpoints of the same color.

Map the coloring back to actual locations

Implement a dead-variable elimination pass

#### 2.1` `Caching free variables

You have already implemented `free_vars`

in Fer-de-lance, to
compute the set of free variables of a given expression. In this
assignment, you’re going to need to access those free variable sets
repeatedly, so rather than recomputing them over and over (which would
be exponential in the size of your AST), you should design a function
`free_vars_cache`

that takes a tagged `aprog`

and walks the AST,
computing the free variables at every expression, and produces a new
`aprog`

that stores those free variable sets.

#### 2.2` `Defining an interference graph

Consider a let-expression `let x = e in b`

. In order for
`b`

to run correctly, all of its free variables must not be
placed wherever we place `x`

. Accordingly, we define
inteference edges between `x`

and every free variable within
`b`

. Implicitly, this means we need to add nodes to the
interference graph for each variable mentioned in this whole
expression. Note that we do not add interference edges between
`x`

and any free variables of `e`

, since `x`

isn’t
defined until after `e`

executes.

Consider a lambda-expression `lam(x, y, ...): b end`

. We build
an interference graph for `b`

. To that graph, add interference
edges for every pair of free variables of the entire lambda (i.e., the
free variables of `b`

except for the arguments `x`

,
`y`

, etc.), since those are precisely the values that are packed
into the closure and that need to be unpacked to distinct locations
when the closure gets called.

Consider a let-rec expression
`let rec x1 = e1, x2 = e2, ... in b`

.
By well-formedness, each of the `e#`

expressions is a
lambda. We therefore collect the union of the free variables of all
the lambdas, and add interference edges with each of the `x#`

variables. We also add interference edges between each pair of
`x#`

variables. (Note that this technically introduces
some potentially-unneeded interference edges, but the bookkeeping
needed to avoid the consequences of them (in how you fill in the
bodies of the closures) is sufficiently painful to make these edges
worthwhile.)

When building the interference graph for an `aexpr`

expression,
note that every `cexpr`

cannot introduce additional interference
edges...except for `CIf`

, which can contain `aexpr`

s that
themselves contain further let-bindings or lambdas. In particular,
do not recur into any lambda expressions; only use their cached
free-variables set. I.e. the interference graph for the lambda itself
should not be merged with the interference graph for the expression
containing that lambda expression.

#### 2.3` `Graph coloring

Given an interference graph, we need to color each node such that no
edge of the graph has two endpoints with the same color. Our set of
“colors” consists of each of the general-purpose registers that are
not already in use by our program. This would be `R10`

,
`R11`

, `R12`

, `R13`

, `R14`

, and `RBX`

, possibly
excluding one or two of those as temporary registers. (You are
welcome to try to include the six argument-passing registers
`RSI`

, `RDI`

, `RCX`

, `RDX`

, `R8`

, `R9`

, though
additional stack-`push`

and `pop`

bookkeeping will be needed
to preserve them across function-calls. If you’ve been pushing
arguments onto the stack for your calling convention, then you should
definitely include these registers in your available set of registers
to use!) Additionally, we can create arbitrarily many more “colors”
corresponding to stack slots `RBP-8`

, `RBP-16`

, etc. Each of
these colors is an `arg`

.

Graph coloring is rather inconveniently an NP-complete problem to solve optimally, though there are many heuristics. We’re going to implement the so-called smallest-last algorithm. We’ll break the process of graph coloring into two steps: first, producing a worklist of nodes that we’ll process in order, and then processing each node of that worklist in turn. Our worklist will be a stack. To determine the order to process the nodes of the graph:

Initialize the worklist to the empty stack.

Find the node in the graph of smallest degree. Push it onto the worklist, and remove that node (and all its edges) from the interference graph.

Repeat this process with the reduced graph, repeatedly removing the node of highest (remaining) degree and pushing onto the stack, until the graph is empty.

To actually color the nodes:1A fairly simple greedy argument shows that we definitely don’t need more colors than one plus the maximum degree of any node in the graph. The reason the smallest-last ordering tends to work well is that each step shrinks the degress of the remaining vertices, which means that the largest degree ever encountered in this process (the so-called "degeneracy" of a graph) is lower than than the original largest degree. By coloring the vertices in the reverse order of their removal, we ensure that all the vertices so far can be colored in at most the maximum degree encountered so far, which in turn is at most the degeneracy of the original graph.

Create a mapping from node names to colors (i.e. an

`arg envt`

), initialized to the empty mapping.For each node on the stack, examine all of its neighbors in the original interference graph, and collect all their currently-known colors. Find the minimum color that isn’t used yet by that set of colors, and allocate that new minimum color to that node. Obviously, we prefer register “colors” to stack-slot colors!

Repeat until the stack is empty.

This `arg envt`

is the final environment for that lambda.

You will compute an interference graph and color it, for each function
in your program (including `our_code_starts_here`

...), and you’ll
store these resulting environments in a map from function names to
environments —`arg envt envt`

that matches the
return type of `naive_stack_alloc`

, so your register allocator
function should be a drop-in replacement for that simpler function.

#### 2.4` `Dead variable elimination

(Required for CS6410; optional for CS4410)

Consider the program `let x = e1 in let y = e2 in x`

. According
to our rules for interference graph construction, `x`

and
`y`

interfere. However, `y`

is dead: it is never
used after its definition. Therefore we don’t even need to store its
value at all! In general, though, we can’t eliminate the computation
of `e2`

, since it may have side effects. Instead, we can detect
this scenario, where a variable defined by a let is not used within
its own body, and transform the `ALet`

into an `ASeq`

.

The analogous situation for let-rec expressions is trickier. Consider
a let-rec `let rec x1 = e1, x2 = e2, ... in b`

where some of the
bindings are not used in any of the other bindings. Because of
well-formedness, the right hand sides of those bindings are lambdas
—`e#`

, and find the
set of unused `x#`

variables (that aren’t in that union)...and
just delete those bindings entirely.

Implement these two transformations as a new phase of your compiler,
after `free_vars_cache`

and before register-allocation.

### 3` `Recommended TODO List

Move over code from past assignments and/or lecture code to get the basics going.

Implement

`free_vars_cache`

, to take a tagged`aprog`

and produce an`aprog`

augmented with the set of free variables at every single expression position. Test as you go.Duplicate your existing

`naive_stack_allocation`

implementation into the body of`register_allocation`

. These two functions should already be wired up to the compilation pipeline in`compile.ml`

.Modify

`register_allocation`

to initially call`free_vars_cache`

, and then refactor its behavior to depend entirely on those cached results: at this point you should have code that works identically to`naive_stack_allocation`

, but whose code is ready for the next step.Implement

`interfere`

for everything other than lambda expressions (i.e. skip`CLambda`

and`ALetRec`

). Test very carefully: minor bugs here can make later allocation decisions very tricky to debug. Write several test programs that avoid let-rec and lambdas, but use variables in interesting ways: work your way back up through all the languages we’ve seen. Confirm that all let-free Adder programs use at most one register. Confirm that all let-free Boa programs use at most as many registers as the rightmost depth of the infix expression. Confirm that any programs using lets use an expected number of registers. Confirm that the branches of if expresions don’t interfere with each other in Cobra programs. Etc...Implement

`color_graph`

, testing as you go.Revise

`register_allocation`

to use`interfere`

and`color_graph`

to make allocation decisions, rather than just assigning stack slots to all variables. You should be able to run existing integration tests by omitting builtins, for now.Add support for

`CLambda`

and`ALetRec`

to`interfere`

. Test very carefully.

### 4` `List of Deliverables

all your modified files

tests in an OUnit test module (

`test.ml`

)any test input programs (

`input/*/*.racer`

files)

Again, please ensure the makefile builds your code properly. The black-box tests will give you an automatic 0 if they cannot compile your code!

DO NOT SUBMIT YOUR `.git`

DIRECTORY! For that matter, don’t submit
your `output`

or `_build`

directories. Basically, run
`make clean`

and then submit a zip of the remaining directory.

### 5` `Grading Standards

For this assignment, you will be graded on

Whether your code implements the specification (functional correctness),

the clarity and cleanliness of your code, and

the comprehensiveness of your test coverage

### 6` `Submission

Wait! Please read the assignment again and verify that you have not forgotten anything!

Please submit your homework to https://handins.khoury.northeastern.edu/ by the above deadline.

1A fairly simple greedy argument shows that we definitely don’t need more colors than one plus the maximum degree of any node in the graph. The reason the smallest-last ordering tends to work well is that each step shrinks the degress of the remaining vertices, which means that the largest degree ever encountered in this process (the so-called "degeneracy" of a graph) is lower than than the original largest degree. By coloring the vertices in the reverse order of their removal, we ensure that all the vertices so far can be colored in at most the maximum degree encountered so far, which in turn is at most the degeneracy of the original graph.