Homework 9: More Mutability

Due: Sunday, November 9th, 11:59pm

Start with the template code here.

Replace each failwith "TODO" with your solution.
Be sure to test your solutions in utop or in tests alongside your submission. To load your file into utop, use #use "hw9.ml";;.
Submit your version of hw9.ml to the HW9 assignment on Gradescope. Only submit hw9.ml; NOT any other files or folders.
At the top of the template file, there is a comment for you to write in approximately how many hours you spent on the assignment. Filling this in is optional, but helpful feedback for the instructors.

Q1: Combining Modules and Mutable State

Using mutable state, we can implement a notion of a "fresh integer" like so:

module Fresh = struct
    let ctr = ref 0
    let fresh () = 
        ctr := !ctr + 1;
        !ctr
end

The intention is that the surrounding OCaml code can use the Fresh module and be assured that, whenever fresh : unit -> int is called, we truly get an integer that wasn't returned before by that function. Unfortunately, the above module does not guarantee this property. The following module does, however:

module type FreshSig = sig
    val fresh : unit -> int
end

module FixedFresh : FreshSig = struct 
    let ctr = ref 0
    let fresh () = 
        ctr := !ctr + 1;
        !ctr
end

In a code example and accompanying comment, give a piece of code which shows that we can trick Fresh.fresh into returning the same integer on two different calls; and explain why the same code example won't work with FixedFresh. The code example should return a pair of integers --- each coming from different calls to fresh --- but should return the same number in both components.

(* Return two integers --- both coming from different calls to Fresh.fresh --- that should always be equal. *)
let break_fresh () : int * int = failwith "TODO"

(* TODO: Why does the above code not work for FixedFresh? 
    ---

*)

Q1: Grading Criteria

Whether break_fresh can return the same integer twice with two different calls to Fresh.fresh; and whether you comment explains why the same code won't work for FixedFresh.

Q2: Implementing References

As described in class, a value of type t ref is, in a sense, nothing more than a reference (or pointer) to a value of type t stored "somewhere else". When we call ref to create a new reference:

let x = ref 32 in 
...

What ends up happening (through many layers of indirection) is, very roughly, something like this: the OCaml program asks the surrounding environment for a fresh "memory cell". This memory cell is nothing more than an integer i, which indexes into a large buffer B of space. Then, when we call !x, we then index into the buffer using the index --- computing something like B[i] --- and grabbing the value there. If we call x := 5, then we are modifying this large buffer by setting B[i] := 5. The piece of code that manages this buffer B, and calculates and returns indices i, is called an allocator. If you have used C, then you have likely called malloc, which is this idea.

This is all happening under the hood, but we can bring this idea to light by building our own simple allocator in OCaml. The way we are going to do this is by having two types:

An arena, which holds a large buffer of memory we can use (like our B above); and
An mref (standing for manual reference), which is an index into this arena.

The idea is that an arena is a large chunk of data we can use (and reuse) to implement mutation. We actually use mutation via mrefs, which are just indices into the arena. Our mref type willl support the normal operations of ref --- creating new ones, assigning to them, and reading from them --- but also the ability to free an mref, which allows another piece of code to reuse that buffer space. When we free an mref, this is a promise that we will never read from that mref again; thus, further reads to the mref might fail. We can implement these ideas using a module type:

module type ManualRefSig = sig
    type 'a arena 
    type 'a mref

    (* Create a new arena. The argument (capacity of the arena) must be nonnegative. *)
    val fresh_arena : int -> 'a arena

    (* Find an unused slot in the arena and store the given value there; and return an mref to that location. 
       Return None if no free space is available.
    *)
    val alloc : 'a arena -> 'a -> 'a mref option

    (* Free the mref, which should allow that slot in the arena to be re-used. 
       If the mref points to an already freed location, the operation should do nothing. *)
    val free : 'a arena -> 'a mref -> unit

    (* Read an mref. Should return None if the mref is pointing to a freed location in the arena. *)
    val read : 'a arena -> 'a mref -> 'a option

    (* Write to an mref. Should return None if the mref is pointign to a freed location in the arena. *)
    val write : 'a arena -> 'a mref -> 'a -> unit option

    (* Do the two mrefs point to the same index in the arena? *)
    val phys_eq : 'a mref -> 'a mref -> bool

    (* Return the number of currently free cells in the arena. *)
    val num_free : 'a arena -> int

    (* Return the number of times we have called alloc on the arena. *)
    val num_allocations : 'a arena -> int
end

Your task in this question is to implement ManualRefSig:

module ManualRef : ManualRefSig = struct
    ...
end

Importantly, the only actual allocation of mutable memory you should use shall be stored in the arena. The 'a mref should merely store pure data (e.g., int), and should not hold a value of type 'a (since they are supposed to be stored in the arena).

To create a fresh arena, you should use (at least) OCaml's arrays, which are similar to ref but holds N values at once, for a user defined N. You create arrays using Array.make, and you read/modify them using Array.get and Array.set, respectively. While functions raise exceptions if you pass an out-of-bound index into the array, your code should never raise these exceptions. (Note: you may also use ref inside of the arena in addition to an Array.t, if both are useful.)

Other than using Array.t inside arenas, and the fact that mrefs can't store mutable data (e.g., no ref, Array.t, or other mutable data structures from the standard library), the rest of the implementation is up to you. Note that fresh_arena returns an 'a arena without specifying values for every slot; thus, the array inside the arena cannot hold 'a exactly, but a type that can optionally hold an 'a. Additionally, note that since alloc must find an unused slot in the arena, the arena must keep track of what slots are used and which are free.

Finally, we also require that you implement functions num_free : 'a arena -> int and num_allocations : 'a arena -> int, to get the current number of free slots in the arena and the total number of allocations performed in the arena, respectively. (Thus, you need to keep track of this data as well in the arena.)

Q2: Grading Criteria

Correctness of your implementation of ManualRef.

Q3: Specifications about Memory

Now that we have our own model of memory, we can use it to reason about the behavior of code with respect to memory. In particular: have we freed every reference created? And how much memory are we using? Ordinarily, answers to questions like these need to be made "outside" the program, using tools like Valgrind which instruments (adds additional checks to) your code to answer these questions. However, by using num_free and num_allocations, we can answer these questions more directly.

To make the solutions to Q3 be generic over the particular implementation of ManualRefSig, we will use a functor that takes a particular implementation of the module as an argument:

module MemorySpecs (R : ManualRefSig) = struct
    open R
    ... (* Q3 content here *)
end

We open R inside of the functor so that we can make reference to type types 'a arena directly (instead of 'a R.arena).

You can test your code in this question by writing

module MyMemorySpecs = MemorySpecs(ManualRef)

at the bottom of the file, and calling the functions in / opening the module MyMemorySpecs.

Q3.1 First, we can implement a function leakage_free to ask if a certain function f : 'a arena -> unit has any memory leaks. A call to f has a memory leak whenever the number of free cells in the arena is different after f is called than before f is called.

let leakage_free (a : 'a arena) (f : 'a arena -> 'b) : bool = failwith "TODO"

You may assume that the only memory leaks we care about are the ones arising from how f uses the arena a (i.e., no other arenas are used). As an example of a memory leak,

fun (a : int arena) -> 
    let _ = alloc a 32 in 
    ()

leaks a single unit of memory, because we are storing 32 in the memory but throwing away the pointer --- thus, we can not free that memory later.

Q3.2 Next, we can implement a function num_allocations_in that computes the number of allocations done by the function:

let num_allocations_in (a : 'a arena) (f : 'a arena -> 'b) : int = failwith "TODO"

Q3: Grading Criteria

Correctness of the two functions above (evaluated independently of the correctness of ManualRef from Q2).

Q4: Using Mutable Memory

In this question, we will explore an algorithm which uses mutable memory in a subtle way. Our problem starts with computing the n'th Fibonacci number \(F(n)\), where we define \(F(0) = 0, F(1) = 1\), and \(F(n) = F(n - 2) + F(n - 1)\) for \(n \geq 2\). We can easily implement this as a functional, recursive program:

let rec fib (n : int) = 
    if n <= 0 then 0 else 
        if n = 1 then 1 else
            fib (n - 2) + fib (n - 1)

(We use n <= 0 instead of n = 0 just to make sure that fib never loops forever.) The main problem with this function is its time complexity: to evaluate fib 5, we have to first:

evaluate fib 3, which calls fib 1 and then calls fib 2 (which calls fib 0 and fib 1); then,
evaluate fib 4, which calls fib 2 (which calls fib 0 and fib 1), and then calls fib 3, which calls fib 1 and then calls fib 2 (which calls fib 0 and fib 1).

As you can see, the number of calls to fib n is exponential in n! This is no good. Your job in this assignment is to use mutation to implement fib in a faster way.

Like Q3, we begin by setting up a functor:

module Fib (R : ManualRefSig) = struct
    open R
    module IntMap = Map.Make(Int)

    (* Helper functions *)
    let create_map (n : int) (f : int -> 'a) : 'a IntMap.t = 
        IntMap.of_list (List.init n (fun i -> (i, f i)))

    (* Create a map using f, and return None if any calls to f return None. *)
    let create_map_option (n : int) (f : int -> 'a option) : ('a IntMap.t) option = 
        let pairs = List.init n (fun i -> (i, f i)) in 
        if List.exists (fun p -> (Option.is_none (snd p))) pairs then 
            None 
        else 
            Some (IntMap.of_list (List.map (fun p -> (fst p, Option.get (snd p))) pairs))

    let iter_map (m : 'a IntMap.t) (f : int -> 'a -> unit) : unit = 
        IntMap.iter f m

    let fib_fast (a : int arena) (n : int) : int option = failwith "TODO"
end

You may implement fib_fast in any way you wish, with the following requirements:

It must compute the same final answer as fib, with the exception that if one of the functions inside of R (e.g., alloc / free / read / write) returns None, you should also return None. Given an arena with enough space, fib_fast should return Some with the correct answer.
fib_fast cannot itself be recursive on n. Instead, it can only use create_map/create_map_option to create an IntMap; and iter_map, which allows you to traverse the IntMap in ascending order of the integer keys. (This can emulate a for loop, since you can create a map from 0 to N, and then iterate over it in ascending order.)
fib_fast must not use ref or Array.t, and so on; but should instead use the functions inside of R to manipulate mutable memory using the supplied arena a. Additionally, you may not call fresh_arena. Thus, for this this problem, you may ONLY store integers inside of mrefs.
Your code should not have memory leaks: all mrefs you create with alloc must be freed before the function returns.
You may feel free to write helper functions, as long as these helper functions do not use recursion.

Hint

You can create an (int mref) IntMap.t (using create_map or create_map_option) which maps each index i to a fresh int mref which holds the value of fib i (initialized with -1, if that index hasn't been computed yet). Via create_map_option, you can return None if any allocation failed. Then, you can compute the value of fib n by iterating over the map via iter_map.

Q4: Grading Criteria

Correctness and leakage-freedom of fib_fast (again, which is evaluated independently of the correctness of ManualRef from Q2).