Lecture 7.2: More Functors and Mutability

More Uses of Functors

Just to wrap up our discussion of functors, last week we defined a type with an associated equality: a function which, given two ts, will decide if they are equal. This allows us to use the type in data structures which assume a given type with a user-defined equality.

Let's take a look at an EqType for string insensitive equality StringEq and string sensitive equality StringSensitiveEq:

module StringEq : (EqType with type t = string) = struct
    type t = string
    let equals p1 p2 = String.(lowercase_ascii p1 = lowercase_ascii p2)
end

module StringSensitiveEq : (EqType with type t = string) = struct
    type t = string
    let equals p1 p2 = p1 = p2
end

Now, what if we want a list of unordered pairs? We could implement this directly:

module StringListEq : (EqType with type t = string list) = struct
    type t = string list
    let equals p1 p2 = ???
end

But we would soon see that the complexity of equals increases dramatically with the size of the type. Instead, we can use functors to build new EqTypes out of old ones:

module EqTypeList (T : EqType) : (EqType with type t = T.t list) = struct
    type t = T.t
    let rec equals xs ys = 
        match xs, ys with 
        | [], [] -> true
        | x :: xs', y :: ys' -> T.equals x y && equals xs' ys'
        | _, _ -> false

Here is another for trees:

type 'a tree =
  | Leaf of 'a
  | Node of 'a tree * 'a tree

module EqTypeTree (T : EqType) : (EqType with type t = T.t tree) = struct
    type t = T.t tree
    let rec equals t1 t2 = 
        match t1, t2 with
        | Leaf x, Leaf y -> T.equals x y
        | Node (t11, t12), Node (t21, t22) -> 
            equals t11 t21 && equals t12 t22
        | _, _ -> false
end

We can even generalize our unordered pair module:

module UnorderedPairOf (T : EqType) : (EqType with type t = T.t * T.t) = struct
    type t = T.t * T.t
    let equals p1 p2 = 
        (T.equals (fst p1) (fst p2) && T.equals (snd p1) (snd p2))
        &&
        (T.equals (snd p1) (fst p2) && T.equals (fst p1) (snd p2))
end

Then, we can get for free an EqType for unordered pairs of lists of trees that contain integers:

module IntEqType = struct
    type t = int
    let equals x y = (x = y)
end

module MyType = UnorderedPairOf (EqTypeList (EqTypeTree (StringEq)))

Mutability

Switching gears, you may have noticed that, so far in class, we have only worked with immutable data. As an example, the following function,

let add_one x = x + 1

let f () = 
    let x = 1 in 
    let y = add_one x in 
    let x = 37 in 
    print_endline ("y: " ^ string_of_int y)

will print

y: 2

The reason for this is that the second assignment of x to 37 is rebinding the value of x. Thus, the old version of x is, in a sense, still there: we just can't get access to it because we have shadowed the name x with a new binding. Thus, y still is equal to 2, even after we re-assign x to a new value.

Sometimes, we do want y to recompute and change when x changes. We can do that through making x a reference, and having y recompute whenever we want the value:

let add_one (x : int ref) = !x + 1

let f () = 
    let x : int ref = ref 1 in (* Create a new reference *)
    let y : unit -> int = fun _ -> add_one x in 
    x := 37; (* Mutate x to hold 37 *)
    print_endline ("y: " ^ y ()) (* Prints "y: 38")

While int/bool/string all stand for pure values (i.e., a specific integer, such as 42), a value of type int ref is a memory cell: a box which holds an integer. We create a memory cell by the function ref, which has type 'a -> 'a ref. In other words, it can take in something of any type 'a, and return a memory cell which holds an 'a. (Note here that ref means two different things: it is a function to create a new memory cell; and it is also a parameterized type of those memory cells.) Then, if x has type int ref, we can:

peek inside the box by calling !x, returning an int; and
modify the box by writing x := i, where i : int; this returns a unit.

Why have we waited until now to introduce mutability? The semantics can get tricky very fast. Let's see a few examples of increasing complexity:

(* Evaluates to 22 *)
let test1 () = 
    let y = ref 10 in 
    let f x = x + !y  in
    y := 20;
    f 2

Here's another one:

(* Evaluates to 13 *)
let test2 () = 
    let f x = 
        let y = ref 10 in 
        y := !y + 1;
        x + !y
    in
    let y = ref 0 in 
    f 2

How about this?

let test3 () = 
    (* Evaluates to 13 *)
    let f = 
        let y = ref 10 in
        fun x -> 
            y := !y + 1;
            x + !y
    in
    let _ = f 0 in 
    f 1

Hiding references inside of functions

Let's create a function f : unit -> int that returns an incremented number each time it's called:

let f =
    let y = ref 0 in 
    fun () -> 
        y := !y + 1;
        !y

Storing arbitrary types

You can store anything inside of a reference, including a function:

let p_to_f : (int -> int) ref = ref (fun x -> x) in 
let double_p_to_f () = 
    let f = !p_to_f in 
    p_to_f := (fun x -> f x * 2)
in
double_p_to_f ();
double_p_to_f ();
(* Evaluates to 8 *)
(! p_to_f) 2

Aliasing

A memory cell is represented in memory as an address. You can think of there being a global heap, or "scratchpad", that's holding the current state of values. For example, when we evaluate let x = ref 32, we can think of x's value being the heap address 0x1234. Then, we also have, on the side:

(* *** HEAP *** *)

(* Memory address 0x1234 holds 32 inside *)
0x1234 |-> 32

Then, if we modify x -- say, by evaluating x := 0 -- we return (), but get a new heap:

(* *** HEAP *** *)

(* Memory address 0x1234 holds 32 inside *)
0x1234 |-> 0

Since x's value is not the integer itself, but rather an address to the integer, we can think about what happens in the following program:

let x = ref 32 in 
let y = x in 
y := 0;
!x

This evaluates to 32. The runtime value of x is a memory location (e.g., our 0x1234), and when we let y = x, we are setting y to this same memory location. Thus, if we modify y, there is no guarantee that x stays the same.

This problem is called aliasing: references (pointers to memory cells) may point to the same location, and thus affect each other. Let's see how this could affect the following function:

let f (x : int ref) (y : int ref) = 
    y := 0;
    !x

Now, what happens when we call f like so?

let x = ref 2 in 
let y = ref 4 in 
f x y

This evaluates to 2, because x and y are different memory cells. But what about this?

let x = ref 2 in 
f x x

What gets evaluated here? We get 0 back. This is concerning, because the type of f has no information about whether x and y point to the same or different memory locations. Thus, as soon as we have functions, we can't really be confident that any ref doesn't coincide with any other one. We won't discuss this in more detail, but this is exactly the problem that Rust tackles.