Skip to content

Lecture 5.1: "Module" 2 with module files

Our discussion of polymorphism wraps up Module 1, where we are learning the foundations of OCaml.

Today we will touch base on type inference, a skill that will be useful beyond this class, and look into OCaml-specific (alt: functional programming specific) features to help us reason about complex software.

Sleuthing for types.

Last class, in lecture 4.3, we touched on type inference. We described, at a high level, what is going on. Similar to our design recipe for structural recursion, we'll slowly step through type inference to give you a sense for how break down this process slowly.

Looking at this function:

let f = fun a -> (snd a, fst a)

The main idea is to tabulate facts about the expression, starting from the top-level expression (the function), to the deepest expressions (fst a and snd a):

  1. Identify the overall shape of the expression: in this case, a single argument function: 'a -> 'b.
  2. Looking at expression just inside of the function body, we see a tuple / pair type: C? * D?
  3. Make the necessary substitution where 'b = 'c * 'd. Now we see f : 'a -> 'c * 'd
  4. Observe that fst and snd operate on pairs. So their input argument a must also be a pair. For now we will say a : 'x * 'y and we can write down that f : 'x * 'y -> 'c * 'd -- we aren't done, though!
  5. We still need to look at the expressions fst a and snd a. Given that we know that a : 'x * 'y, we'll see fst a : 'x and snd a : 'y. These are placed in a tuple, so (snd a, fst a) : 'y * 'x
  6. We had some placeholders for the output type, 'c * 'd, now we can observe that these are equivalent to 'y * 'x.
  7. Finally, we see there is no work left to do: we started with a top-level type 'a -> 'b and evaluated all nested expressions, so the final type of f is f : 'x * 'y -> 'y * 'x.
A little mental model to keep in mind is to think of yourself as a "type detective."

You enter the crime scene. Before you open the door you see...

let f = fun a -> (snd a, fst a)

The top level expression - it's a function, like others you've seen before (fun _ -> _), this one just has one argument. Simple:

f : 'a -> 'b

Entering the room, you see the first argument a sitting between fun and ->. Clearly, this a had no idea what it was in for. Fairly obvious this a is the same type as 'a. You open your notebook:

f : 'a -> 'b
a : 'a

What could the return be? It's a tuple (_,_), sticking out from under the table, like a box. The main subject of the investigation. Let's just give it the usual type 'c * 'd. Of course, we were looking for a 'b so we can update the facts:

f : 'a -> 'c  * 'd
a : 'a

Diving in: fst and snd. operations on tuples:

fst : 'x * 'y -> 'x
snd : 'x * 'y -> 'y

Looks like these functions did a number on a. Does it type check? It still can, 'a is unknown! Let's set 'a = 'x * 'y and add fst a and snd a to our evidence:

f : 'x * 'y -> 'c  * 'd
a : 'x * 'y
fst a : 'x
snd a : 'y

So far so good... just one thing left before heading back to the diner and donuts:

(snd a, fst a)

What is that again? Looks like a 'c * 'd. Or... was it an 'x * 'y? Ah, it's 'y * 'x, Sam and Ferd are trying to be clever. We still haven't made a call on 'c * 'd, so let's say something like:

(snd a, fst a) : 'c  * 'd = 'y * 'x

Since we have more information on a, we'll leave it as 'y * 'x:

f : 'x * 'y -> 'y * 'x

Just a few more minutes before you're off the clock. Looks like a polymorphic function, check, looked at every expression top-to-bottom, check, nothing left to do. Next up: diner and donuts.

For some more practice, check out this type inference problem generator game.

Motivating Modules

Polymorphism let's us remove boilerplate and think about the most general expression which would work for any arbitrary type: for instance, in hw4 you will see a polymorphic priority queue, exposed as a collection of functions which we package together with a record type. It looks approximately like the following:

type 'a queue = 'a list

type 'el q_ops = { 
  push : 'el queue -> 'el -> 'el queue;
  peak  : 'el queue -> 'el option ;
  (* ... *)
}

Today, we will look APIs in a different light: instead of using polymorphism, we will use abstraction. For example, let's take a look at the API for mutable queues in OCaml's standard library (slightly simplified, and leaving out many more functions):

type 'a t (* Queues holding elements of type 'a *)
val create : unit -> 'a t
val add : 'a -> 'a t -> unit
val take_opt : 'a t -> 'a option

Here, 'a t is the type for a mutable FIFO queue holding things of type 'a. The function create initializes a new queue; the function add pushes an element onto the queue; and the function take_opt will return and pop the top element of the queue (or return None), if no top element exists. Thus, this API represents a mutable queue, since add and take_opt are not pure functions, but instead have side effects. (This is different than your queue in HW4, which should not use side effects at all.) We will learn more about mutability in later weeks, but the overall point about APIs is the same regardless of mutability.

We can use the Queue module by doing the following:

utop # open Queue;;
utop # let q : int t = create ();;
val q : int t = <abstr>

utop # add 32 q;;
- : unit = ()

utop # add 24 q;;
- : unit = ()

utop # take_opt q;;
- : int option = Some 32

utop # take_opt q;;
- : int option = Some 24

utop # take_opt q;;
- : int option = None

The command open Queue;; opens the Queue module into the local scope. If we didn't open Queue, the function create and the type int t would be undefined.

Now, note that utop's answer for create () is val q : int t = <abstr>. This is because the t type for Queues is an abstract type. While one can find the definition of 'a t if we were to peer inside of Stdlib's implementation (and is a particular record type), this is not meant to be exposed to the user. OCaml enforces this in our client code. Thus, even though queue.ml contains a length field for the type t, we cannot access it:

utop # let l = q.length;;
Error: Unbound record field length

Why is the queue type called t? While one can use queues by calling open Queue, the more common way is to not call open Queue, and instead make reference to it as follows:

utop # let q : int Queue.t = Queue.create ();;
val q : int Queue.t = <abstr>

utop # Queue.add 32 q;;
- : unit = ()

utop # Queue.add 24 q;;
- : unit = ()

utop # Queue.take_opt q;;
- : int option = Some 32

utop # Queue.take_opt q;;
- : int option = Some 24

utop # Queue.take_opt q;;
- : int option = None

What's going on here is that, while we haven't called open Queue, the Queue module itself is still in scope (since it's part of the standard library). A module is sort of like a record --- because you can use . to access various fields of it --- but instead of being a value, it's a set of definitions, including types. The convention in OCaml is that if we have a module M that defines some type that M is "about", then we call that type t in the module --- so that we can say M.t for that type.

The List module

Another example of a module that you have probably seen already is List:

utop # List.map (fun x -> x + 1) [1; 2; 3];;
val - : int list = [2; 3; 4]

Here, List is a module from the standard library (as you can see here). Another way to use these definitions is to open the module:

utop # map (fun x -> x + 1) [1; 2; 3];;
Error: Unbound value map
Hint: Did you mean ma'x

utop # open List;;
utop # map (fun x -> x + 1) [1; 2; 3];;
- : int list = [2; 3; 4]

If you look at List, we see that the list type is defined to be:

type 'a t = 'a list = 
| []
| (::) of 'a * 'a list
Thus, instead of saying int list, we can also say int List.t:
utop # let xs : int List.t = [1; 2; 3];;
val xs : int list = [1; 2; 3]
(list is somewhat special in OCaml, so we typically just say list.)

Modules as files

If we had a file foo.ml:

foo.ml
let x : int = 32

we can make reference to it from another file. To do that, we first compile foo.ml into a library:

> ocamlc foo.ml -c
This creates foo.cmo. We can then load it into utop:
> utop foo.cmo
and then see that we can reference Foo.x (or open Foo and use x):
utop # let y = Foo.x;;
val y : int = 32

Capitalization of Module Names

Even if we call it foo.ml, OCaml enforces that all module names start with a capital letter --- so we get the module Foo. It's common in OCaml for us to write lib.ml, and have it turn into the module Lib.

Hiding Types in Interface Files

Let's replicate a version of Queue as our own file. We will make some simplifications: first, instead of a queue, we will implement a stack (last-in first-out); we will do so in an immutable style (thus not using state, and instead returning updated stacks; and we will fix the element type to be int.

foo.ml
type stack = int list
let empty : stack = []
let push : int -> stack -> stack = fun x xs -> x :: xs
let peek : stack -> int option = fun xs -> 
    match xs with 
    | [] -> None
    | x :: _ -> Some x

However, this exposes all of the details of foo.ml --- including the definitions of the types. We can see this below with utop foo.cmo (after compiling foo.ml):

utop # let xs = Foo.empty;;
val xs : int list = []

Thus, even though we defined stack in foo.ml, OCaml still knows that it is a list. Indeed, we can do bad things like:

utop # let xs = Foo.empty;;
val xs : int list = []

utop # let ys = 1 :: xs;;
val ys : int list = [1]

utop # Foo.peek ys;;
- : int option = Some 1

Instead, we want to prevent the outside world from messing with our internal stack data structure. We can do this by using an interface file, or .mli:

foo.mli
type stack 
val empty : stack 
val push : int -> stack -> stack 
val peek : stack -> int option 

This interface file leaves stack abstract because we don't give a definition of stack, but just declare that stack is a type (with a single type parameter).

We then compile foo.ml by first compiling foo.mli:

> ocamlc foo.mli -c
> ocamlc foo.ml -c
Now, if we load it into utop foo.cmo:
utop # let xs : Foo.stack = Foo.empty;;
val xs : Foo.stack = <abstr>

utop # let ys = 1 :: xs;;
Error: The value xs has type Foo.stack but an expression was expected of type
         int list
we get an error when trying to cons 1 onto xs, since xs is not a list (from our perspective), but an element of the abstract type Foo.stack.

However, if we modified the interface file to expose the type:

foo.mli
type stack = 'a list
val empty : stack 
val push : int -> stack -> stack 
val peek : stack -> int option 

then we would get the original (maybe undesired) behavior:

utop # let xs = Foo.empty ;;
val xs : Foo.stack = []

utop # let ys = 1 :: xs ;;
val ys : int list = [1]
because here, stack is not an abstract type, but simply a type alias to int list.