Skip to content

Lecture 5.1: Modules via .mli files

Last week, through polymorphism, we saw the idea of an API: a way of associating a set of operations to a type. For example, a simple set API could look like

type 'a set_ops = {
    empty : 'a;
    insert : 'a -> string -> 'a;
    remove : 'a -> string -> 'a;
    lookup : 'a -> string -> bool
}

Then, a client of the set API can use it by being polymorphic over the type variable:

let client (ops : 'a set_ops) = 
    let s1 = ops.empty in 
    let s2 = ops.insert s1 "x" in
    ...

Today, we will look APIs in a different light: instead of using polymorphism, we will use abstraction. For example, let's take a look at the API for mutable queues in OCaml's standard library (slightly simplified, and leaving out many more functions):

type 'a t (* Queues holding elements of type 'a *)
val create : unit -> 'a t
val add : 'a -> 'a t -> unit
val take_opt : 'a t -> 'a option

Here, 'a t is the type for a mutable FIFO queue holding things of type 'a. The function create initializes a new queue; the function add pushes an element onto the queue; and the function take_opt will return and pop the top element of the queue (or return None), if no top element exists. Thus, this API represents a mutable queue, since add and take_opt are not pure functions, but instead have side effects. (This is different than your queue in HW4, which should not use side effects at all.) We will learn more about mutability in later weeks, but the overall point about APIs is the same regardless of mutability.

We can use the Queue module by doing the following:

utop # open Queue;;
utop # let q : int t = create ();;
val q : int t = <abstr>

utop # add 32 q;;
- : unit = ()

utop # add 24 q;;
- : unit = ()

utop # take_opt q;;
- : int option = Some 32

utop # take_opt q;;
- : int option = Some 24

utop # take_opt q;;
- : int option = None

The command open Queue;; opens the Queue module into the local scope. If we didn't open Queue, the function create and the type int t would be undefined.

Now, note that utop's answer for create () is val q : int t = <abstr>. This is because the t type for Queues is an abstract type. While one can find the definition of 'a t if we were to peer inside of Stdlib's implementation (and is a particular record type), this is not meant to be exposed to the user. OCaml enforces this in our client code. Thus, even though queue.ml contains a length field for the type t, we cannot access it:

utop # let l = q.length;;
Error: Unbound record field length

Why is the queue type called t? While one can use queues by calling open Queue, the more common way is to not call open Queue, and instead make reference to it as follows:

utop # let q : int Queue.t = Queue.create ();;
val q : int Queue.t = <abstr>

utop # Queue.add 32 q;;
- : unit = ()

utop # Queue.add 24 q;;
- : unit = ()

utop # Queue.take_opt q;;
- : int option = Some 32

utop # Queue.take_opt q;;
- : int option = Some 24

utop # Queue.take_opt q;;
- : int option = None

What's going on here is that, while we haven't called open Queue, the Queue module itself is still in scope (since it's part of the standard library). A module is sort of like a record --- because you can use . to access various fields of it --- but instead of being a value, it's a set of definitions, including types. The convention in OCaml is that if we have a module M that defines some type that M is "about", then we call that type t in the module --- so that we can say M.t for that type.

The List module

Another example of a module that you have probably seen already is List:

utop # List.map (fun x -> x + 1) [1; 2; 3];;
val - : int list = [2; 3; 4]

Here, List is a module from the standard library (as you can see here). Another way to use these definitions is to open the module:

utop # map (fun x -> x + 1) [1; 2; 3];;
Error: Unbound value map
Hint: Did you mean max?

utop # open List;;
utop # map (fun x -> x + 1) [1; 2; 3];;
- : int list = [2; 3; 4]

If you look at List, we see that the list type is defined to be:

type 'a t = 'a list = 
| []
| (::) of 'a * 'a list
Thus, instead of saying int list, we can also say int List.t:
utop # let xs : int List.t = [1; 2; 3];;
val xs : int list = [1; 2; 3]
(list is somewhat special in OCaml, so we typically just say list.)

Modules as files

If we had a file foo.ml:

foo.ml
let x : int = 32

we can make reference to it from another file. To do that, we first compile foo.ml into a library:

> ocamlc foo.ml -c
This creates foo.cmo. We can then load it into utop:
> utop foo.cmo
and then see that we can reference Foo.x (or open Foo and use x):
utop # let y = Foo.x;;
val y : int = 32

Capitalization of Module Names

Even if we call it foo.ml, OCaml enforces that all module names start with a capital letter --- so we get the module Foo. It's common in OCaml for us to write lib.ml, and have it turn into the module Lib.

Hiding Types in Interface Files

Let's replicate a version of Queue as our own file. We will make some simplifications: first, instead of a queue, we will implement a stack (last-in first-out); we will do so in an immutable style (thus not using state, and instead returning updated stacks; and we will fix the element type to be int.

foo.ml
type stack = int list
let empty : stack = []
let push : int -> stack -> stack = fun x xs -> x :: xs
let peek : stack -> int option = fun xs -> 
    match xs with 
    | [] -> None
    | x :: _ -> Some x

However, this exposes all of the details of foo.ml --- including the definitions of the types. We can see this below with utop foo.cmo (after compiling foo.ml):

utop # let xs = Foo.empty;;
val xs : int list = []

Thus, even though we defined stack in foo.ml, OCaml still knows that it is a list. Indeed, we can do bad things like:

utop # let xs = Foo.empty;;
val xs : int list = []

utop # let ys = 1 :: xs;;
val ys : int list = [1]

utop # Foo.peek ys;;
- : int option = Some 1

Instead, we want to prevent the outside world from messing with our internal stack data structure. We can do this by using an interface file, or .mli:

foo.mli
type stack 
val empty : stack 
val push : int -> stack -> stack 
val peek : stack -> int option 

This interface file leaves stack abstract because we don't give a definition of stack, but just declare that stack is a type (with a single type parameter).

We then compile foo.ml by first compiling foo.mli:

> ocamlc foo.mli -c
> ocamlc foo.ml -c
Now, if we load it into utop foo.cmo:
utop # let xs : Foo.stack = Foo.empty;;
val xs : Foo.stack = <abstr>

utop # let ys = 1 :: xs;;
Error: The value xs has type Foo.stack but an expression was expected of type
         int list
we get an error when trying to cons 1 onto xs, since xs is not a list (from our perspective), but an element of the abstract type Foo.stack.

However, if we modified the interface file to expose the type:

foo.mli
type stack = 'a list
val empty : stack 
val push : int -> stack -> stack 
val peek : stack -> int option 

then we would get the original (maybe undesired) behavior:

utop # let xs = Foo.empty ;;
val xs : Foo.stack = []

utop # let ys = 1 :: xs ;;
val ys : int list = [1]
because here, stack is not an abstract type, but simply a type alias to int list.