Lecture 5.2: Manipulating Modules
Last class, we saw how to implement immutable stacks by hiding its implementation in an interface file. Let's see that again:
type t
val empty : t
val push : int -> t -> t
val peek : t -> int option
val pop : t -> t
Above, we have the .mli file for a simple stack interface. An interface file can contain types (either abstract declarations, like above, or full type definitions like type q = int list), as well as declarations for values, which we write using val. Then, we have the .ml file (the implementation):
type t = int list
let empty : t = []
let push : int -> t -> t = fun x xs -> x :: xs
let peek : t -> int option = fun xs ->
match xs with
| [] -> None
| x :: _ -> Some x
let pop : t -> t = fun xs ->
match xs with
| [] -> []
| _ :: xs' -> xs'
We can get very far with this style of writing modules. However, OCaml lets us do more: we can also write modules inline, in a single file. Let's replicate our stack below in this style:
module type MyStackSig = sig
type t
val empty : t
val push : int -> t -> t
val peek : t -> int option
val pop : t -> t
end
module MyStack : MyStackSig = struct
type t = int list
let empty : t = []
let push : int -> t -> t = fun x xs -> x :: xs
let peek : t -> int option = fun xs ->
match xs with
| [] -> None
| x :: _ -> Some x
let pop : t -> t = fun xs ->
match xs with
| [] -> []
| _ :: xs' -> xs'
end
First, we write the stack interface. This is given by a module type, which is
secretly what an .mli file is -- just a sequence of type and value
declarations. Then, we have the module, which is the implementation --- this
is what goes in the .ml file.
Let's see an example use:
let rec build_stack_rec (n : int) (acc : MyStack.t) : MyStack.t =
if n <= 0 then acc else
let x = Random.int 50 in
build_stack_rec (n - 1) (MyStack.push x acc)
let build_stack n = build_stack_rec n MyStack.empty
let pop_twice (q : MyStack.t) =
MyStack.pop (MyStack.pop q)
Now, since we are focusing on abstraction: what happens if we try to break the abstraction boundary?
let bad =
let q = MyStack.empty in
24 :: q (* Fails: this expression (q) has type MyStack.t but an expression was expected of type int list *)
We get the error we want. Again, why do we want this? We want to prevent users of our code from doing bad things. Let's now see what happens if we change things a bit:
The only difference here is that we leave off the module type. Then, our bad example will type check:
What's going on here is that, when we give our module a module type, we are effectively sealing it. Inside of MyStack, we are allowed to know that the stack is a list; but outside, we only know it behaves as a MyStackSig: we have a type t, but that's it.
Let's now see some more examples. Below is a simple interface for sets of strings:
module type StringSet = sig
type t
val empty : t
val insert : string -> t -> t
val remove : string -> t -> t
val mem : string -> t -> bool
end
While below is an interface for maps, where the keys are strings:
module type StringMap = sig
type 'a t
val empty : 'a t
val insert : string -> 'a -> 'a t -> 'a t
val remove : string -> 'a t -> 'a t
val lookup : string -> 'a t -> 'a option
end
On API Design
Designing a good interface for your code is one of, if not the, most important idea in programming.
If the interface is too narrow, it might be restrictive. For example, what if we want to count the number of elements in a StringSet.t?
We are out of luck.
But if the interface is too wide, then this presents other issues. What if we added a function elems : t -> string list?
This forces our set interface to "remember" all of the elements we have stored in the set. While this might sound reasonable, consider this implementation of sets:
module HashSet : StringSet = struct
type t = string list
let empty = []
(*
Digest.string : string -> string
computes a hash of the input.
*)
let insert xs x = Digest.string x :: xs
let remove xs x = List.filter (fun y -> y <> Digest.string x) xs
let mem xs x = List.mem (Digest.string x) xs
end
This choice comes down to what the API is actually for. There are many ways in which sets are used. Are we storing elements for later? Or are we just testing for the presence of certain elements?
Cognitive Burden
Another important facet of program correctness --- equally important as formal testing --- is decreasing cognitive burden. While you can use testing to make sure that your code is correct, another approach is to set up your APIs so that it is hard to create bad code. Here's another API for sets:
module type OtherStringSet = sig
type t
(* Add an element to a set, returning the set and its index into the set. *)
val add : string -> t -> (t * int)
(* Now, we must work with indices. *)
val get_nth : int -> t -> string option
val remove_nth : int -> t -> t
val mem : int -> t -> bool
end
Constraining Implementations via Abstract Types
Let's look now deeper at abstract types by thinking about a few examples.
Suppose we have a module M of module type S. Then how can we create a value of type t? We can only make reference to x and y. Can we do anything with these values? Not really. We can compare them for equality, but that's about it. (We shouldn't do this, though -- because this breaks through the abstraction boundary.)
Booleans as an abstract type
What about this?
module type S2 = sig
type t
val tru : t
val fls : t
val ite : t -> 'a -> 'a -> 'a
end
module M2 : S2 = struct ... end
What can we do here? We can have tru : t and fls : t, and we can case on a t by using ite. While we can case on a t, there are still only two values of t we can use. This means that M2.t is in bijection with the booleans: any boolean can be turned into a t, and vice versa.
We can actually implement these functions:
let t_of_bool (b : bool) : M2.t = if b then M2.true else M2.fls
let bool_of_t (x : M2.t) : bool = M2.ite x true false
Let's now implement it:
module M : S2 = struct
type t = unit option
let true = None
let fls = Some ()
let ite o x y = match o with | None -> x | Some _ -> y
end
Natural numbers as an abstract type
One more:
How many values of type t can we create? Infinitely many! There is one value of t for every natural number. Zero corresponds to z, while n + 1 corresponds to s n. We can make this bijection explicit by adding a way to recurse on values:
module type S4 = sig
type t
val z : t
val s : t -> t
(* `recurse x v f` means: if x = zero, then v; else, compute the answer for "x - 1", and apply f to it. *)
val recurse : t -> 'a -> ('a -> 'a) -> 'a
end
module M4 : S3 = struct ... end
let int_of_t (x : M4.t) : int =
M4.rec x 0 (fun x -> x + 1)
let rec t_of_int (x : int) : M4.t =
if x <= 0 then M4.z else
M4.s (t_of_int (x - 1))
module M4 : S3 = struct
type t = unit list
let z = []
let s xs = () :: xs
let rec recurse xs v f =
match xs with
| [] -> v
| _ :: xs' -> f (recurse xs' v f)
end
module M5 : S3 = struct
type t = float
let z = 0.0
let s x = x +. 1.0
let rec recurse x v f = if x <= 0.0 then v else f (recurse (x -. 1.0) v f)
end
module M6 : S3 = struct
type t = | Z | S of t
type z = Z
let s x = S x
let rec recurse x v f =
match x with
| Z -> v
| S x' -> f (recurse x' v f)
end