Lecture 5.2: Manipulating Modules
Last class, we saw how to implement immutable stacks by hiding its implementation in an interface file. Let's see that again:
type 'a set = 'a list
val empty : 'a set
val mem : 'a -> 'a set -> bool
val add : 'a -> 'a set -> 'a set
val remove : 'a -> 'a set -> 'a set
Above, we have the .mli file for a simple stack interface. An interface file can contain types (either abstract declarations, like above, or full type definitions like type q = int list), as well as declarations for values, which we write using val. Then, we have the .ml file (the implementation):
(* {1 2 3 3} = {1 2 3} = {3 2 1} *)
type 'a set = 'a list
let empty : 'a set
= []
let rec mem (x : 'a) (s : 'a set) : bool
= match s with
| [] -> false
| x' :: s' -> if x' = x then true else mem x s'
let add (el : 'a) (s : 'a set) : 'a set
= if mem el s then s else el :: s
(* = el :: s *)
let rec remove (x : 'a) (s : 'a set) : 'a set
= match s with
| [] -> []
| x' :: s' -> if x' = x then s' else x' :: remove x s'
We can get very far with this style of writing modules. However, OCaml lets us do more: we can also write modules inline, in a single file. Let's replicate our stack below in this style:
(* keeping it simple: just for ints *)
module type MyIntSetSig = sig
type t
val empty : t
val mem : int -> t -> bool
val add : int -> t -> t
val remove : int -> t -> t
end
module MyIntSet : MyIntSetSig = struct
type t = int list
let empty : t = []
let rec mem (x : int) (s : t) : bool = List.mem x s
let add (el : int) (s : t) : t = el :: s
let rec remove (x : int) (s : t) : t
= List.filter (fun x' -> x' <> x) s
end
We can also make this polymorphic
module type MySetSig = sig
type 'a t
val empty : 'a t
val mem : 'a -> 'a t -> bool
val add : 'a -> 'a t -> 'a t
val remove : 'a -> 'a t -> 'a t
end
module MySet : MySetSig = struct
type 'a t = 'a list
let empty : 'a t = []
let rec mem (x : 'a) (s : 'a t) : bool = List.mem x s
let add (el : 'a) (s : 'a t) : 'a t = el :: s
let rec remove (x : 'a) (s : 'a t) : 'a t
= List.filter ((<>) x) s
end
First, we write the stack interface. This is given by a module type, which is
secretly what an .mli file is -- just a sequence of type and value
declarations. Then, we have the module, which is the implementation --- this
is what goes in the .ml file.
Let's see an example use:
let ex0 = MySet.empty ;;
let ex1 =
MySet.add 1 (
MySet.add 2 (
MySet.add 3 xs ));;
(* scoped open *)
let ex2 = let open Myset in
add 1 (add 2 (add 3 empty))
(* local open *)
let ex3 = Myset.(add 1 (add 2 (add 3 empty)))
let rec build_set_acc (acc : int MySet.t) (n : int) : int MySet.t =
if n <= 0
then acc
else
let x = Random.int 10 in
build_set_acc (MySet.add x acc) (n - 1)
let build_set (n : int) = build_set_acc MySet.empty n
let build_set : int -> MySet.t = build_set_acc MySet.empty
(* utop *)
utop# let xs = build_set 10;;
utop# MySet.mem 3 xs;;
utop# MySet.mem 4 xs;;
utop# MySet.mem 5 xs;;
Now, since we are focusing on abstraction: what happens if we try to break the abstraction boundary?
let bad =
let q = MySet.empty in
24 :: q (* Fails: this expression (q) has type MySet.t but an expression was expected of type int list *)
We get the error we want. Again, why do we want this? We want to prevent users of our code from doing bad things. Let's now see what happens if we change things a bit:
The only difference here is that we leave off the module type. Then, our bad example will type check:
What's going on here is that, when we give our module a module type, we are effectively sealing it. Inside of MySet, we are allowed to know that the stack is a list; but outside, we only know it behaves as a MySetSig: we have a type t, but that's it.
Let's now see some more examples. Below is a simple interface for sets of strings:
module type StringSet = sig
type t
val empty : t
val mem : string -> t -> bool
val add : string -> t -> t
val remove : string -> t -> t
end
On API Design
Designing a good interface for your code is one of, if not the, most important idea in programming.
If the interface is too narrow, it might be restrictive. For example, what if we want to count the number of elements in a StringSet.t?
We are out of luck.
But if the interface is too wide, then this presents other issues. What if we added a function elems : t -> string list?
This forces our set interface to "remember" all of the elements we have stored in the set. While this might sound reasonable, consider this implementation of sets:
module HashSet : StringSet = struct
type t = string list
let empty = []
(*
Digest.string : string -> string
computes a hash of the input.
*)
let add xs x = Digest.string x :: xs
let mem xs x = List.mem (Digest.string x) xs
let remove xs x = List.filter (fun y -> y <> Digest.string x) xs
end
This choice comes down to what the API is actually for. There are many ways in which sets are used. Are we storing elements for later? Or are we just testing for the presence of certain elements?
Cognitive Burden
Another important facet of program correctness --- equally important as formal testing --- is decreasing cognitive burden. While you can use testing to make sure that your code is correct, another approach is to set up your APIs so that it is hard to create bad code. Here's another API for sets:
Which one is easier to use? I would say the original one, because the module type is closer to what I think of when I think of a set. This other interface, while as functional as before, is much harder to use, and therefore it's more likely that bugs will arise.