Lecture 2.3: Defining New Types; more experience writing properties
So far, we have learned how to write and specify OCaml code using built-in types, such as lists, pairs, and option types. Today, we will focus on user-defined types, which will enable us to write and specify many more types of programs.
Binary trees
Let's look at an implementation of a binary tree type in OCaml, where the leaves are integers:
This defines an algebraic data type, or ADT, which has a number of constructors --- here, Leaf
and Branch
--- which each have arguments: Leaf
has one argument of type int
, while Branch
has two arguments of type bintree
.
Note
While bintree
has two arguments of type bintree
, you should, in this case, not think of this as one argument of the pair type bintree * bintree
.
Let's look at a few binary trees, and how they can be represented in OCaml.
This would be written Branch (Leaf 2, Leaf 7)
.
This would be written Branch (Branch (Leaf 0, Leaf (-1)), Leaf 42)
.
Why trees?
(Binary) trees are used all over in computer science. Here, we will think of them as data structures (supporting efficient implementations of maps and sets); later on in this class, we will use trees to represent the syntax of a simple programming language.
Instead of writing trees directly, we can generate them using recursive functions. Let's create a generator for trees:
let rec mk_random_tree (n : int) =
if n <= 0 then Leaf (Random.int 42)
else Branch (mk_random_tree (n - 1), mk_random_tree (n - 1))
let rec string_of_bintree (t : bintree) =
match t with
| Leaf i -> string_of_int i
| Branch (x, y) -> "{" ^ string_of_bintree x ^ ", " ^ string_of_bintree y ^ "}"
Using binary trees
To use a binary tree, we can pattern match on it, similar to lists:
let rec sum_tree (t : bintree) =
match t with
| Leaf i -> i
| Branch (x, y) -> sum_tree x + sum_tree y
Another example:
let rec flip_tree (t : bintree) =
match t with
| Leaf i -> Leaf i
| Branch (x, y) -> Branch (flip_tree y, flip_tree x)
Formalizing a property about flip_tree
To check whether flip_tree
really is performing a flip, we can write a few properties. First, we can see that, for most inputs, flip_tree
won't return the same tree as the one it takes as input:
let _ =
let t = mk_random_tree 3 in
if not (t = flip_tree t) then
()
else
print_endline ("Found tree that stays the same: " ^ string_of_bintree t)
mk_random_tree
will cause the above equality to fail. However, this is certainly not a theorem! Any symmetric tree will cause it to succeed:
Equality on ADTs
As long as your ADT does not contain functions in it, the equality operator =
will work on the ADT (as it does for bintree
).
Note
Thus, the following theorem is true: \(\exists t, t = \mathsf{fliptree}(t).\)
But it is not true that \(\forall t, t = \mathsf{fliptree}(t)\).
While it's not always true that t = flip_tree t
, it definitely is true that t = flip_tree (flip_tree t)
. In other words, if you flip a tree twice, you get back to the original tree. In other words, flip_tree
is involutive: applying the same function twice gets you back to the original value.
let flip_tree_inv () =
let t = mk_random_tree 4 in
if t = flip_tree (flip_tree t) then
() (* OK *)
else
print_endline ("Found a counterexample: " ^ string_of_bintree t)
What's algebraic about ADTs?
Another name for an ADT such as
is a sum type, because it can be thought of as the disjoint sum of the two constructors: it is either an Int
, or a Bool
. We can think of this as being represented as the following equation:
$$
\texttt{int_or_bool} = \mathsf{Int} + \mathsf{Bool}
$$
Our bintree
example is similar, except that the equation is recursive:
$$
\mathsf{bintree} = \mathsf{int} + (\mathsf{bintree} * \mathsf{bintree})
$$
More on ADTs
Let's look at few more examples of ADTs, and how you might use them. Suppose you're creating a graphics application that lets you create various shapes and give them colors. First, we can define colors by an enumeration:
Here, the type is just an enumeration between six choices, without any data attached. While we could representcolor
with an integer (e.g., 0
for red, 1
for green, and so on), using an explicit enumeration with names is useful for code clarity.
For example, let's write a function to compute complementary colors:
let complementary (c : color) =
match c with
| Red -> Green
| Green -> Red
| Blue -> Orange
| Orange -> Blue
| Yellow -> Purple
| Purple -> Yellow
(As you might have noticed, this is another example of an involutive function.)
Next, let's define a type for shapes:
type shape = | Rectangle of int * int (* width and height *)
| Circle of int (* radius *)
| Square of int (* side length *)
Finally, we want to define a type that holds both a color and a shape. While we could do this with a pair type (color * shape
), there is another way: by using a record:
This defines a record, which must hold a field b_color
of type color
, and b_shape
of type shape. Let's create one and analyze it: