Lecture 4.3: More about Interfaces; Differential Testing for Interfaces

A Note about Polymorphic Declarations

If you declare the following function in OCaml:

let f (x : 'a) : 'a = x

this will type check and OCaml will give it type 'a -> 'a.

However, if you do this:

let g (x : 'a) : 'a = x + 32

then g will also type check --- but here, OCaml will give it type int -> int. This is because we think of type variables as having constraints. Above, 'a is unconstrained in f, while 'a has the constraint 'a = int in g.

A "real" polymorphic function is one that has unconstrained type variables (thus 'a can be anything). However, if we add constraints by forcing 'a = int, then g : int -> int is not a polymorphic function.

Thus, when you are defining a new function, the type annotations you insert do not force your function to be polymorphic. Instead, the type OCaml gives back to you (such as 'a -> 'a, or int -> int) after type checking will tell you whether it's polymorphic.

Last class, we saw the idea of an interface: a set of operations that describe a data structure. In short, instead of describing what a data structure is (a list, or a binary tree), we describe what it does. Today, we will dig a little deeper into what it means to specify an interface by showing more about specifying each operation; and another technique called differential testing.

Here is our interface for sets:

type 's set_ops = {
    empty : 's;
    insert : 's -> string -> 's;
    remove : 's -> string -> 's;
    mem : 's -> string -> bool 
}

Last lecture, we described one way of specifying sets: via properties that should relate the various operations of the set. We have five properties for sets:

mem empty x = false, for all x;
mem (insert s x) x = true, for all x;
mem (insert s x) y = mem s y, for all y != x;
mem (remove s x) x = false, for all x;
mem (remove s x) y = mem s y, for all y != x.

Obtaining Specifications

How did we obtain these properties? We got them from the following principle: there are ways to construct sets, and there are ways to observe sets. For us, we can construct them via empty, insert, and remove; and we can observe them via mem. For each pair of a constructor and an observation, we need to specify what it means to combine them.

Adding a new constructor

If we added a new constructor (say, union: 's -> 's -> 's), then to specify it, we would only need to know what it means to observe it. Since our only observer is mem, we get the following template for a spec:

mem (union s s') x = ???

Here, we then will take inspiration from set theory and know that x is in \(A \cup B\) only if it's in \(A\) or \(B\). Thus, we get the following logical spec:

mem (union s s') x = mem s x || mem s' x

Similarly, for intersect, we get

mem (intersect s s') x = mem s x && mem s' x.

Adding a new observation

What if we want a new way of observing a set? (What are some other data we can get from a set?)

One possibility is adding a notion of cardinality: card : 's -> int, counting the number of elements in the set. What should be true about card? First of all, it should always be positive:

forall s, card s >= 0

Next, we need to update our list of specs so that we know what it means to observe each constructor using card. Thus, we need to specify the following (in full generality):

card empty = ???
card (insert s x) = ???
card (remove s x) = ???

Now, we can fill in the gaps. First, card empty is easy; that should be 0. What about card (insert s x)? If x is already in s, we aren't changing s; otherwise, we're adding one element to it. Thus, we get:

card (insert s x) = if mem s x then card s else card s + 1

card (remove s x) is then the opposite:

card (remove s x) = if mem s x then card s - 1 else card s

Specifying a Queue

Let's move on from sets, and look at (FIFO) queues. A queue is a data structure that supports pushing and popping, as well as peeking (and an empty constructor).

Let's first specify the signature of these things. Giving a type V of values, a queue Q supports the operations:

empty : Q;
push : Q -> V -> Q;
pop : Q -> (Q * V) option; and
peek : Q -> V option.

Next, we implement this in OCaml (again choosing string for the type of values):

type 'q queue_ops = {
    empty : 'q;
    push : 'q -> string -> 'q;
    pop : 'q -> ('q * string) option;
    peek : 'q -> string option
}

Next, we can derive a specification for it by considering all the ways of constructing and observing the queue. Constructors are all the ways of returning a new value of type 'q; while observations are all the ways of obtaining something of a type that is not a type variable. Thus, we have the constructors empty, push, and pop; and the observers pop and peek. (pop is both a constructor and and observer, because it returns both something of type q and string).

Thus, we now need to know what happens when we compose a constructor with an observer. For pop:

pop empty = ???
pop (push q x) = ???
if pop q = Some (v, q'), then pop q' = ???

And for peek: - peek empty = ??? - peek (push q x) = ??? - if pop q = Some (v, q'), then peek q = ???

We will just analyze pop here. We see that pop empty should be None, since there is nothing to pop; while pop (push q x) should be Some (x, q). In the last case, where we are composing pop with itself, there ends up being nothing interesting to specify, so we won't worry about that case.

Once we have these specifications in place, we can then go ahead and implement them. For pop, they would be

let pop_empty (ops : 'q queue_ops) : bool = 
    ops.pop ops.empty = None

let pop_push (ops : 'q queue_ops) (q : 'q) (x : string) : bool = 
    ops.pop (ops.push q x) = Some (x, q)

Equality of Queues

Note above that we are comparing queues for equality in pop_push, by using OCaml's structural equality operator =. While this might work for simple cases, it will not work if we are using references or arrays internally, or working with abstract modules (we will see all of these later). Thus, a better approach is to add a notion of equality to queues:

type 'q queue_ops = {
    ...
    equal : 'q -> 'q -> bool
}

and use it in specifications:

let pop_push (ops : 'q queue_ops) (q : 'q) (x : string) : bool = 
    match ops.pop (ops.push q x) with
    | None -> false
    | Some (x', q') -> (x = x') && (ops.equal q q')

Differential Testing

Now, we will switch gears and think about another way to specify an interface. Suppose we had a high quality implementation of a set that we trust to be correct:

let list_set_ops : (string list) set_ops = {
    empty = [];
    insert = (fun s x -> x :: s);
    remove = (fun s x -> List.filter (fun y -> y <> x) s); (* <> = "not equal" *)
    mem = (fun s x -> List.mem s x);
}

While list_set_ops will do the trick, we find out it's really slow. So instead, we have a fast version of sets:

type string_tree = | Leaf of string | Node of string_tree * string_tree

let tree_set_ops : string_tree set_ops = {
    ...
}

How should we specify tree_set_ops? One way is to re-run all of our tests on list_set_ops. While that is likely to catch some bugs, it might not catch them all. So instead, we will consider another approach, where we will test equivalence of tree_set_ops and list_set_ops. The idea here, called differential testing, is that instead of coming up with a formal spec, we instead just see if the two implementations behave the same. If they do, that's great; if they don't, then one of them has a bug.

Our strategy for asking if they "behave the same" will be to construct two sets in the same way --- one with list_set_ops, and one with tree_set_ops --- and make sure that they behave the same on their observations (here, mem). For example:

let test_equiv (opsA : 's set_ops) (opsB : 't set_ops) : bool = 
    let s1A = opsA.insert opsA.empty "X" in
    let s1B = opsB.insert opsB.empty "X" in
    let s2A = opsA.insert s1A "Y" in
    let s2B = opsB.insert s1B "Y" in
    opsA.mem s2A "X" = opsB.mem s2B "X"

If the above test returns true, then the two interfaces are behaving the same; if it returns false, then one of the two interfaces is behaving wrong (since there is only one possible correct way for sets to behave).

We can turn this example into a random test by randomly sampling two parallel sets:

(* Possible strings *)
let ss = ["A", "B"; "C"; "D"; "E"; "F"; "G"]

(* Randomly choose a string from the list *)
let gen_string () = List.nth ss (Random.int (List.length ss))

let rec gen_sets (opsA : 's set_ops) (opsB : 't set_ops) (n : int) : ('s * 't) = 
    if n <= 0 then (opA.empty, opsB.empty)
        else 
        let x = gen_string () in 
        let (sA, sB) = gen_sets opsA opsB (n - 1) in 
        if Random.bool () then 
            (opsA.insert sA x, opsB.insert sB x)
        if Random.bool () then 
            (opsA.remove sA x, opsB.remove sB x)

The idea behind gen_sets is that we will generate two sets following the same instructions on both sides. Thus, these two sets --- even though they have different representations --- should be logically equivalent.

Then, our differential testing strategy can be as follows:

randomly generate two "identical" sets using gen_sets; and
ensure that each set behaves the same when we query for membership of the same string.

let gen_pair (genA : unit -> 'a) (genB : unit -> 'b) : unit -> ('a * 'b) = 
    fun _ -> 
        let x = genA () in 
        let y = genA () in 
        (x, y)

let forall (gen : unit -> 'a) (prop : 'a -> bool) (trials : int) : 'a option = 
    ... like we have seen ...

let diff_test (opsA : 's set_ops) (opsB : 't set_ops) () : ('s * 't) option = 
    forall (gen_pair mk_sets gen_string) (fun ((sA, sB), x) -> 
        (* sA : 's *)
        (* sB : 't *)
        (* x : string *)
        opsA.mem sA x = opsB.mem sB x
    ) 32

If diff_test always returns None, we can't find a bug that shows that opsA and opsB behave differently; thus, this serves as evidence that they are behaving the same. If it returns Some, we have a bug.

The counter-example will be of type ('s * 't): this is hard to inspect in general, because the only way we can "look inside" these sets is by using mem. In HW4, we will investigate a more effective way to inspect counter-examples by differential testing.