Lecture 3.1: Specification and Testing

Overview

So far in class, we have been learning OCaml through a variety of examples, and a few use cases of formal specification and testing. Now that we've gained some experience in using OCaml, we can now dive deeper into specification.

Specifications

Simply put, a specification, or a "spec" for short, is just a description of possible implementations. We can formalize this idea by saying that a spec is given by a set $S$ of possible implementations.

Specifications can take many forms. Some specifications are given by comments:

(* Return the larger of the two numbers. *)
let max (i : int) (j : int) = ...

Here, our spec is

\[ S = \{ P \mid \text{P implements ``Return the larger of the two numbers.''} \} \]

What are the issues with specs like this? There are two:

This set $S$ is described imprecisely: two different people might have different ideas of whether a particular program $P$ is in the set.
This set $S$ cannot not be evaluated: even for one person, whether $P \in S$ might depend on how big $P$ or the comment is, the time of day, if you ate breakfast, ...

It is worth thinking about the advantages of them, too:

They are easy to write. No additional training required!

Whether your specific project needs a formal spec is a matter of taste. Accounting software for the backend of a bank? Formal specs are likely useful here. The graphics rendering pipeline for a video game? Less useful.

An example

If you have a list of integers, it is easy to see what this formal spec should mean intuitively:

(* Sort this list. *)
let rec sort (xs : int list) = ...

However, suppose you have a database where each row has three fields: a name, and two unique IDs:

type row = {
        name : string;
        gid : string; (* Global ID *)
        lid : string; (* Local ID *)
}

Then, you are tasked with the following problem:

(* Sort the rows. *)
let rec sort_rows (xs : row list) = ...

What possible sets $S$ for specifications are there?

Formal Specs

While the above example might seem a little contrived, examples like this can easily happen when programming. To prevent these issues, we want better, more precise, ways to describe the set $S$. This is done by using a formal specification: an encoding using mathematical logic, types, or other tools that allows one to decide (or perhaps test) whether $P \in S$, rather than just guessing.

Types as Specifications

The first kind of spec that you have all been using already are types. For example, if we have the function

let f (x : int) (y : bool) : int list = ...

then we know that (as long as f typechecks) that whenever we pass f an integer and a boolean, we are guaranteed to get back an int list. We can think of this as a type of specification:

\[ S = \{ P \mid P \text{ has type $\mathsf{int} \to \mathsf{bool} \to \mathsf{int\ list}$} \}. \]

While there are lots of functions that have this type --- and thus, the set $S$ is too large to be useful on its own --- having type information like this can still be quite helpful. There are two approaches to make this set $S$ smaller: use logical properties (like we have seen last week) to constrain the implementation; or use more sophisticated types that can express more properties. We will focus now on the first approach, but later on in this class we will see how polymorphic and abstract types can allow us express tight (as opposed to loose) specifications with types alone.

Properties as Specifications

Another way to define a specification is by a set of properties, which are formulas that should be true about your specification. For example, let's think about sorting. What should be true about a function sort : int list -> int list?

The result should be sorted: if $\mathsf{sort}(xs) = ys$, then $\forall 0 \leq i < j < |ys|, ys[i] \leq ys[j]$.
The result should have the same number of each element: if $\mathsf{sort}(xs) = ys$, then $\forall i, \mathsf{count}(xs, i) = \mathsf{count}(ys, i)$, where $\mathsf{count}(xs, i)$ counts the number of occurences of $i$ in $xs$.

Note

There are number of other specs that feel right on first glance, but aren't strong enough. For example, what about this one:

\[ \forall i, i \in xs \iff i \in ys.\]

Another Example: `find`

Let's try another function: find : (string -> bool) -> string list -> string option. What should be true about it?

If find succeeds, the result should be in the list: $\forall v, \mathsf{find}(p, xs) = \mathsf{Some}(v) \implies v \in xs$
If find succeeds, the predicate should hold on the value: $\forall v, \mathsf{find}(p, xs) = \mathsf{Some}(v) \implies p(v)$
If find fails, no elements should satisfy the predicate: $\mathsf{find}(p, xs) = \mathsf{None} \implies \forall x \in xs, \neg p(x)$.

Types of Logical Specs

There are many different ways to specify programs. The two specifications above for find and sort are defined in terms of properties of their outputs. We can be more precise by saying we are defining relations between their inputs and outputs:

\[ S = \{ P \mid \forall i, R(i, P(i)) \}. \]

Here, i is the set of inputs: one input of type int list for sort, and two inputs of type (string -> bool) and string list for find. The relation R is then the conjunction of all desired properties for the function.

While relations between inputs and outputs of a single program are very useful, they are not the only way to specify a program. Another way is to relate two different programs. Suppose you have two implementations of the same function: a fast, but really complicated version, f, and a slow, but easy-to-read version, spec. Even if there is no obvious relation between inputs and outputs for f and spec, we can still show f is correct by checking equivalence between the two:

\[ S = \{ P \mid \forall i, P(i) = \texttt{spec}(i) \}. \]

Thus, we would f is correct if $f \in S$.

An example from cryptography

One of the fundamental operations in cryptography is encryption, where you use a secret key to encrypt private data. Encryption is ubiquitous: indeed, whenever you use HTTPS (which most websites use now), your computer/phone is talking to the webserver by using encryption. Because encryption is so common, performance matters a ton: even a 2% improvement can have huge financial and energy-usage implications.

Because of this, cryptographers typically write encryption functions not in higher-level languages, like C, but directly in assembly, which can result in code that is much faster than what you can get by using a C compiler. But doing this is extremely dangerous, because it's so much easier to make coding errors in assembly.

This is where equivalence properties come into play. By testing (and proving, by using formal verification) that the hand-optimized assembly version is equivalent to the higher-level C version, you get the best of both worlds: the performance of hand-written assembly, and the easy-to-read higher-level spec. For more info, you can read this article.

Next Class

Today, we only talked about what specifications are: how to write them, and what they can express. Next class, we will talk about how you can check that functions meet their specifications.