Lecture 3.3: Working with Specifications

Last class, we discussed what specifications are; now, we will discuss what we can do with them. Let's take a look at the spec for sort : int list -> int list, which has two parts:

(P1) The result should be sorted: if \(\mathsf{sort}(xs) = ys\), then \(\forall 0 \leq i < j < |ys|, ys[i] \leq ys[j]\).
(P2) The result should have the same number of each element: if \(\mathsf{sort}(xs) = ys\), then \(\forall i, \mathsf{count}(xs, i) = \mathsf{count}(ys, i)\), where \(\mathsf{count}(xs, i)\) counts the number of occurences of \(i\) in \(xs\).

Formalizing a spec

To do anything interesting with this specification, we will first formalize it by encoding it into a programming language.

let rec all (p : int -> bool) (xs : int list) =
    match xs with
    | [] -> true
    | x :: xs' -> p x && all p xs'


let rec sorted (xs : int list) = 
    match xs with 
    | [] -> true
    | _ :: [] -> true
    | i :: j :: xs' -> i <= j && sorted (j :: xs')


let rec count (xs : int list) (y : int) =
    match xs with
    | [] -> 0
    | x::xs' -> (if x = y then 1 else 0) + count xs' y

let equal_elements (xs : int list) (ys : int list) = 
    all (fun (x : int) -> 
        count xs x = count ys x
    ) xs
    &&
    all (fun (x : int) -> 
        count xs x = count ys x
    ) ys

let sort_spec (xs : int list) (sort : int list -> int list) = 
    let ys = sort xs in 
    sorted ys && equal_elements xs ys

We made some specific coding choices above that made our two logical into a computable check that can actually be run. Indeed, sort_spec xs sort returns true if and only if sort returns the correct answer on xs. There are some tricky points here.

Why is our sorted function the same as our predicate for the list being sorted?
What about equal_elements?

What to do with a formalized spec?

What we have done above by defining sort_spec is show that, for a given list xs, we can decide whether sort behaves correctly on that list xs. Correctness of an algorithm on a specific input is usually easy to directly check. What's harder is showing that sort behaves correctly on all inputs:

\[ \forall xs \in \texttt{int list}, \texttt{sort_spec}(xs, sort). \]

In general, questions like this one above is undecidable: no computer program, given the code for sort, can automatically compute with absolute certainty whether the above formula is true. This is mostly due to the fact that int list is infinite (modulo restrictions in OCaml about the sizes of integers), so we can't exhaustively check every case. However, even though this problem is undecidable, there are still a lot of things you can do.

Decidability

We can generalize the above discussion to talk about decidability in general. Given a set \(S\) (encoded via logical specs), the decision problem for \(S\) is the task of asking whether a given function \(f\) is inside of \(S\) or not. The decision problem for \(S\) is decidable if there exists a program decide that, given \(f\), can always output a correct yes/no answer for whether \(f \in S\). A problem is undecidable if there does not exist such a program. Thus, this set is decidable:

\[ S = \{(xs, f) \mid \texttt{sort_spec}(xs, f) \}\]

while this one isn't:

\[ S = \{f \mid \forall xs, \texttt{sort_spec}(xs, f) \}.\]

Property-Based Testing

Given a specification \(S\) given via logical specs, we can use property-based testing, or PBT, to test whether a given function \(f\) is in \(S\). The essence of PBT is to reinterpret all universal quantifiers as probabilistic checks, where we sample from a distribution (often many times). While we can't decide if \(f \in S\) with absolute certainty, PBT still allows us to automatically find counter-examples with minimal effort.

Since our logical spec for sort is \(\forall xs, \texttt{sort_spec}(xs, \texttt{sort})\), we see we have one quantifier that we need to sample for. We sample this quantifier using a generator:

let gen_uniform_list (len_bound : int) : unit -> int list = fun () ->  
    let len = Random.int len_bound in 
    (* Standard library function to create lists using a function *)
    List.init len (fun _ -> Random.int 100)

A note about Random

In OCaml, Random.int generates random values based on a pseudorandom number generator (PRNG), which creates random values given a seed. If you use Random.int (or any other function in Random) and you want it to actually be random, you need to call Random.self_init () first. Alternatively, you can fix the seed by calling Random.init 32.

In general, a generator for t will have type unit -> t; thus, we can think of gen_uniform_list as being a function from integers (for the length bound) to generators for int list. We now create a sampler for the quantifier, which we will suggestively call forall:

let rec forall_list (gen : unit -> int list) (prop : int list -> bool) (trials : int) : (int list) option =
    if trials <= 0 then     
        None (* No counter-example found *)
    else 
        let xs = gen () in 
        if prop xs then 
            forall_list gen prop (trials - 1)
        else Some xs

Above, we are using the OCaml standard library functions for List. We will use them more going forward. If all tests succeed, we return None; else, we return Some xs, where xs is the counter-example of correctness. Additionally, we are abstracting over the generator we will use for the input.

Next class we will run this on an implementation of sort!