Special Topics: Verification with Dafny
1 Limitations of PBT, Intro to Interactive Theorem Provers
Throughout the semester, we’ve seen the enormous variety of logical invariants we can express using Property Based Testing.
The obvious weakness of PBT, which we have discussed many times, is that it only tests a certain subset of the inputs. In order to have that be useful, we have to be sure that we are testing a representative subset, and to know if it representative, we may have to already know where bugs may appear: possibly not a reasonable assumption! Despite that, it is remarkably flexible and useful.
However, if our code had a bug that only shows up on larger inputs (or particular uncommon cases), then this testing might not find it. For many applications, this is good enough – it’s certainly better than how the vast majority of software is tested – but there are domains (medical devices, cryptography, fly-by-wire systems, etc) where we want to do better, and logic gives us tools.
How can we possibly reason about data of arbitrary size? Or at least, arbitrary size that will fit into our computers? We certainly can’t run on all inputs. The key idea is the same one as how we write recursive functions that would, in principle, work on data of arbitrary size. While you cannot construct an infinitely long list, the following length function will correctly work on any list you can construct it (of which there are, in practical terms, infinitely many):
(define (len l) (cond [(empty? l) 0] [(cons? l) (add1 (len (rest l)))]))
This may seems basic, since you’ve been dealing with these kinds of functions for a while, but it’s worth reflecting on that fact: this three line function will work on any list you can create.
Now let’s move a little closer to math, and think about a first proof.
Consider one of the simplest possible functions:
; f : Natural -> Natural ; adds 0 to its input
And we want to show that f(x) = x for all natural numbers. If we were happy testing a few random numbers, we could use PBT. But we want to know this holds for any natural number.
In particular, we want to be sure that the function doesn’t do, for example, this:
i.e., for any numbers beyond 2^{64}, instead of returning the same number, it just returns 0. How can we prove that our function doesn’t do that?
2 Dafny
To prove truly universal properties, like that f(x) = x for all x, one needs to move from property-based testing to verification. We will use the verification tool Dafny. We will use the textbook Program Proofs, which you can access through Northeastern’s library here.
Chapter 0
Chapter 1
Chapter 4-10
3 List Library
In class, we developed a library for lists in Dafny, written as an inductive datatype.
datatype List<A> = Nil | Cons(v : A, tl : List<A>) |
|
function list_map<A, B>(xs : List<A>, f : A -> B) : List<B> { |
match xs |
case Nil => Nil |
case Cons(v, tl) => Cons(f(v), list_map(tl, f)) |
} |
|
lemma list_map_map<A, B, C>(xs : List<A>, f : A -> B, g : B -> C) |
ensures list_map(list_map(xs, f), g) == list_map(xs, x => g(f(x))) |
{ |
|
} |
|
function foo() : List<nat> { |
list_map(Cons(3, Nil), (x : nat) => x + 1) |
} |
|
function append<A>(xs : List<A>, ys : List<A>) : List<A> { |
match xs |
case Nil => ys |
case Cons(x, xs') => Cons(x, append(xs', ys)) |
} |
|
function append2<A>(xs : List<A>, ys : List<A>) : List<A> { |
if xs.Nil? then ys else Cons(xs.v, append2(xs.tl, ys)) |
} |
|
lemma {:induction false} append_eq<A>(xs : List<A>, ys : List<A>) |
ensures append(xs, ys) == append2(xs, ys) |
{ |
if xs.Nil? { |
|
} |
else { |
append_eq(xs.tl, ys); |
} |
} |
|
function reverse<A>(xs : List<A>) : List<A> { |
match xs |
case Nil => Nil |
case Cons(v, ys) => append(reverse(ys), Cons(v, Nil)) |
} |
|
lemma {:induction false} append_nil<A>(xs : List<A>) |
ensures append(xs, Nil) == xs |
{ |
match xs |
case Nil => {} |
case Cons(v, xs') => { |
append_nil(xs'); |
} |
} |
|
lemma {:induction false} append_assoc<A>(xs : List<A>, ys : List<A>, zs : List<A>) |
ensures append(append(xs, ys), zs) == append(xs, append(ys, zs)) |
{ |
match xs |
case Nil => {} |
case Cons(v, xs') => { |
append_assoc(xs', ys, zs); |
} |
} |
|
lemma {:induction false} rev_append<A>(xs : List<A>, ys : List<A>) |
ensures reverse(append(xs, ys)) == append(reverse(ys), reverse(xs)) |
{ |
match xs |
case Nil => { |
calc { |
reverse(append(Nil, ys)); |
== |
reverse(ys); |
== {append_nil(reverse(ys)); } |
append(reverse(ys), Nil); |
== |
append(reverse(ys), reverse(Nil)); |
} |
} |
case Cons(v, xs') => { |
calc { |
reverse(append(Cons(v, xs'), ys)); |
== |
reverse(Cons(v, append(xs', ys))); |
== |
append(reverse(append(xs', ys)), Cons(v, Nil)); |
== { rev_append(xs', ys); } |
append(append(reverse(ys), reverse(xs')), Cons(v, Nil)); |
== { append_assoc(reverse(ys), reverse(xs'), Cons(v, Nil)); } |
append(reverse(ys), append(reverse(xs'), Cons(v, Nil))); |
} |
} |
} |
|
lemma {:induction false} rev_rev<A>(xs : List<A>) |
ensures reverse(reverse(xs)) == xs |
{ |
match xs |
case Nil => {} |
case Cons(v, ys) => { |
calc { |
reverse(reverse(Cons(v, ys))); |
== |
reverse(append(reverse(ys), Cons(v, Nil))); |
== { rev_append(reverse(ys), Cons(v, Nil)); } |
append(reverse(Cons(v, Nil)), reverse(reverse(ys))); |
== { rev_rev(ys); } |
append(reverse(Cons(v, Nil)), ys); |
} |
} |
} |
4 Arithmetic Expressions
We then wrote a library for arithmetic expressions, and proved a simpler compiler into a stack-based language correct.
|
include "lists.dfy" |
|
datatype Expr = Num(v : nat) | Add(l : Expr, r : Expr) | Var(x : string) |
{ |
function eval(m : map<string, nat>) : nat { |
match this |
case Var(x) => if x in m then m[x] else 0 |
case Num(v) => v |
case Add(l, r) => l.eval(m) + r.eval(m) |
} |
} |
|
function const_fold(e : Expr) : Expr { |
match e |
case Var(_) => e |
case Num(_) => e |
case Add(Num(n), Num(m)) => Num(n+m) |
case Add(l, r) => Add(const_fold(l), const_fold(r)) |
} |
|
lemma {:induction false} const_fold_eq(e : Expr, m: map<string, nat>) |
ensures const_fold(e).eval(m) == e.eval(m) |
{ |
match e |
case Var(_) => {} |
case Num(_) => {} |
case Add(Num(n), Num(m)) => { } |
case Add(l, r) => { |
const_fold_eq(l, m); |
const_fold_eq(r, m); |
} |
} |
|
datatype Ins = IGet(s : string) | IPush(v : nat) | IAdd |
type Stack = List<nat> |
|
function compile(e : Expr) : List<Ins> { |
match e |
case Var(x) => Cons(IGet(x), Nil) |
case Num(n) => Cons(IPush(n), Nil) |
case Add(l, r) => Cons(IAdd, append(compile(r), compile(l))) |
} |
|
function eval_i(i : Ins, s : Stack, m : map<string, nat>) : Stack |
{ |
match i |
case IGet(x) => Cons(if x in m then m[x] else 0, s) |
case IPush(v) => Cons(v, s) |
case IAdd => |
match s |
case Cons(v1, Cons(v2, rest)) => Cons(v1 + v2, rest) |
case _ => Nil |
} |
|
function eval_ilist(xs : List<Ins>, s : Stack, m : map<string, nat>) : Stack |
{ |
match xs |
case Nil => s |
case Cons(x, xs') => eval_i(x, eval_ilist(xs', s, m), m) |
} |
|
lemma eval_ilist_app(xs : List<Ins>, ys : List<Ins>, s : Stack, m : map<string, nat>) |
ensures eval_ilist(append(xs, ys), s, m) == eval_ilist(xs, eval_ilist(ys, s, m), m) |
{ |
} |
|
lemma compile_correct(e : Expr, s : Stack, m : map<string, nat>) |
ensures eval_ilist(compile(e), s, m) == Cons(e.eval(m), s) |
{ |
match e |
case Var(_) => {} |
case Num(_) => {} |
case Add(l, r) => { |
calc { |
eval_ilist(Cons(IAdd, append(compile(r), compile(l))), s, m); |
== |
eval_i(IAdd, eval_ilist(append(compile(r), compile(l)), s, m), m); |
== { eval_ilist_app(compile(r), compile(l), s, m); } |
eval_i(IAdd, eval_ilist(compile(r), eval_ilist(compile(l), s, m), m), m); |
== { compile_correct(l, s, m); } |
eval_i(IAdd, eval_ilist(compile(r), Cons(l.eval(m), s), m), m); |
== { compile_correct(r, Cons(l.eval(m), s), m); } |
eval_i(IAdd, Cons(r.eval(m), Cons(l.eval(m), s)), m); |
} |
} |
} |