Homework 3
Due: Sunday, September 28th at 11:59pm.
Submission Instructions
Start with the template code here.
- Replace each
failwith "TODO"
with your solution. (failwith
has typestring -> t
for any typet
, and throws an exception when it evaluates. Your solution to each problem should not includefailwith
.) - Be sure to test your solutions in
utop
or in tests alongside your submission. To load your file intoutop
, use#use "hw3.ml";;
. - Submit your version of
hw3.ml
to the HW3 assignment on Gradescope. Your version ofhw3.ml
should include (at least) the functions above, with the same types as above.
- At the top of the template file, there is a comment for you to write in approximately how many hours you spent on the assignment. Filling this in is optional, but helpful feedback for the instructors.
Binary Search Trees
A binary search tree is an efficient data structure for implementing a map, which allows you to insert key/value pairs, look up the value for a key, and remove keys from the map.
This homework will task you with implementing various operations on binary search trees and debugging their correctness using PBT.
Q1: Writing some specs
To use PBT, we first need to come up with specifications: functions that we take as definitions for correctness.
Q1.1 First, implement a function has_key
to test if a certain key is in a binary search tree:
Grading criteria: Q1.1
Whether your version of has_key
computes the correct answer.
Q1.2 Additionally, implement a function has_entry
to test if a certain entry is in the tree:
Grading criteria: Q1.2
Whether your version of has_entry
computes the correct answer.
Q1.3 Finally, we will need to test if a tree is sorted.
The definition of "sorted" for our bst
type is similar to int_tree
from last homework, except that we perform comparisons over the key
s. In other words, a bst
is sorted if for every Node(x, e, y)
in the tree, e.key
is greater than or equal to all keys stored in x
and less than or equal to all keys stored in y
.
Grading criteria: Q1.3
Whether your version of sorted
computes the correct answer.
Q2: A generator for BSTs
Next, use functions from Random
to implement a generator gen_bst
for bst
s. You can find documentation for Random
here.
This generator will take an integer n
to serve as a bound for the maximal depth of the tree.
You are free to write the generator in any style you wish, as long as it is useful for your PBT experiments. Later exercises will grade you on whether your PBT tests actually find bugs, so it cannot be trivial. Feel free to use randomness to generate unbalanced trees.
Note: here, a value of the bst
type is, without any additional properties, a "raw" binary search tree, which might not be sorted. Do not assume values of type bst
are sorted without explicit properties which state so.
Grading criteria: Q2
While later problems will depend on your gen_bst
generator being useful, we will grade gen_bst
itself on whether it
successfully uses randomness to produce any tree.
forall
functions for using generators
There are no homework problems in this section, but you will need to use code explained here in later questions.
We will use generators to conduct property-based tests; i.e., random sampling to test if universally-quantified formulas are true.
To make them easy to use, we will use a generic form of forall
. This generic form will be similar to what we saw in Week 3, but will use polymorphism (which will be introduced Monday of Week 4) to make them more flexible.
Below is our generic versions of forall
:
let rec forall (gen : unit -> 'a) (prop : 'a -> bool) (trials : int) : 'a option =
if trials <= 0 then None else
let x = gen () in
if prop x then forall gen prop (trials - 1) else Some x
This version of forall
is generic over the type 'a
of things we are quantifying over. Thus, this will type check:
'a = bst
). But this will also work:
since here 'a = int
, and encodes the property \(\forall x \in \texttt{int}, x + 1 = 1 + x\).
You will not need to use polymorphism in your code for this homework except when you use forall
.
We will also give you versions of forall
that lets you quantify over two and three things, instead of just one:
let rec forall2 (genA : unit -> 'a) (genB : unit -> 'b) (prop : 'a -> 'b -> bool) (trials : int) : ('a * 'b) option =
if trials <= 0 then None else
let x = genA () in
let y = genB () in
if prop x y then forall2 genA genB prop (trials - 1) else Some (x, y)
let rec forall3 (genA : unit -> 'a) (genB : unit -> 'b) (genC : unit -> 'c) (prop : 'a -> 'b -> 'c -> bool) (trials : int) : ('a * 'b * 'c) option =
if trials <= 0 then None else
let x = genA () in
let y = genB () in
let z = genC () in
if prop x y z then forall3 genA genB genC prop (trials - 1) else Some (x, y, z)
Working examples of forall2
and forall3
are below:
forall2 (fun _ -> Random.int 10) (fun _ -> Random.int 20) (fun (x : int) (y : int) -> x + y = y + x) 32
(* OK, since 'a = int, 'b = int *)
(* (Would return None, since the logical property is true *)
forall3 (fun _ -> Random.bool) (gen_bst 5) (fun _ -> Random.int 20) (fun (x : bool) (y : bst) (z : int) ->
b) 32
(* OK, since 'a = bool, 'b = bst, 'c = int *)
(* (Would return Some ..., since if we sample a random bool and get false, the property is false) *)
Q3: Looking up a key
Q3.1 Given an arbitrary binary search tree, if we want to look up the value with a given key, the best we can do is search through the whole tree --- since the key might be anywhere. Implement this:
Here, you should return Some v
, if v
is the value associated to the first entry you find that matches key k
. If no entry matches, return None
.
Grading criteria: Q3.1
Whether your function type checks and does not contain failwith
. We will evaluate its correctness in the next question.
Q3.2
Next, we will check that it is correct by using PBT. Show using PBT that the following formula is (most likely) true:
In natural language: \((P1)\) states that if \(\mathsf{lookup}(t, k)\) returns \(\mathsf{None}\), then \(k\) must not be in the tree.
To show property \((P1)\), define a function called p1
that contains a use of forall2
to demonstrate this property.
This theorem should return None
if and only if forall2
cannot find a counterexample.
You will need to write your own generator for int
s.
Grading criteria: Q3.2
Whether p1
correctly uses forall2
to implement and evaluate the theorem \((P1)\) (using gen_bst
to generate the bst
s); and if PBT does not find any counter-examples.
Q3.3 Next, use PBT to evaluate the following property:
This property is false.
Using forall2
, implement p2
to test whether (P2)
is true or false.
This will require you to write a generator for entry
.
Additionally, using your local PBT experiments, find a counter-example and write the counter-example (a pair of a bst
and entry
which makes the property false) in
lookup_has_entry_counter_example
. Additionally, write in a comment above why the property is false.
If your gen_bst
generator cannot find a counter-example, you should go back and tweak it so that it does.
(* This property is false, so it should return a counter-example. *)
let p2 () : (bst * entry) option = failwith "TODO"
(* Briefly explain below why this counter-example breaks the property:
TODO
*)
let p2_counter_example () : (bst * entry) =
failwith "TODO"
Grading criteria: Q3.3
Whether:
p2
correctly usesforall2
to implement and evaluate property \((P2)\) (usinggen_bst
to generate thebst
);- You have written a correct counter-example for \((P2)\) in
p2_counter_example
. - You have a brief explanation in a comment above
p2_counter_example
that explains why the property is broken on the counter-example.
Q3.4 While \((P2)\) is false, a slightly fixed version of \((P2)\) is true:
Note that we use "\(\implies\)" rather than "\(\iff\)".
Using forall2
, implement the below test which should use PBT to evaluate the correctness of \((P2')\).
Additionally, in a comment above p2'
, write a brief explanation for why \((P2')\) is true but \((P2)\) is false.
Grading criteria: Q3.4
Whether:
p2'
correctly usesforall2
to implement and evaluate the theorem \((P2')\) (usinggen_bst
to generate thebst
); and if PBT does not find any counter-examples;- Your comment adequately explains why \((P2)\) is false, but \((P2')\) is true.
Q4: Using sortedness to lookup a key faster
Above, lookup
needs to potentially traverse the whole tree, which can be prohibitively slow. However, given a sorted tree, we can do better. Since the tree is sorted, we can test if the key we are searching for is larger or smaller than the current entry's key; based on this information, we can go left or right. This means we do not traverse the whole tree, but only a certain path of it.
Q4.1 Implement this faster lookup:
Grading criteria: Q4.1
Whether your function type checks and does not contain failwith
. We will evaluate its correctness in the next question.
Correctness of lookup_sorted
(Part of) the correctness of lookup_sorted
is that it should not return None
if the key actually is in the tree.
But this is only true for sorted bst
s:
We can show this property in two ways.
Q4.2 First, use forall2
with your gen_bst
generator to show that
the above property is true for all t
and k
:
While the above likely won't find a counter-example, the above is not a very good test. Depending on how you wrote your gen_bst
function, it might be the case that you almost never generate a sorted function; and thus, the left hand side of the implication will always be false.
Grading criteria: Q4.2
Whether p3
correctly uses forall2
to evaluate the truth value of \((P3)\) (using gen_bst
to generate the bst
s); and if it does not find any counter-examples.
Q4.3 To fix this, we need to write a generator for bst
s that makes the left-hand side of the implication true. Thus,
write a generator for bst
that should always return a sorted bst
:
Grading criteria: Q4.3
Whether your function type checks and does not contain failwith
. We will evaluate its correctness in the next question.
And evaluate two properties:
Q4.4 First, gen_sorted
actually returns sorted bst
s. To do this, pick a bound n
, and use forall
to test whether sorted
is true for outputs of gen_sorted n
:
(This should always return None
, if gen_sorted
and sorted
are written correctly.)
Grading criteria: Q4.4
Whether you correctly use forall
to evaluate if gen_sorted
always returns sorted
trees.
Q4.5 Second, that our property about lookup_sorted
is still true when we use gen_sorted
instead of gen_bst
:
Grading criteria: Q4.5
Whether you correctly use forall2
to evaluate if property \((P3)\) is true when using gen_sorted
as the generator for bst
s.
Violating lookup_sorted
's hypothesis
Q4.6 If we consider all bst
s, instead of the sorted ones, then the conclusion of \((P3)\) is false. That is, the below statement is false:
Give a counter-example for the above statement's truth below:
Grading criteria: Q4.6
Whether the value you construct for p4
causes the conclusion of \((P4)\) (i.e., \(\mathsf{lookup\_sorted}(t, k) = \mathsf{None} \iff \mathsf{has\_key}(t, k) = \mathsf{false}\)) to be false.
Q5: Specifying and implementing remove
Q5.1 Now, implement a function remove : int -> bst -> bst
such that, for a key k
and tree t
, remove k t
returns a tree with all entries with key k
removed.
Hint
To remove \(k\) from Branch(t1, e, t2)
, if e
's key is equal to k
, you will need to construct a new tree out of
remove k t1
and remove k t2
by merging them together.
Feel free to create a helper function to do this. Your new solution may take any strategy for merging subtrees.
Grading criteria: Q5.1
Whether your function remove
type checks, does not contain failwith
, and does not output trees that contain entries with key k
.
Q5.2 Next, write a logical specification for what should happen when we call lookup (remove k1 t) k2
. This specification will universally quantify over k1
, k2
, and t
, and include two subparts in its body: what should be true when k1
is equal to k2
; and what should be true when k1
is not equal to k2
. Your property should include (at least) one call to lookup (remove k1 t) k2
.
(* Write your logical specification of remove below:
forall k1 : key, k2 : key, t : bst.
.... TODO .....
*)
This question is intentionally open-ended, as the purpose is for you to discover an appropriate logical specification for how lookup
should behave after we call remove
.
Grading criteria: Q5.2
Whether your logical specification in your comment is a well-formed propositional formula.
Q5.3 Next, use PBT as we did in Q3 to evaluate this property. Since we are quantifying over three things, you will use forall3
. First, transcribe the property into OCaml:
let p_remove () : (int * int * bst) option = failwith "TODO"
(* TODO:
- Is your property true or false?
- If true, explain why. If false, give a counter-example you found using PBT.
... TODO ...
*)
Then, after running p_remove
locally (trying out various generators for key
might be useful here),
write in a comment whether your logical property from Q5.2 is true or false. If it is true, explain why; if it is false,
give a concrete counter-example and explain why your counter-example breaks your property.
Grading criteria: Q5.3
Whether:
- Your
p_remove
is a correct transcription of your logical property, usingforall3
(and any generators you wish); - You correctly evaluated the truth value of your property;
- If you gave a counter-example, if it is a correct counter-example to the property's truth;
- If you gave a reason why the property is true, if it is a valid argument of its truth. (It does not have to be a formal proof, but should be a reasonable argument for why it is true.)