On this page:
1 Purpose
2 Outline
2.1 Failing to generate
8.15

Lecture 13: Using LSL III🔗

1 Purpose🔗

Show how to use Tyche to understand how well check-contract is working

2 Outline🔗

We’ve now used check-contract for many different purposes, in class and in HW. Up until this point, we’ve been a little bit vague about how well it is exploring our properties. Part of that is because despite us using property-based testing (PBT) throughout, this is not really a course about PBT – it’s a course about specifications, and we use PBT to give some meaning to them by actually looking for counter examples. The other reason we haven’t focused too much on the actual distributions of values, or even how many tests we are running, is that even running a relatively small number of examples can be surprisingly effective.

But we do want to give you a little more of an under-the-hood perspective, and show you a tool that can help you understand called Tyche.

We’ve built support for Tyche directly into DrRacket, so when you have an LSL program open, next to the Run button there is a magnifying glass that says "Tyche". If you click that instead of clicking run, we will run all of your check-contracts, but instead of just showing that they passed (assuming they do), it opens up a web browser window with an interactive page to show information about the type of examples that were used.

Let’s take, as a basic example, the binary tree we had a few days ago:

(define-struct leaf [value])
(define-struct node [left right])
(define-contract (Leaf X) (Struct leaf [X]))
(define-contract (Node X Y) (Struct node [X Y]))

In order to fully use Tyche, we are often going to want to have full control over the generation, so we’ll use the inttree? and inttree-generate functions to construct a contract:

(define (inttree? t)
  (or (and (leaf? t) (integer? (leaf-value t)))
      (and (node? t)
           (inttree? (node-left t))
           (inttree? (node-right t)))))
 
(define (inttree-generate fuel)
  (cond [(zero? fuel) (make-leaf (contract-generate Integer fuel))]
        [else (if (< (random) 1/2)
                  (make-leaf (contract-generate Integer fuel))
                  (make-node (inttree-generate (sub1 fuel))
                             (inttree-generate (sub1 fuel))))]))
 
(define-contract IntTree (Immediate (check inttree?)
                                    (generate inttree-generate)))

Now if I write a height function and property about it:

(: height (-> IntTree Natural))
(define (height t)
  (cond [(leaf? t) 0]
        [(node? t) (add1 (max (height (node-left t)) (height (node-right t))))]))
(: height-rec-prop (-> IntTree True))
(define (height-rec-prop t)
  (and (equal? (add1 (height t))
               (height (make-node t (make-leaf 0))))
       (equal? (add1 (height t))
               (height (make-node (make-leaf 0) t)))))
(check-contract height-rec-prop)

And run Tyche (note: always run LSL normally, as running Tyche only works if no errors occur!).

In the window, we can see that all inputs passed the property, and, on our run, 93 of them were unique. That means that 7 were duplicate, but that’s relatively low. If we click on either segment, we can see the actual samples that were run, but Tyche can do quite a bit more.

For example, maybe we want to know if we are getting a good range of integers in the trees we generate. We could define a function that returns the magnitude of the largest integer that exists in the tree:

(define (absmax t)
  (cond [(leaf? t) (abs (leaf-value t))]
        [(node? t) (max (absmax (node-left t)) (absmax (node-right t)))]))

Now we could update IntTree to add an additional clause: a feature of the data. features are functions of the value that return some value that is relevant for Tyche. They additionally have a string label.

For example, we can revise IntTree to IntTree2:

(define-contract IntTree2 (Immediate (check inttree?)
                                    (generate inttree-generate)
                                    (feature "max magnitude integer" absmax)))

Now when we run Tyche, in addition to the previous output we have a new panel: distribution of "max magnitude integer". This shows us that most of our numbers are pretty small, and none are above 100, which may be a concern!

We can add more features – indeed, height might be a useful feature on its own. If we define height before and use it as a feature for IntTree, as follows:

(define (height-nc t)
  (cond [(leaf? t) 0]
        [(node? t) (add1 (max (height-nc (node-left t)) (height-nc (node-right t))))]))
 
(define-contract IntTree3 (Immediate (check inttree?)
                                    (generate inttree-generate)
                                    (feature "max magnitude integer" absmax)
                                    (feature "tree height" height-nc)))

We can see when we run Tyche that almost all the trees are quite short! This may be an actual issue, though maybe not, as there are a few that are larger.

2.1 Failing to generate🔗

What if we decided we wanted to only generate trees with even numbers in them? One possibility would be to make a new pair of functions like inttree? and inttree-generate, but for eventree. But, another way is to generate using "IntTree" and then filter out inputs that don’t satisfy the given property. Then all we need to do is make a predicate, not a way of generating.

We can do this as follows:

(define (eventree? t)
  (cond [(leaf? t) (even? (leaf-value t))]
        [(node? t) (and (eventree? (node-left t))
                        (eventree? (node-right t)))]))

Now let’s say we want to check if a definition of max-tree works over trees that are full of even numbers. We can express one invariant about max-tree as follows:

(: max-tree (-> (AllOf IntTree eventree?) Integer))
(define (max-tree t)
  (cond [(leaf? t) (leaf-value t)]
        [(node? t) (max (max-tree (node-left t)) (max-tree (node-right t)))]))
 
(: max-tree-prop (-> (AllOf IntTree eventree?) (AllOf IntTree eventree?) True))
(define (max-tree-prop t1 t2)
  (let ([m1 (max-tree t1)]
        [m2 (max-tree t2)])
    (and (equal? (max-tree (make-node t1 t2))
                 (max m1 m2))
         (equal? (max-tree (make-node t2 t1))
                 (max m1 m2)))))
(check-contract max-tree-prop)

Running this with check-contract reports no test failures.

Running it with Tyche reveals something much more interesting: out of 100 samples, 91 (in one run) didn’t pass the eventree? predicate, so only 9 even went through the property. Even worse, if we look at them, they are all very small, which makes sense if we reflect: as trees get bigger, the chance of generating one with no odd numbers gets very small! This means that if our code had a bug that only occurred on larger trees, it would be very unlikely, even if we generated a ton of examples, that we would ever generate one that could trigger the bug.

We contrast this with defining EvenTree "by hand":

(define (eventree-generate fuel)
  (cond [(zero? fuel) (make-leaf (* 2 (contract-generate Integer fuel)))]
        [else (if (< (random) 1/2)
                  (make-leaf (* 2 (contract-generate Integer fuel)))
                  (make-node (eventree-generate (sub1 fuel))
                             (eventree-generate (sub1 fuel))))]))
 
(define-contract EvenTree (Immediate (check eventree?)
                                    (generate eventree-generate)
                                    (feature "max magnitude integer" absmax)
                                    (feature "tree height" height-nc)))

(Where we re-used some features from IntTree)

Now if we change the input to the property above to EvenTree:

(: max-tree-prop2 (-> EvenTree EvenTree True))
(define (max-tree-prop2 t1 t2)
  (let ([m1 (max-tree t1)]
        [m2 (max-tree t2)])
    (and (equal? (max-tree (make-node t1 t2))
                 (max m1 m2))
         (equal? (max-tree (make-node t2 t1))
                 (max m1 m2)))))
(check-contract max-tree-prop2)

When we run Tyche, we can see that all the inputs are used, and while there are still quite a few short trees, there are some that are long.

If we decided we didn’t like that there were short trees, we could also address this! This is due to our decision to, at every step, decide if we are going to stop the tree at this level, and we do this with probability 1/2. This means that a tall tree is much less likely than a short one (leaves occur with probability 50%!).

There are several ways of changing this. A simple one is to just tweak the threshold – rather than 1/2, make it be 1/3 chance that we stop at a leaf. Even though this seems like a "small" change, it ends up significantly changing the distribution, such that if we run more than ~30 tests it can take a while to find them.

(check-contract max-tree-prop2 30)