Lecture 7: ==>, PBT filtering
1 Purpose
Show ==>; logical implication corresponds to filtering for PBT.
2 Outline
Last time, we saw how to use for-all to generate random inputs
This is, in some sense, the fundamental idea of property-based testing. Between universal quantification, which is approximated by random generation, and logical assertions, which are realized by code that evaluates to booleans, we have a very rich way of describing the behavior of code.
(define (rev-app-helper l1 l2)
(list (reverse (append l1 l2))
(append (reverse l2)
(reverse l1))))
(define (rev-app-prop l1 l2)
(let [(ra (rev-app-helper l1 l2))]
(equal? (first ra) (second ra))))
(check-property
(for-all [(l1 [ListOf Natural])
(l2 [ListOf Natural])]
(rev-app-prop l1 l2)))
One additional useful logical tool is implication. That is, a certain property may only hold if certain preconditions also hold.
For example, consider the example from our very first lecture, control software for an automatic insulin pump:
; control : [ListOf BloodSugar] [ListOf InsulinGiven] -> InsulinGiven
; takes history of readings, history of dosing (any further back unknown), calculates new dose
(define (control Hx Rx)
1)
(define MAX-DOSE-OVER-T 10)
(define T 60)
(define MAX-DOSE 2)
(define (sum lon) (foldr + 0 lon))
(define (average lon) (/ (sum lon) (length lon)))
(define (get-n-readings l n)
(cond [(empty? l) (cond [(zero? n) '()]
[(positive? n) (build-list n (λ(_) 0))])]
[(cons? l) (cond [(zero? n) '()]
[(positive? n) (cons (first l)
(get-n-readings (rest l)
(sub1 n)))])]))
(check-property (for-all [(Hx [ListOf Natural]) (Rx [ListOf Natural])]
(let [(already-given (sum (get-n-readings Rx (- T 1))))]
(==> (< already-given MAX-DOSE-OVER-T)
(< (control Hx Rx) (- MAX-DOSE-OVER-T already-given))))))
Here, the property we were interested in was that given any set of readings, the pump wouldn’t decide to exceed a certain dose over a certain time interval. But since the historical data was also randomly generated, and is an input to the function, we don’t need to consider what happens if the dose had been exceeded in the past (as that is just random data, not anything actually produced by the function). As a result, we add a precondition to the property using the logical implication operator ==>. What this means is that if the precondition doesn’t hold, the property is vacuously true, and thus we will only consider violations of the property when the precondition holds.
Filtering to get more sensible inputs
Now, while quantifying over all Hx might make sense: readings could be large or small, and possibly the hardware that produces them could have bugs in it, but quantifying over all Rx may be less sensible, as, in particular, we have a safety requirement (ensured by properties) that a given dose never exceeds MAX-DOSE. So let’s include that restriction on Rx in the precondition, to make sure that any counterexamples we get are actually plausible:
(check-property (for-all [(Hx [ListOf Natural]) (Rx [ListOf Natural])]
(==> (andmap (λ(r) (< r MAX-DOSE)) Rx)
(let [(already-given (sum (get-n-readings Rx (- T 1))))]
(==> (< already-given MAX-DOSE-OVER-T)
(< (control Hx Rx) (- MAX-DOSE-OVER-T already-given)))))))
Now the test passes. That seems suspect, as our control function hasn’t changed. The issue is that of the random inputs we generated, all of those that passed the precondition did not trigger the bug. We can certainly come up with one (e.g., (list 1 1 1 1 1 1 1 1 1)), but it turns out that this input, or one like it with zeros mixed in, is the only input that’ll violate our spec, and our random generation doesn’t happen to pick any of those lists.
This is a weakness of PBT, but is avoidable. Any ideas how?
Carefully generate random inputs The idea is that rather than randomly generating arbitrary lists of naturals (and then filtering), we will generate only lists of naturals that seem plausible. Now, this is a risky thing to do, as if our "seem plausible" doesn’t match reality, we may miss bugs. But it’s an important technique, and critical to scaling PBT to real problems.
We’ll talk more about this tomorrow, but we’ll get a little into it today.
In addition to providing check-property, for-all, ==>, and a few types Integer, Natural, String, Boolean, ListOf, RecordOf, ISL-Spec exports the entirely of the Racket Quickcheck library (docs at https://docs.racket-lang.org/quickcheck/) with the qc: prefix.
One primitive provided by the library is qc:choose-one-of, which takes a list of values and returns one of them at random. We can use this to only return naturals 0 and 1. There is one detail that becomes apparent when we switch to these lower-level primitives: the library distinguishes between "arbitrary" instances, that can be shrunk, and "generators", that underly them. While all of the built-in types (Integer, String, Boolean, ListOf, RecordOf) are arbitraries, the combinators to build up new types are mostly generators, so usually when we drop down to using the library directly, we’ll be using generators.
We can use either with for-all, but we can’t pass generators to parametric arbitraries (i.e., ListOf (qc:choose-one-of (list 0 1)) will give an error, because ListOf expects an arbitrary, not a generator), so we need to use qc:choose-list, and the latter expects a size, so we used qc:sized to have it have random size:
(check-property (for-all [(Hx [ListOf Natural])
(Rx [qc:sized (λ(n) (qc:choose-list (qc:choose-one-of (list 0 1)) n))])]
(let [(already-given (sum (get-n-readings Rx (- T 1))))]
(==> (< already-given MAX-DOSE-OVER-T)
(< (control Hx Rx) (- MAX-DOSE-OVER-T already-given))))))
Now, as hoped, we get a counterexample.