Lecture 9: Dependent Contracts
1 Purpose
Output specification depending on input.
2 Outline
Last time we introduced the idea of relational properties, using shortest-string as an example.
Can you think of any other relational properties?
Think about sorting. Say you have a data definition for a person, with a name and an age:
(define-struct person [name age])
And you want to write a function sort-people that sorts people by their age, ascending:
(define (sort-people lop) ...)
Writing unit tests for this is problematic, as you end up with multiple possibilities:
(check-expect (sort-people (list (make-person "A" 19) (make-person "B" 19))) (list (make-person "A" 19) (make-person "B" 19))) (check-expect (sort-people (list (make-person "A" 19) (make-person "B" 19))) (list (make-person "B" 19) (make-person "A" 19)))
Both of these options are correct, and indeed, if you want to allow different sorting algorithms, it’s probably quite important not to enforce a certain ordering of otherwise equivalent elements.
How do we avoid this? By defining the notion of sortedness as a property, and making our specification be that sort-people results in a sorted list!
(define (sorted l) (cond [(empty? l) #t] [(cons? l) (and (andmap (lambda (p) (>= (person-age p) (person-age (first l)))) (rest l)) (sort-people-prop (rest l)))]))
(define (sort-people-prop l) (sorted (sort-people l)))
What is this expressing? That for every element in the list, everything later in the list has an age greater than or equal to it. That captures the notion of sortedness.
Now I can write tests in a form that allows any correct sorting algorithm:
(check-expect (sort-people-prop (list (make-person "A" 19) (make-person "B" 19))) #t)
Now, when we are coming up with these specifications, we do have to be careful: one benefit of unit tests is that they had us think about the input & the output, and write them down carefully. Just as our code can have bugs, so can our specifications, and they may not be obvious, at first. So often, we want to still include unit tests. In this case, it would be good to have unit tests for lists that did not have duplicate ages (and thus did not result) in ambiguity.
But if we just look at the specification, do we see any issue?
One problem: we are no longer ensuring that the same elements, or even any elements, come out of our function. We could define sort-people to return the empty list on all input and it would pass this specification, as indeed, the empty list is sorted!
Clearly, if we included some unit tests, we would rule out that possibility, but we can also extend our property to include other details. For example, we might want to ensure that the length of the input & output lists were the same, and that all elements in the input occurred somewhere in the output.
Figure out how to write those two as properties, and extend sort-people-prop to include them both.
(define (count-in x l) (length (filter (lambda (e) (equal? x e)) l)))
(define (elements-in? l1 l2) (andmap (lambda (p1) (= (count-in p1 l1) (count-in p1 l2))) l1))
(define (sort-people-prop1 l) (local [(define SORTED (sort-people l))] (and (sorted SORTED) (= (length l) (length SORTED) (elements-in? l SORTED)))))
Struct Contracts
If we tried to give signatures to the above, it would be difficult. While in principle, we could build a specific Immediate contract for Person that checked the fields appropriately, this can get unwieldy; and it’s unnecessary. LSL has built-in support for structs: whenever you define a new struct (like person above), you get a parametric contract (in this case, Person) created automatically. What parametric means is that, like List, it isn’t meaningful on its own: it has to be applied to other contracts to yield a useful, concrete, contract.
We can use that to give the above definitions signatures as:
(define-contract Person~ (Person String Number)) (: sort-people (-> (List Person) (List Person))) (: sort-people-prop (-> (List Person) True)) (: sort-people-prop1 (-> (List Person) True))
Function Contracts
At this point, we’ve written quite a few specifications. We’ve written these both as type-based signatures (i.e., that a function takes an Integer and returns an Integer), and as boolean predicates (that a given function should always return #t). We’ve used these both to describe properties (or invariants) about data: is a piece of data sorted, for example, to use to help build a specification for a sort function, but also, to describe what should happen when we call a given function: i.e., if we convert all other colors to black & white, none of them will be within a certain luminance of the given color, converted to black and white.
When we have used these specifications in tests, we have chosen particular inputs. For some data (e.g., our colors), we can exhaustively test all inputs, which forms a proof, a la truth table, that the function is indeed correct.
For other inputs, it’s not feasible to test all inputs. For simple atomic types, we’ve shown how you can use either check-contract or verify-contract to either randomly generate inputs and run the code, ensuring the output satisfies the contract, or symbolically execute the code, ensuring again that the output satisfies the output contract.
All of the contracts we’ve seen so far can be expressed, effectively, as simple boolean predicates: Any -> Boolean functions. i.e., Integer can be expressed as integer?, Boolean as boolean?. Any time we have had a more interesting invariant: e.g., the example from the second lecture that a prime-factor function actually returns a factor, we end up writing that as a separate function that is intended to always return #t.
(: prime-factor-of (-> Integer Integer)) (define (prime-factor-of n) 2) (: prime-factor-of-prop (-> Integer True))
(define (prime-factor-of-prop n) (let ([factor (prime-factor-of n)] [divides? (lambda (x y) (= (remainder y x) 0))]) (divides? factor n)))
(That’s a buggy implementation, of course, but the specification would catch that).
Writing these predicates separately works okay, but they are harder to re-use (e.g., if we had another function that involved factors, we could not reuse prime-factor-of-prop, since it calls prime-factor-of) and are perhaps more verbose than expressing directly in the signature of prime-factor-of what we expect of the output. One of the reasons for starting with this more primitive form is that it relies less on features of LSL, but as we get to more complex examples, we’ll appreciate the convenience that LSL provides.
So instead of expressing the property that the output should be a factor of the input as above, we can instead define a dependent contract, i.e., a contract that depends on a value, Divides:
(define-contract (Divides n) (Immediate (check (lambda (x) (= (remainder x n) 0)))))
In this case, we’ve defined this as a contract that cannot be used for generation: we could define how to generate factors of n, but in this case, we don’t need it.
With that contract, we can now redefine our prime factor function to be:
(: prime-factor-of-2 (Function (arguments [n Integer]) (result (AllOf Integer (Divides n))))) (define (prime-factor-of-2 n) 2)
This uses an alternative form of defining contracts for functions: rather than writing (-> Arg1 Arg2 ... Ret), you can write (Function (var1 Arg1) (var2 Arg2) ... Ret): i.e., you can give names to the arguments, which means they can be referred to either in later arguments or, as in this case, in the return. Indeed, (-> Arg1 Arg2 ... Ret) is actually shorthand for Function where all the vars are unused in later arguments or return.
In this case, what this says is that the first argument should be an Integer, and we’ll refer to it as n (note that we used the same contract variable as argument variable in the function header, but that’s merely convention: they can be different). The return should then be an Integer, but also (AllOf combines two contracts and ensures both hold) it should Divides n, i.e., it should be a factor of n, the argument of the function.
Now, checking the contract of this function (with (check-contract prime-factor-of-2)) not only checks that the types match, but also checks the logical constraint that the output is a factor of the input. We could also define a IsPrime contract, and change the output contract to be (AllOf Integer (Divides n) IsPrime) to truly capture what it means for the function to be correct.
Another example.
We can express the correctness property for the shortest-string function in this way:
(define (ShorterThanIn l) (lambda (x) (andmap (lambda (s) (>= (string-length s) x)) l))) (: shortest-string (Function (arguments [l (List String)]) (result (AllOf String (ShorterThanIn l)))))
(define (shortest-string los) ...)