On this page:
1 Purpose
2 Outline
3 Slim  Check
8.7

Lecture 37: Demo of Slim Check

1 Purpose

Show how you can do property based testing to look for counter-examples to theorems in Lean.

2 Outline

At the beginning of the course, and again in your current homework, we showed how, given logical properties, we can use testing to partially verify them. Technically, what we are doing is looking for counter-examples: the longer we search, and the smaller the space of possible inputs, the closer we get to a proof, and the more confident we can be that the property holds.

All of the properties that we were using were boolean predicates, in the end, in that we ran code that either returned true or false, and we were looking for inputs that would result in false.

Once we moved to Lean, we ended up using propositions that were not boolean, and didn’t have an obvious way to run. Despite the fact that there is often a connection between an executable predicate and a logical one, we weren’t able to directly apply the same techniques. Instead, we had to produce a proof, in the form of a term in the Lean language, that the Lean typechecker could check against the proposition. Sometimes, carrying out that proof was quite difficult! And this is only for cases where we knew that the theorems were true (as you were always proving theorems that had already been proved).

One thing we didn’t explore at all is how to figure out if you are even proving the right theorem? If you are trying to prove a large theorem, and end up needing smaller helper theorems, what if your proposed helper theorems end up being not true? What if the overall theorem isn’t actually true? Sometimes you can do a lot of work, and realize there are subtleties that you didn’t realize at first that you need to take into account in order to prove this. The closest we got to this, and only in the notes, not in class, for Lecture 22, where a proof ended up going wrong because we needed a more precise notion of error handling in our stack calculator.

Both of these things can be addressed at once with a relatively simple idea: apply property based testing to theorems.

How? Well, it requires a bit of work, as we need a way to run our logical statements (which is possible for some, but not all), and a way to generate inputs (again, true for some, but not all), but at least for the subset that we can, we get the following benefits:

1. If a theorem is not true, we may be able to easily find a counter-example automatically. This is quite different than our typical approach, which requires we attempt to prove it, get stuck (potentially after doing a lot of work), and then decide that we are stuck not because we don’t know how to prove it (possible) but because it actually isn’t true.

2. For theorems we haven’t proved (or haven’t proved yet), rather than leaving them completely unverified (i.e., just sorry), we can have more confidence that they may be true if after doing a significant amount of property based testing we did not find a counter-example. Clearly, this is not the same as a proof (as we talked about when we talked about the limitations of PBT, it’s possible that there is a particular bad input, or our input generation isn’t well-distributed, etc), but it’s a lot better than nothing!

Let’s see how this can be done. In this lecture, I’m going to be using a tool called "SlimCheck". This is a part of (but not released yet) Mathlib4, which in a Lean4 port of the massive Lean3 library Mathlib. It defines not only how to do randomized testing (and assuming no counter-example is found, admit theorems with sorry), but how to generate inputs, and how to convert logical statements into executable expressions (so we can run them).

3 SlimCheck

The rest of these notes are an annotated file, which will only work with yet-to-be-merged code in Mathlib4, so I don’t expect anyone (without quite a bit of work) will be able to reproduce it!

import Mathlib

namespace SlimCheck

def List.shrink : (l : List A) -> List { y : List A // WellFoundedRelation.rel y l }
| [] => []
|x::xs => 
  let rest := List.shrink xs 
  xs, by simp_wf :: rest.map (fun y => {y with property := by match y with |y,h => simp_wf; sorry })

instance List.shrinkable [Shrinkable A] : Shrinkable (List A) where
  shrink  := List.shrink

instance List.sampleableExt [SampleableExt A] [Repr A] [Shrinkable A] : SampleableExt (List A) :=
  SampleableExt.mkSelfContained 
    (do Gen.listOf (SampleableExt.interpSample A))

end SlimCheck


















example : forall x : Nat, 2 * x = x + x := 
  by slim_check -- Success







example : forall x y : Nat, x + y = y + x := by slim_check -- Success


def odd : Nat -> Bool
| 0 => false
| 1 => true
| Nat.succ (Nat.succ n) => odd n

example : forall x y : Nat, 
  x % y = 1 -> odd y -> odd x := 
  by intros x
     induction x <;> intros y H1 H2 
     simp    
  



-- example : true == false



example (xs : List Nat) 
        (w :  x  xs, x < 3) :  y  xs, y < 10 := by slim_check
/-
===================
Found problems!
xs := [9, 3, 11, 1, 11, 11, 8, 10]
guard: ⋯
y := 11
guard: ⋯
issue: 11 < 5 does not hold
(2 shrinks)
-/





example : forall (α : Type) (xs ys : List α), 
  xs ++ ys = ys ++ xs := 
  by slim_check
/-
===================
Found problems!
α := "ℤ"
xs := [1]
ys := [2]
issue: [1, 2] = [2, 1] does not hold
(0 shrinks)
-/



-- How does this work? Relies upon a few ideas.

-- First, a more general idea, which you may have seen in Java as 
-- an "interface", but is called in Lean (and some other languages)
-- as a "type class" (there are some differences, but nothing we 
-- need to get into). These are used in several ways.

-- First, you need to be able to generate data. This relies upon the 
-- idea that certain types are "sampleable". i.e., the types are 
-- "instances" of the "Sampleable" type class (analogously, they satisfy 
-- its interface, but its done _externally_ to the definition of the type).

-- Some examples:
#sample Nat
/- Prints, e.g.,
0
0
0
0
4
0
4
6
1
2
-/

example : forall n : PrimeNat, ... toNat n ...

#sample Bool
#sample Bool
#sample Bool


#sample List Nat


#sample List Bool
/- Prints, e.g.,
[]
[false]
[false]
[false]
[true, true]
[true, false, true, true, false]
[true, false, true]
[false, true]
[true]
[false, false, true, true, true, false, false, false]
-/

-- In addition to sampling, we also want the idea of "shrinking", 
-- as once we find a counter-example, we'd like to make it smaller. 
-- This is done via a "Shrinkable" typeclass, which given an element,
-- returns a list of smaller candidates.

-- What if you want to use your own types? Need to make them Sampleable
-- & Shrinkable

structure MyType where
  x : Nat
  y : Nat
  h : x  y
  deriving Repr

instance : SlimCheck.Shrinkable MyType where
  shrink := λ x,y,h =>
    let proxy := SlimCheck.Shrinkable.shrink (x, y - x)
    proxy.map (λ ⟨⟨fst, snd⟩, ha => ⟨⟨fst, fst + snd, sorry⟩, sorry⟩)

instance : SlimCheck.SampleableExt MyType :=
  SlimCheck.SampleableExt.mkSelfContained do
    let x  SlimCheck.SampleableExt.interpSample Nat
    let xyDiff  SlimCheck.SampleableExt.interpSample Nat
    pure $ x, x + xyDiff, sorry

-- That's a bit noisy, and we aren't going to go into the details 
-- (some of which are elided with `sorry`), but the basic idea is 
-- that shrink takes an element and gives a list of smaller elements: 
-- we're sort of cheating by re-using existing 
-- instances for pairs. For sampling, we are choosing a random Nat 
-- and a random difference and using those two to make the data 
-- structure (the proof is not generated, though it should be!). 


-- Now I can sample from MyType
#sample MyType

-- And use it in theorems that I try to disprove
example :  a b : MyType, a.y  b.x  a.x  b.y := by slim_check -- Gave up 12 times

-- And theorems that aren't true!
example :  a b : MyType, a.y  b.x  a.x = b.y := by slim_check
/-
===================
Found problems!
a := { x := 0, y := 0, h := _ }
b := { x := 1, y := 1, h := _ }
guard: 0 ≤ 1
issue: 0 = 1 does not hold
(0 shrinks)
-/

-- Next, you need to have theorem statements that are _decidable_. 
-- This is done via another type class: Testable.

#check SlimCheck.Testable

-- Key idea: define how some Props can be _run_ to produce Booleans.
-- The "Gen" part is another feature, that we definitely can't get into, 
-- that allows randomness to be controlled quite precisely, and captures
-- the fact that the result in _random_ (which has to be done carefully in 
-- a pure language like Lean).
/-
class Testable (p : Prop) where
  run (cfg : Configuration) (minimize : Bool) : Gen (TestResult p)
-/

-- For example:
#check SlimCheck.Testable.andTestable

/-
instance andTestable [Testable p] [Testable q] : Testable (p ∧ q) where
  run := λ cfg min => do
    let xp ← runProp p cfg min
    let xq ← runProp q cfg min
    pure $ and xp xq
-/

-- And more generally, we can rely on the notion of "decidable" that is in the
-- core Lean libraries (and underlies `if`)

#check SlimCheck.Testable.decidableTestable

/-
instance (priority := low) decidableTestable {p : Prop} [PrintableProp p] [Decidable p] :
    Testable p where
  run := λ _ _ =>
    if h : p then
      pure $ success (PSum.inr h)
    else
      let s := printProp p
      pure $ failure h [s!"issue: {s} does not hold"] 0
-/

-- That uses two other type classes, one that comes from SlimCheck, and one from the
-- standard library.
#check SlimCheck.PrintableProp

/-
class PrintableProp (p : Prop) where
  printProp : String

instance (priority := low) : PrintableProp p where
  printProp := "⋯"
-/

#check Decidable


/-
class inductive Decidable (p : Prop) where
  /-- Prove that `p` is decidable by supplying a proof of `¬p` -/
  | isFalse (h : Not p) : Decidable p
  /-- Prove that `p` is decidable by supplying a proof of `p` -/
  | isTrue (h : p) : Decidable p

abbrev DecidableEq (α : Sort u) :=
  (a b : α) → Decidable (Eq a b)

/-- Decidable equality for Bool -/
@[inline] def Bool.decEq (a b : Bool) : Decidable (Eq a b) :=
   match a, b with
   | false, false => isTrue rfl
   | false, true  => isFalse (fun h => Bool.noConfusion h)
   | true, false  => isFalse (fun h => Bool.noConfusion h)
   | true, true   => isTrue rfl

@[inline] instance : DecidableEq Bool :=
   Bool.decEq
-/


-- In conclusion -- we can apply ideas from property based testing even to 
-- _theorems_ -- we just need a way of generating data and running the theorems,
-- and to understand how to use the results! Clearly this does not get us proofs,
-- but it can potentially be useful in making us not have to waste time trying to 
-- prove theorems that aren't true!

If you’d like to see a much more developed version of this (with tutorials, research papers), check out QuickChick: https://github.com/QuickChick/QuickChick