Special Topics: Interactive Theorem Proving with Rocq
1 Limitations of PBT, Intro to Interactive Theorem Provers
Throughout the semester, we’ve seen the enormous variety of logical invariants we can express using Property Based Testing.
The obvious weakness of PBT, which we have discussed many times, is that it only tests a certain subset of the inputs. In order to have that be useful, we have to be sure that we are testing a representative subset, and to know if it representative, we may have to already know where bugs may appear: possibly not a reasonable assumption! Despite that, it is remarkably flexible and useful.
However, if our code had a bug that only shows up on larger inputs (or particular uncommon cases), then this testing might not find it. For many applications, this is good enough – it’s certainly better than how the vast majority of software is tested – but there are domains (medical devices, cryptography, fly-by-wire systems, etc) where we want to do better, and logic gives us tools.
How can we possibly reason about data of arbitrary size? Or at least, arbitrary size that will fit into our computers? We certainly can’t run on all inputs. The key idea is the same one as how we write recursive functions that would, in principle, work on data of arbitrary size. While you cannot construct an infinitely long list, the following length function will correctly work on any list you can construct it (of which there are, in practical terms, infinitely many):
(define (len l) (cond [(empty? l) 0] [(cons? l) (add1 (len (rest l)))]))
This may seems basic, since you’ve been dealing with these kinds of functions for a while, but it’s worth reflecting on that fact: this three line function will work on any list you can create.
Now let’s move a little closer to math, and think about a first proof.
Consider one of the simplest possible functions:
; f : Natural -> Natural ; adds 0 to its input
And we want to show that f(x) = x for all natural numbers. If we were happy testing a few random numbers, we could use PBT. But we want to know this holds for any natural number.
In particular, we want to be sure that the function doesn’t do, for example, this:
i.e., for any numbers beyond 2^64, instead of returning the same number, it just returns 0. How can we prove that our function doesn’t do that?
To understand this, we have to think more carefully about how we define natural numbers, as our proof is going to rely upon that exact definition. There are multiple ways of doing that, but in Rocq, they are defined as:
Inductive nat : Set :=
O : nat
| S : nat -> nat.
i.e., a nat
is either O
(zero), or it is
the successor of some other nat
. This is an inductive
definition, as it is defined in terms of its self, or inductively. All terms in
Rocq have types, including constructors (O
and
S
) and types (nat
).
To prove our theorem, that is, that \forall x : nat, f(x) = x, we
need to show that it holds for all natural numbers. To do this, we appeal to the
inductive structure of the data: we show that the property holds for
0
, and then show that, assuming it holds for some arbitrary
nat
m
, that it holds for
S m
(i.e., one more than m
).
By just showing those two cases, just like with our recursive len function,
we can show that the property holds for any natural number, as any natural
number is constructed by repeated applications of S
on top
of O
, and thus we can repeatedly appeal the second case of our proof until
we get to the base case and then appeal to the first case.
Because proofs can get somewhat unwieldy, we will usually use tactics to help construct them. Tactics are procedures that can inspect the current partially constructed proof and add to it. All theorems in Rocq use tactics by default, so when you start working on a proof, you are switched to tactic mode.
This allows our theorem, among other things, to construct trivial equality
proofs with reflexivity
and unfold the definition of f
.
Note that, in general, we can only directly prove trivial equalities
(using reflexivity
); for more interesting ones, we often have to rewrite
them (using rewrite
) to a trivial equality.
Definition f (x : nat) := x.
Theorem f_is_good : forall x, f x = x.
Proof.
intros x.
induction x.
- reflexivity.
- unfold f. reflexivity.
Qed.
A couple other important things in this, our first proof: first, note that we
have the same forall
construct as we’ve seen before. Rocq is a
higher-order logic, which means that we can quantify over terms (we’ll see
exactly what that means later). Next, note that our theorem statement
is the type of our function: it’s what we return. This is the type of
an equality proof (=
is actually just another inductive type, though one
for which there is lots of built in support for working with).
The idea that we construct proofs as functional programs, and the types are the theorem statements, is a very important idea, often called the Curry-Howard isomorphism, and it first showed up in a theorem prover called AUTOMATH, created by the mathematician and computer scientist N.G. de Bruijn in 1968. We talked about this a little in Lecture 20: Recursion & Induction.
It’s worth actually reflecting a bit on the history of these theorem provers, and really, the notion of truth that they give us.
1.1 What is proof?
To start, we need to be clear about what a proof is, and in particular, what it means within the Rocq Theorem Prover. First, what is a proof? A proof is an argument for some statement, described so that someone else can follow, and in doing so be convinced of the statement. Proofs are thus, historically, very social and contextual things: a step that the reader will be able to jump over can be left out, for example.
Related to proofs, and in some sense, simpler, are counter-examples: these are refutations of statements by means of a concrete example. For example, the statement "all prime numbers are odd" can be refuted by giving the counter-example 2, which is both a prime number and not odd (related, the fact that finding contradictions is often easier is the reason why many proofs in math assume the statement to be proved is not true, and then use that to derive a contradiction. It’s also how Property Based Testing works!).
There is still context, of course: what is a prime number, what is odd, etc: every bit of mathematics and logic builds upon some shared social understanding of the meaning of certain things that people over centuries (or millenia) have found useful in trying to understand the world.
1.2 What is mechanized proof?
Much more recently, but nearly as old as computers (1950s), people have been trying to use computers to carry about, or assist in carrying out, proofs. The appeal is two-fold: first, if theorems are intended to be a series of steps, each which follow according to some logical rule from the previous, computers are quite good at following steps without forgetting ones in the middle. Humans are certainly more creative, but this comes at a cost: we sometimes make leaps without explanation. Of course, sometimes those leaps are correct, and some of the most creative, baffling mathematics arises from those kind of jumps (e.g., some of the number theory of Ramanujan, whose results were divinely inspired, appearing to him in dreams). However, sometimes these leaps are difficult to confirm, or end up turning out to be not true!
The other reason why computers seemed appealing, even from the beginning, was the thought that if proof was merely the application of logical steps, one after another, perhaps proof could be rendered mechanical, and people could focus on the "interesting" parts: developing new theories, adding new rules, and posing new conjectures (to be automatically proved or disproved).
It turned out that the initial optimism was somewhat misplaced: while for limited theories, computers can come up with proofs, and indeed, SMT is probably the most amazing result from that realm, beyond that, things become intractable.
Indeed, for most interesting math, there are simply too many possible approaches, and there is no tractable procedure by which a proof will be discovered. This led to an alternative approach, and a new set of tools, proof assistants, of which Rocq is one of the main incarnations.
In a proof assistant, a person is primarily responsible for constructing the proof, but each step that they carry out is checked to be valid. Thus, while proofs cannot be discovered automatically (at least, not in general), even quite rich proofs can be checked automatically, and that still turns out to be incredibly valuable, as it eliminates a serious potential risk. Assuming one trusts the proof checker (which, we’ll talk about), one need not trust the person who wrote the proof, only that the statement says what we want it to say. Given that proofs can often be orders of magnitude more complicated than theorem statements, this is a huge benefit.
1.3 What’s in Rocq?
There are two key ideas in Rocqs. The first we mentioned earlier, is the idea that theorems are types and proofs are terms. While some people are enamored with this from a theoretical perspective, it’s appeal is quite practical: it means we can reuse key ideas across programming and proving. Indeed, while in some ways, mechanized theorem proving will feel quite different, in some ways, it should feel familiar: like learning a new programming language, just a very very different one.
There’s another idea, which actually comes from an entirely different theorem prover, developed around the same time as de Bruijn’s AUTOMATH. Called LCF (or Edinburgh LCF), created by Robin Milner in 1972, the main idea was to have a trusted core that protected logical truth, and then have a flexible language with which to define conveniences for writing proofs. What Milner realized, even then, was that mechanized proofs could be very tedious things, and much easier if you could leverage programming to make it easier. These procedures, called tactics, are a key idea in Coq: indeed, we pretty much won’t write proofs without using some tactics!
We used some tactics in the above proof: rewrite
is one of the main ones we’ll
use, and it is the main way that we use equalities in our proofs.
reflexivity
is also a tactic, but a simpler one: it constructs a proof of equality
between two terms that are equivalent. Equality is itself an inductive type,
with a single constructor that allows reflexive proofs, and
so reflexivity
is really just a convenience over the constructor eq_refl
.
While we are motivated to move to Rocq by proving properties involving infinite types, we will begin our journey at the beginning: with Rocq as a programming language.
2 Installing Rocq
Please follow instructions for your platform at https://rocq-prover.org/install
3 Programming in Rocq
Rocq is a typed functional programming language. Its type system is more powerful than nearly all others, but it shares many features with other typed functional languages like OCaml, Haskell, Scala, etc.
For example: all values have types, which we can inspect with Check
; this
is part of the interactive nature of Rocq. After each of these
commands, we can run the file "to the point" – i.e., where our cursor is. If
you are using VSCode, there is a horizontal arrow near the top of the window
(and a corresponding hotkey). Other editors have a similar mechanism.
Check true.
Check 42.
We can make Definition
s:
Definition m : nat := 1. (* m is a natural number *)
Definition n : nat := 0.
Definition b1 : bool := true. (* b1 is a boolean *)
Definition b2 : bool := false.
Here we’ve given explicit type annotations to each of the definitions. Often, Rocq can infer these, but we will usually write them explicitly both to avoid any ambiguity and because it can make it easier to read.
We can, of course, ask for the type of definitions and expressions using them.
Check m. (* output: nat *)
Check n.
Check n + 0. (* nat *)
Check m * (n + 0). (* nat *)
Check b1. (* bool *)
(* Check b1 && b2. Doesn't work. *)
Check (b1 && b2)%bool. (* "&&" is the Boolean and *)
Check (b1 || b2)%bool. (* Boolean or *)
What happened with the last two? Rocq has a very flexible syntax system, and one
mechanism of that is that infix notation (e.g., &&
) is defined in what
are called "scopes". You can create and add to these yourself, and to show all
the currently available ones, use:
Print Scopes.
From that, we can see bool_scope
, which has the "delimiting key"
bool
. Any expression can be wrapped with (...)%key
and the
notation within will use the scope described by key
.
You can also open scopes:
Open Scope bool_scope.
After which you don’t need to use the delimiter, which is done by default for
nat_scope
(which is why +
and *
worked above).
And since it is a programming language, we can also evaluate expressions:
Eval compute in 5 * 4. (* 20 *)
Eval compute in m + 2. (* 3 *)
Eval compute in b1 && b2. (* false *)
Some notations and operations are defined in other modules, which we can import with Require Import
:
(* Eval compute in (m + 7) mod 2. Doesn't work*)
Print Scope nat_scope.
Require Import Arith.
Print Scope nat_scope.
Eval compute in (m + 7) mod 2. (* 0 *)
Definition
can also be used to create (non recursive) functions, which are expressions that are abstracted over (typed) arguments:
Require Import String.
Definition g (s : String) : bool := Nat.ltb 0 (length s)
Here, we are using an operation String.length
(functions from modules can
always be named qualified by their module name) to calculate how long a string
is. Note that we are not using <
, despite it existing, as that will move
us into the world of proof – something we will come back to. Instead, we use
the "ltb" (less than boolean) function from the natural number module to
compare.
We can also use other typical data structures, like lists. But recursive
functions, in Rocq, are defined with Fixpoint
, not with
Definition
.
Fixpoint my_length (l : list nat) := (match l with
| nil => 0
| _ :: xs => 1 + my_length xs
end)%list.
This additionally shows pattern matching, which can be used both to discriminate
different cases (here, the nil
and cons
cases) and also to unpack
the data and give names to part (as in the cons
case, we only want the
rest of the list; and additionally use the infix syntax for cons
, written
::
).
But we aren’t really here to learn about typed functional programming, no matter how rich. Our goal is to prove properties of programs. So how do we get from a typed functional programming language to theorem proving?
4 Minimal propositional logic
We’ll start with a simple problem, perhaps the simplest: how to express, and prove theorems about, minimal propositional logic. That is, the language that only has logical variables and implications. This will allow us to become familiar with what it means to express theorems and prove them, and understand what that means!
In Rocq, as we’ve said, everything has a type, including logical statements, or propositions.
The type of propositions is Prop
, and
we can declare that we have arbitrary ones using Parameter
:
Parameter P Q R : Prop.
Here, we’ve declared three variables that all have type Prop
. We can read this as "consider arbitrary propositions P, Q, R".
Parameter
works for any type; we could write "consider arbitrary natural numbers, or arbitrary lists, or arbitrary graphs, etc".
Now, Rocq is both a programming language and a tool for expressing and proving theorems. It could have added separate rules for logic: implication as an arrow \Rightarrow, and then logical rules for introduction (abstraction) and elimination (modus ponens), and many more rules for different logical constructions, but it turns out that the programming language is rich enough to express the logic. In the context of minimal propositional logic, rather than having a separate implication arrow, we can re-use the same arrow \rightarrow that we use for functions.
Thus Rocq makes no distinction
between a function that takes P
and produces Q
and a theorem that, given P
, yields Q
. This is partly made
possible because function arrows in Rocq are dependent: the output type can depend on the value of the input type. This
is quite important for theorems, as, e.g., in the example we had last time. x
is the value of the input, and it is used in the
output type (a proof about equality). It can also be useful for programming, though we will not explore this much: e.g., you might encode
invariants of a data structure in the types, and the output could then relate in sophisticated ways to the input.
If a function arrow is an implication, then what is a proof of such a statement? i.e., we can write down the following
theorem
:
Parameter P Q R : Prop.
Theorem T : (P -> Q) -> P -> Q.
Which we can read as: given a theorem from P
to Q
and a proof of P
, we can prove Q
. The proof is a convincing argument that one can
proceed as described. A particularly convincing argument is a procedure that given the inputs,
produces the output. What procedure could that
be? Again, we can appeal to
our understanding of this as a type in a programming language. The procedure is a program with the given type. We can
complete the proof as follows:
Parameter P Q R : Prop.
Theorem T : (P -> Q) -> P -> Q.
Proof.
exact (fun Hpq Hp => Hpq Hp).
Qed.
i.e., our proof consists of a single tactic, exact
, which completes a
proof with term that inhabits the type that is the goal of the theorem
(or the goal of whatever part of the theorem the tactic is used in – since it
is used at the beginning, it is the entire theorem). Here, it is a function
that takes Hpq
, p
and returns Hpq
applied to p
.
We could write exactly the same thing in LSL (or ISL+):
-- ; T' : (P Q) [P -> Q] P -> Q
(define (T' Hpq Hp) (Hpq Hp))
Note that the type
had (P -> Q)
taken as an argument, returning a function from P
to Q
. For compactness, we wrote the term as a function of two arguments,
rather than a series of functions of one argument, but the latter would be equivalent:
Parameter P Q R : Prop.
Theorem T : (P -> Q) -> P -> Q.
Proof.
exact (fun Hpq => fun Hp => Hpq Hp).
Qed.
We could also write this without the anonymous function, as:
Parameter P Q R : Prop.
Definition F (Hpq : P -> Q) (p : P) : Q := Hpq p.
Theorem T : (P -> Q) -> P -> Q.
Proof.
exact F.
Qed.
This same dichotomy between functions defined with or without lambda is present in LSL and ISL+, where the following two are identical:
(define (myfun x y) (+ x y))
(define myfun (lambda (x y) (+ x y)))
As an exercise, construct a term for the following theorem:
Parameter P Q R : Prop.
Theorem T' : (P -> Q -> R) -> Q -> P -> R.
Proof.
exact _.
Qed.
5 Part 2
Last time, we introduced Rocq, the notion of proof, and how Rocq serves as both a programming language and a proof assistant at the same time. Today, we’re going to go into more detail about what Truth is, how it relates to Proof, and continue on our journey exploring what we can express in Rocq, moving from Minimal Propositional Logic to full Propositional Logic.
6 Truth and falsehood
In propositional logic, a formula or statement is valid if for all assignments of variables, the statement evaluates to T. In PBT, true statements are those for which we have not yet found a counter-example (thus, they are only possibly true). Sometimes our inputs are finite and thus we can exhaustively test, but this is pretty rare.
We can be a lot more precise in Rocq, but we also have to be a lot more careful: precision comes with a cost that small mistakes could render a so-called theorem not always true.
So let’s talk about how to represent truth and falsehood in Rocq. These will be our second encounter with data definitions in Rocq (after Nat), also called inductive types. Inductive types are an important enough part of the system that the theory that underlies Rocq is called the Calculus of Inductive Constructions. There is a lot more to them, but for now, will focus on a couple important examples. So what are true and false?
Well, it turns out there are two different versions, and the difference is very important. First, we have the ordinary, boolean data definition that should be familiar from any number of programming languages, though perhaps written in an unfamiliar notation.
Inductive bool : Set :=
true : bool
| false : bool.
Let’s read this, carefully. First, it says we are declaring a new
Inductive
type called bool
. In Rocq, everything has a type,
including types, which is what the : Set
is saying: the type of
bool
is Set
, which is the type of ordinary data types. This data
definition has two cases: true
and false
.
Aside from the syntax, this should be very familiar. And indeed, we have familiar operations on
type bool
: and, or, not, if, etc. However, in Rocq, we have a separate, much more important
definition of truth and falsehood that exists for a very important reason. As we mentioned
above, in theorem provers that subscribe to the types-as-theorems idea that originated in AUTOMATH,
we should not be able to prove false! One of the oldest statements from logic is ex falso quodlibet.
Translated as "from falsehood, anything follows", the idea is that if you have a proof of false, you
can prove any theorem. As a result, any logical system in which you can prove false would
not be very useful. That means if we use our type bool
as our definition of logical truth and
falsehood, we are in trouble, as we can easily "prove" false: we just use the constructor for false
.
Why is this not an issue in PBT? Essentially, our only reasoning is based on evaluation and assertions (often, of equalities). Our only theorems are individual test cases, and their proofs are based on evaluation, which must terminate. We thus do not have the principle of explosion (ex falso quodlibet), along with just about any other logical reasoning! Indeed, we rarely have proofs: we simply have searches for counter-examples, though in certain limited cases it can be exhaustive (and thus form a proof) – e.g., when we were writing truth tables for propositional formulas.
So how should we define logical truth and falsehood? Truth is the following:
Inductive True : Prop := I : True.
i.e., True
is a inductive type that has type Prop
that has a single constructer, I
.
Why Prop
and not Set
? Rocq distinguishes between types that are used for theorems, which have type
Prop
, and types that are used for computation, which have type Set
. The actual situation is
actually a bit more complicated, as there is an entire hierarchy of types, that exist to avoid circularities
in the logic, but generally we’ll be able to avoid worrying about that.
Why is True
a type that has a single constructor? Well, logical truth is essentially trivial: it
tells you nothing, on its own, and thus you can always construct a proof of it (you don’t need to know
anything).
Logical falsehood ends up being a lot more important. How do we define it?
Inductive False : Prop := .
This type is much more baffling, at first. It is a type, that has type Prop
, but it has no
constructors. Why does in have no constructors? Because it is impossible to construct. This is exactly what
we want out of our logic, which is our programming language! Thus, we can formally write the statement
"from false follows anything", without worrying, since there should be no way of constructing a value of type
False
. Note the importance of the distinction between the types (the theorem statements) and the terms (the proofs).
The type False
can easily be written down, and indeed, it is a very important type! We will often want to
express that certain cases are impossible. How will we do that? We will write that if such cases occur,
False
will result, and as a result, we need not do anything more in those cases (as they become trivial).
If, in a proof, we end up with a term of type False
, by inspecting the value, we can determine that we must be in
an impossible case, because such a value could never have been constructed, and thus no further work is needed.
7 Propositional Logic
Now that we have a way to describe truth and falsehood, let’s continue on with operators from propositional logic. We already have implication (which exists in minimal propositional logic) how about logical and?
Inductive and (A B : Prop) : Prop := conj : A -> B -> A /\ B.
And a separate infix notation definition, /\
.
There are two projection functions, proj1
and proj2
, to get the two fields out.
To use it, we could write a statement like P /\ Q -> P
as:
Definition and_1 (p : P /\ Q) := proj1 p.
(Though this is quite rendundant since proj1
already exists).
Despite the new syntax, this is entirely something you could have written in LSL!
(define-struct intro [a b])
(define-contract (And X Y) (Intro X Y))
(define (and_1 p)
(intro-a p))
What is that argument p
? Well, it has type P /\ Q
, i.e., it is a conjunction of two
propositions. That is its type: the term is simply a pair, and so we can use projections (proj1
) and
proj2
) to extract out fields from the pair; in this case, the first field is the proposition of type P
.
So what this program does is, given a proof of a conjunction (a term of type P /\ Q
), returns
a proof of P
(a term of that type).
Written in this style, both the theorem (the type) and the proof (the term) are somewhat mixed together.
In addition to logical and, we have logical or, defined as another inductive type:
Inductive or (A B : Prop) : Prop :=
or_introl : A -> A \/ B | or_intror : B -> A \/ B.
Which similarly has infix syntax \/
.
Again, this could be defined in LSL:
(define-struct inl [a])
(define-struct inr [b])
(define-contract (Or X Y) (OneOf (Inl X) (Inr Y)))
Unlike and
, logical or is an inductive type with two cases: or_introl
and or_intror
.
This is a constructive view of logical disjunction: that if you have a proof of A \lor B, that
should mean that you either have a proof of A or a proof of B. Thus, the two ways of constructing
A \/ B
are constructors that take terms of type A
and B
.
To use a disjunction, we end up having to consider the two cases: given a term (proof) of type A \/ B
,
we must handle the case that either we had a proof of A
or we had a proof of B
. This means that, unlike with and
,
there are no projections. If x : A \/ B
, we don’t have proj1 x : A
and proj2 x : B
. Rather, we have
two possibilities for x
: it is either or_introl a
or or_intror b
, where a : A
and b : B
.
We can deal with that in code using pattern matching, as follows:
Definition or_1 (pq : P \/ Q) (hp : P -> R) (hq : Q -> R) : R :=
match pq with
| or_introl p => hp p
| or_intror q => hq q
end.
Here, we handle the two possibilities, and invoke the corresponding function (hp
or hq
) accordingly.
This is exactly analogous to how we program with itemizations in LSL. The same function could be defined with out data definition for Or above as:
(: or_1 (All (P Q R) (-> (Or P Q) (-> P R) (-> Q R) R)))
(define (or_1 pq hp hq)
(cond [(inl? pq) (hp (inl-a pq))]
[(inr? pq) (hq (inr-b pq))]))
Logical not is defined not directly, but by way of logical falsehood:
not = fun A : Prop => A -> False : Prop -> Prop
8 Part 3
As noted before, logical negation is defined in terms of False
, and notation is provided as ~A
for not A
.
That is, ~P
holds if P -> False
. This is a constructive view of negation: the negation of
P
is true if, from P
, you can construct a proof of False
. It is also why we end
up using False
a lot in proofs, even while we rarely use True
, since whenever we are reasoning about negation, we end up reasoning about False
.
One obvious consequence of this is that the statement and P (not P)
is
nonsense: a statement can never be both true and false. If we unfold the
definition of not
, we see that what that statement is saying is that
P -> False
and P
. Thus, we can project out the P -> False
function and the P
, and apply the first to the second and get a proof of
False
.
Parameter P : Prop.
Theorem F : (and P (not P)) -> False.
Proof.
exact (fun andpnotp => (proj1 andpnotp) (proj2 andpnotp)).
Qed.
Now that we have a proof of False
(a term that has type False
), we truly are in an impossible situation, as if you recall, there are no terms of type False
. It is defined as an inductive type with no constructors:
Inductive False : Prop := .
As a result, there is an elimination rule (a rule for using
the term) that says, if we have a proof of False
, we can
conclude anything.
Check False_ind. -- forall P : Prop, False -> P
For example, if we wanted to prove the statement 1 = 2
(which is a valid statement, but not provable), and we knew and P (not P)
, we
could do that as follows:
Theorem T1 : (and P (not P)) -> 1 = 2.
Proof.
exact (fun andpnotp => False_ind _ ((proj2 andpnotp) (proj1 andpnotp))).
Qed.
i.e., we first pull out the two components of the conjunction, applying the
second to the first, which gives us a term of type False
. We can then use
False_ind
, which takes a property and something of type False
and
returns something of the type of that property (i.e., a proof)! In particular,
it will produce a proof of 1 = 2
in this case. Note one special detail:
rather than actually writing down the type of the property (i.e.,
False_ind (1 = 2) ...
) we used an underscore. This can be done in many
positions, as an indication that we would like Rocq to infer it (if it can).
9 Tactics
Up to now, we’ve mostly been writing proofs as explicit terms, but that is something that is actually done quite rarely. It turns out, rather than constructing the terms directly, which for complicated proofs, might be rather involved, we can instead invoke higher-level tactics that will construct a term for us. This is the incredibly important legacy of Milner’s LCF.
For the rest of this lecture, we’re going to focus primarily on the "Goals" panel, and how it, along with tactics, allow us to much more easily, and interactively, construct proofs.
By default, whenever we start a Proof
, we switch to tactic mode. Indeed,
we’ve been using a tactic in all our proofs: exact
, which takes a
complete term that satisfies the type of the goal. But now we’ll see more
interactive, step-by-step tactics.
On their own, tactics can make our code harder to read, as they allow you to
leave out details apparent in the terms, but the interactive nature of the
"Goals" provides interactive feedback that results in a really nice environment
for constructing proofs. The most basic tactic (beyond exact
!) is
reflexivity
, which performs evaluation on both sides, and then solves
trivial equalities: equalities that are the exact same on both sides.
Tactic Use | Tactic Description |
| If the goal is an equality, complete the proof if the two sides are either the same, or can be made the same by evaluation. |
Let’s see how this works:
Lemma L1 : 1 + 1 = 2.
Proof.
reflexivity.
Qed.
Lemma
is a synonym for Theorem
that you can use to hint that these
are smaller results. It has to internal significance to Rocq.
To learn more about tactics, let’s see a slightly more involved example, returning to our simple theorem about and
:
Theorem and_2'' : forall {P Q : Prop}, P /\ Q -> Q.
Proof.
intros P Q pq.
destruct pq as [p q].
exact q.
Qed.
Here, we use three tactics. The first, intros
, takes
identifiers from the goal and moves them into the premise of the
proof. This corresponds, in the underlying term, to creating a function. Here
we give explicit names to the propositions P
and
Q
, and name the proof of P /\ Q
as pq
.
The Goals panel will always show information about the current goal and how many remain (none if the proof is complete), and each goal will have assumptions or premises (or hypotheses) that are available to use in the proof of that goal. The assumptions are separate from the obligation with a horizontal line, and the goal is the proposition that we are trying to prove.
Tactic Use | Tactic Description |
| adds new hypothesis |
We then do case analysis on one of my hypotheses: which one? The one called
pq
. Case analysis is done with destruct
, and on a pair will
produce two new hypotheses for the two conjuncts. In this case, we matched the
two produced hypotheses to the names p q
; if we had left out the as [p q]
part, then Rocq would choose names for our two hypotheses, which can make
proofs harder to read.
Note that it does not manipulate the goal in this case, but simply changes the form of the hypotheses. If we did case analysis on a disjunction (also known as a sum), we would get multiple goals, as there would be multiple different cases to consider. Case analysis in tactics corresponds to pattern matching in the underlying terms, though the Goal state can make it a bit easier to manage.
Tactic Use | Tactic Description |
| if |
| like destruct, but gives explicit names to produced names in case there is only a single case. More complicated patterns can handle disjuncts. |
At this point, we can complete the proof with the term q
, since it has
the type Q
which is our goal. We do this using the tactic exact
. This is a
tactic that takes a term, and if it matches the goal, it completes the proof.
Tactic Use | Tactic Description |
| if |
There are many other ways we could write this. First, we could use the tactic
assumption
to complete the proof: it looks for a hypothesis
that matches the goal.
Theorem and_2'' : forall {P Q : Prop}, P /\ Q -> Q.
Proof.
intros P Q pq.
destruct pq as [p q].
assumption.
Qed.
Tactic Use | Tactic Description |
| if there is some hypothesis that has the same type as the goal, then this tactic completes the proof by using that hypothesis. |
Since we no longer need to refer to q
by name, we now can allow Rocq to
determine the names when we use destruct
:
Theorem and_2'' : forall {P Q : Prop}, P /\ Q -> Q.
Proof.
intros P Q pq.
destruct pq.
assumption.
Qed.
10 Re-doing a propositional logic proof
Let’s prove a theorem that you did in Homework 2:
\neg (P \lor Q) \equiv \neg P \land \neg Q
Here we’ve written \equiv as "if-and-only-if" double implication, rather
than propositional equality, which is more natural in Rocq, since Prop
cannot be computed (and thus typical equality does not work).
Theorem demorgan1 : forall P Q : Prop, ~(P \/ Q) <-> ~P /\ ~Q.
Proof.
intros P Q.
constructor.
- intro mp.
constructor.
* intro p.
apply mp.
apply or_introl.
assumption.
* intro q.
apply mp.
apply or_intror.
assumption.
- intro mpr.
intro pq.
destruct mpr as [mp mq].
destruct pq as [p | q].
* apply mp. assumption.
* apply mq. assumption.
Qed.
Here we used two new tactics:
Tactic Use | Tactic Description |
| if the goal is an inductive type, try applying its constructors to see if any match. |
|
|
Interestingly, the other proof that you did:
\neg (P \land Q) \equiv \neg P \lor \neg Q
cannot be done in the core logic of Rocq. The issue is that it is not constructive; it can be done if we add classical axioms.
Stepping back: proofs here vs. proofs with truth tables
Now that we’ve seen a few proofs, including doing a proof that you did in propositional logic, it’s important to understand the difference between proofs in Rocq and proofs in propositional logic.
In propositional logic, all variables can be either true or false, and thus any formula can be computed via a finite truth table. In a higher order logic like Rocq, propositional variables are not either true or false, but rather, they are either provable or not provable. For a statement to be provable, it means that we have constructed a proof term with that type, but that type may involve data of arbitrary size, and thus the truth or falseness of a given statement cannot be determined by a finite truth table. Thus while the proof we just did of one of De Morgan’s laws is syntactically the same as what you did in Homework 2, it is a much more powerful proof, as it holds for any provable propositions, not just for T/F propositional variables. Another way to think about this is that Rocq has propositional logic embedded, by way of the boolean data type. Thus, our proof from Homework 2 was the following:
Theorem demorganprop :
forall P Q : bool, negb (orb P Q) = andb (negb P) (negb Q).
Proof.
intros P Q.
destruct P; destruct Q; try reflexivity.
Qed.
Here we can see a new tactic (or, technically, a "tactical", since it operates
on other tactics): ;
, which is used to chain tactics together by applying
what follows to everything that comes out of the former. Here we can see that by
doing case analysis on the two boolean inputs, we can complete the proof
trivially, as once we have constants, simply evaluating the expression will give
us the result.
Tactic Use | Tactic Description |
| Apply |
Attemptying to do that for the other proof will fail, as we cannot do
case analysis on Prop
: it is not an inductive type with a finite
number of constructors.
11 Part 4
As we’ve learned to program with Rocq, we’ve seen several inductive types: not only natural numbers, but lists and various logical connectives and constants.
Today, we’re going to go into more detail about inductive types, and how we can program and prove with them.
Inductive types allow us to define both structures and enumerations: they are the single type definition mechanism that we have, and they are incredibly powerful. Indeed, the theory that underlies Rocq is called the Calculus of Inductive Constructions for a reason!
We will show two different ways of defining an arithmetic expression (in the form of a binary tree) in the following:
Inductive ArithExp : Set :=
| num : nat -> ArithExp
| add : ArithExp -> ArithExp -> ArithExp.
This is the first, and the representation that underlies everything else. Since
everything has a type, our new inductive type must be given a type: in this
case, Set
, which is the type of "ordinary type" (i.e., data). Then, we
have a list of constructors. There can be any (finite) number of constructors,
including zero! Each constructor can taken any number of arguments (including
zero!) and must return the type being constructed. The only final restriction is
called the "positivity" test. That means you cannot define a type like:
Inductive MyT : Prop :=
| C : (MyT -> Prop) -> MyT.
(There are more subtleties to this, which we don’t have time to get into!)
Positions in types are called "negative" or "positive", and all occurrences of the type being defined must be positive in its definition. Positive occurrences are the arguments to the constructor, or the return types of function arguments to the constructor. Negative occurrences are arguments to function arguments to the constructor, as above. That means we could define a type like this:
Inductive MyT : Prop :=
| C : (Prop -> MyT) -> MyT.
This requirement is a much more subtle version of the requirement that all functions terminate. If negative occurrences in types were allowed, it would be possible to construct instances of uninhabited types, and in particular prove logical falsehood. This, in turn, would render the logic not useful as a logic, so it is ruled out by the system. We generally won’t run into this, but if you do see that error, you should have some idea of what you should be looking for.
To construct elements of type ArithExp
, we use the constructors:
Definition a1 := num 10.
Definition a2 := add (num 1) (num 2).
To use an ArithExp
in code or in a proof, we can use
pattern matching, a common feature in typed functional
programming, and one we’ve started getting familiar with. This is a more
concise form of the pattern of conditionally branching on which case of
the data type you are in, and then binding the corresponding elements.
Definition a3 := add (num 3) (num 4).
Definition a4 := match a3 with
| add e1 _ => e1
| _ => a3
end.
A match
expression takes an expression (here a3
), and then
a series of clauses. Rocq requires that the clauses are exhaustive: they
must cover all of the cases of the inductive type, but you can use wildcard
(_
) patterns to cover multiple cases, or within cases.
While that syntax of inductive definition is the core syntax, we will often use the following shorthand, which gives names to the arguments to constructors, writing them a bit more like enumerated struct definitions:
Inductive ArithExp : Set :=
| num (n : nat)
| add (l r : ArithExp).
This is a logically equivalent type, but the names of the fields will be used when Rocq defines comes with convenient field names.
You can also apply only some of the arguments, and end up with a partially applied function expecting the rest:
Check add (num 3). -- add (num 3) : ArithExp -> ArithExp
Let’s practice by writing a function that uses pattern matching to evaluate our arith expressions to a number:
Fixpoint eval (e : ArithExp) : nat := match e with
| num n => n
| add l r => Nat.add (eval l) (eval r)
end.
Note how when we write recursive functions, we must use Fixpoint
, vs
non-recursive functions can be defined with Definition
. There are more
severe restrictions on recursive functions that this mere keyword switch: all
recursive function must be terminating, and by default this uses a relatively
simple syntactic check that corresponds more-or-less to recursive calls that
follow the structural template (so all generative recursion would be ruled out).
There are ways to encode generative recursive functions, but it’s more complex.
We’ll come back to using this function in a proof in a bit.
Let’s now define a new inductive type, this time for colors. It should be an enumeration of a few constants:
Inductive Color : Set :=
| red : Color
| green : Color
| blue : Color.
Now, let’s define a different representation, where our type has a single constructor that takes three natural numbers as arguments (for red, green, blue light values):
Inductive RGB : Set :=
| mk : nat -> nat -> nat -> RGB.
We can now define functions that convert between the two representations:
Definition colortoRGB (c : Color) : RGB := match c with
| red => mk 255 0 0
| green => mk 0 255 0
| blue => mk 0 0 255
end.
Definition RGBtoColor (r : RGB) : Color := match r with
| mk 255 0 0 => red
| mk 0 255 0 => green
| mk 0 0 255 => blue
| _ => red
end.
What had to happen when we converted from RGB to color? We had to pick a way of handling "all the rest" of the colors. In this case, we just turned everything else into Red. There are other approaches, but the important part is that Rocq will always make us handle every case in some way. This is quite different from similar pattern matching in functional languages like Haskell or OCaml, where you can leave out cases and get errors at runtime.
Now, let’s prove that if we convert from Color to RGB and back, we get the same thing:
Theorem colorRGB_roundtrip : forall c, RGBtoColor (colortoRGB c) = c.
Proof.
intros c.
destruct c; auto using colortoRGB, RGBtoColor.
Qed.
Here we started to introduce automation; once we do case analysis on
c
using destruct
, we know that we should be able to complete the
proof my simplifying both functions; the auto
tactic can do
simplification (among other things), and using
allows us to suggest that
auto
should use these definitions (rather than just built in definitions
like and
, or
, etc, which wouldn’t have been enough).
Inductive proofs
Our motivation for moving to Rocq was to be able to prove properties about unbounded data. Now we know how to define inductive data types, which allow us to describe all sorts of unbounded data: not only numbers, but lists, trees, graphs, etc. Almost any of the data types that you designed in Fundies 1 are types that you should be able to write down as a suitable inductive definition.
But writing data definitions and defining functions over them was not our goal: we wanted to prove properties. Let’s start to talk about how that works.
We’ve shown how in Rocq, logical statements are types, and terms that have the type are proofs of the statement. Just as possibly many terms can have the same type, there are often many ways to prove the same statement. We’ve shown, in detail, how we can apply this to minimal propositional logic – how implication is expressed through function arrows, and application corresponds to modus ponens.
Now we are going to talk about how recursion corresponds to induction.
Inductive types are named that because they can be defined inductively – i.e., the constructors can refer to the type itself. Provided there is at least one base case (i.e., a constructor that does not refer to itself), we can build examples of the data: i.e., it is inhabited.
We will use an inductive type by pattern matching, and often computing recursively.
We’ll start by proving a simple property of lists: that appending an empty list results in the same list. Proving this on the left is trivial (try it!), but on the right requires induction, because of how List.append
is defined.
Theorem list_app_r_nil : forall (T : Type) (L : list T),
(L ++ nil)%list = L.
Proof.
intros T L.
induction L.
- reflexivity.
- simpl. rewrite IHL. reflexivity.
Qed.
Here we see a few new tactics. Not only induction
, but simpl
and rewrite
.
Tactic Use | Tactic Description |
| If |
| Evaluates the goal |
| The same as above, but simplifies in a hypothesis H |
| If |
| The arrow allows to rewrite using the equation the other direction, and the |
Now, consider our ArithExp
inductive type, and eval
function. We talked about this as programming: it
is a function that takes an expression and returns a number. But we can
also think of the type as a logical statement: that given an
ArithExp
, it is possible to construct a Nat
. This isn’t a
terribly interesting statement, because we can construct a Nat
without anything: we can just use 0
, or 1
, or any of the
infinite other Nat
s. It thus has lots of proofs, many of which
are probably not very useful, at least if we want to treat them also
as programs.
Things become much more interesting when we introduce the ability to
express equivalence in our logical statements. There are two
different ways of doing this in Rocq. First, we can express it as
two implications: a statement P
is equivalent to a statement
Q
if P -> Q
and Q -> P
. We can write this
compactly as P <-> Q
. Constructing a proof for such a theorem
involves proving both implications. The other form of equivalence is
P = Q
, which is an inductive type with a single constructor
that takes two identical elements. This means that if you have a proof
(i.e., term) of type P = Q
, you know that P
and Q
must actually be the same, as that’s the only way you could have
constructed the term. We’ll use both forms of equivalence, though
the latter is stronger, since it shows that the two
statements must actually be the same, whereas P <-> Q
only
means that you can prove them each from the other. The other issue is
that P <-> Q
is only defined for propositions, whereas
P = Q
is defined for any type.
Let’s use this with our ArithExp
definition. Let’s define an
optimization, const_fold
, that replaces any ArithExp.add
of two numbers with the result of the
operation.
Fixpoint const_fold (e : ArithExp) : ArithExp := match e with
| num n => num n
| add l r =>
match l, r with
| num n, num m => num (Nat.add n m)
| l', r' => add (const_fold l')
(const_fold r')
end
end.
We can prove that this is correct, by showing that the result of
calling eval
on the result of const_fold
is the same as
calling eval
on the original expression, for all expressions.
Theorem const_fold_correct1 :
forall (e : ArithExp), eval e = eval (const_fold e).
Proof.
intro e.
induction e.
- reflexivity.
- destruct e1.
* destruct e2.
-- simpl. reflexivity.
-- assert (H : eval (const_fold (add (num n) (add e2_1 e2_2))) =
eval (add (num n) (const_fold (add e2_1 e2_2)))) by auto.
rewrite H.
clear H.
assert (H : forall x, eval (add (num n) x) =
Nat.add n (eval x)) by auto.
rewrite H. rewrite H. rewrite IHe2.
reflexivity.
* destruct e2.
-- assert (H : eval (const_fold (add (add e1_1 e1_2) (num n))) =
eval (add (const_fold (add e1_1 e1_2)) (num n))) by auto.
rewrite H.
clear H.
assert (H : forall x, eval (add x (num n)) =
Nat.add (eval x) n) by auto.
rewrite H. rewrite H. rewrite IHe1.
reflexivity.
-- assert (H : eval (const_fold (add (add e1_1 e1_2) (add e2_1 e2_2))) =
eval (add (const_fold (add e1_1 e1_2)) (const_fold (add e2_1 e2_2)))) by auto.
rewrite H.
clear H.
assert (H : forall x y, eval (add x y) = Nat.add (eval x) (eval y)) by auto.
rewrite H. rewrite IHe1. rewrite IHe2.
reflexivity.
Qed.
NOTE: this is a horrible proof.
First, note that this was what we approximated earlier, using PBT or Rosette. Now we have proved it for all expressions. We did it by induction. What is induction? It’s a strategy for constructing a proof for an arbitrary sized piece of data, in the same way as we write recursive functions over arbitrary sized data.
Let’s look at the proof state after we run the induction
tactic. See we
have two goals now: one for each case of our data type. This is the
same way that a recursive function on ArithExp
would have two cases in
pattern matching, or two branches in a template. In the base case(s) we have to
prove the statement in that case. In the other cases, we have to prove the
statement, but we can use (analogously to recursive calls!) the statement on the
subparts of the data. i.e., we can assume it already holds on the subparts of
the data.
Why does this work? Well, for any example piece of data, it is built out of the constructors, and if we have proved each of these cases, we can build up a proof of the example by starting with the proofs of the base case and then applying the inductive cases over and over. Since we can do that for any data, it works for all!
Now, that was kind of repetitive. If we do a little work, we can make that proof a lot shorter, by relying on some of the tacticals and some automation:
Theorem const_fold_correct2 :
forall (e : ArithExp), eval e = eval (const_fold e).
Proof.
intro e.
induction e.
- reflexivity.
- destruct e1; destruct e2;
simpl; eauto.
Qed.
If we want to see how this worked, we can replace the ; eauto
with
incremental applications of info_eauto
, which will show what sequence of
tactics it used to complete each goal.
12 More Proof Practice
Let’s do a few proofs about numbers. Here we are going to represent "evenness" as an inductive type, rather than a boolean predicate.
Fixpoint double (n : nat) : nat :=
match n with
| 0 => 0
| S n' => S (S (double n'))
end.
Theorem double_succ : forall n : nat,
double (S n) = S (S (double n)).
Proof.
intros n. simpl. reflexivity.
Qed.
Inductive ev : nat -> Prop :=
| ev_O : ev 0
| ev_SS : forall n : nat, ev n -> ev (S (S n)).
Theorem ev_4 : ev 4.
Proof.
apply ev_SS. apply ev_SS. apply ev_O.
Qed.
Theorem ev_plus4 : forall n : nat, ev n -> ev (n + 4).
Proof.
intros n H.
repeat (rewrite <- plus_n_Sm).
rewrite <- plus_n_O.
apply ev_SS. apply ev_SS. assumption.
Qed.
Theorem ev_double : forall n : nat, ev (double n).
Proof.
intros n.
induction n as [| n' IH].
- simpl. apply ev_O.
- simpl. apply ev_SS. assumption.
Qed.
The only slightly tricky part of these was dealing with the n + 4
–
since addition is defined by recursion on the first argument, we
actually have to appeal to a lemma (proven with induction) that n + (S m)
, and then finally that
= S (n + m)n + 0 = n
.
Now, we’ll try to show that our definition of even is equivalent to a more typical boolean predicate definition:
Fixpoint even (n : nat) : bool :=
match n with
| 0 => true
| 1 => false
| S (S n') => even n'
end.
Definition nat_ind2 :
forall (P : nat -> Prop),
P 0 ->
P 1 ->
(forall n : nat, P n -> P (S(S n))) ->
forall n : nat , P n :=
fun P => fun P0 => fun P1 => fun PSS =>
fix f (n:nat) := match n with
0 => P0
| 1 => P1
| S (S n') => PSS n' (f n')
end.
Theorem ev_eq : forall n : nat, ev n <-> even n = true.
Proof.
intro n.
split.
- (* -> direction *)
intro H.
induction H as [| n' H' IH].
+ simpl. reflexivity.
+ simpl. assumption.
- (* <- direction *)
intro H.
induction n as [| n' IH | n' IH] using nat_ind2.
+ apply ev_O.
+ simpl in H. discriminate.
+ simpl in H. apply ev_SS. apply IH. assumption.
Qed.
Initially, this seems to go well: but we need to do induction on the structure of our relational definition of even, rather than the structure of the natural numbers. Going in the other direction, from the even predicate to the even relation, just doesn’t work. The reason is that natural number induction goes up one number at a time. To prove that Nat.succ n is even, we can assume that n is even. But this is never going to be true! Either n isn’t even, or Nat.succ n isn’t even!
The solution is to instead use a different induction principle on natural numbers, that goes up two at a time, an easier way to do this is actually go back to our intuition from recursive definitions.
This uses a new tactic: discriminate. This is used whenever we have an assumption that is a false equality.
Tactic Use | Tactic Description |
| Looks for an impossible equality as a hypothesishypothesis (i.e., |
13 Relations vs. Functions
You’ve written many functions in your programming career, and a bunch in this class. You’ve also now seen a few "relations" in this class, which often capture similar ideas. e.g., we just saw an even function:
Fixpoint even (n : nat) : bool :=
match n with
| 0 => true
| 1 => false
| S (S n') => even n'
end.
And an even relation:
Inductive ev : nat -> Prop :=
| ev_O : ev 0
| ev_SS : forall n : nat, ev n -> ev (S (S n)).
We talked (and proved) that the two were equivalent by showing that if you had one you could get the other, e.g., proved the following theorem:
Theorem even_ev : forall n, even n = true <-> ev n.
Proof.
Admitted.
But how are they different, and how are they the same? That is the purpose of the rest of this lecture.
14 Functions
First, we need to be more formal about what a function is. Mathematically (and the functions in Rocq are mathematical functions, i.e., given the same input, they always return the same output), a function is a mapping from inputs to outputs. Another way of stating that is that it is a (potentially infinite) set of input-output pairs, with the restriction that for any input, there is only one output (as calling a function on a particular input can only give you one output).
e.g., we could characterize even as:
0, true |
1, false |
2, true |
3, false |
4, true |
... |
Where we would have to give infinite pairs.
We can do this for any function. For example, the factorial function:
Fixpoint factorial (n : nat) := match n with
| 0 => 1
| S n => (S n) * factorial n
end.
Could also be defined as:
0,1 |
1,1 |
2,2 |
3,6 |
4,24 |
5,120 |
... |
15 Relations
So what is a relation? A relation is a generalization of a function, in that it does not have the restriction that there is only a single output for a given input. Instead, it is a set of niput-output pairs, where there can be multiple outputs for a given input. So any function is as relation, but some relations are not functions. More concisely, you can think of a relation as a connection between inputs and outputs. One way you might get to a relation is by defining a function, and then making the function nondeterministic, i.e., allow it to potentially return different outputs for the same input.
How can we represent relations? Well, in any language we can represent a relation as a predicate (boolean returning function) on pairs (of input-output). e.g., a relation for even could be:
Require Import Arith.
Definition even_rel (x : nat * bool) : bool := match x with
| (n, true) => n mod 2 =? 0
| (n, false) => n mod 2 =? 1
end.
Or we could represent the factorial relation as:
Definition factorial_rel (x : nat * nat) : bool := match x with
| (n, m) => factorial n =? m
The latter being a bit circular, but a valid definition.
But in Rocq, we have another way of representing relation that turns out to be really useful when we are writing proofs. Rather than writing them as functions that take pairs and return booleans, we can define them as inductive types.
This is useful because if we are trying to prove something about, e.g., the
factorial function, if we have the function, all we know is the input and
output: the structure of the code is opaque. But if we represent it as an
relation, and encode that as an inductive type, then the structure of the
computation corresponds to the constructors of the indpuctive type. We can do
case analysis, induction, etc. We’ve seen the ev
relation many times, so
let’s do one for factorial
instead:
Inductive fact : nat -> nat -> Prop :=
| fact_O : fact 0 1
| fact_S : forall (n m : nat), fact n m -> fact (S n) ((S n) * m).
Now, why does that relation mean the same thing? And in general, why does this
approach work? Rather than proving it (which we’ll do later), let’s look at how
the factorial
function evaluates. e.g., factorial 4
:
factorial 4 =
4 * factorial 3 =
4 * (3 * factorial 2) =
4 * (3 * (2 * factorial 1)) =
4 * (3 * (2 * 1 * factorial 0)) =
4 * (3 * (2 * 1 * 1)) = 24
Now, we know this, and we can confirm that we got from the beginning to the end
by running #eval factorial 4
, but if we are trying to prove something
about the factorial function, e.g., we have a hypotheses factorial n = m
,
there is no way to extract that structure from the hypothesis.
Now, let’s look at the same thing, but with the relation. In this case, we want
to see how a value of type fact 4 24
is constructured: i.e., how we
construct a proof that the pair (4, 24) is in the factorial relation.
Check (fact_S 3 _ (fact_S 2 _ (fact_S 1 _ (fact_S 0 _ fact_O))) : fact 4 24).
We can see that the term is a sequence of constructors, each of which
is a step in the computation: the same steps as we see above when we wrote out
how factorial
evaluated, but unlike something on paper, this is an actual
value we can use in our proofs.
To see this, and to see how this structure can be useful in our proofs, lets show that the two representations are equivalent. One direction (going from the inductive) ends up being easier, because the inductive relation contains so much more structure we can make use of in our proof:
Theorem fact_factorial : forall n m, fact n m <-> factorial n = m.
Proof.
intros n m.
split.
- (* forward direction *)
intros H.
induction H.
+ simpl. reflexivity.
+ simpl. rewrite IHfact. reflexivity.
- (* backward direction *)
intros H.
revert m H.
induction n as [|n' IH]; intros m H.
+ simpl in H. subst m. constructor.
+ simpl in H. subst m.
constructor.
apply IH. reflexivity.
Qed.