Shrinking a language, part 1
We defined a small language yesterday that could implement simple functions. We could handle numbers, booleans, our own definitions of functions, and builtin functions. Today, we’re going to shrink our language: we are going to try to encode numbers and booleans (and the functions that work with them) using nothing but lambdas. Our goal is not to create a convenient language to use, but rather to see just how minimal of a language we can get that still suffices to compute whatever we want to compute. In fact, while we’re at it, let’s limit ourselves to lambdas of just a single argument, just to see how minimal we can get.
; Nat -> Nat (define (fact n) (if (zero? n) 1 (* n (fact (sub1 n)))))
Sample Problem Identify all the concepts we need to translate, in order to simplify this into only lambdas.
Booleans: true, false and if
Numbers: 1, zero?, * and sub1
Definitions and recursion
1 Booleans
Our strategy for defining each of these topics is to identify the constants we need to work with, and some defining equations that describe the requirements we have for them to work properly. For Booleans, it’s easy: there are three constants, TRUE, FALSE and IF, and we have the following requirements:
Actually, these equations are ever so slightly incomplete. We’ll see why, next time.
IF TRUE a b ===> a IF FALSE a b ===> b
How can we define TRUE and FALSE using nothing but lambdas? Limiting ourselves to lambdas means giving up some obvious abilities: among them, we can’t check whether a value is equal to a particular constant or not, because we can’t check lambdas for equality. All we can do with lambdas is call them. So we have to shift our perspective from asking what the constants are to asking what the constants do.
; Any Any -> Any (define TRUE.v1 (λ(a b) a)) (define FALSE.v1 (λ(a b) b))
; A BOOL is a function of signature Any -> [Any -> Any], ; and is one of: (define TRUE (λ(a) (λ(b) a))) (define FALSE (λ(a) (λ(b) b)))
Sample Problem Define IF, and test it.
; BOOL -> [Any -> [Any -> Any]] (define IF (λ(cond) (λ(then) (λ(else) ((cond then) else)))))
A slight caution: the derivations below are not written to follow exactly the evaluation order that DrRacket would use. I write it this way to keep things clearer, and we can actually prove that in this setting, either evaluation order is allowed. You are encouraged to work through the derivations in the order DrRacket would evaluate. For example, the first step below should substitute both IF and TRUE...
(((IF TRUE) x) y) ; {substitute IF} ==> ((((λ(cond) (λ(then) (λ(else) ((cond then) else)))) TRUE) x) y) ; {apply the lambdas} ==> ((TRUE x) y) ; {substitute TRUE} ==> (((λ(a) (λ(b) a)) x) y) ; {apply the lambdas} ==> x (((IF FALSE) x) y) ; {substitute IF} ==> ((((λ(cond) (λ(then) (λ(else) ((cond then) else)))) FALSE) x) y) ; {apply the lambdas} ==> ((FALSE x) y) ; {substitute TRUE} ==> (((λ(a) (λ(b) b)) x) y) ; {apply the lambdas} ==> y
This seems to meet our definitions, so we have a working encoding of Booleans and if, using nothing but lambdas.
2 Pairs
In the previous lecture we ignored structured data entirely. Let’s remedy that a bit here. We won’t handle arbitrary structure definitions; we’ll just work with a single definition of a pair of values. (We can use pairs to encode everything else: lists, trees, graphs, etc. So conceptually, we only need to support just this one kind of structure.)
Sample Problem Suppose we wrote (define-struct pair [left right]) in ISL. What functions would we be defining?
We only need three constants to describe a pair: MK-PAIR to construct a pair, and LEFT and RIGHT to get the pieces of a pair. We don’t need PAIR?, since pairs are the only structure we have. (This is fortunate, since we have no way of asking a given lambda whether it is a pair or not.) Our constants will have the signatures
; MK-PAIR : Any -> Any -> Pair ; LEFT : Pair -> Any ; RIGHT : Pair -> Any
and the defining equations
(LEFT (MK-PAIR a b)) ===> a (RIGHT (MK-PAIR a b)) ===> b
Exercise 11 Design encodings for these three constants.
The key insight here is that a pair is useless until we take it apart into its pieces, and doing so amounts to choosing which piece we want. That suggests the following encoding:
; A PAIR is a [BOOL -> Any] ; INTERPRETATION: when given TRUE, the pair will return its left value; ; and when given FALSE, it will return its right value (define MK-PAIR (λ(a) (λ(b) (λ(choose) ((choose a) b))))) (define LEFT (λ(pair) (pair TRUE))) (define RIGHT (λ(pair) (pair FALSE)))
Let’s see whether these definitions work with our requirements.
(LEFT ((MK-PAIR x) y)) ; {substitute LEFT} ==> ((λ(pair) (pair TRUE)) ((MK-PAIR x) y)) ; {apply the lambda} ==> (((MK-PAIR x) y) TRUE) ; {substitute MK-PAIR} ==> ((((λ(a) (λ(b) (λ(choose) ((choose a) b)))) x) y) TRUE) ; {apply the lambdas} ==> ((TRUE x) y) ; {by the definition of TRUE} ==> x (RIGHT ((MK-PAIR x) y)) ; {substitute RIGHT} ==> ((λ(pair) (pair FALSE)) ((MK-PAIR x) y)) ; {apply the lambda} ==> (((MK-PAIR x) y) FALSE) ; {substitute MK-PAIR} ==> ((((λ(a) (λ(b) (λ(choose) ((choose a) b)))) x) y) FALSE) ; {apply the lambdas} ==> ((FALSE x) y) ; {by the definition of FALSE} ==> y
This seems to meet our definitions, so we have a working encoding of Pairs, using nothing but lambdas.
3 Numbers
First, let’s agree that we’re not going to implement all possible numbers. Implementing complex numbers isn’t technically challenging: a complex number is simply a pair of real numbers...and we have pairs now. Implementing real numbers is very challenging, but can be implemented by something called Dedekind cuts, which is well beyond our scope but can be encoded by a predicate function on fractions...and we already have Booleans. A fraction is nothing more than a pair of integers...and we have pairs. And an integer is nothing more than the difference between two natural numbers...which is another pair. So all we need to implement are the natural numbers.
We’ve seen data definitions of naturals already: either zero, or one more than some natural. The essence of such numbers appears to be to count how many times to do something: in other words, these are the Repeaters you’ve worked with in lab.
Exercise 12 Finish (part "lab12") on repeaters.
(define ZERO? (λ(n) ((n (λ(dont-care) FALSE)) TRUE)))
Exercise 13 Test this function to confirm that it works for several numbers.
3.1 Making numbers bigger
We need to design ADD1, which ought to take in a NAT and produce a new NAT that represents the next larger number. Therefore, its signature is
Sample Problem Determine the general scaffolding for ADD1. What can we deduce just from the signature and data definition?
(define ADD1 (λ(n) (λ(f) (λ(x) ...))))
(define ADD1 (λ(n) (λ(f) (λ(x) (f ((n f) x))))))
(define ADD (λ(m) (λ(n) ((m ADD1) n))))
Sample Problem Given addition, define multiplication.
(define TIMES (λ(m) (λ(n) ((m (ADD n)) ZERO))))
Exercise 14 Given multiplication, define exponentiation.
3.2 Making numbers smaller
The challenge of implementing subtraction is subtle. Our encoding of natural numbers as repeaters menas that all we can do with these values is call them or ignore them; we can’t "partially" call them to repeat "fewer times". Said another way, if we have a repeater that applies a function f a total of n times, we can’t in general "undo" one of those calls to f – instead, we have to figure out a way to never have called it in the first place.
If we can’t count "down from n", then maybe we can count "up from ZERO." One reasonable intuition is to say "let’s define a function that takes its argument n, then starts an accumulator x from ZERO and checks whether (ADD1 x) is equal to n – if it is, then x is our answer, otherwise, ADD1 to it and recur." The problem is we need to figure out how to define equality on numbers, and right now the only number we know how to check for is ZERO. So we’d have to subtract one number from the next and compare to zero...and we don’t have subtraction yet. So this approach doesn’t quite work.
But the gist of the idea is promising, though: maybe we can define a function that, when repeated exactly n times, will give us n-1? Let’s build a simple table of inputs and outputs to see what we expect:
n | (SUB1 n) |
0 | ?? |
1 | 0 |
2 | 1 |
3 | 2 |
4 | 3 |
... | ... |
n | n - 1 |
This is one of the foundational axioms of Peano arithmetic. Axioms 1 and 6 correspond exactly to our NAT data definition...and our encoding actually very neatly matches this: the original notation of repeaters used s and z instead of f and x, because these two variables serve the exact same roles as "successor" and "zero".
Sample Problem Solve this!
(define SUB1 (λ(n) (RIGHT ; take the right value of the pair that we get by ((n (λ(row) ; repeat n times: given a row, (MK-PAIR ; make a pair where we (ADD1 (LEFT row)) ; add 1 to the left item of the prior row, then copy (LEFT row)))) ; the left item of the prior row to the right of this row (MK-PAIR ZERO ZERO))))) ; starting from a pair of zeros
This is truly elegant: we’ve used pairs to give ourselves an accumulator parameter, revised our perspective to think about repeaters as acting on rows of a table instead of just individual numbers, revised our perspective of subtraction to be a process of counting up, and in the end we obtain a function that works perfectly.
Exercise 15 Given SUB1, define arbitrary subtraction.
4 Recursion
At this point, we can translate our code to be
; NAT -> NAT (define fact (λ(n) (IF (ZERO? n) ONE (* n (fact (SUB1 n))))))
We’ve translated everything in our function...except the recursive call itself. That will have to wait for another day.
5 Summary
The encodings we’ve discussed today are called the Church encodings, named after Alonzo Church, and the language that we’ve reduced to, which consists solely of variables, lambdas and applications, is called the Lambda calculus. There is a wealth of research and programming language theory that stems from this, and if you’re interested, there’s far more to explore here.