Defining an interpreter for a small language
We’ve been using DrRacket to help us write and run programs in BSL and ISL this semester, and we’ve relied on its ability to evaluate our code and produce either an answer or an error message. But how exactly does it do that? Writing an interpreter for a language is one of the ways programming-language researchers and implementors understand and create new languages. DrRacket is itself a program: can we design our way to building an interpreter for a small langauge?
1 What makes a language?
The kinds of values our language can manipulate: things like Numbers, Booleans, Symbols, structure values, etc;
The syntax of expressions we use to manipulate values: things like define, if, function application, etc;
The semantics or meaning of each of those expression forms
Those three pieces define a language. Moreover, the semantics of the language is essentially a function from expressions to values: it assigns meaning to each expression by determining what its result should be.
For our language, we’re going to pick a very small set of values and a small set of expressions to manipulate them. Building a bigger language doesn’t introduce any new essential concepts for us here.
2 Data definitions
Sample Problem What kinds of values can we manipulate in ISL+Lambda? What is a minimal set of interesting values to work with?
Moving up, we work with structured (and recursive) data. These are more complicated; we’ll leave them out for now.
Most importantly, we have functions, which are the workhorse of BSL and ISL. Functions come in two varieties: the built-in ones, and the ones we write ourselves. We’ll start with defining our own functions, and incorporate built-in ones later.
Sample Problem Design a data definition for Values.
; A Value is one of ; - Number ; - Boolean ; - CustomFunction ; - BuiltInFunction
Sample Problem What kinds of expressions do we manipulate in ISL+Lambda? What is a minimal set of interesting expressions to work with?
What’s left? For booleans, we need the ability to write if expressions...and that’s
pretty much enough. (We can write cond expressions using a lot of
ifs, so cond is not essential —
Working with numbers is done entirely through function calls, so we don’t need any special-purpose expressions for that.
Writing our own functions requires lambda expressions. Using functions requires being able to call them, so we need some syntax for function-application.
Finally, using the parameters of a function means we need to have names in our program.
Sample Problem Design a data definition for Expressions.
; A Expr is one of ; - Number ; - Boolean ; - Identifier ; - (list 'if Expr Expr Expr) ; - (list 'fun [List-of Identifier] Expr) ; - [NEList-of Expr] ; A Identifier is any Symbol except any Keywords ; A Keyword is one of ; - 'if ; - 'fun
Exercise 4 Construct at least ten examples of Exprs, covering all the cases. Be sure some of the examples nest expressions inside each other.
'(fun (x) (if x 5 6))
If you’re not familiar with the quote syntax, read Intermezzo 2. We will use it freely in these notes.
Note! We’re creating a data definition to represent programs of
our toy language. Those data definitions happen to look a lot like BSL itself.
That’s not just a coincidence: our expressions above are all examples of
s-expressions —
If you find yourself getting confused whether something is an Expr or "just" a ISL expression, rely on signatures to help you out.
Exercise 5 Design the function keyword=?, that takes a BSL value and a Keyword, and checks if they’re equal.
; keyword=? : Any Keyword -> Boolean ; Is this value equal to the given keyword? (define (keyword=? val kw) (and (symbol? val) (symbol=? val kw))) (check-expect (keyword=? 42 'if) #false) (check-expect (keyword=? 'fun 'fun) #true)
Exercise 6 Construct the template for Expr.
(define (expr-temp expr) (cond [(number? expr) ...] [(boolean? expr) ...] [(symbol? expr) ...] [else (local ((define first-expr (first expr))) (cond [(keyword=? first-expr 'if) (... (expr-temp (second expr)) ; the condition of the if ... (expr-temp (third expr)) ; the "then" clause ... (expr-temp (fourth expr)) ; the "else" clause ...)] [(keyword=? first-expr 'fun) (... (second expr) ; the arguments of the function ... (expr-temp (third expr)) ; the function body ...)] [else (... (expr-temp (first expr)) ; the function ... (map expr-temp (rest expr)) ; the function arguments ...)]))]))
3 Evaluation
Our goal is to develop a function eval, that takes a Expr and returns a Value. We’ll build this function in several stages.
3.1 First attempt: just numbers and booleans
Exercise 7 Fill in the number and boolean cases of the template above. Complete the case of 'if expressions. Be careful! How many possible outcomes can there be for an 'if expression?
; Expr -> Value ; Evaluate the given expression to an answer (define (eval expr) (cond [(number? expr) expr] [(boolean? expr) expr] ...))
; Is this value #true, and nothing else? ; Any -> Boolean (define (true? v) (and (boolean? v) v)) ; Expr -> Value ; Evaluate the given expression to an answer (define (eval expr) (cond [(number? expr) expr] [(boolean? expr) expr] [(symbol? expr) ...] [else (local ((define first-expr (first expr))) (cond [(keyword=? first-expr 'if) (local ((define cond-val (expr-temp (second expr)))) ; evaluate the condition (cond [(true? cond-val) (eval (third expr))] ; if it's true, use the "then" clause [(false? cond-val) (eval (fourth expr))] ; if it's false, use the "else" clause [else (error "If expects a boolean, got " cond-val)]))] ...))]))
Exercise 8 Write several tests for eval. Confirm that it behaves as expected.
3.2 Including identifiers
x
((λ(x) (+ x 2)) 5)
Exercise 9 Design a function all-bound?, that takes a Expr and returns true if all the uses of variables in the program are properly bound, and false if any of them are not.
; An Environment (Env) is a [List (list Identifier Value)] ; INTERPRETATION: Each pair in the list associates a variable name to ; its value.
; Identifier Env -> Value (define (lookup name env) (cond [(empty? env) (error name "not defined!")] [(cons? env) (if (symbol=? (first (first env)) name) (second (first env)) (lookup name (rest env)))]))
Sample Problem Define several interesting test cases for lookup.
; Expr Env -> Value ; Evaluate the given expression to an answer ; ACCUMULATOR: env includes all the variables that are currently in scope (define (eval expr env) (cond [(number? expr) expr] [(boolean? expr) expr] [(symbol? expr) (lookup expr env)] [else (local ((define first-expr (first expr))) (cond [(keyword=? first-expr 'if) (local ((define cond-val (eval (second expr) env))) ; evaluate the condition (cond [(true? cond-val) (eval (third expr) env)] ; if it's true, use the "then" clause [(false? cond-val) (eval (fourth expr) env)] ; if it's false, use the "else" clause [else (error "If expects a boolean, got " cond-val)]))] ...))]))
Exercise 10 Test this new evaluator thoroughly. Mix and match identifiers, expressions, and different environments.
3.3 Implementing our own functions
(define foo (λ(x) x)) (foo 5)
(define foo (λ(a) (λ(b) (+ a b)))) (define bar (foo 3)) (define baz (foo 5)) (bar 5)
; A CustomFunction is a (make-closure [List-of Identifier] Expr Env) ; INTERPRETATION: the argument names and the body of a function, ; packaged with the environment that existed at the moment the closure was created. (define-struct closure [args body env])
; Expr Env -> Value ; Evaluate the given expression to an answer ; ACCUMULATOR: env includes all the variables that are currently in scope (define (eval expr env) (cond [(number? expr) expr] [(boolean? expr) expr] [(symbol? expr) (lookup expr env)] [else (local ((define first-expr (first expr))) (cond [(keyword=? first-expr 'if) (local ((define cond-val (eval (second expr) env))) ; evaluate the condition (cond [(true? cond-val) (eval (third expr) env)] ; if it's true, use the "then" clause [(false? cond-val) (eval (fourth expr) env)] ; if it's false, use the "else" clause [else (error "If expects a boolean, got " cond-val)]))] [(keyword=? first-expr 'fun) (make-closure (second expr) ; the args (third expr) ; the body – not evaluated! env ; the environment)] [else (local ((define eval-fun (eval (first expr) env)) (define eval-args (map (λ(arg) (eval arg env)) (rest expr)))) (do-application eval-fun eval-args))]))]))
; Value [List-of Value] -> Value ; Applies the given function-value to the given argument-values (define (do-application fun-val args-vals) (cond [(closure? fun-val) (... (closure-args fun-val) (closure-body fun-val) (closure-env fun-val))] [else (error "not a function: " fun-val)]))
Exercise 11 Finish this function!
; Value [List-of Value] -> Value ; Applies the given function-value to the given argument-values (define (do-application fun-val args-vals) (cond [(closure? fun-val) (cond [(= (length (closure-args fun-val)) (length args-vals)) (local ((define args-bindings ; combine each argument name with the corresponding ; argument value (map (λ(arg val) (list arg val)) (closure-args fun-val) args-vals)) ; the new environment includes the arguments and the old environment (define new-env (append args-bindings (closure-env fun-val)))) (eval (closure-body fun-val) new-env))] [else (error "Invalid function call: wrong number of arguments")])] [else (error "not a function: " fun-val)]))
Exercise 12 Confirm that this shadowing works properly.
Exercise 13 Test the new functions thoroughly.
3.4 Including built-in functions
'((fun (n) (add-one (num-square n))) 5)
; A BuiltinFunction is a (make-builtin ISLFunction) ; INTERPRETATION: A way to include ISL functions into our environments (define-struct builtin [fun])
; Value [List-of Value] -> Value ; Applies the given function-value to the given argument-values (define (do-application fun-val args-vals) (cond [(closure? fun-val) (cond [(= (length (closure-args fun-val)) (length args-vals)) (local ((define args-bindings ; combine each argument name with the corresponding ; argument value (map (λ(arg val) (list arg val)) (closure-args fun-val) args-vals)) ; the new environment includes the arguments and the old environment (define new-env (append args-bindings (closure-env fun-val)))) (eval (closure-body fun-val) new-env))] [else (error "Invalid function call: wrong number of arguments")])] [(builtin? fun-val) (apply (builtin-fun fun-val) args-vals)] [else (error "not a function: " fun-val)]))
(define AN-ENV (list (list 'add-one (make-builtin add1)) (list 'num-square (make-builtin sqr))))
4 Summary
Building a small language is relatively easy. We started with the concepts of Values and Expressions, and we chose representations for them that were convenient for us to create and process (using s-expressions). Then we systematically and incrementally built up an evaluator, the function that gives meaning to expressions by evaluating them to values. Our function was based heavily on the template for expressions, and we found ourselves reaching for just one additional accumulator parameter beyond that template-driven design. As we implemented CustomFunctions, we wound up with mutually-recursive data definitions (because now Values contained Exprs inside a closure-body), so our evaluator split into two mutually recursive functions, eval and do-application. Nothing in this program is particularly difficult, but it is sophisticated, and rereading it a few times to make sure you understand it all is definitely worthwhile.
If you found this material surprisingly interesting, definitely take a Programming Languages course, where you’ll learn a lot more about how languages are designed and what additional features we can add to them. If you want to know more about "real languages" like DrRacket and Java and others are implemented, consider taking a Compilers course after PL.