Lecture 9.3: Binding

Last class, we ended up with an expression AST like so:

type value = | Nat of int | Bool of bool

type op = | Add | Mul | And | Or | Not | Eq

type expr =
  | Const of value
  | Op of op * expr list
  | Ite of expr * expr * expr

along with a semantics that allowed us to evaluate expressions to values:

let rec eval (e : expr) : value = ...

While the type says we always return a value, recall that it throws an exception (via failwith) if there is a "type error" in the semantics (e.g., adding a number to a boolean).

This class, we will tackle a new challenge when it comes to growing our language for realism: adding let-binders, so we can have expressions like:

(* a single let expression *)
let x = 2 in x + 3

(* sequenced let expressions *)
let x = 2 in 
let y = 3 in 
let z = 3 in 
x + y + z

(* nested let expressions *)
let x = (let y = 3 in y) in
let z = 3 in 
x + z

Note that all of the above expressions are "closed" (all variables that we use are defined), but we will also want to be able to parse "open" expressions, such as

let x = 3 in 
y

where the variable y is "free" (not defined).

We can adapt for let-bindings and variables by adding these nodes to the AST :

type value = | Nat of int | Bool of bool

type op = | Add | Mul | And | Or | Not | Eq

type expr =
  | Var of string 
  | Let of string * expr * expr (* Let(x, e1, e2) is "let x = e1 in e2" *)
  | Const of value
  | Op of op * expr list
  | Ite of expr * expr * expr

Note two things: first, that our other nodes of the AST (e.g., Op) do not change; and that, even though we are using strings for variables, this is NOT the same as strings appearing inside the language itself. If we wanted the language to be able to manipulate strings, we would need to add a new type of value and new operators.

Let's write down some example programs from above:

let x = 4 in x * x
-->>> Let("x", Const (Nat 4), Op (Mul, [Var "x"; Var "x"]))

From the perspective of parsing, free variables are fine (just like how false + false is a valid AST), so the following are also valid ASTs:

y
-->>> Var "y"

let x = 4 in y
-->>> Let("x", Const (Nat 4), Var "y")

let x = y in x
-->>> Let("x", Var "y", Var "x")

Interpreting Let Bindings

How should we evaluate a program like let x = 2 + 2 in x * x? Let's think about what happens in OCaml: first, we evaluate 2 + 2, giving us 4; then, we now know that x = 4, which lets us evaluate x * x. , giving us 16. Let's think of a few more examples:

utop # let x = 2 in let x = false in x;;
- : bool = false

This gives us false, because the second binding shadows (overwrites) the first binding.

utop # let x = (let y = 3 in y) in y;;
Error: Unbound value y

Here, y is out of scope outside of the inner let-binding. Thus, what this shows us is that bindings for variables don't flow out of the let expression that created it; but only flow down into the right-hand side of the let expression.

Note

This is one way of showing that OCaml, like most other languages, implements lexical scope: the knowledge of what variables are in scope is defined by the lexical structure of the source program.

Thus, what do we know?

There needs to be a way to assign values to variables;
When evaluating let x = e1 in e2, we first evaluate e1, grab the value for x, and evaluate e2 now with this new binding; and
When evaluating let x = (let y = e0 in e1) in e2, we don't want y to be in scope for e2.

There are multiple choices for how to implement a semantics that does this. The solution we will tackle here is to use environments, which are maps from variables to values.

module StringMap = Map.Make(String)

let rec eval (e : expr) (m : value StringMap.t) : value = ...

First, take note of a few things:

We implement the environment using OCaml's Map API, documented here;
We need to change the type of eval to take the environment m as an input;
But we do not need to return an environment. This is because of our choice to implement lexical scope: the environment only flows down the tree; not back up.

Let's now sketch out eval, and see some of the various cases. First, let's see how the old cases work now:

let rec eval (e : expr) (m : value StringMap.t) : value = 
  match e with
  | Const v -> v
  | Var s -> ...
  | Let (x, e1, e2) -> ...
  | Op (o, es) -> let vs = List.map (fun e -> eval e m) es in eval_op o vs
  | Ite (cond, e1, e2) -> 
    match eval cond m with 
    | Bool b -> if b then eval e1 m else eval e2 m
    | _ -> failwith "Ite: didn't get a boolean"

When we evaluate an Op, the only thing that changes is that, instead of calling List.map eval es, we need to pass the map m in as the second argument to eval. In this way, we are passing the environment downwards to the subexpressions. The code for Ite works the exact same way.

Let's now see the other cases:

let rec eval (e : expr) (m : value StringMap.t) : value = 
  match e with
  ...
  | Var s -> 
    (match StringMap.find_opt s m with 
      | Some v -> v
      | None -> failwith "Variable not found"
    )
  | Let (x, e1, e2) -> 
    let v = eval e1 m in 
    eval e2 (StringMap.add x v m)
  ...

In the Var case, we look up the variable in the environment, and throw an exception if the variable isn't found. In the Let case, we first evaluate e1 under the original environment, and then evaluate e2 under the environment extended with the new value for x.

The full code is given here. As an example piece of code, we can try evaluating

(* let x = (if true && false then 0 else 1) in (x + x) *)
let test_program = 
    Let ("x", 
      Ite (Op (And, [Const (Bool true); Const (Bool false)]),
        Const (Nat 0),
        Const (Nat 1)
      ),
      Op (Add, [Var "x"; Var "x"]))

To evaluate this, we need to call eval. With no other variables in scope, we can just use StringMap.empty:

utop # eval test_program StringMap.empty;;
- : value = Nat 2