Skip to content

Homework 10: Type Checking

Due: Sunday, November 16th, 11:59pm

Start with the template code here.

  • Replace each failwith "TODO" with your solution.
  • Be sure to test your solutions in utop or in tests alongside your submission. To load your file into utop, use #use "hw10.ml";;.
  • Submit your version of hw10.ml to the HW10 assignment on Gradescope. Only submit hw10.ml; NOT any other files or folders.
  • At the top of the template file, there is a comment for you to write in approximately how many hours you spent on the assignment. Filling this in is optional, but helpful feedback for the instructors.

A Generic Type Checker

In class, we saw a simple AST for a programming language built out of a series of operators, values, and expressions of the following form:

type expr =
  | Value of value
  | Var   of string
  | Let   of string * expr * expr
  | Ite   of expr * expr * expr
  | App   of op * expr list

Then, after seeing a basic type system, we extended it in various ways, such as dividing TNat into three subsets: Any, Even, and Odd.

For the first part of the homework, you will repeat this experiment, but with a variety of other analyses. To keep things simple, we will have values just be int (treating ints as bool by comparing with 1), but have the following set of operators:

type value = int
type op = | Add | Mul | Sub | Negate | Div | And | Or | LT | EQ | Not
Note here that, since we have subtraction and negation, we will have negative numbers be valid values.

We will also give you eval_op in the template file:

val eval_op : op -> value list -> value 
Note that eval_op may throw an exception. (Where can it throw exceptions?) We interpret And/Or/LT/Not by thinking of the integer as a boolean: treating 1 as true, and any other number (such as 0) as false.

To make the homework modular, we will parameterize the type checker by a theory, which contains the set of types and needed operations for types:

module type Theory = sig
  type ty

  val type_of_const : value -> ty option

  val type_of_op : op -> ty list -> ty option

  (* Used for if/then/else *)
  val merge_ty : ty -> ty -> ty option

  val interpret_ty : ty -> value -> bool
end

Then, given a theory, the module BasicLang in the template is our language from class:

module BasicLang (S : Theory) = struct
  type expr = | Value of value | Var of string | Let of string * expr * expr | App of op * expr list | Ite of expr * expr * expr
  let rec eval (m : value StringMap.t) (e : expr) : value =  ...
  let rec typecheck (m : S.ty StringMap.t) (e : expr) : S.ty option = 
end

Most importantly, the theory given to BasicLang must be sound, meaning two things:

  • if the theory allows the type checker to succeed, eval must not crash; and
  • if an expression type checks with type Some t under the theory, then eval must evaluate to a value v where interpret_ty t v must hold.

In the template, we give you a basic theory, IntTheory, where the only type is TInt. Note that IntTheory disallows division by not returning Some in type_of_op in the case for Div.

As shown in BasicLang.typecheck, the functions type_of_const and type_of_op inside of the Theory signature are used for type checking Value and App nodes, respectively; while merge_ty is used for type checking if-then-else by attempting to merge the two types obtained by type checking the two branches. In the case of IntTheory, merge_ty just requires the two types are the same (which always happens to work); but for other theories, such as a theory for even/odd/any, a more complicated merge function is needed: for example, if we want to merge odd and even, it should succeed, but we will get any.

Finally, the function interpret_ty is used to say whether the value is "in" the subset of values corresponding to that type.

Q1: Warmup

Explain why IntTheory must disallow div for soundness, and give an expression that, if we allowed Div in IntTheory, would type check but evaluation would crash.

Grading Criteria

A correct explanation for why IntTheory can't handle div soundly, and an expression that would unsoundly type check if we did allow division in IntTheory.

Q2: Sign Analysis

In class, we worked out the details that would enable you to create a Theory for even/odd/any. Here, you will instead create a theory for sign analysis, where the type definition

  type ty = | Pos | Zero | Neg | Any
captures the subsets of integers that can either be positive (strictly larger than zero); zero; negative (strictly less than zero); or any integer.

Implement the rest of this theory in SignTheory. Your implementation must be both sound (satisfying the two soundness conditions mentioned above); but also as complete as possible (meaning that it returns Some whenever it can); and as precise as possible, which means using the most accurate type (i.e., if you can return Zero instead of Any, do so).

Grading Criteria

The correctness of SignTheory --- both in terms of soundness, completeness, and precision.

Q3: Interval Analysis

For our little language, an even more powerful analysis is possible: interval analysis, where types are represented as ranges:

  type ty = (int * int) (* First component must be <= second component! *)
Each type here is interpreted as the range \([a, b]\); i.e., the set of integers \(i\) such that \(a \leq i \leq b\). For the range to be sensible, it must be that case that \(a \leq b\). Implement this theory in IntervalTheory. This theory should also be implemented in a way that is as sound, complete, and precise as possible. Here, "precise" means that the intervals you assign should be as small as possible: you should not assign the type (0, 100) to the result when (0, 50) would also do.

Grading Criteria

The correctness of IntervalTheory --- both in terms of soundness, completeness, and precision.

Q4: Modulus Analysis

Finally, one can generalize the even/odd/any analysis to a modulus analysis, where the types are:

  type ty = | Mod5 of int (* Must be in range [0..4] *)| Any
Here, Any represents any integer (as before); while the type Mod5 i represents integers that are equal to i mod 5. Here, i must live in the set {0, 1, 2, 3, 4}. Implement this in Mod5Theory.

Grading Criteria

The correctness of Mod5Theory --- both in terms of soundness, completeness, and precision.

Q5: Type Checking an Imperative Language

The language in BasicLang is a simple, let-based language, modeled after OCaml. However, many other languages are based on commands, where we seperate the AST into two parts: expressions, which are either variables or applications of operators; and commands, which are statements that "run" but have no return value itself. An example command-based language is something like Python, where you can write programs such as:

x = 0
y = 50
while y > 0:
    y = y - 1
    x = x + y
    if x > 0:
        z = 0
    else:
        z = 1
The meaning of assignment here z = 0 is not a local let-binding but instead a global mutation: thus, after the loop terminates, we should be able to read z to see if the last value of x is greater than zero. In this way, variables in this language function more like an int ref in OCaml than an int.

In this problem, you will modify our type checker to now support a command-based language. The AST, found in MiniPython, looks like this:

type expr = | Var of string | App of op * expr list

type cmd = | Assign of string * expr | Ite of expr * cmd * cmd | Seq of cmd * cmd | While of expr * cmd | Skip
An expression is either a variable or an application of an operator, while a command may be:

  • An assignment of a variable to an expression (such as x := x + 1, which would be written Assign("x", App (Add [Var "x"; Value 1])));
  • An if/then/else, which checks if the expression returns 1; and if so, moves to the first command, and otherwise the second;
  • A sequencing, which lets you chain two command together (like c1; c2);
  • A while loop, which first tests if the expression is 1; if it is, it runs the command, and then runs the while loop again; and if the expression is not 1, it stops; and
  • A special command skip, which just means "do nothing".

We give this language a semantics via two functions:

let rec eval_expr (m : value StringMap.t) (e: expr) : value = ...

let rec eval_cmd (m : value StringMap.t) (c : cmd) : value StringMap.t = 
Here, eval_expr works just like our original eval_expr; however, eval_cmd works differently. It takes in the environment of variables m and returns not a value, but instead a new environment of variables for after the command runs.

For example, evaluating Assign("x", e) will first evaluate e to a value v, and then return a new map by modifying the old one with "x" updated to v. If/then/else evaluates by testing if the expression is true, as before, while sequencing runs the first command to obtain a new map, and then passes that new map to the evaluation of the second command. Evaluating skip just returns the original map.

Your task in this question is to implement a type checker for this command language:

let rec typecheck_expr (ty_env : S.ty StringMap.t) (e: expr) : S.ty option = failwith "TODO" 
let rec typecheck_cmd (ty_env : S.ty StringMap.t) (c : cmd) : bool = failwith "TODO"

Type checking expressions is similar to BasicLang. For type checking commands, the type checker outputs a boolean: true if the command is "well-formed", and false if the command is not "well-formed". A command is well-formed if, for every assignment x := e in the program:

  • The variable x is in the typing environment ty_env, so has some type Some t;
  • and e type checks to the same type, Some t;

and for every test of a guard g in Ite (g, c1, c2) or While (g, c), that g type checks under typecheck_expr (i.e., does not return None, but we don't care about the exact type it outputs).

Thus, we assume that every variable we care about is already in the typing environment. Note that the well-formedness check for While will not guarantee that the while loop terminates; just that, if it performs an assignment or a branch, that assignment or branch is allowed under our well-formedness condition. Finally, note that since there are no let-bindings but only mutation of global variables, ty_env should not change throughout the executino of typecheck_expr or typecheck_cmd.

Grading Criteria

The correctness of your type checker for expressions and commands with respect to this well-formedness condition.