Homework 10: Type Checking
Due: Sunday, November 16th, 11:59pm
Start with the template code here.
- Replace each
failwith "TODO"with your solution. - Be sure to test your solutions in
utopor in tests alongside your submission. To load your file intoutop, use#use "hw10.ml";;. - Submit your version of
hw10.mlto the HW10 assignment on Gradescope. Only submithw10.ml; NOT any other files or folders. - At the top of the template file, there is a comment for you to write in approximately how many hours you spent on the assignment. Filling this in is optional, but helpful feedback for the instructors.
A Generic Type Checker
In class, we saw a simple AST for a programming language built out of a series of operators, values, and expressions of the following form:
type expr =
| Value of value
| Var of string
| Let of string * expr * expr
| Ite of expr * expr * expr
| App of op * expr list
Then, after seeing a basic type system, we extended it in various ways, such as dividing TNat into three subsets: Any, Even, and Odd.
For the first part of the homework, you will repeat this experiment, but with a variety of other analyses. To keep things simple, we will have values just be int (treating ints as bool by comparing with 1), but have the following set of operators:
We will also give you eval_op in the template file:
eval_op may throw an exception. (Where can it throw exceptions?)
We interpret And/Or/LT/Not by thinking of the integer as a boolean: treating 1 as true, and any other number (such as 0) as false.
To make the homework modular, we will parameterize the type checker by a theory, which contains the set of types and needed operations for types:
module type Theory = sig
type ty
val type_of_const : value -> ty option
val type_of_op : op -> ty list -> ty option
(* Used for if/then/else *)
val merge_ty : ty -> ty -> ty option
val interpret_ty : ty -> value -> bool
end
Then, given a theory, the module BasicLang in the template is our language from class:
module BasicLang (S : Theory) = struct
type expr = | Value of value | Var of string | Let of string * expr * expr | App of op * expr list | Ite of expr * expr * expr
let rec eval (m : value StringMap.t) (e : expr) : value = ...
let rec typecheck (m : S.ty StringMap.t) (e : expr) : S.ty option =
end
Most importantly, the theory given to BasicLang must be sound, meaning two things:
- if the theory allows the type checker to succeed,
evalmust not crash; and - if an expression type checks with type
Some tunder the theory, thenevalmust evaluate to a valuevwhereinterpret_ty t vmust hold.
In the template, we give you a basic theory, IntTheory, where the only type is TInt.
Note that IntTheory disallows division by not returning Some in type_of_op in the case for Div.
As shown in BasicLang.typecheck, the functions type_of_const and type_of_op inside of the Theory signature are used for type checking Value and App nodes, respectively; while merge_ty is used for type checking if-then-else by attempting to merge the two types
obtained by type checking the two branches. In the case of IntTheory, merge_ty just requires the two types are the same (which always happens to work); but for other theories, such as a theory for even/odd/any, a more complicated merge function is needed:
for example, if we want to merge odd and even, it should succeed, but we will get any.
Finally, the function interpret_ty is used to say whether the value is "in" the subset of values corresponding to that type.
Q1: Warmup
Explain why IntTheory must disallow div for soundness, and give an expression that, if we allowed Div in IntTheory, would type check but evaluation would crash.
Grading Criteria
A correct explanation for why IntTheory can't handle div soundly, and an expression that would unsoundly type check if we did allow division in IntTheory.
Q2: Sign Analysis
In class, we worked out the details that would enable you to create a Theory for even/odd/any. Here, you will instead create a theory for sign analysis, where the type definition
Implement the rest of this theory in SignTheory. Your implementation must be both sound (satisfying the two soundness conditions mentioned above); but also as complete as possible (meaning that it returns Some whenever it can); and as precise as possible,
which means using the most accurate type (i.e., if you can return Zero instead of Any, do so).
Grading Criteria
The correctness of SignTheory --- both in terms of soundness, completeness, and precision.
Q3: Interval Analysis
For our little language, an even more powerful analysis is possible: interval analysis, where types are represented as ranges:
Each type here is interpreted as the range \([a, b]\); i.e., the set of integers \(i\) such that \(a \leq i \leq b\). For the range to be sensible, it must be that case that \(a \leq b\). Implement this theory inIntervalTheory.
This theory should also be implemented in a way that is as sound, complete, and precise as possible. Here, "precise" means that the intervals you assign should be as small as possible: you should not assign the type (0, 100) to the result when (0, 50) would also do.
Grading Criteria
The correctness of IntervalTheory --- both in terms of soundness, completeness, and precision.
Q4: Modulus Analysis
Finally, one can generalize the even/odd/any analysis to a modulus analysis, where the types are:
Any represents any integer (as before); while the type Mod5 i represents integers that are equal to i mod 5. Here, i must live in the set {0, 1, 2, 3, 4}.
Implement this in Mod5Theory.
Grading Criteria
The correctness of Mod5Theory --- both in terms of soundness, completeness, and precision.
Q5: Type Checking an Imperative Language
The language in BasicLang is a simple, let-based language, modeled after OCaml. However, many other languages are based on commands, where we seperate the AST into two parts: expressions, which are either variables or applications of operators; and commands, which are statements that "run" but have no return value itself.
An example command-based language is something like Python, where you can write programs such as:
z = 0 is not a local let-binding but instead a global mutation: thus, after the loop terminates, we should be able to read z to see if the last value of x is greater than zero.
In this way, variables in this language function more like an int ref in OCaml than an int.
In this problem, you will modify our type checker to now support a command-based language.
The AST, found in MiniPython, looks like this:
type expr = | Var of string | App of op * expr list
type cmd = | Assign of string * expr | Ite of expr * cmd * cmd | Seq of cmd * cmd | While of expr * cmd | Skip
- An assignment of a variable to an expression (such as
x := x + 1, which would be writtenAssign("x", App (Add [Var "x"; Value 1]))); - An if/then/else, which checks if the expression returns
1; and if so, moves to the first command, and otherwise the second; - A sequencing, which lets you chain two command together (like
c1; c2); - A while loop, which first tests if the expression is
1; if it is, it runs the command, and then runs the while loop again; and if the expression is not1, it stops; and - A special command skip, which just means "do nothing".
We give this language a semantics via two functions:
let rec eval_expr (m : value StringMap.t) (e: expr) : value = ...
let rec eval_cmd (m : value StringMap.t) (c : cmd) : value StringMap.t =
eval_expr works just like our original eval_expr; however, eval_cmd works differently. It takes in the environment of variables m and returns not a value, but instead a new environment of variables for after the command runs.
For example, evaluating Assign("x", e) will first evaluate e to a value v, and then return a new map by modifying the old one with "x" updated to v.
If/then/else evaluates by testing if the expression is true, as before, while sequencing runs the first command to obtain a new map, and then passes that new map to the evaluation of the second command.
Evaluating skip just returns the original map.
Your task in this question is to implement a type checker for this command language:
let rec typecheck_expr (ty_env : S.ty StringMap.t) (e: expr) : S.ty option = failwith "TODO"
let rec typecheck_cmd (ty_env : S.ty StringMap.t) (c : cmd) : bool = failwith "TODO"
Type checking expressions is similar to BasicLang. For type checking commands, the type checker outputs a boolean: true if the command is "well-formed", and false if the command is not "well-formed".
A command is well-formed if, for every assignment x := e in the program:
- The variable
xis in the typing environmentty_env, so has some typeSome t; - and
etype checks to the same type,Some t;
and for every test of a guard g in Ite (g, c1, c2) or While (g, c), that g type checks under typecheck_expr (i.e., does not return None, but we don't care about the exact type it outputs).
Thus, we assume that every variable we care about is already in the typing environment.
Note that the well-formedness check for While will not guarantee that the while loop terminates; just that, if it performs an assignment or a branch, that assignment or branch is allowed under our well-formedness condition.
Finally, note that since there are no let-bindings but only mutation of global variables, ty_env should not change throughout the executino of typecheck_expr or typecheck_cmd.
Grading Criteria
The correctness of your type checker for expressions and commands with respect to this well-formedness condition.