Assignment 5: Diamondback:   Defining functions
1 The Diamondback Language
1.1 Concrete Syntax
1.2 Abstract Syntax
1.3 Semantics
1.4 Implementation
1.5 Tail calls
1.6 Extra credit:   Printing stack traces
1.7 Testing
2 Running main
3 List of Deliverables
4 Grading Standards
5 Submission
7.2

Assignment 5: Diamondback: Defining functions

Due: Thu 02/14 at 8:59pm

git clone

In this assignment, you’ll implement a compiler for a small language with functions declarations and function calls, conforming to the C stack layout. You’ll also add some static well-formedness checks to the compiler.

As for the project acronym, so far our greatest minds have come up with

1 The Diamondback Language

As usual, we have concrete and abstract syntaxes, along with a specification of semantics.

1.1 Concrete Syntax

The major addition to Diamondback are function declarations. Our programs are now a sequence of zero or more function declarations, followed by a single main expression.

‹program›: | ‹decls› ‹expr› | ‹expr› ‹binop-expr›: | IDENTIFIER | NUMBER | true | false | ‹prim1› ( ‹expr› ) | ‹expr› ‹prim2› ‹expr› | IDENTIFIER ( ‹exprs› ) | IDENTIFIER ( ) | true | false | ( ‹expr› ) ‹prim1›: | add1 | sub1 | ! | print | printStack | isbool | isnum ‹prim2›: | + | - | * | < | > | <= | >= | == | && | || ‹decls›: | ‹decl› | ‹decl› ‹decls› ‹decl›: | def IDENTIFIER ( ‹ids› ) : ‹expr› | def IDENTIFIER ( ) : ‹expr› ‹ids›: | IDENTIFIER | IDENTIFIER , ‹ids› ‹expr›: | let ‹bindings› in ‹expr› | if ‹expr› : ‹expr› else: ‹expr› | ‹binop-expr› ‹exprs›: | ‹expr› | ‹expr› , ‹exprs› ‹bindings›: | IDENTIFIER = ‹expr› | IDENTIFIER = ‹expr› , ‹bindings›

Diamondback also supports line comments, beginning with a # and continuing to the end of the line.

The other addition is function applications, which are written IDENTIFIER(‹exprs›), for example f(1, 2, 3). This is the syntax for a call to a function. Note that primops and function calls are whitespace-sensitive: there can be no space before the left parenthesis. (This is a workaround for a grammatical ambiguity in the language. You can see a remnant of that ambiguity by the "1 shift/reduce conflict" warning produced by ocamlyacc in the provided code.)

1.2 Abstract Syntax

In this assignment, we distinguish between the surface syntax that the parser produces (expr) from the ANF expression forms (aexpr). All these types are given in exprs.ml; below we show a simplified, undecorated form:

type prim1 =
  | Add1
  | Sub1
  | Not
  | Print
  | IsNum
  | IsBool

type prim2 =
  | Plus
  | Minus
  | Times
  | And
  | Or
  | Greater
  | GreaterEq
  | Less
  | LessEq
  | Eq

(* Surface expressions *)
type bind = (string * expr)

type expr =
  | ELet of bind list * expr
  | EPrim1 of prim1 * expr
  | EPrim2 of prim2 * expr * expr
  | EApp of string * expr list
  | EIf of expr * expr * expr
  | ENumber of int
  | EBool of bool
  | EId of string

type decl =
  | DFun of string * string list * expr

type program =
  | Program of decl list * expr


(* Internal ANF expressions *)
type immexpr =
  | ImmNumber of int
  | ImmBool of bool
  | ImmId of string

and cexpr =
  | CPrim1 of prim1 * immexpr
  | CPrim2 of prim2 * immexpr * immexpr
  | CApp of string * immexpr list
  | CIf of immexpr * aexpr * aexpr
  | CImmExpr of immexpr

and aexpr =
  | ALet of string * cexpr * aexpr
  | ACExpr of cexpr

and adecl =
  | ADFun of string * string list * aexpr

and aprogram =
  | AProgram of adecl list * aexpr

The EApp and CApp constructors correspond to function applications, and the decl and adecl types respresent function declarations. A program is a list of decls (surface) or adecls (ANF), with a main expression.

1.3 Semantics

There are several distinguishing features of Diamondback. The first is function applications. A function application should give the answer we’d get if we followed the rules for substituting argument values for parameter names. So, for example:

def f(x, y):
  x + y

f(1, 2)

should produce 3.

Your compiler should use the rules for C stacks discussed in class and at this assembly guide to implement this behavior.

There are a number of new errors that can occur now that we have function declarations and calls. Your implementation should catch all of these cases statically; that is, before the program runs:

Again, these errors should stop the program from compiling, not happen at runtime. Moreover, you should report all the errors in this program, not just the first one encountered. See the notes on is_well_formed below for implementation details.

These errors, in addition to ParseErrors, are all defined in errors.ml. Additionally, there are two more errors given to you: NotYetImplemented errors, which you should obviously remove and implement; and InternalCompilerErrors, which indicate intended-to-be-impossible scenarios in your compiler. Ideally, your final compiler should not need these (either by coming up with useful semantics in all scenarios, or by using the ML type system to your advantage to avoid any such unplanned scenarios, but in practice you might still need them.

1.4 Implementation

The file assembly.ml gives you all the assembly instructions you should need, though you’re free to add your own new instructions if you want.

There are a few new pieces to the implementation:

1.5 Tail calls

Refine your compilation of function applications to support tail-calls. There are three levels of complexity here:

To support the last one, you’ll need to be creative with the calling convention. The only truly essential feature of a stack frame is knowing where the saved ESP, EBP and return addresses must be, and all other variables’ locations can be calculated from knowing those. In particular, you’ll need to change the ordering of values in your stack frame, where EBP points between parameters and locals. Instead, both parameters and locals will need to be above EBP: change your compilation of function calls — and anything else that is affected! — to support this new convention. This change will prevent you from using the call instruction (why?), so you’ll need to come up with something clever to save the desired return address. Come see me if you try this, for a hint.

To detect that you’ve implemented tail calls correctly, you may want to attempt the following extra credit.

1.6 Extra credit: Printing stack traces

Implement a new prim1 operator, printStack, that takes a single argument and returns it, just like print does. What makes printStack special is its output: it will print out a stack trace of all the call frames from wherever it’s called, down to wherever our_code_starts_here began. For example, given the following program:

def fact(acc, n):
  if n < 1: printStack(acc)
  else: fact(acc * n, n - 1) + 0
  # NOTE: This + 0 is to prevent tail-recursion in this example

fact(1, 3)
Here is a potential stack trace:

ESP: 0xffa4ee40	==>  -2984148
EBP: 0xffa4ee58	==>  -2984132
(difference: -6)
Requested return val: 0xc ==> 6
Num args: 2
    0xffa4ee40: 0xffa4ee58	==>  old ebp
    0xffa4ee44: 0x00000002	==>  1
    0xffa4ee48: 0000000000	==>  0
    0xffa4ee4c: 0000000000	==>  0
    0xffa4ee50: 0000000000	==>  0
    0xffa4ee54: 0xffffffff	==>  true
EBP 0xffa4ee58: 0xffa4ee78	==>  old ebp
    0xffa4ee5c: 0x08048c44	==>  saved ret
    0xffa4ee60: 0x0000000c	==>  6
    0xffa4ee64: 0000000000	==>  0
    0xffa4ee68: 0000000000	==>  0
    0xffa4ee6c: 0000000000	==>  0
    0xffa4ee70: 0x0000000c	==>  6
    0xffa4ee74: 0x7fffffff	==>  false
EBP 0xffa4ee78: 0xffa4ee98	==>  old ebp
    0xffa4ee7c: 0x08048c44	==>  saved ret
    0xffa4ee80: 0x0000000c	==>  6
    0xffa4ee84: 0x00000002	==>  1
    0xffa4ee88: 0000000000	==>  0
    0xffa4ee8c: 0x00000002	==>  1
    0xffa4ee90: 0x0000000c	==>  6
    0xffa4ee94: 0x7fffffff	==>  false
EBP 0xffa4ee98: 0xffa4eeb8	==>  old ebp
    0xffa4ee9c: 0x08048c44	==>  saved ret
    0xffa4eea0: 0x00000006	==>  3
    0xffa4eea4: 0x00000004	==>  2
    0xffa4eea8: 0000000000	==>  0
    0xffa4eeac: 0x00000004	==>  2
    0xffa4eeb0: 0x00000006	==>  3
    0xffa4eeb4: 0x7fffffff	==>  false
EBP 0xffa4eeb8: 0xffa4eec8	==>  old ebp
    0xffa4eebc: 0x08048c7b	==>  saved ret
    0xffa4eec0: 0x00000002	==>  1
    0xffa4eec4: 0x00000006	==>  3
BOT 0xffa4eec8: 0xffa4eef8	==>  old ebp
    0xffa4eecc: 0x08048b74	==>  saved ret
    0xffa4eed0: 0x1	==>  heap
6

The output shows:

Hint: To identify where to stop printing the stack (after all, main isn’t the very last thing in the stack; there are operating-system frames as well...), I used a trick: declare a global, non-const variable in C, int* STACK_BOTTOM, and mark it as extern. At the beginning of our_code_starts_here, initialize that value with the current value of EBP (you’ll need a new assembly arg for this), and then you’re guaranteed to know where your stack ends, dynamically.

Note: if you’ve properly implemented tail calls, then eliminating the + 0 from the program above, and rerunning it, should produce a much shorter stack trace...

Note: If you implement this extra credit, provide a brief, written explanation (in a comment in main.c, near whatever new functions you define) of why this must be implemented as a EPrim1 and not as a EApp expression.

1.7 Testing

There is one new testing function provided, tvg. This works just like t, except it runs valgrind on the executable it creates, and checks whether or not it reports errors. If valgrind does report memory errors, the test fails and the errors are reported. The test succeeds if there are no memory errors and the answer is correct.

You can test the well-formedness errors using the error tester as usual.

2 Running main

Running your own programs is the same as with Cobra, except you’ll give them the .diamond file extension.

3 List of Deliverables

Again, please ensure the makefile builds your code properly. The black-box tests will give you an automatic 0 if they cannot compile your code!

DO NOT SUBMIT YOUR .git DIRECTORY! For that matter, don’t submit your output or _build directories.

4 Grading Standards

For this assignment, you will be graded on

5 Submission

Wait! Please read the assignment again and verify that you have not forgotten anything!

Please submit your homework to https://handins.ccs.neu.edu/ by the above deadline.