Lecture 5: Runtime representations for multiple value types

7.9

Lecture 5: Runtime representations for multiple value types

We ought to be able to handle more kinds of values than just numbers. Let’s try to add at least one more type of value to our language. Doing so will trigger a large number of problems, which will all need to be resolved to make things work again.

1 Growing the language: adding boolean values

Reminder: Every time we enhance our source language, we need to consider several things:

Its impact on the concrete syntax of the language
Examples using the new enhancements, so we build intuition of them
Its impact on the abstract syntax and semantics of the language
Any new or changed transformations needed to process the new forms
Executable tests to confirm the enhancement works as intended

When we introduced conditionals and branching control-flow last time, we were a bit dissatisfied with the semantics of “branch if it’s nonzero or not” — that seemed like a hack justified only by the limitations of our language. Today, let’s consider adding the simplest additional type of data to our language: boolean values. It’s just two new values! What will we need to support them?

Do Now!
What new primitive operations should we add to operate with booleans? What should happen if we give numeric arguments to boolean operations? Or vice versa?

Exercise
We currently represent numbers as themselves. How can we represent booleans?

Exercise
How can we prevent things from going wrong?

The key challenge, from which all the other problems today will arise, is that our current encoding of numbers is too big. We’ve assigned a numeric meaning to every 64-bit pattern, and therefore used them all up: there are no bit patterns left to represent boolean values. Somehow, we need to carve out space for our booleans. As soon as we do that, though:

Not all operations should operate on all bit patterns, which means we have to ensure that operations can distinguish valid operands from invalid ones.
Not all patterns can be considered numbers, which means we have to ensure that our compilation of numbers makes sense in this new extended setup.
It becomes apparent that just supporting booleans is shortsighted: we’d like to be able to broaden these techniques to other datatypes later on, such as pointers, tuples, structures, closures, or others.

A lot of work, for just two values! But that work forms a blueprint to follow when growing the language further.

1.1 Wait, what about static typing?

Surely we can resolve some if not all of these problems by designing a type-checker for our language? After all, if we can determine statically that the program never miuses an operation with the wrong type of operands, then we don’t need to distinguish those types of values at runtime?

Sadly, not quite: while some operations will be simplified if we do this, there are still features we’d like to have in our compiler, such as garbage collection, that will inevitably need to treat some types of values differently than others. Without getting into the details now (we’ll save that for later), learning how to deal with distinguishing values now, while the language is still small, will be good practice when the language gets more complicated soon.

1.2 The new concrete syntax

Today’s language changes themselves are syntactically quite simple: only two new expression forms, and a bunch of new operators. Notice that we have operators that consume and return booleans, and also operators that consume numbers and return booleans – language features interact.

‹prim1›: ... | ! ‹prim2›: ... | && | || | < | <= | > | >= | == | != ‹expr›: ... | true | false

1.3 Examples

Exercise
Develop a suite of test examples for using booleans. Be thorough, and consider as many edge cases as possible.

1.4 Enhancing the abstract syntax

The abstract syntax changes are similarly straightforward to the concrete ones.

type prim1 = ... | Not
type prim2 = ... | And | Or | Less | Greater | LessEq | GreaterEq | Eq | Ne
type 'a expr = ... | EBool of bool * 'a

Like numbers, we will consider EBool _ values to be immediates for the purposes of ANF: no further computation on them is needed.

2 Representation choices

If all of our bit-patterns so far are used for numbers, then when we choose some patterns to reinterpret as booleans, we must have some way of distinguishing boolean (or other) patterns from numeric ones.

A note about notation: since we will be discussing actual bit-level representations today, it will be useful to use either binary or hexadecimal notations for numbers, rather than more common decimal notation. I.e.

0b00000000_0000....0000_00001010 = 0x000000000000000A = 10

For clarity, if needed, we’ll include underscores to separate the bytes in the binary representation, and elide the middle bits if they’re not interesting. Further, if there are “wildcard” bits that we don’t care about, we will mark them as w in the binary representation.

2.1 Attempt 1: Why limit ourselves to 64-bits?

If all the patterns in a 64-bit word are used up...let’s use more words. Suppose we encode all our values as two words, where the first word is a tag word describing what type of value comes next — e.g. 0 for number, 1 for boolean — and the subsequent word contains the actual value.

Do Now!
List at least three consequences of this approach.

On the one hand, this approach is certainly future-proof: we have room to distinguish 9 quintillion distinct types of data in our programs! But on the other, consider how we would have to implement any of our operations. Because our values are now larger than our word-size, they won’t fit into registers any more, which in turn means that our invariant of “the answer always goes in RAX” won’t work any longer. Instead, we will need to always put our answers on the stack, and then put the address of the first word of the answer into RAX: i.e., the actual answer can be found in [RAX] and [RAX - 8]. This in turn means that every single operation we want to perform will require at least two memory reads and writes — which is an exorbitant cost.

2.2 Attempt 2: We don’t need 64 bits for a tag

If having 64 tag bits is overkill, let’s try the minimum needed: could we get away with a single tag bit? We can pick any bit we like for this, but for reasons that will become apparent soon, let’s choose the least-significant bit (LSB) to denote numbers versus booleans:

LSB = 0 implies the value is a number
LSB = 1 implies the value is a boolean

This may still feel like overkill: we have 4 quintillion bit patterns reserved just for booleans, and we’ve given up half of our available space for numbers. In practice, we’ll accept the loss of range on numbers for now, and we’ll refine our notion of what it means to “tag a boolean” so that we can use the rest of that reserved space for other types of values, later.

2.2.1 Booleans

We now need to choose representations for true and false themselves. Since it’s very likely that tags are going to get more complex later, we probably want to reserve several low-order bits for future tag use — so using any of those bits to encode our values is probably not future-proof. Instead, let’s look to the other end of the value, at its most-significant bit (MSB), and declare that

MSB = 0 encodes false
MSB = 1 encodes true

In other words, our true and false values will therefore be 0b1wwwwwww_wwww.....wwww_wwwwwww1 and 0b0wwwwwww_wwww.....wwww_wwwwwww1, respectively. We can choose whatever bits we want in between: two obvious choices would be either all zeros or all ones. Arbitrarily, we’ll choose all-ones.

2.2.2 Numbers

How should we represent numbers, then? Since we’ve decided that an LSB of zero indicates a number, we effectively have all the bit-patterns-formerly-interpreted-as-even-numbers available to us as numbers. This lends itself to a convenient encoding: we’ll simply represent the number n as the bit-pattern that used to represent 2n:

Number		Representation
`1`		`0b00000000_0000.....0000_00000010`
`6`		`0b00000000_0000.....0000_00001100`
`42`		`0b00000000_0000.....0000_01010100`

This is clearly a one-to-one mapping, so it’s a faithful encoding; let’s see if it’s a useful one, too.

2.3 Compiling constants

Compiling our new constants is fairly straightforward. There is a twist: we need to ensure at compile time that any numeric constants we have in our source program aren’t inherently overflowing:

let min_cobra_int = Int64.div Int64.min_int 2L
let max_cobra_int = Int64.div Int64.max_int 2L
let compile_imm e env =
  match e with
  | ENumber(n, _) ->
     if n > max_cobra_int || n < min_cobra_int then
       failwith ("Integer overflow: " ^ (Int64.to_string n))
     else
       Const(n lsl 1)
  | EBool(true, _) -> const_true
  | EBool(false, _) -> const_false
  ...

Here, the lsl operator logically shifts left the bits of its first operand by some number of places given by the second operand. The newly vacated bits are filled in with zeros. As a consequence, (n lsl 1) == 2 * n...but since we’re contemplating these values as bit patterns, it’s a useful observation to notice that multiplying by two simply slides the bits around: perhaps we can rely on that later to unslide them as needed.

Do Now!
It’s worth pausing to confirm that lsl does have the desired semantics in all cases, especially when n is negative. Given that our numbers use two’s complement representation, what can we determine about the most significant bits of any negative number in our valid range of numbers? Given that, what we can we deduce about the behavior of lsl?

The const_true and const_false values are whatever bit patterns we chose above; for legibility, we ought to define them as named constants in our compiler, rather than scatter magic numbers throughought our code.

Exercise
Now that we have defined how to compile constants, what output will we get from executing the program 15? What will we get from running the program true?

2.4 Defining arithmetic over our representations

Well clearly, even before arithmetic, the first operation we should redefine is print – or else we cannot trust any output of our program!

Do Now!
How many cases of output should we check for, if we’re coding print defensively?

Since we now have at least two different types of values in our language, it doesn’t quite make sense to use the C type int64_t to describe them: they’re not meaningfully numbers in C, necessarily, but rather are just 64 bits of data. Accordingly, we’ll use typedef to create a type synonym, to remind ourselves of which values are meaningful values-in-our-program, and which values are meaningful-in-C. All bit patterns ending in zeros will be considered numbers. Only two patterns that end in a one have any meaning, as true and false; everything else is an error for now:

typedef uint64_t SNAKEVAL;
const uint64_t BOOL_TAG   = 0x0000000000000001;
const SNAKEVAL BOOL_TRUE  = ...; // These must be the same values
const SNAKEVAL BOOL_FALSE = ...; // as chosen in compile.ml
SNAKEVAL print(SNAKEVAL val) {
  if ((val & BOOL_TAG) == 0) { // val is even ==> number
    printf("%ld", ((int64_t)(val)) / 2); // shift bits right to remove tag
  } else if (val == BOOL_TRUE) {
    printf("true");
  } else if (val == BOOL_FALSE) {
    printf("false");
  } else {
    printf("Unknown value: %#018x", val); // print unknown val in hex
  }
  return val;
}

Now let’s consider arithmetic operations.

Do Now!
What currently is output if we execute add1(0)? Redefine add1 and sub1 to fix this.

Zero is represented as 0x00000000, so add1(0) currently produces 0x00000001, which is a representation of a boolean value! Our implementations of add1, and more generally addition and subtraction, need to take that representation into account.

Operation	Representation	Result	Interpretation
`3 + 4`	`0x0000000000000006 + 0x0000000000000008`	`0x000000000000000D`	`7`
`8 + 9`	`0x0000000000000010 + 0x0000000000000012`	`0x0000000000000022`	`17`
n1 + n2	2n1 + 2n2	2(n1 + n2)	n1 + n2

Interestingly, our notion of addition and subtraction just works with our encoding, because multiplication distributes over addition. This is very convenient: no changes are needed to how we compile addition or subtraction.1Not quite! What important cases are we missing? There’s nothing we can do about them yet, but we shoudl be aware that there are lingering problems. And it informs us how we should revise add1 and sub1: they should add or subtract not 1, but rather the representation of 1, i.e. 2.

Do Now!
What currently is output if we execute 3 * 2? Redefine multiplication to fix this.

Following the same logic as above, 3 * 2 is encoded as 6 * 4, which produces 24 as a result, which is interpreted as 12, which is twice as big as the desired result. When we multiply, we wind up with an extra factor of two since both operands have individually been multiplied by two. Our implementation of multiplication must shift the result right by one bit, to compensate. The shr val, bits instruction shifts the given val to the right by the given number of bits.

Do Now!
Ok, so multiplication seems fixed. What happens if we execute 2 * (-1)? Why does that result appear?

Our program prints 4611686018427387902, which is represented by 9223372036854775804, or 0x7FFFFFFFFFFFFFFFC. The trouble is that negative numbers are represented by having the MSB set to 1 (the “sign bit”), and shifting right does not preserve that sign bit: our output has an MSB of zero, even though it should be representing a negative number. Instead, we need arithmetic shift right, instead of logical shift right: a dedicated instruction sar that behaves like shr, except it keeps the sign bit unchanged.

2.5 Defining logic operations over our representations

Logical and and or are very easy to define. Because we’ve confined the differences between the representations of true and false to a single bit (the MSB), and in all other bits they are the same, logical and and or are equivalent to bitwise and and or, for which we have dedicated assembly instructions. Specifically:

val1	val2	val1-rep	val2-rep	val1-rep bitwise-AND val2-rep	val1 and val2
`true`	`true`	`0b1ww...ww1`	`0b1ww...ww1`	`0b1ww...ww1`	`true`
`true`	`false`	`0b1ww...ww1`	`0b0ww...ww1`	`0b0ww...ww1`	`false`
`false`	`true`	`0b0ww...ww1`	`0b1ww...ww1`	`0b0ww...ww1`	`false`
`false`	`false`	`0b0ww...ww1`	`0b0ww...ww1`	`0b0ww...ww1`	`false`

val1	val2	val1-rep	val2-rep	val1-rep bitwise-OR val2-rep	val1 or val2
`true`	`true`	`0b1ww...ww1`	`0b1ww...ww1`	`0b1ww...ww1`	`true`
`true`	`false`	`0b1ww...ww1`	`0b0ww...ww1`	`0b1ww...ww1`	`true`
`false`	`true`	`0b0ww...ww1`	`0b1ww...ww1`	`0b1ww...ww1`	`true`
`false`	`false`	`0b0ww...ww1`	`0b0ww...ww1`	`0b0ww...ww1`	`false`

Logical not is slightly trickier. We could use not to invert the MSB, but that would invert all the bits of the value — including the tag bit, which would convert the result into a representation of a number. Really, we want to invert only the MSB, and leave all the other bits alone. In other words, we want the following truth table:

Input bit	Flip?	Output
1	1	0
0	1	1
1	0	1
0	0	0

This is the truth table that defines the exclusive-or operation, and the xor instruction applies this operation bitwise. Suppose we designed a mask BOOL_MASK equal to 0x8000000000000000, whose only active bit is the MSB. Then xor RAX, BOOL_MASK will negate only the MSB, and leave the rest unchanged, as desired:

val	val-rep	BOOL_MASK	val-rep bitwise-XOR BOOL_MASK	not(val)
`true`	`0b1ww...ww1`	`0x8000000000000000`	`0b0ww...ww1`	`false`
`false`	`0b0ww...ww1`	`0x8000000000000000`	`0b1ww...ww1`	`true`

Note that unfortunately, we cannot write xor RAX, 0x8000000000000000. Due to a limitation of x64 syntax, we cannot specify a 64-bit literal argument to xor (or to many other operators, either). Instead, we would have to mov that value into a temporary register or a stack slot first, and then we could xor RAX with that temporary value. This infelicity of x64 assembly syntax will come back to bother us later, so it’s useful to point out now.

2.6 Defining comparisons over our representations

Comparisons are slightly finicky to define well.

Do Now!
What are the kinds of numbers most likely to cause problems? Work through the bitwise representation of those numbers, and try to construct some assembly sequence that would take two (representations of) numbers and produce a (representation of a) boolean that represents their comparison.

Here are two potential strategies, focusing just on the less-than operator:

2.6.1 Using `cmp`

Most simply, we can just use the cmp operation, which compares its two operands arithmetically.2In fact, the cmp operator simply subtracts one argument from the other, just like sub does, and sets the same overflow, zero, and signedness flags. In contrast to sub, cmp discards the result of the subtraction, and doesn’t overwrite its first argument. This link provides a nice summary of the various conditional jumps available in assembly, as well as the meaning of the flags that are set by comparisons or subtractions. We then use one of the conditional jump instructions much like compiling an if-expression, and move the appropriate boolean constant into RAX as a result. Assuming we have values in stack slots 1 and 2, we might start with the following assembly:

  ;;; NOTE: cmp does not allow both arguments to be memory accesses
  mov RAX, [RSP - 8*1] ;; Load first value
  cmp RAX, [RSP - 8*2] ;; Compare with second value
  jl less
greater:
  ....
  j done
less:
  ....
done:
  ...

and we can refine it to produce the needed boolean. This scaffold looks pretty similar to the structure for compiling an if expression, but we don’t need the full generality of that structure. We can assume the result will be true, perform the comparison and jump, and if the comparison fails we’ll update the result to be false:

  mov RAX, [RSP - 8*1]
  cmp RAX, [RSP - 8*2]
  ;; The comparison has now set the necessary flags,
  ;; so we can safely clobber RAX:
  mov RAX, 0x8000000000000001 ;; Assume the result is true
  jl less                     ;; If this jump falls through,
  mov RAX, 0x0000000000000001 ;; then the comparison failed
less:
  ...

We could also have flipped the assumption, started from false, and jumped if the comparison failed:

  mov RAX, [RSP - 8*1]
  cmp RAX, [RSP - 8*2]
  ;; The comparison has now set the necessary flags,
  ;; so we can safely clobber RAX:
  mov RAX, 0x0000000000000001 ;; Assume the result is false
  jge less                    ;; If this (inverted) jump falls through,
  mov RAX, 0x8000000000000001 ;; then the comparison succeeded
greater_eq:
  ...

This is straightforward, but potentially expensive, since conditional jump instructions can stall the processor if the processor “guesses wrong” about whether the branch will be taken or not. We can try to build an optimizer to choose which of these two compilations to pick, based on whether we think the condition is more often true or false, and that might be worthwhile for critical inner loops...

Can we do “better” and avoid needing any conditional jumps?

2.6.2 Being “clever”?

There are arithmetic and bitwise tricks we can play to get the answer correct in nearly every case. For example, to determine if 6 < 15, we can simply compute their difference, or -9. Because this is a negative number, its MSB is 1, which is exactly the defining bit of true. We merely need to set all the remaining bits to BOOL_TRUE’s bits. If we chose a representation with all zeros, then and’ing our answer with BOOL_MASK will force all the remaining bits to zeros. If we chose a representation with all ones, then or’ing our answer with the complement of BOOL_MASK will force all the remaining bits to ones. A similar argument holds for checking if 15 < 6, because that subtraction yields a positive number, whose MSB is zero, which is the defining bit for false. So for example, we might try the following for 6 < 15:

mov RAX, 12 ;; Representation of 6
sub RAX, 30 ;; Representation of 15
and RAX, 0x8000000000000000 ;; Mask off all the irrelevant bits,
or  RAX, 0x0000000000000001 ;; and make sure to tag it back as a boolean!

Unfortunately, this approach fails for large numbers. Checking if 4611686018427387903 < -4611686018427387904 fails with our representations, because we represent the former by 0x7ffffffffffffffe, and the latter by 0x8000000000000000. Subtracting the latter from the former yields 0xfffffffffffffffe, or -2 (which in turn encodes -1 in our representation), whose MSB is 1, leading to a conclusion that 4611686018427387903 < -4611686018427387904 is true! The problem is that our subtraction overflowed 63 bits on these large numbers, and no longer makes sense under our representation. The solution, then, is to use sar to shift both operands right by one bit, before doing the subtraction, so that overflow can’t spill into the sign bit. Using these two extra shifts affects our compilation: even though we’ve compiled both arguments to the comparison into immediates, we actually need to preprocess both of them, which means we need a place to stash them both. Practically speaking, we can use RAX for one, and R11 for the other, so long as nothing else in our compilation is using R11 (which at the moment, indeed, nothing is3R11 is the conventional choice to use as a temporary register. The more “obvious” choices, like RBX or RCX, typically have other meanings.). This would look like:4If we wanted to be a bit more careful with producing exactly our two canonical booleans, all of whose bits are 1, then in the final step we should or with 0x7fffffffffffffff.

mov RAX, 0x7ffffffffffffffe ;; Representation of 2^62 - 1
sar RAX, 1                  ;; Shift (arithmetic) right by 1 bit
mov RBX, 0x8000000000000000 ;; Representation of -2^62
sar RBX, 1                  ;; Shift (arithmetic) right by 1 bit
sub RAX, RBX                ;; Produce the difference
and RAX, 0x8000000000000000 ;; Mask off the irrelevant bits
or  RAX, 0x0000000000000001 ;; and tag it again as a boolean

(Actually, technically, this approach also fails, but for a far less interesting reason: again, due to the limitation on 64-bit literal constants in most assembly instructions. If we were to try to implement the code above properly, we’d actually have to move the argument to and into a temporary register first, and then and RAX with that temporary register. The following command for or “works”, only because truncating that constant to its lower 32 bits yields the same value of 1 as the entire 64-bit constant does.)

2.6.3 So why does `cmp` work?

Internally, cmp really does perform a subtraction on its two arguments, but as we just showed above, that doesn’t quite work as smoothly as we’d like! Looking carefully at the failure above, we see that when an overflow occurs, our sign bit is exactly wrong. Reading the documentation for cmp and sub, it turns out they set a few flag bits indicating whether an operation has overflowed, is negative, is zero, etc. The meaning of jl, subtly, is to check whether either the operation overflowed and is negative, or the operation didn’t overflow and is positive. In other words, it checks whether the overflow flag is not equal to the signedness flag. Since we don’t have a trivial way of implementing that check via and, or or xor, our simple code above failed.

There exist techniques to copy the flags register into an actual register (or onto the stack) so that we can manipulate it further, but a detailed analysis of the cycle counts for conditional jumps versus pushing and popping the flags register shows that the conditional jump is both simpler and faster, uses fewer registers, and works in all cases.

Do Now!
Complete this analysis for other comparison operations, being careful about overflow in all cases.

3 Overflow and Bogus Arguments

Unfortunately, our code currently does not prohibit bogus programs, such as 5 + true. Indeed, we don’t even have a mechanism yet for enforcing such things. Additionally, while our compilation of constants prohibits compile-time constants that overflow 63 bits, nothing prohibits us from adding numbers together until they overlow. Checking for such broken behaviors is the focus of next lecture.

4 Testing

As we’ve now added a new type of value to our language, it’s crucial to test thoroughly that we properly process and render each type of value consistently: that is, any operations that produce booleans should always produce valid boolean representations, and similarly for numeric operations. Moreover, operations should process their operands correctly. (We don’t yet have to worry about operations handling operands of the wrong types, as noted above.) Catching inconsistencies at this stage is substantially easier than it will be later, since we haven’t added too many new expression forms to the language.

1Not quite! What important cases are we missing? There’s nothing we can do about them yet, but we shoudl be aware that there are lingering problems.

2In fact, the cmp operator simply subtracts one argument from the other, just like sub does, and sets the same overflow, zero, and signedness flags. In contrast to sub, cmp discards the result of the subtraction, and doesn’t overwrite its first argument. This link provides a nice summary of the various conditional jumps available in assembly, as well as the meaning of the flags that are set by comparisons or subtractions.

3R11 is the conventional choice to use as a temporary register. The more “obvious” choices, like RBX or RCX, typically have other meanings.

4If we wanted to be a bit more careful with producing exactly our two canonical booleans, all of whose bits are 1, then in the final step we should or with 0x7fffffffffffffff.

1	Growing the language: adding boolean values
2	Representation choices
3	Overflow and Bogus Arguments
4	Testing