On this page:
4.1 Update LSL (IMPORTANT)
4.2 Problem 0
4.3 Problem 1
4.4 Problem 2
4.5 Problem 3
8.15

4 Homework 5🔗

This assignment is due Thursday, 2/6 by MIDNIGHT

What to Download:

hw5.rkt

Please start with this file, filling in where appropriate, and submit your homework to the HW5 assignment on Gradescope. This page has additional information, and allows us to format things nicely.

4.1 Update LSL (IMPORTANT)🔗

Note: in order to do Problem 2, you must update LSL. Go to File -> Install Package or Package Manager and click Update.

4.2 Problem 0🔗

The purpose of writing specifications is to ensure that the implementations that they are used to check work as expected. One of the best ways to figure that out is to try to "break" them – i.e., think of an "obviously wrong" implementation that still satisfies the specification.

In this problem, we’d like you to practice this explicitly. Specifically, we are giving you:

A correct implementation of a function left-pad:

 (: left-pad (-> Natural String String)) 
 ; adds n characters of whitespace to the left of the input string s 
 (define (left-pad n s) 
   (string-append (implode (build-list n (lambda (_) " "))) s))

The start of a buggy version, called buggy-left-pad (one of your tasks will be to implement it):

 (: buggy-left-pad (-> Natural String String)) 
 (define (buggy-left-pad n s) 
   ...)

A specification, which we claim is not strong enough, that describes buggy-left-pad:

 (: left-pad-spec (-> Natural String True)) 
 (define (left-pad-spec n s) 
   (= (string-length (buggy-left-pad n s)) 
      (+ n (string-length s))))

Your task is to first implement buggy-left-pad: this should still satisfy left-pad-spec, but it should not be correct.

In order to concretely express that it is not correct, we then want you to give a single set of example inputs to the function:

 (define BUGGY-INPUT-N ...) 
 (define BUGGY-INPUT-S ...)

Where (left-pad BUGGY-INPUT-N BUGGY-INPUT-S) does not produce the same output as (buggy-left-pad BUGGY-INPUT-N BUGGY-INPUT-S).

Since left-pad is our reference solution, the fact that the behavior of buggy-left-pad does not give the same result is confirmation that it is, indeed, buggy! And the fact that it still satisfies left-pad-spec means that the specification was not strong enough.

Note: you are welcome to think about how you could strengthen the specification, but you do not have to for this problem.

4.3 Problem 1🔗

Cryptography, or the science of secret communication, is about as old as math itself. Modern cryptography was born alongside with computers, as during WWII, information was sent over radio and secure communication was critical. Machines were used in what became an arms race, where more sophisticated codes were developed using complex machinery, and at the same time, some of the earliest "computers" were created in order to break them.

One of the first, and most important, cryptanalysis (code breaking) results was done by teams at Bletchley Park, a British estate that housed one of the first machines that could conceivably be called the computer. While many were involved in the effort, including a team of Polish mathematicians, a key figure in the effort was a British mathematician and one of the pioneers of the entire field of computer science named Alan Turing. For his contributions to this early work on computation, and many other things, the highest award in computer science (our "Nobel Prize") is named the Turing award.

Despite his critical role in the effort, and ground breaking work for the field as a whole, Turing was prosecuted for homosexuality after the war, and died by apparent suicide a few years later at the age of only 41. More than a half century later, the British Government apologized for his treatment and officially pardoned him.

Much of cryptanalysis relies upon detecting patterns, and indeed, the ability to do that at scale was exactly how the Bletchley Park team was able to use machines to break the German Enigma.

"Perfect encryption" via so-called one-time pads, or the other hand, has no patterns, and thus is not vulnerable to that type of analysis, even in theory. It is, in effect, unbreakable. It relies, on the other hand, on a secret key, shared beforehand, that is as long as the message itself, which aside from limited cases, is difficult to imagine (the "Red Telephone" in movies between the US and USSR is fiction, but keys to enable this form of encryption were exchanged via embassies). Most modern cryptography, on the other hand, relies upon being able to negotiate secure communication between parties that have never interacted before (you and an online shopping website, for example).

One-time pads, however, still do have value, in that they are provably impossible to break, and we will show why that is in this problem. Further, most realistic cryptosystems, while much more complex, share elements with this incredibly simple one.

First, we will define types for bits, messages (of fixed length of six bits), and keys (of fixed length of six bits). In a realistic setting, we could translate text into bits (each letter getting a unique assignment of bits), and split our message into chunks of length six. We define contracts for "Bit" and "Message". Note that we require Key and Message be lists of length exactly 6.

 (define-contract Bit (OneOf (Constant 0) (Constant 1))) 
  
 (define-contract Key (Tuple Bit Bit Bit Bit Bit Bit)) 
  
 (define-contract Message (Tuple Bit Bit Bit Bit Bit Bit))

The way one-time pads work is via exclusive-or (XOR). To both encrypt and decrypt, we XOR each bit in the message with a corresponding bit in the key. This should result in an encrypted message for which it is impossible to determine the original without knowing the key. We want to prove that, and we’ll do it in the following way. First, we define encryption and decryption, which use the key (which is the same length of the message), XOR’d bit by bit with the message:

 (: xor (-> Bit Bit Bit)) 
 (define (xor b1 b2) 
   (modulo (+ b1 b2) 2)) 
 (check-expect (xor 0 0) 0) 
 (check-expect (xor 0 1) 1) 
 (check-expect (xor 1 0) 1) 
 (check-expect (xor 1 1) 0) 
  
 (: xor-list (-> [List Bit] [List Bit] [List Bit])) 
 (define (xor-list l1 l2) 
   (map xor l1 l2)) 
 (check-expect (xor-list (list 1 0 0) (list 1 1 1)) (list 0 1 1)) 
 (check-expect (xor-list (list 0 0 0) (list 0 0 0)) (list 0 0 0)) 
  
 (: encrypt (-> Message Key Message)) 
 (define encrypt xor-list) 
  
 (: decrypt (-> Message Key Message)) 
 (define decrypt xor-list)

Your task is to define the following property, that captures the fact that encrypt/decrypt result in perfect secrecy. It should state that for a given encrypted message and any possible original message, there exists some key that would have produced it. That means that without knowing something about the key, there is no way of knowing which message was encrypted, since for any other message, some key could have produced the encrypted message. Note that we haven’t talked much about logical exists (\exists), but if you can compute such a key, that is essentially the same thing (to be more technical, it’s very close to an existential in constructive logic). To show that you have, indeed, computed the right key, your property should use the computed key to decrypt the message and ensure it is, indeed, equal to the given arbitrary message.

 (: xor-perfect-prop (-> Message Message True)) 
 (define (xor-perfect-prop encr-msg arbitrary-msg) ...) 
  
 (check-contract xor-perfect-prop)

4.4 Problem 2🔗

One key source of cybersecurity vulnerabilities are "timing attacks", which work by measuring how long a particular computation takes. If the computation takes different amount of time based on different secret values, the attacker may be able to learn information about those secrets.

In this problem, we’ll consider one of the classic examples of this: matching passwords.

In order to measure time, you’ll use a particular function provided by LSL: distinguishable?. This takes, as an argument, two zero-argument functions (called "thunks"), like, e.g.,

(lambda () (+ 1 2))

Or

(define (myfun)
  (+ 1 2))

distinguishable? will call the two thunks, and will record information about how long they take (specifically, information about string comparisons), and then return a boolean: whether or not the two are distinguishable by how many string comparisons they do.

You’ve been given a function check-password that takes as input a Password, which is list of "0" or "1" (which are strings). (For convenience, you can write constants as longer strings and use explode, but do not ever use implode). check-password checks in against the correct password (stored as a constant). It does it by using a password=? helper that calls string=? function on each letter.

 (define-contract SBit (OneOf (Constant "0") (Constant "1"))) 
 (define-contract BitString (List SBit)) 
  
 (define CORRECT-PASSWORD 
   (explode "00110011")) 
  
 (define-contract Password BitString) 
  
 (: password=? (-> Password Password Boolean)) 
 (define (password=? l1 l2) 
   (cond [(and (empty? l1) (empty? l2)) #t] 
         [(and (empty? l1) (cons? l2)) #f] 
         [(and (cons? l1) (empty? l2)) #f] 
         [(and (cons? l1) (cons? l2)) 
          (and (string=? (first l1) (first l2)) 
               (password=? (rest l1) (rest l2)))])) 
  
 (: check-password (-> Password Boolean)) 
 (define (check-password p) 
   (password=? CORRECT-PASSWORD p))

You have two tasks. First, you need to write a specification that ensures that check-password does not leak timing information. In order to do this, you should define a timing-spec that takes two different passwords, and ensures (using distinguishable?) that calling check-password on both results in the same amount of time (they are indistinguishable).

In order to test this, we have given you three unit tests you should use. You can write more, but these are good tests.

 (: timing-spec (-> Password Password True)) 
 (define (timing-spec p1 p2) 
   (... p1 ... p2 ...)) 
  
 (check-expect (timing-spec (explode "0010000") (explode "111111111")) #t) 
 (check-expect (timing-spec (explode "0011001") (explode "111111111")) #t) 
 (check-expect (timing-spec (explode "00110011") (explode "000")) #t)

Once you have that spec, you should find that the implementation of check-password is indeed leaking timing information: not all the tests will pass. An attacker could use that to guess the password, as the current implementation stops as soon as it finds a character that doesn’t match, so they could try passwords starting with alternating bits until they found one that took a little longer: this would mean they discovered the first bit. Then they would continue with the second, etc. You can try this out with distinguishable? (but don’t have to) by trying check-password with passwords that are longer and longer prefixes of the actual password.

Your next task is to go back and fix the implementation of check-password to not reveal that information.

One requirement: you may not do this by using equal? instead of string=?, and you may not use implode to combine the bits into a single string.

The reason for this is: without getting into more low level hardware details that we can in a class like this (or more complicated statistics), our notion of timing is simplified, and requires the above restrictions in order to work. Note that this doesn’t mean an implementation that uses equal? is secure! The same timing information present in comparisons for string=? is there, you would just have to do a lot more work to find it (but an attacker would be happy to do that).

Figure out how to rewrite password=? so that the implementation no longer reveals any timing information. Hint: you may need to do pointless work, which is indeed an important element of this type of security!

4.5 Problem 3🔗

Consider a binary tree, defined as follows:

 (define-struct leaf [value]) 
 (define-struct node [left right]) 
 (define-contract (Leaf T) (Struct leaf [T])) 
 (define-contract (Node X Y) (Struct node [X Y])) 
 (define-contract (Tree X) (OneOf (Leaf X) (Node (Tree X) (Tree X))))

Trees are often used for fast lookup of sorted data, as if it is sorted, at each node we can continue searching down only one half of the tree. But in order to be fast, they have to be balanced; a degenerate tree that only ever branches down one side will perform the same as a list.

A (height) balanced binary tree has a difference in height of the left and right subtrees of at most 1, where height is the longest path to a leaf. Importantly, this property (or invariant) must be mantained when inserting and removing elements from the tree. This may not be easy: if you have a tree:

 (define T1 (make-node (make-node (make-leaf 2) 
                                  (make-leaf 3)) 
                       (make-leaf 4)))

And you want to add 1, while mantaining the fact that the tree is sorted (or else, you won’t be able to use it to quickly look up data), you would want to insert it before 2 in the tree. The easiest thing would be to replace the leaf 2 with a node with 1 and 2:

 (define T2 (make-node (make-node (make-node (make-leaf 1) 
                                             (make-leaf 2)) 
                                  (make-leaf 3)) 
                       (make-leaf 4)))

This is still sorted, but it is no longer balanced, as now the left side of the top-level node has height 3, but the right side has height 1, which is a difference of 2. Instead, you might have to "rotate" the 3 to the other side to rebalance and get the following tree:

 (define T3 (make-node (make-node (make-leaf 1) 
                                  (make-leaf 2)) 
                       (make-node (make-leaf 3) 
                                  (make-leaf 4))))

There are many different ways of doing this, with different names (AVL trees, Red Black trees, etc), and our goal is not to implement insertion, rebalancing, deletion, etc. What we do want to do is capture the invariant of being balanced.

Your task: define balanced-prop as a predicate. You should test your property on several example trees using check-expect; invalid data should be tested with check-error.