8.3

#### Assignment 7: Mutable Worlds

Goals: Further practice with ArrayLists; mutable worlds.

##### Instructions

This assignment is long. Start early.

As always, be very careful with your naming conventions.

The submissions will be organized as follows:

• Homework 7 Problem 1: The Automata.java file.

• Homework 7 Problem 2: The Huffman.java file.

Due Date: Friday, March 11th at 9:00pm

### Problem 1:Cellular Automata and Mutable World Programs

A cellular automaton is a simplified model of a biological cell: it has a state, a short lifespan, and can produce a child cell in some state in response to its own state and to the state of its neighbors. When a collection of cellular automata are arranged in some fashion (in a line, or on a plane, or in a 3-d grid...), interesting collective behaviors can emerge. In this problem, we’re going to work with the simplest kind of cellular automaton, which has only two states (on or off), and we’re only going to work with a 1-dimensional collection of cells. As it turns out, even this simplified setting is enough for impressive patterns to emerge.

Before you begin the assignment below, add the following lines to the top of your Automata.java file:

 import tester.*; // The tester library import javalib.worldimages.*; // images, like RectangleImage or OverlayImages import javalib.impworld.*; // the abstract World class and the big-bang library for imperative worlds import java.awt.Color; // general colors (as triples of red,green,blue values) // and predefined colors (Red, Green, Yellow, Blue, Black, White)

Be sure to use impworld, not funworld.

##### 7.1Individual cells

An individual cell should implement the following interface:
 interface ICell { // gets the state of this ICell int getState(); // render this ICell as an image of a rectangle with this width and height WorldImage render(int width, int height); // produces the child cell of this ICell with the given left and right neighbors ICell childCell(ICell left, ICell right); }

Every cell has a state, but it might be computed in various ways, so ICells provide a getState method to compute it. We make the return type int in case you want to experiment with more sophisticated cell states than just on/off. For now, use 0 for an “off” cell, and 1 for an “on” cell.

Additionally, every cell needs to produce a picture of itself somehow. The render method should produce a picture that fits into a rectangle of the given width and height. You will use this method to render a collection of cells at various locations. You may draw cells however you like; in the pictures below, an “on” cell is drawn as a black square, while an “off” cell is white.

Finally, the childCell method lets a cell produce its child cell, in response to its own state and its neighbors’ states. We’ll describe this method in more detail, below.

##### 7.2.1Inert cells

Define a class InertCell that implements ICell, and is always off. Its state is always 0. When it produces a child, it ignores its neighbors entirely and produces another InertCell.

##### 7.2.2Rule-based cells

A rule-based cell takes the states of its left and right neighbors, along with its own state, into account when producing a child cell. Since each of these three cells can be in one of two states, there are eight possible situations. For each of these eight possible situations, the child cell could be either on or off. Accordingly, there are 28 = 256 possible “rules” for implementing a rule-based cell. We can conveniently name these rules by reading off the outputs as bits, in order from left to right, and treating them as an 8-bit binary number. For example, here is rule 60:

Read each of these images as saying (for example): “if the left cell is in state 1, this cell is in state 0, and the right cell is in state 1, then output a child cell in state 1.”

(We get the number 60 by reading the outputs as bits from left to right, so 0 * 27 + 0 * 26 + 1 * 25 + 1 * 24 + 1 * 23 + 1 * 22 + 0 * 21 + 0 * 20 = 60. We get the exponents by reading off the inputs as bits of a binary number, too.)

Design a class Rule60 that implements ICell, and implements the rule above.

Design a class Rule30 that implements ICell, and implements the rule below:

(Again, we get the number 30 by reading the outputs as bits from left to right, so 0 * 27 + 0 * 26 + 0 * 25 + 1 * 24 + 1 * 23 + 1 * 22 + 1 * 21 + 0 * 20 = 30.)

Abstract as much as you can. Implementing new classes, such as Rule182 or Rule54, should be much easier than copy-paste-modifying your existing code! Instead, look carefully for repeated code in these rule-based classes, and abstract it into an abstract base class (that still implements ICell). Then, simplify your concrete classes as much as possible. You may find this easier to do once you’ve finished the rest of the assignment and have it working, and can confirm (via tests!) that you won’t break anything.

##### 7.3Populations of cells

A population of cells will be a CellArray, which takes in just an ArrayList<ICell> in its constructor. Design a method CellArray nextGen() that produces a new generation of cells from the current population. Cells at the start and end of the list are treated as having InertCells as left and right neighbors, respectively.

It should also have a draw method, which takes in two numbers, representing the width and the height of an indiviual cell.

This code introduces two new keywords, static and final. For now, treat them as “magic words” and use these definitions as given. A partial explanation for them is that static lets you define things associated with a class rather than with an object, and final prohibits you from modifying a value once it’s been initialized. Taken together, and when used properly, they allow you to define constants like these, and the naming convention is to use ALL_CAPS_AND_UNDERSCORES.

Design a class CAWorld that extends World. Complete the class definition below:
 class CAWorld extends World { // constants static final int CELL_WIDTH = 10; static final int CELL_HEIGHT = 10; static final int INITIAL_OFF_CELLS = 20; static final int TOTAL_CELLS = INITIAL_OFF_CELLS * 2 + 1; static final int NUM_HISTORY = 41; static final int TOTAL_WIDTH = TOTAL_CELLS * CELL_WIDTH; static final int TOTAL_HEIGHT = NUM_HISTORY * CELL_HEIGHT; // the current generation of cells CellArray curGen; // the history of previous generations (earliest state at the start of the list) ArrayList history; // Constructs a CAWorld with INITIAL_OFF_CELLS of off cells on the left, // then one on cell, then INITIAL_OFF_CELLS of off cells on the right CAWorld(ICell off, ICell on) { //TODO: Fill in } // Modifies this CAWorld by adding the current generation to the history // and setting the current generation to the next one public void onTick() { //TODO: Fill in } // Draws the current world, scrolling up'' from the bottom of the image public WorldImage makeImage() { // make a light-gray background image big enough to hold 41 generations of 41 cells each WorldImage bg = new RectangleImage(TOTAL_WIDTH, TOTAL_HEIGHT, OutlineMode.SOLID, new Color(240, 240, 240)); // build up the image containing the past and current cells WorldImage cells = new EmptyImage(); for (CellArray array : this.history) { cells = new AboveImage(cells, array.draw(CELL_WIDTH, CELL_HEIGHT)); } cells = new AboveImage(cells, this.curGen.draw(CELL_WIDTH, CELL_HEIGHT)); // draw all the cells onto the background return new OverlayOffsetAlign(AlignModeX.CENTER, AlignModeY.BOTTOM, cells, 0, 0, bg); } public WorldScene makeScene() { WorldScene canvas = new WorldScene(TOTAL_WIDTH, TOTAL_HEIGHT); canvas.placeImageXY(this.makeImage(), TOTAL_WIDTH / 2, TOTAL_HEIGHT / 2); return canvas; } }

The 5th, 15th and 25th generations for Rule 60 should look like this (shrunk to fit side-by-side here):

Likewise, the 5th, 15th and 25th generations for Rule 30 should look like this:

##### 7.5Optional Enhancements

Try designing a three-valued cell (states are 0, 1, 2), or a four-valued cell. Make up coloring rules for them, and make up rules for how the childCell method should work.

Try designing differently-sized neighborhoods, so that cells can see their four nearest neighbors (two on each side).

### Problem 2:Huffman Coding

A character encoding is a way of representing an alphabet (set of characters) with some kind of code. In computer science, that code is often made of single bits, 0s and 1s, since that is how all data is ultimately stored.

Unicode’s industry-standard encoding is what lets us have emojis. :-)

For example, if we had the following encoding:

 Letter Code a 0 b 1,0 c 1,1

Then "abc" would be encoded as the five bits 01011, and 1110 would be decoded as "cb".

After all, there are far fewer short codes than long codes. Why?

For this assignment, you are going to implement Huffman coding, which is a strategy for encoding an alphabet when you know how relatively popular the letters are. For example, a, e, i, o, and u are heavily used in English, whereas j, q, x, and z are not. It makes sense to try to use shorter codes to represent the popular letters, and let the less-common letters be stuck with longer codes.

While there are various properties that make Huffman codings interesting, the one we are most interested in is that it is a prefix code. This means that no encoding of one character is a prefix of any other character’s encoding. The example given above is a prefix code. The following, however, is not:

 Letter Code e 0 f 1 g 01

What would 01 decode to, ef or g? Prefix codes help eliminate these types of ambiguities.

##### 7.1The algorithm

The best way to explain the Huffman coding algorithm is to illustrate it.

For example, say we begin with the following letters:

"a", "b", "c", "d", "e", "f"

Where each letter has the following relative frequencies:

12, 45, 5, 13, 9, 16

To begin, we create 6 leaves, where each leaf stores the letter and its relative frequency.
We then sort them in ascending order by frequency.
Then, we combine the two smallest leaves into a node, whose value is now the sum of its subtrees. The first leaf we place under the red branch, and the second leaf under the blue. We place this node inside of our forest (list of trees) such that it is still sorted.
and repeat...
and repeat...
and repeat...
...until there is just one tree.

This process is guaranteed to produce a prefix code. Why? Could we choose red as 1, and blue as 0? Could we choose differently at every level of the tree?

To encode a character, follow the path from the root to the leaf with that letter, where a red line represents a 0, and a blue a 1. So, encoding "eba" with the tree above would produce 11010100. Decoding a sequence of 0s and 1s is (essentially) the inverse operation, so decoding 101111 would yield "df".

Composing the encoding and decoding operations in either order should produce what is effectively the identity function (on either strings or bit-sequences), but they aren’t quite inverse functions. Why not?

##### 7.2Data design and methods
• Design the Huffman class, which should be constructed with two ArrayLists, one of strings and the other of integers (in that order), representing the letters of the alphabet in question and their relative frequencies. You may assume the integers are always positive and that the strings are always of size 1 (and never equal to "?"), but you should throw an IllegalArgumentException if the lists are of different lengths or if the length of the list of strings is less than 2 (after all, there wouldn’t be much point of encoding a 1-letter alphabet).

• Your forest should be represented by an ArrayList of some kind. You may need a helper class; you may need multiple ArrayLists; the design is up to you.

• In order for your encoded output to match ours, and therefore match the test cases, we need to specify how to sort the leaves and nodes. When sorting the initial leaves, you should use a stable sorting algorithm. This means that if two different elements have the same value for the purposes of sorting, their relative order in the original list should be preserved. (For instance, if two letters have the same frequency, then they have the same value for sorting even though they are not the same letter.)

• Similarly, when you insert a new node into the forest, you should ensure that it appears as deep in the list as possible, so that if it ties with something else already in the list, the existing value “wins” and should come first. You should be sure to carefully test these properties of your sort, and you should not have to attempt to re-sort the entire list when inserting a new node.

• Design the encode method, which takes a string and returns an ArrayList<Boolean>, where 0 is represented by false and 1 by true. If the string contains any character not in the alphabet, an IllegalArgumentException should be thrown with the exact error "Tried to encode <the character in question> but that is not part of the language.", where "<the character in question>" is replaced with the first character in the string that is not in the alphabet.

• Design the decode method, which takes an ArrayList<Boolean>, where 0 is represented by false and 1 by true, and returns the string represented by this encoding. If at the end of the list of booleans you run out of booleans before you can reach a terminal node in the tree, you should end the string with a single "?". For example, in the above tree, the list of false, true, false, would decode to "b?".

(Hint: for a change, we’re not treating Strings as primitive data: you’ll need to search around the methods on Strings to find one that helps get length-1 substrings out from the string. You should not need to work with Java chars for this assignment.)