Lecture 1: Data Definitions in Java
Design of simple classes and classes that contain objects in another class.
Overview
Data vs. Information, and the need for structured data
Saying what you mean: the benefits of static types
Simple classes: Book, Author
Containments: Book and Author
Examples of data
Do Now!
A Do Now! exercise is strongly recommended for you to do now, without reading ahead in the lecture notes. You should think through the question being asked, and then see if your understanding matches the concepts the lecture notes suggest.
Exercise
An Exercise is strongly recommended for you to do, now or when finished reading these notes. These exercises may suggest further applications of the concepts in the lecture notes.
1.1 Representing information by structured data
Information is everywhere: “it is 75 degrees Farenheit outside”, “Providence, RI is 65 miles away from Boston, MA”, “the book How to Design Programs was written by Matthias Felleisen”, etc. As humans, we can read those sentences and make sense of them: we understand the information they are trying to convey, and can use that information in other settings. But how can we cause a computer to do the same?
Computers do not work with information: they work with data, which is a representation of the information relevant for the computation. Data is different from information in much the same way that the four characters “2510” are not the same thing as the number two-thousand-five-hundred-ten. The first step in designing a program to compute anything is to figure out an appropriate data representation of the relevant information.
1.2 Types of data
Data in DrRacket
Strings
Numbers
Booleans
Symbols
Images
Structures containing other pieces of data
Unions of other forms of data
;; A number representing the temperature in degrees Farenheit (define outside-temp 75)
;; A number representing the temperature in degrees Farenheit (define outside-temp 75) ;; Some examples... (define hot-day (+ outside-temp 50)) (define bad-day (+ "75 degrees" 50))
Do Now!
If we type that whole program into DrRacket, what will happen?
Nothing! But if we run the program, the definition of bad-day will complain that
+: expects a number as 1st argument, given "75 degrees".
Do Now!
Why couldn’t DrRacket warn us about our mistake as soon as we wrote it?
Unfortunately, DrRacket cannot use the purpose statement for the + operator to enforce that it only accepts numbers, because the purpose statement for + (or for anything, for that matter) is just a comment. Instead, we must wait until runtime to detect the error.
Data in Java
Look carefully at the purpose statement for outside-temp above: it says that outside-temp is a Number that represents a temperature. The first part of that statement describes the type of the data, while the second part describes what that number represents. In Java, when we define data, we will always include the type of the data in the definition, rather than just in comments. In this way, we can inform Java what type of data we mean, and Java can then check for these type errors even before we run our program.
Let’s see how to define some simple forms of data, corresponding to the kinds of data we saw in Fundies I.
Booleans
boolean isDaytime = true; boolean inNewYork = false;
Strings
String courseName = "Fundies II";
Numbers
int ten = 10;
double pointSix = 0.6;
We’ll have a lot more to say about the difference between int and double later, especially as it regards writing test cases.
Symbols and Images
Java does not have DrRacket’s notion of symbols. And in Java, images are not a built-in form of data; we will come back to images next week.
So much for simple forms of data. Our goal now is to learn how to represent more complex information in Java.
Classes of Data
The Design Recipe for Data Definitions says that if the information consists of several components, it should be represented by a structure. For example, here is the data definition for a book in a bookstore in DrRacket:
;; to represent a book in a bookstore ;; A Book is (make-book String Author Number) (define-struct book (title author price)) ;; to represent an author of a book (with the year of birth) ;; An Author is (make-author String Number) (define-struct author (name yob))
The Design Recipe for Data Definitions tells us to follow up with concrete examples of data:
;; Examples: (define pat (make-author "Pat Conroy" 1948)) (define beaches (make-book "Beaches" pat 20))
In Java, we don’t define “struct”s to describe compound data; we define classes. (Throughout this course, we’ll see the ways classes differ from structs in how they let us organize our code, and the abstraction mechanisms they provide.) A more visual way to describe these structure definitions is with a diagram:
+---------------+ | Book | +---------------+ | String title | | Author author |--+ | Number price | | +---------------+ | v +-------------+ | Author | +-------------+ | String name | | Number yob | +-------------+
We will often draw these class diagrams instead of writing out the corresponding Java code. Here, the Java definitions of these classes is:
// to represent a book in a bookstore class Book { String title; Author author; int price; // the constructor Book(String title, Author author, int price) { this.title = title; this.author = author; this.price = price; } } // to represent a author of a book in a bookstore class Author { String name; int yob; // the constructor Author(String name, int yob) { this.name = name; this.yob = yob; } }
Notice the differences from the DrRacket code: instead of specifying the signature for the constructors make-book and make-author as comments, we declare the fields of our classes with their types. Additionally, while the define-struct form in DrRacket defined constructor functions for each struct for us, in Java we define constructors for the classes explicitly, though for the time being, they will all look the same: the name of the class, followed by a parenthesized list of argument types and names that (for now) mimic exactly the field definitions, then a block (enclosed in curly braces) of initialization statements this.fieldname = argumentname; that initialize the fields of our object to the provided values. For clarity and convenience, we choose to give the arguments the same names as the fields of the object, so it is obvious which argument corresponds to which field. The this keyword used in these initialization statements indicates which object is being initialized: this one, and not some other one.
Do Now!
Why do you think String is capitalized?
1.3 Examples of Data
While we have now defined the classes Book and Author, we haven’t yet defined any actual books or authors. As a matter of terminology, we say we create instances of these classes, and these instances are known as objects.
Do Now!
Does it make sense to construct examples of e.g. the Book class as fields of the Book class itself?
This means we need to define a new class, whose purpose is to demonstrate examples of our data. Accordingly, we’ll name the class ExamplesSomething – either ExamplesProblemName if we are working on a specific problem, or ExamplesClassName if we are building examples specifically for one of our data classes. Here, for instance, we’re building examples of Books:
// examples and tests for the classes that represent // books and authors class ExamplesBooks{ ExamplesBooks() {} Author pat = new Author("Pat Conroy", 1948); Book beaches = new Book("Beaches", this.pat, 20); }
This introduces two new pieces of syntax: first, in addition to declaring fields we are also initializing them to their values. (Compare with the field definitions in Book and Author, which do not use these initializers.) Second, we are constructing new instances of the Book and Author classes. The general syntax new ClassName(arg1, arg2, ..., argN) is the Java equivalent of (make-class-name arg1 arg2 ... argN) in DrRacket.
Notice the use of this in the code above: it’s used to indicate which ExamplesBooks object’s pat field is being used to initialize beaches – this one. This is called field access, and in general its syntax is someObject.aField, meaning “obtain the value of the someObject, then obtain the value of its field named aField.” (It is legal Java syntax to repeat this process, and write someObject.aField.anotherField, but it is bad design – don’t be tempted to do it!)
Do Now!
In terms of the design recipe from last semester, which part of the recipe would we be violating?
1.4 Anatomy of the Class Definition
A class definition consists of the keyword class followed by the name we choose for this class of data.
Next are the definitions of fields of the class, similar to the fields or components of DrRacket structs. In Java, the field definitions provide both the type of data that the field represents, and the name of the field, so we can refer to it. Contrast this with DrRacket, where the information about the types of data the fields represent was written only in comments. In some classes, such as ExamplesBooks, we also supply initial values for the fields. Recognize these by an = sign after the field name, followed by an expression that provides the desired value.
After the field definitions comes the constructor, which accepts a list of arguments and initializes the fields of the being-created instance to the supplied values.
Creating new instances of classes requires invoking the desired constructor. The syntax to do so is the keyword new followed by the name of the class for which we want to construct an instance, followed by a parenthesized list of values to give to the fields.
Every field definition and initializer statement in Java must end in a semicolon
More or less. We’ll see later some of the subtleties here.
class ExamplesBooks{ ExamplesBooks() {} // BAD IDEA! Book beaches = new Book("Beaches", this.pat, 20); Author pat = new Author("Pat Conroy", 1948); // CRASH String beachesAuthorName = this.beaches.author.name; }
Do Now!
When we create an ExamplesBooks instance, what will the value of the author field of its beaches field be?
There are several problems with this example: First, it is poor design to access a field of a field, and all the more so a field of a field of a field! Second, it is silly to initialize the field beachesAuthorName from some smaller part of beaches; it would be simpler to initialize beachesAuthorName first, then pat based on it, then beaches based on that.
Exercise
Enhance the definitions of Book and Author above to include Publisher information. A Publisher should have fields representing their name (that is a String), their country of operation (that is also a String), and the year they opened for business (that is an int). Should Books or Authors have Publishers? Enhance the class diagram above to include your new inforamation. Define the new class. And enhance the ExamplesBooks class to include examples of the new data.