Lecture 5: Java Safari
There are a number of Java features we’ll use in this course that you likely haven’t seen or fully understood before. In this lecture we’ll introduce several of them.
1 What the heck does static
mean?
1.1 What is it?
In programming generally, static describes things that happen or are
determined at compile time, and dynamic describes things that happen
or are determined at run time. In object-oriented programming, and in
Java in particular, static
means that some member—a field,
method, or nested class—is part of its class, whereas a
non-static
member is associated with every object of the class.
(Note that a class and its static members is created at compile time,
whereas objects are created dynamically.)
1.2 How do we use it?
Field, class, and method members of classes can be declared
static
. (Enumerations and interfaces are always static and need
not be declared as such.) In each case the idea of being associated with
the classes versus its instances must be interpreted slightly
differently.
1.2.1 Static fields
An instance (non-static) field is a separate slot in each object of a class, whereas a static field has one slot for the whole class. For example, consider a class with one of each kind of field:
class Widget {
int widgetId;
static int widgetIdCounter = 3;
}
For each Widget w
we create, there’s a separate w.widgetId
variable.
Whereas there is only one Widget.widgetIdCounter
variable, which we
refer to as a member of the class Widget
rather than a particular
object w
:
assertEquals( 3, Widget.widgetIdCounter );
Widget one = new Widget();
one.widgetId = 1;
Widget two = new Widget();
two.widgetId = 2;
assertEquals( 1, one.widgetId );
assertEquals( 2, two.widgetId );
assertEquals( 3, Widget.widgetIdCounter );
one.widgetId = 11;
two.widgetId = 12;
Widget.widgetIdCounter = 13;
assertEquals( 11, one.widgetId );
assertEquals( 12, two.widgetId );
assertEquals( 13, Widget.widgetIdCounter );
In a sense, static fields are Java’s version of global variables, and like globals, they should be used sparingly. Public, static, non-final fields warrant extra suspicion.
1.2.2 Static methods
Unlike dynamic methods, static methods do not require an instance of a class to operate on; as with static fields, the method is treated as a member of the class. For example:
class Widget {
public void getWidgetId() { return widgetId; }
int widgetId;
public static void resetWidgetIdCounter() { widgetIdCounter = 0; }
static int widgetIdCounter = 3;
}
Note that it would be a static (compile-time) error for
resetWidgetIdCounter
to refer to widgetId
, because widgetId
is a
non-static field, and static methods don’t operate on an individual
instance. In order to call non-static method getWidgetId()
, we need an
instance of the widget class to call it on, because a non-static method
can use the non-static fields of the object:
assertEquals( 11, one.getWidgetId() );
assertEquals( 12, two.getWidgetId() );
Bus static methods are called on the class rather than an instance, and
thus don’t have an instance (this
) to work on:
assertEquals( 13, Widget.getWidgetId() );
1.2.3 Static classes
When a nested class is static
, it behaves like an ordinary
class, just nested in its enclosing classes namespace.1We defer
discussion of non-static nested classes for now. If Nested
is a
static member of Outer
, then we refer to it as Outer.Nested
when writing code that’s outside of Outer
. Furthermore, nested classes
share the same private
scope with their enclosing class. Thus,
Outer
can see all of Outer.Nested
’s private members, and
Outer.Nested
can see all of Outer
’s private members.
As an example, rather than use a static field widgetIdCounter
to
generate widget IDs, we can manage the counter in a factory object:
class Widget {
private Widget(...) { ... }
public static class Factory {
public Factory() { ... }
public Widget create(...) {
...
++widgetIdCounter;
...
}
private int widgetIdCounter;
}
}
In order to create Widget
s, we first must create a Widget.Factory
,
which instantiates its own counter. Then we can use the factory to
create Widget
objects numbered using that factory’s counter:
Widget.Factory factObj = Widget.factory();
Widget w1 = factObj.create();
Widget w2 = factObj.create();
An advantage to this approach is that now we can create multiple independent factories, perhaps to work with multiple concurrent versions of some client system. Of course, if we actually want only one global numbering, then we can apply the singleton pattern.
1.3 When should we use static
?
Use a static field when you want on variable for the whole class rather
than one per object. Sometimes a class with do this privately to cache
some kind of information or to keep track of some information about its
instances, but for the most part static fields are rare except for
constants. A constant should be a public static final
field
whose contents are immutable, and its name should be in all caps.
Use a static method when you want to associate some method with a class
that doesn’t depend on having an instance. The most common case for
static methods are static factory methods, which produce objects of a
class (and don’t require that we already have an object to do so). It’s
also common to factor out implementation functionality into private
static
helper methods.
Use a static class when you want a helper class that’s strongly associated with the enclosing class, especially when the helper doesn’t make sense on its own. Nesting a helper class also allows the outer class to see its private members and vice versa, which can be helpful when the two classes are tightly coupled. For example, it makes sense to nest a class implementing an iterator for a collection class inside the collection class, because the iterator probably doesn’t make sense without the collection class, and it is often useful for the iterator to be able to see into the collection objects.
2 Arrays
2.1 What are they?
In Java, an array of type τ[]
(where τ
is any Java
type) is a mutable, fixed-length, constant-time–indexed sequence of
values of type τ
. Let’s unpack that, from back to front:
A sequence means that we have some number of things with a defined order (unlike a set, which has no order).
Constant-time–indexed means that we can get (or set) any element of an array identified by its index (position) in the array, and how long it takes does not depend on the position or the size of the array.2Well...this is a convenient approximation, really. The actual behavior for sufficiently long arrays is not so constant, and depends upon the actual architectural details of the computer.
Fixed-length means that once an array is created, its length never changes.
Mutable means that the contents of the array—
the elements— can be changed at will. Thus, they can change in content but not in length.
2.2 How can we use them?
To create a new array, use the special array form of the new
operator, which comes in two main variants:
A new, fully initialized array
new int[] {2, 4, 6, 8}
contains exactly the elements listed, which also determine its length.A new, uninitialized array
new int[64]
has the given size and is filled with the default value for its element type. For numeric types likeint
anddouble
the default is zero, and for object types the default isnull
.
As a special case, when used to initialize a variable as part of its declaration everything before the curly braces can be omitted:
int[] intArray = {2, 4, 6, 8};
The main operation on arrays is indexing, for both observing and
updating the array. It’s also common to find out the length of an array
using its length
field.
assertEquals(4, intArray[1]);
assertEquals(8, intArray[3]);
intArray[3] = 17;
assertEquals(17, intArray[3]);
assertEquals(4, intArray.length);
The class
Arrays
(doc)
provides a large number of static methods for working with arrays, such
as searching, sorting, filling, and copying. It also provides a method
for turning an array of any type T
into a List
(doc) of T
:
<T> List<T> asList(T... a)
.3Note that List
is an interface, and the
venerable ArrayList
is one class that implements it. Ah, but what
does that T...
mean? It’s a special form of array. Read on...
2.2.1 Varargs
Suppose we want a method that takes an arbitrary number of strings. Of course arrays are good for this, since they represent arbitrary length sequences:
void setPlayers(String[] newPlayers);
However, passing an array is ugly and inconvenient when in the common case we don’t have an array already and want to list the elements directly in the method call:
setPlayers(new String[] {"Crosby", "Stills", "Nash", "Young"});
Instead, Java provides a way to declare a method that takes a variable number of arguments:
void setPlayers(String... newPlayers);
The ...
parameter always comes last in the argument list,
though there may be a number of ordinary parameters that come before it.
Now when we call the method, it looks like it takes a variable number of string arguments:
setPlayers("Crosby", "Stills", "Nash", "Young");
Within the method, parameter newPlayers
is an array just like before:
void setPlayers(String... newPlayers) {
for (String player : newPlayers) {
addToGame(player);
}
}
If we already have an array ready to go, we can pass it directly:
setPlayers(someArrayOfStrings);
Whether we pass an existing array or use the varargs method call syntax, the callee receives an array.
2.2.2 Array gotchas
An array is represented as a reference to a chunk of memory, and the value of the array, so far as Java is concerned, is the reference itself, not the chunk of memory. This means that re-assigning or passing an array results in aliasing, having more than one name for the same thing:
int[] anotherArray = intArray;
anotherArray[0] = -9;
assertEquals(-9, intArray[0]);
Not only does ==
for arrays compare references rather than
contents, but equals(Object)
does as well. This means, for
example, that this JUnit test will fail:
assertEquals(new int[] {3, 6}, new int[] {3, 6}); // fails!
Yes, the arrays have the same contents, but not the same physical
identity. To test for equality on the contents of two arrays, use
assertArrayEquals
:
assertArrayEquals(new int[] {3, 6}, new int[] {3, 6});
In order to compare arrays by their values yourself, you should use the
Arrays.equals(Object[], Object[])
method, which compares the
elements of the arrays using their equals(Object)
methods. Of
course, if the contents of the arrays you are comparing are in turn
arrays, those will each be compared by physical identity. If you have
several layers of nested arrays and want to compare by the contents all
the way down, use Arrays.deepEquals(Object[], Object[])
. (Note
that these methods can take Object[]
s because in Java every
array type is a subtype of Object[]
. This design choice is actually a
big problem, but we won’t get into it right now.)
2.3 When should we use them?
Mainly, you will use arrays because various existing APIs require them.
You can use an array when you want a sequence of values that you can
look up and update efficiently, addressed by their positions in the
sequence. However, if the length of the sequence needs to change, you
probably want to use a List
instead (e.g., ArrayList
or
LinkedList
). You can simulate a variable-length array by allocating
new arrays and copying the elements over as needed, and this is in fact
how ArrayList
works (though there are some subtleties to avoiding a
large number of inefficient copies).
Since List
s can do everything that arrays can, why would we ever use
an array? One reason would be if you know the length that you need up
front, and you want to guarantee that you can never accidentally change
it. Another reason is for efficiency, since higher-level sequences such
as ArrayList
build on top of arrays with additional checking and
indirection. Usually this small, constant-factor change in resource
efficiency isn’t worth worrying about, but if you are implementing a new
data structure on top of a sequence or sequences, you may want to use an
array directly to avoid an additional layer that every client will then
have to pay for.
3 Characters
3.1 What are they?
The character type, char
, represents single graphemes or
symbols for writing text. This includes letters ('a'
,
'b'
, 'A'
, 'B'
, 'α'
, 'β'
,
etc.), digits ('0'
, '1'
, '١'
, '٢'
,
etc.), punctuation ('.'
, '-'
, '«'
, etc.), and
whitespace (' '
, '\n'
, '\t'
, '\v'
,
etc.).
3.2 How can we use them?
The Character
class provides, among other things, static methods for
working with characters, such as tests for character categories:
assertTrue(Character.isLetter('a'));
assertTrue(Character.isLetter('β'));
assertFalse(Character.isLetter('8'));
assertFalse(Character.isLowercase('Z'));
assertTrue(Character.isLowercase('z'));
Strings are sequences of characters, and the Java String
class
provides methods for working with them as such. The method
charAt(int)
lets us treat a string as an (immutable) array by
allowing us to look up characters by position:
assertEquals('a' , "abcde".charAt(0));
assertEquals('c' , "abcde".charAt(2));
Additionally, we can search for characters in strings4There’s a twist,
though: some methods, such as String.indexOf(int)
, take
“characters” represented as type int
rather than type
char
, because it turns out that the Java char
type
doesn’t have enough bits to represent every Unicode code point. However,
because char
s are implicitly converted to int
s, you
can use a character where an integer is expected with no trouble. (In
the other direction it requires a cast, which is lossy because some
int
values don’t fit in char
), and convert between
strings and arrays of characters:
assertEquals(0, "abcde".indexOf('a'));
assertEquals(2, "abcde".indexOf('c'));
assertEquals(-1, "abcde".indexOf('f'));
char[] abc = {'a', 'b', 'c'};
assertArrayEquals(abc, "abc".toCharArray());
assertEquals("abc", String.valueOf(abc));
3.3 When should we use them?
Characters are really for only two things:
For processing a
String
character by character to query, transform, decompose, or construct it incremementally. For example, if we want to split a sentence up into individual words, then processing it character-by-character lets us find the word boundaries and whitespace in between.For representing text-like values that are always exactly one character long. This is fairly rare, but one place it often shows up is representing keystrokes in GUIs, which actually doesn’t work very well. (Why?)
4 Primitive types versus reference types
4.1 What are they?
Java makes a distinction between primitive types and reference (or pointer) types. Understanding this distinction will make a variety of other features and quirks of Java make sense. Let’s consider what Java variables really mean.5Not just variables, because everything below applies to method parameters and results as well. In this diagram, there are six variables, each of which is represented as a box in the left column that contains the variable’s value:
The first thing to notice about the six variables is that none has a
compound value, composed of multiple components—each contains a single,
simple value, which may be an immediate number, a reference to something
else, or null
.
Variables a
and b
contain values of primitive (meaning “built-in”)
numeric types int
(four bytes) and long
(eight bytes);
for each of these the value is directly in the variable, with no
references to anything else. The prefix 0x
on a number is the syntax
in many programming languages, including Java, for hexadecimal integer
literals; writing the numbers this way makes it clear how many bits each
is represented with. Every bit in each of these types (and all primitive
types) is part of the representation of the number, and there’s no room
for a distinguished null
bit pattern; hence, primitive types
do not include null
as a value.
The other four variables have reference types, of which there are two
subdivisions, object types and array types. Variables c
and d
both
have the object type Posn
. (Assume a class with two int
fields x
and y
.) Java variables cannot hold objects directly—objects
are compound data structures—so instead they must hold a reference6What
is a reference? Most likely it’s just the memory address of the object,
like a pointer in C or C++—though there are optimizations that can make
the situation less simple. to an object. Variable c
contains a
reference to a Posn
object, so that, for example, c.x == 3
.
Note that the fields of an object are a kind of variable, which means
that they, too, can hold only primitives or references, not actual
objects. Variable d
currently does not hold a reference, so its value
is null
, which is a distinguished value that indicates the
absence of a reference.
Variables e
and f
have array types, which means that
each hold a reference to an array (or null
), not the array
itself. Whereas e
refers to an array of primitive int
values,
f
refers to an array of Posn
object references. Three of the
four elements contain references to Posn
objects, and the fourth
contains null
. Note that while there are only two Posn
objects in the diagram, there are four Posn
object references—multiple
references can point to the same object. (This is aliasing as discussed
in the array section above.)
The reasons why objects and arrays need to be accessed via references is
that their types do not determine how much space they take up. In
particular, an array type int[]
does not determine the length
of the array, and object type Posn
could include subclasses with
additional fields.
4.1.1 All the primitive types
In total, Java has eight primitive types that can be the immediate values of variables:
Type |
| Size |
| Description |
| Range |
|
| 1 bit |
| truth value |
|
|
|
| 8 bits |
| signed integer |
| \(-2^7\) to \(2^7 - 1\) |
|
| 16-bits |
| signed integer |
| \(-2^{15}\) to \(2^{15} - 1\) |
|
| 16-bits |
| Unicode character |
| \(0\) to \(2^{16} - 1\) |
|
| 32-bits |
| signed integer |
| \(-2^{31}\) to \(2^{31} - 1\) |
|
| 32-bits |
| floating point number |
| complicated, see here |
|
| 64-bits |
| signed integer |
| \(-2^{63}\) to \(2^{63} - 1\) |
|
| 64-bits |
| floating point number |
| complicated, see here |
These have different sizes, which means that variables have different sizes. But each one has a known size, and each is a single value rather than some combination of values.
4.1.2 Boxed types
Every primitive type in Java has a corresponding object type:
int
has Integer
(doc),
char
has Character
(doc),
double
has Double
(doc),
and so on. (Only Integer
and Character
have names that differ in
more than their capitalization from their primitive counterparts; the
other six are the same except for the capitalized first letter.)
In each case, the uppercase object type is a class with a field
containing the corresponding lowercase primitive type. For example,
let’s compare the short
primitive type with the Short
object type:
Variable g
of type short
contains a short
value
directly. Compare this to variable h
of type Short
, which contains a
reference to a Short
object that has a field containing its value.
Variable i
also has type Short
, but instead of containing a
reference to an object, it contains null
. Note that
short
cannot be null
but Short
can, double
cannot be null
but Double
can, and so on for the other six
boxed types.
Variable j
contains a reference to any array of primitive
short
values, which are stored directly in the array. Variable
k
is reference to an array of Short
s, that is, an array of object
references.
Note that these types are boxed because each is a reference to a box (object) containing the primitive type. For the most part, you shouldn’t have to convert between primitive types and their object versions, because Java automatically inserts box and unbox operations where needed.
4.2 How can we use them?
You have been, and for the most part you know how. But there’s one thing
worth knowing about that you may not: How to perform Object
operations
such as equality and hashing.
5 Equality: physical and logical
Equality seems like such a simple proposition: two things are either “the same” or they aren’t. Except of course the preceding sentence has two key terms left undefined: “things” and “sameness”. Now that we have a clearer picture of the difference between reference types and value types, there are clearly two different kinds of things: things that contain an arrow, or things that don’t. We therefore have to refine our notion of sameness: For value types, there’s really only one thing to be done: compare the values themselves and see if they are equal. But for references:
We can “follow the arrows” and recursively compare the data they refer to for sameness.
Or, we can see if the arrows point to exactly the same place. This check is what the
==
operator performs.
These last two are known as “physical” and “logical” equality, respectively. But this is a very operational definition: instead of saying what we want the sameness operation to mean, we are saying how to compute the result.
An alternate way of phrasing these two would be to say that physical equality
is when two values are aliases of each other—
In Fundies 2, we called these notions “intensional” and “extensional” equality. They mean the same thing as “physical” and “logical” equality, except we didn’t need to know exactly how references worked in order to define them!
To make equality convenient to use, Java defines the
boolean equals(Object other)
method, on the Object
class, so that any two
objects can be compared. By default, this operation is defined as
boolean equals(Object that) { return this == that; }
so that the default operation is simply intensional, physical equality.
However we are free to override this method on our own classes, so that
instances of our classes can support logical equality instead. Typically, if
we design a class such that all of its fields are private final
, then
that class is probably meant to behave as a value type, rather than a
reference type, despite the fact that all object types are reference types.
7Be careful! If any fields of that class, despite being final
,
are reference types that transitively contain values that aren’t
final
, then perhaps this type isn’t actually a value type. Or,
perhaps the design of that non-final
class should be revisited... In
these cases it’s a good idea to override equals
.
Logical equality can be more sophisticated than merely recursively comparing
all fields for equality. Consider a Fraction
class:
final class Fraction {
private final int num, den; // represents the number (num/den)
...
@Override
public boolean equals(Object obj) {
if (!(obj instanceof Fraction)) return false;
Fraction that = (Fraction)obj;
return this.num == that.num && this.den == that.den; // Oops!
}
}
assertEquals(new Fraction(1, 2), new Fraction(1, 2)); // good...
assertEquals(new Fraction(1, 2), new Fraction(2, 4)); // Fails!
There are multiple ways of representing the “same” fraction, so we need a more general equivalence:
@Override
public boolean equals(Object obj) {
if (!(obj instanceof Fraction)) return false;
Fraction that = (Fraction)obj;
return this.num * that.den == that.num * this.den;
}
Now both tests above pass.
5.1 Equality and autoboxing
Comparing an object to a primitive value seems like a silly thing to do: of
course they can’t ever be equal, right? But in the presence of generics (see
Generics below), Java will automatically box primitives into
their boxed object forms. And at that point, looking for, say, an int
in a List<Integer>
makes a lot of sense. Accordingly, Java ensures that
the equals
method on box types coincides with the ==
operation on
the primitive values: if any two values are ==
, their boxed forms are
equals()
:
long val_x1 = 7L;
long val_x2 = 7L;
Long box_x1 = 7L;
Long box_x2 = 7L;
assertTrue(val_x1 == val_x2); // primitive equality
assertTrue(box_x1.equals(box_x2)); // logical equality of boxes
assertTrue(box_x1.equals(val_x1)); // logical equality with auto-boxing
Unfortunately, the ==
operator will not perform the same way for box
types as is does for primitives. Consider: there are a lot of 64-bit
long
numbers, so allocating an object for each one is exorbitantly
expensive. So the following will occur:
Long x1 = 7L;
Long x2 = 7L;
Long y1 = 720_233_830_121_456L;
Long y2 = 720_233_830_121_456L;
assertTrue ( x1.equals(x2) ); // for small enough numbers, Java
assertTrue ( x1 == x2 ); // will ensure physical equality of boxed values
assertTrue ( y1.equals(y2) ); // but when they get large enough,
assertFalse( y1 == y2 ); // there is no such guarantee.
equals()
whenever possible, to avoid this confusion.5.2 Equality and hashing
Equality is intended to mean that two variables “behave the same” in any scenario we care about. This implies that any equality implementation had better respect the following three rules:
Reflexivity:
x.equals(x)
, always. (Unless of coursex
isnull
, in which case this crashes. See the notes onnull
below...)Symmetry: if
x.equals(y)
, theny.equals(x)
, always.Transitivity: if
x.equals(y)
andy.equals(z)
, thenx.equals.(z)
, always.
(It should be apparent that ==
obeys these three rules too.)
Java includes one additional scenario in which we can observe objects: we can
stick them inside hash tables, in which case Java relies on the int
hashCode()
method. Hashcodes have to obey the following consistency rules
with respect to equality:
Compatibility: If
x.equals(y)
, thenx.hashCode() == y.hashCode()
. Contrapositively, ifx.hashCode() != y.hashCode()
, then!x.equals(y)
.Non-injectivity: Just because
x.hashCode() == y.hashCode()
doesn’t imply thatx.equals(y)
. Contrapositively, just because!x.equals(y)
doesn’t mean thatx.hashCode() != y.hashCode()
.
In other words, the point of a hashcode is to quickly decide when two objects
are not equal. If that quick check can’t tell them apart, then the full
equals()
method is needed.
Always override hashCode
when you override equals
, or
you’re practically guaranteed to violate these rules.
As corollary of these rules, the hashCode()
method should be a function
of only the same fields that are used by equals()
—Posn
whose equals()
method only checked its x-coordinate, but whose
hashCode()
method used its y-coordinate as well.)
Fortunately, Java provides several static methods to make constructing
hash codes simple. In particular, each of the box types provides a static
hashCode
method, as shown here for double
s:
double val_x = 314.1592;
Double box_x = 314.1592;
assertTrue(box_x.hashCode() == Double.hashCode(val_x));
This takes care of most individual fields or variables. Additionally, there is
a utility class named Objects
that includes a convenience method
int hash(Object... args)
:
final class Posn {
private final int x, y;
...
@Override
public int hashCode() {
return Objects.hash(x, y); // boxes x and y to Integers,
// then gets their hashCode()s, and combines them into a single result
}
}
The actual mathematics of good hash functions are an interesting sub-domain of algorithms; for our purposes, it’s enough to know Java has good built-in defaults that we can use as needed.
Of course, if our particular equality operation is more sophisticated than
simply comparing fields, then this hashing approach won’t work. In particular,
if we used the same approach for Fraction
as we just used for Posn
,
assertEquals(new Fraction(1, 2).hashCode(),
new Fraction(2, 4).hashCode()); // Fails
We need a hashcode that respects our equality relation. (Try implementing this one!)
6 Unlearning Fundies 2
Fundies 2 laid out a set of guides for how to write Java code that was easy to test and relatively clear, and the rules were fairly easy to follow. However, a few of those rules were too simplistic.
6.1 null
is bad
Not entirely true. More accurately, null
is a pain to deal with.
Unlike every other value that can be placed in a reference-typed variable, you
can’t call methods on null
values. Forgetting to check for the
possibility of null
leads to unintended NullPointerException
s
that crash your program at unanticipated times. Far cleaner to create a
sentinel-value type (as we did with Empty
lists) and use that instead.
It turns out that null
values were a historical accident, because they
were simply easy to implement. Their inventor, Sir Tony Hoare (a Turing-award
winner, and creator of many ubiquitous algorithms) calls null
his
“billion-dollar mistake”.
One of the few good use cases for null
is when intending to create
cyclic data. Here, use null
to indicate that the cycle hasn’t been
formed; document well by what point the cycle should be formed, and
check for it early and fail quickly if an unexpected null
value appears.
6.2 ==
is bad
Not exactly. With our more sophisticated understanding of the distinction
between value types and reference types, just remember that ==
compares
the immediate contents of variables and so provides physical equality
comparisons. Use this operator sparingly but appropriately.
6.3 instanceof
and casts are bad
This one’s still true.
The only legitimate use for it is when implementing custom equals()
overriden methods.
7 Enumerations
An enumeration is a finite collection of values, all of which are known statically, and can therefore be given names. We’ve seen enumerated data in Fundies 1, but we skipped over it in Fundies 2, because it required additional syntactic support and because enumerated values don’t make it easy to participate in dynamic dispatch. Enums are most useful when the set of values is reasonably small, and when each value’s behavior is uniformly the same.
7.1 Simple enumerations
A simple enumeration looks like this:
enum TrafficLight { Red, Yellow, Green }
(Per the Google style guide, enum constants should be in
UNDERSCORED_ALL_CAPS
, and this naming convention is common, but a bit
“shouty”. We won’t complain if you use TitleCaseNaming
instead.)
Under the covers, this actually defines a class named
TrafficLight
, a private consturctor and three static fields. In other
words, it’s effectively producing the following:
final class TrafficLight {
private TrafficLight() { }
public static final TrafficLight Red = new TrafficLight();
public static final TrafficLight Yellow = new TrafficLight();
public static final TrafficLight Green = new TrafficLight();
}
Because the constructor is private and the class is final
, these three
final
fields are the only possible non-null values of this type.
As a consequence, you can check which enum value you have using the ==
operator:
TrafficLight nextLight(TrafficLight cur) {
if (cur == TrafficLight.Red) {
return TrafficLight.Green;
} else if (cur == TrafficLight.Yellow) {
return TrafficLight.Red;
} else if (cur == TrafficLight.Green) {
return TrafficLight.Yellow;
} else {
throw new IllegalArgumentException("Bad traffic light");
}
}
switch
statement instead (see below), and this is far more convenient.7.2 More elaborate enumerations
Because enumerations are simply classes in disguise, the syntax for creating them is a bit more elaborate, and can be used to convenient effect:
enum UsCoin {
// Define each named value, passing an argument into the constructor
Penny(1), Nickel(5), Dime(10), Quarter(25);
// semicolon is needed to separate the declarations above
// from the fields and methods below
// Define some fields:
private final int value;
// Define the constructor
UsCoin(int value) { this.value = value; }
// Define some methods
public int getCentsValue() { return this.value; }
@Override
public String toString() { return String.format("%d¢", this.value); }
}
UsCoin.Quarter.toString()
yields "25¢"
,
without any additional work on our part.Notice that since the enumerated values are all instances of exactly the same
class, they necessarily have precisely the same implementation of any methods
we define. It is in this sense that enumerated values don’t participate in
dynamic dispatch very well: we can’t use that mechanism to distinguish among
them —switch
statement instead.
8 The switch
statement
In Fundies 1, we used the "cond" expression for two purposes: to
distinguish between the variants of a union-data definition, and to ask a
multi-faceted domain-knowledge question. In Java, the former usage is better
expressed via dynamic dispatch, while the latter usage is better expressed with
if/else if/else
statements. However, enumerated values are an odd
middle case, where dynamic dispatch won’t suffice, but something better than
cascaded if
statements exists.
8.1 A simple switch
statement
By way of introductory example, we could rewrite our nextLight
method
above as follows:
TrafficLight nextLight(TrafficLight cur) {
switch(cur) {
case TrafficLight.Red: return TrafficLight.Green;
case TrafficLight.Yellow: return TrafficLight.Red;
case TrafficLight.Green: return TrafficLight.Yellow;
default: throw new IllegalArgumentException("Bad traffic light");
}
}
A switch
statement simply states that the value being examined falls into one
of several mutually exclusive case
s. Switch statements may only be used
with enum values, primitive values (mainly chars and ints) and (as of Java 7)
String values.
8.2 Fallthrough behavior
Warning: the cases of a switch statement have fallthrough
behavior, meaning that as soon as a case matches, the body of that case will
run, and then control will fall through to the next case. In the code below,
the second case will match, and so will print "Got here"
...
void badSwitchExample() {
switch("Oops") {
case "Won't happen":
System.out.println("doesn't run");
case "Oops":
System.out.println("Got here");
case "Yay":
System.out.println("Hooray");
default:
System.out.println("Huh?");
}
}
"Hooray"
and "Huh?"
. To avoid this,
we need the break
statement to say “please stop, and jump to the end of
this switch
statement”:
void goodSwitchExample() {
switch("Oops") {
case "Won't happen":
System.out.println("doesn't run");
break;
case "Oops":
System.out.println("Got here");
break;
case "Yay":
System.out.println("Hooray");
break;
default:
System.out.println("Huh?");
break;
}
}
"Got here"
. (The break
statement
is actually more general, and can be used to escape loops, if statements, etc.,
and essentially jump to the nearest closing brace. You cannot use break
statements to escape from a method body, though; that remains an error.)8.3 Default cases
Every switch
statement must come with a default:
case as its
final case. This case is used when none of the other cases match the given
value. For strings, characters and numbers, this makes sense: after all, there
are a huge number of possible values for those types! For enums, it may seem
weird, but remember that enum values are objects of a given class type, and
null
is unfortunately a potentially legal value of that type as well.
9 Exceptions
10 Generics, or raw types considered harmful
11 JUnit
11.1 Review: the tester library
Coming from Fundies 2, you already are familiar with using the tester library to write test cases: something like
class ExamplesWhatever {
// various fields of data
int someData;
Foo aClassThatThrows;
void setupTestFixture() {
// reinitialize all your data
this.someData = 5;
this.aClassThatThrows = new Foo(5);
}
// old-style test method
boolean testSomething(Tester t) {
this.setupTestFixture();
return t.checkExpect(this.someData, 5, "Is it five?")
&& t.checkConstructorException(new IllegalArgumentException("No tens!"),
"Foo",
10)
&& t.checkException(new RuntimeException("Boom"), this.aClassThatThrows, "explode");
}
// new-style test method
void testSomething(Tester t) {
this.setupTestFixture();
t.checkExpect(this.someData, 5, "Is it five?");
t.checkConstructorException(new IllegalArgumentException("No tens!"),
"Foo",
10);
t.checkException(new RuntimeException("Boom"), this.aClassThatThrows, "explode");
}
}
A test class needs to accomplish several things:
It must correctly reinitialize a test fixture to ensure every test runs in a consistent environment (especially when mutation is involved);
It must be able to test whether some expression evaluates as expected;
It must be able to check for exceptions that occur in constructing objects;
And it must be able to check for exceptions that occur in method invocations.
The tester library in Fundies 2 provided a simplified API for such activities.
In this course, we’ll introduce you to JUnit, the widely-used standard library
for such things, and the support for it that’s built into IntelliJ. This isn’t
quite a langauge feature—
We translate the sample tests above into JUnit, then explain each feature:
import org.junit.*; // used to define @Test and @Before, etc.
import static org.junit.Assert.*; // used for assertEquals and assertTrue, etc.
class MyTestClass {
// various fields of data
int someData;
Foo aClassThatThrows;
@Before
void setupTestFixture() {
// reinitialize all your data
this.someData = 5;
this.aClassThatThrows = new Foo(5);
}
@Test
void simpleTest() {
assertEquals("Is it five"? 5, this.someData);
}
@Test(expected = IllegalArgumentException.class)
void constructorExceptionTest() {
new Foo(10);
}
@Test(expected = RuntimeException.class)
void methodExceptionTest() {
this.aClassThatThrows.explode();
}
}
11.2 Simple tests
The first thing to notice is the two import
statements at the top; these
are needed to define the assertions and annotations used by JUnit tests.
Second, JUnit does not impose any naming convention ("ExamplesBlah"
,
"testWhatever"
). Instead, we simply mark our test methods with the
@Test
attribute.
The analogue of t.checkExpect
is simply assertEquals
. Its
arguments are in exactly the reversed order from the tester library’s order:
first, an optional description of the test case, followed by the
expected value of the test, and finally the actual value.
(Technically, the first argument isn’t optional; rather, there are several
overloaded assertEquals
methods, of which only some include the
description string.)
The assertEquals
testing form compares its arguments using their
.equals
method. Keep this firmly in mind, as it is quite different from
the tester library. That library provided a structural equality comparison by
default, because we were using it before we’d defined how equality actually
worked. Here, now that we know the distinctions between ==
and
.equals
, JUnit doesn’t impose any particular regimen on us; it’s up to
us to define what we mean.
Among other things, look again at the notes about comparing arrays for equality
above. Because arrays are not objects, they do not have a .equals
method, so JUnit provides a customized assertArrayEquals
method for
comparing arrays for equality element-by-element, rather than by aliasing.
11.3 Test fixtures
Writing a test fixture is still our responsibility. However, rather than
having to remember to call it manually in every test method, we simply mark the
test fixture with @Before
. JUnit will call it for us automatically
before each @Test
method.
11.4 Testing for exceptions
The tester library had some rather ungainly mechanisms for testing exceptions:
we passed in the exact exception we expect, followed by either the name of the
class or the object and the name of its method, followed by the arguments to be
passed in to the constructor or the method. (We couldn’t explain at the time,
but checkConstructorException
and checkException
both accepted a
varargs list of arguments...) This was particularly annoying if there was a
typo in the method name, or a type error in the parameters passed in, as there
was no compile-time checking to let us know of our mistake.
JUnit has a much simpler mechanism. We elaborate the annotation before the method with the expected exception’s class:
@Test(expected = IllegalArgumentException.class)
and then simply invoke the constructor or method as normal. The JUnit framework will wrap every test method in a try-catch statement, and check that an exception is indeed thrown and that its class exactly matches the one specified. This means we need to get the exception exactly correct, no subclassing permitted here, but we don’t have to worry about the precise error message itself, and the same technique works for exceptions thrown both by constructors and by methods. JUnit has additional mechanisms for checking exceptions, but this is the simplest and easiest to use.
Note that this mechanism implies that testing will stop at the first exception
thrown in each test method, because throwing exceptions short-circuits
evaluation (much like how a single test failure short-circuited tests in the
tester library, in the old-style boolean test methods). If you want to test
multiple exceptions, you must write multiple test methods: one per exception.
Or, you can write your own try-catch statements, and write an
assertEquals
in the catch block that examines the exception object...but
this is error-prone to forgetting that if the catch blok doesn’t run
then the test should have failed. The common case is simply to write multiple
test methods.
(In hindsight, we can see why the tester library needed to be implemented the
way it was: at the time, we didn’t have try-catch statements. Since the tester
library was implemented entirely via methods on the Tester
class, such
methods would not have the ability to catch exceptions that were thrown during
the evaluation of their arguments. Hence, we passed in the names of the
things to be evaluated by the tester on our behalf, and internally it would use
a try-catch statement to handle the exceptions. JUnit uses a different
mechanism, namely method annotations, that allow it to effectively insert the
try-catch statements around every method for us, leading to the cleaner API.)
1We defer discussion of non-static nested classes for now.
2Well...this is a convenient approximation, really. The actual behavior for sufficiently long arrays is not so constant, and depends upon the actual architectural details of the computer.
3Note that List
is an interface, and the
venerable ArrayList
is one class that implements it.
4There’s a twist,
though: some methods, such as String.indexOf(int)
, take
“characters” represented as type int
rather than type
char
, because it turns out that the Java char
type
doesn’t have enough bits to represent every Unicode code point. However,
because char
s are implicitly converted to int
s, you
can use a character where an integer is expected with no trouble. (In
the other direction it requires a cast, which is lossy because some
int
values don’t fit in char
)
5Not just variables, because everything below applies to method parameters and results as well.
6What is a reference? Most likely it’s just the memory address of the object, like a pointer in C or C++—though there are optimizations that can make the situation less simple.
7Be careful! If any fields of that class, despite being final
,
are reference types that transitively contain values that aren’t
final
, then perhaps this type isn’t actually a value type. Or,
perhaps the design of that non-final
class should be revisited...