7.7

#### Lecture 6: Equality and comparison

Equality and comparison between objects

##### Overview

This lecture introduces the notion of comparing objects, either for equality or for ordering.

##### Equality of objects

Often we need to know if two objects are “equal” to each other. We are used to equality with simpler data like numbers and characters, but how do we generalize this notion for arbitrary objects? If we could, that would give us a unified way to deal with all kinds of data.

Equality (or sameness) in math should adhere to some basic properties that makes it rigorous and unambiguous. They are:

1. Reflexivity: A is equal to A (every object is equal to itself).

2. Symmetry: If A is equal to B, then B is equal to A.

3. Transitivity: If A is equal to B and B is equal to C, then A is equal to C.

Thus when we define equality for custom data (i.e. classes that we write) it is important to ensure that our definition has the above properties.

Equality in general has many flavors:

• Shallow vs deep: literally the same object, or two different objects that have the same contents? (e.g. are two different strings with the same contents equal to each other?)

• Intensional vs extensional: literally the same thing, or can be calculated/derived to the same thing? (e.g. is a time duration of 60 seconds equal to a time duration of 1 minute?)

• Nominal vs structural: are two objects of different classes that have the same values for the same instance variables equal to each other? (e.g. is a point (x,y) equal to a vector (x,y)?)

##### Testing equality in Java

There are two ways to test equality of two things in Java. The first is to use the == operator. This is routinely used with primitive types (i.e. int, float, char, etc.). When used with objects it checks for shallow equality (see above). More importantly it is not possible to change this behavior.

The second way to test equality is the equals method, and every class inherits it from the Object class. This method is used extensively by other classes in Java. For example the assertEquals method in Junit classes uses this method when presented with objects. The contains method in the List<T> interface uses this method to determine if a given object is present in the list.

Unlike the == operator we can redefine the equals method for classes so that it has the appropriate flavor from above while still ensuring reflexivity, symmetry and transitivity. When we write our own class and wish that a user use the equals method to compare two objects, we must redefine or override the equals method.

The signature of this method is public boolean equals(Object other). When we override a method we cannot change its signature in any way. Thus we must define a method that has this exact signature.

 public boolean equals(Object other) { ... }

The above method poses a problem: it can take an object of any type as an argument! Obviously we can only compare apples to apples, which means we must have a way to weed out all the other possibilities.

What does the default equals method do? It returns true if this and other are literally the same object, i.e. it checks this==other.

##### Equality in the Person class

Recall from Lecture 2 that we wrote a method public boolean same(Person that) that checked if two Person objects were the “same”. Its implementation was:

 public boolean same(Person other) { return this.firstName.equals(other.firstName) && this.lastName.equals(other.lastName) && this.yearOfBirth == other.yearOfBirth; }

As we can see, it already used the equals method defined for String objects. We can now standardize this method by replacing it with an equals method for the Person class.

First we make it backward compatible: if the default equals method returned true for two objects, our method must do so too.

 public boolean equals(Object other) { if (this==other) { return true; } ... }

Now we need to weed all the cases where the argument other is not a Person object. We can determine if a given object is of the given type by using the instanceof operator.

 public boolean equals(Object other) { if (this==other) { return true; } if (!other instanceof Person) { return false; //other cannot be equal to this as it is not a Person object! } ... }

If other instanceof Person returns true it means the other object is really of type Person. This means we can safely cast it to a Person-type variable, which allows us to access all its methods.

Remember that an object of a derived class can be assigned to a variable of a base class. This means any object in Java can be assigned to an Object-type variable. This is called upcasting and is always valid.

 public boolean equals(Object other) { if (this==other) { return true; } if (!other instanceof Person) { return false; //other cannot be equal to this as it is not a Person object! } Person otherPerson = (Person) other; //this will work return this.firstName.equals(otherPerson.firstName) && this.lastName.equals(otherPerson.lastName) && this.yearOfBirth == otherPerson.yearOfBirth; }

The instanceof operator is a way of explicitly checking the type of an object. This is precisely what we were trying to avoid through dynamic dispatch. The equals method is an exception because we literally need to know the type: we cannot substitute this test with any other programming feature like dynamic dispatch. But in general, avoid querying the type of an object if possible, and use instanceof sparingly.

Thus the general structure of the equals method is:

 public boolean equals(Object other) { if (this==other) { return true; } if (other is not of the correct type) { return false; } //cast other to the correct type of object //actual way of determining equality goes here }
##### Equality for union types

The equals method can be overridden similarly for the Circle and Rectangle classes from Lecture 3.

Suppose we subsequently create a Square shape, which we implement by extending the Rectange class as a square is a rectangle. However this raises a subtlety regarding equality: can a square and a rectangle be equal to each other? One could say “yes”, because a square of side 2 is the same shape as a rectangle with width and height 2. In this case the square can use the same equals method that it inherited from the Rectangle class.

What if we were using the shapes in an application that wanted to differentiate between different types of shapes, even rectangle and square? In this case we may be tempted to write a separate but nearly same equals method for the Square class.

 //In the Rectangle class: public boolean equals(Object other) { if (this==other) { return true; } if (!other instanceof Rectangle) { return false; } Rectangle otherR = (Rectangle) other; return (this.width == otherR.width) && (this.height == otherR.height); } //In the Square class: public boolean equals(Object other) { if (this==other) { return true; } if (!other instanceof Square) { return false; } Square otherR = (Square) other; return (this.width == otherR.width) && (this.height == otherR.height); }

Let us say we create a Rectangle object r1 and a Square object s1 such that they have the same dimensions. s1.equals(r1) returns false (as it should) because the instanceof returns false. However r1.equals(s1) returns true! Thus our implementation violates the symmetry property of equality.

If we trace the origins of this problem, they are in the fact that the parameter of equals method is of type Object, necessitating the instanceof that creates this problem.

Let us take a step back. How would we write a method that only compares a circle with another circle? We can do this by writing a method boolean equalsCircle(Circle other) in the Circle class. How can we generalize this to compare any shape to a circle? We can do this by writing the equalsCircle method with the above signature in all the shape classes. Except for the Circle class all other implementations would simply return false, as those shapes are fundamentally not circles.

How do we force all shapes to have a method boolean equalsCircle(Circle other)? We can add it to the Shape interface. But doing so exposes this method to anybody using Shape objects (as it will be public). So we add it to the abstract class AbstractShape (look at the design from Lecture 3) and make it protected.

 public abstract class AbstractShape implements Shape { protected Point2D reference; protected boolean equalsCircle(Circle other) { ... } ... }

Since possibly many child classes will have this method simply return false, we can put this default implementation here.

 public abstract class AbstractShape implements Shape { protected Point2D reference; protected boolean equalsCircle(Circle other) { return false; //by default "this" shape is not equal to a circle } ... }

By default the equalsCircle method of all shapes returns false. We override this method in the Circle class so that it compares two circles (this and other) more meaningfully.

Thus our shape classes now have a way of testing equality of any shape with a circle. We can write similar methods to compare any shape with a rectangle and a square.

 public abstract class AbstractShape implements Shape { protected Point2D reference; protected boolean equalsCircle(Circle other) { return false; //by default "this" shape is not equal to a circle } protected boolean equalsRectangle(Rectangle other) { return false; //by default "this" shape is not equal to a rectangle } protected boolean equalsSquare(Square other) { return false; //by default "this" shape is not equal to a square } ... }

This solution will allow us to compare any shape to any other shape. However a user must know which one of these methods to call. What we really need is a single method public boolean equalsShape(Shape other) that can be called on any shape with any shape as an argument.

We add this method to our Shape interface. Now consider its implementation in the Circle class.

 //In Circle class public boolean equalsShape(Shape other) { ... }

According to the template of this method, we know that this is a Circle, but we don’t know what other is. If we knew that other was of type AbstractShape we can call its equalsCircle method and pass this to it.

 //In Circle class public boolean equalsShape(Shape other) { if (other instanceof AbstractShape) { AbstractShape ashape = (AbstractShape) other; return ashape.equalsCircle(this); } return false; //since it is not AbstractShape it is not a circle either, so return false }

Depending on what kind of an object other really is, the appropriate method will be called (equalsCircle in the Circle that will meaningfully compare the two circles, or the default for the other kinds of shapes which will return false).

The above instanceof statement checks if other is a kind of AbstractShape, which means it weeds out all other types! Thus the above method is actually equivalent to an equals method!

 //In Circle class public boolean equals(Object other) { if (other instanceof AbstractShape) { AbstractShape ashape = (AbstractShape) other; return ashape.equalsCircle(this); } return false; //since it is not AbstractShape it is not a circle either, so return false }

Similarly we can implement the equals method for the Rectangle and Square classes, replacing equalsCircle with equalsRectangle and equalsSquare respectively.

Let us revisit our problematic example from before. Let us say we create a Rectangle object r1 and a Square object s1. s1.equals(r1) goes to the equals method of the Square class, which in turn calls the equalsSquare method on r1. This returns false (as it should). r1.equals(s1) goes to the equals method of the Rectangle class, which in turn calls the equalsRectangle method on s1. This returns false (as it should). Thus the symmetry property is now obeyed.

In general when we call a.equals(b) Java calls the equals method of the class of which a is, thanks to dynamic dispatch. If it is one of the shape classes, it will then call the appropriate equalsXXX method of the class of which b is, again thanks to dynamic dispatch. This technique is called double dispatch, and in this case, solves the problem of the instanceof operator.

What if we create a new type of shape, say a triangle? We can make it part of this hierarchy by saying public class Triangle extends AbstractShape. In this case, we would need to add a new method protected boolean equalsTriangle(Triangle other) to the AbstractShape class. While having to edit AbstractShape is undesirable, it is unavoidable. Note that we created the abstract class purely for better design. A user of these shape classes needs to work only with the Shape interface and concrete shape classes, never AbstractShape. Thus while editing the AbstractShape is not ideal it does not have an impact on the user code. If we create this class separate from this hierarchy by saying public class Triangle implements Shape then we can write its equals method independently, like we did for the Person class above.

##### From equality to comparison

In many situations we need to compare two objects more thoroughly: instead of equality we wish to know their ordering, i.e. is an object “lesser” or “greater” than the other. An example would be to sort a list of objects. In Lecture 4 we defined such a method for the Book class called cheaperThan(Book other) that ordered two books based on price. Similar to equality, ordering should be rigorous and unambiguous.

1. If A is less than B, then B must be greater than A.

2. If A is less than B and B is less than C, then A must be less than C.

3. If A is not less than B, and B is not less than A, then A and B must be equal.

The last property shows the relationship between ordering and equality. If we have ways of ordering objects, we must make sure that they are consistent with equality of objects.

##### Ordering in Java

Similar to how Java uses the equals method to standardize checking for equality, it has two ways of standardizing ordering. It uses these ways for operations that require ordering of two objects. Examples of this are the Collections.sort(...) method to sort List<T> objects, Arrays.sort(...) to sort arrays of objects, Arrays.binarySearch(...) to efficiently search for an element in a sorted array, etc.

##### The Comparable<T> interface

The Comparable<T> interface in Java contains a single method public int compareTo(T other). This method is expected to return a negative numbef if the calling object is lesser than the other object, a positive number if the calling object is greater than the other object, or 0 otherwise (if they are equal). The generic parameter T becomes the type of the argument of the compareTo method.

If we write a class X in Java, one way to make two of its objects comparable to each other is to make it implement the Comparable<X> interface: public class X implements Comparable<X> {...}.

##### The Comparator<T> interface

There are two situations in which using the Comparable<T> interface is difficult or infeasible. If the class that wishes to define the ordering has already been written, having it implement the Comparable<T> interface after-the-fact involves changing its code. If the user of the class does not have access to its code, then doing this is not even possible. Secondly, having the class implement this interface commits it to a particular way of comparing two objects. What if the same objects can be compared in multiple ways? For example, items on amazon.com can be sorted by name, relevance, price or user rating.

Java offers another way: writing an object whose sole purpose is to compare two objects. Such an object is called a comparator. Java facilitates this through the Comparator<T> interface. If two objects of class X must be compared, we can write a comparator as public class XComparator implements Comparator<X>. This object must now contain a method public int compare(X a, X b) that compares two objects of type X. Its return value follows the same convention as mandated by the Comparable<T> interface: -ve if a<b, +ve if a>b and 0 otherwise.

As this class is separate from the one whose objects it is comparing, it does not involve changing the latter class or even having access to its source code.