Lecture 16: Visitors

8.5

Lecture 16: Visitors

Visitors as generic function objects over union data

Introduction

Over the last few lectures, we have seen two powerful abstraction mechanisms: function objects allow us to abstract over behavior, giving us the flexibility of higher-order functions in an object-oriented setting, while generics allow us abstract over types, defining once-and-for-all families of related types like IList<T>. We can even combine the two, defining interfaces like IFunc<A, R> that describe all function objects that take an argument of type A and return a result of type R. We saw examples of how to define function objects over simple data: for instance, IFunc<Book, String> is the interface for a function object whose apply method takes a Book and returns a String. We could then take those function objects and map() them across lists. So far, so good.

But now we have to ask, how well do these techniques work for more complex data types? In particular, what happens if we try to write a function object that takes a value of a union data type?

16.1 Flawed attempt 1: Processing shapes with existing types of function objects

Recall our example from Lecture 4 of shapes. Let’s simplify the definitions, though, by getting rid of all the methods for now, and let’s temporarily only include Circle and Rect.

interface IShape { }
class Circle implements IShape {
int x, y;
int radius;
String color;
Circle(int x, int y, int r, String color) {
this.x = x;
this.y = y;
this.radius = r;
this.color = color;
}
}
class Rect implements IShape {
int x, y, w, h;
String color;
Rect(int x, int y, int w, int h, String color) {
this.x = x;
this.y = y;
this.w = w;
this.h = h;
this.color = color;
}
}

Suppose now we have an IList<IShape>, and we want to obtain the list of areas of all the shapes in the list. (Notice we got rid of all methods on IShape, above!) Recall the signature of map():

// In IList<T>: <U> IList<U> map(IFunc<T, U> func);

Accordingly, for our IList<IShape>, map() must take a function object argument of type IFunc<IShape, Double>. But if we try to implement this function object, we are stuck:

// A function object that computes the area of IShapes class ShapeArea implements IFunc<IShape, Double> {
public Double apply(IShape shape) {
/* Template: * Fields: * Methods: * Methods of fields: */
}
}

The IShape interface currently has no methods, and certainly has no fields! So how can we implement this functionality?

16.2 Key observation: we’ve seen this problem before

The problem we’re running into in the ShapeArea’s apply method is we don’t know which specific type of shape we’re given as our argument. As a result, our template is empty, and there’s nothing we can do. We’ve encountered this problem before with shapes: when we tried to determine when two IShapes were the sameShape. In that setting, there were two techniques we could use: casting, and double-dispatch.

16.2.1 Flawed attempt 2: Casting

Do Now!
Try writing ShapeArea using casting. What works well in this approach, and what doesn’t?

We might reason that since a IShape is either a Circle or a Rect, the following code should work:

// A function object that computes the area of IShapes class ShapeArea implements IFunc<IShape, Double> {
public Double apply(IShape shape) {
if (shape instanceof Circle) {
return Math.PI * ((Circle)shape).radius * ((Circle)shape).radius;
}
else {
return ((Rect)shape).w * ((Rect)shape).h;
}
}
}

This code does indeed implement the formulas for the area of circles and rectangles. But it is quite ugly having all those casts cluttering up the code. And worse, it is badly brittle — what happens when we add a Square class as another IShape? Our code will still compile, but will crash at runtime with a ClassCastException.

16.3 Specific functions on shapes

Let’s take a step back from making everything too generic, and let’s instead try to design a IShape2DoubleFunc interface that represents a function that takes an IShape and returns a Double. (There are no type parameters involved here, yet.) Now we have the flexibility to change this interface as needed.

When we were working with sameShape, we started by writing simpler methods like sameCircle or sameRect. Let’s try the same notion here: instead of having one apply(IShape) method, we’ll have simpler helper methods.

// Represents a function object defined over Shapes that returns a Double interface IShape2DoubleFunc {
double applyToCircle(Circle circle);
double applyToRect(Rect rect);
}

Now each of these methods should be easier to implement, because we have arguments with specific types instead of the general IShape interface:

// Implements a function taking a Shape and returning a Double, // that computes the area of the given shape class ShapeArea implements IShape2DoubleFunc {
public double applyToCircle(Circle circle) {
return Math.PI * circle.radius * circle.radius;
}
public double applyToRect(Rect rect) {
return rect.w * rect.h;
}
}

The only remaining trick is to figure out how to use this object! We manage this by adding one method to the IShape interface and implementing it on each shape class. If the purpose of the methods in the IShape2DoubleFunc is to “apply a function to a shape”, then the purpose of this new method in the IShape interface is to “be applied to by some function”:

interface IShape {
// To return the result of applying the given function to this shape double beAppliedToBy(IShape2Double func);
}

// In Circle // To return the result of applying the given function to this Circle public double beAppliedToBy(IShape2Double func) {
return func.applyToCircle(this);
}

// In Rect // To return the result of applying the given function to this Rect public double beAppliedToBy(IShape2Double func) {
return func.applyToRect(this);
}

Exercise
Compare this implementation to the implementation of sameShape, and figure out what methods are the analogues of sameShape, sameCircle and sameRect, and what roles the IShape and IShape2DoubleFunc interfaces play.

Notice that this code is substantially cleaner than the version with casts above, and it cannot possibly fail at runtime with a ClassCastException. Even better, if we need to add new kinds of IShapes, our code will still not crash at runtime! Instead, it will fail to compile, because we won’t have any way to implement beAppliedToBy for the new class. This is much better: it means that we’re using Java’s type system to prevent us from forgetting parts of our implementation.

Do Now!
Suppose we add the following class:
class Square implements IShape {
int x, y, size;
String color;
Square(int x, int y, int size, String color) {
this.x = x;
this.y = y;
this.size = size;
this.color = color;
}
}
What changes do you need to make to extend the ShapeArea class to handle this new case?

// In IShape2DoubleFunc double applyToSquare(Square square);

// In ShapeArea public double applyToSquare(Square square) {
return square.size * square.size;
}

// In Square // To return the result of applying the given function to this Square public double beAppliedToBy(IShape2DoubleFunc func) {
return func.applyToSquare(this);
}

Exercise
Do the same thing, this time trying to compute the perimeters of the shapes.

16.4 Introducing the Visitor pattern

Now, as with our initial implementation of sameShape via double dispatch, the names of the methods above are not the typical names. Also, we can take another look at where parameterizing our data types might be worthwhile. Having a specialized set of methods for IShapes seems like something we’ll need to keep, but the return type of double seem like something that’s easily made generic.

The proper name for this pattern of double-dispatch with function objects is called the visitor pattern. Suppose we have an interface IFoo, and classes X, Y and Z that implement this interface. We define a visitor for this interface:

// To implement a function over Foo objects, returning a value of type R interface IFooVisitor<R> {
R visitX(X x);
R visitY(Y y);
R visitZ(Z z);
}

In the IFoo interface, we need to add one method to accept the visitor:

// In IFoo: // To return the result of applying the given visitor to this Foo <R> R accept(IFooVisitor<R> visitor);

Finally, in each class, we implement this method in the “obvious” way by matching up names:

// In X: // To return the result of applying the given visitor to this X public <R> R accept(IFooVisitor<R> visitor) { return visitor.visitX(this); }

// In Y: // To return the result of applying the given visitor to this Y public <R> R accept(IFooVisitor<R> visitor) { return visitor.visitY(this); }

// In Z: // To return the result of applying the given visitor to this Z public <R> R accept(IFooVisitor<R> visitor) { return visitor.visitZ(this); }

Do Now!
Rewrite the ShapeArea class to implement IShapeVisitor<Double>, instead of the IShape2Double interface.

16.5 Discussion

Visitors are nothing more than the natural answer to the question, “how do we make function objects work with union data types?” That answer combines double dispatch, generics, and function objects, which are all concepts we’ve seen already: this pattern is just a subtle, clever combination of those pieces.

When would you need this pattern? Whenever you are defining a union data type as part of a library, whose ultimate uses you can’t completely envision. Suppose for example someone were designing a library for manipulating HTML content. They’d have an interface IHTMLTag, and roughly 90 classes for each of the various tag types: IATag, IBrTag, IDivTag, etc. But the sheer variety of “ways to manipulate HTML” means that the library author cannot predict all those possibilities and implement them as methods himself! Instead, he can supply the accept method that takes an IHTMLTagVisitor — and leaves this interface available for clients of the library to implement however they choose. One tiny method’s worth of advance planning by the library author — the accept method — means the library is flexible and easy to use in ways the library author could not anticipate.

16.6 Wait — but what about map()?

At the beginning of this lecture, we wanted to implement a function object for IShapes that we could successfully map across a list of shapes. Instead we’ve created this IShapeVisitor<T> interface, but that is not an IFunc<IShape, T> that we can use with map! Recall, we wanted this:

// A function object that computes the area of IShapes class ShapeArea implements IFunc<IShape, Double> {
public Double apply(IShape shape) {
...
}
}

But the problem was we had nothing available in our template that would let us implement this method.

At this point, we should take advantage of another feature of Java interfaces that we have not yet needed to explore: a class is permitted to implement more than one interface. In particular, while we currently have

// A shape-visitor that computes the area of shapes class ShapeArea implements IShapeVisitor<Double> {
public Double visitCircle(Circle c) {
return Math.PI * c.radius * c.radius;
}
public Double visitRect(Rect r) {
return r.w * r.h;
}
public Double visitSquare(Square s) {
return s.size * s.size;
}
}

We can actually revise the class declaration to be

// A shape-visitor AND a function object that computes the area of shapes class ShapeArea implements IShapeVisitor<Double>, IFunc<IShape, Double> {
...
}

The meaning of this declaration is simply that the ShapeArea class promises to implement all the methods from all the interfaces it claims it implements. We already have implementations for all the visitor methods; now we just need to add the methods for IFunc<IShape, Double>:

// A shape-visitor AND a function object that computes the area of shapes class ShapeArea implements IShapeVisitor<Double>, IFunc<IShape, Double> {
...
public Double apply(IShape shape) {
// what to do? }
}

But this is precisely the method we didn’t know how to implement earlier!

Do Now!
Implement this method. You should need only a single method call.

If only there were a way to delegate to the visitor methods we have handy in this class... and now there is: that’s precisely the purpose of the accept method on IShape!

// A shape-visitor AND a function object that computes the area of shapes class ShapeArea implements IShapeVisitor<Double>, IFunc<IShape, Double> {
...
public Double apply(IShape shape) {
return shape.accept(this);
}
}

Think carefully through the types here. ShapeArea promises to implement IFunc<IShape, Double>, so it must have a method Double apply(IShape shape). IShapes know how to accept a IShapeVisitor<T> (for any T), and ShapeArea promises that it can be a IShapeVisitor<Double>, too. So the method call shape.accept(this) will typecheck. And so will passing a ShapeArea object to map() over a IList<IShape>, because of the IFunc promise. And now we truly do have function objects over union data, as we claimed above.

16.6.1 One last refinement: extending interfaces

It is mildly annoying that IShapeVisitor<T> and IFunc<IShape, T> are completely independent interfaces. Because of that, it’s possible to design a class that implements IShapeVisitor<T> and forgets to also implement IFunc<IShape, T>. This is especially annoying because every visitor can implement IFunc; there’s really no reason not to implement them both.

Remember that we’ve been saying that “a visitor is a function object for union data”. Let’s add just a bit of emphasis to that statement: “a visitor is-a function object for union data”. We have a mechanism for expressing when something is-a specialized kind of something else: the extends keyword. Java uses extends to express when one class is a subclass of another; it also allows us to express when one interface is an enhanced version of another interface. Any class that promises to implement the derived interface implicitly promises to implement every method in that interface, and also every method from the base interface. If we declare that our visitor interface extends the IFunc interface, then there is no longer a way to implement just the visitor interface without also remembering to implement the IFunc interface.

So the final version of our code will be:

// An IShapeVisitor is a function over IShapes interface IShapeVisitor<R> extends IFunc<IShape, R> {
R visitCircle(Circle c);
R visitSquare(Square s);
R visitRect(Rect r);
}

// ShapeArea is a function object over IShapes that computes their area class ShapeArea implements IShapeVisitor<Double> {
// Everything from the IShapeVisitor interface: public Double visitCircle(Circle c) { return Math.PI * c.radius * c.radius; }
public Double visitSquare(Square s) { return s.side * s.side; }
public Double visitRect(Rect r) { return r.w * r.h; }

// Everything from the IFunc interface: public Double apply(IShape s) { return s.accept(this); }
}

contents ← prev up next →

	General
	Texts
	Lectures
	Syllabus
	Lab Materials
	Assignments
	Code style
	Documentation

	Lecture 1: Data Definitions in Java
	Lecture 2: Data Definitions: Unions
	Lecture 3: Methods for simple classes
	Lecture 4: Methods for unions
	Lecture 5: Methods for self-referential lists
	Lecture 6: Accumulator methods
	Lecture 7: Accumulator methods, continued
	Lecture 8: Practice Design
	Lecture 9: Abstract classes and inheritance
	Lecture 10: Customizing constructors for correctness and convenience
	Lecture 11: Defining sameness for complex data, part 1
	Lecture 12: Defining sameness for complex data, part 2
	Lecture 13: Abstracting over behavior
	Lecture 14: Abstractions over more than one argument
	Lecture 15: Abstracting over types
	Lecture 16: Visitors
	Lecture 17: Mutation
	Lecture 18: Mutation inside structures
	Lecture 19: Mutation, aliasing and testing
	Lecture 20: Mutable data structures
	Lecture 21: Array Lists
	Lecture 22: Array Lists
	Lecture 23: For-each loops and Counted-for loops
	Lecture 24: While loops
	Lecture 25: Iterator and Iterable
	Lecture 26: Hashing and Equality
	Lecture 27: Introduction to Big-O Analysis
	Lecture 28: Quicksort and Mergesort
	Lecture 29: Priority Queues and Heapsort
	Lecture 30: Breadth-first search and Depth-first search on graphs
	Lecture 31: Dijkstra’s Algorithm for single-source shortest paths
	Lecture 32: Minimum Spanning Trees
	Lecture 33: Implementing Objects

	Introduction
16.1	Flawed attempt 1: Processing shapes with existing types of function objects
16.2	Key observation: we’ve seen this problem before
16.3	Specific functions on shapes
16.4	Introducing the Visitor pattern
16.5	Discussion
16.6	Wait — but what about map()?