6.2.1
16 Lecture 16: Visitors
Visitors as generic function objects over union data
Introduction
Over the last few lectures, we have seen two powerful abstraction mechanisms:
function objects allow us to abstract over behavior, giving us the flexibility of
higher-order functions in an object-oriented setting, while
generics allow us abstract over types, defining once-and-for-all families of related types like IList<T>.
We can even combine the two, defining interfaces like IFunc<A, R> that describe all function objects
that take an argument of type A and return a result of type R. We saw examples of how
to define function objects over simple data: for instance, IFunc<Book, String>
is the interface for a function object whose apply method takes a Book and returns a String.
We could then take those function objects and map() them across lists. So far, so good.
But now we have to ask, how well do these techniques work for more complex data types? In particular,
what happens if we try to write a function object that takes a value of a union data type?
16.1 Flawed attempt 1: Processing shapes with existing types of function objects
Recall our example from
Lecture 4 of shapes. Let’s simplify the definitions, though, by
getting rid of all the methods for now, and let’s temporarily only include
Circle and
Rect.
interface IShape { } |
class Circle implements IShape { |
int x, y; |
int radius; |
String color; |
Circle(int x, int y, int r, String color) { |
this.x = x; |
this.y = y; |
this.radius = r; |
this.color = color; |
} |
} |
class Rect implements IShape { |
int x, y, w, h; |
String color; |
Rect(int x, int y, int w, int h, String color) { |
this.x = x; |
this.y = y; |
this.w = w; |
this.h = h; |
this.color = color; |
} |
} |
Suppose now we have an IList<IShape>, and we want to obtain the list of areas of
all the shapes in the list. Recall the signature of map():
<U> map(IFunc<T, U> func); |
Accordingly, for our IList<IShape>, map() must take a function object argument
of type IFunc<IShape, Double>. But if we try to implement this function object,
we are stuck:
class ShapeArea implements IFunc<IShape, Double> { |
public Double apply(IShape shape) { |
|
} |
} |
The IShape interface currently has no methods, and certainly has no fields! So how can we
implement this functionality?
16.2 Key observation: we’ve seen this problem before
The problem we’re running into in the ShapeArea’s apply method is we don’t
know which specific type of shape we’re given as our argument. As a result, our
template is empty, and there’s nothing we can do. We’ve encountered this problem before
with shapes: when we tried to determine when two IShapes were the sameShape.
In that setting, there were two techniques we could use: casting, and double-dispatch.
16.2.1 Flawed attempt 2: Casting
Try writing ShapeArea using casting. What works well in this approach, and what doesn’t?
We might reason that since a IShape is either a Circle or a Rect, the following code should work:
class ShapeArea implements IFunc<IShape, Double> { |
public Double apply(IShape shape) { |
if (shape instanceof Circle) { |
return Math.PI * ((Circle)shape).radius * ((Circle)shape).radius; |
} else { |
return ((Rect)shape).w * ((Rect)shape).h; |
} |
} |
} |
This code does indeed implement the formulas for the area of circles and rectangles.
But it is quite ugly having all those casts cluttering up the code.
And worse, it is badly brittle — what happens when we add a Square class as another
IShape? Our code will still compile, but will crash at runtime with a ClassCastException.
16.3 Specific functions on shapes
Let’s take a step back from making everything too generic, and let’s instead try to design a
IShape2DoubleFunc interface that represents a function that takes an IShape and returns a Double.
(There are no type parameters involved here, yet.) Now we have the flexibility to change this interface
as needed.
When we were working with sameShape, we started by writing simpler methods like sameCircle or sameRect.
Let’s try the same notion here: instead of having one apply(IShape) method, we’ll have simpler helper methods.
interface IShape2DoubleFunc { |
double applyToCircle(Circle circle); |
double applyToRect(Rect rect); |
} |
Now each of these methods should be easier to implement, because we have arguments with specific types instead of
the general IShape interface:
class ShapeArea implements IShape2DoubleFunc { |
public double applyToCircle(Circle circle) { |
return Math.PI * circle.radius * circle.radius; |
} |
public double applyToRect(Rect rect) { |
return rect.w * rect.h; |
} |
} |
The only remaining trick is to figure out how to use this object! We manage this by adding one method
to the IShape interface and implementing it on each shape class. If the purpose of the methods in the
IShape2DoubleFunc is to “apply a function to a shape”, then the purpose of this new method in the IShape
interface is to “be applied to by some function”:
interface IShape { |
double beAppliedToBy(IShape2Double func); |
} |
public double beAppliedToBy(IShape2Double func) { |
return func.applyToCircle(this); |
} |
public double beAppliedToBy(IShape2Double func) { |
return func.applyToRect(this); |
} |
Compare this implementation to the implementation of sameShape, and
figure out what methods are the analogues of sameShape, sameCircle and sameRect,
and what roles the IShape and IShape2DoubleFunc interfaces play.
Notice that this code is substantially cleaner than the version with casts above, and it cannot possibly fail
at runtime with a ClassCastException. Even better, if we need to add new kinds of IShapes,
our code will still not crash at runtime! Instead, it will fail to compile, because we won’t have
any way to implement beAppliedToBy for the new class. This is much better: it means that we’re
using Java’s type system to prevent us from forgetting parts of our implementation.
Suppose we add the following class:
class Square implements IShape { |
int x, y, size; |
String color; |
Square(int x, int y, int size, String color) { |
this.x = x; |
this.y = y; |
this.size = size; |
this.color = color; |
} |
} |
What changes do you need to make to extend the ShapeArea class to handle this new case?
double applyToSquare(Square square); |
public double applyToSquare(Square square) { |
return square.size * square.size; |
} |
public double beAppliedToBy(IShape2DoubleFunc func) { |
return func.applyToSquare(this); |
} |
16.4 Introducing the Visitor pattern
Now, as with our initial implementation of sameShape via double dispatch, the names of the methods
above are not the typical names. Also, we can take another look at where parameterizing our data types might be
worthwhile. Having a specialized set of methods for IShapes seems like something we’ll need to keep,
but the return type of double seem like something that’s easily made generic.
The proper name for this pattern of double-dispatch with function objects is called the visitor pattern.
Suppose we have an interface IFoo, and classes X, Y and Z that implement this interface.
We define a visitor for this interface:
interface IFooVisitor<R> { |
R visitX(X x); |
R visitY(Y y); |
R visitZ(Z z); |
} |
In the IFoo interface, we need to add one method to accept the visitor:
<R> accept(IFooVisitor<R> visitor); |
Finally, in each class, we implement this method in the “obvious” way by matching up names:
public <R> accept(IFooVisitor<R> visitor) { return visitor.visitX(this); } |
public <R> accept(IFooVisitor<R> visitor) { return visitor.visitY(this); } |
public <R> accept(IFooVisitor<R> visitor) { return visitor.visitZ(this); } |
16.5 Discussion
Visitors are nothing more than the natural answer to the question, “how do we make function
objects work with union data types?” That answer combines double dispatch, generics, and function
objects, which are all concepts we’ve seen already: this pattern is just a subtle, clever combination
of those pieces.
When would you need this pattern? Whenever you are defining a union data type as part of a library,
whose ultimate uses you can’t completely envision. Suppose for example someone were designing a
library for manipulating HTML content. They’d have an interface IHTMLTag, and roughly
90 classes for each of the various tag types: IATag, IBrTag, IDivTag, etc.
But the sheer variety of “ways to manipulate HTML” means that the library author cannot
predict all those possibilities and implement them as methods himself! Instead,
he can supply the accept method that takes an IHTMLTagVisitor — and leaves
this interface available for clients of the library to implement however they choose.
One tiny method’s worth of advance planning by the library author — the accept method — means
the library is flexible and easy to use in ways the library author could not anticipate.