Lab 13: Optionals, Streams, and Collectors: Oh My!

8.14

Lab 13: Optionals, Streams, and Collectors: Oh My!🔗

Goals: To gain practice using Java’s built in Optionals and Streams, as well as mock database interfaces for testing purposes.

There is a lot of material for you to discover in this lab. Streams are a powerful recent feature in Java, and using them tends to work best when using lambda expressions as well. You can work on either section of this lab first, then revise your code to integrate both parts. The section on Streams, and especially on Collectors at the end, is not obvious: this lab is intended for you to explore the documentation, try to decipher it, and mostly to formulate questions about what you would like to learn next. You are not expected to magically know all this information already!

13.1 A quick primer on lambda expressions🔗

Java now has support for syntax that makes it easier to write quick, anonymous function objects. Note that these are not a new kind of value! Java still doesn’t have functions; it only has classes and methods. So instead, using lambda syntax allows you to implicitly (1) declare a class that implements a particular function interface, (2) define the implementation of that class and (3) create a single instance of that class. When lambda syntax is sufficient, it’s often much more convenient than writing out class defintions explicitly, but under the hood, it’s doing exactly the same thing as we have been doing manually all along.

13.1.1 Interfaces🔗

As we mentioned in Digression: Function objects in recent Java, Java defines several function object interfaces for us, instead of the ones we’ve defined ourselves:

Predicate<T> defines a boolean valued function over arguments of type T
Function<A, R> defines functions from an argument of type A to a result of type R
BiFunction<A1, A2, R> defines a function from two arguments of type A1 and A2 to a result of type R
There are plenty more, defined in the java.util.function package.

In fact, Java will permit you to write lambda expressions for any interface that contains exactly one method in it: these are called “functional interfaces” in the Java documentation. (Note: that tutorial is fairly long, but we have covered most of the ideas in it already and you should be able to read and understand most of it. We have skipped some concepts in there, but they’re not essential to today’s lab.)

13.1.2 Syntax🔗

There are two ways to write a lambda expression:

If the body of your function is a single expression, such “a function that takes two ints and adds them”, you can simply write
(int x, int y) -> x + y
This syntax contains the argument list, as usual, followed by an arrow, and then the expression whose value you want to return — no need to even write the return keyword.
If the body of your function is more complicated, and needs statements as well, such as “a function that takes two ints and returns the larger of them, you could write
(int x, int y) -> {
if (x > y) {
return x;
} else {
return y;
}
}
This syntax starts the same, with the argument list and an arrow, then contains a brace-delimited block of normal code.

In addition, if your argument list contains only a single argument, you may leave out the parentheses for it. And in some cases, if Java can guess the types of the arguments for you, you can even leave out the types of the arguments as well. Be careful with this last part, since it may not always guess correctly. To get a feel for how this works, start by writing out the types explicitly, and then once your code works, remove the types and see if it still compiles without errors.

13.1.3 Types🔗

The precise definition of removeIf uses another sophisticated feature of Java called “bounded generics”, and its use here is quite tricky. Without getting too deeply into detail right now, the syntax removeIf(Predicate<? super T> pred) means removeIf can accept a predicate over T arguments, or a predicate over any supertype of T. This may seem backwards at first, but it makes sense: our example didn’t rely on any specific features of Circle, but rather only used methods from the IShape interface. You could have used the same exact lambda to filter an ArrayList<IShape> and it would still have type-checked. So the definition of removeIf says “I need a predicate that is no more specific than working with items of type T.”

How does Java “know” which interface to pick for a given lambda expression? Basically, it guesses based on the context of your program: suppose you had an ArrayList<Circle> bubbles, and you wanted to mutably filter out items in the list that were too small. The removeIf method does this, so you start writing bubbles.removeIf(...). Based on the signature of removeIf, you need to supply a Predicate<Circle>. So if you complete your code by writing myList.removeIf(c -> c.area() < 10), Java will look at that lambda expression and look at the definition of Predicate<Circle>, and confirm that the interface contains a single method, and that method expects a Circle, so Java guesses that c should be a Circle and proceeds to type-check the following expression. It determines that c.area() < 10 is valid and produces a boolean, which is what the Predicate expects, so the whole method call type-checks successfully.

If you had not written this lambda expression in the context of removeIf, then Java could not have known if you wanted a Predicate<Circle> or a Function<Circle, Boolean>, or an IFunc<Circle, Boolean>, or some other function object interface. Accordingly, you should use lambdas only when the enclosing expression in your program makes it obvious what their signatures should be.

13.1.4 Scope, and mutation🔗

When you write a lambda expression, you may access any variables that are currently in scope. In other words, the template for the lambda includes everything in the template for the surrounding code. But be careful! You cannot reassign a variable from within a lambda, nor can you reassign variables that the lambda mentions. (If the variables are of some object type, with mutator methods on them, you may invoke those methods; you just are not permitted to modify the variables themselves.) In other words, you must treat every variable that a lambda uses as if it were final. (This at least relaxes a rule from earlier versions of Java, where all such variables truly needed to be explicitly declared final...) This is to prevent some very tricky scenarios: there’s just no good way to make the following example work.

boolean broken() {
ArrayList<Function<Integer, Boolean>> funcs = new ArrayList<>();
for (int i = 0; i < 10; i += 1) {
funcs.add(x -> (x == i)); // Here we mention the variable i, which is // obviously mutable and being mutated... }
return funcs.get(3).apply(6); // and here we call a function, when i has // gone out of scope! So what possible value should i currently have? }

13.1.5 Small examples🔗

Try rewriting some of the simple examples from lecture to use lambdas instead of explicitly written function-object classes. Stick to functions over non-union data: lambdas don’t eliminate the need for visitors, but visitors have more than one methood so they don’t play nicely with lambda syntax.

13.2 The Setup🔗

You are now a co-op at Generyc, a very exciting new startup in Boston’s booming tech scene. You have been tasked with enhancing the company’s internal tools that allow them to track their customers and sales. The developer who began to build this, but has since quit to pursue their dreams of joining the circus, left you with the following classes. (We have annotated all the fields as private final, indicating they cannot be accessed outside their declaring class, and may never be reassigned once they’ve been initialized in the constructors.)

class Customer {
private final int id;
private final String name;

public Customer(int id, String name) {
this.id = id;
this.name = name;
}

public int getId() { return this.id; }
public String getName() { return this.name; }
}

class Inventory {
private final int id;
private final String description;
private final int costPerUnit;

public Inventory(int id, String description, int costPerUnit) {
this.id = id;
this.description = description;
this.costPerUnit = costPerUnit;
}

public int getId() { return this.id; }
public String getDescription() { return this.description; }
public int getCostPerUnit() { return this.costPerUnit; }
}

class Purchase {
private final int id;
private final List<LineItem> lineItems;
private final int customerIdOfPurchaser;
private final int numberOfDaysAgo;

public Purchase(int id, List<LineItem> lineItems, int customerIdOfPurchaser, int numberOfDaysAgo) {
this.id = id;
this.lineItems = new ArrayList<>(lineItems);
this.customerIdOfPurchaser = customerIdOfPurchaser;
this.numberOfDaysAgo = numberOfDaysAgo;
}

public int getId() { return this.id; }
public List<LineItem> getLineItems() { return this.lineItems; }
public int getCustomerIdOfPurchaser() { return this.customerIdOfPurchaser; }
public int getNumberOfDaysAgo() { return this.numberOfDaysAgo; }
}

class LineItem {
private final int inventoryId;
private final int numberOfItemsPurchased;

public LineItem(int inventoryId, int numberOfItemsPurchased) {
this.inventoryId = inventoryId;
this.numberOfItemsPurchased = numberOfItemsPurchased;
}

public int getInventoryId() { return this.inventoryId; }
public int getNumberOfItemsPurchased() { return this.numberOfItemsPurchased; }
}

Java’s built in libraries around dates are clunky at best, so for simplicity’s sake, we will assume the databse can properly instantiate the numberOfDaysAgo field in the Purchase class. Working robustly with dates is harder than you might expect, so we’ll dodge the issue for now.

The developer also created a database to store this information, as well as some interfaces with which to interact with the database:

interface CustomerDB {
Optional<Customer> getCustomerById(int id);
List<Customer> getAllCustomers();
}

interface InventoryDB {
Optional<Inventory> getInventoryById(int id);
List<Inventory> getAllInventory();
}

interface PurchaseDB {
Optional<Purchase> getPurchaseById(int id);
// get purchases made numberOfDaysAgo, numberOfDaysAgo - 1... today List<Purchase> getPurchasesSince(int numberOfDaysAgo);
List<Purchase> getPurchasesFor(int customerId);
List<Purchase> getPurchasesFor(List<Integer> customerIds);
List<Purchase> getPurchasesForSince(int customerId, int numberOfDaysAgo);
List<Purchase> getPurchasesFor(List<Integer> customerIds, int numberOfDaysAgo);
}

Retrieving the information for multiple customers at once from a databse can have strong performance improvements over querying the database once per customer.

As the developer didn’t know the design recipe, however, there is neither a single example nor test to be found!

Import the java libraries you think will be necessary to work with these classes and interfaces.
A line item is akin to a single entry on a receipt at a grocery store.
Develop examples of customers, inventories, purchases, and line items. Be sure to create a healthy amount of each.
Develop at least one kind of example per database (using anonymous classes if you prefer) that uses your examples. For the PurchaseDB, be sure to abstract when possible. The Optional#ofNullable method will also likely come in handy, as will streams.
Write tests that ensure your mock databases work as you expect them to. NOTE: while the tester library will still check private fields for you, other testing libraries like jUnit will not. Write your tests such that they never rely on the tester library implicitly checking your objects for sameness: either define an .equals method yourself and use that, or check the results of the various accessor methods.

13.3 Computation Can Be Fun!🔗

Now that you have data to properly test on, it’s time to implement the features that have been requested. Be sure to test your methods as you go, and use streams where appropriate.

Design a class that will be responsible for computing the sales information for your company. Be sure it has all of the above databse interfaces as fields.
Design a method that determines how much revenue was created over the past n days. Hint: delegate where appropriate; you are free to add methods onto the classes provided.
Design a method that given the id of a piece of inventory, determines how many units of it have been sold in the past month (which can be estimated as 30 days).
Design a method that produces the name of the customer which has spent the most money at your company. If such a customer doesn’t exist, throw an error. If there is a tie, return the name of any tying customer. Hint: Optional#map and Optional#orElseThrow will likely come in handy.
Do Now!
Why do you think the method is called map? How is it similar to mapping over a list?
(Very tricky!) Design a method that given a customer id, produces the list of ids of the pieces of inventory they have purchased the most times (which is not necessarily the same as the pieces of inventory they have purchased the largest quantities of). The method should error if such a customer does not exist. Hint: Since Stream#max only returns one optional value, you should create your own Collector in a util class which will make this method as readable as possible. The inputs to the collector’s constructor should likely be very similar to the inputs given to Stream#max.

Collectors are the trickiest part of working with streams. Think of them as an accumulator-based fold operation. They take a function that produces an initial accumulator value, a function that accumulates each item-being-collected onto the accumulator, and a function that can combine two accumulator values into one. (This last step is unusual for us, but is helpful in higher-performance environments.)

There are several predefined Collectors available, as well as defining your own. The easiest way to define your own is to use Collectors#of and pass in three function objects.

13.4 Your Changing Data🔗

There is a very clear and crucial element missing from the databases provided: data changes over time. New customers are added, new purchases are made, inventory descriptions change, etc. Update each database interface to include the following methods, and be sure to update your implementing classes and any related classes as needed. You may need to restructure your examples class to account for the fact your data is now mutable. Try to minimize the changes: keep as many fields final as possible.

interface CustomerDB {
// add a customer to the database (and assign it a unique id) void addCustomer(String name);
}

interface InventoryDB {
// add an inventory to the database (and assign it a unique id) void addInventory(String description, int costPerUnit);
// update the inventory's name and/or cost (depending on which have been provided) // throw an error if the inventory does not exist void updateInventory(int id, Optional<String> newDescrption, Optional<Integer> newCostPerUnit);
}

interface PurchaseDB {
// add a purchase made today (and assign it a unique id) void addPurchase(List<LineItem> lineItems, int customerIdOfPurchaser);
}

Note that these methods reveal a deep flaw in the database as it was designed: a line item only keeps track of a piece of inventory and how many units were purchased, but if inventory prices can change over time, there’s no way (as of now) to determine how much a customer actually paid for that purchase. What other issues can you spot? How would you design the data differently to avoid these issues? Whenever writing programs for real-world data, always keep in mind what can and can’t change, and what kind of records may need to be retrieved far into the future.

contents ← prev up next →

	General
	Texts
	Lectures
	Syllabus
	Lab Materials
	Assignments
	Examplar
	Pair Programming Overview
	Code style
	Documentation

	Lab 1: Introduction to Eclipse and Simple Data Definitions
	Lab 2: Working with Self-Referential Data; Testing
	Lab 3: Testing and abstraction
	Lab 4: Equality and using double-dispatch
	Lab 5: Visitors and Generics
	Lab 6: Working with Cyclic Data
	Lab 7: Working with Array Lists and Mutable worlds
	Lab 8: Stacks; Queues; More Iterators; Mutable worlds
	Lab 9: Trie-d and true Java
	Lab 10: Stress Tests and Big-O behavior
	Lab 11: Depth-First Search and Topological Sort
	Lab 12: Python: The Full Monty
	Lab 13: Optionals, Streams, and Collectors: Oh My!

13.1	A quick primer on lambda expressions
13.2	The Setup
13.3	Computation Can Be Fun!
13.4	Your Changing Data