7.7

#### Lecture 17: Hierarchical structures

Composite Design Pattern.
Organizational trees.

##### 17.1Representing hierarchies

There are many forms of data that have a natural hierarchical organization. Examples include a file system (files and directories), a family tree, an organizational hierarchy, etc. In such examples, data can be one of many finite forms (a union). However some data is composed of one or more other forms of data, including other instances of its own. In many examples, each instance of data is part of at most one composition. This forms a hierarchical, tree-like structure.

Hierarchical structures are examples of recursive union data types with one characteristic: composition. The design and manipulation of this data has many interesting characteristics.

##### 17.2Organization hierarchy

An organization has employees in managerial and non-managerial roles. A manager, in addition to fulfilling traditional responsibilities, also supervises one or more other employees. Managers have managers too, so this forms a hierarchical organizational structure. Each employee has exactly one manager, except the CEO (who has no managers). The organization also hires employees by a fixed-term contract. These contractual employees also have managers within the organization, but cannot supervise other employees in the organization. Irrespective of the type of employees, they all have name, annual pay and gender (as reported at the time of employment) in the organization records.

An example organization is shown in the figure below.

White, green and red indicate supervisors, non-managerial and contract employees respectively.

Given the organizational data, we would like to create an organization and answer the following types of questions about it:

• How big is the organization (i.e. how many employees work for it)?

• What are the gender demographics of the organization?

• Who does the organization employ (e.g. names of all employees)?

• How many employees make 6-figure salaries (or some other threshold)?

• How many employees are scheduled to leave the organization by a certain date (or how many employees will the organization have on a certain date)?

##### 17.2.1Representation

Realistic design processes are often mixed and messy. A top-down approach starts with what is needed (when we are done) and lets it lead the way. A bottom-up approach fixes the nuts and bolts and then brings them together. It is recommended that we start with a preliminary top-down approach: specify what needs to work for the client (client-facing part). Then design specific data representations if possible. Finally implement each operation required by the client end-to-end, adding operations to the lower-level representations. Always be open to abstracting and refactoring everything but the client-facing layer.

In Lecture 7 we started with the operations of a list, and then encapsulated them into an abstract data type. Realistic scenarios are more complicated: do we proceed with a top-down (design ADT first and then let its operations lead to other representations/methods) or a bottom-up (create lower-level pieces and then encapsulate them in an ADT)? In this lecture, we will try a mixed approach: we express the organization as an ADT and also think about data representations of the employees. Then we will implement an ADT operation end-to-end, inserting methods in the employee representation and abstracting/generalizing where we can.

##### 17.2.2Representation of the organization

We first summarize our organization as an abstract data type. This ADT will offer the following operations:

1. Add an employee to the organization with name, annual pay, gender and details of the supervisor.

2. Get the size of the organization.

3. Get the number of employees of a specified gender.

4. Get the number of employees whose annual pay is above a certain amount.

5. Get a list of all employee names.

6. Get the number of employees who will be terminated by a certain date.

Now we follow our object-oriented design approach and represent the ADT with a Organization interface:

 import java.util.List; /** * This interface represents an organization. It includes methods that an * organization is expected to offer. */ public interface Organization { /** * Add an employee to this organization with the given * specifics and supervisor. This employee will not be * added to the organization if the supervisor cannot be * found. * @param name name of employee to be added * @param pay the annual pay of this employee * @param gender the gender of this employee * @param supervisorName the name of the supervisor. The supervisor should * be an existing employee */ void addEmployee(String name,double pay,Gender gender, String supervisorName); /** * Add a contract employee to this organization with the * given specifics and supervisor. This employee will not * be added to the organization if the supervisor cannot be * found. * @param name name of employee to be added * @param pay the annual pay of this employee * @param gender the gender of this employee * @param endDate the date on which this employee's contract ends * @param endMonth the month in which this employee's contract ends * @param endYear the year in which this employee's contract ends * @param supervisorName the name of the supervisor. The supervisor should * be an existing employee */ void addContractEmployee(String name,double pay,Gender gender, int endDate,int endMonth,int endYear, String supervisorName); /** * Get the size of the organization, i.e. the total number of employees in * this organization. * @return the number of employees in this organization */ int getSize(); /** * Get the number of employees of the specified gender in this organization. * @param gender the specific gender that must be counted * @return the number of employees of the specified gender in this * organization */ int getSizeByGender(Gender gender); /** * Get a list of names of all employees in this organization. * @return a list of names of all employees as a list of {@link String} */ List allEmployees(); /** * Return the number of employees whose annual pay is above the specified * amount * @param amount the lower threshold of the annual pay * @return the number of employees whose annual pay is above the specified * amount */ int countPayAbove(int amount); /** * Return the number of employees who are scheduled to be terminated before * a specific date * @param date the date of termination * @param month the month of termination * @param year the year of termination * @return the number of employees who will be scheduled before this * specific date */ int terminatedBefore(int date,int month,int year); }

Note the comments above each method signature. They specify fully how the method should act, in ideal and exceptional circumstances. Specifying the details of the purpose and contract is even more important when the method is mutating the object (e.g. addEmployee).

##### 17.2.3Representation of the employees

Gleaning from the above description, there are three distinct types of employees:

1. Supervisors: they supervise at least one other employee.

2. Non-managerial employees: they have no managerial responsibilities.

3. Contract employees: similar to non-managerial employees (in that they have no managerial responsibilities) but with a limited-term employment.

As said before, this is a union type. The supervisor employee profile, and the fact that supervisors also have a supervisor, give rise to the characteristic that "certain employees are composed of other employees".

##### 17.2.3.1Classes and interfaces

We create an interface to represent an employee (Employee). We create two implementations of it: NonManagerEmployee and Supervisor representing the first two types of employees in the above list. We also create a ContractEmployee class to represent the third type of employee. Since a contract employee seems to be a "special case" of a regular, non-managerial employee, we use inheritance to relate ContractEmployee with NonManagerEmployee. Finally we create a GenericEmployee class in anticipation that some functionality could be abstracted (similar to the duration examples from Lecture 4). This creates the following preliminary design:

The Supervisor is related to the Employee via a "has-a" relationship (a supervisor has employees, supervisees), called a composition. This is an alternative to the "is-a" relationship which is inheritance.

This is a specific example of the composite design pattern. The composite design pattern is used when data is composed of other forms of data, resulting in a whole-part hierarchy. The general design of the composite pattern is as below:

The main characteristic of the composite pattern is that different forms of (related) data can be dealt uniformly, as a union. A leaf and a composite are both components, although they are structurally different. Although a simple pattern by itself, it has one important design issue that we will encounter later in this lecture.

"Composition" has a different meaning in UML than what is discussed here. Currently we refer to composition as merely "one object has another object inside it". UML has three different kinds of such relationships. The general one is called association, and is represented by a regular arrow head as shown here. When the association implies a "whole-part" hierarchy, it is called an aggregation. Aggregation is represented by a hollow diamond arrow head (pointing to the whole). Finally the strongest association is composition. This implies a whole-part hierarchy where the part cannot exist without being associate with some whole (e.g. a student record has a transcript, but a transcript cannot exist without being associated with a student record). Composition is represented by a solid diamond arrow head (pointing to the whole).

##### 17.2.3.2Attributes

We know that each employee has a name, annual pay and gender. Gender-type data can be represented using an enumerated type, since gender is one of a finite set of options (male, female, undisclosed). We add the name, annual pay and gender as attributes in the GenericEmployee class and add accessor (getter) methods for them in the Employee interface. We keep these attributes protected so that subclasses can directly access them.

There is additional data for specific types of employees, so we add them in their respective classes:

• Supervisor: a supervisor has one or more supervisees. A supervisee can be any employee, including another supervisor. Therefore we represent a supervisee as an Employee object, and add a list of these objects in the Supervisor class.

• ContractEmployee: a contract employee has an end-of-contract date. We add this as an attribute to the ContractEmployee, as a LocalDate object.

Our design evolves to:

We can write preliminary implementations of this design. We give one constructor to each of the classes, that explicitly takes all its attributes as arguments and initializes them accordingly.

 public interface Employee { String getName(); Gender getGender(); double getAnnualPay(); } public abstract class GenericEmployee implements Employee { protected String name; protected double pay; protected Gender gender; public GenericEmployee(String name,double pay, Gender gender) { this.name = name; this.pay = pay; this.gender = gender; } @Override public String getName() { return this.name; } @Override public Gender getGender() { return this.gender; } @Override public double getAnnualPay() { return this.pay; } } public class NonManagerEmployee extends GenericEmployee { public NonManagerEmployee(String name, double pay, Gender gender) { super(name, pay, gender); } } public class ContractEmployee extends NonManagerEmployee { private LocalDate contractEndDate; public ContractEmployee(String name, double pay, Gender gender,int date, int month, int year) throws IllegalArgumentException{ super(name, pay, gender); //validate our date try { contractEndDate = LocalDate.of(year, month, date); } catch (DateTimeException e) { throw new IllegalArgumentException("Invalid contract end date"); } } } public class Supervisor extends GenericEmployee { private List supervisee; public Supervisor(String name, double pay, Gender gender) { super(name, pay, gender); supervisee = new LinkedList(); } }
##### 17.2.4Implementation of the organization

In the linked list representation (see Lecture 7) the ADT tracked the head of the list, which was the first element in it. This could be used to access the remaining list, one element at a time. Similarly we can track a tree-like structure by recording its root. In the above example, the root would be "Bob".

Accordingly we create a class (OrganizationImpl) that implements this interface. Consider how to instantiate it. The addEmployee and addContractEmployee methods mandate the name of an existing employee as the supervisor of the new employee. How would one add the first employee in this design?

We solve this problem by mandating that an organization must be created with at least one employee, and this employee (the founder or the CEO) must be specified during instantiation. This ensures that the organization has an existing employee when an employee is added using the addEmployee or addContractEmployee methods.

 public class OrganizationImpl implements Organization { private Employee root; public OrganizationImpl(String nameCEO, double pay, Gender gender) { root = new NonManagerEmployee(nameCEO,pay,gender); } }

Note that we choose to create a NonManagerEmployee object. This covers the case where the organization has only a single employee.

##### 17.2.5Example

The example below illustrates how we can use the above design to create the organization shown in the earlier figure. This clarifies how we expect a client to use this design.

Conceptually an employee can be added as a supervisee only to a supervisor. This is obvious from the attributes of our Supervisor class. How do we add a supervisee to an existing Supervisor object? We have two choices:

1. We add a addSupervisee method in the Supervisor class. As the composite and leaf are distinct, it is conceivable in general that they have operations that are unique to themselves. This retains type safety, i.e. the property that ensures that operations are offered only by those classes for which they are relevant.

However this means that we must have a Supervisor object in order to call this method. We can no longer treat Employee-type objects uniformly. We must infer the type of the object and possibly cast it to call this method. This breaks the uniformity that the composite pattern brings, and the resulting code is unable to leverage dynamic dispatch fully.

2. We add the addSupervisee method to the Employee interface. This preserves the uniformity of the composite pattern. However this means we must now implement this method in all subclasses, including those for which this operation does not make sense (e.g. NonManagerEmployee and ContractEmployee). Thus this design compromises type safety in favor of uniformity.

The above issue is inherent in the composite design pattern: do we prefer type safety or uniformity? There is no clear answer. In our case we retain the uniformity of the composite pattern so that our client does not have to bother with knowing which Employee-type object it is referring to. Therefore we add the addSupervisee method to the Employee interface. What does its method signature look like?

For simplicity, we provide this method with two arguments: the Employee object to be added into the hierarchy and an identifier for its supervisor (i.e. its name). Remember that we always start at the root: we do not have direct access to the required Supervisor object. This method returns the root of the resulting hierarchy.

Do Now!

Does this contract seem OK to you? Think about why the addSupervisee method has to return anything before reading on. Think about how the client (OrganizationImpl) would use it.

 public interface Employee { ... /** * Add the given employee as the supervisee of the employee * with the given * name. This method has no effect on the hierarchy if the * supervisor cannot be found. * @param supervisorName the name of the supervisor * @param supervisee the employee that will be supervised by this employee * @return the resulting hierarchy of this employee */ Employee addSupervisee(String supervisorName,Employee supervisee); }

Before we implement this method, we should write tests for it.

We can test that the addEmployee and addContractEmployee methods work correctly by creating a hierarchy, and then “seeing” it in some way. We observe that the Organization has the allEmployees method that returns a list of employees. We can use it to test as follows:

 @Test public void testAllEmployees() { List actualResult = startup.allEmployees(); expected = //a sequence of names assertEquals(expected,actualResult.toString()); }

Although this test seems correct, the Organization interface does not state the sequence of names of employees that must be produced by the allEmployees method. Although such a specification should be part of the method declaration, it may be deemed an “implementation detail”. This means that the designer of the interface merely mandated that such a list-producing method be part of all employee objects, but the exact sequence can be decided by a specific implementation. Java’s toString method is an example of this.

We must now implement the addSupervisee method in all subclasses. In the Supervisor class, we first check if this object is that of the supervisor (i.e. whether the supervisor name passed to this method matches this supervisor’s name). If so, we add the supervisee object to its list of supervisees (effectively making it a child of this node). If this object is not the supervisor, we recur to its children and try in each one of them.

Implementing this method in the non-supervisor classes is tricky. Consider the example above of creating a new organization. The root of the hierarchy is a NonManagerEmployee object, which is incapable of having supervisees. When we call addEmployee on this object, it must "promote" itself to a supervisor with one supervisee. The resulting hierarchy’s root will now be a Supervisor object, and the attribute root must mutate to reflect this new root. This mutation is the reason why our addSupervisee method returns a result.

Thus the implementation of this method in the leaf should

1. Check if this leaf employee is the supervisor we are looking for.

2. If this is the supervisor, we create a Supervisor object with the same details as this, add the Employee object as its supervisee and return it.

3. If this is not the supervisor, we return this object (i.e. the hierarchy is unchanged). We document this behavior clearly in the interface!

Since this implementation applies to both NonManagerEmployee and ContractEmployee we implement it in the former.

 //In NonManagerEmployee: @Override public Employee addSupervisee(String supervisorName,Employee supervisee) { if (this.name.equals(supervisorName)) { //must first "promote" this employee Supervisor newSelf = new Supervisor(this.name,this.pay,this .gender); newSelf.addSupervisee(supervisorName,supervisee); return newSelf; } return this; } //In Supervisor: @Override public Employee addSupervisee(String supervisorName,Employee supervisee) { if (this.name.equals(supervisorName)) { this.supervisee.add(supervisee); return this; } for (int i=0;i

Finally we implement this method in the OrganizationImpl class:

 //In OrganizationImpl: @Override public void addEmployee(String name, double pay, Gender gender, String supervisorName) { Employee newEmployee = new NonManagerEmployee(name,pay,gender); root = root.addSupervisee(supervisorName,newEmployee); }

The addContractEmployee method is very similar:

 //In OrganizationImpl: @Override public void addContractEmployee(String name, double pay, Gender gender, int endDate, int endMonth, int endYear, String supervisorName) { Employee newEmployee = new ContractEmployee(name,pay,gender,endDate,endMonth, endYear); root = root.addSupervisee(supervisorName,newEmployee); }

How would we test that this works? We need operations that will (in some way) allow us to “see” the hierarchy.

##### 17.2.6.3Operations in composites

A hierarchy is a self-similar structure (it is made of smaller hierarchies). This is reflected clearly in our design of a composite. Since the definition of the data is recursive, it is no surprise that the addSupervisee method was recursive in nature. We observed this in lists as well: recursive data definitions often lend themselves to recursive implementations of operations.

Every recursive operation has two components: the recursion and one or more base cases. In our design the former is put in the composite (the recursive data) and the latter are put in the leaves (non-recursive data). Our hierarchy always ends in leaves, which corresponds to the recursion "halting" in the base cases.

Remember the technique of writing recursive functions. First state the operation (in plain language) in terms of itself. Once you successfully complete this step, the design and implementation are quite straightforward.

##### 17.2.7.1Tests for sizes

We begin by writing the tests that our implementation must pass.

 @Test public void testGetSize() { assertEquals(8,startup.getSize()); assertEquals(1,startup.getSizeByGender(Gender.Female)); assertEquals(5,startup.getSizeByGender(Gender.Male)); assertEquals(2,startup.getSizeByGender(Gender.UnDisclosed)); }
##### 17.2.7.2Implementation of sizes

Getting the size of an organization amounts to counting the number of employees in it. How can one determine the number of employees in an organization? Technically we must count the number of nodes in our tree, starting from the root.

We state the operation in terms of itself: The number of employees in the hierarchy rooted at R is the sum of the number of employees in each of its sub-hierarchies (rooted at its supervisees), plus 1 (for the root itself). If an employee has no supervisees, the number of employees in its hierarchy is 1. The first sentence leads to a recursive implementation, and the second sentence is the base case!

We add a method to the Employee interface that counts the number of employees in a hierarchy starting there.

 public interface Employee { ... /** * Count the number of employees in this hierarchy * @return the number of employees in this hierarchy */ int count(); }

We notice that every employee needs to count itself, irrespective of its type. Therefore we write the base case in the GenericEmployee class. The NonManagerEmployee and ContractEmployee inherit it directly. We override (and reuse it) in the Supervisor class.

 //In GenericEmployee: @Override public int count() { return 1; } //In Supervisor: @Override public int count() { int count = 0; for (Employee c:supervisee) { count += c.count(); } return count + super.count(); }

We can also implement this using a map followed by a reduce, as follows:

 @Override public int count() { Stream stream = this.supervisee.stream(); return this.supervisee.stream() .mapToInt(b -> b.count()) .sum() + super.count(); }

Specifically we convert the list of supervisees into a stream, then map it onto a list of counts (of each sub-hierarchy) and then reduce it by summing it.

The getSizeByGender method counts the number of employees in the hierarchy that have the specified gender. We can frame the operation as follows: The number of employees in the hierarchy of the specific gender rooted at R is the sum of the number of employees of the specific gender in each of its sub-hierarchies (rooted at its supervisees), plus 1 or 0 depending on the root employee’s gender. If an employee has no supervisees, the number of nodes is 1 or 0 depending on its gender.

The above statement sounds similar to that of getSize. Indeed, its implementation is similar to it as well.

 //In Employee: /** * Count the number of employees in this hierarchy with the specified * gender. * @param gender the specified gender * @return the number of employees in this hierarchy with the * specified gender */ public int countByGender(Gender gender); //In GenericEmployee: @Override public int countByGender(Gender gender) { if (this.gender==gender) { return 1; } else return 0; } //In Supervisor: @Override @Override public int countByGender(Gender gender) { int count = 0; for (Employee c:supervisee) { count += c.countByGender(gender); } return count + super.countByGender(gender); }

Finally the getSize and getSizeByGender methods can be implemented as follows:

 //In OrganizationImpl @Override public int getSize() { return root.count(); } @Override public int getSizeByGender(Gender gender) { return root.countByGender(gender); }

Do Now!

Try to implement the countPayAbove method.

##### 17.2.8Termination of contract: terminatedBefore

This operation computes the number of employees whose employment terminates before a specific date. This only applies to contract employees, as only they have a termination date. Therefore, all other types of employees do not contribute towards the result of this operation.

We can frame this operation as a counting operation: we count the number of employees whose employment ends before a specific date. This seems very similar to the earlier count and countByGender methods. Can we abstract these operations to remove this redundancy?

##### 17.2.9Abstracting the count operation

The common aspects of all three methods is that they count employees. The first one counts unconditionally, but the other two test whether an employee passes muster (gender or termination date) before adding it to the count. We can state all of them as conditional counting methods, with the condition applying to an employee (the counting of employees is simply a condition that computes to true for every employee).

We note that implementations of both count and countByGender traverse the hierarchy one employee at a time. Therefore we can generalize it by passing it a condition that will be tested on each employee before it contributes to the accumulating count. This condition is of the form boolean condition(Employee), which is a predicate.

Based on these observations, we refactor the implementations of the employees (note that Organization does not change). We first combine count and countByGender into one method:

 public interface Employee { ... /** * Count the number of employees in this hierarchy * who fulfill the given predicate * @return the number of employees in this hierarchy that fulfill the * given predicate */ int count(Predicate condition); }

We then implement it as follows:

 //In Supervisor: @Override public int count(Predicate condition) { Stream stream = this.supervisee.stream(); return this.supervisee.stream() .mapToInt(b -> b.count(condition)) .sum() + super.count(condition); } //In GenericEmployee @Override public int count(Predicate condition) { if (condition.test(this)) { return 1; } return 0; }

Finally we change the getSize and getSizeByGender as follows:

 //In OrganizationImpl: @Override public int getSize() { return root.count(b -> true); } @Override public int getSizeByGender(Gender gender) { return root.count(b -> b.getGender()==gender); }

Both methods use the count method, with different predicates. The getSize method provides a tautological predicate (returns true for each employee) so that each employee is counted. The getSizeByGender method provides a predicate that passes only if the employee’s gender is as specified.

As the Organization interface did not change, our earlier tests can be re-run to verify that this implementation works correctly. The fact that we could do this means what we just did (from the test’s perspective) is code refactoring: we changed the internals of the OrganizationImpl class (and others) without affecting how the client uses it.

##### 17.2.10terminatedBefore: Second cut

We can now attempt to frame terminatedBefore as a count with a different condition: the employee passes if it has a date of termination and it is before the specified date.

 //In OrganizationImpl: @Override public int terminatedBefore(int date,int month,int year) { LocalDate threshold; try { threshold = LocalDate.of(year,month,date); } catch (DateTimeException e) { return 0; } return root.count( //the predicate b->{ if (b.getEmploymentEndDate().equals("XXXXXXXX")) return false; // this employee has no termination date else { LocalDate d = LocalDate.parse(b.getEmploymentEndDate(), DateTimeFormatter.ofPattern("MMddyyyy")); return d.compareTo(threshold)<0; //true if before threshold } }); }

We note the power of abstraction. We have managed to express 3 different operations in the Organization interface in terms of a single operation on employees!

Do Now!

Can you reimagine countPayAbove?

##### 17.2.11A list of all names: allEmployees

This method returns part of each employee’s data present in the hierarchy as a list, effectively "serializing" the hierarchy. There are several ways of arranging the contents of a hierarchy sequentially. This involves visiting each element of the tree and adding it a list. The documentation of allEmployees does not mandate a particular sequence, so an implementation is free to choose any valid sequence. Some options include:

• All names in level 1, followed by those in level 2, and so on. Each level can be traversed from left to right (Bob, Bill, Michelle, Mark, Amit, Tom, Chuck, Tim) or from right to left (Bob, Michelle, Bill, Tim, Chuck, Tom, Amit, Mark). These types of traversals are called "breadth-first" or "level-order".

• A top-down traversal that traverses the supervisees of a supervisor in the order it stored them (Bob, Bill, Mark, Amit, Tom, Michelle, Chuck, Tim). A bottom-up traversal is also possible (Mark, Amit, Tom, Bill, Chuck, Tim, Michelle, Bob). These types of traversals are called "depth-first".

Let us attempt a top-down traversal: The list of all employees in the hierarchy contains the employee at the top of the hierarchy, followed by the list of employees for its first supervisee, then the list of employees for its second supervisee, and so on. If an employee has no supervisees, then the list consists of only that employee.

Do Now!

Try to state the bottom-up traversal.

Although we need a list of names of employees, it seems reasonable to have a simpler method in the hierarchy to return a list of employees. We can then extract the data from it.

##### 17.2.11.1Tests for Hierarchy=>List

The tests that we wrote for the addEmployee and addContractEmployee also test the correctness of the allEmployees method:

 @Test public void testAllEmployees() { List actualResult = startup.allEmployees(); expected = "[Bob, Bill, Mark, Amit, Tom, Michelle, Chuck, Tim]"; assertEquals(expected,actualResult.toString()); }
##### 17.2.11.2Implementation for Hierarchy=>List

We first add a method to the Employee interface.

 public interface Employee { ... /** * Convert the employee hierarchy into a list. * @return the resulting list */ List toList(); }

Then we implement it in the subclasses as follows:

 //In Supervisor @Override public List toList() { List result = new ArrayList(); result.add(this); for (Employee e : supervisee) { result.addAll(e.toList()); } return result; } //In GenericEmployee @Override public List toList() { List result = new ArrayList(); result.add(this); return result; }

Finally we implement the allEmployees method. We extract a list of employees from the hierarchy and use it to form another list that contains their corresponding names. (List<Employee> => List<String>...which higher-order function is this?)

 //In OrganizationImpl: @Override public List allEmployees() { return root.toList().stream().map(e->e.getName()).collect(Collectors .toList()); }