On this page:
22.1 Finding an item in an arbitrary Array  List
22.2 Finding an item in a sorted Array  List
22.3 Generalizing to arbitrary element types
22.4 Sorting an Array  List
22.5 Finding the minimum value
6.2.1

22 Lecture 22: ArrayLists

Binary search over sorted ArrayLists, sorting ArrayLists

In the last lecture we began implementing several functions over ArrayLists as methods in a helper utility class. We continue that work in this lecture, designing methods to find an item in an ArrayList matching a predicate, and to sort an ArrayList according to some comparator.

22.1 Finding an item in an arbitrary ArrayList

To find an item that matches a given predicate, we need to add a new method to our utility class as well. Our initial guess at a signature is
// In ArrayUtils <T> ??? find(ArrayList<T> arr, IPred<T> whichOne) {
???
}
What should the return type of our method be? If we simply return the item (i.e. have a return type of T), we are limited in what we can do: we can modify the item, perhaps, but we cannot remove the item from the ArrayList itself, because removal requires specifying an index. Therefore we should return the index where we found the item. If no item is found, we could throw an exception, but since it is a common occurrence for a value not to be found, we should instead return some invalid index, rather than aborting our program:
// In ArrayUtils // Returns the index of the first item passing the predicate, // or -1 if no such item was found <T> int find(ArrayList<T> arr, IPred<T> whichOne) {
???
}
How can we implement this method? We don’t have ConsList and MtList against which to dynamically dispatch to methods. Even if we did, we’d need to keep count of the index we were at (using an accumulator parameter) so that we could return it. Here, we can use that index to drive the iteration as well. We define a helper method
// In ArrayUtils // Returns the index of the first item passing the predicate at or after the // given index, or -1 if no such such item was found <T> int findHelp(ArrayList<T> arr, IPred<T> whichOne, int index) {
if (whichOne.apply(arr.get(index)) {
return index;
} else {
return findHelp(arr, whichOne, index + 1);
}
}

Do Now!

What’s wrong with this code?

We’ve forgotten our base case: this code will continue its recursion until index gets too big, at which point get will throw an exception. We need to compare index with the size of the list:
// In ArrayUtils // Returns the index of the first item passing the predicate at or after the // given index, or -1 if no such such item was found <T> int findHelp(ArrayList<T> arr, IPred<T> whichOne, int index) {
if (index >= arr.size()) {
return -1;
} else if (whichOne.apply(arr.get(index)) {
return index;
} else {
return findHelp(arr, whichOne, index + 1);
}
}

Do Now!

What would happen if we had used > instead of >=?

22.2 Finding an item in a sorted ArrayList

Suppose we happen to know that our ArrayList contains items that are comparable, and that the ArrayList itself is sorted. Can we do better than blindly scanning through the entire ArrayList? For concreteness, let’s assume our ArrayList is an ArrayList<String> and we’ll use the built-in comparisons on Strings. We’ll revisit this decision after we’ve developed the method, and generalize it to arbitrary element types.

To guide our intuition on a better searching algorithm, consider a well-known sorted list of strings: a dictionary, whose entries are alphabetized. Here is a sample dictionary of words, along with their 0-based indices:

   0       1       2      3    4     5       6       7         8

[apple, banana, cherry, date, fig, grape, honeydew, kiwi, watermelon]

Suppose we were searching for “grape”. We know that words beginning with ‘g’ are not likely to appear at the very front of the dictionary, nor are they likely to appear at the back. Instead we start our search somewhere in the middle of the dictionary. In this case, the middle of our dictionary is index 4, “fig”. Because the dictionary is alphabetized, and “grape” comes after “fig” in the alphabet, we now know that all indices of 4 and below will definitely not contain the word we seek. Instead, we turn our attention to indices 5 (which is one more than the middle index, 4+1) through 8 (our upper bound on which indices might contain our word). We could begin blindly scanning through all those items (and indeed, in this particular example, we’d luckily find our target on the very next try!), but our first approach of checking the “middle” index and eliminating half the dictionary in one shot worked so well; let’s try it again. This time, the middle index is 6 (or 7; either will work, but since indices must be integers, we will use integer division, allowing Java to truncate any fractional part and we’ll get 6 as our answer), “honeydew”. Since “grape” precedes “honeydew”, we now know that indices 6 and up will definitely not contain the word we seek. So we continue with indices 5 (our lower bound) through 5 (which is one less than the middle index, 6-1). Happily, index 5 contains “grape”, so we return 5 as our answer.

Do Now!

What indices would we check if we were searching for “blueberry”?

Once again, we consider the entire ArrayList, from index 0 through index 8, and start our search at the middle index 4, “fig”, which is greater than our target word. So we eliminate indices 4 and up, and focus on indices 0 (our lower bound on where to find the word) through 3 (which is 4-1). Our middle index is 2, “cherry”, which is greater than “blueberry”, so we eliminate indices 2 and up, and focus on indices 0 (our lower bound) through 1 (which is 2-1). Now our middle index is 0, “apple”, which is less than our target, so we eliminate index 0, and focus on indices 1 (which is 0+1) through 1 (our upper bound). Index 1 contains “banana”, which is less than our target, so we eliminate index 1, and focus on indices 2 (which is 1+1) through 1 (our upper bound). Now our bounds have crossed: our lower bound is greater than our upper bound, so there are no possible words in the dictionary that might be our target. We must not have the target word in our ArrayList; we therefore return -1.

Let’s see how to translate this description into code. This search technique, which splits the search space in half each time, is known as binary search, so we’ll implement a new method to distinguish it from our previous find operation:
// In ArrayUtils // Returns the index of the target string in the given ArrayList, or -1 if the string is not found // Assumes that the given ArrayList is sorted aphabetically int binarySearch(ArrayList<String> strings, String target) {
???
}
Once again, we find ourselves in need of a helper method: we need to keep track of the lower and upper bounds on the indices where our target string might be found.
// In ArrayUtils // Returns the index of the target string in the given ArrayList, or -1 if the string is not found // Assumes that the given ArrayList is sorted aphabetically int binarySearchHelp(ArrayList<String> strings, String target, int lowIdx, int highIdx) {
int midIdx = (lowIdx + highIdx) / 2;
if (target.compareTo(strings.get(midIdx)) == 0) {
return midIdx; // found it! } else if (target.compareTo(strings.get(midIdx)) < 0) {
return this.binarySearchHelp(strings, target, midIdx + 1, highIdx); // too low } else {
return this.binarySearchHelp(strings, target, lowIdx, midIdx - 1); // too high }
}

Do Now!

What’s wrong with this code?

Once again we forgot our base case: when the indices cross, the target must not be present:
// In ArrayUtils // Returns the index of the target string in the given ArrayList, or -1 if the string is not found // Assumes that the given ArrayList is sorted aphabetically int binarySearchHelp(ArrayList<String> strings, String target, int lowIdx, int highIdx) {
int midIdx = (lowIdx + highIdx) / 2;
if (lowIdx > highIdx) {
return -1; // not found } else if (target.compareTo(strings.get(midIdx)) == 0) {
return midIdx; // found it! } else if (target.compareTo(strings.get(midIdx)) < 0) {
return this.binarySearchHelp(strings, target, midIdx + 1, highIdx); // too low } else {
return this.binarySearchHelp(strings, target, lowIdx, midIdx - 1); // too high }
}

Do Now!

What would happen if we didn’t add or subtract 1 from midIdx in the recursive calls?

Consider searching for “clementine”, this time without adding or subtracting 1:
  • We start the search between indices 0 and 8. The middle index is 4, and “fig” is bigger than “clementine”, so we search from the lower bound to the middle index.

  • We search between indices 0 and 4. The middle index is 2, and “banana” is smaller than “clementine”, so we search from the middle index to the upper bound.

  • We search between indices 2 and 4. The middle index is 3, and “cherry” is smaller than “clementine”, so we search from the middle index to the upper bound.

  • We search between indices 3 and 4. The middle index is 3, and “cherry” is smaller than “clementine”, so we search from the middle index to the upper bound.

  • We search between indices 3 and 4...

If we don’t add or subtract 1, then we can easily get stuck comparing the same items with the same upper and lower bounds indefinitely. Once again, when dealing with indices, we have to be particularly careful about our edge cases.

Do Now!

What would happen if our exit condition were if (loxIdx >= highIdx)...?

Now that we have a working helper, we just need to invoke it from the main binarySearch method:
// In ArrayUtils int binarySearch(ArrayList<String> strings, String target) {
return this.binarySearchHelp(strings, target, 0, strings.size() - 1);
}

22.3 Generalizing to arbitrary element types

For completeness, here is the version of binarySearch that works for arbitrary element types. Our signature gets slightly more complicated, but the logic behind the index computations and comparisons remains the same:
// In ArrayUtils <T> int binarySearch(ArrayList<T> arr, T target, IComparator<T> comp) {
return this.binarySearchHelp(arr, target, comp, 0, arr.size() - 1);
}
<T> int binarySearchHelp(ArrayList<T> arr, T target, IComparator<T> comp,
int lowIdx, int highIdx) {
int midIdx = (lowIdx + highIdx) / 2;
if (lowIdx > highIdx) {
return -1;
} else if (comp.compare(target, strings.get(midIdx)) == 0) {
return midIdx;
} else if (comp.compare(target, strings.get(midIdx)) < 0) {
return this.binarySearchHelp(strings, target, comp, midIdx + 1, highIdx);
} else {
return this.binarySearchHelp(strings, target, comp, lowIdx, midIdx - 1);
}
}

22.4 Sorting an ArrayList

Picture a set of cards spread out in a row on a table, each with a word on them:

   0       1     2      3      4     5       6         7        8

[kiwi, cherry, apple, date, banana, fig, watermelon, grape, honeydew]

How would we sort this? There are many, many techniques we could use, but since we have only two hands to move the cards around, one of the most natural might be the following. We pick up the first card, “kiwi”, and look for the card that ought to go in that spot — “apple” — and replace “kiwi” with “apple”. Since we do not want to lose “kiwi”, and since we have to set it down again somewhere, we might as well place it in the spot where “apple” was: we exchange them.

   0       1     2      3      4     5       6         7        8

[apple, cherry, kiwi, date, banana, fig, watermelon, grape, honeydew]

Do Now!

How did we decide that “apple” was the appropriate replacement for “kiwi”?

Next, we pick up the second card, “cherry”, and look for the card that ought to go in that spot — “banana” — and exchange them.

   0       1     2      3      4     5       6         7        8

[apple, banana, kiwi, date, cherry, fig, watermelon, grape, honeydew]

Do Now!

How did we decide that “banana” was the appropriate replacement for “cherry”?

Let’s be a bit more rigorous about what we’re doing here. In the first case, when we were searching for a replacement for “kiwi”, we were looking for the smallest item of the list. In the second case, we could not possibly have been searching for the smallest item of the list, or else we’d have found “apple” again! Instead, we were searching for the smallest item of the rest of the list, beyond the location we were swapping. Why does this work? Our algorithm essentially partitions the list into two segements: the front of the list has been processed, while the back of the list remains to be processed. Moreover, the front of the list is guaranteed to be sorted.

   0       1   ||  2      3      4     5       6         7        8

[apple, banana,|| kiwi, date, cherry, fig, watermelon, grape, honeydew]

    SORTED  <--++-->  NOT YET SORTED

By searching for the smallest item of the not-yet-sorted portion of the list, and exchanging it with the first item in the not-yet-sorted portion, we have essentially sorted that one item:

                         MIN

   0       1   ||  2      3      4     5       6         7        8

[apple, banana,|| kiwi, date, cherry, fig, watermelon, grape, honeydew]

    SORTED  <--++-->  NOT YET SORTED

 

    Swap items at index 2 and index 3...

 

   0       1     2   ||   3      4     5       6         7        8

[apple, banana, date,|| kiwi, cherry, fig, watermelon, grape, honeydew]

       SORTED     <--++-->  NOT YET SORTED

Now if we repeat this process for each index in the list, we’ll have grown the sorted section to encompass the entire list: we’ll have sorted the list.

But how to do that? We cannot use a for-each loop here, because we specifically care about the indices, more than we care about the particular items. We could write our code using a recursive method and an accumulator parameter:
// In ArrayUtil // EFFECT: Sorts the given list of strings alphabetically void sort(ArrayList<String> arr) {
this.sortHelp(arr, 0); // (1) }
// EFFECT: Sorts the given list of strings alphabetically, starting at the given index void sortHelp(ArrayList<String> arr, int minIdx) {
if (minIdx >= arr.size()) { // (2) return;
} else { // (3) int idxOfMinValue = ...find minimum value in not-yet-sorted part...
this.swap(arr, minIdx, idxOfMinValue);
this.sortHelp(arr, minIdx + 1); // (4) }
}
But this feels clumsy: there’s too much clutter surrounding the actual activity of sortHelp. Again, since iterating over all items by position is such a common operation, Java provides syntax to make it easier: a counted-for loop, or just a for loop. We introduce counted-for loop syntax by rewriting sort and sortHelp to use one:
// In ArrayUtil // EFFECT: Sorts the given list of strings alphabetically void sort(ArrayList<String> arr) {
for (int idx = 0; // (1) idx < arr.size(); // (2) idx = idx + 1) { // (4) // (3) int idxOfMinValue = ...find minimum value in not-yet-sorted part...
this.swap(arr, minIdx, idxOfMinValue);
}
}

A for loop consists of four parts, which are numbered here (and their corresponding parts are numbered in the recursive version of the code). First is the initialization statement, which declares the loop variable and initializes it to its starting value. This is run only once, before the loop begins. Second is the termination condition, which is checked before every iteration of the loop body. As soon as the condition evaluates to false, the loop terminates. Third is the loop body, which is executed every iteration of the loop. Fourth is the update statement, which is executed after each loop body and is used to advance the loop variable to its next value. Read this loop aloud as “For each value of idx starting at 0 and continuing while idx < arr.size(), advancing by 1, execute the body.”

The initialization, termination condition, and update statement used here are pretty typical: loops often count by ones, starting at zero and continuing until some upper bound. But these loops can be far more flexible: they could start counting at some large number, and count down to some lower bound:
for (int idx = bigNumber; idx >= smallNumber; idx = idx - 1) { ... }
or count only odd numbers:
for (int idx = smallOddNumber; idx < bigNumber; idx = idx + 2) { ... }
or anything else that the problem at hand requires.

Exercise

Practice using the counted-for loop: design a method
<T> ArrayList<T> interleave(ArrayList<T> arr1, ArrayList<T> arr2)
that takes two ArrayLists of the same size, and produces an output ArrayList consisting of one item from arr1, then one from arr2, then another from arr1, etc.

Design a method
<T> ArrayList<T> unshuffle(ArrayList<T> arr)
that takes an input ArrayList and produces a new list containing the first, third, fifth ... items of the list, followed by the second, fourth, sixth ... items.

22.5 Finding the minimum value

Exercise

Design the missing method to finish the sort method above: this method should find the minimum value in the not-yet-sorted part of the given ArrayList<String>.