Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Refactoring with Loops and Collection Pipelines: Part II, Nested Loops

DZone's Guide to

Refactoring with Loops and Collection Pipelines: Part II, Nested Loops

"Refactoring with Loops and Collection Pipelines" is a four-part series by Martin Fowler. In Part II, learn about nest loops and simplifying the pipeline.

· DevOps Zone
Free Resource

The Nexus Suite is uniquely architected for a DevOps native world and creates value early in the development pipeline, provides precise contextual controls at every phase, and accelerates DevOps innovation with automation you can trust. Read how in this ebook.

EDITOR'S NOTE: "Refactoring with Loops and Collection Pipelines" is a four-part series by Martin Fowler. You can read part one here.

The loop is the classic way of processing collections, but with the greater adoption of first-class functions in programming languages the collection pipeline is an appealing alternative. In this article I look at refactoring loops to collection pipelines with a series of small examples.


Nested Loop - Readers of Books

For a second example, I'll refactor a simple, doubly nested loop. I'll assume I have a online system that allows readers to read books. I have a data service that will tell me which books each reader has read during a particular day. This service returns a hash whose keys are identifiers of readers and values are a collection of identifiers of books

interface DataService…

  Map<String, Collection<String>> getBooksReadOn(Date date);

for this example, I'll switch to Java because I'm sick of method names with a capitalized first letter

Here's the Loop

public Set<String> getReadersOfBooks(Collection<String> readers, Collection<String> books, Date date) {
  Set<String> result = new HashSet<>();
  Map<String, Collection<String>> data = dataService.getBooksReadOn(date);
  for (Map.Entry<String, Collection<String>> e : data.entrySet()) {
    for (String bookId : books) {
      if (e.getValue().contains(bookId) && readers.contains(e.getKey())) {
         result.add(e.getKey());
      }
    }
  }
  return result;
}

I'll use my usual first step, which is to apply Extract Variable to the loop collection

public Set<String> getReadersOfBooks(Collection<String> readers, Collection<String> books, Date date) {
  Set<String> result = new HashSet<>();
  Map<String, Collection<String>> data = dataService.getBooksReadOn(date);
  final Set<Map.Entry<String, Collection<String>>> entries = data.entrySet();
  for (Map.Entry<String, Collection<String>> e : entries) {
    for (String bookId : books) {
      if (e.getValue().contains(bookId) && readers.contains(e.getKey())) {
         result.add(e.getKey());
      }
    }
  }
  return result;
}

Moves like this make me so glad that IntelliJ's automated refactoring saves me from typing that gnarly type expression.

Once I've got the initial collection into a variable, I can work on elements of the loop behavior. All the work is going on inside the conditional so I'll begin with the second clause in that conditional and move its logic to a filter.

public Set<String> getReadersOfBooks(Collection<String> readers, Collection<String> books, Date date) {
  Set<String> result = new HashSet<>();
  Map<String, Collection<String>> data = dataService.getBooksReadOn(date);
  final Set<Map.Entry<String, Collection<String>>> entries = data.entrySet().stream()
          .filter(e -> readers.contains(e.getKey()))
          .collect(Collectors.toSet());
  for (Map.Entry<String, Collection<String>> e : entries) {
    for (String bookId : books) {
      if (e.getValue().contains(bookId) && readers.contains(e.getKey()))) {
        result.add(e.getKey());
      }
    }
  }
  return result;
}

With Java's streams library, the pipeline has to end with a terminal (such as a collector)

The other clause is more tricky to move since it refers to the inner loop variable. This clause is testing to see if the value of the map entry contains any book that's also in the list of books in the method parameter. I can do this by using a set intersection. The java core classes don't include a set intersection method. I can get by using one of the common collection oriented add-ins to Java such as Guava or Apache Commons. Since this is a simple pedagogical example I'll write a crude implementation.

class Utils...

  public static <T> Set<T> intersection (Collection<T> a, Collection<T> b) {
    Set<T> result = new HashSet<T>(a);
    result.retainAll(b);
    return result;
  }

This works here, but for any substantial project, I'd use a common library.

Now I can move the clause from the loop into the pipeline.

public Set<String> getReadersOfBooks(Collection<String> readers, Collection<String> books, Date date) {
  Set<String> result = new HashSet<>();
  Map<String, Collection<String>> data = dataService.getBooksReadOn(date);
  final Set<Map.Entry<String, Collection<String>>> entries = data.entrySet().stream()
          .filter(e -> readers.contains(e.getKey()))
          .filter(e -> !Utils.intersection(e.getValue(), books).isEmpty())
          .collect(Collectors.toSet());
  for (Map.Entry<String, Collection<String>> e : entries) {
    for (String bookId : books) {
      if (e.getValue().contains(bookId) ) {
        result.add(e.getKey());
      }
    }
  }

  return result;
}

Now all the loop is doing is returning the key of map entry, so I can kill the remainder of the loop by adding a map operation to the pipeline

public Set<String> getReadersOfBooks(Collection<String> readers, Collection<String> books, Date date) {
  Set<String> result = new HashSet<>();
  Map<String, Collection<String>> data = dataService.getBooksReadOn(date);
  result = data.entrySet().stream()
          .filter(e -> readers.contains(e.getKey()))
          .filter(e -> !Utils.intersection(e.getValue(), books).isEmpty())
          .map(e -> e.getKey())
          .collect(Collectors.toSet());
  for (Map.Entry<String, Collection<String>> e : entries) {
    for (String bookId : books) {
      result.add(e.getKey());
    }
  }

  return result;
}

Then I can use Inline Temp on result.

public Set<String> getReadersOfBooks(Collection<String> readers, Collection<String> books, Date date) {
  Set<String> result = new HashSet<>();
  Map<String, Collection<String>> data = dataService.getBooksReadOn(date);
  return data.entrySet().stream()
          .filter(e -> readers.contains(e.getKey()))
          .filter(e -> !Utils.intersection(e.getValue(), books).isEmpty())
          .map(e -> e.getKey())
          .collect(Collectors.toSet());
  return result;
 }

Looking that use of intersection, I find it's rather complicated, I have to figure out what it does when I read it - which means I should extract it.

class Utils...

  public static <T> boolean hasIntersection(Collection<T> a, Collection<T> b) {
    return !intersection(a,b).isEmpty();
  }

class Service...

  public Set<String> getReadersOfBooks(Collection<String> readers, Collection<String> books, Date date) {
    Map<String, Collection<String>> data = dataService.getBooksReadOn(date);
    return data.entrySet().stream()
            .filter(e -> readers.contains(e.getKey()))
            .filter(e -> Utils.hasIntersection(e.getValue(), books))
            .map(e -> e.getKey())
            .collect(Collectors.toSet());
   }

The object-oriented approach is at a disadvantage with when you need to do something like this. Switching between static utility methods and normal methods on objects is awkward. With some languages I have a way to make it read like a method on the stream class, but I don't have that option in Java.

Despite this issue, I still find the pipeline version easier to comprehend. I could combine the filters into a single conjunction, but I usually find it easier to understand each filter as a single element.

The DevOps Zone is brought to you in partnership with Sonatype Nexus.  See how the Nexus platform infuses precise open source component intelligence into the DevOps pipeline early, everywhere, and at scale. Read how in this ebook

Topics:
big data ,loops ,refactoring loops ,nested loops

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}