Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Breadth-First Search With Java 8 Stream API

DZone's Guide to

Breadth-First Search With Java 8 Stream API

See an example of how you can use Java 8's Stream API to traverse your data and collect elements on the go.

· Java Zone
Free Resource

Get the Edge with a Professional Java IDE. 30-day free trial.

I recently had the problem of walking through some data and collect elements on the go. I thought it would be nice to use theStream API in Java 8. After a short search on “the Internet,” I found a solution in this blog post for tree structures and in this one for recursing through a file system (which is a tree structure again). So, story told? No way! I needed to walk through graph structures (i.e., there might be cycles in it). On the contrary, the solutions above will run forever, or more precisely, they will terminate relatively fast with a StackOverflowException.

A solution is e.g. a breadth-first search (BFS). In essence, a BFS explores the search space level by level. On the go, it stores the so-called open list. open contains the nodes (on the next level) not explored, yet. Already processed nodes won’t be put into open again. This enables searching through structures with cycles.

Consider the following as an iterative solution to BFS using the Stream API:

class Node {
    Node(String name) {
        this.name = name;
    }

    String name;
    private List<Node> childs = new LinkedList<>();

    List<Node> getChilds() {
        return childs;
    }
}

class IterativeStreamBFS {
    public static void main(String args[]) {
        final Node first = new Node("first");
        Node second = new Node("second");

        first.getChilds().add(second);
        second.getChilds().add(first);

        final Set<Node> nodes = new LinkedHashSet<Node>();
        nodes.add(first);

        Set<Node> openSet = new HashSet<>(nodes);

        do {
            final Stream<Node> childNodes = openSet.stream()
            .map(Node::getChilds).flatMap(List::stream);
            openSet = childNodes.filter(p -> !nodes.contains(p))
            .collect(Collectors.toSet());
            nodes.addAll(openSet);
        }
        while(!openSet.isEmpty());
    }
}

This solution works, but has some downsides. First, it is not recursive, but iterative. This is not a bad thing for itself, but it is not very functional, something I wanted upfront when I decided to use the Stream API. Second, it is inefficient, as it isn’t lazy and creates new openSet sets in each iteration. Third, (which is related to my first point) it puts the burden onto the programmer to handle all the moving parts (i.e. handling of the openSet and the implicit return value processes). For sure we could hide a lot of this in corresponding methods, but then there is a risk of becoming inflexible.

Looking back at the recursive solutions for trees mentioned before, I came up with this one:

class Node {

    //...

    Stream<Node> flattened() {
        final Set<Node> closedSet = Collections.synchronizedSet(new HashSet<>(1000));
        return flattened(closedSet).parallel();
    }

    private Stream<Node> flattened(Set<Node> closedSet) {
        if (!closedSet.add(this)) return Stream.empty();
        else return Stream.concat(Stream.of(this),
        childs.stream().flatMap(n -> n.flattened(closedSet)));
    }
}

class LazyRecursiveStreamBFS {
    public static void main(String args[]) {
        // ...
        first.flattened().collect(Collectors.toList());
    }
}

So, this solution is lazy and it does not produce intermediate open sets (although it creates a whole bunch of streams on the go). Further on, you can parallelize it and do further stream operations on it. You can e.g. use limit(), which reduces the overall cost significantly, as the exploration is done lazily. Last but not least, its usage is very easy.

Some words to the implementation: the initialization of the HashSet with an initial capacity of 1000 is for performance reasons (so that the set must not be resized so often, for bigger structures). I use a synchronizedSet, so that the execution can be parallelized. I use only atomic operations on the set, so that I do not need any further synchronization (see line 11).

Of course, this solution can be enhanced to search through more complex structures, where not every explored object should be put into the stream. But be careful. If you search through objects that you don’t care and they may have cyclic structures, you have to do extra bookkeeping for them, or you will end in endless loops (or stack overflows) again.

Exciting.

Get the Java IDE that understands code & makes developing enjoyable. Level up your code with IntelliJ IDEA. Download the free trial.

Topics:
stream api ,breadth-first search ,java ,sets

Published at DZone with permission of Johannes Neubauer, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}