Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

How to Use Infinite Sets in Java 9

DZone's Guide to

How to Use Infinite Sets in Java 9

Are infinite sets a thing in Java 9? Read on for some insight and code pertaining to infinite sets, the Immutable StreamSet, and the Positive LongSet.

· Java Zone ·
Free Resource

Download Microservices for Java Developers: A hands-on introduction to frameworks and containers. Brought to you in partnership with Red Hat.

A Set is a collection of elements whereby any given element in the Set appears only once. More formally, a set contains no pair of elements e1 and e2 such that e1.equals(e2). We can easily create a Set in Java 9 like this:

final Set<Integer> s = Set.of(1, 2, 3);
System.out.println(s);


This might produce the following output:

[2, 3, 1]


The Set produced above is immutable, i.e. it cannot change and it is also finite because there is a distinct number of elements in the Set, namely three. The order in which the elements are returned via its read methods (such as stream(), iterator() and forEach()) is unspecified.

An Infinite Set

An infinite set contains an unlimited number of elements. One example of an infinite set is the set of all integers [..., -1, 0, 1, 2, ...] where an integer is not of a Java Integer class, but an integer according to the mathematical definition of an integer whereby there is always a larger integer n+1 for any given integer n.

There are many infinite sets, such as the set of all primes, the set of even integers, and the set of fibonacci numbers, etc. For obvious reasons, we cannot precompute and store all the elements of an infinite Java Set. If we try, we would eventually run out of memory.

A fundamental question we have to ask ourselves is: Are there actually infinite sets for the Java types we have? If we have a Set<Byte> , there are at most 256 elements in the Set and that is far from infinite. The same reasoning goes for Short and even Integer. After all, there are only about four billion different Integer objects, and if we would use a bit-map to represent membership, we could fit a Set<Integer> in just 0.5 GB. While it will be big, it is not infinite.

But, if we are talking about Long or String elements, we are approaching at least virtually infinite sets. To store a bitmap of all Longs, it would require a number of PB of internal storage. A true infinite Set would be a Set of String with all possible combinations of characters [a-z] and length.

The Immutable StreamSet

To move away from a paradigm where we store the elements of a Set, we could create an ImmutableStreamSet that defines the elements of the Set only through its stream() method. The ImmutableStreamSet could be defined as a FunctionalInterface like this:

@FunctionalInterface
public interface ImmutableStreamSet<E> extends Set<E> {

    // This is the only method we need to implements
    @Override
    public Stream<E> stream(); 

    @Override
    default int size() {
        return (int) stream().limit(Integer.MAX_VALUE).count();
    }

    @Override
    default boolean contains(Object o) {
        return stream().anyMatch(e -> Objects.equals(e, o));
    }

    @Override
    default boolean containsAll(Collection<?> c) {
        return (this == c) ? true : c.stream().allMatch(this::contains);
    }

    @Override
    default boolean isEmpty() {
        return !stream().findAny().isPresent();
    }

    @Override
    default <T> T[] toArray(T[] a) {
        return stream().collect(toList()).toArray(a);
    }

    @Override
    default Object[] toArray() {
        return stream().toArray();
    }

    @Override
    default Spliterator<E> spliterator() {
        return stream().spliterator();
    }

    @Override
    default Iterator<E> iterator() {
        return stream().iterator();
    }

    @Override
    default Stream<E> parallelStream() {
        return stream().parallel();
    }

    @Override
    default void forEach(Consumer<? super E> action) {
        stream().forEach(action);
    }

    // We are immutable
    @Override
    default boolean removeIf(Predicate<? super E> filter) {
        throw new UnsupportedOperationException();
    }

    @Override
    default void clear() {
        throw new UnsupportedOperationException();
    }

    @Override
    default boolean removeAll(Collection<?> c) {
        throw new UnsupportedOperationException();
    }

    @Override
    default boolean retainAll(Collection<?> c) {
        throw new UnsupportedOperationException();
    }

    @Override
    default boolean addAll(Collection<? extends E> c) {
        throw new UnsupportedOperationException();
    }

    @Override
    default boolean remove(Object o) {
        throw new UnsupportedOperationException();
    }

    @Override
    default boolean add(E e) {
        throw new UnsupportedOperationException();
    }

    static <E> ImmutableStreamSet<E> of(Supplier<Stream<E>> supplier) {
        // Check out GitHub to see how this Impl class is implemented
        return new ImmutableStreamSetImpl<>(supplier);
    }
}


Awesome, now we can create infinite sets by just providing a stream supplier like this:

    ImmutableStreamSet<Long> setOfAllLong
            = LongStream.rangeClosed(Long.MIN_VALUE, Long.MAX_VALUE)::boxed;


This will create a Set of all Long values (e.g. with 2^64 elements). When providing a stream supplier, it is imperative to make sure to adhere to the Set property of element uniqueness. Consider the following illegal Set:

final ImmutableStreamSet<Long> illegalSet = 
            () -> Stream.of(1l, 2l, 1l);


Clearly, 11 occurs two times in the set, which makes this object violate the Set requirements.

As we will see, it would be better to create concrete classes of the infinite sets we are considering. One particular problem with the default implementation above is that the contains() method can be very slow. Read the following sections to find out why and how we can solve it.

PositiveLongSet

Let us assume that we want to create a Set with all the positive long values and that we want to be able to use the Set efficiently with other sets and objects. This is how we can go about it:

public final class PositiveLongSet implements ImmutableStreamSet<Long> {

    public static final PositiveLongSet INSTANCE = new PositiveLongSet();

    private PositiveLongSet() {
    }

    @Override
    public Stream<Long> stream() {
        return LongStream.rangeClosed(1, Long.MAX_VALUE).boxed();
    }

    @Override
    public int size() {
        return Integer.MAX_VALUE;
    }

    @Override
    public boolean contains(Object o) {
        return SetUtil.contains(this, Long.class, other -> other > 0, o);
    }

    @Override
    public boolean isEmpty() {
        return false;
    }

    @Override
    public String toString() {
        return SetUtil.toString(this);
    }
}


Note how we comply with the formal requirement in the method size() where we return the Integer.MAX_VALUE, even though the Set is much larger. If the Set had been defined today, it is likely that the size() would have returned to a long , instead of an int. But, in the beginning of the 90s, internal RAM was usually less than 1 GB. We are using two utility methods in the class:

The SetUtil.toString() takes a Set and iterates it over the first eight elements and returns a String representation of those elements.

The SetUtil.contains() method takes a Set, the Element type class (here Long.class), and a Predicate that is called if the object we are comparing is of the given Element type class (if the object we are comparing against is null or of another type, then trivially the Set does not contain the object).

Here is what the SetUtil looks like:

final class SetUtil {

    private static final int TO_STRING_MAX_ELEMENTS = 8;

    static <E> String toString(Set<E> set) {
        final List<String> first = set.stream()
            .limit(TO_STRING_MAX_ELEMENTS + 1)
            .map(Object::toString)
            .collect(toList());

        final String endMarker = first.size() > TO_STRING_MAX_ELEMENTS ? ", ...]" : "]";

        return first.stream()
            .limit(TO_STRING_MAX_ELEMENTS)
            .collect(
                joining(", ", "[", endMarker)
            );
    }

    static <E> boolean contains(
        final Set<E> set,
        final Class<E> clazz,
        final Predicate<E> predicate,
        final Object o
    ) {
        if (o == null) {
            return false;
        }
        if (!(clazz.isAssignableFrom(o.getClass()))) {
            return false;
        }
        final E other = clazz.cast(o);
        return predicate.test(other);
    }
}


Armed with the classes ImmutableStreamSet and SetUtil , we can now easily create other infinite sets, like PostitiveEvenLongSet (not shown hereunder, try writing it by yourself), PrimeLongSet (containing all primes that can be represented by a Long), and even FibonacciLongSet (containing all fibonacci numbers that can be represented by a Long). Here is how these classes may look:

PrimeLongSet

public final class PrimeLongSet implements ImmutableStreamSet<Long> {

    public static final PrimeLongSet INSTANCE = new PrimeLongSet();

    private PrimeLongSet() {
    }

    private static final LongPredicate IS_PRIME =
        x -> LongStream.rangeClosed(2, (long) Math.sqrt(x)).allMatch(n -> x % n != 0);

    @Override
    public Stream<Long> stream() {
        return LongStream.rangeClosed(2, Long.MAX_VALUE)
            .filter(IS_PRIME)
            .boxed();
    }

    @Override
    public int size() {
        return Integer.MAX_VALUE; 
    }

    @Override
    public boolean contains(Object o) {
        return SetUtil.contains(this, Long.class, IS_PRIME::test, o);
    }

    @Override
    public boolean isEmpty() {
        return false;
    }

    @Override
    public String toString() {
        return SetUtil.toString(this);
    }
}


FibonacciLongSet

public final class FibonacciLongSet implements ImmutableStreamSet<Long> {

    public static final FibonacciLongSet INSTANCE = new FibonacciLongSet();

    private FibonacciLongSet() {
    }

    @Override
    public Stream<Long> stream() {
        return Stream.concat(
            Stream.of(0l),
            Stream.iterate(new Fibonacci(0, 1), Fibonacci::next)
                .mapToLong(Fibonacci::getAsLong)
                .takeWhile(fib -> fib > 0)
                .boxed()
        );
    }

    @Override
    public int size() {
        return 92;
    }

    @Override
    public boolean contains(Object o) {
        return SetUtil.contains(
            this,
            Long.class,
            other -> stream().anyMatch(fib -> Objects.equals(fib, other)),
            o
        );
    }

    @Override
    public boolean isEmpty() {
        return false;
    }

    @Override
    public String toString() {
        return SetUtil.toString(this);
    }

    private static class Fibonacci {

        final long beforeLast;
        final long last;

        public Fibonacci(long beforeLast, long last) {
            this.beforeLast = beforeLast;
            this.last = last;
        }

        public Fibonacci next() {
            return new Fibonacci(last, last + beforeLast);
        }

        public long getAsLong() {
            return beforeLast + last;
        }

    }

}


Note how we are using the Stream::takeWhile to break the stream when the long wraps around to a negative value. Arguably, we are "cheating" when we precompute and provide a size of 92, but, otherwise, size() would have been a bit slower.

Stitching It All Up

By providing an interface with static providers to instances of these classes, we can encapsulate our predefined sets and make sure that there is only one instance of them in the JVM like this:

public interface Sets {

    static Set<Long> positiveLongSet() {
        return PositiveLongSet.INSTANCE;
    }

    static Set<Long> positiveEvenLongSet() {
        return PositiveEvenLongSet.INSTANCE;
    }

    static Set<Long> primeLongSet() {
        return PrimeLongSet.INSTANCE;
    }

    static Set<Long> fibonacciLongSet() {
        return FibonacciLongSet.INSTANCE;
    }
}


We could also encapsulate our code in a Java 9 module to make sure only the class' Sets and ImmutableStreamSet are visible by exposing them in the projects top-most package and putting all the other classes in a package named "internal" (which is not exposed). This is how our module-info.java could look, provided that the two exposed classes are in the com.speedment.infinite_sets and the implementation classes are in a package like com.speedment.infinite_sets.internal:

module-info.java

module com.speedment.infinite_sets {
    exports com.speedment.infinite_sets;
}


Trying It Out

We can now create another module using our infinite sets by first declaring usage of our existing module like this:

module-info.java

module Infinite_sets_app {
    requires com.speedment.infinite_sets;
}


Then, we have access to the exposed parts of the module. Here is one way of trying out the infinite sets:

import static com.speedment.infinite_sets.Sets.*;
public class Main {

    public static void main(String[] args) {

        Stream.of(
            Set.of(1, 2, 3),
            positiveLongSet(),
            positiveEvenLongSet(),
            primeLongSet(),
            fibonacciLongSet()
        ).forEachOrdered(System.out::println);

        // This actually completes fast due to identity equality
        positiveLongSet().containsAll(positiveLongSet());
    }
}


This might produce the following output:

[3, 2, 1]
[1, 2, 3, 4, 5, 6, 7, 8, ...]
[2, 4, 6, 8, 10, 12, 14, 16, ...]
[2, 3, 5, 7, 11, 13, 17, 19, ...]
[0, 1, 2, 3, 5, 8, 13, 21, ...]


The source code in this post is available on GitHub.

Download Building Reactive Microservices in Java: Asynchronous and Event-Based Application Design. Brought to you in partnership with Red Hat

Topics:
java ,sets ,code ,immutable ,streams ,tutorial

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}