{{announcement.body}}
{{announcement.title}}

# Folding the Universe, Part II: Abstracting Recursion

DZone 's Guide to

# Folding the Universe, Part II: Abstracting Recursion

### How to solve the knapsack problem in Java without relying on recursion at all, except for the folds!

· Java Zone ·
Free Resource

Comment (0)

Save
{{ articles.views | formatCount}} Views

This is a series of articles complementing my book, Functional Programming in Java. The previous article in this series may be found here:

In the the previous article in this series, I showed how to write a functional Java program solving the Knapsack problem, which consists of packing a knapsack of a given capacity with items from a collection, while maximizing the knapsack's value, given that each item has a weight and an individual value. Of course, we must not exceed the knapsack capacity. This problem may be applied to dividing any quantity (the knapsack capacity) in parts (the items) while maximizing the value, which might, in some cases, mean minimizing the unused capacity (when the value of each item is proportional to its weight). There are indeed several variants of this problem, the most common two being packing with a single copy of each item and packing with as many copies as wanted of each item. An example of solving this problem in traditional imperative Java may be found here: Knapsack problem/0-1. Please have a look at it before reading this article.

## The challenge

As I said in the previous article, the challenge is to solve the problem while using no variables (only constants) and no control structures, but only using classes and methods, arithmetic operations and comparisons, and Java 8 `Function` and `BiFunction`. I showed a bi-recursive version in the previous article, and explained why it was not really usable (at least for lists of items longer that thirty elements). Our new challenge is to write this same program without using recursion at all, except for the folds, which are abstracted in the `List` class that we have defined. If you haven’t read part I, please do so before reading this article. (Folding the Universe, part I)

## How to avoid recursion

To avoid bi-recursion, we have to memorize the result of each step in order not not have to compute it again and again. It is obvious that when examining an item to see if we should put it in the knapsack, we must consider what we have already put into it in previous steps. What we will do is:

• Build a list of knapsacks of capacity varying from 1 to the given capacity, in reverse order.
• For each item in the list:
• Try to put it in all of the knapsacks. In the previous example, we tested the item weight to see if it would fit in the knapsack. Here, for each knapsack, we will first filter out the items which capacity exceeds the knapsack capacity.
• Find the knapsack corresponding to index (`item.weight - 1`) and add the item to it.
• Find the knapsack of maximum value and add it to the list
• From the resulting knapsack list, select the one with the maximum value. This is the result of the packing.

Note that if we add a knapsack to the list, we put it in front of the list. So, the knapsack with the maximum value will simply be the head of the list. This is why we select the knapsack of index (`item.weight - 1`). This would correspond to an index of `i - it.weight` if the list was build the other way, where `i` would be the index of the knapsack we are in.

Here is the corresponding code:

``````public static Knapsack pack(List<Item> items, int capacity) {
final Knapsack zero = empty(capacity); (1)
final List<Knapsack> result = List.range(capacity + 1, 1).foldRight(List.list(), (i, acc) -> { (2) (3)
Knapsack knapsack = items.filter(item -> item.weight <= i) (4)
.flatMap(item -> acc.at(item.weight - 1).map(k -> k.add(item))) (5)
.foldRight(zero, Knapsack::maxValue); (6)
return acc.cons(knapsack); (7)
});
return result.head().foldRight(zero, (a, b) -> a); (8)
}``````

(1) A "zero" knapsack is created with the given capacity. This is just a convenience to avoid calling the `Knapsack.empty()` method several times.

(2) A list of integers from the given capacity to 1 is built by calling the `range` method. This method must be added to the `List` class.

(3) This list of integers is mapped to a list of knapsacks. While constructing this list (named `acc`, for "accumulator", by convention), the previously created knapsacks are available.

(4) We filter the items so that we won’t try to insert items with a weight higher than what the knapsack may contain.

(5) We look for a previously created knapsack of index `item.weight - 1`. The `List.at(n)` method has to be created and it will return optional data, which mean a list of zero (not found) or one (found) knapsack. If this knapsack is found, the item is added to it, which is done by mapping the function `k → k.add(item)` to the singleton knapsack list.

(6) From the resulting list, we select the knapsack with the maximum value.

(7) The resulting value is added to the (head of) accumulator. This concludes the function used for folding the list of integers, in (3).

(8) Once the list of integers is folded to a list of knapsacks, we just have to select the maximum value, which is automatically the head of the list. As the head method returns a singleton list, we use a "getOrElse" technique to select either the value if it is present, or a default value if not.

The technique used here is very well known in functional programming and is called memorization. It consists in keeping access to the previous results instead of recomputing them at each step. This technique is also used in imperative programming. The particularity of the functional programming version is the way the previous results are transported during the fold (in the `acc` parameter).

## Implementing the missing methods

In this example, we have used two new methods that are missing in the `List` class. The `range` method is a convenience method allowing creating a list of integers from `start` (included) to `end` (excluded) if `start` is lower than `end`. Otherwise, the list is constructed in reverse order. The fact that `start` is included and `end` is excluded is purely conventional. Here is the corresponding implementation in the `List` class:

``````public static List<Integer> range(int start, int end) {
return start < end
? rangeAsc(list(), start - 1 , end - 1)
: rangeDesc(list(), end, start);
}

private static List<Integer> rangeAsc(List<Integer> acc, int start, int end) {
return start == end
? acc
: rangeAsc(new Cons<>(end, acc), start, end - 1);
}

private static List<Integer> rangeDesc(List<Integer> acc, int start, int end) {
return start == end
? acc
: rangeDesc(new Cons<>(start, acc), start + 1, end);
}``````

The `List.at(n)` method allows indexed access to the list. It is not very efficient, since it needs time proportional to `n`, but is is perfectly usable when `n` is low. Here is the implementation:

``````public List<A> at(int n) {
return tail().flatMap(t -> n == 0 ? head() : t.at(n - 1));
}``````

This method is even less efficient since it uses the `head` method defined in the previous article. We will fix this in the next section.

## Solving the real problem

Beside efficiency, we have a more important issue: our program does not solve our real problem, because it allows using several times the same item. It was done this way for simplification. Now, we can fix this easily:

``````public static Knapsack pack(List<Item> items, int capacity) {
final Knapsack zero = empty(capacity);
final List<Knapsack> result = List.range(capacity + 1, 1).foldRight(List.list(), (i, acc) -> {
Knapsack knapsack = items.filter(item -> item.weight <= i)
.flatMap(item -> acc.at(item.weight - 1).map(k -> k.contains(item) ? k : k.add(item))) (1)
.foldRight(zero, Knapsack::maxValue);
return acc.cons(knapsack);
});
return result.head().foldRight(zero, (a, b) -> a);
}``````

(1) We put the element in the knapsack only if it has not yet been used, which simply means that the knapsack does not already contain it.

To make this possible, we must add the `contains` method to the `Knapsack` class:

``````private boolean contains(Item item) {
return items.contains(item);
}``````

This method simply delegates to the `List.contain(a)` method. Guess what this method will be based on? Bingo! A fold:

``````public boolean contains(A a) {
return foldLeft(false, (result, b) -> a.equals(b) || result);
}``````

Again, this method is not optimized, since it will have to traverse the whole list instead of escaping as soon as a matching element is found.

Note that we must define an `equals` method in our `Item` class. I will not show it here, since you certainly know how to do this. Moreover, any good IDE will generate it for you. However, to be really complete, we should add a unique ID to the `Item` class. Forgetting to do so will make it impossible to have two different items with the same weight and value in the list. A very simple way to do this, if the ID is not to be used for anything else, is to generate a UUID and use it for equality:

``````public class Item {

public final String name;
public final int weight;
public final double value;
private final UUID id = UUID.randomUUID(); (1)

private Item(String name, int weight, double value) {
this.name = name;
this.weight = weight;
this.value = value;
}

@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Item item = (Item) o;
return weight == item.weight && Double.compare(item.value, value) == 0 && (name != null
? name.equals(item.name)
: item.name == null && id.equals(item.id)); (2)
}
...``````

A random UUID is generated, and it is used to test equality.

## Representing the absence of data

In this example, we had to represent data that could be present or absent. For example, a method returning the first element of a list could return nothing if the list is empty. Traditional languages use the `null` reference, invented by Sir Charles Antony Richard Hoare (a.k.a. "Tony Hoare"), which is what he called his "million dollars mistake". The `null` reference is the plague of programming because it does not compose. This is something it has in common with exceptions (to which it transposes very easily!). This does not mean it is not possible to compose it, but it does not compose transparently by itself. The programmer has to handle it separately through some control structures.

In the previous article, we saw that it was perfectly possible to use a list to represent optional data. Absence of data is represented by an empty list, and data is represented by a singleton list. The main problem is that there was no mean to differentiate a normal list from a singleton list. The good thing is that it is very easy to compose both.

Another possibility is to use a special class to represent singleton lists. We might call this class `Option`. It will be identical to the `List` class, with two main exceptions: there is no `cons` method (since we can’t add elements) and the folding methods generally do not take a function argument. It could however take one, but it would in most use cases be the identity function, which is why it is most often implemented without this function argument. This method will then simply return either the included value, if it is present, or the "zero" value if not. Of course, the "zero" value will have a more explicit name such as `default`, but this (like names in general) is irrelevant. Here is a minimal `Option` class:

``````public abstract class Option<A> {

public abstract <B> B foldRight(B identity, BiFunction<A, B, B> f);

public <B> Option<B> map(Function<A, B> f) {
return foldRight(none(), (a, option) -> some(f.apply(a)));
}

public <B> Option<B> flatMap(Function<A, Option<B>> f) {
return foldRight(none(), (a, option) -> f.apply(a));
}

public Option<A> filter(Predicate<A> p) {
return foldRight(none(), (a, option) -> p.test(a) ? some(a) : none());
}

public A getOrElse(A defVal) {
return foldRight(defVal, (a, b) -> a);
}

private static class None<A> extends Option<A> {

@Override
public <B> B foldRight(B identity, BiFunction<A, B, B> f) {
return identity;
}

public String toString() {
return "None()";
}
}

private static class Some<A> extends Option<A> {

private final A value;

private Some(A value) {
this.value = value;
}

@Override
public <B> B foldRight(B identity, BiFunction<A, B, B> f) {
return f.apply(value, identity);
}

public String toString() {
return String.format("Some(%s)", value);
}
}

@SuppressWarnings("rawtypes")
private static Option NONE = new None();

@SuppressWarnings("unchecked")
public static <A> Option<A> none() {
return NONE;
}

public static <A> Option<A> some(A a) {
return new Some<>(a);
}
}``````

## What Option changes to our example

Using `Option` to represent optional data instead of a list brings some benefits in the sense that a program is often easier to read if we can easily distinguish between true lists (which can contain any number of elements) and singleton lists representing optional data (which can only contain zero or one element). By the way, the difference is not always visible, since type may sometimes be inferred, such as when chaining methods. Here is our `pack` method using `Option` instead of `List` for optional data:

``````public static Knapsack pack(List<Item> items, int capacity) {
final Knapsack zero = empty(capacity);
final List<Knapsack> result = List.range(capacity + 1, 1).foldRight(List.list(), (i, acc) -> {
Knapsack knapsack = items.filter(item -> item.weight <= i)
.sequence(item -> acc.at(item.weight - 1).map(k -> k.contains(item) ? k : k.add(item))) (1)
.map(list -> list.foldRight(zero, Knapsack::maxValue)) (2)
.getOrElse(zero); (3)
return acc.cons(knapsack);
});
}``````

(1) The flatMap method can’t be used, since it produces a `List` Here, we produce an `Option`, but this would not be visible if we had kept the name `flatMap` for the method. We will of course have to define the `Option.sequence` method.

(2) The `map` method is called on an `Option`, but it is difficult to see by looking at the code!

(3) Getting the value (if present) or a default value (if not) is done through the `getOrElse` method. We changed the name to follow conventional usage although it is implemented through a fold!

(4) We can simplify getting the value or default value from the resulting `Option`.

The `map` and `getOrElse` method can be seen in the previous listing of the `Option` class. You can see that `Option` is really a `List`. The problem happens when a list is mapped with a function returning an `Option`. When we were using a list for optional data, the function was returning a `List`, so we ended with a `List<List<A>>`. Getting a `List<A>` instead was just a matter of using `flatMap` instead of `Map`. With `Option`, we end with a `List<Option<A>`. So we have to create a method for this. We will put it in the `List` class. By convention, this method is called `sequence`. In order to better mimic the `List.flatMap` method, we will also add a convenience `sequence` method taking a function returning an `Option` as its argument:

``````public <B> Option<List<B>> sequence(Function<A, Option<B>> f) {
return sequence(map(f)); (1)
}

public static <A> Option<List<A>> sequence(List<Option<A>> list) {
return list.foldRight(Option.some(List.list()), (oa, acc) -> map2(oa, acc, (a, b) -> b.cons(a))); (2)
}

public static <A, B, C> Option<C> map2(Option<A> a, Option<B> b, BiFunction<A, B, C> f) {
return a.flatMap(av -> b.flatMap(bv -> Option.some(f.apply(av, bv)))); (3)
}``````

(1) This method is a simple convenience allowing applying the function and calling `sequence` in a single operation that mimics the `List.flatMap` method.

(2) The `sequence` method uses a `foldRight` (to preserve the list order) and a utility method `map2` transforming a `BiFunction<A, B, C>` into a `BiFunction<Option<A>, Option<B>, Option<C>>`

(3) This is a very common idiom in functional programming. If you have trouble to understand it, you should insist until you succeed. It is not complicated. The first `flatMap` give access to the value contained in the first `Option` (`av`). The second nested `flatMap` does the same for the second `Option`, producing the value `bv`. We can apply the function to these two values, producing a `C`. As we want an `Option<C>`, we just wrap it into an `Option.Some`.

Note that the `map2` implementation is not optimal. We must take the value out of the first `Option`, thus using `flatMap`. But we are not forced to do the same with the second. An alternate (and far more common) solution is to `map` the second option with the function:

``````public static <A, B, C> Option<C> map2(Option<A> a, Option<B> b, BiFunction<A, B, C> f) {
return a.flatMap(av -> b.map(bv -> f.apply(av, bv)));
}``````

## Conclusion

We have now completed our functional Knapsack example. If you compare to traditional imperative Java (for example Knapsack problem/0-1), you may think that the functional version is not shorter. This is true. You may also think it is not easier to understand. This is totally subjective. I found it cristal clear and I have trouble to understand the imperative version. But this is because I am used to functional programming, and I am reluctant to read imperative programs.

So what are the benefits of the functional programming version? The main benefit is that we have abstracted a great deal of the program in classes (`List` and `Option`) that may be reused for solving other problems. Of course, Java has also `List` and `Optional`. Just try to write the same program with these classes and see for yourself. The real result is that beside the business code (the `Item` class and most of the `Knapsack` class), our program is less that ten lines long. As you know, the shortest the code, the fewer the bugs.

Of course, there is plenty of room for optimization. We will look at some in a next article, while trying to solve a different problem. In the meantime, if you are interested in more advanced techniques about the subject, have a look at my book: Functional Programming in Java.

Topics:
java 8 functional programming ,recursion

Comment (0)

Save
{{ articles.views | formatCount}} Views

Published at DZone with permission of Pierre-Yves Saumont . See the original article here.

Opinions expressed by DZone contributors are their own.