Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Mahout: Using a Saved Random Forest/DecisionTree

DZone's Guide to

Mahout: Using a Saved Random Forest/DecisionTree

· Big Data Zone
Free Resource

Effortlessly power IoT, predictive analytics, and machine learning applications with an elastic, resilient data infrastructure. Learn how with Mesosphere DC/OS.

One of the things that I wanted to do while playing around with random forests using Mahout was to save the random forest and then use use it again which is something Mahout does cater for.

It was actually much easier to do this than I’d expected and assuming that we already have a DecisionForest built we’d just need the following code to save it to disc:

int numberOfTrees = 1;
Data data = loadData(...);
DecisionForest forest = buildForest(numberOfTrees, data);
 
String path = "saved-trees/" + numberOfTrees + "-trees.txt";
DataOutputStream dos = new DataOutputStream(new FileOutputStream(path));
 
forest.write(dos);

When I was looking through the API for how to load that file back into memory again it seemed like all the public methods required you to be using Hadoop in some way which I thought was going to be a problem as I’m not using it.

For example the signature for DecisionForest.load reads like this:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
 
public static DecisionForest load(Configuration conf, Path forestPath) throws IOException { }

As it turns out though you can just pass an empty configuration and a normal file system path and the forest shall be loaded:

int numberOfTrees = 1;
 
Configuration config = new Configuration();
Path path = new Path("saved-trees/" + numberOfTrees + "-trees.txt");
DecisionForest forest = DecisionForest.load(config, path);

Much easier than expected!

Learn to design and build better data-rich applications with this free eBook from O’Reilly. Brought to you by Mesosphere DC/OS.

Topics:

Published at DZone with permission of Mark Needham, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}