Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Implementing Immutable Classes in Machine Learning Algorithms

DZone's Guide to

Implementing Immutable Classes in Machine Learning Algorithms

If you design your classes to be immutable when you're implementing machine learning algorithms in Java, performance will be easier to improve.

· AI Zone
Free Resource

Recently, I've been trying to implement some machine learning algorithms in Java. One of the first things you'll need to do when trying to make sense of big data is to crunch the numbers by applying some linear algebra. I could (and should) use the Math library from Apache Commons for this, but this was a learning experience, so I tried to implement the basic matrix and vector operations myself.

Once it worked — in the sense that I could predict the categories of Irises from the Iris dataset to about 90% correctness — I tried out some other datasets. Apart from the fact that other datasets appeared to be more difficult to predict, the other thing I noticed that the ...learning...phase...was... very...slow.

Luckily, I had designed my Vector and Matrix classes to be immutable. It was more or less an instinctive guess that this might be the best way to use these classes. So instead of changing the matrix itself, operations like "multiply" and "add" created a new matrix object containing the results of the operation.

Only when I noticed how poor my application performed did I realize that I could give some of the more computationally intensive operations a boost. On high-dimensional data, some operations can take a long time to compute — like the QR-decomposition of a matrix, for instance. For example, my method qr() returns a nested interface QR with the following methods:

public interface QR {
    DoubleMatrix q();
    DoubleMatrix r();
    DoubleMatrix inverse();
    Eigen eigen();
    double det();
}

And from these methods, the eigen() method returns another interface with these methods:

public interface Eigen {
    DoubleVector values();
    DoubleMatrix vectors();
}

So in a sense, given one matrix, the qr() operation computes four new matrices and more.

Knowing that the state of my Matrix object can't change, I added some simple caching to the rescue. Each time I needed a QR decomposition, I first checked if I hadn't computed it before for this matrix. If so, I returned the cached former result; otherwise, I started computing my cache, but only once.

public QR qr() {
    if(this.qr!=null) return this.qr;
    // otherwise, start the hard work ...
}

Predicting the categories of my datasets is still difficult. But now I can enjoy the much-improved performance of obtaining my results.

Topics:
tutorial ,ai ,machine learning ,predictive analytics ,algorithms ,app performance ,immutable ,caching

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}