# Review: Clojure for Machine Learning (Ch 1-3)

# Review: Clojure for Machine Learning (Ch 1-3)

Join the DZone community and get the full member experience.

Join For Free**Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.**

In short, the book provides a good bird-eye view of the intersection of Clojure and Machine Learning, useful for people coming from both sides. It introduces a number of important methods and shows how to implement/use them in Clojure but does not – and cannot – provide deep understanding. If you are new to M.L. and really like to understand things like me, you want to get a proper textbook(s) to learn more about the methods and the math behind them and read it in parallel. If you know M.L. but are relatively new to Clojure, you want to skip all the M.L. parts you know and study the code examples and the tools used in them. To read it, you need only elementary knowledge of Clojure and need to be comfortable with math (if you haven’t worked with matrices, statistics, or derivation and equations scare you, you will have a hard time with some of the methods). You will learn how to implement some M.L. methods using Clojure – but without deep understanding and without knowledge of their limitations and issues and without a good overview of alternatives and the ability to pick the best one for a particular case.

The main topics are matrices, linear regression, data categorization (f.ex. Bayesian classification, k-nearest neighbors, decision trees), neural networks, selection and evaluation of data, support vector machines, data clustering, anomaly detection and recommendation, big data. Some of the tools being used are Incanter, clj-ml (primarily a wrapper of Weka), Enclog (neural networks), BigML (facade for ML cloud services).

Some impressions from the first chapters (ch 1 – 2 take 1/3 of the book):

- I miss the big picture – f.ex. what kinds of regression methods are there, how to know which is appropriate? How to choose which of the 3-4 categorization methods to use in a given case? Again, a good textbook on M.L. would complement this pragmatically oriented book well.
- As mentioned, the book demonstrates what is possible but does not provide enough explanation and math theory to be able to really understand some of the more complex methods. You won’t be able to go and start deriving and optimizing good regression models just based on this text.
- Ch1 introduces matrices which are later used f.ex. to compute the parameters of an Ordinary Least Squares regression model. It mentions a number of concepts without elaborating their meaning such as eigen-vector and determinant.
- It would be nice if the author pointed regularly to good online/offline resources where the curious reader can learn more about the math, concepts, and methods being introduced.

Tip: You might want to check out the Stanford Machine Learning online course at Coursera, which also draws from numerous case studies and applications.

*Disclaimer: I have been unconditionally provided with a free copy of the e-book prior to reveiwing it.*

**Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub. Join the discussion.**

Published at DZone with permission of Jakub Holý , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

## {{ parent.title || parent.header.title}}

## {{ parent.tldr }}

## {{ parent.linkDescription }}

{{ parent.urlSource.name }}