Deep Learning is becoming the next big area for companies and universities to explore. Deep Learning libraries are growing and their adoption is expanding. With Google's open sourcing of TensorFlow, there is a massive rise in deep learning adoption. I have started using it for it's very interesting Image Recognition capabilities which can be used out of the box with their ImageRecognition example. Google has released a new TensorFlow library - Image Recognition, Slim. TF-Slim is a lightweight library for defining, training and evaluating complex models in TensorFlow, which should speed up adoption and ease of use. TensorFlow also has an update to the Inception RESNET v2 training library. This should provide for better accuracy. TensorFlow now runs on Android, Linux, Linux with GPU and MacOS. Other platforms will be added in the future. For non-Linux users, I recommend trying in a cloud or on a VM. I have installed TensorFlow on the Hortonworks 2.5 sandbox, so that could be an option for Windows users. I like the VirtualBox version better and VirtualBox is free and runs well.
There are also updates to Google's Text Summarization in TensorFlow. This technology will extra pieces of a text and create a summary based on a metric that makes that "interesting". It does require building up your training set and then running some heavy duty processes, but very interesting work indeed. Might work out well for someone automatically writing summarization articles. Perhaps TensorFlow wrote this article? How would you know? If a person wrote it or a Deep Learning algorithm wrote it or one or the other edited it. Hmmm, lots of possibilities here.
Google's TensorFlow is a very powerful deep learning (and machine learning) library that includes a lot of usable examples, training data, and tutorials. I highly recommend learning and using it. I have two tutorials on using it with Apache NiFi that I wrote: TensorFlow with Tweets and Using Parsey McParse Face. Now that I have told you what TensorFlow can do, let's start learning.
To run my example script, I triggered it from BASH:
/opt/demo/tensorflow/bazel-bin/tensorflow/examples/label_image/label_image --image="/tmp/$@" --output_layer="softmax:0" --input_layer="Mul:0" --input_std=128 --input_mean=128 --graph=/opt/demo/tensorflow/tensorflow/examples/label_image/data/tensorflow_inception_graph.pb --labels=/opt/demo/tensorflow/tensorflow/examples/label_image/data/imagenet_comp_graph_label_strings.txt
For installation, this PIP install was critical:
sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.1-cp27-none-linux_x86_64.whl
First start with these excellent TensorFlow tutorials:
Another cool way to learn about TensorFlow is to use it with a machine learning framework that you already know, like H2O. You will need to learn some Python and also learn how to do some complex installs and builds as TensorFlow uses Google's Bezel build tool and some other tools for building and including many dependencies.
Once you have the basic down you can start integrating TensorFlow with your existing Machine Learning pipelines, one great way to do that is with H2O. H2O is an ML and DL framework and tools that run many ways including on Hadoop and Spark for massively parallel processing on clusters. H2O calls there TensorFlow + H2O -> Deep Water. Deep Water is very new and probably shouldn't be in your production pipeline, but data scientists and data engineers should start investigating ASAP.
Here is a very well documented good starter example of doing TensorFlow Image Inception with H2O in Python. Deep Water supports a few deep learning frameworks, but for TensorFlow here you go. I will write an article on Deep Water in the future. If you are interested, please comment on this article.
Once you have the tools and some practice, you know need to do something with your powerful TensorFlow knowledge, tools and cluster. A very recent and interesting project is the Self-driving Car Challenge. It is a very cool class and challenge from Udacity on building an open source self-driving car using NVidia (GitHub). Download a lot of image training sets and start training your deep learning models. Some useful information on this project is available at NVidia Autopilot on TensorFlow and End to End Learning for Self-driving Cars. Comment here on any problems, questions or interesting findings. You can also engage me on HCC or Twitter.
Never stop learning, there is always a ton of interesting things going on in Big Data.