Apache Deep Learning 101: Using Apache MXNet on an HDF Node
In this article, a data science expert discusses how to use deep learning processes on a single HDF node on Linux. Let's get started!
Join the DZone community and get the full member experience.Join For Free
This is for people preparing to attend my talk on Deep Learning at DataWorks Summit Berlin 2018 (https://dataworkssummit.com/berlin-2018/#agenda) on Thursday, April 19, 2018, at 11:50 AM Berlin time.
This is for running Apache MXNet on an HDF 3.1 node with Centos 7.
Let's get this installed!
The installation instructions at Apache MXNet's website are amazing. Pick your platform and your style. I am doing this the simplest way, on a Linux path.
We need to install OpenCV to handle images in Python. So we install that and all the build tools that OpenCV requires to build it and Apache MXNet.
HDF 3.1 / Centos 7 Setup
sudo yum groupinstall 'Development Tools' -y sudo yum install cmake git pkgconfig -y sudo yum install libpng-devel libjpeg-turbo-devel jasper-devel openexr-devel libtiff-devel libwebp-devel -y sudo yum install libdc1394-devel libv4l-devel gstreamer-plugins-base-devel -y sudo yum install gtk2-devel -y sudo yum install tbb-devel eigen3-devel -y pip install numpy cd ~ git clone https://github.com/Itseez/opencv.git cd opencv git checkout 3.1.0 git clone https://github.com/Itseez/opencv_contrib.git cd opencv_contrib git checkout 3.1.0 cd ~/opencv mkdir build cd build cmake -D CMAKE_BUILD_TYPE=RELEASE \ -D CMAKE_INSTALL_PREFIX=/usr/local \ -D OPENCV_EXTRA_MODULES_PATH=~/opencv_contrib/modules \ -D INSTALL_C_EXAMPLES=OFF \ -D INSTALL_PYTHON_EXAMPLES=ON \ -D BUILD_EXAMPLES=ON \ -D BUILD_OPENCV_PYTHON2=ON .. sudo make sudo make install sudo ldconfig
Local Centos 7 Run Script
python -W ignore analyzex.py $1
See Part 1 here.
Apache NiFi Flow
This first flow retrieves images from the picsum.photos API, stores it locally, and then runs some basic processing. The first branch extracts all the metadata we can. The second branch will call our example Inception Apache MXNet Python script for image recognition. The script returns a JSON file that we will process with the same processing code that is used by the local version of this program.
Once we funnel that out our process group, we send it to the MXNet processing group which will convert the JSON to Apache AVRO and then to Apache ORC for storage in HDFS to be used as an external Apache Hive table.
Our Schema hosted in Hortonworks Schema Registry
Examining The Picture with ExtractMedia...
To Execute Apache MXNet Installed on HDF Node
An Example Image Loaded From the API
Exploring the data with Apache Hive SQL in Apache Zeppelin on HDP 2.6.4
Images REST API Provided by PicSum (Digital Ocean + Beluga CDN)
Images Provided By Unsplash
Opinions expressed by DZone contributors are their own.