Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Deep Learning With TensorFlow, Nvidia and Apache Mesos (DC/OS) (Part 1)

DZone's Guide to

Deep Learning With TensorFlow, Nvidia and Apache Mesos (DC/OS) (Part 1)

Read on to learn more about the new GPU-based scheduling and see how you can take advantage of it within Mesosphere DC/OS.

Free Resource

See how the beta release of Kubernetes on DC/OS 1.10 delivers the most robust platform for building & operating data-intensive, containerized apps. Register now for tech preview.

DC/OS 1.9 introduced GPU-based scheduling. With GPU-based scheduling, organizations can share resources of clusters for traditional and Machine Learning workloads, as well as dynamically allocate GPU resources inside those clusters and free them when needed. By using some of the popular libraries for Machine Learning (such as TensorFlow and Nvidia Docker), data scientists can test locally on their laptops and deploy to production on DC/OS without any change to their applications and models. Read on to learn more about this technology and how you can take advantage of it within Mesosphere DC/OS.

This is the first in a multipart tutorial to demonstrate the value of GPUs, and how running them on DC/OS makes your applications not just fast but also efficient, reliable, and scalable.

  • Tutorial #1: Run a TensorFlow Docker image on your laptop and run a Machine Learning model with and without GPUs.
  • Tutorial #2: Run a Tensoflow Docker image on a DC/OS cluster with and without GPUs. See GPU isolation and Jupyter in action.
  • Tutorial #3: Deploy a dynamic, distributed TensorFlow on DC/OS from the Universe. See how TensorFlow on DC/OS dynamically consumes and releases resources on the cluster when done. Run multiple TensorFlows on the same cluster with different resource requirements.

In part one of this tutorial, we’ll use TensorFlow to launch a convolutional neural network example on your local machine, then use nvidia-docker to accelerate the TensorFlow job using GPUs. TensorFlow is a machine intelligence library with architecture specially configured to leverage GPUs for speed and efficiency.

Watch a video version of this tutorial here.

The only prerequisite here is that Docker is installed on your local machine.

Let’s Get Going!

First, we’ll get TensorFlow running without GPUs and time the result.

  1. On your local machine, launch TensorFlow without GPUs using Docker.
    docker run -it tensorflow/tensorflow bash
  2. We will need to download a Machine Learning model to test with Tensorflow. You can find some good examples on the GitHub repo. We’ll install git using apt-get, then clone the examples repo.
    apt-get update; apt-get install -y git
    git clone https://github.com/aymericdamien/TensorFlow-Examples
  3. Now that we have the convolutional network example, let’s run it and time the execution.
    cd TensorFlow-Examples/examples/3_NeuralNetworks
    time python convolutional_network.py

That took my machine about three minutes.

Tensorflow Example With GPUs

Now, let’s see how GPUs compare.

Note: This part of the tutorial will not work on Mac OSX because nvidia-dockerdoes not support OSX.

  1. Open a new tab to compare results with next steps with nvidia-docker.
  2. Make sure you have the latest CUDA drivers installed on your system. Find them here. As of this writing, version 8.0 is the latest.
  3. Download and installnvidia-docker.
  4. Verify your nvidia-docker installation.
    nvidia-docker run --rm nvidia/cuda nvidia-smi
    nvidia-docker run --rm nvidia/cuda:7.5 nvidia-smi
  5. Launch TensorFlow with GPUs using nvidia-docker.
    nvidia-docker run -it tensorflow/tensorflow:latest-gpu bash
  6. Install git and download TensorFlow-Examples, as you did in the previous section.
    apt-get update; apt-get install -y git
    git clone https://github.com/aymericdamien/TensorFlow-Examples
  7. Run the same convolutional network example as before.
    cd TensorFlow-Examples/examples/3_NeuralNetworks
    time python convolutional_network.py

That probably took about 30 seconds: 10 times faster than using just CPUs. The benefits of using GPUs are clear.

What’s the Effect of Multiple GPUs?

  1. While we’re here, let’s run the multi-GPU example to see the performance difference if we use more than one GPU for a task.
    cd ../5_MultiGPU
    time python multigpu_basics.py

Even faster!

New Mesosphere DC/OS 1.10: Production-proven reliability, security & scalability for fast-data, modern apps. Register now for a live demo.

Topics:
big data ,deep learning ,apache mesos ,tensorflow ,tutorial ,nvidia

Published at DZone with permission of Kevin Klues, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}