DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
The Latest "Software Integration: The Intersection of APIs, Microservices, and Cloud-Based Systems" Trend Report
Get the report
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Using GPU in TensorFlow Model

Using GPU in TensorFlow Model

This tutorial explains how to increase our computational workspace by making room for TensorFlow GPU.

Rinu Gour user avatar by
Rinu Gour
·
Mar. 11, 19 · Tutorial
Like (1)
Save
Tweet
Share
9.16K Views

Join the DZone community and get the full member experience.

Join For Free

In our last TensorFlow tutorial, we studied Embeddings in TensorFlow. Today, we will study how to increase our computational workspace by making room for Tensorflow GPU. Moreover, we will see device placement logging and manual device placement in TensorFlow GPU and will discuss optimizing GPU memory. We will also cover single GPU in multiple GPU systems and use multiple GPU in TensorFlow.

Let's begin!

Image title


GPU in TensorFlow

Your usual system may comprise of multiple devices for computation, and as you already know, TensorFlow supports both CPU and GPU, which we represent as strings. For example:

  • If you have a CPU, it might be addressed as “/cpu:0”.
  • TensorFlow GPU strings have an index starting from zero. Therefore, to specify the first GPU, you should write “/device:GPU:0”.
  • Similarly, the second GPU is “/device:GPU:1”.

By default, if your system has both a CPU and a GPU, you give priority to the GPU in TensorFlow.

Device Placement Logging

You can find out which devices handle particular operations by creating a session where the log_device_placementconfiguration option is preset.

# Graph creation.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Running the operation.
print(sess.run(c))

The output of TensorFlow GPU device placement logging is shown below:

/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla K40c, pci bus
id: 0000:05:00.0
b: /job:localhost/replica:0/task:0/device:GPU:0
a: /job:localhost/replica:0/task:0/device:GPU:0
MatMul: /job:localhost/replica:0/task:0/device:GPU:0
[[ 22. 28.]
[ 49. 64.]]

Manual Device Placement

At times, you may want to decide on which device your operation should be running, and you can do this by creating a context with tf.device, wherein you assign the specific device, i.e., CPU or a GPU that should do the computation, as shown below.

# Graph Creation.
with tf.device('/cpu:0'):
  a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
  b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Running the operation.
print(sess.run(c))

The above code of TensorFlow GPU assigns the constants a and b to cpu:0. In the second part of the code, since there is no explicit declaration of which device is to perform the task, a GPU by default is chosen if available, and it copies the multi-dimensional arrays between devices if required.

Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla K40c, pci bus
id: 0000:05:00.0
b: /job:localhost/replica:0/task:0/cpu:0
a: /job:localhost/replica:0/task:0/cpu:0
MatMul: /job:localhost/replica:0/task:0/device:GPU:0
[[ 22.  28.]
 [ 49.  64.]]

Optimizing TensorFlow GPU Memory

Memory fragmentation is done to optimize memory resources by mapping almost all of the TensorFlow GPU's memory that is visible to the processor, thus saving a lot of potential resources.
TensorFlow GPU offers two configuration options to control the allocation of a subset of memory if and when required by the processor to save memory, and these TensorFlow GPU optimizations are described below:

allow_growth, which allocates a limited amount of GPU memory in TensorFlow according to the runtime, is dynamic in the sense that it initially allocates little memory and keeps widening it according to the running sessions, thus extending the GPU memory required by the process. The memory isn’t released, as it will lead to fragmentation, which is not desired. ConfigProto is used for this purpose:

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config, ...)

per_process_gpu_memory_fraction is the second choice, and it decides that the segment of the total memory should be allocated for each GPU in use. The example below will tell TensorFlow to allocate 40 percent of the memory:

config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.4
session = tf.Session(config=config, ...)

It will be used only in cases where you already know the specifics of the computation and are sure that they will not change during the course of processing.

Single GPU in Multi-GPU System

In multi-TensorFlow GPU systems, the device with the lowest identity is selected by default. It is, again, up to the user to decide the specific GPU if the default user does not need one:

# Creates a graph.
with tf.device('/device:GPU:2'):
  a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
  b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
  c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print(sess.run(c))

The InvalidArgumentError is obtained when the TensorFlow GPU specified by the user does not exist, as shown below:

InvalidArgumentError: Invalid argument: Cannot assign a device to node 'b':
Could not satisfy explicit device specification '/device:GPU:2'
   [[Node: b = Const[dtype=DT_FLOAT, value=Tensor<type: float shape: [3,2]
   values: 1 2 3...>, _device="/device:GPU:2"]()]]

If you want to specify the default device in such cases when there is no existing or supported device found by TensoFflow, you could use allow_soft_placement and set it in the configuration option when the session is created, as illustrated by the code below.

with tf.device('/device:GPU:2'):
  a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
  b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
  c = tf.matmul(a, b)
sess = tf.Session(config=tf.ConfigProto(
      allow_soft_placement=True, log_device_placement=True))
# Running the operation.
print(sess.run(c))

Using Multiple GPU in TensorFlow

You are already aware of the towers in TensorFlow and each tower we can assign to a GPU, making a multi-tower structural model for working with TensorFlow multiple GPUs. Let’s see an example:

c = []
for d in ['/device:GPU:2', '/device:GPU:3']:
  with tf.device(d):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3])
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2])
    c.append(tf.matmul(a, b))
with tf.device('/cpu:0'):
  sum = tf.add_n(c)
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Running the operations.
print(sess.run(sum))

The output of TensorFlow GPU is as follows:

/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla K20m, pci bus
id: 0000:02:00.0
/job:localhost/replica:0/task:0/device:GPU:1 -> device: 1, name: Tesla K20m, pci bus
id: 0000:03:00.0
/job:localhost/replica:0/task:0/device:GPU:2 -> device: 2, name: Tesla K20m, pci bus
id: 0000:83:00.0
/job:localhost/replica:0/task:0/device:GPU:3 -> device: 3, name: Tesla K20m, pci bus
id: 0000:84:00.0
Const_3: /job:localhost/replica:0/task:0/device:GPU:3
Const_2: /job:localhost/replica:0/task:0/device:GPU:3
MatMul_1: /job:localhost/replica:0/task:0/device:GPU:3
Const_1: /job:localhost/replica:0/task:0/device:GPU:2
Const: /job:localhost/replica:0/task:0/device:GPU:2
MatMul: /job:localhost/replica:0/task:0/device:GPU:2
AddN: /job:localhost/replica:0/task:0/cpu:0
[[ 44. 56.]
[ 98. 128.]]

You can test this multiple GPU model with a simple dataset such as CIFAR10 to experiment and understand working with GPUs.

Conclusion

In this tutorial, we saw TensorFlow GPUs for graphical computations and that define as an array of parallel processors working together to perform high-level computations that are in contrast to CPUs. This tutorial also briefed you on how to initialize GPUs and change the default configurations to suit your needs and optimize your computation. Moreover, we saw how to import GPU and TensorFlow GPU install. If you have any questions or thoughts, feel free to comment below.

TensorFlow Memory (storage engine) code style Making Room Session (web analytics) Strings Subset Testing

Published at DZone with permission of Rinu Gour. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • 7 Ways for Better Collaboration Among Your Testers and Developers
  • Getting a Private SSL Certificate Free of Cost
  • Create a CLI Chatbot With the ChatGPT API and Node.js
  • Securing Cloud-Native Applications: Tips and Tricks for Secure Modernization

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: