Getting Started With PyTorch – Deep Learning in Python
PyTorch is one of the fastest-growing Python-based frameworks for deep learning. Let’s have a look at the basics and how to build and deploy a model using Machine Learning.
Join the DZone community and get the full member experience.
Join For FreeAre you trying to design a model using machine learning?
If yes, PyTorch will be the right choice in that case. This article will help you understand the basics of deep learning and the concept of PyTorch. In the beginning, we will explain what PyTorch is & the advantages of using it for your projects. The article will end with a quick comparison between PyTorch and NumPy using an example.
Introduction to PyTorch
Launched by Facebook back in 2016, PyTorch is an open-source machine learning framework. PyTorch belongs to the Torch library, and the primary intent behind the development of the framework is to facilitate the high-speed implementation of the neural networks.
What makes PyTorch a better framework for the creation and development of neural networks is the fact that it uses dynamic computational graphs. Unlike the other deep learning frameworks with static graphs, the dynamic ones are created on the fly, which means that the graph is computed after every step and on each iteration.
But that's not the only thing that accounts for the widespread usage of PyTorch. Here we have listed additional advantages offered by PyTorch.
Leverages Python
It is a well-known fact that Python is one of the hottest programming languages of the decade. From machine learning to Artificial Intelligence, everything is coded using Python. PyTorch, for that matter, is Pythonic in nature. Python developers can easily understand and work on the PyTorch framework. This makes it a popular framework as compared to other deep learning frameworks.
Simple to Learn
Even though you aren't versed in Python language, learning PyTorch is pretty easy and wouldn't worry you much. The syntax is comparatively simple, and the overall framework is intuitive.
Quick Debugging
Being integrated with Python, PyTorch provides the flexibility to extend the usage of Python's debugging tools. In fact, all of the debugging tools rendered by Python can be used to debug programs in PyTorch.
Community Support
In addition to all of the above, PyTorch is backed by a huge community of developers and programmers. Also, it has well-organized and structured documentation, which further makes it easier to create ML models using the framework.
Comparison
Besides PyTorch, NumPy is another frequently used framework that helps in the creation of networks. To understand their difference, let's start with the creation of the network, first with NumPy and then with PyTorch.
NumPy
Numpy offers the provision of using an n-dimensional array object. Also, it has an array of functions that can be used and implemented on the array to manipulate them.
Technically, NumPy is a machine learning framework used for general purposes such as scientific computing. In fact, the framework isn't aware of the deep learning concepts or the computational graphs.
Here, in the given an example, we are to use NumPy to adjust a two-layer network with random data. The manual method of implementation is done through the forward and backward passes using NumPy operations.
xxxxxxxxxx
# -*- coding: utf-8 -*-
import numpy as np
# N is batch size; D_in is input dimension;
# H is a hidden dimension; D_out is the output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10
# Create random input and output data
x = np.random.randn(N, D_in)
y = np.random.randn(N, D_out)
# Randomly initialize weights
w1 = np.random.randn(D_in, H)
w2 = np.random.randn(H, D_out)
learning_rate = 1e-6
for t in range(500):
# Forward pass: compute predicted y
h = x.dot(w1)
h_relu = np.maximum(h, 0)
y_pred = h_relu.dot(w2)
# Compute and print loss
loss = np.square(y_pred - y).sum()
print(t, loss)
# Backprop to compute gradients of w1 and w2 with respect to loss
grad_y_pred = 2.0 * (y_pred - y)
grad_w2 = h_relu.T.dot(grad_y_pred)
grad_h_relu = grad_y_pred.dot(w2.T)
grad_h = grad_h_relu.copy()
grad_h[h < 0] = 0
grad_w1 = x.T.dot(grad_h)
# Update weights
w1 -= learning_rate * grad_w1
w2 -= learning_rate * grad_w2
PyTorch
Sounds good but not the best. NumPy, even though being a useful framework, it doesn't facilitate the utilization of GPUs to process graphical computations or, say, accelerate numerical computations.
When it comes to the creation of modern-day deep neural networks, it is possible to accelerate the speed of the computations by 50 times and even more. Now to achieve this, NumPy's aren't suitable. This is where PyTorch comes into the play.
Tensor or PyTorch Tensor is one of the basic concepts of PyTorch. While it seems to be the same as that of NumPy arrays, it is different from the same in the sense that it utilizes GPU for boosting numerical computations.
To define, a PyTorch Tensor is an n-dimensional array, along with an array of functions that can be used to execute specific operations on the Tensor. At the backend, these Tensors are capable of keeping track of the gradients and the computational graph. In addition to the above, PyTorch Tensors can be used as a tool to process scientific computations.
As stated above, the PyTorch Tensors can use GPU and run the same on GPU, all you need to do is cast is on a different datatype.
To help you understand better, we now have an example where we use PyTorch Tensors to adjust the two-layer network in any random data. Note that the example is the same as the one we had for NumPy. Here again, we need to implement both the backward and forward passes across the network manually.
xxxxxxxxxx
import torch
dtype = torch.float
device = torch.device("cpu")
# device = torch.device("cuda:0") # Uncomment this to run on GPU
# N is batch size; D_in is input dimension;
# H is a hidden dimension; D_out is the output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10
# Create random input and output data
x = torch.randn(N, D_in, device=device, dtype=dtype)
y = torch.randn(N, D_out, device=device, dtype=dtype)
# Randomly initialize weights
w1 = torch.randn(D_in, H, device=device, dtype=dtype)
w2 = torch.randn(H, D_out, device=device, dtype=dtype)
learning_rate = 1e-6
for t in range(500):
# Forward pass: compute predicted y
h = x.mm(w1)
h_relu = h.clamp(min=0)
y_pred = h_relu.mm(w2)
# Compute and print loss
loss = (y_pred - y).pow(2).sum().item()
if t % 100 == 99:
print(t, loss)
# Backprop to compute gradients of w1 and w2 with respect to loss
grad_y_pred = 2.0 * (y_pred - y)
grad_w2 = h_relu.t().mm(grad_y_pred)
grad_h_relu = grad_y_pred.mm(w2.t())
grad_h = grad_h_relu.clone()
grad_h[h < 0] = 0
grad_w1 = x.t().mm(grad_h)
# Update weights using gradient descent
w1 -= learning_rate * grad_w1
w2 -= learning_rate * grad_w2
As noted, we had to implement the backward and forward passes manually in the examples mentioned above. Since we are doing it in a two-layer network, the manual process doesn't cause a problem. However, when the network size increases, this would be tedious. Thankfully, PyTorch has a separate package, namely autograd, which helps in automation of the process.
Replace the for loop
xxxxxxxxxx
learning_rate = 1e-6
for t in range(500):
y_pred = x.mm(w1).clamp(min=0).mm(w2)
loss = (y_pred - y).pow(2).sum()
if t % 100 == 99:
print(t, loss.item())
loss.backward()
with torch.no_grad():
w1 -= learning_rate * w1.grad
w2 -= learning_rate * w2.grad
w1.grad.zero_()
w2.grad.zero_()
This was all about PyTorch and its usage. Implement the code above to get started with it. It will definitely help you in deep learning projects.
Opinions expressed by DZone contributors are their own.
Comments