In this post, we go over the basics of using NumPy, a powerful library for Python that allows for more advanced data manipulation and mathematics.
Join the DZone community and get the full member experience.Join For Free
The open source HPCC Systems platform is a proven, easy to use solution for managing data at scale. Visit our Easy Guide to learn more about this completely free platform, test drive some code in the online Playground, and get started today.
What Is NumPy?
NumPy is a powerful Python library that is primarily used for performing computations on multidimensional arrays. The word NumPy has been derived from two words — Numerical Python. NumPy provides a large set of library functions and operations that help programmers in easily performing numerical computations. These kinds of numerical computations are widely used in tasks like:
- Machine Learning Models: while writing Machine Learning algorithms, one is supposed to perform various numerical computations on matrices. For instance, matrix multiplication, transposition, addition, etc. NumPy provides an excellent library for easy (in terms of writing code) and fast (in terms of speed) computations. NumPy arrays are used to store both the training data as well as the parameters of the Machine Learning models.
- Image Processing and Computer Graphics: Images in the computer are represented as multidimensional arrays of numbers. NumPy becomes the most natural choice for the same. NumPy, in fact, provides some excellent library functions for fast manipulation of images. Some examples are mirroring an image, rotating an image by a certain angle, etc.
- Mathematical tasks: NumPy is quite useful to perform various mathematical tasks like numerical integration, differentiation, interpolation, extrapolation, and many others. As such, it forms a quick Python-based replacement of MATLAB when it comes to Mathematical tasks.
The fastest and the easiest way to install NumPy on your machine is to use the following command on the shell:
pip install numpy.
This will install the latest/most stable version of NumPy on your machine. Installing through pip is the simplest way to install any Python package. Let us now talk about the most important concept in NumPy, a NumPy array.
Arrays in NumPy
The most important data structure that NumPy provides is a powerful object called a NumPy array. A NumPy array is an extension of a usual Python array. NumPy arrays are equipped with a large number of functions and operators that help in quickly writing high-performance code for various types of computations that we discussed above. Let us see how we can quickly define a one-dimensional NumPy array:
import numpy as np my_array = np.array([1, 2, 3, 4, 5]) print my_array
In the above simple example, we first imported the NumPy library using
import numpy as np. Then, we created a simple NumPy array of 5 integers and then we printed it. Go ahead and try it on your own machine. Use the steps under the “NumPy Installation” section to make sure that you have installed NumPy in your machine.
Let us now see what all we can do with this particular NumPy array.
This will print the shape of the array that we created - (5, ). This indicates that
my_array is an array with 5 elements.
We can print the individual elements as well. Just like a normal Python array, NumPy arrays are indexed from 0.
print my_array print my_array
The above commands will print 1 and 2 respectively on the terminal. We can also modify the elements of a NumPy array. For instance, suppose we write the following 2 commands:
my_array = -1 print my_array
We will get
[-1, 2, 3, 4, 5] on the screen as output.
Now suppose, we want to create a NumPy array of length 5 but with all elements as 0, can we do it? Yes. NumPy provides an easy way to do the same.
my_new_array = np.zeros((5)) print my_new_array
We will get [0., 0., 0., 0., 0.] as the output. Similar to
np.zeros, we also have
np.ones. What if we want to create an array of random values?
my_random_array = np.random.random((5)) print my_random_array
The output we will get will look something like [0.22051844 0.35278286 0.11342404 0.79671772 0.62263151]. The output that you got may vary since we are using a random function which assigns each element a random value between 0 and 1.
Let us now see how we can create 2-dimensional arrays using NumPy.
my_2d_array = np.zeros((2, 3)) print my_2d_array
This will print the following on the screen:
[[0. 0. 0.]
[0. 0. 0.]]
Guess what the output would be for the following code:
my_2d_array_new = np.ones((2, 4)) print my_2d_array_new
Here it is:
[[1. 1. 1. 1.]
[1. 1. 1. 1.]]
Basically, when you use the function
np.ones(), you can specify the tuple that talks about the size of the array. In the above two examples, we used the following tuples, (2, 3) and (2, 4) to indicate 2 rows with 3 and 4 columns respectively. Multidimensional arrays like the ones above can be indexed using
my_array[i][j] notation where
i indicates the row number and
j indicates the column number.
j both start from 0.
my_array = np.array([[4, 5], [6, 1]]) print my_array
The output of the above code snippet is 5, since it is the element present in the index 0 row and index 1 column.
You can also print the shape of my_array as follows:
The output is (2, 2), indicating that there are 2 rows and 2 columns in the array.
NumPy provides a powerful way to extract rows/columns of a multidimensional array. For instance, consider the example of
my_array that we defined above.
[[4 5] [6 1]]
Suppose, we want to extract all elements of the second column (index 1) from it. Here, as can be seen, the second column is comprised of two elements: 5 and 1. To do so, we can do the following:
my_array_column_2 = my_array[:, 1] print my_array_column_2
Observe that, instead of a row number, we have provided a colon (
:) and for the column number we have used the value 1. The output will be: [5, 1].
We can similarly extract a row from a multidimensional NumPy array. Now, let us see the power that NumPy provides when it comes to performing computations on several arrays.
Array Manipulations in NumPy
Using NumPy, you can easily perform mathematics on arrays. For instance, you can add NumPy arrays, you can subtract them, you can multiply them, and even divide them. Here are a few examples of this:
import numpy as np a = np.array([[1.0, 2.0], [3.0, 4.0]]) b = np.array([[5.0, 6.0], [7.0, 8.0]]) sum = a + b difference = a - b product = a * b quotient = a / b print “Sum = \n“, sum print “Difference = \n“, difference print “Product = \n“, product print “Quotient = \n“, quotient The output will be as follows: Sum = [[ 6. 8.] [10. 12.]] Difference = [[-4. -4.] [-4. -4.]] Product = [[ 5. 12.] [21. 32.]] Quotient = [[0.2 0.33333333] [0.42857143 0.5 ]]
As you can see, the multiplication operator performs element-wise multiplication instead of matrix multiplication. To perform matrix multiplication, you can do the following:
matrix_product = a.dot(b) print “Matrix Product = “, matrix_product
The output would be:
As you can see, NumPy is really powerful in terms of the library function that it provides. You can perform large calculations in a single line of code with the excellent interface that NumPy exposes. This is what makes it an elegant tool for various numerical computations. You should definitely consider mastering it if you wish you develop a career as a mathematician or as a data scientist. You need to know Python before getting proficient in NumPy.
Find the best Python tutorials recommended by the programming community on Hackr.io. All the best!
Opinions expressed by DZone contributors are their own.