A Quick Discussion on Python's Pandas
Everything you need to know about Pandas' Series data type.
Join the DZone community and get the full member experience.
Join For FreeIn this section, we are going to have a quick discussion about Python Pandas and some of its data types.
Pandas is an open source library built on top of Numpy. It allows for fast analysis, data cleansing, data preparation, and data visualization. It can also work with data from a wide variety of sources.
How to Install Pandas
Go to your command line or terminal and use:
pip install pandas
(If you haven't installed Python, you can go directly to https://www.python.org/downloads/.)
You can also install Pandas with Anaconda's distribution of Python.
conda install pandas
Either one should work.
Let’s start with the first data type we encounter when working with Pandas.
Series and how They Interact With Pandas
A Series is very similar to a NumPy array. (In fact, it is built on top of the NumPy array object). What differentiates the NumPy array from a Series, is that a Series can have axis labels, meaning it can be indexed by a label. It can hold any arbitrary Python Object.
Let us import Pandas and explore the Series object.
Creating a Series
You can convert a list, numpy array, or dictionary to a Series:
It looks a lot like a Numpy array except it has an index and corresponding data.
Now, here, we have an index that is labeled, meaning we can call these data points using this labeled index. We can also directly use:
Numpy arrays
Dictionaries
Note: A Pandas Series can hold a variety of object types:
Grabbing Data From a Series
Here, we are creating two series and storing them. In order to grab data out of a series, we refer to the index:
Performing Various Operations on a Series
We can also perform various with series such as addition:
Notice what is happening. It's going to try to match up the operation based on the index. It has got matching for India and USA, so the operation has been performed here. But for other indices, there’s no match, so it has a null (NaN object). Also, notice that the data type here is converted to float in order to retain all the information possible.
For division and multiplication see the example below:
Thanks for reading!
I wish you have enjoyed it.
Note: There will be an addition to this article. We will discuss Data Frame and how it works with Pandas. Then, we'll discuss some particular topics, such as how to work with missing data and how to use the Pandas groupby function.
Opinions expressed by DZone contributors are their own.
Comments