Pandas: Pandas Tutor
This post explores Pandas Tutor - a great tool for learning and debugging Pandas programs. let’s learn more about it by executing a few basic Pandas programs.
Join the DZone community and get the full member experience.Join For Free
This article focuses on the excellent tool for learning and debugging Pandas program known as “Pandas Tutor”. The “Pandas Tutor” tool visualizes how the Pandas code transforms the data, which helps in learning Pandas quickly, also we can very effectively use this tool for debugging our programs. It’s a great tool and visualization is super awesome, let’s learn this tool by executing a few basic Pandas programs.
To access Pandas Tutor online, please use this link.
First set up the data, for simplicity’s sake we will create our own dictionary and create a DataFrame from it.
Once the dictionary is created and DataFrame is created, we must click on the “Visualize pandas’ expression on the last line” button, after that, we can see the tool is executing “df.head()” and producing the Input given to the Program and the Output it generated. On this data, we will perform all our basic operations and try to visualize the transformation.
Transformation – Filtering
Let’s execute a basic filter operation, where will see all the employees whose age is less than 40 years old.
Amazing, the operation is performed on the last line which is “df[df['Age'] < 40]” look at the Input is given to the program and how the filtering transformation is performed and the generated Output. The tool is showcasing by an arrow that Employee ‘David’ is not pulled in the result DataFrame.
Transformation – Sorting
Let’s sort the DataFrame by a single column “Age” and let’s see what kind of visualization tool will perform,
We are sorting the Age column in Ascending order, “pandastutor” is taking care of minute details of Highlighting the column on which sorting is performed and the arrows signify the index of each row after sorting.
Let’s try to sort the DataFrame by multiple columns:
Two columns in question are highlighted.
Transformation – nunique
“nunique” is a simple function that returns the count of unique values in each column of DataFrame. If it’s such a simple operation then why I am highlighting that in the article, the reason is “pandastutor” is showcasing that the “Series” object is returned by the operation, for learning Pandas return type is important especially Series.
Transformation – groupby
Let’s count the total number of employees in each department, we will use groupby function on the “Department” column of the DataFrame.
I highly recommend “Pandas Tutor” if you are a beginner or experienced, I wish I would have known about the tool earlier, it’s helpful whether we are learning Pandas or trying to debug an existing Pandas program.
I hope you find the article useful, Thank You for reading.
Opinions expressed by DZone contributors are their own.