DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • Unleashing the Power of Python: A Deep Dive into Data Visualization
  • How to Use Python for Data Science
  • Enhancing Business Decision-Making Through Advanced Data Visualization Techniques
  • Python and Open-Source Libraries for Efficient PDF Management

Trending

  • Cloud Security and Privacy: Best Practices to Mitigate the Risks
  • How to Build Real-Time BI Systems: Architecture, Code, and Best Practices
  • Endpoint Security Controls: Designing a Secure Endpoint Architecture, Part 2
  • Efficient API Communication With Spring WebClient
  1. DZone
  2. Data Engineering
  3. Data
  4. From Static to Interactive: Exploring Python's Finest Data Visualization Tools

From Static to Interactive: Exploring Python's Finest Data Visualization Tools

In this article, take a detailed look at three popular Python libraries for data visualization: Matplotlib, Seaborn, and Plotly.

By 
Sandeepkumar Racherla user avatar
Sandeepkumar Racherla
·
Jul. 06, 23 · Analysis
Likes (4)
Comment
Save
Tweet
Share
3.5K Views

Join the DZone community and get the full member experience.

Join For Free

Data visualization plays a fundamental role in understanding and communicating the insights we derive from our data when we analyze them.

When it comes to data analysis, Python is one of the most used programming languages for a simple reason: it’s versatile and has several libraries for creating plots, giving us the possibility to choose the one that best suits our needs.

In this article, we’ll talk about three popular Pythonic data visualization libraries: Matplotlib, Seaborn, and Plotly. We’ll explore their characteristics, emphasize their differences, and show practical code examples on how to use them.

Matplotlib

Matplotlib is one of the oldest and most widely used data visualization libraries. It provides a wide range of methods for creating plots, giving us the possibility to visualize data ranging from scatterplots to complicated visualizations.

It’s an extremely flexible library, allowing us to customize every aspect of our graphs. Also, despite having a low-level interface, Matplotlib is the foundation for other visualization libraries, like Seaborn.

Anyway, although Matplotlib gives us complete control over our charts, its Achilles’ heel is that it may take a lot of code to produce the results we need. In particular, if we’re interested in presenting aesthetically beautiful plots.

Examples Using Matplotlib

Let’s see some plots we can create with Matplotlib, with code examples.

Scatterplots With Matplotlib

One of the very basic graphs we may be interested in when analyzing data is a scatterplot because this may give us a sense of how the data is distributed.

To create a scatterplot with Matplotlib we can use the method scatter(). Let’s see how to use it:

Python
 
import matplotlib.pyplot as plt
import random

# Generate data
num_points = 50
x = [random.random() for _ in range(num_points)]
y = [random.random() for _ in range(num_points)]

# Create a scatter plot
plt.scatter(x, y)

# Labeling the plot
plt.xlabel('X-axis') # x-axis label
plt.ylabel('Y-axis') # y-axis label
plt.title('Scatter Plot with Random Data') # plot title

# Show the plot
plt.show()


And here’s the result: Scatter plot with random data

Image by author

As we can see, we can fully customize our plots by adding:

  • X and y axes labels with the methods plt.xlabel() and plt.ylabel()
  • A title to the plot with the method plt.title()

Lineplots With Matplotlib

Another type of plot that is very used is the “line plot” which, as the words say, creates a line plot of the variables.

Here’s how we can create one:

Python
 
import matplotlib.pyplot as plt

# Create variables
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

# Create the line plot
plt.plot(x, y, color='gold', linewidth=2)

# Label the plot
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Line Plot')

# Show grid
plt.grid(True)

# Show plot
plt.show()


And we get:
Line plot
Image by author

So, with plt.plot(x, y, color='gold', linewidth=2) we can create a line plot displaying a line:

  • Colored in golden, passing the parameter color='gold'
  • With a width of 2 points, passing the parameter linewidth=2

In these cases, it may also be useful to add a grid by typing plt.grid(True) to improve our visualization experience.

Barplots With Matplotlib

Barplots are another popular kind of plot we may need as they are useful to compare values.

To plot a barplot, we can use the method plt.bar() as follows:

Python
 
import matplotlib.pyplot as plt

# Create values and categories to plot
categories = ['A', 'B', 'C', 'D']
values = [25, 50, 75, 100]

# Create the bar plot
plt.bar(categories, values, color='blue', alpha=0.75)

# Label the plot
plt.xlabel('Categories')
plt.ylabel('Values')
plt.title('Bar Plot')

# Show the plot
plt.show()


And we get:

Bar plot

Image by author

So even here we have full control of the parameters, specifying the color of the bars and their alpha parameter (this is a parameter that, somehow, manages how dark the color we’ve chosen should be).

Comparing Variables With Matplotlib

A useful and interesting feature of Matplotlib is that we can use it to compare variables. For example, suppose we want to compare a sin() and cos() functions:

Python
 
import matplotlib.pyplot as plt
import numpy as np

# Generate data
x = np.linspace(0, 10, 100)
y1 = np.sin(x) # sin function
y2 = np.cos(x) # cos function

# Create the plot
plt.plot(x, y1, label='Sin(x)')
plt.plot(x, y2, label='Cos(x)')

# Set labels and title
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Plot of Sin(x) and Cos(x)')

# Add legend
plt.legend()

# Display the plot
plt.show()


And we get:

Plot of Sin(x) and Cos(x)

Image by author

So, here we can also see that we can even add a legend for a better understanding of the plot with the method plt.legend(), improving our visualization experience.

Plotting Linear Regression Lines With Matplotlib

Another interesting feature of Matplotlib is the possibility to use it for “machine learning purposes” such as, for example, displaying a linear regression line. Let’s see how to do so:

Python
 
import matplotlib.pyplot as plt
import numpy as np

# Generate linearly dependent data
np.random.seed(42)
x = np.linspace(0, 10, 50)
y = 2 * x + np.random.normal(0, 1, 50)

# Perform linear regression
coefficients = np.polyfit(x, y, 1)
m, b = coefficients

# Create the plot
plt.scatter(x, y, color='blue', label='Data')
plt.plot(x, m * x + b, color='red', label='Linear Regression')

# Set labels and title
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Linear Regression Plot')
# Add a legend
plt.legend()

# Display the plot
plt.show()


And we get:

Linear regression plot

Image by author

So, here we’ve created linearly dependent data, fitted with a line with np.polyfit(), and displayed the data and the fitted regression line.

Seaborn

Seaborn is a library built on top of Matplotlib which has a high-level interface that is specialized in offering us beautiful statistical visualizations. It simplifies the process of creating complex plots by providing ready-to-use functions for tasks like scatter plots, bar plots, heatmaps, and many more.

Seaborn focuses on enhancing the visual appeal and readability of plots, making it the perfect choice for exploratory data analysis and presenting results.

Examples Using Seaborn

Let’s see some code examples using Seaborn. We’ll also make some comparisons with the results obtained with Matplotlib, where possible.

Scatterplots With Seaborn

Let’s create a scatterplot with Seaborn and compare it with the one we’ve created in Matplotlib:

Python
 
import seaborn as sns
import matplotlib.pyplot as plt
import random

# Generate random data
num_points = 100
x = [random.random() for _ in range(num_points)]
y = [random.random() for _ in range(num_points)]

# Create scatterplot
sns.scatterplot(x, y)

# Set labels and title
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Scatterplot with Random Data')

# Show the plot
plt.show()


And we get:

Scatterplot with random data

Image by Author

The first thing we can notice is the fact that, as we’ve said, Seaborn is built on top of Matplotlib. In fact, as the code shows, we used some Matplotlib code to create the plot. Also, the graphical part is not too different from the one we obtained in Matplotlib. So, let’s see some other possibilities.

Barplots With Seaborn

Let’s create a barplot with Seaborn:

Python
 
import seaborn as sns
import matplotlib.pyplot as plt

# Sample data
categories = ['A', 'B', 'C', 'D']
values = [10, 25, 15, 30]

# Set a shiny color palette
shiny_colors = ['#FFC300', '#FF5733', '#C70039', '#900C3F']

# Create a barplot with shiny colors
sns.barplot(x=categories, y=values, palette=shiny_colors)

# Set labels and title
plt.xlabel('Categories')
plt.ylabel('Values')
plt.title('Shiny Barplot')

# Show the plot
plt.show()


And we get:

Shiny barplot

Image by author

So, here we’ve seen another customization. In fact, we’ve declared the colors we wanted with the variable shiny_colors and passed it to the method sns.barplot() in the palette parameter to apply it.

Plotting Heatmaps With Seaborn

Another typical plot we can create with Seaborn is a heatmap, which is a representation of data where the values are shown within a matrix as colors. This kind of visualization typically allows us to explore patterns, correlations, and distributions in the data.

Here’s how we can create one, using the “flights” dataset provided by Seaborn itself:

Python
 
import seaborn as sns

# Load the “flights” dataset
flights = sns.load_dataset("flights")
flights = flights.pivot("month", "year", "passengers")

# Create the heatmap
sns.heatmap(flights, annot=True, fmt="d", cmap="YlGnBu")

# Show the plot
plt.show()


And we get:

Heatmap

Image by author

So, here we can see how a heatmap helps us better visualize the data, giving us the possibility to immediately find the highest values (606 and 622) and to see to which features they are related (1960, July, and August).

Showing Seaborn Superpowers

The superpowers of Seaborn are related to the fact that is built on top of Matplotlib, so we can achieve beautiful results with low code. 

Also, Seaborn has a vast tutorial section on its website where we can see its superpowers. It’s also equipped with some datasets so that we can plot and analyze them, to improve our skills with it.

For example, suppose we want to analyze the data related to the tips left to waiters at the restaurant. We can compare the tips at dinner and at lunch, and we can see if the customers were smokers or not.

The dataset called “Tips” is provided by Seaborn, and it can be used to make practice with it. Here’s how we can do so. (Note: the following code is taken from the Seaborn tutorials web page, see here for reference):

Python
 
# Import seaborn
import seaborn as sns

# Apply the default theme
sns.set_theme()

# Load the “tips” dataset
tips = sns.load_dataset("tips")

# Create the visualization
sns.relplot(
    data=tips,
    x="total_bill", y="tip", col="time",
    hue="smoker", style="smoker", size="size",
)


And we get:

Seaborn - Time (Lunch and Dinner)

Image by author

So, with just one line of code, we get a beautiful and meaningful plot that shows us exactly what we wanted.

Plotly

Plotly is a versatile library that provides interactive visualizations. It offers a range of chart types, including line charts, scatter plots, bar plots, and many more.

Plotly excels in creating interactive plots with hover effects, zooming, and panning capabilities, making it ideal for building interactive dashboards and web applications. 

Additionally, Plotly provides an online platform for sharing and collaborating on visualizations.

Examples Using Plotly

Let’s make some examples of how to use Plotly. The power of this library is to create interactive plots.

But first of all, you may need to install Plotly. You can do it via the terminal like so:

Python
 
$ pip install plotly


Scatterplots With Plotly

First of all, let’s create a scatterplot with Plotly using the module plotly.graph_objects:

Python
 
import plotly.graph_objects as go
import numpy as np

# Generate random data
np.random.seed(0)
x = np.random.rand(100)
y = np.random.rand(100)

# Create a scatter plot
fig = go.Figure(data=go.Scatter(x=x, y=y, mode='markers'))

# Add labels and title
fig.update_layout(
    title='Scatter with random data',
    xaxis_title='X-axis',
    yaxis_title='Y-axis'
)

# Show the plot
fig.show()


And we get:

Scatter with Random Data

Image by author

The first thing we can notice is that for labeling the axes and the title, Plotly requires just one line of code, as opposed to Matplotlib.

Also, we can see how graphically clear is the plot, providing the grid without explicitly coding it, as opposed to Matplotlib and Seaborn.

We can, then, be pleased by its interactivity as it:

  • Shows us the values of x and y when we move the cursor on a spot.
  • Provides some features in the top-right corner to zoom, save the image, and perform other actions.

Bubble Plots With Plotly

A practical case where we may need an interactive plot more than in other types of plots is bubble plots, especially if bubbles intersect.

To make such a plot will use the module plotly.express provided by the Plotly library like so:

Python
 
import plotly.express as px
import numpy as np

# Generate random data
np.random.seed(0)
x = np.random.rand(50)
y = np.random.rand(50)
sizes = np.random.rand(50) * 30  # Random sizes for the bubbles

# Create a bubble plot using Plotly express
fig = px.scatter(x=x, y=y, size=sizes, color=sizes,
                 size_max=30, color_continuous_scale='Viridis')

# Add labels and title
fig.update_layout(
    title='Bubble Plot Example',
    xaxis_title='X-axis',
    yaxis_title='Y-axis'
)

# Show the plot
fig.show()


And we get:

Bubble plot example

Image by author

So, in such cases, we can really appreciate the interactive features provided by Plotly.

Interactive Plots for Machine Learning With Plotly

If you’re familiar with machine learning, you may benefit from the interactivity that Plotly gives us in machine learning plots.

For example, we can fit the train set with a Random Forest classifier and create an interactive AUC/ROC curve using Plotly. Let’s see how:

Python
 
import numpy as np
import pandas as pd
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import roc_curve, roc_auc_score
import plotly.express as px

# Generate a dataset for classification
X, y = make_classification(n_samples=1000, n_features=10, n_informative=5, random_state=0)

# Scale the features using StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Split the data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=0)

# Fit train set with a Random Forest Classifier
rf_classifier = RandomForestClassifier(random_state=0)
rf_classifier.fit(X_train, y_train)

# Get the predicted probabilities for the positive class (class 1)
y_pred_proba = rf_classifier.predict_proba(X_test)[:, 1]

# Compute the ROC curve values
fpr, tpr, thresholds = roc_curve(y_test, y_pred_proba)

# Compute the AUC score
auc_score = roc_auc_score(y_test, y_pred_proba)

# Create an ROC curve using Plotly Express
roc_df = pd.DataFrame({'False Positive Rate': fpr, 'True Positive Rate': tpr})
fig = px.line(roc_df, x='False Positive Rate', y='True Positive Rate', title=f'ROC Curve (AUC = {auc_score:.2f})')

# Show plot
fig.show()


And we get:

ROC Curve

Image by author

Conclusions

In this article, we’ve seen an overview of some plots we can create with the three most used Python libraries for graphical representations: Matplotlib, Seaborn, and Plotly.

Although there is no absolute right or wrong, we can synthesize their usage like so:

  • Matplolib generally requires a lot of code to create aesthetically beautiful plots, so it’s more suitable in cases where we need fast and raw plots that give a sense of the data and its distribution.
  • Seaborn provides beautiful aesthetically statistical plots, so it’s particularly suitable for the data exploration phase. Also, it requires little code to create complex visualization, so it can be suitable even to present data.
  • Plotly creates interactive plots with little lines of code, so it’s particularly suitable for presenting data, or for the exploratory phase where its interactivity feature helps us better understand the data. For example, in the case of a bubble plot if the bubbles intersect themselves.
Data visualization Library Matplotlib Python (language)

Opinions expressed by DZone contributors are their own.

Related

  • Unleashing the Power of Python: A Deep Dive into Data Visualization
  • How to Use Python for Data Science
  • Enhancing Business Decision-Making Through Advanced Data Visualization Techniques
  • Python and Open-Source Libraries for Efficient PDF Management

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!