DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
11 Monitoring and Observability Tools for 2023
Learn more
  1. DZone
  2. Data Engineering
  3. Big Data
  4. Self-Service Analytics Using Dremio

Self-Service Analytics Using Dremio

Learn about performing data transformation and data analysis using Dremio and performing data visualization using Tableau.

Rathnadevi Manivannan user avatar by
Rathnadevi Manivannan
·
Sep. 27, 17 · Tutorial
Like (2)
Save
Tweet
Share
8.04K Views

Join the DZone community and get the full member experience.

Join For Free

Dremio, a self-service data platform, helps data analysts and data scientists to determine, organize, accelerate, and share any data at any time irrespective of volume, velocity, location, or structure. Dremio allows business users to access data from a variety of sources and prevents them from relying on developers.

In this blog, let's discuss data transformation and data analysis using Dremio and data visualization using Tableau.

Prerequisites

Download and install Dremio from here.

Data Description

Online retail data with different product types, product prices, and quantities sold from December 2010 to December 2011 is used as a data source.

Sample data source:sample_data_source1

Synopsis:

  • Connect different data sources with Dremio
  • Perform data transformation
  • Create virtual datasets in Dremio
  • Connect virtual datasets with BI tools
  • Visualize results in Tableau

Connecting Different Data Sources With Dremio

Different types of data sources available for performing data transformation activities are shown in the below screenshot:connecting_different_data_sources_with_dremio

To connect Amazon S3 data sources with Dremio, perform the following:

In Data Source Types page, select the Amazon S3 data source.

Connect to the Amazon S3 location as shown in the below screenshot:

Image title

Connect to the MySQL connection and provide the required credentials as shown in the below screenshot:

Image title

Connect to Network Attached Storage (NAS) as shown in the below screenshot:

Image title

Performing Data Transformation

To transform data, perform the following:

Use UNION function to merge data from three different data sources such as S3, MySQL, and NAS and load data as virtual dataset as shown in the below screenshot:

performing_data_transformationAs price values are based on single quantity, the total price needs to be calculated based on quantity.

Add Total_Price as a new field. Calculate the total price based on quantity as shown in the below diagram:

performing_data_transformation1

Perform aggregation with stock quantity and stock price based on the products in the source data as shown in the below diagram:

performing_data_transformation2

Round off the total price values to two decimal digits as shown in the below diagram:

performing_data_transformation3

Creating Virtual Datasets in Dremio

Upon successfully transforming data, create virtual datasets (View) on Dremio spaces to store the data based on the source.

The virtual dataset for purchases done by each customer is as shown below:creating_virtual_datasets_in_dremio

The virtual dataset for most quantity sold based on the product is shown in the below diagram:creating_virtual_datasets_in_dremio1

Connecting Virtual Datasets With BI Tools

To connect the virtual datasets with BI tools, export the virtual dataset in .tds format to be used with BI tools such as Tableau, Qlik Sense, and Power BI as shown in the below diagrams:connecting_virtual_datasets_with_bi_tools connecting_virtual_datasets_with_bi_tools1

Visualizing Results in Tableau

On clicking the .tds  file in Tableau, you will be redirected to Tableau for visualizing the data.

Most purchases by customers:

most_purchases_by_customers

Maximum number of products sold:

maximum_number_of_products_sold

And that's it!

Data science Self-service AWS Data transformation Analytics

Published at DZone with permission of Rathnadevi Manivannan. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • OWASP Kubernetes Top 10
  • Create a REST API in C# Using ChatGPT
  • When to Choose Redpanda Instead of Apache Kafka
  • Required Knowledge To Pass AWS Certified Solutions Architect — Professional Exam

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: