Over a million developers have joined DZone.

Getting Started With Data Analysis: Cupcake Data

DZone's Guide to

Getting Started With Data Analysis: Cupcake Data

Learn how a small cupcake company achieved its analytical goals using really simple web reporting tools for data analysis.

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

Today, I’d like to share a really simple tutorial on how to analyze your data online. To revive the story, I will tell you how a small company can achieve analytical goals in business using really simple web reporting tools for data analysis.

Let’s imagine we have a company selling cupcakes. Our confectionery works in a couple of cities and we usually take orders for birthdays, office parties, etc.

Recently, our sample company went through a series of improvements and after that, it started expanding. So, there comes a point when the company needs to analyze sales data to address a wide range of requests:

  • In which cities should we open new offices.
  • Compare the productivity of our sales managers.
  • Which cupcakes are the best-selling.

To answer these questions, we started exploring free web reporting tools to analyze data. To choose the needed tool, we should clarify which data we are going to deal with. All our data is collected automatically and saved in JSON format. We keep order details on when, where, how much, and by whom our cupcakes were bought and who was communicating with the customer. So basically, it is an array of JSON objects.

At first, we converted the data to CSV and inserted it into Google Sheets to analyze it there. Some entries were repeating lots of times and all in all, it was not convenient. Pretty soon, we discovered that a pivot table would be much more useful. A pivot table is a table that takes a simple table, applies aggregation like summing or count, and shows this aggregated data in another table. We were looking for free web-based pivot tables. Our shortlist contained two options:

  1. WebDataRocks: Free web reporting tool for data analysis and visualization.

  2. PivotTable.js: JavaScript pivot table library with drag-and-drop functionality.

Both components can take your data and display it in a pivot table. We created two demos with our data. Here, you can see how both tools can be embedded into the webpage and also compare the appearance:

As you can see, we put Product Name in rows and City in columns. Each cell shows a sum of Order Prices for the respective data. Both components allow pre-configuration of the data and offer a number of aggregations. We added number formatting to suit our needs. Also, we can easily rearrange the data by selecting something other to rows, columns, and values, and we can use sorting to detect the most relevant results on the fly.

Before, we were talking about the functionality supported by both tools. Here are the unique features:

  • With WebDataRocks, we can connect to new files or open previously saved reports. The tool provides the ability to export to the most popular formats, such as Excel or PDF. Also, conditional formatting can be added via the user interface.

  • With PivotTable.js, we can switch between table, table bar chart, and heatmap. The data stays the same but representation is different.

Both tools are great and which one to use is up to you. I hope you guys will find this tutorial useful for starting your own data analysis. Let me know your thoughts below in the comments!

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

data analytics ,big data ,tutorial ,web reporting

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}