Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Tableau and Treasure Data: From Logging to Visualization

DZone's Guide to

Tableau and Treasure Data: From Logging to Visualization

Free Resource

Transform incident management with machine learning and analytics to help you maintain optimal performance and availability while keeping pace with the growing demands of digital business with this eBook, brought to you in partnership with BMC.

Previously, we looked at how to log events from our very simple Python “Rock, Paper, Scissors” app. While there are many types of logs (error logs, sensor logs, event logs, payment, and CRM logs, to name a few), the principle is effectively the same for any sort of log, regardless of how the log is invoked or what purpose it actually serves.

To glean insights from our data collection efforts, we’ll need to look at ways to visualize our logs. There are many ways to do this, including R studio, D3.js (topics that merit cover in future posts) as well as out-of-the-box solutions including Chartio and Tableau.

We’ll look at the last of these: How we’re collecting the data from our app, how we’re querying it in Treasure Data, and finally how we’re connecting to Tableau for dashboards and visualizations.

Along the way, our visualizations will teach us an interesting fact about Python’s randomlibrary. But first things first…

To recap the last post, we looked at ways to get logging events out of the app we created. At the simplest level, it involves 1) importing a sender; 2) importing events; 3) structuring the events (which determines the schema of the time-series database table into which those events will end up); and lastly, 4) sending the events. Although we’ve used Python 2.7 in our related examples, it’s more or less the same process regardless of the language or SDK we end up using.

All of the above is shown in this example, but to sum up, it looks like this in Python:
Importing sender and events, and setting up our sender:

from fluent import sender
from fluent import event,
…
sender.setup(‘td.rsp_db’, host=’localhost’, port=24224)

Structuring and sending our events:

event.Event(‘game_data’, {     ‘verdict': verdict })
event.Event(‘game_data’, {          ‘player': ‘Player 1′,          ‘choice': player1    })

It’s worth noting that languages like Python and Ruby require td-agent running to send events, but our other SDKs like JavaScript don’t.

Once we have our events being sent to Treasure Data, it’s a simple enough matter to query them out.

There is, however, an alternate step required if you want to export the data to Tableau.  Assuming you’ve already signed up for Tableau Server, then you need to ensure that you are connecting to your server instance with your Treasure Data query engine. To do this:

  1. In Treasure Data, just above the query window, select “Add” across from “Result Export”.
  2. From the “Export to:” dropdown menu, select “Tableau Server”
  3. Fill “host” field (will be “online.tableausoftware.com” if you are using Tableau Online for your server).
  4. Fill “username” and “password” with the values you got when you signed up for Tableau; enter a name for your Tableau Data Source.
  5. Keep other defaults and hit “Use”.
  6. Save and run your query.

Barring any difficulties, you should be able to pull up your data on your Tableau Desktop instance for visualization.

  1. On your Tableau Desktop main screen, under “Connect”, select “Tableau Server”.
  2. When prompted to sign in, enter the username and password you registered with Tableau and click “Sign in”.
  3. Search for the Data Source you entered in step 4, above, and select it. Make sure the data is refreshed.
  4. Next click “Sheet 1” at the bottom.
  5. We’ll create a simple visualization – a bar graph – based on our players and the frequency of their choice. To do this drag dimensions “choice” and “player” into “Columns” and measure “Number of Records” into “Rows”. You should end up with the following:



In our example, we’ve been writing our game verdict (computer wins, you win) to the same time series database table as our players’ choices, and this has skewed our results to the point where those choices of player “Null” far outpace those choices of “Player 1” and “Player 2”, making the distinction between them difficult. We want to exclude player “Null” from our graph.

To do this, hover your mouse cursor over “Null” at the top of the graph. Right click the text and select “Exclude” from the pop-up dialog.

Now that we excluded “Null” results from our visualization, we are more easily able to see any differences between the players and their choices during the game.

However, the graph actually tells us something interesting about our example. Additionally, it gives us some suggestions for further exploration.

First, there is some slight variation between the frequency of “paper”, “rock,” and “scissors”, with scissors appearing the most (Player 2, at 575 choices) and paper appearing the least (Player 2, at 521 choices).

However, relatively speaking, that variation is kind of small. Most of the choices between players are even smaller. (There’s an identical difference of 7 between the number of times Players 1 and 2 chose “paper” and “scissors”, and no difference at all between the number of times each player chose “rock”.)

Ideally, this app was intended to be a game where a human (using Python’s raw_input()) plays against the computer (using random.choice([‘rock’, ‘paper’, ‘scissors’])to make the selection).

However, to generate enough data to make the visualization relevant for this blogpost, I ended up using the random.choice() for both players, which brings me to my point: Given a number of choices, Python’s random() function seems to want to fairly evenly distribute picks from a finite set. So, as our graph shows, it’s really not very random at all.

To be sure, this data set is a far too small to get any reasonable variability out of it. But what if we wanted to benchmark different ways to randomize selection?

Join us for our future segment! We’ll also highlight more advanced ways to connect Treasure Data and Tableau, for example by setting up ODBC drivers (it’s not as complicated as it sounds!). Stay tuned!


Evolve your approach to Application Performance Monitoring by adopting five best practices that are outlined and explored in this e-book, brought to you in partnership with BMC.

Topics:

Published at DZone with permission of Sadayuki Furuhashi, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}