Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Visualizing ECL and Sharing Your Results

DZone's Guide to

Visualizing ECL and Sharing Your Results

Learn about an open-source platform that can help you visualize results generated by your ECL code rather than relying on JavaScript.

· Big Data Zone
Free Resource

Learn best practices according to DataOps. Download the free O'Reilly eBook on building a modern Big Data platform.

Wherever you look these days, analysts are providing visual representations of the data they mine to help businesses make decisions. The HPCC Systems Visualizer Bundle allows you to visualize results generated by your ECL code rather than relying on JavaScript.

It is open source just like the HPCC Systems platform, ECL IDE, Machine Learning Library, and embedded language and data store support.

Four types of visualizations are included in the new HPCC Systems Visualizer bundle:

  1. Two-dimensional charts such as chart, pie, and bubble.
  2. Multi-series and dimensional charts such as bar and column.
  3. Geospatial such as choropleths.
  4. General tables/grids that can display any data rather than data specific shapes.

The bundle also includes an internal self-test (Visualizer.ecl), which, when run, provides a minimal example of its use including some examples of these different visualization types.

Also included is a Demos folder that contains some more complete examples, including:

  • Field mappings.
  • Filtering (dashboarding).
  • Dermatology properties (look and feel).

The Visualizer bundle can be installed with any IDE, provided you have already installed the HPCC Systems Client Tools. Since the installation of the bundle requires the use of the ECL command line tool, users may find it easier to set their local PATH to include the specific version of the ECL Client Tools that they use with the following command:

set PATH=%PATH%;"c:\Program Files (x86)\HPCCSystems\%version%\clienttools\bin"

%version% is the client tools version you have installed.

Download the bundle ZIP file from the HPCC Systems Visualizer GitHub repository and extract it onto your computer. The installation instructions recommend that you extract into a folder called Visualizer in your downloads folder.

Note: On Windows, the default “extract” option tends to unzip the files into an additional nested folder, which can cause the install to fail. Simply move the files to the correct folder one level up to work around this.

You can then use the ECL command line to install the Visualizer using the following command:

ecl bundle install %USERPROFILE%\Downloads\Visualizer

You can also install the Visualizer direct from GitHub, but to do this both the HPCC Systems Client Tools and Git must be installed and available in the path:

ecl bundle install https://github.com/hpcc-systems/Visualizer.git

For the purposes of this blog, I have chosen to use our ECL IDE and view the visualizations using ECL Watch in a browser. Once the Visualizer is installed, the self-test files are automatically available in the Visualizer/Demos repository.

There are a few simple visualization jobs you can use to view the ECL code and see the Visualizer in action. helloWorld.ecl is a self-test that uses an inline 2D dataset to create a 2D bubble chart. Either double-click on the helloWorld.ecl file or right-click and Open in Builder Window. 

Submitting the job to the target cluster creates a unique workunit, which you can view in ECL Watch from within the ECL IDE by clicking on the workunit name and ID.

You can also use ECL Watch from within in your browser (http://<esp IP Address>:8010), which gives you more screen real estate for viewing the visualizations (as shown below). In the ECL area, click on the workunit ID and use the Resources tab to view the charts which are displayed in HTML.

There are also examples of simple 2D column and Pie chart visualizations:

In the previous examples, the data was exactly the shape that the visualizations expected, but often, you may have many columns in your results and only want to visualize a specific set of columns. The areaChart-mappings.ecl self-test illustrates how you extract only the columns you want from a larger data set and how to map those columns to the visualization, by specifying a mappings dataset in your ECL code.

This is the corresponding visualization:

Look and Feel: Dermatology

When viewing a chart (especially from the previous example), you may not be 100% happy with its appearance. This is where the dermatology Properties come in. The dermatology (the skin) of the chart can be tweaked in two different ways:

  1. By pressing the Properties button while viewing the chart on the Resources tab in ECL Watch and clicking Save when done.
  2. By setting the properties directly in the ECL code.

I started by editing the look and feel visually using the first method, and once I was happy, I was able to move those new settings into the ECL code. The next self-test, areaChart-mapping-properties.ecl, is an example of how this works.

The areaChart-mapping-properties.ecl self-test illustrates not only how you can set the mappings for the data you want to view but also how to declare in advance the properties you want to use to display the results by specifying a properties dataset.

The resulting chart makes a better job of displaying the data in a much more meaningful visualization:

Now let’s look at a dashboard showing a number of visualizations displaying different columns of data from the larger dataset. Note that the area line chart shown without results includes a filter parameter that is filtered by the other charts shown on the dashboard.

Clicking on the results shown in the column, bar, and pie charts causes the line chart to change to show only the results you have asked to display. Selected results are indicated by a red border, which is toggled on and off by a click.

By arranging the charts on the page as you wish, you can create a dashboard view of the all the charts you want to see concurrently. I thought it might make more sense to display the line chart the full width of the page with the others below:

Once saved, this dashboard showing the visualizations with the properties you have chosen is preserved for future viewing.

Sharing Your Visualization

Chances are that at some stage, you will want to share your visualizations with others. There are two common approaches to doing this:

  1. Provide a direct URL to the embedded chart. The simplest way to do this is to click on the Open in new tab button in the top right-hand corner of the Resources page and then to share that URL. But you should note that this will only work if you are sharing with others who also have access rights to ECL Watch.
  2. Download a “canned” version of the dashboard. To do this, use the Download button, which allows you to preselect a few items to be included. You can then email the resulting HTML file (including the dashboard definition) and the recipients can see your charts exactly as you viewed them. While the download feature also works with “dynamic” dashboards, it’s worth noting that if your recipient clicks on filter combinations not included in the downloaded HTML file, they will need access to the HPCC Platform that hosts the data and may be prompted for their login details to see the new results they have just requested.

Data Sources

All the previous examples have assumed that the source data has been self contained in the current workunit, but it is worth noting that you can specify the following data sources:

  • Current WU + Result Name.
  • Other WU + Result Name.
  • Logical File.
  • Roxie Query.

As well as using the Visualizer bundle to carry out data analytics, another use might be to add a visualization into a large job you run regularly so you can see how that job ran giving the additional benefit of allowing you to compare the results of subsequent runs of the same job.

Final Notes

  1. ECL Watch is the web-based UI supplied with the HPCC Systems platform that provides an interface to the whole system, including monitoring the health of all components, running and completed workunits and queries, access to results and user permissions, etc. It is accessed using your browser, specifying the IP address of the ESP component followed by the port 8010 (http://<esp IP Address>:8010).
  2. More information is available in the Visualizing ECL Results manual.
  3. The test script and sample data can be found in the Visualizer GitHub repository.
  4. You can still visualize results from your data using JavaScript. For more information, see the earlier blog, Visualizing Your Data Using HPCC Systems®.

Find the perfect platform for a scalable self-service model to manage Big Data workloads in the Cloud. Download the free O'Reilly eBook to learn more.

Topics:
big data ,data visualization ,tutorial ,ecl

Published at DZone with permission of Lorraine Chapman, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}