Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Reporting and Analysis With Elasticsearch

DZone's Guide to

Reporting and Analysis With Elasticsearch

A software developer gives an overview of Elasticsearch and the Elastic Stack, while diving into her experiences with the big data platform and search engine.

· Big Data Zone ·
Free Resource

How to Simplify Apache Kafka. Get eBook.

Since the popularity of NoSQL and Big Data exploded in recent years, keeping up with the latest trends in databases, search engines, and business analytics is vital for developers.

And it’s hard not to be overwhelmed by the number of solutions available on the market: Amazon CloudSearch, Elasticsearch, Swiftype, Algolia, Searchify, Solr, and others.

Each of the above-mentioned solutions has their pros and cons. But this time I’d like to focus your attention on one of them, Elasticsearch, covering the main points about its benefits, kinds of search, and use cases. Also, I’ll share my own experience on how to approach the task of analysis of Elasticsearch data with web reporting and visualization tools. So, let’s start!

What Elasticsearch Is

Image title

Elasticsearch is acknowledged as one of the best full-text search engines capable of dealing with structured and unstructured data. It’s built on the top of Apache Lucene, a library for full-text search written in Java. Additionally, it’s an open-source product, therefore, it’s always been supported and developed by programmers and engineers.

Getting Advanced With Elasticsearch

Each project of the Elastic Stack deserves special attention: Beats performs centralization of data in Elasticsearch and ships the data to Logstash; Logstash, in turn, transforms and parses the incoming data from multiple sources and sends it to the heart of the Elastic stack — Elasticsearch. As the last step, the stored data is visualized with charts with the help of Kibana. You can see that storing structured and unstructured data and its aggregation is pretty easy with the Elastic Stack.

So let’s dive deeper into reasons why to use Elasticsearch at all.

Features of Elasticsearch

Developers appreciate Elasticsearch for:

  • The capability of scaling out to hundreds of servers due to its distributed system of data storing and processing which saves your CPU and RAM resources.
  • Fast retrieval of data due to the inverted indices technique and the caching of the most frequently used structured queries.
  • Creation of indices at runtime.
  • Mappings for the fields which divide the documents into the logical groups.
  • Balancing of the load between the nodes in a cluster and replicating the data.
  • RESTful APIs for Create, Read, Update, Delete, and search operations against your indexes, checking cluster, node and index health, etc.

And more.

All these features help in developing real-time platforms, as well as search and business intelligence applications. Moreover, Elasticsearch serves well for searching and analyzing logs. It helps in identifying problems with web servers or applications.

Elasticsearch supports the following types of dynamic search:

  • Structured
  • Full-text
  • Multifield
  • Proximity matching (it treats documents as a ‘bag of words’ that doesn’t take into account the relationships between words)
  • Partial matching

Furthermore, you can combine them to achieve the perfect match of your query.

Analysis of Elasticsearch Data

The first steps of the analysis process consist of defining the underlying objectives, collecting and aggregating the data from indices, and importing it into the analysis tool.

My goal was to make an interactive report based on the Elasticsearch data. One main requirement the ease of connection to my index for further summarizing the data. Of course, the challenge of Big Data visualization can be handled with Kibana. But I needed to find a flexible approach for exploring indices which would help me to have a constant access to the analytics right from my web application.

Fortunately, I’ve found a pivoting tool which has a built-in connector to Elasticsearch.

Flexmonster is what helped me out. It’s a client-side pivot table component for data visualization. It allows for the aggregating, filtering, and sorting of data, and visualizes it through charts. The results can be shared in Excel, PDF, and other formats.

Getting Started With Elasticsearch and Pivot Table

To start gaining insights from the data, I needed to get a full understanding of how to work with Elasticsearch and set up the connection to it from Flexmonster Pivot Table and Charts.

Setting Up the Configuration

With the help of the official documentation, I’ve easily installed Elasticsearch on my machine and connected to its server.

Then I’ve proceeded to configure Elasticsearch connection in the client-side part of my application. Following the steps described in the guide from the package, I’ve enabled CORS, connected to the index and imported the data into the pivot table. The whole process of configuration took me about 15 minutes. The overall roadmap consisted of the following parts:

  • Embedding the component into the application.
  • Configuring the Elasticsearch server.
  • Establishing the connection from Flexmonster to the server.

I’ve managed to integrate the pivot table with my Angular application and connect to Elasticsearch instance. Also, I’ve discovered that it’s compatible with such data types of Elasticsearch as String (keyword as only this datatype can be used for aggregating and sorting), Date, Numeric, Boolean, Object, and Nested Object. When dealing with the business data, Numeric and Date types are especially useful. Numeric fields can be put into measures while date fields may serve both as dimensions and measures.

After connecting to the data source, I’ve defined a slice for my report putting the string and date fields into rows and columns, the numeric fields into measures and filtered the report to see only the records I needed.

As an illustration, you can have a look at Kibana flights data sample in a real-time demo.

Summary

Altogether, the Elasticsearch functionality lets you perform complex queries against your data. To me, the Elastic Stack and Flexmonster create a powerful combination for building dashboards and carrying out ad hoc analysis. All of them may help you bring web reporting and data visualization to a new level.

Useful links

12 Best Practices for Modern Data Ingestion. Download White Paper.

Topics:
big data ,elasticsearch ,data visualization ,data analysis ,search engine

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}