A couple of years ago, I read an article published by a medical insurance company from the US. The article was about a solution that they had implemented to cater to the growing need of accessing customer information in a consistent manner.
Given that the organization had grown over the years, organically and inorganically, they had many systems in place to deal with the various insurance products that they had sold. As stated in the article, the company was facing issues dealing with customer queries.
Typically, when a customer calls their help desk, the help desk person goes through the insurance policy that the customer has with the company. But as the company had many systems that stored product information, it was a chore to navigate through the various systems to finally bring up the correct insurance product information. Added to this was the complexity that customer information was stored in multiple ways by the various systems that the company had. Thus, getting the correct information in a timely manner was becoming a challenge.
To help improve the customer experience, the company decided to implement a Big Data solution. Now, the most commonly used visualization method that will come to everyone's mind is to throw all customer information into a Hadoop data lake and re-write all applications to make use of information from this common store. But the company decided to use a different method. They made use of a document store datastore — MongoDB, in this case — in particular, to implement their solution. The name of the solution was "the Wall."
The solution, as described in the article, collected all customer information from various systems into a MongoDB store. Then, a new user interface was developed to interact with the new solution. The most important aspect of the solution was that none of the original solutions was modified in any manner. The Wall pulled all relevant information into the data store and also maintained references back to the main system the information came from.
The solution implemented by the insurance company is commonly called a "Customer 360" solution. The solution, primarily, is a data store where all customer information from various legacy systems is aggregated to create a master data store.
The beauty of the solution is that there is no immediate need to modify any application to use a Big Data platform. This is because only the most essential customer information is pulled from the source systems and duplicated in the data store of the Wall. If any customer-related action needs to take place, the customer 360 front-end directs control to the solution that owns the customer record that is being referred to at present.
Due to my recent exposure to Elasticsearch, a thought came to my mind when I heard about the solution once again. If I was to implement a solution like the Wall, would Elasticsearch be a suitable candidate for storing the data? That set me thinking. So, let's conduct a thought experiment in this article. Will we get a workable solution is we use Elasticsearch instead of MongoDB to implement a solution similar to the Wall?
As is commonly known, Elasticsearch is the 'E' of the now-famous ELK (Elasticsearch, Logastash, Kibana) set of tools. Elasticsearch is a search solution implemented on top of Apache Lucene and provides a search facility for the data stored in one or more indices present with the Elasticsearch instance (or cluster). The reason for Elasticsearch's popularity is not only because it is open source, but also due to its simplicity.
Customer 360 Using Elasticsearch
Considering the popularity of Elasticsearch and the fact that it is also a document store, I am convinced that a customer 360 solution could be implemented using Elasticsearch.
To implement a solution using Elasticsearch, we need to define an index to hold customer information. For each customer, we will need to communicate with each existing product management system and collect relevant information.
Then, we will have to map the information with the fields in the Elasticsearch index in such a way that it should be possible to uniquely identify the customer record in the relevant product management system. As we do not wish to duplicate each functionality for some actions, like information editing, we would need to transfer control to the original system. Keeping the mapping information will be very helpful for this purpose.
The task of building the index and collecting information for each customer (from various systems) is quite tedious and will definitely be time-consuming. Additionally, it can happen that some information, like the name of the customer, may be written differently in different systems. For example, one system may store the customer name in the 'First name, Last name' format, while another system may prefer to use the 'Last name, First name' convention. Thus, it will not only be important to harmonize data in Elasticserarch (for better search performance), it is also important to maintain a mapping to the original system, where this information came from.
Once all the relevant data has been collected in an index, we will need to provide access to the stored information using a suitable user interface.
The biggest advantage of using Elasticsearch is that it is fast. Given that Elasticsearch is not a data store, many of the associated overheads (of data store management) do not need to be supported, improving its performance. Another advantage is that Elasticsearch supports stemming. This can turn out to be a very important feature when we wish to search for a customer using their name.
1. Search Engine vs. Data Store
It is important to note that Elasticsearch is a document oriented search engine, while MongoDB is a document store. Thus, many aspects, like data locking, and concurrency are not supported.
2. Information Security and Privacy
By primarily being a search engine, Elasticsearch does not have strong security and privacy features built into the core of the system. Though security and privacy can be purchased as additional plugins, their absence may be the major deterrent for adoption in many customer solutions.
3. In-Place Modification
Though data can be removed from an index in Elasticsearch, it is not possible to perform 'in place' editing of any of the doucments stored in the index. To 'edit' a document, we need to read all its attributes, copy all the fields into a new document, insert the new document and delete the existing record.
As Elasticsearch provides the ability to store documents, quite similar to that provided by a data store like NoSQL, it is a good candidate to help implement a Customer 360 solution. The primary aim of a Customer 360 solution is to search for information that lies in many legacy systems. The solution provides this facility by aggregating critical information elements from various systems into a central data store and providing search facility over the data.
Elasticsearch, being a search solution, is quite capable of meeting the needs of the solution. While Elasticsearch does provide an advantage while implementing a customer 360 solution, we need to keep in mind the fact that by default, it does not provide robust security and privacy features. Hence, all the data will be accessible to everyone across the organization. For mitigation, we will need to purchase an additional security layer using plugins and/or implement our own custom solution.