Elasticsearch Index v7.6
Elasticsearch Index v7.6
In this article, we discuss how to work with indexes in Elasticsearch and search and visualize results.
Join the DZone community and get the full member experience.Join For Free
Elasticsearch, which is based on Lucene, is a distributed document store. It is a highly effective way of indexing your information for correlation and quick query for analysis. In this blog, I will just walk you through the steps required to create an Index, search, and visualize.
What Is an Index?
In the context of ES an index is a collection of documents.
Three basic elements in an index are Documents and Fields.
Some points to remember for an index are:
- Index is a logical grouping of physical shards.
- Single document may be distributed across multiple shards.
- Shards can be primaries or replicas.
- Each document belongs to one primary shard.
- The number of primary shards is fixed at the time of index creation.
- The number of replicas shards can be changed anytime.
What Is Elasticsearch?
ES is a Distributed document store for complex datastore serialised as JSON
In usual dev or prod environments, an ES is deployed as a cluster of nodes (collection of master/slaves) .
- Based on Apache Lucene.
- Accessible from any of the member nodes.
- It uses an inverted index to store text fields.
- Indexes all data in every field.
- Every field has a dedicated, optimized data store.
- Numeric and Geo points are stored in BKD trees.
- Can execute structured, full text, and complex queries.
The command above allows to query the "myserver.elastic.com" server and publishes the health on the browser (or console if invoked via curl)
Something like below
Time for Action
Let’s add a document into an index.
The most powerful aspect of Elasticsearch is Dynamic Mapping (more details below) which allows you to explore your data as quickly as possible.
To index a document, you don’t have to first create an index or define a mapping type or define your fields — you can just index a document and the index, type, and fields will spring to life automatically
In the command below, I am trying to issue a request for a document with
The GET command above just requests for an index "profile", and in the index, it specifically requests for document "1".
Note: The index is not present, we are issuing a GET command on an index not present.
As expected, the error below gets thrown.
The point is, you can create an index on the fly by adding the first element itself, but to search or get, you need to have the index.
Add the First Element
I will invoke the PUT and use the same signature "/profile/_doc/1" with the JSON body as shown below.
The response in my case is shown below, which indicates the successful indexing of the document "doc - 1".
What if you do not specify a body? You can give it a quick try and in all probability it should result in a parse-exception.
The automatic detection and addition of new fields is called dynamic mapping. - Dynamic Mapping
Now if you issue the GET again, you will be returned the result that is the newly indexed document.
If you think ingesting one document at a time is tedious, there are other options like the Bulk API.
I repeated the add document for a few more data items, and then it was ready for the next step — to query the data ingested.
A simple command (from the official documentation of Elastic)
It is querying the index, "profile" to return all matches; basically a "select '*'`;" and expects the age to be sorted in the ascending order.
Instead of the result, I get an exception like below.
Mistake I Made
The error message is explanatory itself, but to be precise I made an error while adding the documents. The error was the age I specified was a "string" and now while executing the sort, I am expecting it to be a numeric.
Two things I could have done
- Specified it as numeric.
- Defined the mapping explicitly.
If you issue the GET _mappings for the index, it should show the mapping that was deduced for the index at the time of creation.
You see the age is TEXT here, and hence the exception.
The PUT mapping for the profile will not work as the index already exists and is not empty.
If the above command is used, it will throw an exception, as shown below:
The easiest way is to delete the index and recreate the documents or use the "age.keyword", as shown in the mapping to aggregate.
I chose to utilize the delete index option, and create it as shown below.
This command will delete the index along with all the documents.
Adding the new index as under should take care of the previous mistake I made.
Time to fire the search again.
This time the response is as under — a successful search.
What Is Mapping?
It will need a blog to explain the mappings in detail, but a short answer is
Mapping defines how a document and the fields it contains are stored and indexed.
It helps to identify at a high level:
- Which fields are date fields (should be stored like a date).
- Which string fields are to be used for full-text searching.
- Which are numeric (for operations like sort, aggregate, etc.)
Opinions expressed by DZone contributors are their own.