Elasticsearch is a search server based on Lucene. It provides a distributed, multitenant-capable full-text search engine with a RESTful web interface and schema-free JSON documents. Elasticsearch is developed in Java and is released as open source under the terms of the Apache License. (Source: Wikipedia)
Things to consider
Based on my experiences with Elasticsearch cluster deployments, I think the following considerations should be kept in mind.
Keep the JVM Version Consistent
You will start seeing Serialization exceptions in the logs at the serverside or in the Elasticsearch clients, if there is a mismatch in the JVM versions.
The typical exceptions have patterns similar to the following:
org.elasticsearch.transport.RemoteTransportException: Failed to deserialize exception response from stream
It will usually be a ClassNotFoundException with one of the SerializationException classes.
According to Shay Banon, the creator of Elasticsearch,
Tip: Try to keep the JVM version consistent across the cluster.
Virtual Memory Map Settings
Apache Lucene which is the underlying indexing project driving Elasticsearch, may throw an exception that looks something like:
Caused by: java.lang.OutOfMemoryError: Map failed
at sun.nio.ch.FileChannelImpl.map0(Native Method)
... 20 more
You will notice that this is choking in native code of the JVM. So we have to look for the OS settings.
Based on the issue identified in https://github.com/elasticsearch/elasticsearch/issues/4547
it is important to check your virtual memory settings.
Check if your vm.max_map_count is at the default of 65530. If yes, raise it higher.
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_file_descriptors_and_mmap.html for additional guidance.
Tip: Check your virtual memory settings for the OS.
Understand the 3 modes of an Elasticsearch Node
An Elasticsearch node can act in 3 modes:
- Elasticsearch Master Node
- Elasticsearch Data Node
- Elasticsearch Client Node
Understand the Heap Settings and Sizing
I consider the following page in the Elasticsearch document to be very important in designing/sizing your cluster.
Tip: For a data node, never allocate more than 32GB to the Java heap. Give half your memory to your OS (which will be used by Lucene).
Elasticsearch documentation on the threadpools http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-threadpool.html
gives information on the queue sizes.
For performance, it makes sense to use bulk requests and set your bulk queue size appropriately.
Tip 9 from https://www.loggly.com/blog/nine-tips-configuring-elasticsearch-for-high-performance/ gives a hint on one possible case.
It is very important to manage the bulk queue because the clients can start seeing exceptions and there can be data loss.
Tip: Configure the threadpool.bulk.queue_size property properly.
This should be a no-brainer for serious Elasticsearch administrators and developers. You will have to configure the minimum number of master nodes setting correctly.
Refer to http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_important_configuration_changes.html for details on this.
Tip: Set minimum_master_nodes to (number of master-eligible nodes / 2) + 1
Disable Multicast and Use Unicast
Refer to the discussion in
You will need to disable multicast in production and use unicast. This will prevent nodes from accidentally joining your cluster in production.
Tip: Ensure that the "discovery.zen.ping.multicast.enabled" property is set to "false".
It is very important to understand the Elasticsearch architecture, concepts and configuration before deploying it in production. Your devops will be grateful when you do the right things.
I hope you find this article useful. Please inform me if I have made any mistakes or if there are any additional tips that you would like me to add.