DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Apache Doris vs Elasticsearch: An In-Depth Comparative Analysis
  • Building a Cost-Effective ELK Stack for Centralized Logging
  • Doris vs Elasticsearch: A Comparison and Practical Cost Case Study
  • How to Scale Elasticsearch to Solve Your Scalability Issues

Trending

  • Rethinking Recruitment: A Journey Through Hiring Practices
  • Fixing Common Oracle Database Problems
  • Internal Developer Portals: Modern DevOps's Missing Piece
  • Develop a Reverse Proxy With Caching in Go
  1. DZone
  2. Data Engineering
  3. Big Data
  4. Elasticsearch Throttling Indexing

Elasticsearch Throttling Indexing

If segment merging falls behind indexing, Elasticsearch will throttle indexing requests to a single thread.

By 
James Carr user avatar
James Carr
·
Jun. 10, 16 · Analysis
Likes (2)
Comment
Save
Tweet
Share
11.3K Views

Join the DZone community and get the full member experience.

Join For Free

Recent adventures in Zapierland had me in a somewhat scary predicament. Starting at some point last week I’d get an alert that the Elasticsearch cluster we run Graylog against had hit very high CPU, load and memory usage. I was confused… we had more than enough horsepower to handle queries and normally the cluster only uses about 10% of CPU on each node. I paused message processing (that’s a nice feature of Graylog there, it just sticks the messages into a local journal) and began looking around. Nothing seemed off. Nothing stuck out in the logs. I noticed that the CPU and load eventually dropped, shrugged my shoulders and resumed message processing. Maybe a rogue query? Who knows.

A couple days later it happened again. Pause, resume. Everything began operating normally. While I had some hunches I had no solid evidence as to the cause and assumed that maybe it was time to expand the cluster. Since it wasn’t a fire (we just use Graylog for internal log searching of API requests) I added a card to our Trello board to just go ahead and kill two birds with one stone by upgrading Graylog and ES and ensuring we up the cluster size in the rebuilt cluster. I can handle the pause/unpause cycle in the short term.

As it continued happening daily, I got very curious. It only happened during peak load time and when there were a lot of staff members using it. And it always became 100% fine with a pause/unpause messaging cycle. I looked around some more and realized that we should tune the logging level a bit and see if anything stood out the next time it hit.

Sure enough, when our engineering team did a quick support blitz it happened again. This time, the Elasticsearch logs were filled to the brim with entries like the two below.

[2016-06-06 18:20:34,880][INFO ][index.engine ] [ip-10-0-0-2.ec2.internal]

[graylog2_761][0] now throttling indexing: numMergesInFlight=5, maxNumMerges=4

[2016-06-06 18:20:40,804][INFO ][index.engine ] [ip-10-0-0-2.ec2.internal]

[graylog2_761][0] stop throttling indexing: numMergesInFlight=3, maxNumMerges=4

Well, that’s a bit curious. Off to google. After a little searching, I came across this entry in the Elasticsearch reference.

Segment merging is computationally expensive, and can eat up a lot of disk I/O. Merges are scheduled to operate in the background because they can take a long time to finish, especially large segments. This is normally fine, because the rate of large segment merges is relatively rare.

But sometimes merging falls behind the ingestion rate. If this happens, Elasticsearch will automatically throttle indexing requests to a single thread. This prevents a segment explosion problem, in which hundreds of segments are generated before they can be merged. Elasticsearch will log INFO-level messages stating now throttling indexing when it detects merging falling behind indexing.

Elasticsearch defaults here are conservative: you don’t want search performance to be impacted by background merging. But sometimes (especially on SSD, or logging scenarios), the throttle limit is too low.

A key point following that statement is the revelation of what the default rate is.

The default is 20 MB/s, which is a good setting for spinning disks. If you have SSDs, you might consider increasing this to 100–200 MB/s.

Well, this seems promising because we DO use SSD. From the documentation, I issued a curl call to up the indices.store.throttle.max_bytes_per_sec rate.

PUT /_cluster/settings
{
    "persistent" : {
        "indices.store.throttle.max_bytes_per_sec" : "100mb"
    }
}


With these settings in place I had a few hours until the next support blitz. Suffice to say it has been several days and we haven’t had to pause message processing yet. This is definitely an important setting to ensure you configure correctly when you tune the defaults on your cluster.

Elasticsearch

Published at DZone with permission of James Carr, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Apache Doris vs Elasticsearch: An In-Depth Comparative Analysis
  • Building a Cost-Effective ELK Stack for Centralized Logging
  • Doris vs Elasticsearch: A Comparison and Practical Cost Case Study
  • How to Scale Elasticsearch to Solve Your Scalability Issues

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!