How to Triage a Busy Thread Count Alert in 14 Minutes
Join the DZone community and get the full member experience.Join For Free
This is a real example of troubleshooting a production application issue provided by an AppDynamics customer. What you are about to see is a combination of run time analytics, adaptive data collection, intelligent alerting, and a proven problem solving workflow. From first alert to DBA handoff took only 14 minutes.
5:26 p.m. – Operations receives an email alert about Busy Threads breaching a threshold. The incident was automatically detected and alerted upon by AppDynamics when the Busy Threads JMX metric shot up to 182.
AppDynamics sends notifications detailing busy thread counts
5:34 p.m. – Details from AppDynamics show that call volume is down, response time is up, errors are up and network I/O is down. Initial suspicion is that the load balancer may be throttling traffic due to poor performance.
5:38 p.m. – Company procedure is followed by disabling the server from the load balancer so that it will not receive any more traffic. Recycle of application server is considered as a possible temporary resolution to the issue.
5:40 p.m. – Details from AppDynamics are used to show that transactions are backing up because of a database issue. There is no need to recycle the application server. The issue is handed off to DBA team with full application context for resolution.
Screenshot showing problematic JDBC call as the culprit.
Later that day: DBA team fixes the issue and application response time returns to normal. All nodes are restored into the load balancer rotation.
This is an example of a scenario that IT Operations teams deal with regularly. Without having AppDynamics in place to provide fault domain isolation this type of problem usually ends up in a long conference call where all support personnel for this application must participate until service has been restored. There is no need to waste significant company resources any more. Stop the “all hands on deck” madness and see how AppDynamics can help your company today.
Published at DZone with permission of Jim Hirschauer, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Observability Architecture: Financial Payments Introduction
Building a Flask Web Application With Docker: A Step-by-Step Guide
Auditing Tools for Kubernetes
Scaling Site Reliability Engineering (SRE) Teams the Right Way