Solving OutOfMemoryErrors - APM tools as a solution?
APM solutions are positioned as the Holy Grails on the quest for solving your production environment’s performance problems. You just setup an APM tool of your choice, let it monitor your whole cluster from front-end load-balancer - or even the end user's browser - down to your Oracle or Neo4j database and then relax in that Aeron chair of yours. APM provides you with all the information your operations or developers need in order to achieve a desired level of customer satisfaction.
If you thought there was a grain of sarcasm in what I have just said, you are right. I was always suspicious about Jacks of all trades. But let us cast all the doubts aside and just try them out.
Our fist list of APMs to try consisted of 5 names: AppDynamics, CA APM (former Introscope), dynaTrace, HPjmeter and New Relic.
HPjmeter fell away at once, as it is available for HP-UX platforms
only. New Relic has no memory leak tracking capabilities yet, so there
we lost a second contestant. For CA APM and dynaTrace we were unable to
obtain a free evaluation version. This left us with AppDynamics alone.
Kudos for the AppDynamics team who gladly provided us the opportunity to
try the solution!
AppDynamics installation provides the AppDynamics Controller, a central web-based dashboard, that is collecting and processing data from Java Application Server Agents. The agents are JVM agents, just like Plumbr. These agents can be attached to different servers and applications. Attached agents send runtime information to Controller. With our sample application the AppDynamics Controller dashboard looks like this:
If we dive a little deeper, we can see memory related data per node:
screenshot above is only a subset of the information provided on the
screen. It is scrollable all the way down for much more graphs and
trends and data. Which is all nice, but not exactly what we were looking
As our goal was to let AppDynamics help us find the root cause of the memory leak we have suffered from so long, we switch to “Automatic Leak Detection” tab and activate it (it is switched off by default). Then we start an “On Demand Capture Session” and let the application run under the load of our stress tests. The results are... a bit disappointing. Even after several re-tries with different parameter values that AppDynamics allows to configure, the result was always the same:
This all happened while my application was crashing with the OOM. No luck here today.
It was very difficult for me to write this post. First of all I had only one tool to test, and getting even this one was not an easy task. Secondly, I wasted several hours before I saw the first bit of information in AppDynamics dashboard (do not ask, I will not write about it). And thirdly, I have no results to show to my readers. But here we are and what can I conclude from this experience?
- APM tools require planning. If you have fire in the house, it is too late to go fetching them. Like in case of life insurance you should have thought about them way before the moment you really need them.
- APM tools give you tons of information about inner workings of your application performance-wise. But I was hoping for something answering my question ”Why do I have a memory leak and what should I do?” more directly.
- All in all, APM solutions can be great tools for monitoring and proactive planning of performance related maintenance of your application. I cannot recommend it for problem solving though.