Throughout the decade of being responsible for delivering various IT projects I have involuntarily developed a reflex - whenever I hear someone saying "Our application has a memory leak", I immediately get shivers running down my spine.
The thing is - you can never estimate the amount of time it takes to solve the leak. Moreover - if the problem is in production, it means immediately allocating the best developer(s) in the team for an impossible-to-estimate period of time, and dropping everything else they were doing.
You're a bit more in luck if the leak is not in production, but the prospect of having to face the unknowns that always accompany memory leaks (How can we reproduce it? What is leaking? Where is it leaking? How long will it take to fix the leak? How can we verify the fix?) is still not very pleasant - to say the least.
However, in the last months I've had several positive encounters with memory leaks that have resulted in the reflex starting to evade. It turns out that it is possible to find the cause of a memory leak - and also fix it! - in half an hour.
What? Fix a memory leak in half an hour?
Well, it is actually possible to do even faster :)
As an example, let me describe an adventure I had last week. I normally write to everyone who tries our memory leak detector tool, and ask for feedback. That day I got a reply from a guy from Stanford university (let's call him Mr. A) who used it with one of his Java applications. He said that the app had an "extremely slow memory leak" that Plumbr couldn't detect. With that he effectively caught my attention, as we are always in the lookout for memory leaks that Plumbr is not able to recognize (we haven't found any in the last months... if you have one, do let me know!).
Anyway, I immediately replied to the guy, and we agreed to have a Skype meeting the next day. The following is a chronological overview of our Skype session
* Mr. A. said that he had run his app with Plumbr several times, and hadn't received a memory leak report.
* After some double checking questions we finally realized that Plumbr actually had detected the leak. We opened the report at 12:18. This is what we saw there:
* 39k objects of javax.management.ObjectName - this looked fishy. Unfortunately, Mr. A. reported he didn't know much of the internals of the library the HttpRequestHandler object belonged to. Luckily though he had the source code of the library at hand, and immediately showed me the HttpRequestHandler source code. I received the file at 12:28.
* Now, my own Java knowledge ended here, but fortunately I had the technical two thirds of our team in another Skype window. It took Nikita 3 minutes to read the leak report and analyze the source code, and come up with a workaround that would avoid leaking the object.
* This is the Skype log of what happened next:
[22:31:09] Priit Potter: do you use JMX in [your application]?
[22:32:59] Priit Potter: if not, you could just use false in the constructor's input parameters, the last one here:
...that should work around creating those leaking objects
[12:33:13] Mr. A.: huh, interesting
[12:34:17] Mr. A. : alright, I made that change to disable use of jmx
[12:34:30] Mr. A. : I guess we should re-run it and see if that fixes that issue
[12:34:59] Priit Potter: yup, try it
[12:35:02] Priit Potter: keep Plumbr attached
[12:35:59] Mr. A. : ok
And that was it. We had solved a memory leak, and it took us 16 minutes from when we learned that we had a leak in the application, to actually deploying the fix. And this with all the communication barriers (10 hours time difference, using multiple Skype chats in parallel, etc).
A couple of days ago I also received confirmation from Mr.A. that our fix had indeed solved the problem. Great!
What allows for such difference?
The tools that existed before Plumbr (we've covered many of them in our Solving OutOfMemoryError series - a couple of more posts still to come) concentrate on giving you as much information as possible of heap contents, and organizing it for somewhat easier reading. Some do it better, some have more room for improvement, but in the end of the day you still need to interpret that data to find the cause of the memory leak. If you have a big application that is not entirely written by you, and you don't have much experience with solving memory leaks or no time to concentrate, it can take you days or even weeks to find that needle in the haystack.
Plumbr does the interpretation for you. It gets to know your application and learns its memory usage patterns. With some help from the sweat we've poured into it, it detects memory leaks well before these affect your end users' experience, and also reports the exact cause of the leak.
And if you have the name of the leaking class, the line in the source code where the objects of the class are created, and the reference stack, any decent Java developer will solve the leak in minutes.
This is what Plumbr does in a nutshell - it reduces the pile of unknown situations to an easily solvable task that can almost always be solved in half an hour.
And helps project managers get rid of their fear of memory leaks. I wish Plumbr existed 10 years ago, it would have saved me some nerve cells from tense customer meetings :)