Is Your JVM Leaking File Descriptors — Like Mine?

DZone 's Guide to

Is Your JVM Leaking File Descriptors — Like Mine?

Quick! Go check!

· Performance Zone ·
Free Resource

Is your JVM leaking?

Is your JVM leaking?

The two issues described here were discovered and fixed more than a year ago. This article only serves as historical proof and a beginners' guide on tackling file descriptor leaks in Java.

You may also like: Difference Between JDK vs JRE vs JVM

In Ultra ESB we use an in-memory RAM disk file cache for fast and garbage-free payload handling. Some time back, we faced an issue on our shared SaaS AS2 Gateway where this cache was leaking file descriptors over time. Eventually leading to too many open file errors when the system limit was hit.

The Legion of the Bouncy Castle: Leftovers From Your Stream-Backed MIME Parts?
One culprit, we found, was Bouncy Castle — the famous security provider that had been our profound love since the Ultra ESB Legacy days.

Bouncy Castle FTW!

With a simple, home-made toolkit, we found that BC had the habit of calling getContent() on MIME parts in order to determine their type (say, instanceof checks). True, this wasn't a crime in itself; but most of our MIME parts were file-backed, with a file-cache file on the other — meaning that each getContent() opens a new stream to the file. Now there are stray streams (and hence file descriptors) pointing to our file cache.

Enough of these, and we would exhaust the file descriptor quota allocated to the Ultra ESB (Java) process.

Solution? Make 'em Lazy!

We didn't want to mess with the BC codebase. So we found a simple solution: create all file-backed MIME parts with "lazy" streams. Our (former) colleague Rajind wrote a LazyFileInputStream — inspired by LazyInputStream from jboss-vfs — that opens the actual file only when a read is attempted.

Yeah, lazy.

BC was happy, and so was the file cache, but we were the happiest.

Hibernate JPA: Cleaning up After Supper, A.K.A Closing Consumed Streams

Another bug we spotted was that some database operations were leaving behind unclosed file handles. Apparently this was only when we were feeding stream-backed blobs to Hibernate, where the streams were often coming from file cache entries.

Hibernate: hassle-free ORM, but with leaks?

After some digging, we came up with a theory that Hibernate was not closing the underlying streams of these blog entries. (It made sense because the java.sql.Blob interface does not expose any methods that Hibernate could use to manipulate the underlying data sources.) This was a problem, though, because the discarded streams (and the associated file handles) would not get released until the next GC.

This would have been fine for a short-term app, but a long-running one like ours could easily run out of file descriptors; such as in case of a sudden and persistent spike.

Solution? Make 'em Self-Closing!

We didn't want to lose the benefits of streaming, but we didn't have control over our streams either. You might say we should have placed our streams in auto-closeable constructs (say, try-with-resources). Nice try; but sadly, Hibernate was reading them outside of our execution scope (especially in @Transactional flows). As soon as we started closing the streams within our code scope, our database operations started to fail miserably — screaming "stream already closed!"

When in Rome, do as Romans do, they say.

So, instead of messing with Hibernate, we decided we would take care of the streams ourselves.

Rajind (yeah, him again) hacked together a SelfClosingInputStream wrapper. This would keep track of the amount of data read from the underlying stream, and close it up as soon as the last byte was read.

Self-closing. Takes care of itself!

(We did consider using existing options like AutoCloseInputStream from Apache commons-io; but it occurred that we needed some customizations here and there — like detailed trace logging.)

The Bottom Line

When it comes to resource management in Java, it is quite easy to over-focus on memory and CPU (processing) and forget about the rest. Virtual resources — like ephemeral ports and per-process file descriptors — can be just as important, if not more.

Especially on long-running processes like our AS2 Gateway SaaS application, they can literally become silent killers.

You can detect this type of "leaks" in two main ways:

  • "Single-cycle" resource analysis — Run a single, complete processing cycle, comparing resource usage before and after.
  • Long-term monitoring — Continuously recording and analyzing resource metrics to identify trends and anomalies.

In any case, fixing the leak is not too difficult; once you have a clear picture of what you are dealing with.

Good luck with hunting down your resource-hog d(a)emons!

Further Reading

How to Properly Plan JVM Performance Tuning

Final Keyword and JVM Memory Impact

file descriptors, hibernate, jvm, mime, mime parts, performance

Published at DZone with permission of Janaka Bandara , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}