Over a million developers have joined DZone.

Lock Detection: Eating Our Own Dog Food

Learn more about how the Java language, tools and frameworks have been the foundation of countless enterprise systems, brought to you in partnership with Salesforce.

In our labs, we are working hard on the new performance optimization solutions. One of such new features taking shape is a lock contention detector. In just few months from now we are about to give wider access to the lock detection functionality currently in private beta.

During the beta program we trace the JVM internals to detect locks and determine how long the threads are waiting for a certain lock to grant access. Private beta program members get notified about the detected lock and we can verify whether we have indeed found a performance bottleneck and whether our solution guidelines are correct.

Based on this data we build further improvements into the detection algos. To give you a hint what you can expect from us soon, we open up an example belonging into “eat your own dog food” category. Indeed, one of the applications we monitor for lock contention issues is Plumbr itself. In last week statistics about lock contention we were suddenly facing the following:

Lock identifier# locks# long locksLong wait (ms)
eu.plumbr.SocketRegistry.getUserSessions()

22,440

24

280,040

eu.plumbr.SocketRegistry.addSocket()

19,024

1,422

22,520,822

eu.plumbr.SocketRegistry.removeSocket()

12,886

610

7,982,992

The easiest way to interpret the data above is to read it as following:eu.plumbr.SocketRegistry.addSocket() method invocations were waiting behind a lock 19,024 times. Out of those 19,024 accesses, 1,422 waited for a long time (long time being defined longer than 10,000ms) to get access to the synchronized content. In total, such long wait times resulted in more than six hours (or 22,520,822 ms) of total wait times for our end users. Oops.

Checking the code base we were able to confirm the problem. SocketRegistry is a class responsible for keeping track of connections from Plumbr agents sending data to the Server. These connections are also connected to a web interface to provide real time updates in UI.

Internally sockets were kept in several maps, to enable quick lookup for a session using different keys. Access for modifications of those maps inside those methods was done using standard Java synchronization on these maps. All this was done to ensure that connecting sockets, disconnecting sockets and web interface querying existing sockets are seeing the same picture.

As usual, during software evolution initial assumptions are changing and some initial nitty-gritty details slip away during further development. So, while initially operations on adding, removing and accessing sockets were fast, new features added a couple of database calls within those synchronized methods. Those added database round-trips increased the time of holding the lock. Increased lock holding meant that once the number of clients increased, the number of threads waiting concurrently for the lock also increased. Combination with increased method execution times snowballed the lock wait times to tens of seconds in worst cases.

We immediately took action and fixed the contention issue. As of now, we do not perform any database calls from inside synchronized blocks keeping the synchronized sections as short as possible. But the moral of the story is a bit different. As it played out – eating your own dog food can sometimes be a surprisingly pleasurable.

And as another take-away: we are more and more certain that the lock contention problems are truly difficult to trace down and resolve. We can and we will provide an easy way to tackle this complex problem. In case you are interested in getting early access to this functionality, let us know and we can add you to our private beta program.

Discover how the Force.com Web Services Connector (WSC) is a code-generation tool and runtime library for use with Force.com Web services, brought to you in partnership with Salesforce.

Topics:

Published at DZone with permission of Nikita Salnikov-tarnovski, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}