Over a million developers have joined DZone.

Try Optimizing Memory Consumption First

DZone's Guide to

Try Optimizing Memory Consumption First

Want to increase your performance? Memory is key. Optimizing your memory consumption should be your first step. Check out these tips to better allocate memory.

· Performance Zone ·
Free Resource

Sensu is an open source monitoring event pipeline. Try it today.

[This article by Peter Lawrey comes to you from the DZone Guide to Performance & Monitoring -- 2015 Edition. For additional information including insight from industry experts and luminaries, performance statistics and strategies, and an overview of how modern companies are handling application monitoring, download the guide below.]

When you want your application to run faster you might start off by doing some CPU profiling. However, when I’m looking for quick wins in optimization, it’s the memory profiler I target first. 

Allocating Memory Is Cheap

Allocating memory has never been cheaper. You can buy 16 GB for less than $200. There are affordable machines with hundreds of GBs of memory. The memory allocation operation is also cheaper than it has been in the past, and it’s multi-threaded, so it scales reasonably well. However, memory allocation is not free.

Your CPU cache is a precious resource, especially if you are trying to use multiple threads. While you can buy 16 GB of main memory easily, you might only have 2 MB of cache per logical CPU. If you want these CPUs to run independently, you want to spend as much time as possible within the 256 KB L2 cache (see table, top right).

Allocating Memory is Not Linear

Allocating memory on the heap is not linear. The CPU is very good at doing things in parallel. This means that if memory bandwidth is not your main bottleneck, the rate you produce garbage has less impact than whatever is your bottleneck. However, if the allocation rate is high enough (and in most Java systems it is high), it will be a serious bottleneck. You can tell that the allocation rate is a bottleneck if:

  • You are close to the maximum allocation rate of the machine. Write a small test that creates a lot of garbage and measure the allocation rate. If you are close to the max allocation rate, you have a problem.
  • When you reduce the garbage produced by say 10%, the 99% latency of your application becomes 10% faster, and yet the allocation rate hardly drops. This means your application sped up so that it reached your bottleneck again. 
  • You have very long GC pause times (e.g. into the seconds). At this point, your memory consumption is having a very high impact on your performance, so reducing the memory consumption and allocation rate can improve scalability (how many requests you can process concurrently) and reduce the amount of time during which the application freezes. 

Combining the CPU and Memory Views 

After reducing the memory allocation rate, I look at the CPU consumption with memory tracing turned on. This gives more weight to the memory allocations and will provide an alternative to just looking at the CPU alone. When this CPU and memory view shows you that the application is spending most of its time doing essential work, and there are no more easy performance gains to be made, I then look at CPU profiling alone. Using these techniques as a starting point, my aim typically is to reduce the 99th percentile latency (the worst 1%) by a factor of 10. This approach can also increase the throughput of each thread and allow you to run more threads concurrently in an efficient manner. 


Sensu: workflow automation for monitoring. Learn more—download the whitepaper.

optimization ,performance ,monitoring

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}