DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Discover how Kubernetes continues to shape the industry as developers drive innovation and prepare for the future of K8s.

Observability and performance monitoring: DZone's final 2024 Trend Report survey is open! We'd love to hear about your experience.

Threat Detection: Learn core practices for managing security risks and vulnerabilities in your organization — don't regret those threats!

Celebrate a decade of Kubernetes. Explore why K8s continues to be one of the most prolific open-source systems in the SDLC.

Related

  • The Best Way to Completely Purge a Varnish Cache
  • Duplicate Objects in Java: Not Just Strings
  • Supporting Offline Mode: Key Strategies and Benefits for iOS Mobile Apps
  • The Ultimate Database Scaling Cheatsheet: Strategies for Optimizing Performance and Scalability

Trending

  • Monolithic vs Microservices Architecture: Key Differences
  • Unit Integration Testing With Testcontainers Docker Compose
  • MariaDB Vector Edition: Designed for AI
  • Creating MVPs on a Budget: A Guide for Tech Startups
  1. DZone
  2. Data Engineering
  3. Data
  4. False Sharing

False Sharing

By 
Martin Thompson user avatar
Martin Thompson
·
Aug. 31, 11 · Interview
Like (10)
Save
Tweet
Share
38.0K Views

Join the DZone community and get the full member experience.

Join For Free

Memory is stored within the cache system in units know as cache lines.  Cache lines are a power of 2 of contiguous bytes which are typically 32-256 in size.  The most common cache line size is 64 bytes.   False sharing is a term which applies when threads unwittingly impact the performance of each other while modifying independent variables sharing the same cache line.  Write contention on cache lines is the single most limiting factor on achieving scalability for parallel threads of execution in an SMP system.  I’ve heard false sharing described as the silent performance killer because it is far from obvious when looking at code.

To achieve linear scalability with number of threads, we must ensure no two threads write to the same variable or cache line.  Two threads writing to the same variable can be tracked down at a code level.   To be able to know if independent variables share the same cache line we need to know the memory layout, or we can get a tool to tell us.  Intel VTune is such a profiling tool.  In this article I’ll explain how memory is laid out for Java objects and how we can pad out our cache lines to avoid false sharing.

Figure 1.

Figure 1. above illustrates the issue of false sharing.  A thread running on core 1 wants to update variable X while a thread on core 2 wants to update variable Y.  Unfortunately these two hot variables reside in the same cache line.  Each thread will race for ownership of the cache line so they can update it.  If core 1 gets ownership then the cache sub-system will need to invalidate the corresponding cache line for core 2.  When Core 2 gets ownership and performs its update, then core 1 will be told to invalidate its copy of the cache line.  This will ping pong back and forth via the L3 cache greatly impacting performance.  The issue would be further exacerbated if competing cores are on different sockets and additionally have to cross the socket interconnect.

Java Memory Layout

For the Hotspot JVM, all objects have a 2-word header.  First is the “mark” word which is made up of 24-bits for the hash code and 8-bits for flags such as the lock state, or it can be swapped for lock objects.  The second is a reference to the class of the object.  Arrays have an additional word for the size of the array.  Every object is aligned to an 8-byte granularity boundary for performance.  Therefore to be efficient when packing, the object fields are re-ordered from declaration order to the following order based on size in bytes:
  1. doubles (8) and longs (8)
  2. ints (4) and floats (4)
  3. shorts (2) and chars (2)
  4. booleans (1) and bytes (1)
  5. references (4/8)
With this knowledge we can pad a cache line between any fields with 7 longs.  Within the Disruptor we pad cache lines around the RingBuffer cursor and BatchEventProcessor sequences.

To show the performance impact let’s take a few threads each updating their own independent counters.  These counters will be volatile longs so the world can see their progress.

public final class FalseSharing
implements Runnable
{
public final static int NUM_THREADS = 4; // change
public final static long ITERATIONS = 500L * 1000L * 1000L;
private final int arrayIndex;

private static VolatileLong[] longs = new VolatileLong[NUM_THREADS];
static
{
for (int i = 0; i < longs.length; i++)
{
longs[i] = new VolatileLong();
}
}

public FalseSharing(final int arrayIndex)
{
this.arrayIndex = arrayIndex;
}

public static void main(final String[] args) throws Exception
{
final long start = System.nanoTime();
runTest();
System.out.println("duration = " + (System.nanoTime() - start));
}

private static void runTest() throws InterruptedException
{
Thread[] threads = new Thread[NUM_THREADS];

for (int i = 0; i < threads.length; i++)
{
threads[i] = new Thread(new FalseSharing(i));
}

for (Thread t : threads)
{
t.start();
}

for (Thread t : threads)
{
t.join();
}
}

public void run()
{
long i = ITERATIONS + 1;
while (0 != --i)
{
longs[arrayIndex].value = i;
}
}

public final static class VolatileLong
{
public volatile long value = 0L;
public long p1, p2, p3, p4, p5, p6; // comment out
}
}


Results

Running the above code while ramping the number of threads and adding/removing the cache line padding,  I get the results depicted in Figure 2. below.  This is measuring the duration of test runs on my 4-core Nehalem at home.

Figure 2.

The impact of false sharing can clearly be seen by the increased execution time required to complete the test.  Without the cache line contention we achieve near linear scale up with threads.

This is not a perfect test because we cannot be sure where the VolatileLongs will be laid out in memory.  They are independent objects.  However experience shows that objects allocated at the same time tend to be co-located.

So there you have it.  False sharing can be a silent performance killer.

 

From http://mechanical-sympathy.blogspot.com/2011/07/false-sharing.html

Cache (computing) False sharing Threading

Opinions expressed by DZone contributors are their own.

Related

  • The Best Way to Completely Purge a Varnish Cache
  • Duplicate Objects in Java: Not Just Strings
  • Supporting Offline Mode: Key Strategies and Benefits for iOS Mobile Apps
  • The Ultimate Database Scaling Cheatsheet: Strategies for Optimizing Performance and Scalability

Partner Resources


Comments

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: