Over a million developers have joined DZone.

Testing the Lock-Free Stack

Recently a reader was having some issues with my lock-free queue implementation. In investigating the problem, I decided to throw out my previous “stack and queue” multithreaded test program as being way too hard to modify and started out afresh. In doing so, I also cleaned up the implementations of the lock-free stack and queue, making extensive use of CodeRush’s code issues functionality.

For details on the lock-free stack, read here. For details on the lock-free queue, see here.

In this post. I’ll talk about testing the lock-free stack in a multithreaded application.

First of all, here’s the latest version of the lock-free stack code. There’s not that much change from the original:

using System;
using JmBucknall.Threading;

namespace JmBucknall.Structures {

public class LockFreeStack<T> {
SingleLinkNode<T> head;

public LockFreeStack() {
head = new SingleLinkNode<T>();

public void Push(T item) {
SingleLinkNode<T> newNode = new SingleLinkNode<T> { Item = item };
do {
newNode.Next = head.Next;
} while (!SyncMethods.CAS<SingleLinkNode<T>>(ref head.Next, newNode.Next, newNode));

public bool Pop(out T item) {
SingleLinkNode<T> node;
do {
node = head.Next;
if (node == null) {
item = default(T);
return false;
} while (!SyncMethods.CAS<SingleLinkNode<T>>(ref head.Next, node, node.Next));
item = node.Item;
return true;

public T Pop() {
T result;
Pop(out result);
return result;

First of all, I tested the stack to make sure it did the standard push and pop properly in a single-threaded environment. All that’s needed is a few unit tests and, to be frank, these tests are boring and tedious and I won’t show them here. Of more interest is the use of the stack in a multithreaded application.

Back in the old days, I tested on a dual core machine, but now I have a quad-core desktop, so I wanted to test the code using all four cores. In essence I wanted to have one or more “pusher” threads and one or more “popper” threads. With four cores, to avoid as much CPU contention as possible, I could have one pusher and three poppers, two pushers and poppers each, or three pushers and one popper. (Of course I could increase those numbers as far as I wanted to go, but I would run into thread scheduling on each CPU and I rather wanted to stress the concurrency code as much as possible. Having each thread on its own core would certainly do that.)

The flow of the test app would work like this. I created the shared lock-free stack; for ease of testing, making it a stack of ints. To make sure that the pop code didn’t (a) miss an item or (b) pop the same item twice due to any unforeseen multithreaded issues, I decided to check off each item as it was popped. For that I needed an array of results (essentially booleans) to show which values had been popped or not. Since I wanted to use the thread pool in .NET, I needed a way to know in my main thread when each subsidiary thread, be it a pusher or a popper, had finished. The simplest way to do that is to use a manual reset event (ManualResetEvent) for each thread and then signal it (that is, call Set())  at the end of the thread’s method.

For the pushers, I decided to push all the numbers from 1 to some large value (here, 10 million). To make sure that I didn’t push a value twice, I used the Interlocked.Increment() method ensuring that each pusher got its own values.

  static class Globals {
public const int TopValue = 10000000;
public const int PusherCount = 1;
public const int PopperCount = 3;
public static int PushValue;
public static bool NoMoreData;

public class ThreadData {
public LockFreeStack<int> stack;
public ManualResetEvent completion;
public int[] results;

public ThreadData(LockFreeStack<int> stack, ManualResetEvent completion, int[] results) {
this.stack = stack;
this.completion = completion;
this.results = results;

static class PusherEngine {
static public void Execute(Object stateInfo) {
ThreadData data = stateInfo as ThreadData;
int pushCount = 0;

int nextValue = Interlocked.Increment(ref Globals.PushValue);
while (nextValue <= Globals.TopValue) {
nextValue = Interlocked.Increment(ref Globals.PushValue);

Console.WriteLine(String.Format("Push count: {0}", pushCount));

The Globals class is just a holder for some constants and global values. The ThreadData class is merely a simple record-type class to hold the data for the threads. In essence, the only thread-specific member of that class is the completion reset event, so the other members could have been put in the Globals class. Hey, ho.

For the poppers, I decided to let them run, just popping values off the stack until they got a special “close-down” value. After that they just terminated.

  static class PopperEngine {
static public void Execute(Object stateInfo) {
ThreadData data = stateInfo as ThreadData;

int popCount = 0;
while (true) {
int value = data.stack.Pop();
if (value == 0) {
if (Globals.NoMoreData)
Thread.Sleep(1); //minor wait when stack is empty
else {
int oldValue = Interlocked.CompareExchange(ref data.results[value], 1, 0);
if (oldValue != 0) {
Console.WriteLine(String.Format("Error: already popped {0}", value));

Console.WriteLine(String.Format("Pop count: {0}", popCount));

Notice here the checking to see whether a value has already been popped or not. A value of 0 popped off the stack means that the stack is temporarily empty, and so the popper just waits a smidgeon before going round and trying again. If the stack was empty and the Globvals.NoMoreData item was set, it means that all the pushers have finished adding items to the stack and so it’s time for the popper to terminate.


Now the actual console application:

  class Program {
private static ManualResetEvent[] QueueWorkItems(int count, LockFreeStack<int> stack, int[] results, WaitCallback callback) {
ManualResetEvent[] events = new ManualResetEvent[count];
for (int i = 0; i < count; i++) {
events[i] = new ManualResetEvent(false);
ThreadPool.QueueUserWorkItem(callback, new ThreadData(stack, events[i], results));
return events;

static void Main() {
DateTime start = DateTime.Now;

Console.WriteLine("create the shared stack");
LockFreeStack<int> stack = new LockFreeStack<int>();

Console.WriteLine("create the shared results array");
int[] results = new int[Globals.TopValue + 1];

Console.WriteLine("create the completion events");
var pushersDone = QueueWorkItems(Globals.PusherCount, stack, results, PusherEngine.Execute);
var poppersDone = QueueWorkItems(Globals.PopperCount, stack, results, PopperEngine.Execute);

Console.WriteLine("wait for the pushers to be done");

Console.WriteLine("signal the poppers to stop, and wait");
Globals.NoMoreData = true;

Console.WriteLine("check that all values were popped");
for (int i = 1; i <= Globals.TopValue; i++) {
if (results[i] != 1)
Console.WriteLine(String.Format("Error: {0} was never popped.", i));

DateTime end = DateTime.Now;
Console.WriteLine(end - start);


The QueueWorkItems method creates the manual reset event for each thread and queues the thread up in the thread pool. It returns an array of the manual reset events created so that the main thread can wait on them all to be signaled with WaitAll(). Firstly the main thread waits for all the pushers to finish, sets the “no more data” boolean, and then waits for all the popper threads to complete. Finally it makes sure that all the integer values were popped.

Using this application over several runs with differing numbers of pushers and poppers, I checked that the stack functioned properly. And it did. All four cores were pegged at the max for the few seconds of the test run.

Next time, the queue.



Published at DZone with permission of Julian Bucknall, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}