Over a million developers have joined DZone.

Testing the Lock-Free Queue


Having discussed a test app for testing the multithreaded use of the lock-free stack, it’s now time for the lock-free queue.

First of all, I decided to play around with the lock-free queue more than I had done with the lock-free stack. The queue works by enqueuing at the tail and dequeuing at the head of the internal linked list. Because an enqueue requires two updates to the linked list (adding the new node, and moving the tail pointer to the new node) and it’s hard to get them to both happen during the enqueue operation, Michael and Scott, the inventors of this particular technique, decided that updating the tail pointer wasn’t ultra-important straightaway. If it happened, great, but, if not, they deferred it to the next thread performing an enqueue or a dequeue. In fact, the enqueue operation requires the tail to be pointing to the last node before the enqueue can happen. In other words, the tail can never become adrift by more than one position from the end. If a particular enqueue doesn’t update that tail pointer, the very next one will force it to happen. Since there is a dummy node in the queue implementation holding no data, this means that the tail can never be or point ‘outside’ the linked list; it’s not like we can get some weird ABA problem where the tail is pointing at an old node no longer part of the list.

Michael and Scott went even further: they made the dequeue operation update the tail position if it was required. This, frankly, made the dequeue code harder to understand and harder to test. And, in reality, it’s just not required. Again: the tail can’t become adrift by more than one node since the next enqueue will update it before the new node can be added (in particular, the tail’s next pointer must be null before the new node is added).

So I decided to remove the tail update code from the Dequeue() method altogether. That, plus some more tidying up advised by CodeRush’s code issues feature, led to this latest code for the lock-free queue:

using System;
using JmBucknall.Threading;

namespace JmBucknall.Structures {

public class LockFreeQueue<T> {
SingleLinkNode<T> head;
SingleLinkNode<T> tail;

public LockFreeQueue() {
head = new SingleLinkNode<T>();
tail = head;

public void Enqueue(T item) {
SingleLinkNode<T> newNode = new SingleLinkNode<T> { Item = item };

while (true) {
SingleLinkNode<T> oldTail = tail;
SingleLinkNode<T> oldTailNext = oldTail.Next;

if (tail == oldTail) {
if (oldTailNext != null) {
SyncMethods.CAS<SingleLinkNode<T>>(ref tail, oldTail, oldTailNext);
else {
if (SyncMethods.CAS<SingleLinkNode<T>>(ref tail.Next, null, newNode)) {
SyncMethods.CAS<SingleLinkNode<T>>(ref tail, oldTail, newNode);

public bool Dequeue(out T item) {
while (true) {
SingleLinkNode<T> oldHead = head;
SingleLinkNode<T> oldHeadNext = oldHead.Next;

if (oldHead == head) {
if (oldHeadNext == null) {
item = default(T);
return false;
if (SyncMethods.CAS<SingleLinkNode<T>>(ref head, oldHead, oldHeadNext)) {
item = oldHeadNext.Item;
return true;

public T Dequeue() {
T result;
Dequeue(out result);
return result;

For the Enqueue() method, a new node is allocated and initialized. We then enter an infinite loop (it’ll be exited when we add the new node properly at the end of the linked list) and take copies of the tail and the tail’s next node. If there’s no change in the tail, and the next node is not null, the tail has fallen behind and we (try to) update its position. If the next node is null, we try to add the new node. If that was successful, we (try to) update the tail and return. Simple enough: update the tail if needed and add the new node on the end, afterwards trying to update the tail. The new node will only be added if the tail’s next node is null.

Dequeue() is much simpler than before. No longer do we worry about the tail; let Enqueue() deal with that. We take a copy of the head and its next node. If the head hasn’t changed and the next node is null, we can return the default value and false (to indicate the queue was empty at that point in time). If the head hasn’t changed and the next node is not null, we try to advance the head to its next node. If we’re successful, the old head’s next node is now the dummy head node and we can return its item with impunity and true to indicate a successful dequeue.

Now we can look at the test program. It works in exactly the same way as in the stack case, so I’ll leave the description to that post.

using System;
using System.Threading;
using JmBucknall.Structures;

namespace MultithreadedQueueTester {

static class Globals {
public const int TopValue = 10000000;
public const int EnqueuerCount = 2;
public const int DequeuerCount = 5;
public static int EnqueueValue;
public static bool NoMoreData;

public class ThreadData {
public LockFreeQueue<int> queue;
public ManualResetEvent completion;
public int[] results;
public ThreadData(LockFreeQueue<int> queue, ManualResetEvent completion, int[] results) {
this.queue = queue;
this.completion = completion;
this.results = results;

static class EnqueueEngine {
static public void Execute(Object stateInfo) {
ThreadData data = stateInfo as ThreadData;
int enqueueCount = 0;

int nextValue = Interlocked.Increment(ref Globals.EnqueueValue);
while (nextValue <= Globals.TopValue) {
nextValue = Interlocked.Increment(ref Globals.EnqueueValue);

Console.WriteLine(String.Format("Enqueue count: {0}", enqueueCount));

static class DequeueEngine {
static public void Execute(Object stateInfo) {
ThreadData data = stateInfo as ThreadData;

int dequeueCount = 0;
while (true) {
int value = data.queue.Dequeue();
if (value == 0) {
if (Globals.NoMoreData)
Thread.Sleep(1); //minor wait when queue is empty
else {
int oldValue = Interlocked.CompareExchange(ref data.results[value], 1, 0);
if (oldValue != 0) {
Console.WriteLine(String.Format("Error: already dequeued {0}", value));

Console.WriteLine(String.Format("Dequeue count: {0}", dequeueCount));

class Program {
private static ManualResetEvent[] QueueWorkItems(int count, LockFreeQueue<int> queue, int[] results, WaitCallback callback) {
ManualResetEvent[] events = new ManualResetEvent[count];
for (int i = 0; i < count; i++) {
events[i] = new ManualResetEvent(false);
ThreadPool.QueueUserWorkItem(callback, new ThreadData(queue, events[i], results));
return events;

static void Main(string[] args) {
DateTime start = DateTime.Now;

Console.WriteLine("create the shared queue");
LockFreeQueue<int> queue = new LockFreeQueue<int>();

Console.WriteLine("create the shared results array");
int[] results = new int[Globals.TopValue + 1];

Console.WriteLine("create the completion events");
var enqueuersDone = QueueWorkItems(Globals.EnqueuerCount, queue, results, EnqueueEngine.Execute);
var dequeuersDone = QueueWorkItems(Globals.DequeuerCount, queue, results, DequeueEngine.Execute);

Console.WriteLine("wait for the enqueuers to be done");

Console.WriteLine("signal the dequeuers to stop, and wait");
Globals.NoMoreData = true;

Console.WriteLine("check that all values were dequeued");
for (int i = 1; i <= Globals.TopValue; i++) {
if (results[i] != 1)
Console.WriteLine(String.Format("Error: {0} was never dequeued.", i));

DateTime end = DateTime.Now;
Console.WriteLine(end - start);


Running this test app shows the lock-free queue works well and correctly in a multithreaded environment on a four-core machine.


  • Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms. Maged M. Michael & Michael L. Scott. (pdf)
  • Concurrent Programming on Windows. Joe Duffy. (Amazon)

Published at DZone with permission of Julian Bucknall, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}