This article is taken from the book Azure in Action. Message queues are the third part of the Azure storage system (blobs and tables are the other two). The concept of queues has been around a long time, and you’ve likely worked with some technology related to queues already. A common architectural goal during design is to produce a system that’s tightly integrated but also loosely coupled. The easiest way to provide a loosely coupled system is to provide a way for the components to talk to each other through messages. We want these messages to follow a “Tell, don’t ask” approach. We shouldn’t ask an object for a bunch of data, do some work, and then give the results back to the object for recording. We should tell the object what we want it to do. We should do that at the component and system levels as well. This approach helps us to create code that’s well abstracted and compartmentalized. This is the second part of a two-part article that discusses basic queue and message concepts.
As you learned in part 1 of this two-part article, queues in the cloud are much easier to work with and require no grooming or maintenance. The real power of queues, however, is the messages that flow through them. The queue also becomes a pivot point for scaling. In this article, we’ll look at the basic message operations and then discuss how the queue manages the messages it holds.
Working with basic message operations
If you read part 1 of this article, you now know how to work with queues; let’s look at how you can work with messages. Table 1 lists the methods you’ll use when working with messages.
Table 1 Basic message methods
|AddMessage()||Puts a message on the queue|
|Looks at messages without “reading” them|
|GetMessage()||Pulls a message off the queue|
|DeleteMessage()||Deletes a message and takes it off the queue|
You can see how these methods are used by looking at the Simple Queue Browser shown in figure 1.
Download the code
You can download the code at my blog, http://brianhprince.com/blog/downloads. You’ll need to have VS2010 Beta 2 (http://www.microsoft.com/visualstudio) and the Azure SDK (http://www.microsoft.com/windowsazure/tools/) installed.
Figure 1 We’ll use the Simple Queue Browser to work with queue and message methods. Please note that the authors, while charming, are not graphic designers or user interface specialists. This tool will act as a vehicle for understanding the basics of working with queues.
As we mentioned in part 1 of this article, queues are a FIFO structure, similar to a line at the movie theater. The first message in becomes the first message out. Each message waits its turn in the queue until it’s consumed. Let’s start our discussion with how to put a message on the queue.
Putting a message on the queue
When we put a message on the queue, the new message is placed at the bottom or end of the queue. When we get a message from the queue, it’s from the top or front of the queue. The code to create a new message and then add it to a queue would look like this:
CloudQueue q = Qsvc.GetQueueReference(“newordersqueue”);
CloudQueueMessage theNewMessage = new CloudQueueMessage(“cart:31415”); // #1
q.AddMessage(theNewMessage); // #2
#1 Creates CloudQueueMessage object and sets string
#2 Adds message to end of queue
Replace #1-2 with cueballs
To add a message to the queue we need a reference to the queue, as we did in earlier code examples. Once you have the queue reference, you can call the AddMessage method #2 and pass in a CloudQueueMessage object. Creating the message is a simple affair; in this case #1, we’re passing in text that’ll be the content of the message. You can also pass in a byte array if you’re passing some serialized binary data. Remember that the content of each message is limited to 8 KB in size.
The code to put a message onto a queue with REST looks like this:
POST /my-special-queue/messages?timeout=30 HTTP/1.1
x-ms-date: Fri, 07 Aug 2009 01:49:25 GMT
In this example, we’re adding a message with the order number that can be found in the related Azure table. The consumer will pick up the message, unwrap the content, and process the cart numbered 31415. I bet this shopping cart is filled with pie and pie-related accessories.
Before we show you how to get a message, we want to talk about peeking.
Peeking at a message
Peeking is a way to get the content of a message from the queue, without taking the message off the queue. This leaves the message on the queue so someone else can grab it. Some consumers peek at a message to determine whether or how they want to process the message. This is common if you have different types of consumers polling the same queue. Each consumer might be looking for a different type of message. Perhaps you have two groups of consumers; one consumer group processes orders for normal customers, and another consumer group, with higher priority and more hardware, processes orders for unobtanium priority–level customers.
The methods for peeking are relatively simple:
CloudQueueMessage m = q.PeekMessage(); // #1
private IEnumerable< CloudQueueMessage > mList;
mList = q.PeekMessages(10); // #2
#1 Returns a message but leaves it in queue
#2 Peeks at more than one message
Replace #1-2 with cueballs.
At #1, you can see how easy it is to peek at a message. Calling the PeekMessage method returns a single message, the one at the front of the queue. You can peek at more than one message by calling PeekMessages(). You’ll need to provide the number of messages you want returned. In this example, we asked for 10 messages #2.
Now that we’ve peeked at the messages, we’re ready to get them.
Getting a message
You don’t have to peek a message before getting it, and many times you won’t use peek at all. If you already have a reference to your queue, then it’s as simple as calling
private CloudQueueMessage currentMsg;
currentMsg = q.GetMessage();
One override lets you determine the visibility timeout of the get. We’ll discuss the lifecycle of a message later in this article.
string s = currentMsg.AsString;
Once you have a message, you can use the AsString or CAsBytes property to access the contents of the message. This is the meat of the message and the part you’re most likely interested in. Once you’ve processed a message, you’ll want to delete it. This takes it off the queue.
Deleting a message
Deleting a message is as easy as getting it:
To delete a message, you need to pass the message back into the DeleteMessage method of the queue object. You can also do it easily with REST:
You can delete only one message at a time.
Regardless of how you delete the message, through REST or the API, be prepared to handle any exceptions that might be thrown.
One aspect of messages that we didn’t dive into yet is their lifecycle. What really happens when a message is retrieved? How does the queue keep the same message from being picked up by several consumers at the same time? Is a message self-aware? These are important questions for a queue service. Never losing a message (known as durability) is critical to a queue.
Understanding message visibility
A key aspect of a queue is how it manages its messages and their visibility. This is how the queue implements the message durability developers are looking for. The goal is to protect against a consumer “getting” a message and then failing to process and then delete that message. If that happened, the message would be lost, and that would be bad news for any processing system.
Visibility timeouts and idempotency are the two best tools for making sure your messages are never lost. Understanding how these concepts relate to the queue and understanding the lifecycle of a message are important to the success of your code.
About message visibility and invisibility
A property that’s part of every message is the visibility timeout. When a message is pulled from the queue, it isn’t really deleted; rather it’s temporarily marked as invisible. The consumer is given a receipt (called the pop receipt) that’s unique to that GetMessage() operation. The duration of invisibility can be controlled by the consumer and can be as long as two hours. If not explicitly set, the duration will default to 30 seconds.
While a message is invisible, it won’t be given out in response to new GetMessage() operations. For example, let’s say a producer has placed four messages in the queue, as shown in figure 2, and two consumers will be reading messages out of the queue.
Figure 2 Two consumers are getting messages from a queue. A message is marked as invisible when a getmessage() operation is performed. Unlike a cloak of invisibility, however, this effect does time out after a period of time.
Replace #1 and #2 with a cueball
Consumer 1 gets a message (Msg 1), and that message is marked as invisible #1. Seconds later, Consumer 2 performs a get operation as well. Since the first message (Msg 1) is invisible, the queue responds with message 2 (Msg 2) #2.
Not long thereafter, Consumer 1 finishes processing Msg 1 and performs a delete operation on the message. As part of the delete operation, the queue checks the pop receipt Consumer 1 provides when it passes in the message. This is to make sure Consumer 1 is the most recent “reader” of the message in question. The receipt matches, in this case, and the message is deleted.
Consumer 1 then does an immediate read and gets Msg 3. Consumer 1 fails to complete processing within the invisibility window and fails to delete Msg 3 in time. It becomes visible again.
Just at that time, Consumer 2 deletes Msg 2 and does a get. The queue responds with Msg 3, because it’s visible again. While Consumer 2 is processing Msg 3, Consumer 1 does finally finish processing Msg 3 and tries to delete. But this time, the pop receipt, which Consumer 1 has, doesn’t match the most recently provided pop receipt, which was given to Consumer 2 when Msg 3 was handed out for a second time.
Because the pop receipt doesn’t match, an error is thrown, and the message isn’t deleted. You’ll likely see a 400 - Bad Request error when this happens. The inner exception details will explain that there are no messages with a matching pop receipt available in the queue to be deleted.
Setting visibility timeout
You can set the length of the visibility timeout when you get the message. This lets you determine the proper length of the timeout for each individual message. When you specify visibility timeout, you want to balance the expected processing time with how long it’ll take the system to recover from an error in processing. If the timeout is too long, it’ll take a long time for the system to recover a lost message, slowing down the system’s throughput. If the timeout is too short, too many of your messages will be repeatedly reprocessed.
This leads us to an important aspect of queues in general, but specifically the Azure queue system.
Planning on failure
The service guarantee is worded as “promising that every message will be processed, at least once.” You can see this “at least once” business in the previous scenario. Because Consumer 1 failed to delete the message in time, the queue handed it out to another consumer. The queue has to assume the original consumer has failed in some way. This is useful because it provides a way for your system to take a hit (a server going down) and keep on going. Cloud architecture has this concept of being able to plan on failure and make that central to the structure of the system. Queues provide that capability quite nicely.
The downside is that it’s quite possible a consumer doesn’t crash but takes longer to process the message than intended. In this case, you need to make sure that your processing code is either idempotent (the ability to process the same message twice without affecting the state of the system) or checks before processing each message to make sure it isn’t a duplicate scenario. Because the message being reprocessed is actually the same message, its ID property is the same. This makes it easy to check a history table or perhaps the status of a related order before processing starts.
This little complexity might make you think about deleting a message as soon as you receive it. This is very dangerous and unwise, because there will be failure along the way, and when that happens, that message will be lost forever.
Using idempotent processing code
The goal of our messaging systems is to make sure we never lose a message. No matter how small or large, you never want to lose an order, or a set of instructions, or anything else you might be processing.
To avoid complexity, it’s best to make sure your processing code is idempotent. Idempotent means that the process can be executed several times, and the system will result in the same state. Perhaps we’re working with a piece of software that tracks dog food delivery. When the food is actually delivered to the physical address, the handheld computer sends a message to our queue in the cloud. The software uploads the physical signature of
the recipient to blob storage and submits an order-delivered message to the queue. The message contains the time of delivery and the actual order number, which happens to also be the filename of the signature in blob storage.
When this message is processed, the consumer copies the signature image to permanent storage, with the proper filename, and marks the order as delivered in the package-tracking system database.
If this message were to be processed several times, there’d be no detriment to the system. The signature file would be overwritten with the same file. The order status is already set to “delivered,” so we’d just be setting its status to “delivered” again. Using the same delivery time doesn’t change the overall state of the system. This is the best way to handle the processing of queue messages, but it’s not always possible.