Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Tarantool Queues (Part 1): Make Things Easy on Yourself

DZone's Guide to

Tarantool Queues (Part 1): Make Things Easy on Yourself

A queue is a kind of FIFO (First In, First Out) list in which we write elements, then read, execute, and delete them. What does this look like in practice?

· Database Zone ·
Free Resource

RavenDB vs MongoDB: Which is Better? This White Paper compares the two leading NoSQL Document Databases on 9 features to find out which is the best solution for your next project.  

Why Queues?

Computer science answers this question with a transactional approach — it is believed that transactions can improve everything. To prove this, they often cite the example of bank transfers. They are complex, and based on distributed transactions: first, third-party changes to an account in one bank are blocked, then the same thing happens to an account in another bank, and then withdrawals/subtractions take place, and access to the involved entities is allowed (in practice, of course, it’s more complicated: there are third-party services — intermediaries which connect banks — and a chain of transactions happens).

All of this is quite reliable but, alas, not suitable for every case — if there are many entities, transactions are complicated, the system too distributed, and its various subsystems process data at different rates, then the inevitable huge delays, subsystem failures, and other issues arise. Some operations can be safely repeated if there is a failure (the idempotent ones), though in other cases one cannot “turn a hamburger back into a cow.” It requires other tools, and one of them is a queue, which captures and monitors the status of any job, and effectively turns non-idempotent participants into idempotent ones.

A queue is a kind of FIFO (First In, First Out) list in which we write elements, then read, execute, and delete them. Queues are used for deferred processing of user data, statistics transmission, load smoothing for relatively slow subsystems, and a variety of periodic tasks. They can unleash both the client and the server so that performing tasks on one doesn’t depend on the availability of another. The client puts the task into a queue (the server at this point may not be available) and the server takes and performs it when it is available (thus, the client’s access to the server at any given moment doesn’t matter). Multiple servers can handle tasks from a single queue and multiple clients can access the same server through the queue (again, even if it is not available). The client can always address the queue and check the task status, and there can be different queues for different tasks — all in all, solid beauty and splendor. There are drawbacks, though: an extra entity (or entities) is made and we get a buffer, thus increasing the number of movements and complicating the system.

The Strategy of Choice

So, let's switch to web services: what does this look like in practice? The simplest model here is a bunch of web and database servers, whereby when the load increases, sooner or later, a denial of service happens. We need to change the system’s architecture by examining the critical elements of its logic and then postpone non-critical calculations. In the end, we are inevitably led to a complex distributed system with a pile of different tasks responsible for components, if sufficient load is assumed. Or we could choose a microservice architecture at the initial design stage.

And we should also implement the interaction of various system components. This is not an easy task if they are many — a lot of communication issues can arise between different subsystems.

If we want to solve the problem through a queue, we could create our own or use ready-made solutions like RabbitMQ. Considered one of the best open-source queue servers, RabbitMQ provides good performance, has extensive functionality (expandable via plug-ins), and can displace messages to disk. However, a high-load system needs more: if you enable message logging to disk, RabbitMQ can fail while you still have to write the data if you do not want to lose anything. Also, its performance decreases with the increasing size of each message, there are no timeouts on message processing, and so on.

Creating your own queue server from scratch is really time-consuming and is practical only for very specific and very serious problems. Let’s take the middle ground, i.e. use a DBMS. This raises the problem of choice. Relational DBs are not the best option here. They will bring us well-known performance problems including high latency.

A good choice is a fast (preferably in-memory) database that can write data to disk (reliability is crucial), with neither loss of performance nor cold start problems, and that will allow us to implement a queue server without a lot of effort.

Making Queues With Tarantool: First Steps

For our practical examples, we will choose Tarantool. It’s fast and reliable enough and has replication out of the box, sharding (an optional package), a Lua application server allowing the implementation of logic of any complexity, and to top it all off, a queue package with which you can easily make an advanced server. There is a drawback though, typical of any in-memory DBMS: it needs sufficient RAM and its RAM shouldn’t be overflowed (thus, we need to add a queue length monitoring tool).

Tarantool Queue can withstand heavy loads. In one Tarantool Queue instance, you can create multiple logical queues of different types (so-called “tubes”) via a queue.create_tube call. There is also a "take/ack" mechanism: "take" marks the task as “in process,” while "ack" deletes the task from the queue, thus confirming its successful completion. If it doesn’t make it to the "ack" command, another process will “pick up” the task and complete "take."

To start making queues with Tarantool, let’s take the already-mentioned Queue package and put it into the directory /usr/local/lua. Then, we’ll write our instance in Lua (let’s name it “test”) as follows (this assumes that you’ve already gone through the nuances of Tarantool installation and setup):

package.path = package.path .. ';/usr/local/lua/tarantool-queue/?.lua'
box.cfg{listen = 3301}
queue = require 'queue'
queue.start()
box.queue = queue

Next, you run an instance and then either enter into the Tarantool console or use a Lua script to execute the following commands, which will allow guest access and create a logical tube_name:

box.schema.user.grant('guest','read,write,execute','universe')
queue.create_tube('tube_name', 'fifo')

Queue parsing looks like this:

queue = Tarantool.Queue(host="localhost", port=3301)

while True:
    task = queue.take(tube="tube_name")
    process(task)
    task.ack()

Simple, isn’t it? However, there are many more practical nuances to discuss in our future articles. We will examine them by creating a basic but really useful queue server. 

Get comfortable using NoSQL in a free, self-directed learning course provided by RavenDB. Learn to create fully-functional real-world programs on NoSQL Databases. Register today.

Topics:
tarantool ,database ,queues ,transactions ,tutorial

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}