Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Real-Time Email Tracking With Go, Redis, and Lua

DZone 's Guide to

Real-Time Email Tracking With Go, Redis, and Lua

A developer gives a tutorial on working with Lua and Go to create an application that uses Redis as it main database.

· Web Dev Zone ·
Free Resource

Real-time data provides a goldmine of information.

At Pepipost, we handle around 20k – 50k email events/second. That’s huge! Processing all these events every single second in real-time. We use the magical combination of Redis and PERL to deal with these complex data structures. And it seemed to work just fine, because of the high text processing efficiency, powerful set of regular expressions, fast development, and easy-to-learn functionality.

Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache, and message broker. It supports list, hash, sets, etc.

We are getting satisfactory performance with Perl and Redis.

But, Can We Optimize Further?

As developers, we are always looking for new ways to improve performance and functionality. One of the trickiest problems we face is choosing the right algorithms and data structures to build speed into the project and optimize performance.

Since every use case has a different approach and not all technologies fit every use case, we decided to evaluate newer technologies and languages and found that Golang + Redis + Lua combination gives a great performance. We did a benchmark with a small part of the whole system.

Let’s have a look at the below flow.

Old Flow:

1. Perl daemon which continuously fetching data using BLPOP from a Redis queue (name ‘pepi_queue’).
2. It does some data processing.
3. Insert records to a MySQL database.

Benchmark 1 : Perl + Redis

Process: 10
Number of Events: 1 million
Time: 50 seconds

Above is the old flow with the old benchmark. After that, we tried something new, based on our learnings.

Part 1

1. Why Lua?

Lua is a  lightweight, multi-paradigm programming language designed primarily for embedded use in applications. You can check  here for more info on Lua.org. 

The reason why we chose Lua? It lets you create your own scripted extensions to the Redis database. That means with Redis, you can execute Lua scripts. You can use Redis’s command ‘EVALSHA’ to execute Lua scripts.

2. Build REDIS-LUA Script Extension:

We have created our own scripted extension to Redis using Lua script.

File: lrange.lua

This is a sample file of Lua scripts. Here. we wrote two commands known ‘LRANGE’ and ‘LTRIM’, which are Redis queries only.

local result = redis.call('lrange',KEYS[1],0,KEYS[2]-1)
redis.call('ltrim',KEYS[1],KEYS[2],-1)
return result

Here, the redis.call() function is used for calling a Redis query.

Lua has two tables, KEYS and ARGV; we are going to use KEYS. You can more explore about Lua KEYS here: https://redis.io/commands/eval.

Evaluate/Compile Lua Script With the EVAL Command:

redis-cli --eval lrange.lua pepi_queue 100

This will give you records from pepi_queue or it will throw an error if something’s wrong. Here, pepi_queue is considered as KEYS[1] and 100 is considered as KEYS[2]. The Lua KEYS index starts at 1. When we use the Lua extension, we don’t need to use EVAL every time. You just need to store your script in Redis once and you can start using it with the EVALSHA command.

Cache/Store Scripts in Redis With the SCRIPT LOAD Command:

Load your Lua file into Redis. Redis will cache this script into memory.

redis-cli script load lrange.lua

It will return an sha1 digest of the script like this: 785c5ff1ad6fcc0c3f5387e0a951765d2d644a22.

The script is guaranteed to stay in the script cache forever. You can remove the script from the cache by calling SCRIPT FLUSH. This command will flush all scripts stored in Redis.

Use the EVALSHA command:

redis-cli evalsha 785c5ff1ad6fcc0c3f5387e0a951765d2d644a22 2 'pepi_queue' 100

Here, the number of keys we are providing is two: pepi_queue and 100.  So 100 will be the batch value of how much we need to extract and pepi_queue is the list name.

With the above command, we extract 100 records at a time from pepi_queue. If we check the lrange.lua file, it performs LRANGE first where it will return a list of 100 records that are stored in the ‘result’ variable. After that, it performs the LTRIM function, which trims the same 100 records from the list and finally returns the ‘result’ variable with 100 records.

How the New Approach Helps

Redis guarantees that a script is executed in an automatic way so no other script or Redis command will be executed while the script is being executed and there’s no chance for data loss. In our scenario, we get 100 records in single Redis command, whereas in the old approach we needed to have 100 BLPOP commands to fetch the same data!

You can refer to https://redis.io/commands/eval for more info. While you're at it, I also recommend you go through this nice article I found which explains everything.

We have successfully created own scripted extension. Now for the next part.

Part 2: Using Go

1. Why Go?

Golang performance is well known. It has efficient concurrency like Java, C, and C++. Concurrency in Go is explained well by Rob Pike (watch the video here). Syntactically easy, we can easily open Goroutines with the keyword “go”. “Go” automatically performs memory management and also provides a built-in testing and profiling framework.

Wish to know more advantages of Go?

2. Goroutine Approach

We have converted the old flow with the Goroutine approach in Golang; here’s how.

1. We used go-redis as a client for Redis. We used the EVALSHA command, in which we are fetching data in a batch.

var qbatch = []string{"pepi_queue", 100}
records := rd.EvalSha("785c5ff1ad6fcc0c3f5387e0a951765d2d644a22", qbatch).Val()

Here rd is the Redis connection. You can read more about how we can make the redis-go connection.

2. We opened a goroutine which does data processing and sends the batch build query to the batchInsert channel. Check more on goroutines and channels.

go func(redis_batch_records []string) {
	// some data processing
	q = "insert into pepi_event (id, event_name) values (1,'sent'),(2, 'clicked'),(1,'open'),('3','bounce')"
	batchInsert <- q
}(redis_batch_records)

The whole process runs asynchronously.

3. We opened one goroutine, which is dedicated to listening to the  batchInsert channel and simply executes the query and inserts data into a MySQL database.

func BatchInsertTagData(db *sql.DB, batchInsert chan string) 
{
    for{
        q = <-batchInsert
        db.Exec(q)
    }
}

All of the above three steps run in parallel without any interdependencies and without waiting for each other.

So in Part 2 of this article, we used goroutines and all three processes are running concurrently.

Benchmark 2 : Go + Redis + Lua

Goroutines: 10

Number of Events: 1 million

Time: 10 sec

Conclusion

This is just one example of how we’ve handled huge incoming requests efficiently and improved performance. There is a long workflow behind the curtain. While we’ve used it for processing email events, this combination finds wide application in other areas as well. However, please note that every application has different behavior and approaches. The above approach was found perfect for our use case but it doesn’t necessarily mean it will fit for all cases and also note that this is not about Go versus Perl. Every technology is built as per a specific use case; you need to evaluate which technology is perfect for your application.

Have you worked on any such interesting approaches and technologies? I am eager to know, please share in the comments section.

Topics:
web dev ,lua scripting ,go tutorial ,go development ,redis

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}