In this post I am going to go through the process of constructing a workflow
using payments as an example. As you work more with Redis you soon
start finding yourself building out workflow’s, i.e. small pieces of
code that talk to each other via Redis. For someone familiar with a
Service Oriented approach to building systems this should feel like
deja-vu. Except, instead of using a protocol (HTTP, TCP, UDP, AMQP,
ZeroMQ) we are going back to CS101 using a good old queue datastructure.
It might strike many as crazy, building a payment processing workflow
using anything other than a traditional RDBMS. I would like to argue
though, that many of the perceived strengths you think a traditional
RDBMS provides in terms of transactionality can be very easily
constructed with Redis.
There is the concern of redis keeping it’s stuff in volatile memory
and if the machine running the redis server crashes, then everything
since the last bgsave is lost. But those very same rules apply to a
regular database as well. You could up the frequency with which redis
does a fsync and have it write to disk on every write, much like a
RDBMS. What if the disk gets wiped? In that case you could take the very
same precautions you would with a RDBMS, i.e. replication. I
personally, fall in the camp of favoring replication and think it
invalidates the need to have paranoid disk writes (which kill the
performance redis gives you). Having gotten that out of the way, let’s
proceed with building out a fast, correct, reliable and robust payment
processor using the atomic tools that redis gives you.
Modus operandi
We want to keep each of the players in our workflow light, simple and
preferably singularly focused. This will help us scale our system,
identify problems more easily and most importantly be far easier to
maintain than a monolithic beast. With this in mind, our worflow can be
thought of as follows:
- A HTTP API allowing users to submit payments into our workflow.
This generates a unit of work that get's handled by the workers.
- Workers that act on the work generated from our API
Our plan is going to be:
- Expose a HTTP API that let's users submit payments
- Keep our controller minimal, have it just create a model and call it a day
- Have the model process the payment asynchronously
- Think about handling cases for when things don't go according to plan
- Have a way to figure which payments are taking unduly long to process etc.
The HTTP API
To keep things simple, let’s assume we have the following code
running in Ruby-on-Rails that exposes our payment processor via HTTP:
class PaymentController < ApplicationController
def create
Payment.create! params[:payment]
render :json => {:status => "OK"}
end
end
The Model layer
In keeping with the MVC goodness, here’s our model layer:
class Payment
def self.create!
raise ArgumentError unless params[:payer_id], params[:recipient_id],
params[:amount]
redis.rpush 'payments_to_be_processed',
params[:payment].merge(:tracking_id => redis.incr('tracking_ids')).to_json
end
end
Looking back at the list of bullet points, it looks like we have
accomplished the first three points: we have a super simple HTTP API,
both our controller and model code are minimal and we have paved the
road for processing payments asynchronously with the payments_to_be_processed
list. Pretty awesome stuff, now all that’s left to be done is figure
out the back-end. How do we keep track of payments as they flow through
the various states of being processed, what are the metrics we
think we are going to be needing and how do we go about storing them,
what are the race conditions we need to guard against etc.?
Payment Processing stub
For the purposes of this discussion let’s not worry about actually
processing a payment (there are several well documented services out
there such as braintree, recurly
etc. that make it straightforward). To get the ball rolling I’m going
to be assuming that we have the following piece of code that we are
going to be calling to process our payment:
def process_payment(payer_id, recipient_id, amount)
rand_val = (rand * 10).to_i
if rand_val > 3
return :status => :success,
:txn_id => redis.incr("txn_ids"), :processed_on => Time.new.getutc.to_i
elsif rand_val < 7
return :status => :insufficient_funds,
:txn_id => redis.incr("txn_ids"), :processed_on => Time.new.getutc.to_i
else
return :status => :api_error, :processed_on => Time.new.getutc.to_i
end
end
As you can see, this method can have one of three possible outcomes:
- success
- insufficient funds
- api error (we were not able to connect with our payment service)
This might strike many as naive, but this is by no means attempting
to be an exhaustive monograph on what can go wrong when processing
payments. Instead, what I’d like to focus on is, given a finite list of
possible outcomes when processing a payment how do I use redis to
process the payment accurately and recover gracefully from when things
go bad.
(Note: In this method, I work with time in integer UTC format. Highly recommend this when working with redis.)
Payment Processing Workers
Given the list of three possible outcomes, it’s a no-brainer that in
our payment processing workers we are going to need to handle these
three conditions. With that in mind, here’s a first stab at it:
loop do
payment = JSON.parse(redis.brpop("payments_to_be_processed")[1])
tracking_id = payment['tracking_id']
payer_id = payment["payer_id"]
recipient_id = payment["recipient_id"]
amount = payment["amount"]
results = process_payment payer_id, recipient_id, amount
if results[:status] == :success
redis.zadd "successful_txns", results[:processed_on], results[:txn_id]
redis.hmset "txns", results[:txn_id], payment.merge(:tracking_id => tracking_id).to_json
redis.zadd "payments_made_by|#{payer_id}", results[:processed_on], results[:txn_id]
redis.zadd "payments_received_by|#{recipient_id}", results[:processed_on], results[:txn_id]
elsif results[:status] == :insufficient_funds
redis.zadd "insufficient_funds_txns", results[:processed_on], results[:txn_id]
redis.hmset "txns", results[:txn_id], payment.merge(:tracking_id => tracking_id).to_json
redis.zadd "insufficient_funds_for|#{payer_id}", results[:processed_on], results[:txn_id]
redis.zadd "insufficient_funds_to|#{recipient_id}", results[:processed_on], results[:txn_id]
else
redis.zadd "api_errors", results[:processed_on], {:payment_id => payment_id}.to_json
end
end
This looks like a pretty impressive first stab at the problem. We have:
1. Handled (maybe not completely) for our three cases when processing a payment.
2. A way to figure out the status of a payment by looking in the lists:
- successful_txns
- insufficient_funds_txns
- api_errors
Each worker pulls out a JSON'ified hash that contains details on who
is paying whom and the amount. The worker then tries processing the
payment and depending on whether it was successful or failed adds it to
further redis datastructures. One thing I’d like to point out here, is
that whenever possible I lean towards using a sorted set instead of a
set with a UTC timestamp as the score. This let’s me perform range
queries such as how many successful transactions were performed today in
total, how many payments has a given user made or received in a given
time-frame etc. Anytime you can see yourself needing a set, think a
little deeper to see if a sorted set may be a better fit. Coming back to
the code above, one thing we’d like to guard against is ensuring that
irrespective of the outcome of the payment when we note down the status
in some of our datastructures it’s a really good idea to do it in one
fell swoop. To be a little clearer, if a transaction was successful we
want to ensure that it either get’s added of successful_txns, txns, payments_made_by and payments_received_by or none.
Transactionality using multi-exec
To do this we use redis' built-in transactionality primitives, multi and exec. The updated code is as follows:
loop do
...
if results[:status] == :success
redis.multi
redis.zadd "successful_txns", results[:processed_on], results[:txn_id]
...
redis.exec
elsif results[:status] == :insufficient_funds
redis.multi
...
redis.exec
else
redis.zadd "api_errors", results[:processed_on], {:payment_id => payment_id}.to_json
end
end
Keeping track of Queue size
Here’s a super simple queue size tracker:
loop do
if redis.llen("payments_to_be_processed") > 100_000
send_pager(:to => "ops", :msg => "queue is getting backed up")
end
sleep 1*60 #for a minute
end
Here 100_000 is totally a number I pulled out of thin air. You
can/should have it configurable. You also needn’t worry about this
tracker bringing down your redis server. Believe me, redis can handle a llen O(1) operation every 60 seconds! :)
Scaling
Let’s say your HTTP API is pumping more payments than you are capable
of processing and you would like to process them a little faster.
Simple — just increase the number of workers you have running and you
will horizontally scale.
Conclusion
The big takeaway I’d like for you to have from reading this post is a
feel for working with redis. This post is not about building a payment
system (even though the title says that it is). It is about building
tiny services that have a singular purpose and that talk to each other
using Redis. Some of the code in this post might be wrong and some of
the assumptions I make may be wrong as well. But the general gist of
building a workflow consisting of small services that talk with each
other via Redis is right.
Comments