Chronicle and the Micro-Cloud
The Cloud Zone is brought to you in partnership with Mendix. Better understand the aPaaS landscape and how the right platform can accelerate your software delivery cadence and capacity with the Gartner 2015 Magic Quadrant for Enterprise Application Platform as a Service.
OverviewA common question I face is; how do you scale a Chronicle based system if it is single writer, multiple readers.
While there are solutions to this problem, it is far more likely it won't be a problem at all.
The Micro-CloudThis is the term I have been using to describe a single thread to do the work currently being done by multiple servers. (Or the opposite of the trend of deploying a single application to multiple machines)
There is an assumption that the only way to scale a system is to partition or have multiple copies. This doesn't always scale that well unless you have multiple systems as well. All the adds complexity to the development, deployment and maintenance of the system. A common problem I see is that developers are no longer able to test the end-to-end system on their workstations or in unit/functional tests
Chronicle based processing engines have a different approach. They are typically designed to handle as much load as you can give it. i.e. the bottleneck is elsewhere such as the touch points with external systems, gateways and databases. A Chronicle based processing engine can handle 100,000 to 1,000,000 inbound and out bound events per second with latencies of between 1 and 10 micro-seconds. This is more than the rate than gateways can feed data in/out of the engine so there is never a good use case for partitioning. It is an option, but it is one you should not assume you need.
The benefit of this is you have a deterministic system which is very simple architecturally. The system is deterministic because you have a record of every inbound and outbound message which can have micro-second timings and the behaviour of the system is completely reproduce-able (and restart-able). Note: one of the consequences of this is, just restarting a failed service should not help. If it failed once a particular way based on particular message, it should fail on restart exactly the same way, every time. This makes reproducing issues and testing easier.
Why is it so fast?A single thread processing engine in one thread has
- No locking
- No TCP delays
- Can be locked to a core or CPU improving caching behaviour.
i.e. it never context switches or gives up the CPU.
- Reusable mutable objects are more practical so you can avoid churning your caches with garbage so you memory accesses can be 5x faster. (With the bonus you no longer have GC pauses)
The real reason to avoid producing garbage
ConclusionFor this reason, I have started using the term micro-cloud. i.e. do the work of a whole cloud in one thread. I have a couple clients who are doing just that. What was done by many JVMs across multiple systems can be done by just one, very fast thread.
Obviously, one thread is much easier to test on one workstation and takes up far less rack space in your data centre in production.