KeyDB and the Tao of the Unikernel
KeyDB and the Tao of the Unikernel
This article takes a look at how KeyDB is architecturally better than Redis. Find out why.
Join the DZone community and get the full member experience.Join For Free
KeyDB is architecturally better than Redis. Why? Because it makes use of many threads.
I could end the article now and make this a tweet instead, but let's dive into it.
You might also enjoy: MongoDB Tutorials and Articles: The Complete Collection
These are 2 interesting looking graphs I pulled from the KeyDB repo:
Both latency and QPS are much better. These numbers prompted me to see if Nanos supported it yet.
➜ ~ ops load keydb_5.0.2 -p 6379 [keydb-server /redis.conf] booting /Users/eyberg/.ops/images/keydb-server.img ... assigned: 10.0.2.15 2:C 19 Nov 2019 17:17:26.212 # oO0OoO0OoO0Oo KeyDB is starting oO0OoO0OoO0Oo 2:C 19 Nov 2019 17:17:26.215 # KeyDB version=0.0.0, bits=64, commit=cedea7e4, modified=0, pid=2, just started 2:C 19 Nov 2019 17:17:26.215 # Configuration loaded 2:M 19 Nov 2019 17:17:26.251 # Not listening to IPv6: unsupported 2:M 19 Nov 2019 17:17:26.254 * Running mode=standalone, port=6379. 2:M 19 Nov 2019 17:17:26.255 # Server initialized 2:M 19 Nov 2019 17:17:26.255 * Ready to accept connections 2:M 19 Nov 2019 17:17:26.257 Thread 0 alive.
Looks like it does:
➜ ~ telnet 127.0.0.1 6379 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. set mykey bob +OK get mykey $3 bob set another keyval +OK get another $6 keyval ^] telnet> quit Connection closed.
Redis architecture isn't exactly a new topic and it's not even entirely correct. Redis actually does have multiple threads, but it's not really an ingrained part of the model, and the codebase takes pride in it being so.
There are some really crazy arguments as to why Redis doesn't take advantage of multiple threads considering the machines it typically runs on. Some people claim that the "network is the bottleneck." Ok, then why do you have a cluster of Redis instances?
Then there are the arguments saying that having a single-threaded single-process model is "easier and eliminates entire classes of bugs" — puhhhleaaze. We'll address this insanity later.
Then there is the problem of the 'fits' for AWS or GCloud instances. You can't just shove in XX amount of ram and XX of cores. All instances are sized to the public cloud gods' wisdom. (There are strong financial reasons for this of course so maybe I should say their CFO/CRO's wisdom.)
Instance Sizing vs Workload Sizing
On Google, if I want to use more than 3.75GB, I can't simply request a 1vcpu and 20GB instance. No, I need to upgrade, and in the 20GB instance, I now have 8 VCPUs of which I can't use if I'm using something like Redis. It's a complete utter waste.
The people that advocate single threads and event loops seem to never have any education on the underlying hardware and seem to forget about things like instruction pipelining or memory bandwidth per core.
Linux has had 'threads' as we know them since around 2003. We had them before but they weren't super performant and SMP servers weren't really being produced in mass at that time either. The traditional way for unix programs to scale was to fork a new process. It's also how you could run all the different software on a real computer — not a virtual cloud server.
Fork has many many problems — one of which is performance. I was three years of age when this paper came out decrying the performance of it in 1986. I pulled a nice quote from it:
It has been clear for some time that the UNIX process abstraction is insufficient to meet the needs of modern applications. The definition of a UNIX process results in high overhead on the part of the operating system. Typical server applications, which use the fork operation to create a server for each client, tend to use far more system resources than are required.
Fork is a hack. It was a hack when it was introduced in the 1970s. I argue that the only reason it even exists still is because we still have not fully thrown off the shackles of the tyranny of the single server operating system paradigm. It's fairly integral to anyone that wants an interactive system which is all of us at least on development machines. However, on the server-side, on production, it is time for us to move on from fork.
It always makes me cry when I find a new project that seems interesting on the top, but when I look into it, there are 3-4 long-lived daemons that need to run to make the program work as if one program dying isn't going to just kill the others on the same instance or not allow the rest of the software to work — it's dead, Jim.
The Single Thread Is Faster?
There is one very illogical argument that many developers will state for their reasons to not implement threading. They like to claim that something with a single thread will achieve better performance when compared apples to apples. It's the same bad logic used by some of the interpreted languages. That might be true if it was apples to apples. Unfortunately, it never is — not in a production instance anyways.
If you are truly only going to use 1vcpu, sure, adding even user-land threads might not buy you much but as soon as you go beyond that you are sacrificing time, money, and resources. Oh and that 1 vcpu? That's only a t2.micro. Anything above a t2.small and you are now most definitely wasting resources.
Those 'workers' that you have running behind a proxy? Why have them as separate processes to begin with? Now you have to manage that as well. That's more ops cognitive overhead, more observability and more things to blow up in your face at 3 A.M. on a Saturday morning. Why isn't your software managing that for you? Why is it not taking advantage of the performance advantages that native threads give you? Why introduce security and availability challenges?
Rob Pike famously declared one of the reasons for creating Go the programming language was that developers at Google were simply incapable of working with threads correctly in Java and C++. I won't disagree with this statement but the problem I find with this statement is the inherent defeatism that exists within it. He is also not the only famous hacker out there that has expressed these feelings. Guido has similarly stated this sort of sentiment before.
Python, like other interpreted languages, have long decided against implementing proper threads with but a slight nod towards user-land. This decision was made all the way back in the 90s. Big Problem! Linux didn't have threads as we know them at that time nor was SMP widespread.
Two interesting quotes I pulled from one of the many discussions on this subject:
"The difference is, for an OS kernel, there really isn't any other way to benefit from multiple CPUs. But for Python, there is -- run multiple processes instead of threads!"
He goes on to elaborate in the same email:
"I think you're overestimating the sophistication of the average extension developer, and the hardware to which they have access."
Again, I won't debate the nature of this statement other than to state that there are pro NBA players that get paid millions of dollars and millions of high school students that are forced to go to the gym once or twice a week. Having said that, that is not the argument I'd like to pick apart and it's not a statement that I think a lot of people like to hear either.
Here, Guido is trapped in time and so are his expectations for the future. However, if we fast forward to today when he recently retired from DropBox, he now must admit that all (or at least close to all) hardware has multiple cores, multiple sockets, and multiple threads and the scripting languages have had a much larger impact than anyone could have predicted. My damn phone is octa-core for crying out loud.
After all, a scripting language in the early 2000s would not have had access to modern linux threads. It would not have had access to servers with multiple processors and hyper threading. So it wasn't like he was necessarily wrong back in the day.
Sometimes I feel crazy that I even have to argue this.
Spending Your Way out of Performance
The Information is one of the latest sites to post how incredibly large the lie of "just spend your way out of performance" is.
The problem is not that engineering managers are necessarily wrong. The problem is that there is still a large majority of software that was written prior to all of this happening in the mid 90s and still in use today. Let's not be ambiguous and be ultra clear on this timeline for everyone — that was 25-30 years ago!
You know, the era of Windows 95.
The free/open-source/whatever-you-want-to-call-it movement that allowed all these software companies to flourish needs to acknowledge that we have been using broken software and broken architecture for far far too long and we need to do something about it. If you can laugh and joke about Windows 10, or 7 or NT or XP or 98 support ending so many years or decades ago, I think it's time we started laughing and joking about the abysmal state of our own ecosystem.
One only needs to look at what it takes to stand up a Spark cluster. It's not as simple as downloading one binary and init'ing it. No, you have to install Kafka and that takes installing Zookeeper. Fortunately, even this ecosystem is realizing that this is simply not tenable anymore and are looking to removing the bifurcation that exists between them:
The internet is rife with downright disinformation on sites that purport to know what they are talking about. I can't tell you how many times on twitter I've seen people try to reduce this down to "the kernel just calls them tasks — it's just cloned underneath." There ought to be a law on the half-life and digital decay of technical content on the internet. A lot of it is really bad.
There are a lot of reasons why threading is superior to forking new processes, but one of the bigger issues that has been looming its head over the past so many years is memory size and growth of most applications and the data that goes with it. It becomes progressively harder with large datasets (eg: tens of gigabytes). This is where our aforementioned Redis comes into play. The answer is not just turn on huge pages.
We need more multi-threaded software that is more in tune with our current era compute environments and we need to fix our idea of what the operating system is. We need to replace and upgrade the aging decrepit software that has no place in the 2020s. It's literally going to be 2020 in a little over another month. Maybe it already is by the time you read this.
A large percentage of the software today that we consume on Linux was written in the 90s or earlier. It's no exaggeration to state that there are readers of this article that were not even born in the 90s. There have been 4 very large trends in the past 20 some odd years that have changed the landscape:
Wide Spread Adoption of Software Companies Built on Linux
Cloud is simply virtualization with an API on top. Please tell the cloud-native folks that by the way. SMP is not going anywhere nor is SMT despite what some security wonks believe. Lastly, Google, Facebook, Uber, and name any of your favorite tech companies are all built on Linux. (Yes, yes I know Netflix uses FreeBSD — if you want %s/linux/nix for the purposes of this article.)
Heterogeneous Compute Scheduling and the Death of the General Purpose Operating System
The introduction of Kubernetes and now the service mesh 'side-car' has thrown another bucket of monkey wrenches into what could have been a rebirth of engineering performance. To start off, most Kubernetes installations get thrown into the public cloud, which is already virtualized. K8S then starts to extract a serious performance penalty on applications "scheduled" above. Furthermore, most Kubernetes installations are what we call heterogeneous, and as if the underlying operating system scheduler wasn't already known to be complete trash, now we have another one on top and in 2019 'service mesh' has gone full steam complicating things.
However, even the hardest k8s haters have to admit that k8s has done one thing for good or bad. It has democratized access to distributed compute scheduling. This in return has nailed the coffin shut for the general-purpose operating system on the server side. Unbeknownst to me, when I moved to SF in the 2000s, there already weren't enough engineers that didn't know how to Linux before, yet now there are entire legions of developers that expect an "operating system" to span multiple instances — some installations measured in the thousands.
The thing is that we all collectively walked away from the general-purpose operating system when we went to the cloud. It broke all the abstractions and it's only now that people are starting to realize that we'll need something better in the future. This is a call to action and an opportunity for those of you reading between the lines.
The Great Second Awakening of Virtualization
I believe if today's developers take up the challenge of writing better software that can fully utilize the hardware it runs on, we can have a much brighter future. Don't be told that multi-threading is too hard. Don't be scared of words like mutexes or phrases like wait-free lockless. Crave the performance. Fine-tune that engine. Virtualization's greatest gift is that it has given us a new foundation of which to build some serious software with a clean slate.
Let's build a better future.
Opinions expressed by DZone contributors are their own.