Node Is Great for Some Things...
Node has an event-driven architecture capable of asynchronous input and output. This design optimizes throughput and scalability in Web applications with many input and output operations. Node has been used successfully for chat programs, browser games, and other applications that need real-time communications.
When one of these long running operations is finished, Node receives a callback. The callbacks are accomplished with a lightweight and highly efficient thread pool. In this manner, Node is very good at waiting for things to happen, and doing this with a bare minimum of resources. Other languages might have to spawn a thread or even start a new process in order to accomplish the same type of asynchronous operation.
Another smart use of Node is for an MQTT message broker. This software is used for Internet of Things (IoT) applications where many devices need to communicate through a hub and spoke network. Each device on the network might publish certain messages and subscribe to others. As you can imagine, most of the time this network is just waiting for a message to be published, so Node also works well for an MQTT message broker.
... But Not Everything
If Node is so nifty, then why don’t people use it to build websites? The answer is that websites follow a particular data access pattern. A request comes in, there are some database transactions, and a response goes out. The advantages of Node start to fade away when there is a database transaction for every request and response. The problem is that the database driver needs to run in a separate process. This allows many people to use it at the same time. And so if you have to create a new process anyway, the fact that Node can efficiently wait for an asynchronous database transaction to finish doesn’t help conserve resources or improve speed very much.
As it turns out, a REST API platform follows the same basic data access pattern that a simple website does. A request comes in, there is a database transaction, and the response goes out. On a website the transaction happens with HTML pages, and on a REST API backend the transaction happens with JSON documents, but the workflow is the same. So the advantages of Node are mitigated when you are trying to build a REST API backend. In short, a REST API backend should be working all the time, not waiting for something to happen.
A Better Way
So what is the best language for building a simple request and response website? The world’s most popular Content Management Systems are Drupal and WordPress, and they are both written in PHP. In fact, over 80% of the world’s websites are written in PHP, including Facebook and Yahoo. There is a long history of solving scalability problems with LAMP stacks running PHP. Web servers, load balancers, operating systems, databases, and the PHP language have been optimized for this basic request and response model.
Some people might think that PHP is old hat. But take a look at the cutting edge Laravel framework and the new performance enhancements in PHP 7.0. We have written elsewhere about the advantages of using modular Laravel components for the DreamFactory platform. The popularity of PHP is another advantage, because third party database drivers are usually up to date and more widely tested than other languages. This was a key need for our REST API platform.
The diagram below shows the architecture for the DreamFactory backend. A process is assigned to the incoming request, the database or external service is called synchronously, and the response goes out on the same thread. If there is a lot of work to be done customizing, integrating, transforming or formatting the JSON request or response then this can happen in parallel without the single-threaded bottleneck. You need a separate process, but that was going to happen anyway because of the database transaction. For enhanced scalability we recommend running on NGINX to reduce the overhead of handling multiple processes.
There are many debates on the Internet about how to scale Node applications. Vertical scaling seems possible, but horizontal scaling appears to be more difficult. In our case, we wanted to enable any system administrator to scale DreamFactory using techniques and technologies that they were already familiar with. Because the platform is configured as a LAMP stack, DreamFactory can be scaled vertically with additional processors or horizontally with a load balancer just like any website.
We adopted JSON Web Token (JWT) to ensure that the DreamFactory platform is completely stateless. Need to double performance? Double the number of instances. The only requirement is that all of your instances share the same default database and cache. DreamFactory does not need persistent local storage, and this enables our stack to run on PaaS systems like Bluemix, OpenShift, Heroku, and OpenStack, as well as Docker management solutions like Swarm, Fleet, Kubernetes, and Mesos.
I have written this blog post to explore some of the technology and design decisions that we made along the way. Since DreamFactory is a REST API backend, the platform could always be rewritten in another language if there were a compelling reason to do so. But as things stand, we are happy with the current architecture and scalability characteristics. Reach out and let me know what you think about it.