So, we've seen what problems microservices and containers pose to logging. We've also covered how choosing the right aggregation architecture for your project can help. But now, we're going to look at the tool to make it happen — Fluentd.
Lock and Key: Docker + Fluentd
The need for a unified logging layer for microservices led Sadayuki Furuhashi, Treasure Data’s Chief Architect, to develop and open-source the Fluentd framework. Fluentd is a data collection system — a daemon, like syslogd, that listens for messages from services and routes them in various ways. But unlike syslogd, Fluentd was built from the ground up to unify log sources from microservices, so they can be used efficiently for production and analytics. The same performant code can be used in both Collector or Aggregator modes with a single tweak to configuration, making it extremely easy to deploy across an entire system.
Because Fluentd is natively supported on Docker Machine, all container logs can be collected without running any “agent” inside individual containers. Just spin up Docker containers with “–log-driver=fluentd” option, and make sure either the host machine or designated “logging” containers run Fluentd. This approach ensures that most containers can run “lean” because no logging agent needs to be installed at source containers.
Fluentd’s light weight and extensibility make it suitable for aggregating logs on both the source and destination sides, in either a “scaling up” or “scaling out” configuration. Again, which flavor is best for you depends on your present setup and your future needs. Let’s look at each in turn.
Simple Forwarding + Scaling Up
For easy setup, it’s hard to beat the simplicity of including a few lines of configuration code from the Fluentd logger library in your app and instantly enabling direct log forwarding with a single Fluentd instance per container. Because it’s nearly effortless, this can be a great boon to fledgling startups, which usually have a small number of services and data volumes low enough to store in a standard MySQL database with a few concurrent connections.
But at the risk of beating a seriously dead horse, there are limits to how much a system like this can scale. What if your startup really takes off? Depending on how data-driven your business is, you might want to put in the implementation effort up front (or consider outsourcing the problem with a managed data infrastructure stack) to avoid panic attacks later on.
Source Aggregation + Scaling Up
Another possible configuration is to aggregate on the source with Fluentd and send the aggregated logs to a NoSQL data store using one of Fluentd’s 400+ community contributed plugins. We’ll look at Elasticsearch for this example, because it’s popular. This configuration (using Kibana for visualization), called the EFK stack, is what e.g. Kubernetes runs on. It’s reasonably straightforward, and it works great for medium data volumes. Usually.
A caveat with Elasticsearch: While being a great platform for search, it is less than optimal as a central component of your data infrastructure. This is especially true when you’re trying to load high volumes of important data. At production scale, Elasticsearch has been shown to have critical ingestion problems, including split brain, that result in data loss. In the EFK configuration, since Fluentd is aggregating on the source and not the destination, there’s nothing it can do if the store drops data.
For production-scale analytics, you might consider a more fault-tolerant platform, such as Hadoop or Cassandra — which are both optimized for high volume write loads.
Source/Destination Aggregation + Scaling Out
If you need to process massive amounts of complex data, your best bet is to set up both source and destination side aggregation nodes, leveraging the various configuration modes of Fluentd. With the Fluentd logging driver that comes bundled with Docker, your application can just write its logs to STDOUT. Docker will automatically forward them to the Fluentd instance at localhost, which in turn aggregates and forwards them on to destination-side Fluentd aggregators via TCP.
This is where the power and flexibility of Fluentd really comes into its own. In this architecture, Fluentd, by default, enables round-robin load balancing with automatic failover. This lends itself to scale-out architecture because each new node is load-balanced by the downstream instance feeding it. Additionally, the built-in buffer architecture gives it an automatic fail-safe against data loss at every stage of the transfer process. It even includes automatic corruption detection (which initiates upload retries until the complete dataset is transferred), as well as a deduplication API.
What Configuration Is Right for You?
It depends on your budget and how fast you must move. Are you a resource-strapped startup processing small amounts of data? You may be able to get away with forwarding straight from your source into a single node MySQL database. If your needs are more moderate without a strong need for fail-safe data capture, the EFK stack may suffice.
As organizations of all sizes become more data-driven, however, it’s worth taking the time up front to think through your long-term goals. Do you need to make sure your data pipeline won’t choke when you start processing billions of events per day? Do you want maximum extensibility for whatever data sources you may want to add in the future? Then you may want to consider implementing both source and destination aggregation up front. Your future self (and colleagues) will thank you when your data volumes start exploding.
Whatever your configuration, the simplicity, reliability, and extensibility of Fluentd make it a great choice for data forwarding and aggregation. And the fact that it comes built-in with Docker makes it a no-brainer for any microservices-based system.