Fluentd vs Logstash: The Ultimate Log Agent Battle
In this article, we compare the most popular log agents in the market today—Logstash and Fluentd. Both have compelling features and excel at managing logs.
Join the DZone community and get the full member experience.Join For Free
Logging and log management are critical aspects of modern IT management, monitoring, and security. Regardless of whether your IT systems follow monolithic or microservices architecture, they are complex due to the number of moving parts and the challenges they unwittingly pose around management, deployment, and scaling. Log management systems provide DevOps teams and SREs a third eye that helps monitor and improve performance, investigate errors, visualize events as they occur, and help them look ahead.
Log collectors, or aggregators, are critical aspects of the log management infrastructure. They help collect logs from various systems and parse and groom them for ingestion into a monitoring or observability tool for further visualization and analysis. DevOps and SRE teams are quickly adding log collectors to their toolchain. With millions of users across domains, two log collectors have risen to the forefront of log collection—Fluentd and Logstash.
This article compares the two and sees which one is the best for your log management and analysis initiatives—Fluentd vs. Logstash.
What Are Log Collectors?
Logs are semi-structured, descriptive data about a system’s behavior and all the events occurring within it. A log collector pulls, parses, normalizes, and enriches log data before forwarding them to a log management system, a monitoring tool, or a database. They make for unified data collection and better data use and analysis.
What Is a Log Management System?
Log collectors genuinely excel as a component of a more extensive system. This system is generally known as the log stack and consists of software that helps store and visualize logs forwarded from log collectors or aggregators. For example, log collectors such as Fluentd and Logstash combined with Elasticsearch and Kibana form the popular EFK and ELK stacks, respectively.
Elasticsearch is a text search and analysis engine that allows users to store their data for fast search centrally and provides powerful analytics that scales easily. Kibana enables users to visualize and navigate Elasticsearch data quickly and intuitively. Together, they help monitor, troubleshoot, and secure large, distributed systems.
While system monitoring is a primary use case for log management systems like the ELK or EFK stacks, they’re also widely used to support business intelligence and web analytics use cases.
Fluentd vs Logstash: Key Differences
Both Fluentd and Logstash are opensource. Fluentd is an Apache 2.0 Licensed, fully open-source software with the source code hosted on GitHub. Treasure Data builds, manages, and maintains Fluentd and is part of CNCF.
Logstash is also fully open source under the Apache 2 license. Elastic builds, manages, and maintains Logstash and also developed ElasticSearch and Kibana.
Fluentd: Apache 2.0 licensed, fully open-source, part of CNCF.
Logstash: Apache 2.0 licensed, fully open-source.
Both Fluentd and Logstash run on Windows and Linux. Fluentd, written in Ruby and C, was not available on Windows until 2015. Today Fluentd is fully cross-platform. Logstash, written in JRuby, has a platform advantage as it runs on JVM and is naturally cross-platform.
Fluentd: Written in Ruby and C, now fully cross-platform.
Logstash: Written in JRuby, runs on JVM, naturally cross-platform.
Ecosystem and Plugins
Both Fluentd and Logstash have active and expansive plugin ecosystems. Plugins allow developers and DevOps teams to configure logging systems by input, parser, filter, output, formatter, storage, and buffer. For example, Logstash is a centralized plugin ecosystem managed under a single GitHub repository, allowing for a simple, one-stop location for all plugins.
Fluentd has a decentralized plugin ecosystem. While it does provide an official repository, with a vast majority of its plugins hosted in individual repositories making it easier to navigate through the Logstash plugin ecosystem.
Fluentd: Decentralized yet vast plugin ecosystem hosted in individual repositories.
Logstash: Vast plugin ecosystem in a single, easy-to-navigate repository.
The amount of memory Fluentd and Logstash use boils down to your particular use case. However, Logstash is the more memory-expensive log collector than Fluentd as it’s written in JRuby and runs on JVM.
Fluentd consumes lesser resources when compared to Logstash as it’s written in CRuby. It also scales very well in comparison and is fast and efficient for most small to medium-sized deployments.
Both Logstash and Fluentd offer lighter-weight tools that have a smaller memory footprint. Logstash provides Elastic Beats, a lightweight data shipper for single-purpose data. While it’s great for shipping data, if you need more processing power, you’ll have to use it in conjunction with Logstash.
Fluentd offers Fluent-bit that allows you to collect data from multiple sources, enrich them using filters, and ship them to multiple destinations. Fluent-bit is written in C and supports over 70 plugins and extensions for inputs, filters, and outputs. Fluent-bit works brilliantly in small and embedded applications and is fast turning into the preferred choice for containerized environments like Kubernetes.
Fluentd: Consumes lesser resources; provides Fluent-bit as a lightweight alternative.
Logstash: Is resource-hungry in comparison; provides Elastic Beats as a lightweight alternative.
Event routers in a log stack send messages and events between applications and systems. When evaluating a logging system, it is critical to consider how it handles event routing. Logstash and Fluentd differ considerably in this regard, and you should consider this when deciding which logging stack to choose.
Logstash routes events using if-else conditions—you’ll need to write if-then-else statements to define criteria for Logstash to perform actions on your data. Fluentd, on the other hand, routes events based on tags. With Fluentd, you’ll have to tag each of your data sources (inputs). Fluentd uses tags to match inputs against different outputs and then routes events to the corresponding output.
While writing if-else conditions might not seem too daunting, using tags is more straightforward. Fluentd has a clear advantage due to this. You’d be better off if you do not overlook this difference in routing approaches. Using algorithmic if-then statements for event routing makes a Logstash system procedural. In comparison, the tag system employed for Fluentd is more declarative. In many cases, the tag event routing system used by Fluentd is better for complex logic.
Fluentd: Routes events using the tag approach; more declarative.
Logstash: Routes events based on if-else conditions; more procedural.
Log systems use inputs such as files or data stores to get data into the log collector. Fluentd has a configurable in-memory message queue and has zero dependencies when ingesting data from multiple sources. Logstash lacks a robust in-memory message queue—its queue can only hold 20 events, thereby introducing a dependency on an external queue like Redis. For DevOps teams and developers, this means more configuration and another dependency to manage to ingest data into Logstash, thereby adding complexity to your system and increasing the chances of failure.
Fluentd is safer when it comes to data transport when compared to Logstash due to the lack of external dependencies. It also handles resiliency better with its internal and configurable buffering system.
Fluentd: Has an internal and configurable buffering system, is more resilient and persistent, less complex to configure and manage.
Logstash: Has a restrictive buffering system and depends on an external queue. Has greater complexity and higher chances of failure.
When To Use Fluentd
Teams using Docker prefer Fluentd. As a result, Docker has native, built-in support for Fluentd, but not for Logstash.
Similarly, native Docker support means that Fluentd is often the best option when monitoring Kubernetes environments. When memory is critical—for example, in embedded software—Fluentd is more memory-efficient due to the lack of JVM and Java runtime dependencies. Finally, if your system does not involve the JVM, avoid adding a new dependency by choosing Fluentd.
When To Use Logstash
Logstash was built with Elasticsearch and Kibana in mind – it is your best bet if you are looking for a log collector to work with a system involving Elasticsearch and Kibana. Generally speaking, when you prefer a more managed, supported system, Logstash wins out. By way of a simple example, the managed plugin ecosystem and better enterprise support experience provided by Logstash is an indicator of a more organized product.
Logging at Scale
The Fluentd vs. Logstash consideration is only one of many challenges you'll face when you’re looking at monitoring logs at scale. Both Fluentd and Logstash have compelling feature sets but their performance boils down to where you plan on integrating it and how you plan on employing them to manage and analyze your logs.
Published at DZone with permission of Ajit Chelat. See the original article here.
Opinions expressed by DZone contributors are their own.