In the context of centralizing logs (say, to Logsene or your own Elasticsearch), we often get the question of whether one should log directly from the application (i.e., via an Elasticsearch or syslog appender) or use a dedicated log shipper.
In this post, we’ll look at the advantages of each approach so you’ll know when to use which.
Most programming languages have libraries to assist you with logging. Most commonly, they support local files or syslog, but more exotic destinations are often added to the list, such as Elasticsearch and Logsene. Here’s why you might want to use them:
You’ll want a logging library anyway, so why not go with it all the way without having to set up and manage a separate application for shipping? (Well, there are some reasons below, but you get the point.)
Fewer Moving Parts
Logging from the library means you don’t have to manage the communication between the application and the log shipper.
Logs serialized by your application can be consumed by Elasticsearch and Logsene directly instead of having a log shipper in the middle to deserialize and parse it and then serialize it again.
Your log shipper can be Logstash or one of its alternatives. A logging library is still needed to get logs out of your application, but you’ll only write locally, either to a file or to a socket. A log shipper will take care of taking that raw log all the way to Elasticsearch and Logsene:
Most log shippers have buffers of some form. Whether it tails a file and remembers where it left off, or keeps data in memory or disk, a log shipper would be more resilient to network issues or slowdowns. Buffering can be implemented by a logging library, too, but in reality, most either block the thread or application or drop data.
Buffering also means a shipper can process data and send it to Elasticsearch and Logsene in bulks. This design will support higher throughput. Once again, logging libraries may have this functionality, too (only tightly integrated into your app), but most will just process logs one-by-one.
Unlike most logging libraries, log shippers often are capable of doing additional processing such as pulling the host name or tagging IPs with geo-information.
Logging to multiple destinations (i.e., local file plus Logsene) is normally easier with a shipper.
You can always change your log shipper to one that suits your use-case better. Changing the library you use for logging may be more involved.
Design-wise, the difference between the two approaches is simply tight versus loose coupling, but the way most libraries and shippers are actually implemented is more likely to influence your decision on sending data to Elasticsearch or Logsene.
Logging directly from the library might make sense for development; it’s easier to set up, especially if you’re not (yet) familiar with a log shipper. In production, you’ll likely want to use one of the available log shippers, mostly because of buffers. Blocking the application or dropping data (immediately) are often non-options in a production deployment.
If logging isn’t critical to your environment (i.e., you can tolerate the occasional loss of data), you may want to fire-and-forget your logs to Logsene’s UDP syslog endpoint. This takes reliability out of the equation, meaning you can use a shipper if you need enriching or support for other destinations, or a library if you just want to send the raw logs (which may well be JSON).