DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations

Defining Syslog: Daemons, Message Formats, and Protocols

Syslog is still used for a lot of the logging done today. Mostly because of its long history, syslog is quite a vague concept, referring to many things.

Radu Gheorghe user avatar by
Radu Gheorghe
·
Jan. 31, 17 · Opinion
Like (6)
Save
Tweet
Share
7.25K Views

Join the DZone community and get the full member experience.

Join For Free

Pretty much everyone’s heard about syslog. With its roots in the 80s, it’s still used for a lot of the logging done today. Mostly because of its long history, syslog is quite a vague concept, referring to many things. Which is why you’ve probably heard:

  • "Check syslog; maybe it says something about the problem," – referring to /var/log/messages.
  • "Syslog doesn’t support messages longer than 1K," referring to message format restrictions.
  • "Syslog is unreliable," referring to the UDP protocol.

In this post, we’ll explain the different facets by being specific. Instead of saying “syslog,” you’ll read about syslog daemons, about syslog message formats, and about syslog protocols.

Note the plurals. There are multiple options for each. We’ll show the important ones here to shed some light on the vague (and surprisingly rich) concept. Along the way, we’ll debunk some of the myths surrounding syslog. For example, you can choose to limit messages to 1K and you can choose to send them via UDP, but you don’t have to. It’s not even a default in modern syslog daemons.

Syslog Daemons

A syslog daemon is a program that:

  • Can receive local syslog messages (traditionally /dev/log UNIX socket and kernel logs).
  • Can write them to a file (traditionally /var/log/messages or /var/log/syslog will receive everything, while some categories of messages go to specific files, like /var/log/mail).
  • Can forward them to the network or other destinations (traditionally, via UDP; usually, the daemon also implements equivalent network listeners  — UDP in this case).

This is where "syslog" is often referring to syslogd or sysklogd, the original BSD syslog daemon. Development for it stopped for Linux since 2007 but continued for BSDs and OSX. There are alternatives, most notably:

  • rsyslog. Originally a fork of syslogd, it still can be used as a drop-in replacement for it. Over the years, it evolved into a performance-oriented, multipurpose logging tool that can read data from multiple sources, parse and enrich logs in various ways, and ship to various destinations.

  • syslog-ng. Unlike rsyslog, it used a different configuration format from the start (rsyslog eventually got to the same conclusion, but still supports the BSD syslog config syntax, as well — which can be confusing at times). You’d see a similar feature set to rsyslog, like parsing unstructured data and shipping it to Elasticsearch or Kafka. It’s still fast and light, and while it may not have the ultimate performance of rsyslog, it has better documentation and it’s more portable

  • nxlog. Yet another syslog daemon which evolved into a multi-purpose log shipper, it sets itself apart by working well on Windows.

In essence, a modern syslog daemon is a log shipper that works with various syslog message formats and protocols. If you want to learn more about log shippers in general, we wrote a side-by-side comparison of Logstash and five other popular shippers, including rsyslog and syslog-ng.

Myths About Syslog Daemons

The one we come across most often is that syslog daemons are no good if you log to files or if you want to parse unstructured data. This used to be true years ago, but then so was Y2K. Things changed in the meantime. In the myth’s defense, some distributions ship with old versions of rsyslog and syslog-ng. Plus, the default configuration often only listens for /dev/log and kernel messages (it doesn’t need more), so it’s easy to generalize.

Syslog Message Formats

You’ll normally find syslog messages in two major formats:

  1. The original BSD format (RFC3164).
  2. The “new” format (RFC5424).

RFC3164 (“The Old Format”)

Although RFC suggests it’s a standard, RFC3164 was more of a collection of what was found in the wild at the time (2001), rather than a spec that implementations will adhere to. As a result, you’ll find slight variations of it. That said, most messages will look like the RFC3164 example:

<34>Oct 11 22:14:15 mymachine su: 'su root' failed for lonvick on /dev/pts/8

This is how the application should log to /dev/log, and you can see some structure:

  • <34> is a priority number. It represents the facility number multiplied by 8, to which severity is added. In this case, facility=4 (Auth) and severity=2 (Critical).
  • Oct 11 22:14:15 is commonly known as syslog timestamp. It misses the year, the time-zone and doesn’t have sub-second information. For those reasons, rsyslog also parses RFC3164-formatted messages with an ISO-8601 timestamp instead.
  • mymachine is a host name where the message was written.
  • su: is a tag. Typically this is the process name – sometimes having a PID (like su[1234]:). The tag typically ends in a colon, but it may end up just with the square brackets or with a space.
  • The message (MSG) is everything after the tag. In this example, since we have the colon to separate the tag and the message, the message actually starts with a space. This tiny detail often gives a lot of headaches when parsing.

In /var/log/messages, you’ll often see something like this:

Oct 11 22:14:15 su: 'su root' failed for lonvick on /dev/pts/8

This isn’t a syslog message format, it’s just how most syslog deamons write messages to files by default. Usually, you can choose how the output data looks. For example, rsyslog has templates.

RFC5424 (“The New Format”)

RFC5424 came up in 2009 to deal with the problems of RFC3164. First of all, it’s an actual standard, that daemons and libraries chose to implement. Here’s an example message:

<34>1 2003-10-11T22:14:15.003Z mymachine.example.com su - - - 'su root' failed for lonvick on /dev/pts/8

Now, we get an ISO-8601 timestamp, amongst other improvements. We also get more structure; the dashes you can see there are places for PID, message ID, and other structured data you may have. That said, RFC5424 structured data never really took off, as people preferred to put JSON in the syslog message (whether it’s the old or the new format). Finally, the new format supports UTF8 and other encodings, not only ASCII, and it’s easier to extend because it has a version number (in this example, the 1 after the priority number).

Myths Around the Syslog Message Formats

The ones we see more often are:

  • You can’t send syslog messages over 1K. It is true that RFC3164 stated that messages shouldn’t go over 1K, but modern daemons don’t respect that – the message limit is configurable in both rsyslog and syslog-ng. RFC5424 made this official.
  • Timestamps aren’t exact. That is true for RFC3164 timestamps, but not for the RFC5424 ones.
  • There’s no structure beyond predefined fields. In fact, any modern syslog will happily parse a JSON from the message field. They can parse any kind of message format (structured or not).

Syslog Protocols

Originally, syslog messages were sent over the wire via UDP – which was also mentioned in RFC3164. It was later standardized in RFC5426, after the new message format (RFC5424) was published.

Modern syslog daemons support other protocols as well. Most notably:

  • TCP. Just like the UDP, it was first used in the wild and then documented. That documentation finally came with RFC6587, which describes two flavors:
    1. Messages are delimited by a trailer character, typically a newline.
    2. Messages are framed based on an octet count.
  • TLS. Standardized in RFC5425, which allows for encryption and certificate-based authorization.
  • RELP. Unlike plain TCP, RELP adds application-level acknowledgments, which provides at-least-once guarantees on delivering messages. You can also get RELP with TLS if you need encryption and authorization.

Besides writing to files and communicating to each other, modern syslog daemons can also write to other destinations. For example, datastores like MySQL or Elasticsearch or queue systems such as Kafka and RabbitMQ. Each such destination often comes with its own protocol and message format. For example, Elasticsearch uses JSON over HTTP (though you can also secure it and send syslog messages over HTTPS).

Myths Around Syslog Protocols

The ones we hear most come from the assumption that UDP is the only option, implying there’s no reliability, authorization or encryption.

The other frequent one is that you can’t send multiline messages, like stack traces. This is only true for TCP syslog if newlines are used for delimiting. Then, a stacktrace will end up as multiple messages at the destination — unless its newlines are escaped at the source and reverted at the destination. With UDP, multiline logs work out of the box, because you have one message per datagram. Other protocols (TLS, RELP, and octet-delimited TCP) also handle multiline logs well, by framing messages.

Syslog file IO Protocol (object-oriented programming)

Published at DZone with permission of Radu Gheorghe. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • What Are the Benefits of Java Module With Example
  • Distributed Tracing: A Full Guide
  • Kubernetes-Native Development With Quarkus and Eclipse JKube
  • 19 Most Common OpenSSL Commands for 2023

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: