Implementing Structured Logging in Groovy

Walk through some example code for meeting requirements when it comes to structure logging, business analytics, system monitoring, and diagnostics.

Nick Southwell

May. 23, 17 · Tutorial

Likes (2)

Comment

Save

6.9K Views

In my last blog post, I introduced structured logging and how it can help with business analytics, system monitoring, and diagnostics. In this post, I’m going to walk through example code for meeting the requirements outlined in the last post. Since we are responsible software craftsmen, there’s also some sample code for testing the logging events required for production.

Configuring Logback to Write Logging Events in JSON

The logstash-logback-encoder will write logging events in JSON format for Logback. The fields and names produced are highly configurable. The documentation describes how to include it in Maven projects. This is what you need if you are using Gradle:

In build.gradle:

dependencies {
…
compile 'net.logstash.logback:logstash-logback-encoder:4.7'
…
}

You will also need to modify logback.groovy to use the new encoder (there are other options, see the documentation) wherever an encoder is specified. Logback includes support for a Groovy DSL to configure logging. That is what is being used here. There are two examples below. The first is easier to use but does not support adding structured arguments to the JSON output. The second example gives much better control of the resulting JSON and also includes adding structured arguments to the JSON as well as to the message.

In logback.groovy using the LogstashEncoder:

…
encoder(LogstashEncoder) {
  timeZone = 'UTC'
  shortenedLoggerNameLength = 35
  customFields = toJson(pid:"${new ApplicationPid()}", app:"XYZ")
  fieldNames(LogstashFieldNames) {
    timestamp = '@time'
    version = '[ignore]'
    message = 'msg'
    logger = 'logger'
    thread = 'thread'
    levelValue = '[ignore]'
  }
  stackTrace(StackTraceJsonProvider) {
    throwableConverter(ShortenedThrowableConverter) {
      maxDepthPerThrowable = 20
      maxLength = 8192
      shortenedClassNameLength = 35
      exclude = /sun\..*/
      exclude = /java\..*/
      exclude = /groovy\..*/
      exclude = /com\.sun\..*/
      rootCauseFirst = true
    }
  }
}
…

In logback.groovy using the LoggingEventCompositeJsonEncoder:

…
encoder(LoggingEventCompositeJsonEncoder) {
  providers(LoggingEventJsonProviders) {
    timestamp(LoggingEventFormattedTimestampJsonProvider) {
      fieldName = '@time'
      timeZone = 'UTC'
      pattern = 'yyyy-MM-dd HH:mm:ss.SSS'
    }
    logLevel(LogLevelJsonProvider)
    loggerName(LoggerNameJsonProvider) {
      fieldName = 'logger'
      shortenedLoggerNameLength = 35
    }
    message(MessageJsonProvider) {
      fieldName = 'msg'
    }
    globalCustomFields(GlobalCustomFieldsJsonProvider) {
      customFields = "${toJson(pid:"${new ApplicationPid()}", app:"XYZ")}"
    }
    threadName(ThreadNameJsonProvider) {
      fieldName = 'thread'
    }
    mdc(MdcJsonProvider)
    arguments(ArgumentsJsonProvider)
    stackTrace(StackTraceJsonProvider) {
      throwableConverter(ShortenedThrowableConverter) {
        maxDepthPerThrowable = 20
        maxLength = 8192
        shortenedClassNameLength = 35
        exclude = /sun\..*/
        exclude = /java\..*/
        exclude = /groovy\..*/
        exclude = /com\.sun\..*/
        rootCauseFirst = true
      }
    }
  }
}

These configurations do the following:

Changes the name of the timestamp field from @timestamp to @time.
Set the timezone to UTC instead of using the local timezone.
Removes the encoder version number from the output.
Only logs the level (INFO, EBUG…) but not the level value.
Shortens the logger name to at most 35 characters (by abbreviating the packages).
Changes the name of the message field from message to msg.
Adds custom fields pid and app to all logging events
Changes the name of the logger field from logger_name to logger.
Changes the name of the thread field from thread_name to thread.
Adds entries from the MDC map to the log event.
Configures the stack trace to log the root cause first and remove most of the stack frames that are not from our own code.

The LoggingEventCompositeJsonEncoder configuration also supports adding structured arguments to the JSON output (as well as to the message).

Groovy Support for Logging

Groovy provides some AST transformations to support logging that can be enabled using annotations. These all work in a similar way:

Adds a static final log field for the class.
Wraps each call to the logger in the appropriate isLevelEnabled guard.

This example uses the @Slf4j annotation:

@Slf4j
class Foo {
  void bar() {
    log.debug ‘Hello, World!’
  }
}

After compilation, this is equivalent to:

import org.slf4j.LoggerFactory
import org.slf4j.Logger


class Foo {
  private static final Logger log = LoggerFactory.getLogger(Foo)
  void bar() {
    if (log.isDebugEnabled()) {
      log.debug ‘Hello, World!’
    }
  }
}

If you use these annotations, then there is no need to worry about the expense of constructing arguments for the logger call when that log level is not enabled.

Adding Context to Logging Events

We need a way to propagate custom tokens through the system and add them to our logging events. There are at least a couple of way of doing this. The simplest one is to use the Mapped Diagnostic Context (MDC). A more laborious method is to make a map of values available for inclusion whenever a logging event is created. Of course, there’s no reason not to use a combination of both methods if that makes sense for your project.

Using the Mapped Diagnostic Context (MDC)

The Mapped Diagnostic Context (MDC) is a feature of both the Logback and log4j libraries — and slf4j which can use either implementation. The MDC is a set of key-value pairs that lives in thread-local storage. The logging system can be configured so that values from the MDC can be automatically added to all logging events.

This example shows how convenient this can be for web requests: you just add a filter that adds and removes value(s) for the life of each request. The example accesses spring security to add a userName value to every logging event made while the request is processed:

class MDCFilter implements Filter {


  ...


  @Override
  void doFilter(ServletRequest req, ServletResponse resp, FilterChain chain) {
    Authentication auth = SecurityContextHolder.context.authentication
    if (auth) {
      MDC.put('userName', auth.name)
    }
    try {
      chain.doFilter(req, resp)
    } finally {
      if (auth) {
        MDC.remove('userName')
      }
    }
  }
}

Using Parameterized Logging/Structured Arguments

Parameterized logging looks like this (example from the Logback documentation):

log.debug 'The new entry is {}. It replaces {}.', entry, oldEntry

The {}s serve as placeholders for argument substitution into the message. It turns out that these extra arguments get stored in the logging event, even if the placeholders are missing from the message body. The Logstash encoder can access these extra arguments and add them to the JSON for the event. The authors of the encoder refer to these extra arguments as structured arguments and provide a number of helper methods for constructing them.

This example adds a map using structured arguments. See the documentation for further examples.

log.info 'my message', entries(name:'Ford', age:42)

What About Testing?

If your production system is going to be logging at INFO, then logging events at INFO and above should be specified by the customer. It follows that you should write tests for these logging events.

Mocking out the logger is problematic, though, because the logger instance is static and not easily mocked without resorting to metaclass programming. Luckily, there’s another approach that works nicely. Instead of mocking the logger, add a new appender for the life of the test. Here’s an example Groovy trait that does that using Mockito to create a mock appender, assuming slf4j with a Logback implementation:

import ch.qos.logback.classic.Level
import ch.qos.logback.classic.Logger
import ch.qos.logback.classic.spi.LoggingEvent
import ch.qos.logback.core.Appender
import org.junit.After
import org.junit.Before
import org.mockito.ArgumentCaptor
import org.mockito.exceptions.verification.WantedButNotInvoked
import org.slf4j.LoggerFactory


import static org.mockito.ArgumentCaptor.forClass
import static org.mockito.Mockito.atLeastOnce
import static org.mockito.Mockito.mock
import static org.mockito.Mockito.verify
import static org.slf4j.Logger.ROOT_LOGGER_NAME


trait LoggingEventCapture {
    Appender mockAppender = mock(Appender)
    ArgumentCaptor loggingEventCaptor = forClass(LoggingEvent)


    @Before
    void setUpLogEventCapture() {
        Logger logger = (Logger) LoggerFactory.getLogger(ROOT_LOGGER_NAME)
        logger.addAppender(mockAppender)
    }


    @After
    void tearDownLogEventCapture() {
        Logger logger = (Logger) LoggerFactory.getLogger(ROOT_LOGGER_NAME)
        logger.detachAppender(mockAppender)
    }


    List getLoggingEvents() {
        try {
            verify(mockAppender, atLeastOnce())
                    .doAppend(loggingEventCaptor.capture())
            loggingEventCaptor.getAllValues().findAll {
                it.level.levelInt >= Level.INFO.levelInt
            }
        }
        catch (WantedButNotInvoked ignored) {
            []
        }
    }
}

Some Best Practices for Logging in Groovy

Don’t log and throw: this leads to logging multiple stack traces for the same error.
Log in a machine-readable format — name/value pairs or JSON.
Develop and use log level/category standards.
Use MDC for tokens that track business operations.
Log to a local file and forward elsewhere.
Rotate local logs.
Log application configuration file content on startup.
Use the Groovy @Slf4j annotation to automatically wrap logging statements in guard clauses.

In Closing

I remember working on a project that was effectively an OLTP system. Once it had been in production for a while, the marketing team started asking for information about how our customers were using the system in an effort to demonstrate that our system was superior to our competitors. We were unable to supply that information. It simply wasn’t possible to correlate the beginning and end of a user interaction and what had happened inside it without a significant redesign of the software. Structured logging could have solved this problem by injecting a token to identify the user into each transaction. Logging analysis tools could then have constructed the data that the marketing team wanted. Unfortunately, at the time structured logging was just not on our radar as a solution to the problem.

On a later project, we laboriously added name-value pairs to every single logging statement in the system. It was a lot of development overhead, but it paid off in greatly reduced support time once we had educated the operations team in the use of Splunk to run queries against the logs. This experience convinced me of the value of structured logging — even though the implementation was painful. I am now convinced that logging each event in JSON is an even better solution than name-value pairs. Making use of the MDC feature in logging frameworks greatly simplifies the process of adding logging statements that support structured logging to the software. In most cases, you don’t have to worry about adding all the context attributes; the logging framework just grabs them from the MDC for you.

Groovy (programming language) Event JSON

Published at DZone with permission of Nick Southwell. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

Trending