DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

How does AI transform chaos engineering from an experiment into a critical capability? Learn how to effectively operationalize the chaos.

Data quality isn't just a technical issue: It impacts an organization's compliance, operational efficiency, and customer satisfaction.

Are you a front-end or full-stack developer frustrated by front-end distractions? Learn to move forward with tooling and clear boundaries.

Developer Experience: Demand to support engineering teams has risen, and there is a shift from traditional DevOps to workflow improvements.

Related

  • Component Tests for Spring Cloud Microservices
  • Building Mancala Game in Microservices Using Spring Boot (Part 3: Web Client Microservice Implementation Using Vaadin)
  • Spring Boot - How To Use Native SQL Queries | Restful Web Services
  • Spring Boot: Testing Service Layer Code with JUnit 5 and Mockito, RESTful Web Services

Trending

  • How to Add a Jenkins Agent With Docker Compose
  • Top Trends for Data Streaming With Apache Kafka and Flink
  • From OCR Bottlenecks to Structured Understanding
  • Mastering Fluent Bit: Controlling Logs With Fluent Bit on Kubernetes (Part 4)
  1. DZone
  2. Coding
  3. Frameworks
  4. Tracing in Distributed Systems With Spring Cloud Sleuth

Tracing in Distributed Systems With Spring Cloud Sleuth

Learn how tracing with Spring Cloud Sleuth can solve the various difficulties with traceability in distributed systems like microservices.

By 
Tom Hombergs user avatar
Tom Hombergs
·
Sep. 19, 17 · Tutorial
Likes (7)
Comment
Save
Tweet
Share
21.5K Views

Join the DZone community and get the full member experience.

Join For Free

In a distributed system, potentially many services are involved in creating a response to a single request. Not only for debugging purposes, it’s essential that the path of such a request can be traced through all involved services. This article gives an overview of the traceability problem in distributed systems and provides some tips on how to implement tracing with Spring Cloud Sleuth.

Traceability in Distributed Systems

Even in a monolithic system, tracing a bug can be hard enough. To find the root cause of an error you search through the log files of the application servers around the point in time the error occurred and hope that you find a stacktrace that explains the error. Ideally, the error message contained a correlation ID that uniquely identifies the error, so that you can just search for that correlation ID in the log files. It’s another plus when the log files are aggregated in a central, searchable log server like Graylog.

In a distributed system, this tracing gets even harder since many different services running on different machines may be involved in responding to a single request. Here, a central log server and a correlation ID are not negotiable. But the correlation ID now spans different services and thus has to be shared across service boundaries.

As an example for this article, let’s have a look at a distributed system with three services (see figure below). As result of a request getCustomerWithAddresswe expect customer and address data as a response. The customer and address data each live in the responsibility of a different service so that the initial service (the “downstream” service) has to forward a request to those (“upstream”) services and aggregate their responses.

Image title

If an error occurs in one of those upstream services, the upstream service will probably log the error. Since the downstream services receives an error response, it will probably also log an error. For a straightforward error analysis, both log entries should be connected by a shared correlation ID.

This correlation ID is usually called a trace ID. A trace ID is created when a request from outside of the system is directed at a downstream service. Within the system, the trace ID is then passed on to the upstream services so they can use them in their log entries.

Implement Tracing With Spring Cloud Sleuth

Spring Cloud Sleuth is a library that supports implementing such a trace ID. Sleuth is part of the Spring Cloud project which provides solutions for cloud systems (which are distributed systems by definition). The following sections describe how to activate Sleuth in a Spring Boot application and provide a couple getting started tips.

If you would rather look at code than read words, most of the code examples in this article can be found in these Github repositories: downstream-service and upstream-service.

Activate Spring Cloud Sleuth

First of all, Spring Cloud Sleuth has to be added as a dependency in your build management tool. For Gradle:

compile('org.springframework.cloud:spring-cloud-starter-sleuth:1.2.4.RELEASE')

After this, Sleuth is activated by default and creates a trace ID for each incoming request. If the header x-b3-traceid of the incoming request already contains a trace ID, the existing trace ID will be used. In outgoing requests, the trace ID is put into the header so that the next upstream service can read it.

The name of the header field consists of the prefix x, which is used for custom headers that are not part of the HTTP specification, and b3 for “BigBrotherBird,” which once was the name of the popular tracing UI Zipkin. Sleuth borrows its vocabulary from Zipkin.

Writing the Trace ID Into a Log

Now that a trace id is available, it shall be included in our log output. Sleuth supports this by writing the trace ID into the Mapped Diagnostic Context (MDC) of SLF4J. This way, it can be accessed in log pattern configurations using the name X-B3-TraceId. This is an example pattern for a Logback configuration:

%d{yyyy-MM-dd HH:mm:ss.SSS} %5p [%X{X-B3-TraceId:-}] %m%n

The log pattern above creates the log output below. The first row comes from the downstream server, the other rows from an upstream service each.

2017-09-03 15:31:29.189  INFO [903c472a08e5cda0] COLLECTING CUSTOMER AND ADDRESS WITH ID 1 FROM UPSTREAM SERVICE
2017-09-03 15:31:29.193  INFO [903c472a08e5cda0] GETTING CUSTOMER WITH ID 1
2017-09-03 15:31:29.198  INFO [903c472a08e5cda0] GETTING ADDRESS FOR CUSTOMER WITH ID 1

Output the Service Name

The log output so far does not say which service it comes from. To add this information to the logs, you can use the logging features of Spring Boot. Spring Boot supports using environment variables in its logging configuration. If you use Logback, you simply have to create a file logback-spring.xml and can then use <springProperty/> to access environment variables like spring.application.name:

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <include resource="org/springframework/boot/logging/logback/defaults.xml"/>
    <springProperty scope="context" name="springAppName" source="spring.application.name"/>
    <property name="CONSOLE_LOG_PATTERN"
              value="%d{yyyy-MM-dd HH:mm:ss.SSS} %5p [${springAppName},%X{X-B3-TraceId:-}] %m%n"/>
    <!-- appenders omitted ... -->
</configuration>

The environment variables are loaded from the Spring Boot configuration, which usually lies in the file application.properties. Since there can be log output before the file application.properties is loaded, Spring Cloud supports another file called bootstrap.properties, which is loaded earlier. So all properties which are needed in log output should be put into this file instead.

Log output including the service name then looks like this:

2017-09-03 15:31:29.189  INFO [sleuth-downstream-service,903c472a08e5cda0] COLLECTING CUSTOMER AND ADDRESS WITH ID 1 FROM UPSTREAM SERVICE
2017-09-03 15:31:29.193  INFO [sleuth-upstream-service,903c472a08e5cda0] GETTING CUSTOMER WITH ID 1
2017-09-03 15:31:29.198  INFO [sleuth-upstream-service,903c472a08e5cda0] GETTING ADDRESS FOR CUSTOMER WITH ID 1

Output the Trace ID in Exceptions

Exception logs should include the trace ID since tracing errors is a main reason for introducing the trace ID in the first place. Sadly, exceptions that bubble up in Spring Boot cannot access the trace ID since they are not handled by Spring Boot but by the application server.

However, this problem is easily solved by creating a custom exception handler that runs within the Spring Boot context. A custom exception handler is usually a good idea in any case. A simple exception in Spring MVC looks like this:

@ControllerAdvice
public class ControllerExceptionHandler {

  private Logger logger = LoggerFactory.getLogger(ErrorHandler.class);

  @ResponseStatus(value = HttpStatus.INTERNAL_SERVER_ERROR)
  @ExceptionHandler(Exception.class)
  @ResponseBody
  public String handleInternalError(Exception e) {
    logger.error("internal server error", e);
    return "Internal Server Error";
  }

}

Pass the Trace ID to the Client

In some cases, it may be required to show the trace ID to the user so that he can provide the trace ID to a support hotline in case of an error. The support engineer can then search the log files for that trace ID to find the root cause. Since the trace ID is accessible via MDC, it can be read from the MDC at any time and may be passed on to the client in case of an error. An adapted exception handler that does just that may look like this:

@ControllerAdvice
public class ErrorHandler {

  private Logger logger = LoggerFactory.getLogger(ErrorHandler.class);

  @ResponseStatus(value = HttpStatus.INTERNAL_SERVER_ERROR)
  @ExceptionHandler(Exception.class)
  @ResponseBody
  public String handleInternalError(Exception e) {
    logger.error("internal server error", e);
    return String.format("Internal Server Error (traceId: %s)", MDC.get("X-B3-TraceId"));
  }

}

Instead of passing the trace ID as part of an error message you can, of course, create a structured JSON answer which includes the trace ID as a separate field for better processing.

Send the Trace ID to a Central Log Server

When using a log server like Graylog, it is a good idea to send the trace ID to that log server not only as part of the log message string but also as a separate, searchable and indexable field. A default format for passing log events to a log server is Graylog Extended Log Format (GELF). The following logback configuration uses the library me.moocar:logback-gelf to transform log events into the GELF format and to send them to the log server:

<configuration>
    <include resource="org/springframework/boot/logging/logback/defaults.xml"/>
    <springProperty scope="context" name="springAppName" source="spring.application.name"/>
    <appender name="GELF" class="me.moocar.logbackgelf.GelfUDPAppender">
        <remoteHost>graylog.host</remoteHost>
        <port>12201</port>
        <encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
            <layout class="me.moocar.logbackgelf.GelfLayout">
                <includeFullMDC>true</includeFullMDC>
                <staticField class="me.moocar.logbackgelf.Field">
                  <key>serviceName</key>
                  <value>${springAppName}</value>
                </staticField>
            </layout>
        </encoder>
    </appender>

    <root level="debug">
        <appender-ref ref="GELF" />
    </root>
</configuration>

The parameter includeFullMDC includes all MDC fields (including our trace ID) as separate fields in the GELF data structure so that the log server can store and index them in a structured manner.

The above configuration again uses the logging features provided by Spring Bootto send the value of the environment variable spring.application.name to the log server as a separate field named serviceName.

Analyze Traces

If trace IDs in the log files are not enough, you can perform a more sophisticated trace analysis by using Zipkin. Zipkin is an application that collects tracing data and displays detailed data about it in a web UI. This data contains the duration of the single steps between services among a lot of other things.

Sleuth sends its tracing data to Zipkin by default, if the following dependency is added to your project (Gradle notation):

compile('org.springframework.cloud:spring-cloud-starter-zipkin')

Additionally, the following property must be set in application.properties to contain the host name of the Zipkin server.

spring.zipkin.baseUrl: http://localhost:9411/

Wrap Up

This article has shown that a trace-supporting logging configuration with Spring Boot, Logback and Sleuth is not that much work, if you know where to look. Even sending the tracing data to a central log server in a structured fashion is quite easily possible and a must-have for distributed systems.

Example Projects

Projects containing the downstream and upstream services from this articles example can be found here: downstream-service and upstream-service. Both projects contain a README that explains the steps how to test the application.

Spring Framework Spring Cloud Web Service Spring Boot

Published at DZone with permission of Tom Hombergs. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Component Tests for Spring Cloud Microservices
  • Building Mancala Game in Microservices Using Spring Boot (Part 3: Web Client Microservice Implementation Using Vaadin)
  • Spring Boot - How To Use Native SQL Queries | Restful Web Services
  • Spring Boot: Testing Service Layer Code with JUnit 5 and Mockito, RESTful Web Services

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • [email protected]

Let's be friends: