Spring Boot App Distributed Tracing on Steroids With OpenJDK Flight Recorder
This tutorial shows how to set up distributed tracing and service analysis with Java Mission Control/Flight Recorder.
Join the DZone community and get the full member experience.
Join For FreeAt Oracle CodeONE 2018, Marcus Hirt gave a very interesting presentation called "Diagnose Your Microservices: OpenTracing/Oracle Application Performance Monitoring Cloud [DEV5435]." This presentation gave the audience a quick, intensive inside view of large-scale cloud application tracing.
Although his talk was not recorded, he has created pretty nice example, Robotshop, that he open-sourced and wrote about in a blog post (here). His post touched on Application Performance Management (APM) solutions that were formed before the OpenTracing standard was born.
All this intensive work crystallized into the previously mentioned OpenTracing standard, but what is it? As the website says, it's a vendor-neutral API and instrumentation standard for distributed tracing. Companies like Datadog, Skywalking, Jaeger, and LightStep are actively contributing to this standard, which is great news; it shows that the topic is pretty hot and we should probably think about it.
Developers worldwide are using Spring Boot libraries stack to create Spring-based applications for different purposes, such as building microservices using the HTTP protocol.
In this post, you will learn how to connect some favorite tracers (Jaeger, Zipkin) with your project and how to connect Java Mission Control/Flight Recorder to the application for very deep service analysis. The post also stresses the question of how valuable it is to think about application logging strategies.
The source code in this post is available on my GitHub account, branch: [vehiclefactory-jfr-tracer].
Introduction
Img.1.
On my blog, I recently published a post where I show how to configure a Spring Boot project using OpenTracing, and later, I discussed Spring possibilities in this field. The first post was just a simple producer-consumer pattern-based example. In the following post, we created a VehicleShop project with four independent microservices:
VehicleShopService: represents a shop for selling and upgrading vehicles.
FactoryService: the shop service sends the request to the factory for the new Vehicle or Vehicle elements production. The Factory delivers the desired entity.
StorageService: the Factory stores the element inside the storage for later production or providing it as an upgrade. The Shop service may also send the request to check the element's availability.
CustomerService: represents the customer behavior with a set of schedulers
Img.2.
The example project uses the Gradle build system. It contains all services and you can start each of them separately in Docker, or you can configure the docker-compose file.
The monitoring is about to "extracting meaningful stories" from the defined Metrics (four golden signals, RED method, etc.) or Logs (app events, stack-traces etc.). On the other hand, tracing is about Spans and Spans analysis. Tracking gives the answer to an individual request.
What is a Span? The Span is the primary building block of distributed tracing. An individual Span represents the work done within the distributed system. How do we define a Span?
By the OpenTracing standard, a Span has the following properties:
Aan operation name
A start and finish timestamp
A key:value Span Tag
SpanContext
Carries data across the process boundaries
Having defined a Span term, we define the space from where the Span is coming. Let's consider one individual Span: we'll name it "Active Span." This Span is responsible for the work accomplished by the surrounding code. Active Span has actually one very important property: there can be only one Active Span inside the thread at time T. This property is managed by the Scope, which formalizes the activation and deactivation of a Span.
The OpenTracing standard defines the term Trace. A Trace represents the interface whose implementation creates the Spans.
VehicleShop Example
It has been mentioned that the VehicleShop project is a collection of microservices. Those services exchange information among themselves according to the schema (Img.1.)
The Customer service represents the set of the schedulers. Those schedulers generate traffic that simulates multiple customer behaviors, like buying, searching, or upgrading a car. The VehicleShop service sends requests to the vehicle elements storage (Storage Service) to get information about available pieces that are produced by the factory. The VehicleShop service sends requests to the Factory service to check which cars are ready to sell out. The Factory service, before any car gets produced, sends the request to the Storage service whether the VehicleElement necessary to build the car are available (Img.2.). This is the very simple idea behind the example.
All services have enabled OpenTracing support for the following libraries:
opentracing-spring-jaeger-cloud-starter
opentracing-spring-zipkin-cloud-starter
When libraries are available inside the project class-path, they will automatically configure the default tracer settings. The tracer can be started inside the Docker container like so:
$docker run -d -p 6831:6831/udp -p 16686:16686 jaegertracing/all-in-one:latest
$docker run -d -p 9411:9411 openzipkin/zipkin:latest
A custom tracer configuration can be used, but in such a case, it is necessary to properly reconfigure Spring @Configuration
beans. The another alternative is to use the environment variables.
When all services are up, you can open the Jaeger UI: http://localhost:16686/
(Img.3.).
Img.3.
The interface allows us to analyze taken spans and observe real application communication over the microservices (Img.4.).
Img.4.
The Jaeger UI offers one very neat feature: its ability to compare two spans by IDs. Here's the span ID you can get from available traces (Img.5.):
Img.5.
As you can see, the example architecture is very simple, but it already shows the value of the tracing.
Let's connect the example with Java Mission Control/Flight Recorder and see the traces from the different perspectives.
As the tracing provides you the ability to get a view into the application communication layer and recognize potential issues, when the JFR is attached to the microservice JVMs, you can directly analyze the potentially suspicious code. It means that the trace may be a very good initial signal to start the investigation.
Pumping Some Steroids Into Flight Recorder
Marcus has recently published a very neat library, java-jfr-tracer. This library allows us to record Scopes and Spans into the OpenJDK Flight Recorder for a very deep analysis. To enable this feature, it us necessary to add the following library into the project:
gradle.build:
implementation "se.hirt.jmc:jfr-tracer:0.0.3"
After getting the library, we need to register a new JFR Tracer, which is the wrap of the OpenTracing tracer, in the following way:
@Autowired
public CustomTracerConfig(Tracer tracer) {
GlobalTracer.register(new DelegatingJfrTracer(tracer));
}
Now information provided by the tracers will be available through the Flight Recorder interface through recorder Events (Img.6.):
Img.6.
The Flight Recorder gives you the chance to recognize suspicious trace IDs, which may bring significant value during issue solving/investigation. The Spring framework is not just a one-thread framework and using it may bring potential challenges (Img.7.).
Img.7.
A Final Thought
Distributed tracing is both great and challenging. As you can see from Img.7, there are bunch of threads, and by adding new libraries to the project, it creates new ones. Distributed tracing is opening the way to understand to what is really happening across microservices.
Distributed tracing allows/supports
Distributed transaction monitoring
Performance and latency optimization
Root cause analysis
Service dependencies analysis
Distributed context propagation
The OpenTracing standard implementation may not be enough. Often, you may need to know what is happening inside a specific service and the trace may forward you directly to the issue.
Using Flight Recorder for distributed tracing is like putting steroids on your analysis; it's not a silver bullet, but it may be very close.
Enjoy the demo and happy tracing!
Opinions expressed by DZone contributors are their own.
Comments