Over a million developers have joined DZone.
Refcard #234

Microservices in Java

Microservices patterns to support building systems that are tolerant of failure.

Written by

Joshua Long Spring Developer Advocate, Pivotal @starbuxman

How quickly can you stand up a new Java microservice? You've either bought the idea of microservices – isn't this just the UNIX philosophy generalized? – or you're wondering how we're not just going to rehearse the migraines of CORBA and DCOM and EJB. Nice concept, but the devil is in the implementation details. This Refcard turns concepts into code and lets you jump on the design and runtime scalability train right away – complete with working Java snippets that run the twelve-factor gamut from config to service registration and discovery to load balancing, gateways, circuit breakers, cluster coordination, security, and more.

Free PDF
Brought to you by Oracle
Section 1

Survival Is Not Mandatory

It is not necessary to change. Survival is not mandatory.
-W. Edwards Deming

It's not controversial to suggest that startups iterate and innovate faster than larger organizations, but what about the larger organizations - the Netflixes, the Alibabas, the Amazons, etc.? How do they innovate so quickly? The secret to their agility lays in the size of their teams. These organizations deploy small teams that in turn build small, singly focused, independently deployable, microservices. Microservices are easier to build because the implementation choices don't bleed over to other functionality, they minimize the scope and impact of a given change, they are easier to test; and they are easier to scale. Microservices give us a lot of benefits, but they also introduce complexities related to incepting new services and addressing the dynamics of distribution.

Section 2

Moving Beyond the Wiki Page: "500 Easy Steps to Production"

Microservices are APIs. How quickly can you stand up a new service? Microframeworks like Spring BootGrails(which builds on Spring Boot), JHipster (which builds on Spring Boot), DropWizardLagom, and WildFly Swarm  are, at a minimum, optimized for quickly standing up REST services with a minimum of fuss. Some of these options go further and address all production requirements including security, observability and monitoring. Cloud computing technologies like Cloud Foundry, OpenShift, Heroku, and Google App Engine provide higher-level abstractions for managing the lifecycle of software. At both layers, stronger opinions result in consistency which results in velocity. These technologies let organizations move beyond the burdensome internal Wiki page, "500 Easy Steps to Production."

Section 3

You Can't Fix What You Can't Measure

One thing is common across all organizations, no matter what their technology stack: when the pager goes off, somebody is roused and sat in front of a computer in order to stop the system's bleeding. Root cause analysis will come later; the priority is to restore service. The more we can do upfront to support remediation in those waning and critical seconds, minutes, and hours after a production outage, the better.

Services should expose their own health endpoints, metadata about the service itself, logs, threadumps, configuration and environment information, and whatever else could be operationally useful. Services should also track the progression of both business and operational metrics. Metrics can be overwhelming, and so it makes sense to publish collected metrics to time-series databases that support analytics and visualization. There are many time-series databases like Graphite, Atlas, InfluxDB, and OpenTSDB. Metrics are keys and values over time. Spring Boot suppots the collection of metrics, and can delegate to projects like Dropwizard Metrics and Micrometer.io to publish them.

public class DemoApplication {

  public static void main(String args[]) {
    SpringApplication.run(DemoApplication.class, args);

  GraphiteReporter graphite(@Value("${graphite.prefix}") String prefix,
            @Value("${graphite.url}") URL url,
            @Value("${graphite.port}") int port,
            MetricRegistry registry) {

    GraphiteReporter reporter = GraphiteReporter.forRegistry(registry)
      .build(new Graphite(url.getHost(), port));
    reporter.start(1, TimeUnit.SECONDS);
    return reporter;

class FulfillmentRestController {

  private CounterService ok, well, as long as shes able to get a taxi or a bte tocounterService;

  Fulfillment fulfill(@PathVariable long customerId) {
    // ..
    // ..

Log multiplexers like Logstash or Cloud Foundry's Loggregator funnel the logs from application instances and ship them to downstream log analysis tools like ElasticSearch, Splunk, or PaperTrail.

Getting all of this out of the box is a good start, but not enough. There is often much more to be done before a service can get to production. Spring Boot uses a mechanism called auto-configuration that lets developers codify things — identity provider integrations, connection pools, frameworks, auditing infrastructure, literally anything — and have it stood up as part of the Spring Boot application just by being on the CLASSPATH if all the conditions stipulated by the auto-configuration are met! These conditions can be anything, and Spring Boot ships with many common and reusable conditions: is a library on the CLASSPATH? Is a bean of a certain type defined (or not defined)? Is an environment property specified? Starting a new service need not be more complex than a public static void main entry-point and a library on the CLASSPATH if you use the right technology.

Section 4

Centralized Configuration

The Twelve-Factor App methodology provides a set of guidelines for building applications with good, clean cloud hygiene. One tenet is that environment-specific configuration should live external to the application itself. It might live in environment variables, -Darguments, externalized .properties, .yml files, or any other place, so long as the application code itself need not be recompiled. DropWizard, Spring Boot, Apache Commons Configuration, and others support this foundational requirement. However, this approach fails a few key use cases: how do you change configuration centrally and propagate those changes? How do you support symmetric encryption and decryption of things like connection credentials? How do you support feature flags which toggle configuration values at runtime, without restarting the process?

Spring Cloud provides the Spring Cloud Config Server which stands up a REST API in front of a version-controlled repository of configuration files, and Spring Cloud provides support for using Apache Zookeeper and HashiCorp Consul as configuration sources. Spring Cloud provides various clients for all of these so that all properties—whether they come from the Config Server, Consul, a -D argument, or an environment variable—work the same way for a Spring client. Netflix provides a solution called Archaius that acts as a client to a pollable configuration source. This is a bit too low- level for many organizations and lacks a supported, open-source configuration source counterpart, but Spring Cloud bridges the Archaius properties with Spring's, too.

The Config Server

# application.properties
public class ConfigServiceApplication {

    public static void main(String[] args) {
        SpringApplication.run(ConfigServiceApplication.class, args);

The Config Client

# application.properties
# will read https://github.com/joshlong/my-config/message-client.properties

public class ConfigClientApplication {

    public static void main(String[] args) {
        SpringApplication.run(ConfigClientApplication.class, args);

// supports dynamic re-configuration:
// curl -d{} http://localhost:8000/refresh
class MessageRestController {

    private String message;

    String read() {
        return this.message;
Section 5

Service Registration and Discovery

DNS is sometimes a poor fit for intra-service communication. DNS benefits from layers of caching and time-to-liveness that work against services in a dynamic cloud environment. In most cloud environments, DNS resolution requires a trip out of the platform to the router and then back again, introducing latency. DNS doesn't provide a way to answer the question: is the service I am trying to call still alive and responding? It can only tell us where something is supposed to be. A request to such a fallen service will block until the service responds, unless the client specifies a timeout (which it should!). DNS is often paired with load balancers, but third-party load balancers are not sophisticated things: they may support round-robin load balancing, or even availability zone-aware load balancing, but may not be able to accomodate business-logic-specific routing, like routing a request with an OAuth token to a specific node, or routing requests to nodes collocated with data, etc. It's important to decouple the client from the location of the service, but DNS might be a poor fit. A little bit of indirection is required. A service registry provides that indirection.

A service registry is a phonebook, letting clients look up services by their logical names. There are many such service registries out there. Netflix's EurekaApache Zookeeper, and HashiCorp Consul are three good examples. Spring Cloud's DiscoveryClient abstraction provides a convenient client-side API for working with service registries. Here, we inject the DiscoveryClient to interrogate the registered services:

public void enumerateServiceInstances(DiscoveryClient client){
      .forEach( si -> System.out.println( si.getHost() + ":" + si.getPort() ));
Section 6

Client-side Load Balancing

A big benefit of using a service registry is client-side load balancing. Client-side load balancing lets the client pick from among the registered instances of a given service—if there are 10 or a thousand they're all discovered through the registry—and then choose from among the candidate instances which one to route requests to. The client can programmatically decide based on whatever criteria it likes—capacity, least-recently used, cloud-provider availability-zone awareness, multi-tenancy, etc.—to which node a request should be sent. Netflix provides a great client-side load balancer called Ribbon that Spring Cloud integrates with. Ribbon is automatically in play at all layers of the framework, whether you're using the RestTemplate, the reactive WebFlux WebClient, declarative REST clients powered by Netflix's Feign, the Zuul microproxy or Spring Cloud Gateway.

public class ReservationClientApplication {

    @LoadBalanced  // lets us use service registry service IDs as hosts
    RestTemplate restTemplate() {
        return new RestTemplate();

    public static void main(String[] args) {
        SpringApplication.run(ReservationClientApplication.class, args);

class ApiClientRestController {

  private RestTemplate restTemplate;

  @RequestMapping(method = RequestMethod.GET, value = "/reservations/names")
  public Collection<String> names() {

    ResponseEntity<JsonNode> responseEntity =
        HttpMethod.GET, null, JsonNode.class);
    // ...
Section 7

Edge Services: API Gateways and API Adapters

Client-side load-balancing works for intra-service communication, usually behind a firewall. External clients—iPhones, HTML5 clients, Android clients, etc.—have client-specific security, payload, and protocol requirements. An edge service is the first port of call for requests coming from these external clients. You address client-specific concerns at the edge service and then forward the requests to downstream services. An API gateway (sometimes called a backend for a frontend) supports declarative, cross-cutting concerns like rate limiting, authentication, compression, and routing. API adapters have more insight into the semantics of the downstream services; they might expose a synthetic view of the responses of downstream services, combining, filtering, or enriching them. There are many API gateways, some hosted and some not, like Apigee, WS02, Nginx, Kong, Netflix Zuul and Spring Cloud Gateway. I like to think of Netflix Zuul and Spring Cloud Gateway as microproxies. They're small and embeddable. API Gateways need to be as fast as possible and able to absorb as much incoming traffic as possible. Here, non-blocking, reactive APIs (using technologies like Netflix's RXJava 2, Spring WebFlux, RedHat's Vert.x and Lightbend's Akka Streams) are a very good choice.

public class EdgeServiceApplication {

    // configure reactive, Ribbon aware `WebClient`
    WebClient client(LoadBalancerExchangeFilterFunction lb) {
        return WebClient.builder().filter(lb).build();

    // API Adapter using reactive Spring WebFlux WebClient  
    RouterFunction<?> endpoints(WebClient client) {
        List<String> badCars = Arrays.asList("Pinto", "Gremlin");
        return route(GET("/good-cars"), req -> {
            Publisher<Car> beers = client
                    .filter(x -> badCars.contains(x.getName()));
            Publisher<Car> circuit = HystrixCommands
            return ServerResponse.ok().body(circuit, Car.class);

    // API gateway with Spring Cloud Gateway
    RouteLocator gateway() {
        return Routes.locator()
                // custom paths to a load-balanced service
                // rewrites
                .filter(rewritePath("/foo/(?<segment>.*)", "/${segment}"))
                // circuit breaker

    public static void main(String[] args) {
        SpringApplication.run(EdgeServiceApplication.class, args);
Section 8

Clustering Primitives

In a complex distributed system, there are many actors with many roles to play. Cluster coordination and cluster consensus is one of the most difficult problems to solve. How do you handle leadership election, active/passive handoff, or global locks? Thankfully, many technologies provide the primitives required to support this sort of coordination, including Apache Zookeeper, Redis, and Hazelcast. Spring Integration supports a clean integration with these kinds of technologies. In the following example, we've configured a component to change its state whenever OnGrantedEvent or an OnRevokedEvent is emitted, which it will do when the underlying coordination technology promotes and demotes a leader node.

class LeadershipApplicationListener {

  public void leadershipGranted(OnGrantedEvent evt){
    // ..

  public void leadershipRevoked(OnRevokedEvent evt){
    // ..
Section 9

Messaging, CQRS, and Stream Processing

When you move into the world of microservices, state synchronization becomes more difficult. The reflex of the experienced architect might be to reach for distributed transactions, ;a; la JTA. Ignore this impulse at all costs. Distributed transactions are a stop-the-world approach to state synchronization; that is the worst possible outcome in a distributed system. And, indeed, their use is moot since your microservices will not speak the X-Open protocol - for which JTA is middleware - as your RDBMS or JMS message queue might. One of the main reasons to move to microservices is to retain autonomy over your services' implementation, so that you don't need to constantly synchronize with other parts of the organization when making changes. So allowing access to your JTA-capable resources isn't an option, anyway. Instead, services today use eventual consistency through messaging to ensure that state eventually reflects the correct system world-view. REST is a fine technology for reading data but it doesn't provide any guarantees about the propagation and eventual processing of a transaction. Actor systems like Lightbend Akka and message brokers like Apache ActiveMQApache KafkaRabbitMQ, or even Redis have become the norm. Akka provides a supervisory system that guarantees a message will be processed at-least once. If you're using messaging, there are many APIs that can simplify the chore, including Spring Integration, Apache Camel and—at a higher abstraction level— Spring Cloud Stream. Using messaging for writes and REST for reads optimizes reads separately from writes. The Command Query Responsibility Segregation—or CQRS—design pattern specifically espouses this approach (though it does so separately from any discussion of a particular protocol or technology).

In the example below, Spring Cloud Stream connects a client to three services described in terms of MessageChannel definitions in the CrmChannels interface. Messages sent into the channels are communicated to other nodes through a Spring Cloud Stream binder implementation that in turn talks to a messaging technology like RabbitMQ or Apache Kafka. The configuration that binds a MessageChannel to a destination in a messaging technology is external to the code.

// producer side
public class ProductsEdgeService {

    public static void main(String[] args) {
        SpringApplication.run(ReservationClientApplication.class, args);

interface CrmChannels {

  MessageChannel orders();

  MessageChannel customers();

  MessageChannel products();

class ProductsApiGatewayRestController {

  private MessageChannel products;

  @RequestMapping(method = RequestMethod.POST)
  public void write(@RequestBody Product p) {
    Message<Product> msg = MessageBuilder.withPayload (p).build();

On the consumer side you might consume incoming messages using a @StreamListener:

// on the consumer side
@EnableBinding(Sink.class) // contains a definition for an `input` channel
public class MessageConsumerService {

    public static void main(String[] args) {
        SpringApplication.run(MessageConsumerService.class, args);

public class ProductHandler {

  private ProductRepository products;

  public void handle(Product p) {
Section 10

Retries and Circuit Breakers

You can't take for granted that a downstream service will be available. If you attempt to invoke a service and a failure occurs, then you should be ready to retry the call. Services fail for all sorts of often ephemeral reasons. Spring Retry supports automatically retrying a failed action. You can specify what exceptions should trigger a retry, how many retries should be attempted, and when they should happen.

Circuit breakers, like  Spring Retry, Netflix's Hystrix or JRugged, go one step beyond basic retry functionality: they are stateful and can determine that a call is not going to succeed and that the fallback behavior should be attempted directly. The effect is that downstream services, which would otherwise be overwhelmed, are given an opportunity to recover while shortening the time to recovery for the client. Some circuit breakers can even execute some behaviors in an isolated thread so that, if something goes wrong, the main processing can continue unimpeded. Spring Cloud supports both Netflix Hystrix and Spring Retry. WildFly Swarm also supports Netflix Hystrix. The Play Framework provides support for circuit breakers.

Circuit breakers represent connections between services in a system; it is important to monitor them. Hystrix provides a dashboard for its circuits. Spring Cloud also has deep support for Hystrix and its dashboard. Each Netflix Hystrix circuit breaker exposes a server-sent event-based stream of status information about the Netflix Hystrix on that node. Spring Cloud Turbine supports multiplexing those various streams into a single stream to federate the view of all the circuits across all the nodes across the whole system.

class EdgeService {

  public Collection<String> fallback(){
    // this will be invoked if the `names` method throws an exception

  // the dashboard will show a circuit named 'reservation-service'
  @HystrixCommand(fallbackMethod = "fallback")
  @RequestMapping(method = RequestMethod.GET, value = "/names")
  public Collection<String> names() {
    // ..
Section 11

Distributed Tracing

It is difficult to reason about a microservice system with REST-based, messaging-based, and proxy-based egress and ingress points. How do you trace (correlate) requests across a series of services and understand where something has failed? This is difficult enough a challenge without a sufficient upfront investment in a tracing strategy. Google introduced their distributed tracing strategy in their Dapper paper. Apache HTRace is a Dapper- inspired alternative. Twitter's Zipkin is another Dapper-inspired tracing system. It provides the trace collection infrastructure and a UI in which you can view waterfall graphs of calls across services, along with their timings and trace-specific information. Spring Cloud Sleuth provides an abstraction around the concepts of distributed tracing. Spring Cloud Sleuth automatically traces common ingress and egress points in the system. Spring Cloud Zipkin integrates Twitter Zipkin in terms of the Spring Cloud Sleuth abstraction.

Section 12

Single Sign-On and Security

Security describes authentication, authorization and-often-which client is being used to make a request. OAuth and OpenID Connect are very popular on the open web, and SAML rules the enterprise. OAuth 2.0 provides explicit integration with SAML. API gateway tools like Apigee and SaaS identity providers like Okta can act as a secure meta-directory, exposing OAuth endpoints (for example) and connecting the backend to more traditional identity providers like Active Directory, Office365, Salesforce, and LDAP. Spring Security OAuth and RedHat's KeyCloak are open-source OAuth and OpenID Connect servers. Whatever your choice of identity provider, it should be trivial to authenticate and authorize clients. Spring Security, Apache Shiro, Nimbus and Pac4J all provide convenient OAuth clients. Spring Cloud Security can lock down microservces, rejecting un-authenticated requests. It can also propagate authentication contexts from one microservice to another. Frameworks like JHipster integrate Spring's OAuth support.

Section 13

A Cloud Native Architecture is an Agile Architecture

Systems must optimize for time-to-remediation; when a service goes down, how quickly can the system replace it? If time-to-remediation is 0 seconds, then the system is (effectively) 100% highly available. The apparent appearance of the system is the same in a single-node service that is 100% highly available, but it has profound impacts on the architecture of the system. The patterns we've looked at in this Refcard support building systems that are tolerant of failure and service topology changes common in a dynamic cloud environment. Remember: the goal here is to achieve velocity, and to waste as little time as possible on non-functional requirements. Automation at the platform and application tiers support this velocity. Embracing one without the other only invites undifferentiating complexity into an architecture and defeats the purpose of moving to this architecture in the first place.


  • Featured
  • Latest
  • Popular
Design Patterns
Learn design patterns quickly with Jason McDonald's outstanding tutorial on the original 23 Gang of Four design patterns, including class diagrams, explanations, usage info, and real world examples.
217.4k 656.9k
Core Java
Gives you an overview of key aspects of the Java language and references on the core library, commonly used tools, and new Java 8 features.
138k 384.9k
Getting Started with Git
This updated Refcard explains why so many developers are migrating to this exciting platform. Learn about creating a new Git repository, cloning existing projects, the remote workflow, and more to pave the way for limitless content version control.
136.6k 310.8k
Getting Started with Ajax
Introduces Ajax, a group interrelated techniques used in client-side web development for creating asynchronous web applications.
103.4k 218.9k
Foundations of RESTful Architecture
The Representational State Transfer (REST) architectural style is a worldview that elevates information into a first-class element of architectures. REST allows us to achieve the architectural properties of performance, scalability, generality, simplicity, modifiability, and extensibility. This newly updated Refcard explains main HTTP verbs, describes response codes, and lists libraries and frameworks. It also gives additional resources to further explore each topic.
110.6k 193.2k
Scrum is a framework that allows people to productively and creatively deliver products of the highest possible value. With over 70% of Agile teams using Scrum or Scrum hybrid, learn more about its benefits in managing complex product development. This newly updated Refcard explores the details of Scrum, including theory, values, roles, and events. It also includes a sample of a popular approach to deliver Integrated Increments in a scaled environment.
99.3k 266.2k
Spring Configuration
Catalogs the XML elements available as of Spring 2.5 and highlights those most commonly used: a handy resource for Spring context configuration.
106.2k 276.3k
Core CSS: Part I
Covers Core principles of CSS that will expand and strengthen your professional ability to work with CSS. Part one of three.
92.5k 204.8k
jQuery Selectors
Introduces jQuery Selectors, which allow you to select and manipulate HTML elements as a group or as a single element in jQuery.
95k 364.9k
Core Java Concurrency
Helps Java developers working with multi-threaded programs understand the core concurrency concepts and how to apply them.
93.7k 209.7k
Getting Started with Eclipse
Eclipse IDE is a cross-platform, multi-purpose, open-source Integrated Development Environment. It is widely used to develop projects in Java, JavaScript, PHP, C++, Scala, and many others. This newly updated Refcard breaks down installing, setting up, and getting started with Eclipse. It also covers productivity tips, creating new projects and files, accessing Source Control Managers, and debugging configurations.
84.2k 231.8k
Core CSS: Part II
Covers Core principles of CSS that will expand and strengthen your professional ability to work with CSS. Part two of three.
75.4k 145.3k
{{ card.title }}
{{card.downloads | formatCount }} {{card.views | formatCount }}

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}