DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

How does AI transform chaos engineering from an experiment into a critical capability? Learn how to effectively operationalize the chaos.

Data quality isn't just a technical issue: It impacts an organization's compliance, operational efficiency, and customer satisfaction.

Are you a front-end or full-stack developer frustrated by front-end distractions? Learn to move forward with tooling and clear boundaries.

Developer Experience: Demand to support engineering teams has risen, and there is a shift from traditional DevOps to workflow improvements.

Related

  • Zero Trust for AWS NLBs: Why It Matters and How to Do It
  • A Guide to Microservices Deployment: Elastic Beanstalk vs Manual Setup
  • A Deep Dive on Read Your Own Writes Consistency
  • Mastering Load Balancers: Optimizing Traffic for High Availability and Performance

Trending

  • How I Built an AI Portal for Document Q and A, Summarization, Transcription, Translation, and Extraction
  • How Node.js Works Behind the Scenes (HTTP, Libuv, and Event Emitters)
  • AI-Native Platforms: The Unstoppable Alliance of GenAI and Platform Engineering
  • Operationalizing Data Quality in Cloud ETL Workflows: Automated Validation and Anomaly Detection
  1. DZone
  2. Software Design and Architecture
  3. Performance
  4. Reactive Service-To-Service Communication With RSocket (Part 2): Load Balancing and Resumability

Reactive Service-To-Service Communication With RSocket (Part 2): Load Balancing and Resumability

More information about RSockets and how to use them.

By 
Rafał Kowalski user avatar
Rafał Kowalski
·
Aug. 14, 19 · Tutorial
Likes (8)
Comment
Save
Tweet
Share
11.9K Views

Join the DZone community and get the full member experience.

Join For Free

Image title

This article is the second of the mini-series which will help you to get familiar with RSocket — a new binary protocol which may revolutionize machine to machine communication in distributed systems. In the following paragraphs, we will discuss the load balancing problem in the cloud as well as present the resumability feature which helps to deal with network issues, especially in the IoT systems.

If you are not familiar with RSocket basics, please see the previous article available here.

High Availability and Load Balancing as a Crucial Part of Enterprise-Grade Systems

Applications availability and reliability are crucial parts of many business areas like banking and insurance. In these demanding industries, the services have to be operational 24/7 even during high traffic, periods of increased network latency or natural disasters. To ensure that the software is always available to the end-users it is usually deployed in redundantly, across the multiple availability zones.

In such a scenario, at least two instances of each microservice are deployed in at least two availability zones. This technique helps our system become resilient and increase its capacity — multiple instances of the microservices are able to handle a significantly higher load. So where is the trick? The redundancy introduces extra complexity. As engineers, we have to ensure that the incoming traffic is spread across all available instances. There are two major techniques that address this problem: server load balancing and client load balancing.

The first approach is based on the assumption that the requester does not know the IP addresses of the responders. Instead of that, the requester communicates with the load balancer, which is responsible for spreading the requests across the microservices connected to it. This design is fairly easy to adopt in the cloud era. IaaS providers usually have built-in, reliable solutions, like Elastic Load Balancer available in Amazon Web Services. Moreover, such a design helps develop routing strategy more sophisticated than plain round ribbon (e.g. adaptive load balancing or chained failover). The major drawback of this technique is the fact that we have to configure and deploy extra resources, which may be painful if our system consists of hundreds of the microservices. Furthermore, it may affect the latency — each request has extra “network hop” on the load balancer.

The second technique inverts the relation. Instead of a central point used to connect to responders, the requester knows IP addresses of each and every instance of the given microservice. Having such knowledge, the client can choose the responder instance to which it sends the request or opens the connection with. This strategy does not require any extra resources, but we have to ensure that the requester has the IP addresses of all instances of the responder (see how to deal with it using service discovery pattern). The main benefit of the client load balancing pattern is its performance — by reduction of one extra “network hop”, we may significantly decrease the latency. This is one of the key reasons why RSocket implements the client load balancing pattern.

Image title

Client Load Balancing in RSocket

On the code level, the implementation of the client load balancing in RSocket is pretty straightforward. The mechanism relies on the LoadBalancedRSocketMonoobject which works as a bag of available RSocket instances, provided by RSocket supplier. To access RSockets, we have to subscribe to the LoadBalancedRSocketMono which onNext signal emits fully-fledged RSocket instance. Moreover, it calculates statistics for each RSocket, so that it is able to estimate the load of each instance and based on that choose the one with the best performance at the given point of time.

The algorithm takes into account multiple parameters like latency, number of maintained connections as well as a number of pending requests. The health of each RSocket is reflected by the availability parameter – which takes values from zero to one, where zero indicates that the given instance cannot handle any requests and one is assigned to a fully operational socket. The code snippet below shows the very basic example of the load-balanced RSocket, which connects to three different instances of the responder and executes 100 requests. Each time it picks up RSocket from the LoadBalancedRSocketMono object.

@Slf4j
public class LoadBalancedClient {

    static final int[] PORTS = new int[]{7000, 7001, 7002};

    public static void main(String[] args) {

        List rsocketSuppliers = Arrays.stream(PORTS)
                .mapToObj(port -> new RSocketSupplier(() -> RSocketFactory.connect()
                        .transport(TcpClientTransport.create(HOST, port))
                        .start()))
                .collect(Collectors.toList());

        LoadBalancedRSocketMono balancer = LoadBalancedRSocketMono.create((Publisher>) s -> {
            s.onNext(rsocketSuppliers);
            s.onComplete();
        });

        Flux.range(0, 100)
                .flatMap(i -> balancer)
                .doOnNext(rSocket -> rSocket.requestResponse(DefaultPayload.create("test-request")).block())
                .blockLast();
    }
}













It is worth noting, that the client load balancer in RSocket deals with dead connections as well. If any of the RSocket instances registered in the LoadBalancedRSocketMono stop responding, the mechanism will automatically try to reconnect. By default, it will execute five attempts, in 25 seconds. If it does not succeed, the given RSocket will be removed from the pool of available connections. Such design combines the advantages of the server-side load balancing with low latency and reduction of “network hops” of the client load balancing.

Dead Connections and Resumabilty Mechanism

In case of the machine-to-the machine communication within the cloud, streaming data in real-time is not a big deal, but if we consider IoT devices located in areas without access to a stable, reliable internet connection, the problem becomes more complex. We can easily identify two major issues we may face in such a system: the network latency and connection stability. From a software perspective, there is not much we can do with the first one, but we can try to deal with the latter. Let’s tackle the problem with RSocket, starting with picking up the proper interaction model. The most suitable in this case is therequest stream method, where the microservice deployed in the cloud is the requester and the temperature sensor is the responder. After choosing the interaction model we apply the resumability mechanism. In RSocket, we do it by method resume() invoked on the RSocketFactory, as shown in the examples below:

@Slf4j
public class ResumableRequester {

    private static final int CLIENT_PORT = 7001;

    public static void main(String[] args) {
        RSocket socket = RSocketFactory.connect()
                .resume()
                .resumeSessionDuration(RESUME_SESSION_DURATION)
                .transport(TcpClientTransport.create(HOST, CLIENT_PORT))
                .start()
                .block();
        socket.requestStream(DefaultPayload.create("dummy"))
                .map(payload -> {
                    log.info("Received data: [{}]", payload.getDataUtf8());
                    return payload;
                })
                .blockLast();
    }
}











@Slf4j
public class ResumableResponder {

    private static final int SERVER_PORT = 7000;
    static final String HOST = "localhost";
    static final Duration RESUME_SESSION_DURATION = Duration.ofSeconds(60);


    public static void main(String[] args) throws InterruptedException {

        RSocketFactory.receive()
                .resume()
                .resumeSessionDuration(RESUME_SESSION_DURATION)
                .acceptor((setup, sendingSocket) -> Mono.just(new AbstractRSocket() {
                    @Override
                    public Flux requestStream(Payload payload) {
                        log.info("Received 'requestStream' request with payload: [{}]", payload.getDataUtf8());
                        return Flux.interval(Duration.ofMillis(1000))
                                .map(t -> DefaultPayload.create(t.toString()));
                    }
                }))
                .transport(TcpServerTransport.create(HOST, SERVER_PORT))
                .start()
                .subscribe();
        log.info("Server running");

        Thread.currentThread().join();
    }
}















Please notice that to run provided examples, you need ‘socat’ installed on your machine, please see README file for more details

The mechanism on the requester and responder side works similarly, it is based on a few components. First of all, there is a ResumableFramesStore which works as a buffer for the frames. By default, it stores them in memory, but we can easily adjust it to our needs by implementing the ResumableFramesStore interface (e.g. store the frames in the distributed cache, like Redis). The store saves the data emitted between keep-alive frames, which are sent back and forth periodically and indicates if the connection between the peers is stable; moreover, the keep-alive frame contains the token, which determines Last received position for the requester and the responder. When the peer wants to resume the connection, it sends the resume frame with an implied position. The implied position is calculated from last received position (is the same value we have seen in the Keep-Alive frame) plus the length of the frames received from that moment. This algorithm is applied to both parties of the communication, in the resume frame is it reflected by last received server position and first client available position tokens. The whole flow for resume operation is shown in the diagram below:

Image title

By adopting the resumability mechanism built in the RSocket protocol, with the relatively low effort we can reduce the impact of the network issues. Like shown in the example above, the resumability might be extremely useful in the data streaming applications, especially in the case of the device to the cloud communication.

Summary

In this article, we discussed more advanced features of the RSocket protocol, which are helpful in reducing the impact of the network on the system operationality. We covered the implementation of the client load balancing pattern and resumability mechanism. These features, combined with the robust interaction model constitutes the core of the protocol.

In the last article of this mini-series, we will cover available abstraction layers on top of the RSocket.

Please notice that fully working examples are provided here

Load balancing (computing) RSocket

Published at DZone with permission of Rafał Kowalski. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Zero Trust for AWS NLBs: Why It Matters and How to Do It
  • A Guide to Microservices Deployment: Elastic Beanstalk vs Manual Setup
  • A Deep Dive on Read Your Own Writes Consistency
  • Mastering Load Balancers: Optimizing Traffic for High Availability and Performance

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • [email protected]

Let's be friends: