Over a million developers have joined DZone.

Load Balancing of WebSocket Connections

The problem of WebSocket load balancing has always been a hot issue when managing large systems. This article takes a look at some possible solutions to that problem.

Download Forrester’s “Vendor Landscape, Application Performance Management” report that examines the evolving role of APM as a key driver of customer satisfaction and business success, brought to you in partnership with BMC.

The problem of load balancing has always been a hot issue when managing large systems. Load balancing aims to optimize resource use, maximize throughput, minimize response time, and avoid overload of any single resource, so solving this problem is crucial for performance. In this article we’ll take a look at the possible solutions to the problem.

For a better understanding of WS load balancing let’s dive a bit deeper into TCP sockets background. By default, a single server can handle 65,536 socket connections just because it’s the max number of TCP ports available. So as WS connections have a TCP nature and each WS client takes one port we can definitely say that number of WebSocket connections is also limited.

Actually, it’s a half-truth. The server can handle 65,536 sockets per single IP address. So the quantity can be easily extended by adding additional network interfaces to a server. Meanwhile, it’s extremely important to track how many connections present on a server. Once the limit is exceeded, you can have a lot of issues with other TCP connections (e.g. it’s not possible to connect to a server via ssh). So it’s a good idea to limit WS connections per node inside your application’s code. We do the same in our apps when we deal with WebSockets.

Once we understand the major limitation and the way to overcome it, let’s proceed to load balancing. Below I will describe 3 ways we’ve tried in one of our projects. Please note that all the system parts were deployed to AWS and some of the tips and hints only apply to Amazon configurations.

ELB Approach

The easiest way to implement load balancing is just to use Elastic Load Balancer AWS provides. It’s possible to switch ELB to TCP mode which enables load balancing of any type of TCP connections including WebSockets. This approach gives:

  • Automatic failover of LB
  • Autoscaling of load balanced nodes
  • Extremely easy setup

Basically, it’s a good solution for most common cases until you have a splash growth of load. In this case, ELB becomes too slow to establish new connections. It’s possible to contact Amazon and ask them to “pre-warm” ELB, but it was not an option for us due to load-testing purposes when we need quick establishment of thousands of WS connections and for our customers due to usability of the system.

Software Load Balancers

We have tried HAProxy as a load balancer. But to make HAProxy works correctly one should keep in mind the port limitation issue we’ve talked about above. To make HAProxy handle more than 65k connections we should pass through the next steps:

  1. Create a bunch of private IP addresses. To do it choose your Amazon Instance -> Actions -> Networking -> Manage Private IP Addresses. We added 3 IP addresses:,, Just remember that the IP should be in the same sub-network as your real application server;
  2. Connect to your HAProxy instance via SSH and run following commands
  3. $> ifconfig eth0:1
    $> ifconfig eth0:2
    $> ifconfig eth0:3

    This will add 3 virtual network interfaces to the instance;

  4. Configure HAProxy. Here is a section from haproxy.cfg file for 3 Erlang nodes accepting WS connections.
  5. listen erlang_front :8888
        mode        http
        balance     roundrobin
        timeout connect 1s
        timeout queue 5s
        timeout server 3600s
        option httpclose
        option forwardfor
        server      erlang-1  source
        server      erlang-2  source
        server      erlang-3  source

Now HAProxy can handle more than 65,536 WebSocket connections, and the limit of connections can be easily increased by adding virtual network interfaces. Also, it can establish new connections rather fast.

This approach seemed to be viable despite the following drawbacks:

  • The failover HAProxy instance should be set up manually using tools like Keepalived;
  • Something has to be done to reconfigure HAProxy whenever you add a new Erlang node;
  • As the number of connections grows, there is no option to scale HAProxy horizontally. We have only a vertical option available, so when we have more and more active users, we should obtain more and more expensive instance for HAProxy (and HAProxy mirroring node).

We were fine with these drawbacks, but a much more simple solution was implemented. That was possible because we had already some code implemented, and our system design allowed us to use a custom approach.

Custom Approach

To move forward let’s review the following diagram showing our system architecture.

Image title

We have a JavaScript client application, an auth application responsible for user authorization, and an Erlang application with main application functionality. And the flow is as follows:

  1. The client makes an HTTP request with credentials to the Auth Application
  2. The Auth Application checks creds, generates a token and sends it via HTTP request to the Erlang Cluster
  3. The Erlang app confirms that the token is received and sends an HTTP response with confirmations to the Auth app
  4. The Auth App sends an HTTP response to the client application. This response includes the generated token
  5. The client uses the token to establish a WebSocket connection with the Erlang app through our HAProxy load balancer.

This is our basic flow, which was modified slightly. We added a simple module to our Erlang application to track the number of WebSocket connections on each Erlang node. Due to the “distributed” nature of Erlang, each node knows about other nodes connections. So we can choose a node with fewer connections. We take the public IP address of this node and just send it back to the auth application. Then the auth application sends this IP along with the token back to the client. The client establishs WS connections with the IP address and token received. So the final diagram looks like this:

Image title

Now we can:

  • Get rid of WS load balancer, which makes our system less complicated
  • Add Erlang nodes without any reconfiguration of other parts of the system

In addition:

  • WS connections are now distributed evenly between the Erlang nodes
  • The system easily scales horizontally
  • We don’t have to use Elastic IPs for Erlang nodes

See Forrester’s Report, “Vendor Landscape, Application Performance Management” to identify the right vendor to help IT deliver better service at a lower cost, brought to you in partnership with BMC.

websocket,load balancing,aws,erlang

Published at DZone with permission of Konstantin Shamko. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}