DZone
Performance Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Performance Zone > The Simple Scalability Equation

The Simple Scalability Equation

Vlad Mihalcea user avatar by
Vlad Mihalcea
·
May. 29, 14 · Performance Zone · Interview
Like (0)
Save
Tweet
3.35K Views

Join the DZone community and get the full member experience.

Join For Free

queueing theory

the queueing theory allows us to predict queue lengths and waiting times, which is of paramount importance for capacity planning. for an architect this is a very handy tool, since queues are not just the appanage of messaging systems.

to avoid system over loading we use throttling . whenever the number of incoming requests surpasses the available resources, we basically have two options:

  • discarding all overflowing traffic, therefore decreasing availability
  • queuing requests and wait (for as long as a time out threshold) for busy resources to become available

this behaviour applies to thread-per-request web servers, batch processors or connection pools.

what’s in it for us?

agner krarup erlang is the father of queueing theory and traffic engineering, being the first to postulated the mathematical models required to provisioning telecommunication networks.

erlang formulas are modelled for m/m/k queue models, meaning the system is characterized by:

  • the arrival rate (λ) following a poisson distribution
  • the service times following an exponential distribution
  • fifo request queueing

the erlang formulas give us the servicing probability for:

  • discarding overflow systems
  • queueing overflow systems

this is not strictly applicable to thread pools, as requests are not fairly serviced and servicing times not always follow an exponential distribution.

a general purpose formula, applicable to any stable system (a system where the arrival rate is not greater than the departure rate) is little’s law .

l = \lambda w

where

l – average number of customers
λ – long-term average arrival rate
w – average-time a request spends in a system

littlelaw

you can apply it almost everywhere, from shoppers queues to web request traffic analysis.

this can be regarded as a simple scalability formula, for to double the incoming traffic we have two options:

  1. reduce by half the response time (therefore increasing performance)
  2. double the available servers (therefore adding more capacity)

a real life example

a simple example is a super-market waiting line. when you arrive at the line up you must pay attention to the arrival rate (e.g. λ = 2 persons / minute) and the queue length (e.g. l = 6 persons) to find out the amount of time you are going to spend waiting to be served (e.g. w = l / λ = 3 minutes).

if you enjoy reading this article, you might want to subscribe to my newsletter and get a discount for my book as well.

vlad mihalcea's newsletter

a provisioning example

let’s say we want to configure a connection pool to support a given traffic demand.
the connection pool system is characterized by the following variables:

ws = service time (the connection acquire and hold time) = 100 ms = 0.1s
ls = in-service requests (pool size) = 5

assuming there is no queueing (wq = 0):

\lambda = \frac{l}{w} =\ 50\frac{requests}{s}

our connection pool can deliver up to 50 requests per second without ever queueing any incoming connection request.

whenever there are traffic spikes we need to rely on a queue, and since we impose a fixed connection acquire timeout the queue length will be limited.

littlelawqueue

since the system is considered stable the arrival rate applies both to the queue entry as for the actual services:

\lambda\ = \frac{ls}{ws}\ = \frac{5}{0.1}\ =\frac{lq}{wq} =\frac{10}{0.2}

this queuing configuration still delivers 50 requests per second but it may queue 100 requests for 2 seconds as well.

a one second traffic burst of 150 requests would be handled, since:

  • 50 requests can be served in the first second
  • the other 100 are going to be queued and served in the next two seconds

the timeout equation is:

lspike\ = \lambda spike tspike
t\ = \frac{lspike}{ \lambda } = \frac{ \lambda spike tspike }{ \lambda }
lq = lspike - ls
tq = t - 1

so for a 3 seconds spike of 250 requests per second:

λspike = 250requests/s
tspike = 3s

the number of requests to be served is:

lspike\ =\ 250\frac{requests}{s} 3s\ =\ 750requests
t\ = \frac{ 750requests }{ 50\frac{requests}{s} }\ =\ 15s
lq = lspike - ls = 700requests
tq = t - 1 = 14s

this spike would require 15 seconds to be fully processed, meaning a 700 queue buffer that takes another 14 seconds to be processed.

if you enjoyed this article, i bet you are going to love my book as well.






conclusion

little’s law operates with long-term averages and it might not suit for various traffic burst patterns. that’s why metrics are very important when doing resource provisioning.

the queue is valuable because it buys us more time. it doesn’t affect the throughput. the throughput is only sensible to performance improvements or more servers.

but if the throughput is constant then queuing is going to level traffic bursts at the cost of delaying the overflown requests processing.

flexypool allows you to analyse all traffic data so you’ll have the best insight into your connection pool inner workings. the fail-over strategies are safe mechanisms for when the initial configuration assumptions don’t hold on any more.

if you have enjoyed reading my article and you’re looking forward to getting instant email notifications of my latest posts, you just need to follow my blog .

Requests Connection pool Scalability Queueing theory Connection (dance)

Published at DZone with permission of Vlad Mihalcea. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • What Is ERP Testing? - A Brief Guide
  • Top Soft Skills to Identify a Great Software Engineer
  • Transactions vs. Analytics in Apache Kafka
  • How To Integrate Event Streaming Into Your Applications

Comments

Performance Partner Resources

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo