Over a million developers have joined DZone.

On Infinite Scalability

· Performance Zone

Download Forrester’s “Vendor Landscape, Application Performance Management” report that examines the evolving role of APM as a key driver of customer satisfaction and business success, brought to you in partnership with BMC.

Udi Dahan posted about the myth of infinite scalability. It is a good post, and I recommend reading it. I have my own 2 cents to add, though.


When building a system, I am always assuming an order of magnitude increase in the work the system has to do (usually, we are talking about # of requests / time period).

In other words, if we have 50,000 users, and we have roughly 5,000 active users during work hours, we typically have 100,000 requests per hour. My job is to make sure that we can manage to get up to 1,000,000 requests per hour. (Just to give you the numbers, we are talking about moving from ~30 requests per second to ~300 requests per second).

Why do we need that buffer?

There are several reasons to want to do that. First, we assume that the system is going to be in production for a relatively long time, so the # of user or their activity is going to grow. Second, in most systems, we are usually talking about some % of the users being active, but there are usually times when you have a significantly more users being active. For example, at end of tax year, an accounting system can be expected to see a greater utilization, as well as at every end of the month.

And within that buffer, we don’t need to change the system architecture, we just need to add more resources to the system.

And why a full order of magnitude?

Put simply, because it is relatively easy to get there. Let us assume end of year again, and now we have 15,000 active users (vs. the “normal” 5,000). But the amount of work we do isn’t directly related to the number of users. It is related to how active they are and what operations they are doing. In such a scenario, it is more likely that the users will be more active and stress the system more.

Finally, there is another aspect. Usually, moving a single order of magnitude up is no problem. In fact, I feel comfortable running on a single server (maybe replicate for HA) from 1 request per second to 50 or so. That is 3,600 request per hour to 180,000 requests per hour. But beyond that, we are going to start talking about multiple servers. We can handle a millions requests per hour on a web farm without much problems, but moving to the next level is likely to require more changes, and when you get beyond that, you require more changes still.

Note, those changes are not changes to the code, they are architectural and system changes. Where before you had a single database, now you have many. Where before you could use ACID, now you have to use BASE. You need to push a lot more tasks to the background, the user interaction changes, etc.

Of course, you can probably start your application design with a 10,000,000 requests per hour. That is big enough that you can move to a hundred million requests per hour easily enough. But, and this is important, is this something that is valuable? Is this really something that you can justify to the guys writing the checks.

Scalability is a cost function, as Udi has mentioned. And you need to be aware that this cost incurs during development and during production. If your expected usage can be handled (including the buffer) with a simpler architecture, go for it. If you grow beyond an order of magnitude, there are three things that will happen:

  1. You have the money to deploy a new system for the new load.
  2. You now have a better idea on the actual usage, rather than just guessing about the future.
  3. You are going to change how the system work anyway, from the UI to the internal works.

The last one is important. I am guessing that many people would say “but we can do it right the first time”. And the problem is that you really can’t. You don’t know what users will do, what they will like, etc. Whatever you are guessing now, you are bound to be wrong. I have had that happen to me on multiple systems.

More than that, remember time to market is a big feature as well.

See Forrester’s Report, “Vendor Landscape, Application Performance Management” to identify the right vendor to help IT deliver better service at a lower cost, brought to you in partnership with BMC.


Published at DZone with permission of Ayende Rahien, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}