Recently I visited Toronto in Ontario, Canada, to speak at a gathering of web developers and DevOps practitioners. (You can see the slides from my talk here.) Afterwards, I had to opportunity to take a ride up the CN Tower, which literally towers over the city at a total height of 1815 feet (or 553 meters, as the locals say). The observation area has windows all around to view Toronto. It also has a glass floor through which I could look down past my feet and through the floor 1122 feet to the ground.
While I was admiring this feat of engineering, I had a revelation…
The CN Tower’s Glass Floor is Like Your Site
Sounds far-fetched? Here’s what I mean.
One of the big lessons learned from 2014 is the need to load test your site. While I don’t expect that the CN Tower tested their glass floor with 3.5 orcas or 35 moose or 41 polar bears, etc., I do expect that at some time, long before it was installed, there was a strength test of the glass, and it passed as more than sufficient for the load it would have to bear with lots of tourists standing on it. In fact, I learned that each panel is load tested annually to ensure safety!
When you load test with simulated visitors, you know you have enough capacity to serve all of the expected visitors to your site. If your site is like the CN Tower, the number of visitors is limited by the space in the visitor area: only so many people can fit there, so you know what the site capacity has to be.More likely, the number of visitors to your site is not limited and any number of guests can come to your site at the same time, seeking the business value you deliver, such as:
- Filing their taxes before the deadline
- Submitting insurance plan choices during open enrollment period
- Responding to a commercial they just saw during the SuperBowl
- Saving with special discount offers during a limited-time flash sale
- Shopping for Black Friday specials before Christmas
- Getting up-to-date news about a late-breaking situation
When you have high simultaneous demand, your site can be overwhelmed by too many visitors.
How Much Load Can Your Site Handle?
There are two simple ways to find out how much load your site can handle:
- You can test it with load testing software in advance of your peak time, or
- You can watch customers come to your site at what might be the most important time of the year for your company and see if the site can stay up.
That’s probably not the way you want to discover the upper limits of your site, though that’s what happens to many sites this year and before.Every holiday season, retail sites go down due to heavy demand. A few April 15’s ago, a tax filing site went down due to the last-minute filing load. And we all remember HealthCare.gov’s epic fail in 2013, when the site couldn’t handle the load from all of the visitors who wanted to sign up for healthcare.
Once you decide that you do want to do a load test of your site, you have some things to figure out and choices to make:
- You have to figure out what to test: it might not make sense to load test everything, as that could take a very long time.
- You also have to choose how to write your tests.
- And you need to choose when and where to run your tests.
What to Test
If you have site access logs — or even better, if you have real user analytics data — you can analyze the data to get the ranking of all of your site’s pages. The homepage is probably number one. After that, the organization of your site and what the users are trying to do drives the flow.
If your site is retail and the homepage is filled with attractive promotions, users may be guided by those to specific sale or section pages, and further to product pages, a shopping cart page, and on to checkout.
Other sites might have menu bars at the top of the page that users browse to find the product of interest, leading to a free trial or purchase page.
Or maybe users come to your site to search for information so they quickly go from the homepage to a search results page and then deeper into the site based on the top results returned.
If you’ve been collecting real user data about your site visitors, your RUM tool should be able to show you the top paths through your site, just as this visualization from SOASTA’s Data Science Workbench (DSWB) shows for page paths for purchase on this retailer’s website.
You have to decide which pages and flows through your site to test at the expected capacity — and higher — to know if your site will handle the load of customers doing what they are expected to do on your site. If you don’t know how users go through your site, adding real user monitoring (RUM) software to the site will record all your site’s user activity and make it available for analysis. Some tools in this space are Google Analytics, Adobe Analytics, and, of course, SOASTA mPulse.
How to Create Your Load Tests
When and Where to Run Your Load Tests
During development of your app, you want to run performance tests, which are just light load tests using a small number of virtual users, against your code with each build. In this way you can measure the performance of the new code, and track whether it is remaining steady, getting better, or getting worse and needs attention.
Once your app is in production, you’re not done with testing. Now is when you need load testing at scale — equal to or even much greater than the expected load from your customers. Some companies are not comfortable with load testing the production systems but this is the only true method to determine if the production systems, and the internal and external systems upon which they depend, can handle the load.
Without a production load test, how could you determine if the complete system — app servers, web servers, load balancers, content delivery network (CDN) partners, third-party plug-ins, etc. — will perform well with thousands to millions of simultaneous users accessing it? If it is not tested in advance, it will be tested by the users when they come… and it might not pass.
Scheduling your load test is something that your testing and operations teams can do. Look for times when the number of customers is lower, such as on weekends and late at night, maybe when most of your site customers are offline and sleeping. This way if the load test does expose problems in the production systems, fewer customers are affected. By monitoring the production systems and response time and errors during the tests, you can choose to abort a test that is causing the production system to be stressed, before it fails.
Doing your production load testing long before the anticipated customer load is vital, so that you have time to address any problems exposed in the first load test, apply fixes, and test again. Maybe more than twice; better to have extra time for deeper stress testing than to have just enough time.
When to Get Started (Hint: The Answer is Now!)
Don’t be like HealthCare.gov which “just didn’t have time to test”. Even testing just once per year (like the CNTower team) before your busy season might not be enough. If you leave time for just one test, and then during that test find failures that need to be addressed, you might not have time to address those issues and test again to verify that they were resolved.
Better to start testing now, before the busy season, and leave extra time to test and fix and test again, so you won’t be like that tax site or those retailers whose sites went down just when the customers wanted and needed them to be up. You would hate to lose those customers to another site that isn’t down. As they say in marketing, it’s cheaper to keep a customer than to acquire a new one.