Many performance testers/engineers underestimate the importance analyzing the historical end user access patterns while developing the workload model for performance testing / application capacity planning. On the majority of my audits doing RCA on production performance issues, the culprit will be the wrong workload. The performance test strategy talks about various types of tests that will be planned, infrastructure components that will be monitored, type of analysis that be performed, etc., but when it comes to workload, it is always expected to be provided by customer or business analysts. Definitely, our customers / business analysts knows who are end users and frequently used business flows, but its sole responsibility of performance tester/engineer to understand (and sometimes educate customers) the additional detailed analysis required in order to increase the accuracy of our performance tests.
We need to remember the fact that if the workload selected for running the performance tests is not reflective of realistic end user access pattern, the entire test results will go wrong and will result in jeopardy. Remember the below points during your workload analysis:
- Analyze your end user access patterns. Try to understand your user’s behaviors.
- Identify your average and peak point workloads in your historical trends.
- Define your peak point workload both quantitatively and qualitatively.
- During peak point pattern analysis, pay attention to ignore outliers.
Now create workload model that needs to be used for your performance tests. You can have more than one workload model for your tests. I mean the workload used for your load test can be different from that of endurance test or stress test. It all depends on your end user access patterns. Remember, knowing basics of Queuing Theory (Operational laws) can help to validate the correctness of your workload model and even to an extent whether your peak hour SLA is valid.
If you are dealing with a very business critical and high availability application where it’s really worth, spend time in understanding the underlying statistical distribution pattern for your peak traffic hour workloads. Web applications accessed by independent geographical distributed users usually fall to a Poisson distribution or self-similar distribution. In simple terms, which distribution does my application workload belong to is about analyzing how much bursts and spikes does my peak hour workload have. Representing the burstiness of your traffic using a metric called Hurst and employing various techniques to quantify the Hurst value will confirm which statistical distribution your application fall into and how much your peak hour workload can vary in future.
For application capacity planning, choosing right workload peak points becomes very essential. Unless you choose a series of peak point workloads from your historical statistics and understand the quantitatively and qualitatively what the workload really comprises of, you will not succeed in accurately forecasting hardware demands for your application. Applying analytical modeling techniques to answer business demanded what-if scenarios can be made possible using carefully selected workload. Without doing this basic homework, you cannot rightly size your infrastructure for the projected business loads.
Also, most of the capacity planning techniques require actual application performance benchmarks for careful extrapolation / forecasts. Performance benchmarking becomes very important in capacity planning to understand the hardware resource requirements (represented as service demands) and other performance characteristics of your application. Using the right workload to carry out performance benchmarking is the first step towards successful application capacity planning.
Happy workload analysis and modeling!!