The 6 Most Common Performance Testing Mistakes, and How to Fix Them
Straight from our new Performance: Testing and Tuning Guide, here's a look at some of the most frequent testing errors devs make.
Join the DZone community and get the full member experience.
Join For FreeAs a performance testing consultant for the last 10 years, you could say that it's second nature for me to look for performance patterns. One of the patterns I have observed over my career is that, regardless of the project size, company, or the technology being used, the same types of performance testing mistakes get made over and over and over.
This is fundamentally unsurprising, as human nature is the same regardless of the company, and every project and business environment has deadlines. Sometimes those deadlines mean testing simply must get done, making it easy to cut corners to get the results "over the line." Unfortunately, however, those shortcuts can lead to costly performance testing mistakes and oversights.
With a bit of self-awareness and some helpful accompanying tools however, you can often mitigate these mistakes quite easily.
1. Inadequate User Think Time in Scripts
Hitting your application with hundreds or thousands of requests per second without any think time should only be used in rare cases where you need to simulate this type of behavior – like a Denial of Service attack. 99% of people are not testing for this scenario all the time, however; they are just trying to verify that their site can handle a target load. In that case, user think time is very important.
I've seen many people run tests without any think time with thousands of users and using 1 load injection node, then proceed to ask:
Why are my response times slow? When I manually visit the site using my browser, it seems to be responding just fine.
The cause of this is almost always the sheer volume of requests that the load injection node is tasked with, which subsequently runs into maxing out machine resources and triggering local exceptions quickly.
You can address this by simply adding some sort of think/wait time in between your virtual user's steps, effectively slowing the user down to a more realistic pace. A real-world user would never make back-to-back page requests within 1 second. Think time allows you to add a human pause, mimicking a real person who would wait a few seconds before further interacting with the site. An easy solution is to use a Gaussian Random Timer when designing your tests, which allows your users to interact in a random fashion just as they would in a real-world scenario.
2. Using an Inaccurate Workload Model
A workload model is the detailed plan that you should be writing your scripts against. This workload model should outline your business processes, the steps involved, number of users, number of transactions per user, and calculated pacing for each user.
Having an accurate workload model is critical to the overall success of your testing. Often this is easier said than done — there have been many projects where business requirements simply don't exist because they weren't considered beyond, "the system should be fast."
For existing applications, it's always worthwhile to have access to production statistics, which will show a wealth of information such as:
- What are the most common/popular transactions?
- How many of each transaction happens on a typical business day?
- How many of each transaction happens on a peak day?
- What are the general times or periods that these transactions tend to occur?
- What are the transactions that have a high business cost if they were to fail under load?
From these criteria, you can assemble a picture of the workload model you need to simulate.
For new applications, the general rule is to work as much as possible with the business representatives to ascertain realistic and agreed upon figures that are well defined, simple, and testable.
After the first testing cycle and once the application is released, on-going monitoring of usage in the production environment will help provide feedback for the next round of testing, where the workload model can be tuned further.
3. Setting Up Inadequate Infrastructure Monitoring
Your execution results like throughput, transaction response times, and error information aren’t overly helpful unless you can see how your target infrastructure is coping with the scenario.
It's a common problem — I have heard many testers ask why their response times are taking minutes instead of seconds. The problem can lie either in the load generation or the target application infrastructure.
How do you solve this problem? The ideal solution is to have custom monitoring dashboards for all your on-demand load injection infrastructure. Some load testing platforms, like Tricentis Flood, provide custom monitoring dashboards upon request. This enables you to view system resource utilization while running your tests, ensuring that no bottlenecks are present on the load generation side.
On the target application side, can be implemented to help diagnose bottlenecks or sluggish performance during a load test. There are actually at least 100+ application monitoring services available right now — all with differing price points and features. I have used many of the more popular services such as New Relic, AppDynamics, and Splunk, but these can get quite expensive for a full stack monitoring option. There are also a few open-source, free alternatives such as Nagios and InspectIT. They are not as polished, but with a little know-how and time spent setting these up, it can be worthwhile and very cost effective.
4. Using Hard Coded Data in Every Request
Using the same data in your HTTP request for every user is not a realistic usage scenario. Often, smarter applications and existing database technology will recognize identical requests and automatically cache them, making the overall system appear faster than it is. This leads to an invalid performance test.
Let's take the simple act of registering a new user for a typical online shopping website. Most (if not all) sites won’t let the same user be registered more than once without some unique information being specified.
Here’s an example JSON payload that can be used within a JMeter request for adding a new customer:
{
"firstName": "Teddy",
"lastName": "Smith",
"mobile": "0470123766",
"email": "user@domain.ext",
"currencyCode": "AUD",
"currentMembershipStatus": "LVL 1 Member"
}
You would not be able to keep submitting this request into a shopping website, as there would be some verification required, particularly on the mobile and email fields.
Instead, we can make these two fields unique so we can run this request successfully each time. With JMeter, we can easily use built-in random number generators to ensure that fields like phone numbers and email addresses are unique.
{
"firstName": "Teddy",
"lastName": "Smith",
"mobile": "${Random(0000000000,9999999999)}",
"email":
"${RandomString(10,abcdefghijklmnopqrstuvwxyz1234567890)}@
domain.ext",
"currencyCode": "AUD",
"currentMembershipStatus": "LVL 1 Member"
}
Here, we have made two simple code changes:
1. We replaced the mobile field with a JMeter Random
function that will generate a ten-digit random number for the mobile phone number field.
2. We replaced the email filed with a JMeter RandomString
function that will generate a ten-character email username along with @domain.ext
.
When we run this payload within an HTTP request in a load test, every single request will have a different mobile number and email address. This simple example of using dynamic data for all your requests can save you a lot of invalid tests.
5. Ignoring System or Script Errors Because Response Times and Throughput Look Fine
It is quite easy to gloss over details that could have a huge impact on your test's validity. Here’s common scenario:
A load test runs with a target number of users, and the tester see that response times and error rates are within acceptable ranges. Expected throughput however, is lower than anticipated. How can this be when your load testing platform is reporting very few transaction related errors?
It could be that you are encountering system-related errors that are not being reported on the transaction pass/fail counts—that is why everything looks OK to the untrained eye. A lower than anticipated number of transactions per minute or RPM value is the telltale sign of this issue.
Delving into runtime logs will reveal the issue on the scripting side, which can then be fixed. Once you have a test with transaction response times, error rates, and throughput rates all in the expected zone, are you in the clear?
Not just yet. There is almost always an overlooked area that can still impact your load tests: verification of actual system or application logs during the test.
This can be easily monitored if using a purpose built APM (Application Performance Monitoring) tool such as New Relic or AppDynamics, but a lot of the time, you will need to do this manually.
Exceptions can be caused by different factors, meaning you should carefully analyze why they were caused by your load test scenario and what impacts they have on your system's performance and stability.
Exceptions often carry an enormous performance overhead due to exception handling routines and any related logging to disk or memory operations. This might not be much of an issue with a single transaction, but multiply this over 1,000 concurrent users and it can severely impact your system's responsiveness.
6. Overloading Load Generators
The last mistake is overloading load generators due to one or more of the following:
- Too many concurrent users on a single load injection node
- The target site is very image-heavy or CSS-heavy, which impacts the number of concurrent users you can fit on a load injection node
- Load injection node hardware limitations
A site heavy in CSS and lots of images will cause a larger footprint in resource utilization than a very simple text only site or API calls. This affects the number of threads/users that can be supported per node. So how do you know what number of supported threads/users can be used comfortably per node on your load testing platform?
I would generally recommend running initial tests with a low number of users (1-100) as a scaling test. You can use a node resource dashboard that shows CPU and memory metrics to plan for node capacity.
To pinpoint if you are overloading a load injector, look for error logging, out of memory exceptions, and CPU or network utilization stats. CPU usage in the 80%+ region for a sustained period is a sign that the load injection node is being saturated. The network Rx throughput figure is also an issue in this case, as each load injection node has a limit — find out that limit and note that anything above will mean network saturation.
Opinions expressed by DZone contributors are their own.
Comments