We decided to write this post as a reaction to a great comparison of JMeter response data extractors by Vinoth Selvaraj aka Test Automation Guru.
We tried to repeat his experiment using SmartMeter.io instead of JMeter and adding other extractors, namely Boundary Body extractor and JSON extractor.
As Vinoth shows in his experiment, Regular Expression Extractor has the best performance from extractors used by him. This extractor is also the most powerful thanks to the usage of regular expressions for searching in text.
Boundary Body extractor is not so powerful as regex extractor. It is, however, much easier to use, especially if you are not a regex fan. Due to its simplicity, it should be faster than Regular Expression Extractor.
We use a similar test plan as Vinoth. It will contain one Thread Group and one Dummy Sampler in it. The response will be dummy XML with the Response Data taken from W3 schools. As in the previous test, Latency and Response times are turned off and there are no timers. You can see the whole stack in GitHub repository.
Our XML looks like this:
We will extract the first TITLE in CD list. So the expected value will be “Empire Burlesque.”
This test plan will be same for all tests, the only difference will be in applied extractors.
We used the Amazon t2.micro instance with Ubuntu as the operating system (that was exactly AMI-ID = ami-af455dc9) for this testing. That way we get the environment quickly. This will also allow us to repeat the test at any time under the same conditions, and to eliminate some influences like desktop applications and so.
We installed SmartMeter.io version 1.4 on this machine. We have installed all necessary plugins for Dummy Sampler and extractors, too.
After each finished test, we generated a report. These data were collected for comparison:
|Name of extractor||Count||Throughput||%|
This graph captures the Count values.
Again, as in Vinoth’s article, we will focus on Count (no. of samples sent in 60 seconds) and Rate (throughput). You can clearly see that extractor affects the performance of the test.
Test with no post processors had roughly 3 million requests. Adding post processors affects CPU and memory utilization.
According to the results, we should use Boundary Body extractor as often as possible. Not only is it the fastest extractor, but its use is very simple. Just find strings, that surround your “needle” in the “haystack”. For example, if we have the XML from our test:
<CATALOG> <CD> <TITLE>Empire Burlesque</TITLE> <ARTIST>Bob Dylan</ARTIST> <COUNTRY>USA</COUNTRY> <COMPANY>Columbia</COMPANY> <PRICE>10.90</PRICE> <YEAR>1985</YEAR> </CD> <CATALOG>
and we want to find the contents of the <TITLE> tag, just define Boundary Body extractor as follows:
For more information on how to use Boundary Body extractor, see the documentation.
Even though Boundary Body extractor is fast and efficient, it is unfortunately not omnipotent. One of its problems is that it does not support non-ASCII characters.
It is also not always possible to search in the text so easily. Sometimes there is nothing more than a good old regex.
Therefore, a suitable strategy should be to use Boundary Body extractor first and, if necessary, use Regular Expression Extractor.