In Search of Quality: QA Must Be Engaged in Search Engine Development
In Search of Quality: QA Must Be Engaged in Search Engine Development
Find out which team is developing your organization’s search technologies and ask them how they’re testing them!
Join the DZone community and get the full member experience.Join For Free
If you’re reading this, you’re likely already well aware of the value of watertight QA practice and have a good understanding of what it entails. Yet, there is possibly a team delivering business-critical software at your organization that has thus far escaped the forensic focus of your testing. You need to talk to them, and this blog is a primer to help you do just that.
So, if you want to do one thing today to increase the measurable impact of QA at your organization, do this: find out which team is developing your organization’s search technologies and ask them how they’re testing them. There’s a fair chance that it’s a third-party search specialist, working with their own set of cutting-edge tools. In this case, verifying the quality of their testing practices becomes even more pertinent.
Search Matters – Fortunately, There’s a Community for That!
Search technologies today are delivered by the thriving and innovative search community, leveraging best-of-breed development methods like containerized deployment. They also build and use cutting edge open-source technologies like Apache Lucene, Solr and Kafka.
The search engines this skilled community build are high visibility and high-impact, ranging from search engines on E-Commerce sites, to internal internets, content sharing platforms, and archives. In essence, any technology responsible for providing customers or employees with easy access to the information that organizations want them to see. For a sense of just how many organizations develop their search technologies, see this list of organizations that use Solr to power their online search engines.
These search technologies are a pillar of a smooth and satisfying user experience. They hold the keys to customer acquisition and retention, and therefore also to revenue. E-Commerce provides a clear example: if a user lands on a web store via Google, they are unlikely to trawl through the website for the items they want; however, they are likely to use the search engine to find the items they seek, as well as similar items recommended to them by relevant search results.
Search developers are experts in navigating this “Fuzzy Logic” of search relevancy, striving to make sure that users find satisfaction in their search engine and not a rival process or organization.
Business-Critical Systems Require Industry-Grade Testing
Search engines matter a lot; however, there is often a disconnect between this value and the techniques used to validate that search engines deliver the desired results. This creates a significant amount of negative risk: key components of business-critical, revenue-driving systems are not being sufficiently tested and optimized.
That’s not to say that search specialists are not fine-tuning their systems. They are, for example by validating search results manually with subject matter experts. There are also some great tools out there, like the open-source Quepid.
The challenge is that search engine technologies contain a lot of logic that needs testing. The systems integrate and transform data from a wide range of sources, producing results for an infinite set of possible terms.
This is a familiar challenge for QA: a vast web of interrelated components creates a vast number of data journeys through the system logic. This journey must be tested using a comprehensive range of distinct data inputs to validate that the right output is produced.
Executing such a vast, complex set of data and measuring the results requires a systematic, scientific approach to testing. This science must encompass test case design, test data creation, and data-driven test execution. The volume of tests further requires a high degree of automation in creating and executing the tests, and that means that both expected and actual results must be formulated in a concrete, computer-readable format.
By this measure, the testing work performed by search developers is largely manual and unsystematic. It’s good for smoke testing and verifying that code delivers some results. It’s also good for tuning as you code. However, it’s not a science to match the science of search development. Nor is it capable of rigorously regression testing search engines each time something changes.
This is not the fault of the search specialists. They are experts in the science of search and relevancy, and their time is focused on developing and focusing on highly complex systems. They cannot be expected also to maintain a comprehensive understanding of the equally complex and fast-evolving world of QA. This is where those of us in QA need to reach out and get stuck in, delivering valuable test results to search developers
A Match Made in Heaven?
Search technologies, therefore, present a familiar problem for testers. That’s good news. It’s also a reason why QA should work more closely with search developers, introducing the science of cutting-edge testing to match the cutting-edge development techniques and tools used by the search community.
Further good news comes in the precise skills and knowledge already possessed by search engineers, many of which can be leveraged in QA. As mentioned, search technologies integrate and transform data from numerous sources. Search engineers are therefore adept, for example, in subsetting data and moving it between Solr cores and collections.
What’s missing is coverage-driven testing, and particularly automation in testing. For this, QA should collaborate with search experts to make explicit their subject matter expertise. They should work to understand and explicate the “Fuzzy Logic” of which searches are relevant, building tests on this understanding. By making this sometimes-subjective information concrete, testing can be automated. There are also now some technologies to help with too.
Delivering Results Quickly and Reliably: A Question of Test Data and Test Data Automation
I’ve dedicated much of my time recently to developing testing solutions for Apache Solr and Kafka, particularly automated solutions for data-driven search engine testing. These technologies focus on rapidly finding, securing and creating data, and firing it quickly into Solr and Kafka. The goal is to enable scientific, coverage-based test design, as well as the automation needed to quickly and repeatedly collect test results from evolving search technologies.
Combined with the extensive knowledge of search developers, the following technologies facilitate rapid and rigorous testing for Solr and Kafka.
Automated Data Provisioning
Automated testing for search engines built on Solr and Kafka is data-driven, and efficient testing must be capable of feeding high volumes of data into test environments. Tests can then validate that the actual results match the well-formulated expected result for each test.
However, moving data into Solr and Kafka can be a complex procedure. The data must be drawn from the wide-ranging data sources that the search engine integrates and must be fed consistently into the containers used today to deploy search engine technologies.
To move data, search developers today frequently rely on a complex collection of SQL scripts, or use somewhat dated and now-unreliable technologies. For testers, these approaches are not only too slow, but they risk feeding in misaligned data. That data, in turn, undermines the reliability of test results, producing both false negatives and false positives.
I've therefore recently worked up a high-speed utility for moving data into Solr and Kafka, providing a consistent and repeatable approach to data provisioning.
The automated data provisioning is parallelized to maximize performance. It splits the WHERE clauses used to move data, breaking data into chunks that can be moved in parallel. Last time I used it, the multi-threaded provisioning moved 231,000 rows of data in seconds and with a couple of clicks. The provisions can also be versioned by data source, providing the right configuration for the system under test.
The repeatable, standardized provisioning is furthermore exposed and made available to parallel test teams from an intuitive web portal. That means that QA no longer needs to waste any time waiting for data provisioning. They can simply complete “fill-in-the-blank” style web forms to spin up Docker containers and move data in. Clicking “play” provides on-tap access to data for testing systems built-in Solr and Kafka:
Moving production data to less secure test and development environments present a compliance risk, a subject that you can read about in detail here. In the world of GDPR and CCPA that might apply to data generated by search technologies, which should be kept away from testing environments where possible.
One way to mitigate the risk of a data breach and costly non-compliance is to anonymize data before it is moved to the test. However, this must not create a delay in data provisioning, while the masking must retain the referential integrity of data drawn from multiple sources.
This is the value of integrating reliable data masking with Solr and Kafka data provisioning, securing data seamlessly as it moves to test environments. In this integrated approach, Data Protection Officers have peace of mind that sensitive information is being scrubbed before it reaches test environments. Meanwhile, testers still have on-demand access to the data they need.
However, there are few circumstances where testers want or need a full set of production data in test environments. These large, repetitious data sets will typically run slowly and produce vast, hard-to-analyze results. Instead, testers require a data set to fulfill their exact test scenarios. Subsetting data on-demand helps achieve just this.
The following two types of subsetting are useful during Kafka or Solr data provisioning. First, “Criteria” subsetting, which extracts the data combinations needed to fulfill a specific test scenario. This allows testers to feed a controlled and compact data set into Solr and Kafka. They can then verify that a specific expected result has been achieved.
A “Covered” subset is used when a maximally rich spread of test data is needed. These subsets combine the dimensions of existing data, to boost the number of scenarios exercised during test execution. They create richer test data on-demand, enabling rapid and rigorous testing.
(Model-Based) Data Generation
Another drawback of testing with production data that you can read about here is the quality of that data. Put simply, it lacks the values needed to rigorously test complex systems. This is particularly true for the outliers and unexpected results needed for rigorous negative testing. Even if you recombine the dimensions in historical production data, you will be left lacking. This is a problem that should be familiar to anyone who has done ETL or Business Intelligence testing using production data alone.
Rigorously testing Solr and Kafka systems, therefore, requires data scenarios not found in production, ready to be pushed to test environments on-demand. In other words, an automated approach to data provisioning should, therefore, weave in synthetic test data generation, augmenting data with missing combinations as it moves to test environments. In other words, an on-demand approach should perform data “Find and Makes” on demand. A “Find and Make” first searches in existing data sources for data to satisfy a given test suite, before making any missing combinations needed for the test suite.
This test data generation might include model-based test data generation, which is where search engine testing truly becomes the science discussed earlier. Model-based test generation builds a model of data journeys through a system, in turn, finding and making the data needed to exercise a set of those journeys:
Model coverage algorithms, designed to identify the smallest number of “paths” through the model. The paths are equivalent to test cases, and test generation, in turn, creates the smallest set of tests needed to execute each logically distinct path through the model.
When testing Solr and Kafka, the distinction of the test case and test data is blurred. The model-based testing, therefore, generates every logically distinct data journey through the model, creating a data set with which to test every distinct combination of an equivalence class. It thereby introduces a rapid, repeatable and structured approach to test design.
The rigorous test generation leverages the power of computer processing to identify an executable number of tests for validating vastly complex data processing. The ability to combine modeled processes into master models makes rigorous integrated test design for Solr and Kafka possible, regardless of how many data sources are involved.
The sheer complexity of processing data from multiple, integrated sources makes Solr and Kafka testing perfect for integrated model-based testing and synthetic test generation. Multiple approaches might be used for integrating data generation with model-based test design:
- Static data definition: This specifies static values at the model level. It defines value for nodes in the model, which are then combined during test generation to create complete data journeys.
- Dynamic data generation: Another approach overlays synthetic data functions at the model level, each of which resolves dynamically during test case generation. This creates a more varied and therefore more rigorous set of data journeys for testing.
- Automated test data allocation: The most powerful approach to model-based data generation incorporates automated “Find and Makes” at the model level – these are the blue nodes in the above diagram. These re-usable processes resolve dynamically, either during test generation or execution. They search from back-end systems to find data combinations to satisfy the generated test cases, applying test data generation to create any missing combinations.
The combination of model-based test design and test data automation creates a complete set of data journeys for rigorous test execution. These complete and readily executable data journeys are produced on-demand, enabling complete and truly continuous test execution for search technologies built with Solr and Kafka.
As discussed at the start of this article, search engine development leverage best-of-breed technologies and development techniques like containerization. What’s missing is the science of testing to match the science of a search. In particular, search development requires a structured approach to data-driven test design and execution.
The technologies set out in this article enable systematic and automated test data design for Solr and Kafka. They furthermore provide the automation needed to fire them off to complex test environments on-demand. This, therefore, provides a solid set of first moves when introducing test automation to search engine development.
I’m proud of what Curiosity has achieved so far with Solr and Kafka and believe it fills a key gap at many organizations. My mind is now turning to consider techniques for defining sophisticated expected results for search engine testing. The end goal is to harness the “Fuzzy Logic” currently deployed by search engine developers when evaluating the quality of searches, making these usable in automated tests.
This will likely involve capturing data from manual search testing, in turn explicating the criteria used to evaluate a given search. Analyzing and making this data concrete will make these complex judgments usable in automated tests, facilitating sophisticated regression testing for search engines. The world’s our oyster here, with possibilities in machine learning, Expert Systems, and more.
If you’d like to get involved with this ongoing project, I’d love to hear your thoughts in the comments below!
Published at DZone with permission of huw price . See the original article here.
Opinions expressed by DZone contributors are their own.