Over a million developers have joined DZone.

Test Data Management and the Cloud – Keeping All the Plates Spinning

Learn all about test data management and the cloud.

· Performance Zone

See Gartner’s latest research on the application performance monitoring landscape and how APM suites are becoming more and more critical to the business, brought to you in partnership with AppDynamics.

I was recently in Boston at Faneuil Hall Marketplace, and with the long-awaited warm weather, all the street entertainers were in full force. Singers, musicians, and a variety of juggling acts filled the street, with crowds surrounding them. One act in particular struck a chord with me – the classic spinning plates. We’ve all seen it at various times in our lives. The entertainer started spinning plates, balanced precariously on top of wooden sticks. More and more plates started spinning with the entertainer frantically running back and forth as one started to slow down, almost fall, but, just in time, was able to get it spinning and balanced again. Then the entertainer reached his limit: he add one more plate and as he tried to keep them all up, one lone plate, down on the end started to wobble, the stick tilting, and before the entertainer could reach it, the plate went crashing to the ground, taking several of the other plates with it.

For those who have been responsible for test data management in a large, complex, integrated environment, they can probably relate to the spinning plates challenge. Any Quality Assurance tester or developer responsible for chasing down a Severity 1 blocking bug, only to find the issue was not the code, but a flaw in the test data, can also relate. Identifying, configuring, deploying, and maintaining a valid set of test data remains one of the technology challenges that is the bane of many a technologist. How does the cloud impact this? Does it make it better, worse, or more of the same?

Why is Test Data Management So Hard?

The goal of any good test data management process is to provide consistent, repeatable test data across your systems and environments, whether it be development, QA, or performance. Ideally, it would be wonderful to have reusable test data sets to leverage across all environments. This would provide consistency, as well as resource and time savings. Sounds basic enough, so what makes it so hard?

There are multiple challenges:

  • Avoiding data collisions: For complex systems that integrate with other systems, test environments and systems tend to be shared due to cost and resource constraints. There are other applications testing against that same system you are integrating with. Coordinating data sets to ensure no other application under test is accidentally using and overwriting data you are using can be challenging and addressed. There is nothing worse than chasing what appears to be a bug that actually turns out to be someone else overwriting your test data.
  • Enforcing privacy rules: A common and useful practice is to mine and extract test data from production systems. The key consideration here is any privacy and compliance rules (such as HIPAA). This may require the masking of test data. Masking itself may then introduce other challenges. A simple example: part of your test data is a customer’s name and address, which you need to mask. What if your system does address validation, to ensure all addresses are valid? You could easily create an address that now fails basic validation.
  • Ensuring relational integrity across systems: If you are integrating data sets across multiple systems, you may need to ensure you are maintaining the relational integrity of you data across those systems. The masking mentioned above can add to complications of that process that need to also be considered.
  • Resetting data set to a clean starting point: This means you need to understand any changes your testing did across all the integrated systems in order to be sure those changes can be backed out and/or removed back to a known starting point. Changes propagated across environments can be a key source of unintended consequences in a test environment.

How Does the Cloud Impact All This?

All of the previous challenges discussed still exist when you move to the cloud environment. One of my favorite mantras is ‘no technology negates the need for good design and planning.’ Cloud doesn’t provide any magic; it’s just a tool. It can help in standing up standard repeatable test environments, but the data setup process is still subject to the challenges already discussed.

Additionally, going to the cloud may introduce other challenges that must be considered:

  • SaaS solutions: In SaaS environments, you may not have direct access to the database layer. You are constrained to the mechanisms provided by the SaaS vendors for the extraction and the import of your user data, content, and configuration information. Your test data management process needs to take this into account.
  • Network bandwidth: If part or all of your environments reside in the cloud, you need to take into account the network when doing data loads, especially if you are dealing with large volumes of data either for performance testing or analytics. Bandwidth is usually well thought out for daily operational traffic, but frequently forgotten for initial and test data loads.
  • Keeping All Those Plates Spinning is No Easy Task

    Test Data Management has always been a challenge. Going to the cloud does not make it any easier. In fact, it adds some additional plates you need to keep spinning in order to ensure successful testing of your applications. As technologists, it’s important to be sure we know which plates we need, and keep them close so we can keep them spinning. With good design and planning, there is no reason to think the test data management plates are going to come crashing to the ground.

    This post is brought to you by The CIO Agenda.

    KPMG LLP is a Delaware limited liability partnership and is the U.S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. The KPMG name, logo and “cutting through complexity” are registered trademarks or trademarks of KPMG International. The views and opinions expressed herein are those of the authors and do not necessarily represent the views and opinions of KPMG LLP.

    The Performance Zone is brought to you in partnership with AppDynamics.  See Gartner’s latest research on the application performance monitoring landscape and how APM suites are becoming more and more critical to the business.

    performance,test data management,cloud

    The best of DZone straight to your inbox.

    Please provide a valid email address.

    Thanks for subscribing!

    Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

    {{ parent.title || parent.header.title}}

    {{ parent.tldr }}

    {{ parent.urlSource.name }}