DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • A Guide to DataOps: The New Age of Data Management
  • DORA Metrics: Tracking and Observability With Jenkins, Prometheus, and Observe
  • Automating Databases for Modern DevOps Practices: A Guide to Common Patterns and Anti-Patterns for Database Automation Techniques
  • Optimizing Azure DevOps Pipelines With AI and Continuous Integration

Trending

  • Orchestrating Microservices with Dapr: A Unified Approach
  • How To Introduce a New API Quickly Using Quarkus and ChatGPT
  • Creating a Web Project: Caching for Performance Optimization
  • Detection and Mitigation of Lateral Movement in Cloud Networks
  1. DZone
  2. Testing, Deployment, and Maintenance
  3. DevOps and CI/CD
  4. Test Data Management and Its Role in DevOps

Test Data Management and Its Role in DevOps

Learn how to use test data management to make sure your code is ready for production as part of CI/CD.

By 
Christian Melendez user avatar
Christian Melendez
DZone Core CORE ·
RJ Williams user avatar
RJ Williams
·
Jun. 14, 18 · Analysis
Likes (2)
Comment
Save
Tweet
Share
9.6K Views

Join the DZone community and get the full member experience.

Join For Free

In my career, there've been many times when I've experienced the false joy of my code change being ready to be released to production. I say false joy because everything worked as expected on my computer, in dev, in testing, and also staging. But in production, my recent code changes were causing intermittent problems.

You know the types of problems. It's always something little, like data being longer than expected for certain fields. No matter how careful I was when testing my change, there was a scenario that I forgot or I didn't know existed. If only I could have good data that helped me to do my job better! My joy wouldn't be diluted by those errors.

Having data for good quality testing is key. And that's where test data management (TDM) comes into play. But what's its role in DevOps? Is it possible to integrate TDM? And how would we go about it?

Let's find out.

What Is Test Data Management?

Missing a use case of our app: it's a common problem in all types of organizations that develop software. No matter how we think our app will be used, users tend to exceed our expectations for creativity when using it. And production is production. It serves as a reminder to us that our test cases aren't invincible. If only we could test our changes in production.

But what if we could have production-like environments? And what if these could be production-like environments not just in terms of infrastructure, but also in terms of data?

TDM is the process of creating production-like data for testing purposes. In some cases, there's no difference. But when there's sensitive data, things change. That sensitive data needs to be masked; then, if it's compromised, the impact will be low.

Tests are important in DevOps. Without tests, it's easy to lose confidence in our applications, and deployments tend to be scary. You need data for good tests. TDM won't prevent you from introducing bugs, but it will help you to reduce the chances by giving you the ability to build data of good quality. That's because if you're able to reproduce an error in production, you'll be able to fix it and make sure it won't happen again. Bugs will continue emerging, but they won't be the same ones over and over. Thus, it's important that anyone in the team can access the data they need when they need it.

Test Data Management Should Be Self-Service For Everyone

Well, we know now that there's a process to prepare data and there are tools for the job. The next obvious thought then will be to automate it and give access to the team. That way, operations and DBAs stop being the bottleneck.

When you're building this process, everyone should be involved. You should agree on what data can and should be used, what data will be masked, and how much data will be needed---much like a data discovery phase, if you will. But after there's an agreement, the team should work on having this process automated so that everyone can create, update, or duplicate data for testing. DBAs will appreciate it; it's one less thing to worry about.

You don't have to reinvent the wheel. There are already some tools for this job.

People shouldn't have to wait too long to get the data they need---and, in fact, they won't wait. They'll find ways to get around things that take time and will shift testing to the right. That's why it's key that you choose the proper tool and plan ahead what data will be needed.

Test Data Management Should Help You Keep Healthy Data

It's also important to keep in mind that we're talking about having production-like data in other environments. Initially, that might not be a problem. But as the data grows, it could be costly. This process will also force you to keep your data healthy, not bloated.

My recommendation is that you start by having all the data you need, but as you grow, start generating the data for each test case. Or you can even have a mix of pre-populated data and static data that you generate in the code for each test case. Look forward to having more static data because it will be cheaper and you'll have more control over it!

It will always depend on the use case, but I've seen databases with tables that have data from years, not months or days. It affects not only costs by using more storage but also performance when writing. When this happens, why don't we consider keeping just the data that's needed and then moving historical data for reporting somewhere else? Or what about working with tables per day, week, or month? It's complex, but as with everything, there are always tradeoffs you need to consider.

The plan is to include TDM in your delivery pipeline. So always keep an eye on the time it takes to prepare data, and make sure to optimize. It's key to reduce lead time to deliver your software, so the less time it takes for TDM, the better.

It Should Be Part of CI/CD Processes

Once TDM is a self-service process that testers and developers can use when they need it (and once it runs fast), it's time to implement it in your continuous integration and delivery process.

In DevOps, every process or task that increases silos are shifted to the left. Shifting to left means that you take to doing it at the very beginning of the workflow. We've seen this happen with deployments, security, testing, and basically everything we need to develop and deliver software.

If we're talking about shifting to the left, it means that we should even start including TDM on a developer's machine. Same for testers and any other team member involved in the process. Some argue that developers and testers should generate static data for testing and that they should invest heavily in unit testing, mocks, and stubs. Some even say you should even use containers. It's not just cheaper to go this route---it's also faster. I get it, and actually, I'm an advocate for that. But I won't lie; doing that is not an easy task.

So while you work on increasing test coverage with unit tests, the easiest way to start is by integrating TDM into your workflow. It's better that your process become reliable first. Then you can optimize and improve it.

Good Testing Data Increases Reliability

Want to increase deployment reliability and at the same time reduce the lead time? Invest in TDM and make it part of your process. Shift it to the left. Don't let testing become an afterthought. A sign that you're not shifting to the left enough is when development is finished and you still need to wait for tests to validate those changes.

Automate your tests. It's OK that you started testing manually, but try to take the time to automate as much as possible---even the process of preparing the data. After that, include it in your DevOps implementation. Make it part of your delivery process. But you also need to pay attention to the time it takes to generate testing data. This will force you in some way to think constantly in your data architecture.

And it's good to be thinking. If you do so, you'll always improve, and you'll decouple things that are not just hard to change but also to test.

Check out our Fundamentals of Test Data Management course to learn the principles, best practices, and tools used for test data management through lecture and hands-on exercises.

Test data Data management DevOps IT Continuous Integration/Deployment

Published at DZone with permission of Christian Melendez, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • A Guide to DataOps: The New Age of Data Management
  • DORA Metrics: Tracking and Observability With Jenkins, Prometheus, and Observe
  • Automating Databases for Modern DevOps Practices: A Guide to Common Patterns and Anti-Patterns for Database Automation Techniques
  • Optimizing Azure DevOps Pipelines With AI and Continuous Integration

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!