DZone
Big Data Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Big Data Zone > Holistic Test Data Management: Beyond ETL

Holistic Test Data Management: Beyond ETL

The traditional ETL approach to test data management simply isn’t good enough. A holistic test data management framework contextualizes the broader aspects of TDM.

Niall Crawford user avatar by
Niall Crawford
·
Apr. 26, 18 · Big Data Zone · Opinion
Like (6)
Save
Tweet
4.50K Views

Join the DZone community and get the full member experience.

Join For Free

So, you’ve got a team responsible for test data management.

Your project puts in a request and they grab copies from production. Then, they mask, subset, and deploy them to your test environments.

Image title

Easy peasy...? Well, probably not — probably a lot of paperwork, engineering, and provisioning effort.

And then the issues really start.

  • The data lacks end-to-end integrity (health), i.e. the data is broken.

  • The developers and testers can’t easily find the data they are looking for.

  • When the teams do find the correct data points they can use, they all use it, causing contention and data-related test defects.

And it all starts to grind to halt.

Suddenly, development and test cycles are being blown out.

And then, to add insult to injury, an honest test analyst notices that not all the data has been masked. That's a serious concern when you realize that’s where your project teams spend 95% of their time, and the opportunity for information to be misplaced or stolen is high.

Image title

This is a suboptimal situation that exposes the customer to identity theft and fraud and exposes your own organization to:

  • Compliance penalties

  • Industry sanctions

  • Brand damage

  • Consequent lawsuits

Not exactly ideal — particularly with data compliance legislation like GDPR that will sting you for 4% turnover.

Yet, I can virtually guarantee, sadly, that the above scenarios describe most organizations today.

The reason why data is such a problem is sixfold:

  1. Enterprise architectures are typically diverse and distributed.

  2. Environment and data footprints are under constant change.

  3. Individual databases are often large and poorly defined or understood.

  4. System and data documentation often suffers from technical debt.

  5. It's easy to make mistakes during data subsetting (causing integrity health issues).

  6. It's easy to make mistakes during data obfuscation exercises (causing PII leakage).

The traditional ETL approach to test data management simply isn’t good enough.

  • It is too slow.
  • It is too manual.
  • It is too error-prone.
  • It is not customer- nor user-centric.

There is a fundamental need to recognize that successful test data management can’t rely on ETL alone. Instead, organizations must start looking at data a little more broadly and leverage more automation to ensure the accuracy, quality, compliance, and ease of end-user consumption.

A holistic test data management (HTDM) framework is used to contextualize the broader aspects of test data management: a set of LEGO blocks that call out the broader considerations and needs of an automated test data solution.

Holistic Test Data Management

Built around the traditional ETL, an HTDM promotes the adoption of supporting TDM capabilities like:

  • Data Requirements Capture so you have a clear understanding of consumers (testers and projects) needs.

  • Automated Data Profiling to rapidly understand data structures and PII risks (pre-ETL).

  • Automated Data Validation to rapidly determine if created data (post-ETL) is free of production patterns and healthy (i.e. has integrity).

  • Test Data Mining so testers can visualize, understand, and find end-to-end (cross-system) data without the need to continually build (and rebuild) complex queries and scripts.

  • Test Data Bookings so that test data can be assigned to test cases or teams and avoid the risk of overwriting.  

Key benefits of creating an HTDM framework include:

  • Understanding your data

  • Improving compliance

  • Ensuring data health

  • Making consumption easier

...all of which lead to happy testers and streamlined project delivery.

Test data Data mining Extract, transform, load Data management

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • How Database B-Tree Indexing Works
  • How to Leverage Method Chaining To Add Smart Message Routing in Java
  • Creating a REST Web Service With Java and Spring (Part 1)
  • Take Control of Your Application Security

Comments

Big Data Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo