Data Quality Testing Skills Needed For Data Integration Projects
Data integration projects fail for many reasons. Risks can be mitigated when well-trained testers deliver support. Here are some recommended testing skills.
Join the DZone community and get the full member experience.Join For Free
The impulse to cut project costs is often strong, especially in the final delivery phase of data integration and data migration projects. At this late phase of the project, a common mistake is to delegate testing responsibilities to resources with limited business and data testing skills.
Data integrations are at the core of data warehousing, data migration, data synchronization, and data consolidation projects.
In the past, most data integration projects involved data stored in databases. Today, it’s essential for organizations to also integrate their database or structured data with data from documents, e-mails, log files, websites, social media, audio, and video files.
Using data warehousing as an example, Figure 1 illustrates the primary checkpoints (testing points) in an end-to-end data quality testing process. Shown are points at which data (as it’s extracted, transformed, aggregated, consolidated, etc.) should be verified – that is, extracting source data, transforming source data for loads into target databases, aggregating data for loads into data marts, and more.
Only after data owners and all other stakeholders confirm that data integration was successful can the whole process be considered complete and ready for production.
Figure 1: Checkpoints that are necessary to audit and verify data quality in data integration projects. A data warehouse data integration example is shown in this figure. (©Tricentis)
The Skills and Experiences of Project Testers Are Paramount For Project Success
The data integration/migration testing lead and other hands-on testers are expected to demonstrate extensive experience in their ability to plan, design, and execute data integration source and target testing – strategies and tactics to ensure data quality throughout all stages of a data extract, transform, and extract (ETL) life cycle.
In recent years, there has been an evolving trend toward data analysts, data engineers, business analysts, ETL developers, and even business users to plan and conduct much of the data integration, data migration, and data warehouse tests. Doing so may be risky.
Among the required skills for data integration testers are the following:
- A firm understanding of data warehouse and database concepts
- The ability to develop strategies, test plans, and test cases specific to data integration and the enterprise’s business
- The ability to create effective ETL test cases and scenarios based on ETL database loading technology and business requirements
- Advanced skills with SQL queries, stored procedures, and test scripting to delete, define, load, and merge data for tests
- In-depth understanding of ETL development tools
- Advanced knowledge of project business data and metadata (data sources, data tables, data dictionary, business terminology)
- Innovative skills with data profiling and associated methods and tools
- Understanding of data models, data mapping documents, ETL design, and ETL coding
- Ability to communicate effectively with data engineers, DB designers, and developers
- Testing experiences with multiple DB systems, (e.g., Oracle, SQL Server, DB2, Postgresql)
- Troubleshooting of ETL (e.g., Informatica/DataStage) sessions, workflows, and logs
- Ability to deploy database code to test environments
- Use of Microsoft Excel, and a variety of data quality verification tools
- Implementation of automated testing for ETL processes
- Effective use of defect management and tools
Data integration projects can fail for many reasons: Poor data architecture, inconsistently defined data, inability to combine data from different data sources, missing and inaccurate data values, inconsistent use of data fields, unacceptable query performance, and so forth.
Project risks are mitigated when well-trained and motivated testers deliver ongoing support from the earliest phases of data integration projects. We hope that this article helps you take the first steps towards that end.
Opinions expressed by DZone contributors are their own.