Data Quality Assurance and Testing

Case Study: Bank Data Warehouse Migration and Integration Project | Industry: Investment Bank

Investment Bank – Data Quality Assurance & Testing

Project Highlights:

  • Project Completed on time.
  • Data Quality Risk was effectively managed.
  • Reduced Time for ETL and data testing cycles.
  • Global Team worked collaboratively.

Documented proof of testing and sign off for compliance and audit.

“The moment we realized that the project would be data centric, we quickly decided to use iQA
for our Data Quality and ETL Testing.” – Nomura Securities Project Manager

Background

After 9 months of effort, Nomura Securities celebrated the successful implementation of a new Fixed Income processing system, and integration of its data into a data warehouse.

“For a successful data-centric project special attention has to be placed
on data quality management.” – Nomura Securities IT Manager

The project was a major effort that touched almost all areas of the US business and impacted operations in other regions. The overall size and scope of the project was exemplified by more than 400 people who were involved either directly or peripherally. New security reference data platforms, new interfaces for the Fixed Income front office, separation and partitioning of the existing system for the Equities business, major re-plumbing of finance and regulatory and general ledger feeds, new interfaces for risk, a new suite of reports for compliance, and significant changes to trade confirmations and settlement, were just some of the requirements of the project.

One of the key tasks was to route data from the new source system into the existing Data Warehouse. Downstream systems, such as front offices / trading desk, relied on the Data Warehouse, as it consolidated information from the Fixed Income and Equities systems. As this task was data-centric, and it was understood that data quality would be the key to success of the overall project. That is why the bank turned to iceDQ Soft for their data quality and monitoring solution.

TSB Drawing-03

The integration of Fixed Income (FI) data from the new system with
the existing Equities (EQ) data, and switching off the legacy FI data feed, had significant challenges.

The Data Testing Challenges and Solution

For the multi-million-dollar project, “How can we test hundreds of new data-feeds and thousands of existing ETL processes and at the same time monitor progress in the most effective way?”, was the key question remaining to be answered.

(a) A Need for Formal Methodology for ETL and Data Testing

In a data-centric project, data quality management is as important as project management. Data, not process, is the focus of such projects. Testing has to prove that the quality profile of the production data is maintained across all components of the project. This means that a dedicated framework for testing data-centric project is a must. The iQA solution provides multi-step action plan. The first step is to operationalize iQA Data Quality Assurance and Testing Framework.

TSB Drawing-03

“Just quality assurance software is not enough. You need proper
methodology and a tool that supports it.”- S. Gawande

Project The iQA Data Testing Framework has:

  • Clearly defined activities
  • Roles and responsibilities
  • Workflow and communication protocol that ties it all together
  • Clear guidelines for each activity
  • Categorization of data into multiple subject areas based on functional areas, so that respective SMEs can be assigned

 

Nomura Securities implemented the Testing Framework. Next, iQA was configured to support the Testing Framework to achieve goals of communication, accountability, global collaboration and visibility to management.

Key Technical Challenges in Data testing without iQA:

  • Manual comparisons of huge amount of data
  • Running data quality rules across two databases to compare source and target data
  • Absence of rule checklist for Business Analysts and users for sign-off on ETL processes
  • Create rules once; reuse across DEV, QA, UAT, and PROD environments
  • Visibility to management
  • Global collaboration of team
  • Automation and scheduling of rules in QA and UAT regions

“The beauty of iQA is in its capability to rapidly test for maximum number of test cases and data volumes.”
– Nomura Securities Data Analyst

(b) Data and ETL testing with iQA

Once the actual testing began, iQA quickly proved to be up to the job. Over a period of nine months, testers, developers, and SMEs implemented roughly five thousand data testing rules of different types, including:

 

  • Source data values compared to target data values
  • Lists of value comparisons based on set theory
  • Reduced Time for ETL and data testing cycles
  • Expected data from transformational business rules
  • Predictive testing

 

On each release cycle, many data issues were discovered. iQA routed those data issues and reports to developers, SMEs, and even source system users.

Initial runs quickly discovered thousands of critical and non-critical data quality defects. As the testing cycles proceeded, these issues were resolved and users could sign off on the ETL processes.

As new code was developed, previous rules were recombined to provide regression tests. Also, the same rules were reused in different environments such as UAT and integration.

iQA’s web-based interface was effective for global collaboration. Access to inexpensive resources meant low cost. The simplicity and centralized control made it easy to onboard new resources and outsource testing.

iQA’s automated data comparison and testing permitted testing on complete datasets instead of restricted sample test data.

iQA’s automated data comparison and testing permitted testing on complete datasets instead of restricted sample test data.

Integration of iQA with HP Quality Center, which had been acquired by Nomura Securities, allowed use of this infrastructure and its accompanying methodology.

The Rules Knowledge Repository stored all the rules discovered for data testing so that they are accessible for future use. There was no need to maintain documents for testing as both descriptive metadata and actual testing results were stored in the repository.

The testing results Dashboard provided visibility to status of testing at any time. Rapid feedback in the form of reports of data with material errors together with drill-down abilities enabled quick decisions and responses.

For business users, it was easy to sign off on the ETL process based on success of predetermined lists of rules.

TSB Drawing-03

“iQA gave us the capabilities to reach both end of the pipe and validate
the data flow” – Nomura Securities Offshore Data Quality Rules Analyst

It is a typical characteristic of data-centric projects that information requirements are not clear at the beginning. Many of them are discovered as more data is reviewed. This gradual crystallization of information requirements is aided by iQA, with its Rules Knowledge Repository. Another characteristic of data-centric projects is that data tests must be carried into production. There is no guarantee that production data will remain stable into the distant future. Again, iQA has this capability.

How the bank benefited from iQA implementation for Data Testing!

The following lessons were learned from the success of the project:

  • iQA’s Data Testing framework provided a unified strategy and visibility for the developers, testers, SMEs, business users, and management.
  • iQA supported test-driven development since in data-centric projects not all business and data transformation rules are predefined – but are discovered during the development.
  • Up to 20% fewer resources were required for testing over a period of nine months.
  • The ability to utilize offshore resources provided up to 33% in direct cost saving for testing resources.
  • iQA enabled continuous, consistent, and automated testing of all the data passing through the ETL processes.
  • The ability to get automated and quick feedback to SMEs, and the capability to drill down on the problem dataset led to higher efficiency.

The downstream data was used by trading, and any bad data could have had serious financial impact. Because of this, the key success factor for the success of the project was obtaining the sign-off on data quality. And with iQA, the Bank was successful in proving this success within given parameters of time, money, and resources.

About iceDQ

iceDQ empowers organizations to ensure data trust and reliability throughout the data life cycle.

Our comprehensive platform combines data testing, data monitoring, and data observability into a single solution, enabling data engineers to proactively manage data quality and eliminate data issues before they impact business decisions.

Leading companies across industries, including prominent players in banking, insurance, and healthcare, rely on iceDQ to continuously test, monitor, and observe their data-driven systems. This ensures trustworthy data that fuels informed decision-making and drives business success.

iceDQ Use Cases

  • Data Testing
  • ETL & Data Warehouse Testing
  • Cloud Data Migration Testing
  • BI Report Testing
  • Big Data Lake Testing
  • System Migration Testing
  • Data Monitoring
  • Data Observability

TSB Drawing-03

About the author

Sandesh Gawande

Sandesh Gawande is the Founder and CEO of iceDQ, a unified Data Reliability Platform for automated data testing, monitoring, and observability. With over 25 years of experience in data engineering and architecture, Sandesh has led large-scale data initiatives for Fortune 500 companies across banking, insurance, and healthcare, including Deutsche Bank, JPMorgan Chase, and MetLife.

Know More

Sandesh Gawande - CTO iceDQ

Sandesh Gawande

CEO and Founder at iceDQ.
First to introduce automated data testing. Advocate for data reliability engineering.

Share this case study