Daily Customer Data Monitoring – for Business Readiness

Case Study

Daily Customer Data Monitoring – for Business Readiness

Daily Customer Data Monitoring – for Business Readiness

Introduction

The discount brokerage firm used customer data to provide essential services, including email communications, electronic document delivery, annual tax statements, general correspondence, and promotional outreach. To deliver these services, the firm used customer contact details, marketing preferences, demographics, addresses, domicile, nominee details, and transaction history. This customer data was collected by various applications, stored in multiple DB2 databases.

It was difficult for downstream apps to access this data from multiple DB2 databases. Hence, this scattered data was consolidated with daily ETL batch into a centralized Snowflake cloud.

Eventually, it was realized that the data between Snowflake and DB2 was often out of sync. This had a negative impact on business activity. This data pipeline, from DB2 to Snowflake, was tied to an ETL process, and any errors during the migration could result in severe business challenges and consequences.

The sync issues were due to:

1. Timing Issues: While customer profiles in the DB2 databases were continuously updated by users, the sync was in batch.
2. Schema Changes: Sometimes schema changes led to data inconsistencies between the source and the Snowflake data hub.

THIS NECESSITATED A NEED FOR DAILY DATA MONITORING FOR BUSINESS READINESS.

Business Challenge

Tax Calculation Mistakes: Errors in key details, such as addresses and state domicile information, caused –
a. Incorrect tax calculations.
b. Mistakes in mailing financial statements.

Marketing, CCAR & GDPR Compliance Issues: Incorrect user data and marketing preferences led to –
a. Multiple “do not contact” violations.
b. Promotional material being sent to the wrong recipients.
c. Customers being targeted with multiple campaigns.

Regulatory Non-Compliance Risk: Financial firms operate under strict regulations (e.g., GDPR, FINRA, BCBS 239).
a. Incorrect data handling could result in regulatory fines.
b. Increased scrutiny and damage to the firm’s reputation.

Poor Customer Experience & Erosion of Trust: Errors in customer data resulted in –
a. Incorrect transaction history.
b. Delayed services, requiring customers to reach out to the brokerage firm.

Financial Risks: Data discrepancies led to financial losses –
a. Lost revenue opportunities due to missed marketing and cross-selling opportunities, reducing campaign effectiveness.
b. Regulatory fines.

The First Failed Mitigation Attempt

Operations were compromised due to excessive coordination and communication required to fix issues, including reworking tax calculations and managing numerous customer calls to resolve problems. The initial approach involved manually comparing data or writing scripts to reconcile data between the operational systems and Snowflake. This introduced new inefficiencies, as the data remained inaccurate or incomplete, requiring manual intervention, leading to operational delays that disrupted business processes.

1. Cost: Time-consuming and prone to human error.
2. Delayed Go-to-Market: Lengthy technical sprint cycles slowed the implementation of new monitoring rules, delaying marketing campaigns and other business initiatives.
3. Resource Overload: Manual processes placed a heavy burden on internal teams, including IT and the Application Security Group, to implement new rules and ensure data consistency.
4. Increased Costs for Production Data Fixes: The need to fix data issues post-production resulted in higher operational costs and resource strain.

Automated Daily Data Readiness Check, with iceDQ

TSB Drawing-03

iceDQ was implemented to automate data integrity checks and certify the readiness of data for business use. Over 200 rules were applied across approximately 50 tables to ensure data accuracy and consistency.

1. Source (DB2) Data Validation: Verified that the data in the source system was correct before reconciliation.
2. Source (DB2) vs. Target (Snowflake) Reconciliation: Ensured data was transferred accurately and remained in sync between DB2 and Snowflake before taking further action.
3. Target (Snowflake) Business Validation: Confirmed that the data was not only loaded correctly from the source but also validated from a business perspective.

Scheduling: The data integrity checks were scheduled at the following frequencies:

1. Daily Count Checks: Scheduled to run daily at 9 PM.
2. Weekly Detailed Checks: Scheduled to run every Sunday at 10 PM.

Salesforce Objects 1. Validation rules were made for each table, please refer to the screenshot in figure 3.

  • a. One for count checks and null checks.
  • b. One for data comparison and duplicate checks.
Source to Target Reconciliation 1. Connection Setup: Connections to both DB2 and Snowflake were established in iceDQ.
2. Rule Creation: ID columns were specifically checked for null values.
3. Regression Pack: All rules were added to a regression pack for automated execution.
4. Reconciliation Process:

a. Row counts between DB2 (Source) and Snowflake (Target) are reconciled for over 50 tables.
b. A checksum rule is applied for each table to ensure counts match.

5. Refer to the screenshot in figure 4.

Business Validation Detailed data comparisons on key elements (address, phone number, email, transactions, ITIN, SSN) are conducted weekly.
Workflow Batch A workflow batch is created with all the rules included, allowing you to assign a specific sequence for rule execution. This workflow batch (regression suite) is then added to the scheduler for automated runs. Refer to figure 5.

Validation Rule Example:

TSB Drawing-03

Reconciliation Rule Example:

TSB Drawing-03

Workflow Batch Example:

TSB Drawing-03

Key Features of iceDQ Implementation

1. Supported Connections: Native support to connect to both DB2 and Snowflake databases.
2. Flexible Rule Creation: Easily create and modify rules for count checks, null checks, and data comparisons.
3. Comprehensive Validation: Ability to perform both daily count checks and weekly detailed data comparisons.
4. Scheduled Execution: Automated running of the regression pack at specified times.
5. Scalability: Capable of handling over 50 tables in the reconciliation process.

Conclusion

The implementation of iceDQ in the brokerage firm’s data reconciliation process resulted in substantial benefits:

Enhanced Efficiency The automated reconciliation process reduced manual checks and accelerated data validation.
Improved Accuracy Systematic checks minimized human errors in the reconciliation process.
Faster Rule Implementation New monitoring rules were quickly implemented without the need for lengthy technical cycles.
Increased Data Integrity Regular, thorough checks ensured consistency between source and target databases.
Better Resources Utilization Reduced reliance on the Application Security Group for implementing new rules.
Timely Marketing Initiatives Faster data validation allowed for more responsive marketing campaigns and customer communications.
Comprehensive Reporting Detailed reports on count mismatches and data discrepancies enabled swift issue resolution.

By integrating iceDQ, the brokerage firm greatly improved data flow reliability between DB2 and Snowflake, enhancing data integrity and streamlining the implementation of new monitoring rules. This enabled faster go-to-market for data-driven initiatives and demonstrated the importance of automated data testing in ensuring smooth operations between systems.

About iceDQ

iceDQ empowers organizations to ensure data trust and reliability throughout the data life cycle.

Our comprehensive platform combines data testing, data monitoring, and data observability into a single solution, enabling data engineers to proactively manage data quality and eliminate data issues before they impact business decisions.

Leading companies across industries, including prominent players in banking, insurance, and healthcare, rely on iceDQ to continuously test, monitor, and observe their data-driven systems. This ensures trustworthy data that fuels informed decision-making and drives business success.

iceDQ Use Cases

  • Data Testing
  • ETL & Data Warehouse Testing
  • Cloud Data Migration Testing
  • BI Report Testing
  • Big Data Lake Testing
  • System Migration Testing
  • Data Monitoring
  • Data Observability

TSB Drawing-03

About the author

Sandesh Gawande

Sandesh Gawande is the Founder and CEO of iceDQ, a unified Data Reliability Platform for automated data testing, monitoring, and observability. With over 25 years of experience in data engineering and architecture, Sandesh has led large-scale data initiatives for Fortune 500 companies across banking, insurance, and healthcare, including Deutsche Bank, JPMorgan Chase, and MetLife.

Know More

Sandesh Gawande - CTO iceDQ

Sandesh Gawande

CEO and Founder at iceDQ.
First to introduce automated data testing. Advocate for data reliability engineering.

Share this case study