iCEDQ Manifesto: Why we built iceDQ Platform

“Eliminate 99.99% of the Data issues before
they appear in Production.”

Everyone agrees that data is the new oil. Then why many CIOs and CDOs wait for the data to fail in production? This fatalistic approach towards data integrity is beyond comprehension.

Many would argue that they have implemented a data quality solution in their production environment. But if it really worked, then why is it that the CISQ, (Consortium for Information & Software Quality) reported “For the year 2020, we determined the total Cost of Poor Software Quality (CPSQ) in the US is $2.08 trillion (T)

Why production data is bad in the first place?

Lack of Data Testing in Development: Unlike application testing, companies don’t test data during development, thus causing the seepage of bad data into production.

Lack of Data Pipeline Testing in Development: Bad data processes results in bad data. Often companies ignore testing of the data pipelines in development. They fail to understand that the quality of the data pipelines is as much, if not more, responsible for the data quality.

Fail to understand quality of data pipeline-iCEDQ

Siloed Teams: The problem is further compounded by the siloed approach of data development and operations teams. We have observed that data quality is only implemented after the data is live in production environment, but not much data testing in development. It is a classic case of – too little, too late.

Siloed approach of data development and operations teams-iCEDQ

Lack of Data Monitoring in Production Environment: Companies don’t proactively monitor their data in production and the net result is that the data issues are not discovered until the users start complaining.

There are thousands of data pipelines moving data in the organizations, while they monitor the successes or failures of the processes they don’t check if the data was transformed correctly by the data processes.

No wonder so many of the data warehouses, data migrations and big data projects fail to deliver both, in terms of quality and in time. It is a classic case of – too little, too late.

The net result of lack of data testing and monitoring:

  • Data issues are detected in operational environment instead of development.
  • Often data pipelines are withdrawn back in development for re-engineering.
  • It is very difficult and expensive to undo the layers of data processing in production due to bad data.
  • When the issues are found in operations, damage to the downstream users and systems is already done.
Data Issues in Data Testing & Monitoring-iCEDQ

iCEDQ’s DataOps approach to Data Quality:

DataOps approach in Data Quality-iCEDQ

Adopts DataOps to integrate Data Testing in Development and Data Monitoring in operations, shift as much work as possible to the left of the data development life cycle.

1

Shift-Left

Focus on data testing and auditing during development phase and don’t wait for data to go live.

2

Fix The Process, Not The Problem

Bad data processes will give bad data. Testing data processes ensures that data pipelines do not introduce any more data errors.

3

Automate Data Testing

It is impossible to test millions of records manually. Invest in a purpose-built automated data testing platform such as iceDQ.

4

TTD - Test Driven Development

As the data mapping and data transformation are collected, get the data audit requirements also. This will allow both development and testing in parallel.

5

Data Reconciliation

Some data issues cannot be detected without reconciling data with the data source as the data values correctness are often relative to another database.

6

Whitebox Data Monitoring

The integration of DEV and OPS teams will allow incorporation of data checks as part of the code so that they will be reused in production data monitoring.

7

Business Rules Based Audit Data

Monitor data based on auditing principles so that the data errors are captured before the users or business is impacted.

8

Monitor Data Pipelines

One of the key aspects is to ensure that the data processes are not introducing data errors. Successful data process coemption involves both completion of the process as well as the correct data transformation.

9

Establish Checks And Controls

Often the processes introduced data errors and if not stopped immediately will further complicate and cascade the data issues, which might be irreversible. Hence it necessary to establish checks and controls in the process executions.

10

Link Business Processes To Data Audits

Simply knowing the data issues is not enough but link the data issues to the business process to gauge the actual impact on a specific business process.

11

Pinpoint Data Exceptions

Data audit must pinpoint the actual location of the data issue so that root cause analysis can be conducted.

Why you must act now?

Has your company invested in data testing? If application development requires testing, so do data centric projects. The Consortium for Information & Software Quality’s CPSQ-2020 Report further states that “defects that need to be corrected would be $1.31 Trillion

It is no longer enough to simply develop data pipelines and dump them in production, it is imperative that Data Architects and Data Engineers must fully incorporate data testing in their development practices. Also, CDOs, Data Stewards and Compliance officers along with their business users should not wait for the data to be in production but get actively involved in development to ensure that the data processes are audited and certified prior to their deployment into production.

iCEDQ Engine

“We believe unified data testing and monitoring
not only reduces the development cost
but also eliminates data defects in production,
and that’s why we built iceDQ,
a fully integrated data testing and monitoring platform.”

iceDQ

stands for

Integrity Check Engine For Data Quality

Our Company

Head quartered in Stamford CT, Torana Inc was established in 2005 by a team of Data Architects to solve various challenges faced by organization related to data centric projects and systems. In 2008 Torana established software R&D center in Nagpur, India.

We have a team of 120 developers, architects, analyst and consultants in USA and India.

We are deeply committed and invested in the success of our customers and partners. Torana has deep and inner understanding on the workings data centric systems such as Data Warehouse, ETL, MDM- Master Data Management, RDM-Reference Data Management Systems, CRM – Customer Relationship Management, CDP- Customer Data Platforms, MDW- Marketing Data Warehouse and RPA- Robotic Process Automation.​

Our Story

Our Story-iCEDQ

Customers

Partners

iCEDQ partner Snowflake
iCEDQ Partner Yellownbrick
manta
iCEDQ-partner-HP-Vertica

Achievements

Best Value Software by SoftwareSuggest

Best Business Intelligence Software by FinanceOnline