Resources/ Blog / Data quality issues (causes...

Blog

Data quality

Data quality issues (causes & consequences)

July 29, 2024 9 min. read

When you work with data (or “in data”), you come across inefficiencies and process flaws that are hard to overlook – bad data. However, sometimes, it’s hard to put the problems into words and present a case to leadership or decision-makers. In this article, we’re trying to do exactly that.

If you are responsible for data or are trying to build a data quality management and data governance program in your organization, this guide will explain bad data quality, its pains and issues, and how it affects business initiatives.

Below are our top causes and consequences of bad data quality. If any of these problems seem familiar, it might be time to start addressing or improving your data management program.

If you already know the causes and consequences of bad data quality, you can download our complete data quality framework, a comprehensive guide on data quality to deliver value quickly and iteratively build your data quality program.

What we’ll discuss in this article:

What are data quality issues?
5 causes of data quality issues
8 consequences of bad data quality

What are data quality issues?

Data quality issues refer to data held by your organization that is inaccurate, inconsistent, incomplete, or outdated. This results in bad data quality being circulated and used throughout an organization, leading to poor business decision-making that can have disastrous consequences.

Essentially, it’s data that does not reflect reality. It can result from missing values, duplicate records, differences between data sources, and many other reasons. Ultimately, this bad data quality can hurt your business by slowing down processes and lessening the outcomes of any data-driven initiative.

What causes data quality issues? 5 Common Causes

Data quality issues and concerns can stem from various sources, but several key culprits consistently undermine the reliability and accuracy of data. Here are our top 5 causes of bad data quality:

#1 Unclear ownership of data

Data ownership—as in assigning an owner to a data domain or data source—is critical for two reasons:

Without clearly defined ownership, there is no accountability for the data that is produced. As a result, you create data quality concerns.
If anything goes wrong or a change is about to happen that might affect a particular data sources, it’s also unclear who to contact to fix the bad data quality problem.

The lack of documented data owners makes it hard to implement data improvement initiatives or create self-service means for accessing data. Ownership is important because it imposes accountability for data quality.

#2 Siloed operations

Teams often live in siloes.

Business teams do not communicate effectively with each other, technical teams work individually, and business and technical teams are not connected, which leads to data silos. They often cooperate on a one-off basis, but the outcomes they produce are not shared with others. In the long term, it means the issues will come up again on a new project, and if teams don’t communicate about their expectations of data, the issue won’t fix itself.

Example from data science: data scientists spend 50-80% of their time collecting and cleansing data held by different teams in different locations.

When data owners, data stewards, data engineers, and data scientists collaborate, they can minimize bad data quality by creating standards and expectations for critical data assets used for models and reporting.

#3 No data quality program in place

You can’t conquer data quality issues without looking at them strategically. Data quality needs to become an enterprise-wide program, with shared tools, data quality rules, enablement, and reporting both on data quality metrics and the impact of improved data on business initiatives.

Data quality can emerge as a bottom-up push, but it absolutely needs to become a top-down motion, where the whole organization gets it and comes on board.

#4 No visibility into the state of data and data flows

Teams don’t have an easy way to understand the available data and how it flows throughout IT systems. Having difficulty finding and understanding data can lead to data quality concerns.

This issue affects data scientists, data engineers, business analysts, business subject matter experts, and IT.

Proper tooling can solve this issue: a data catalog with data lineage and quality capabilities can track and store this information, making it easier to understand.

#5 Manual data management processes

All organizations that generate or collect data manage it manually or automatically.

The following data processes are big timewasters when done manually:

Data collection
Issue correction
Data classification
Data validation
Data cleansing

Organizations without dedicated data quality tools, controls, and workflows for these processes end up repeating them. This can drain budgets and leave more space for human error, leading to a high chance of data quality concerns and bad data quality.

What are the consequences of bad data quality?

If data quality concerns go unaddressed and result in bad data quality, it can cause greater problems for your business in the present and future. Here are a few significant consequences of having unchecked data quality issues:

#1 The data engineering team is flooded with requests to fix data

Data engineers are usually responsible for fixing data quality concerns in organizations with complex data pipelines.

They have to do this repeatedly, and it often takes a lot of time to find the root cause of the issue. While they hunt for the answer, more data quality issues can build up, causing poor overall data quality for the business.

Fixing all these issues means data engineers would have less time to code and maintain data quality checks, this is a poorly scalable process where problems can grow exponentially as the workload piles up.

#2 Data-dependent teams don’t trust the data

It’s hard to be data-driven when you don’t trust your data. If your teams believe you have bad data quality, they won’t be able to do their work without double and triple-checking results. Here are the signs you have data trust issues:

Business leaders don’t trust the reports
Data scientists spend too much time validating and cleansing data
Product teams are reluctant to use data for decision-making on creating new products
Teams are reluctant to use data from other business units

#3 Long lead time for deriving value from data

If it takes you weeks or months to access data and finally create reports, something must be fixed.

Unfortunately, organizations with low data maturity operate in this fashion. When someone needs access to data, they go through a convoluted process of figuring out where that data might be stored and who owns it. Then, they wait for approval to export that data. When they finally do, they notice the bad data quality, and they either try a new source or try to fix it — on their own or by involving someone technical.

By the time they’re finished, they have done a lot of busy work and probably used up a lot of other people’s time.

This complex issue stems from the lack of the right systems, data governance processes, and tools to manage data.

#4 Your M&As didn’t go as intended

Mergers and acquisitions are data-intensive activities, and 70-90% of them fail, with integration being one of the top reasons.

Indeed, M&As achieve little without integrating systems and data. That is why master data management best practices are so important.

If your company has been going through one or multiple M&As, listen for the following signs of suboptimal M&As where data quality concerns were likely the issue:

The integration timeline was extended
Fewer systems were integrated or migrated than expected
Organizations use inconsistent business language
No single view of customers, employees, or other data domains exists

Learn more about data management best practices for M&A here.

#5 AI models have questionable ROI and performance

It’s a cliche at this point, but “garbage in, garbage out” holds true for machine learning (ML) models.

Data quality is one of the top factors influencing model performance, deployment speed, and long-term reliability. Top performers in AI generate up to 20% of their EBIT from AI models, but getting to that level requires solid investment in data management foundations:

Data & AI governance
Data quality automation
Monitoring models for data drift

If you don’t have these basics in place, you’re likely dealing with one or several of the following issues:

Frequent reports of data drift and long investigation times (days to weeks)
Fewer models are deployed than expected on a consistent basis
AI projects do not deliver the expected results (model accuracy)
Head of AI or Chief Data Scientists regularly bring up data quality issues

More resources on the importance of data quality for AI:

#6 System modernization projects go over time and budget

Data & system modernization is on the agenda of every data-driven and innovative organization.

Modernization projects simplify the IT system landscape and data flows, consolidate billing and operations, and accelerate data-related activities. Some examples are ERP and core systems consolidation, CRM migrations, Customer 360 projects, and data consumption modernization, like moving from on-premises DWH to a data lakehouse architecture in the cloud.

All of these projects depend on the state of your data. When data is not consistent, valid, and accurate, they eventually grind to a halt. You definitely have a larger problem with data if:

Modernization projects go over time, and budget
Projects are often scrapped or put on hold

#7 Reporting is manual, ad hoc, and unreliable

Accurate reporting is the bedrock of any data-driven organization.

Companies in regulated industries, such as banking, insurance, and life science companies, must submit regulatory reports to authorities, which sets an even higher standard.

Here are some common pains that these organizations experience when dealing with data quality concerns:

Reporting periods end, and the responsible teams must work overtime to manually compile data for reporting.
Teams aggregate spreadsheets in a datamart manually.
Authorities often turn down reports, and teams have to fix data quality concerns manually and prepare reports again.

#8 Customer acquisition and retention metrics are degrading

Customer-centricity is the single most important factor for successful business digitalization.

Companies built around and for their customers are 60% more profitable than others. They are also more likely to receive more information from their customers. It’s a virtuous cycle.

However, what happens when bad data quality causes customer data to degrade over time?

Here is a non-exhaustive list of signs that your customer data needs attention:

Marketing ROI is declining.
Lack of agility and long lead times to prepare data for marketing campaigns.
Marketing leadership questions reporting & analytics.
Frequent customer complaints around preferred methods of communication.
Delayed billing & reconciliation.

More resources on the importance of data quality for customer data:

The Urgency of Consolidating Customer Data with MDM: Are You Facing These Critical Challenges and Missing Out?

Overcome data quality concerns today with our DQ framework

Data quality concerns and bad data quality aren’t just a nuisance. It’s a far-reaching issue that can significantly hinder your data-driven organization’s ability to thrive.

From operational inefficiencies and missed opportunities to failed projects and damaged customer relationships, the cost of neglecting data quality can be substantial. To ensure your organization’s data is a valuable asset rather than a liability, it’s crucial to proactively address data quality issues.

Download our free ebook, “The End-to-End Data Quality Framework“, and discover a comprehensive guide for building a robust data quality program.

Author

David Gregory

David is passionate about all things data, cutting through the mundane "new oil" narratives to extract real-world value from this indispensable resource.

Published at 29.07.2024

Updated at 05.08.2025

Do you like this content?
Share it with others.