Ataccama
  • Plattform
    Enterprise Data Quality Fabric
    Enterprise Data Quality Fabric
    Arrow right
    How It Works
    Überblick über die Plattform
    Arrow right
    Datenqualität
    Datenqualität

    Automatisierte DQ-Prüfungen, Überwachung, Anomalieerkennung und Behebung

    Reference Data Management
    Referenzdatenverwaltung

    Zentralisiertes RDM, Authoring, Hierarchien und Synchronisierung

    Master Data Management
    Stammdatenverwaltung

    Multidomain-Mastering, Stewardship, KI-Matching, flexible Datenbereitstellung

    Datenintegration
    Datenintegration

    Flexible Datenextraktion, -transformation und -bereitstellung

    Datenkatalog
    Datenkatalog

    Automatisierte Datenermittlung, Geschäftsglossar und Datenmarktplatz

    Daten-Stories
    Daten-Stories

    Erzählen Sie mit Ihren Daten ansprechende Daten-Stories

    Implementierung
    Implementierungsoptionen Platform-as-a-Service Vor Ort und Hybrid Architektur und Integrationen
  • Lösungen
    Zurück
    Fokussiert auf
    Implementierung der Daten-Governance

    Ein Tool-Stack für den Start einer schnellen und nachhaltigen Data Governance

    Data Fabric

    Aktivieren Sie Metadaten und automatisieren Sie die Datenzuordnung, -extraktion und -bereitstellung

    Big-Data-Management

    Erfassen Sie zuverlässige Daten, steuern Sie Ihren Data Lake und verarbeiten Sie Daten.

    Zentrale Datenübersicht

    Etablieren Sie eine zentrale Quelle der Wahrheit (Single Source of the Truth, SSOT) und erstellen Sie eine einheitliche Übersicht.

    Alle Branchenlösungen anzeigen
    Branche
    Banken, Versicherungen und Finanzwesen

    Datenvalidierung bei Eingabe, Customer 360, Einhaltung gesetzlicher Vorschriften

    Gesundheitswesen

    Patient 360, zuverlässige Daten für Tests und elektronische Gesundheitsakten, HIPAA-Compliance

    Einzelhandel

    Datenvalidierung bei Eingabe, Customer 360, Datenanreicherung, Referenzdaten

    Regierung

    Citizen 360, Datenaustausch und -schutz, Smart Cities

    Biowissenschaften

    Produkt-MDM, bereinigte Daten für klinische Studien, Ausgabentransparenz

    Telekommunikation

    Customer 360, Datenanreicherung, Geräteverfolgung, Datenschutz

    Transportwesen

    Geräteüberwachung, Customer 360, Referenzdaten, Datenschutz

    Aktuelle Lektüre
    Data for Good: Enabling Data-Driven Altruism with Data Governance
    Data for Good: Enabling Data-Driven Altruism with Data Governance

    Using data for helping solve social causes comes with many challenges. How can social organizations use the data efficiently? Learn in this article.

  • Kunden
  • Unternehmen
    Zurück
    Kontaktieren Sie uns
    Telefonat vereinbaren Kontaktieren Sie uns Anmeldung für den Newsletter Live-Chat
    Unternehmen
    Über uns

    Alles über uns, wer wir sind, unsere Vision, Führungsebene und Standorte

    Medien-Kit

    Laden Sie unsere Markenressourcen, Fotos und Produkt-Screenshots herunter

    Karriere

    #NotYourAverageJob

  • Ressourcen 1
    Zurück
    Ressourcen

    Videos, Artikel und Tipps von unseren Experten und Vordenkern

    Neuigkeiten Erfolgsgeschichten Blog Whitepapers Webinare Demos
    Alle Ressourcen
    Support

    Erhalten Sie Antworten auf Ihre technischen Fragen

    Dokumentation Schulung Wissensdatenbank Nutzer-Community Kundenbetreuung
    Veranstaltungen

    Nehmen Sie an unseren bevorstehenden virtuellen und persönlichen Live-Events teil

    Future of Financial Services, Melbourne 2022

    Jul 20

    Innovate VIC 2022

    Jul 21

    Speziell für Sie ausgewählt
    title
    What Is Data Quality and Why Is It Important?

    Learn what data quality is, why it is important, what costs and risks bad data carries, and how you can get started with data quality today for free.

  • Partner
    Zurück
    Partner
    Partner werden

    Lernen Sie unser Partnerschaftsmodell kennen und werden Sie Partner

    Ataccama-Partnerportal

    Melden Sie sich bei unserem Partnerportal an, um auf alle wichtigen Tools und Ressourcen zuzugreifen.

    Vertriebsmöglichkeit registrieren

    Registrieren Sie Kunden und erhalten Sie eine Partnerprämie

    Unsere Partner

    Erfahren Sie mehr über unsere Technologiepartner, Systemintegratoren und Vertriebspartner

  • Jetzt testen
    Zurück
    Meeting
    Meeting vereinbaren

    Lassen Sie sich entsprechend Ihrer Bedürfnisse und Anforderungen von einem unserer Vertriebsmitarbeiter beraten.

    Kostenfreie Tools
    Web-Profiling

    Profiling in Ihrem Browser mit nur einem Klick. Sie müssen lediglich eine Datei ziehen und ablegen.

    Datenqualitätsanalyse

    Fortschrittliches Profiling-Tool. In nur wenigen Minuten unter Windows installieren.

    Daten-Stories

    Modern data visualization. Present complex facts and wow all stakeholders.

    Alle kostenfreien Tools anzeigen
  • Contact
Ataccama
Login
Benutzer
Anmelden oder Registrieren
Contact
Logo with rockets
Announcing
$150 Million Growth Investment
BainCapital logo
Learn more
Blog

What Is Data Quality and Why Is It Important?

11 minutes read

It is tempting to believe that data, and the management of its quality is something new, brought about by the advent of new regulations such as E-Privacy and the EU GDPR. It is not. Data, its management, and its quality have been around since information was first created: when we started writing things down.

Data Quality Definition

“Data Quality is the planning, implementation, and control of activities that apply quality management techniques to data to ensure it is fit for consumption and meets the needs of data consumers.”

Data Management Body of Knowledge

We could go further, talking about what is data quality as a process, making data operational, enabling individuals and organizations to draw insights from the data which will inform their decision-making.

The reason we describe DQ as a process rather than a single item is that it comprises various elements that all contribute to the purpose of making data “fit for purpose”. Sometimes people use the term Data Preparation to refer to these elements, though data prep should be considered separate for now.

What are the dimensions of data quality?

Sitting underneath the umbrella term of Data Management, DQ takes a holistic view of an entire dataset, combining these elements – often called the dimensions of Data Quality – to provide a snapshot of the quality of data held.

Completeness

Are there gaps in the data and if so, where? Some gaps are worse than others and what is considered a gap depends on the process where the data is used. For example, if the billing department requires both phone number and email address, then no record missing one or the other can be considered complete. You can also measure completeness for any particular column. Profiling your data will uncover these gaps.

Validity

Are the postcode records you hold in a valid format? How confident are you that the email and postal address records you hold in your database are capable of receiving? Validity checks verify that the conforms to a particular format, data type, and range of values.

Since data-driven automation is so important nowadays, data has to be valid to be accepted by processes and systems that expect it.

Timeliness

Is new information entering your CRM every day in real-time or are you manually importing it? How often is the data “refreshed”? Timeliness is a crucial dimension because of the increasing need for up-to-date data.

Similar to other dimensions, timeliness is user-defined. One kind of data needs to be available on a quarterly basis for financial reporting. Other data must not be older than 5 minutes for real-time analytics.

Uniqueness

Do you have the same customer recorded twice in your data set or data catalog? Uniqueness measures how much duplicate data there is in a given data set, either within any particular column or as whole records. For example, in the orders table, each order should have just one row. If, on the other hand, you encounter two records with the same order id, you have a duplicate. How did it get there? Someone could have mistyped the order number. This brings us to the next dimension: accuracy.

Accuracy

Perhaps the most important dimension, accuracy refers to the number of errors in the data. In other words, it measures to what extent recorded data represents the truth. Accuracy is tricky because data might be valid, timely, unique, complete, but inaccurate.

100% accuracy is an aspirational goal for many data managers, and once achieved, the principles of data governance can be combined with DQ to ensure the data does not degrade and become inaccurate ever again.

Consistency

Do you have conflicting information about the same customer in two different systems? That means the data is inconsistent, which might lead to inconsistent reporting and poor customer service.

The Importance of Data Quality and its Value

Of course, everyone wants to know "why is data quality important?" However, we believe an even more important dimension to data needs to be discussed here: value.

Our definition of data quality's value is this: what are the business, risk, and financial values assigned to any piece of information? In this manner, data analysts and other practitioners of data management can quickly assign priorities to different data sources or specific data domains when they do data quality projects.

We recommend using a tool to assign literal values to your data such as:

Business - how valuable is, for instance, Employee salary data to marketing? Chances are, it has a much higher business value to the HR department, whereas customer emails are more useful for marketing.

Risk - are you holding Personally Identifiable Information (PII)? This means you could be exposed to the risk of GDPR fines if this data is not accurately protected to ensure the individual’s privacy.

Financial - eCommerce companies are the best example of the financial value of data: typically email address and credit card numbers are all that is needed in order to transact with customers and therefore profiling the data, keeping it of high quality, and reporting it over time can help eCommerce businesses understand the average value of customers and accurate email addresses.

As you can see from these examples, Data Quality tools can quickly become mission-critical for your business, depending on the quality of the data you hold that you need to perform day-to-day operations. So, why is data quality important? Because it adds value.

What are the business costs and risks of poor data quality?

Data quality maturity curves are becoming more prevalent, and organizations can quickly ascertain whether they’re reactive or optimized and governed in their approach to data management.

An example of an organization that is immature in its capture and management of data is one that does not use validation fields or uses free-form capture fields on the contact forms of its website, allowing anyone to enter whatever they like.

Bad data should not be taken lightly as it poses significant risks and business costs. Below are several examples:

  • Wasted marketing budget: if your organization is sending physical mail to your customers and marketing leads, but those addresses are out of date or invalid, you’ll be wasting precious marketing dollars and time.
  • Non-compliant data: regulations such as GDPR require a certain standard (Article 5) of how to maintain Data Quality in relation to the accuracy and integrity of data. If an organization’s data is found to be non-compliant with data-driven regulations such as the EU General Data Protection Regulation (GDPR) they can be fined up to 20 million euros or 4% of annual turnover - whatever is higher!
  • Hindered IT modernization projects: when data moves from source to target system, without correct mapping and data quality tools, old dirty data can wreak havoc on the new system.
  • Poor customer experience: If contact information is of poor quality, you cannot provide customers with a tailored customer experience and serve them via their preferred channel.
  • Fines: In regulated industries such as healthcare and banking, enterprises risk miscalculating key statistics for regulatory reports and getting fined.
  • Unreliable analytics and machine learning: Inaccurate or invalid data will provide inaccurate analytics and unreliable machine learning models.
  • Strategic operational mistakes: Building a warehouse at the wrong location, not catching fraud, producing the wrong alloy are all examples of using bad data for business decision-making.

    And yes, you can put a number on data quality.

    Bad data costs companies 10-30% of their revenue and correcting mistakes in data costs $1-10 per record.

    What are the benefits of better data quality?

    There are so many benefits to improving the quality of your information that it is impossible to list them all out, but some of the common ones include:

    • Increased return on investment for marketing activity thanks to improved email and postal deliverability and more reliable targeting
    • Less time spent fixing dirty data. This will save you $1-10 per record.
    • Increased ability to personalize your service or product offerings
    • Improved, faster decision-making
    • Compliance with new and existing regulations and the creation of a consumer-centric data-driven culture

    And many more. Ultimately, your business is unique, and therefore how you benefit from improved DQ is also unique.

    Giving Voice to the Business Benefits of Data Quality

    Watch webinar
    Giving Voice to the Business Benefits of Data Quality Giving Voice to the Business Benefits of Data Quality

    30 minutes.
    On demand webinar.

    What are must-have features to ensure data quality?

    If you'd like to learn about all the essential capabilities of data quality, you can read the full article here.

    Data Profiling

    Before you do any data quality checks, it’s important to examine your data at its source to better interpret and understand it. Data profiling does this faster and more efficiently than via SQL queries. It helps with defining what transformations are necessary for the data and what problems to track in the future.

    Data cleansing and transformation

    Very often you need to transform data to improve its quality. This includes:

    • Format standardization
    • Parsing data and breaking it down into separate attributes (e.g., full name into first name and last name)
    • Data enrichment: bringing additional data from external sources
    • Data deduplication: remove duplicates from data
    • Data masking: sometimes you need to obfuscate data for security reasons

    It’s important to note that these processes need to happen automatically to any new data before it travels to other systems and makes its way to data analysts and is used for business decision making.

    That being said, it's even more beneficial and smart to establish processes that validate and “treat data” before it enters any IT system. This is called a data quality firewall. An example of this is an algorithm that checks data entered into a web form against a required format and alerts the user to fix it, such as email addresses or birth dates. But DQ firewalls can be embedded into complex enterprise applications as well.

    Monitoring and reporting

    Peter Drucker said it best: “If you can’t measure it, you can’t improve it.” It’s as valid data quality as it is for business in general. Tracking changes and improvements to data over time is crucial and is usually done through data quality dashboards.

    First, it shows you whether you are moving in the right direction, i.e., whether the data quality metrics that you have defined are improving or not. Second, monitoring data quality helps catch unexpected influxes of bad data and track it to its source. And third, it helps with tracking compliance with regulatory requirements and more.

    FAQ

    If you want to know more, here are some frequently asked questions about data quality.

    Can the Data Catalog and Data Quality work together?

    Yes! Monitoring your data quality is much more efficient and accessible when integrating it with your data catalog. More specifically, you can automate data quality workflows using the metadata from the data catalog. Here are other ways the data catalog and data quality benefit each other:

    • Automating data quality monitoring
    • Improving data discovery
    • Streamlining on-demand DQ evaluation
    • Simplifying data preparation
    • Helping discover root causes of quality issues

    What is a real-world example of bad data quality affecting analytics?

    One of the most common places we find data quality is during census analysis. Many censuses are taken in paper and digital format, leading to quality discrepancies like unreadable inputs and duplicate entries for the same applicant. Most census data undergoes data profiling, standardization, enrichment, matching and consolidation, and relationship discovery before it’s considered fit for analysis.

    How to get started with data quality?

    Data quality management can seem like a bit of a daunting task. In our opinion, the first steps of any data quality improvement are:

    1. Determine your current goals and scope (help with a specific business problem dependent on data or focus on a specific critical data element).
    2. Profile your data.
    3. Fix the most urgent issues as soon as possible
    4. Come up with metrics and methods for measuring its quality.
    5. Monitor data quality problems.
    6. Scale your program to other teams, departments, source systems, and critical data elements.

    Following this process will ensure you find the relevant strategy for your organization and won’t embark on a task that is overwhelming or inadequate.

    How important is data quality for successful AI implementations?

    Data quality is essential for successful AI implementations. Spending too much time preparing data is one of the main reasons AI is so expensive and time-consuming. You can ensure more successful AI implementations if you:

    • Profile your data
    • Perform DQ evaluations
    • Have regular DQ monitoring

    Otherwise, you’ll be building machine learning models on the wrong sets, inevitably leading to errors or more work for your AI architects.

    Where is Data Quality headed in the future?

    Data quality is undoubtedly here to stay, but what kind of innovations can we expect? Well, you can expect the following improvements in the next few years:

    • Further automation will enable greater adoption of new architectures like the data fabric and data mesh.
    • The term is growing to encompass other aspects of data management like reference and master data management.
    • Data being deliverable to any user at the company regardless of skillset.
    • Data quality tools are becoming singular solutions instead of fragmented features that can cause conflict.
    • More systems than people are consuming data.
    • Much more!

    If you’d like to learn more about the future of data quality and how we got here, you can find it all here.

    Improve DQ with Ataccama

    An important first step is to profile your data to understand just what state it is in. There are several data management tools that you can use to do this, many of which offer free versions.

    Get started with data quality today

    Download data profiler

    Free download
    Online or desktop

    Related articles

    2021 State of Data Quality

    2021 State of Data Quality

    Ebook
    The Cost of Poor Data Quality

    The Cost of Poor Data Quality

    Blog
    Essential Data Quality Capabilities

    Essential Data Quality Capabilities

    Blog
    How to Get Started with Data Quality: The 3 Steps You Should Take First

    How to Get Started with Data Quality: The 3 Steps You Should Take First

    Blog
    The Evolution and Future of Data Quality

    The Evolution and Future of Data Quality

    Blog
    Privacy Policy Cookie Policy Terms of Use Ethics Hotline
    Deutsche
    English Deutsche Pусский Français Espanol
    © Ataccama 2022
    Cookies We value your privacy

    We use cookies on our website to enhance your browsing experience. By using our website, you consent to the use of cookies. To understand more how we use cookies or how to change your preference and browser settings, please see our privacy policy.

    Select cookies