Ataccama
  • Plataforma
    Enterprise Data Quality Fabric
    Enterprise Data Quality Fabric
    Arrow right
    How It Works
    Visión general de la plataforma
    Arrow right
    Calidad de los datos
    Calidad de los datos

    Comprobaciones automatizadas de calidad de datos, supervisión, detección de anomalías y corrección

    Reference Data Management
    Gestión de datos de referencia

    RDM centralizado, autoría, jerarquías y sincronización

    Master Data Management
    Gestión de datos maestros

    Dominio multidominio, administración, emparejamiento de IA, suministro flexible de datos

    Integración de datos
    Integración de datos

    Extracción, transformación y suministro flexible de datos

    Catálogo de datos
    Catálogo de datos

    Descubrimiento automatizado de datos, glosario empresarial y mercado de datos

    Historias de datos
    Historias de datos

    Cuente historias atractivas con sus datos

    Despliegue
    Opciones de despliegue Plataforma como servicio Local e híbrida Arquitectura e integraciones
  • Soluciones
    Volver
    Centrado en
    Implementación de la gobernanza de datos

    Una pila de herramientas para empezar rápido y mantener la gobernanza de los datos

    Tejido de datos

    Active los metadatos y automatice el mapeo, la extracción y el suministro de datos

    Gestión de Big Data

    Ingiera datos fiables, controle su lago de datos y procese datos.

    Vista única de los datos

    Establezca una única fuente de la verdad y cree una visión única para todos.

    Ver todas las soluciones industriales
    Desde
    Banca, Seguros, Finanzas

    Validación de datos en la entrada, Customer 360, cumplimiento normativo

    Cuidado de la salud

    Patient 360, datos fiables para pruebas y HCE, cumplimiento de la HIPAA

    Venta minorista

    Validación de datos en la entrada, Customer 360, enriquecimiento de datos, datos de referencia

    Gobierno

    Citizen 360, intercambio y protección de datos, ciudades inteligentes

    Ciencias de la vida

    MDM de productos, datos limpios para estudios clínicos, transparencia de gastos

    Telecomunicaciones

    Customer 360, enriquecimiento de datos, seguimiento de equipos, privacidad de datos

    Transporte

    Monitorización de equipos, Customer 360, datos de referencia, privacidad de datos

    Última lectura
    Ataccama Receives $150 Million Growth Investment from Bain Capital
    Ataccama Receives $150 Million Growth Investment from Bain Capital

    Ataccama receives a $150 million growth investment from Bain Capital Tech Opportunities to enhance R&D and go-to-market, and enable data democratization.

  • Clientes
  • Empresa
    Volver
    Contáctenos
    Programar una llamada Contáctenos Suscríbase al boletín de noticias Chat en vivo
    Empresa
    Sobre nosotros

    Todo sobre nosotros, quiénes somos, visión, liderazgo, oficinas

    Kit de medios

    Descargue nuestros activos de marca, fotos y capturas de pantalla de productos

    Carreras

    #NotYourAverageJob

  • Recursos 1
    Volver
    Recursos

    Vídeos, artículos, consejos de nuestros expertos y líderes de opinión

    Noticias Historias de éxito Blog Monográficos Seminarios web Demos
    Todos los recursos
    Asistencia

    Obtenga respuestas a sus preguntas técnicas

    Documentación Formación Base de conocimientos Comunidad de usuarios Asistencia al cliente
    Eventos

    Asista a nuestros eventos virtuales y presenciales, próximamente

    Future of Financial Services, Melbourne 2022

    Jul 20

    Innovate VIC 2022

    Jul 21

    Elegido a mano para usted
    title
    What Is Data Quality and Why Is It Important?

    Learn what data quality is, why it is important, what costs and risks bad data carries, and how you can get started with data quality today for free.

  • Socios
    Volver
    Socios
    Hágase socio

    Conozca nuestro modelo de asociación, únase a nosotros

    Portal de socios de Ataccama

    Inicie sesión en nuestro portal de socios para acceder a todas las herramientas y recursos esenciales.

    Oportunidad de registro

    Registre el cliente potencial y obtenga una recompensa de socio

    Nuestros socios

    Vea nuestros socios tecnológicos, integradores de sistemas y socios de entrega

  • Probar ahora
    Volver
    Meeting
    Reserve una reunión

    Hable de sus necesidades y requisitos con uno de nuestros representantes de ventas.

    Herramientas gratuitas
    Perfiles web

    Creación de perfiles con un solo clic en su navegador. Simplemente arrastre y suelte un archivo.

    Analizador de calidad de datos

    Herramienta avanzada de creación de perfiles. Se instala en minutos en Windows.

    Historias de datos

    Modern data visualization. Present complex facts and wow all stakeholders.

    Ver todas las herramientas gratuitas
  • Contact
Ataccama
Login
Usuario
Iniciar sesión o registrarse
Contact
Logo with rockets
Announcing
$150 Million Growth Investment
BainCapital logo
Learn more
Blog

What Is Data Quality and Why Is It Important?

11 minutes read

It is tempting to believe that data, and the management of its quality is something new, brought about by the advent of new regulations such as E-Privacy and the EU GDPR. It is not. Data, its management, and its quality have been around since information was first created: when we started writing things down.

Data Quality Definition

“Data Quality is the planning, implementation, and control of activities that apply quality management techniques to data to ensure it is fit for consumption and meets the needs of data consumers.”

Data Management Body of Knowledge

We could go further, talking about what is data quality as a process, making data operational, enabling individuals and organizations to draw insights from the data which will inform their decision-making.

The reason we describe DQ as a process rather than a single item is that it comprises various elements that all contribute to the purpose of making data “fit for purpose”. Sometimes people use the term Data Preparation to refer to these elements, though data prep should be considered separate for now.

What are the dimensions of data quality?

Sitting underneath the umbrella term of Data Management, DQ takes a holistic view of an entire dataset, combining these elements – often called the dimensions of Data Quality – to provide a snapshot of the quality of data held.

Completeness

Are there gaps in the data and if so, where? Some gaps are worse than others and what is considered a gap depends on the process where the data is used. For example, if the billing department requires both phone number and email address, then no record missing one or the other can be considered complete. You can also measure completeness for any particular column. Profiling your data will uncover these gaps.

Validity

Are the postcode records you hold in a valid format? How confident are you that the email and postal address records you hold in your database are capable of receiving? Validity checks verify that the conforms to a particular format, data type, and range of values.

Since data-driven automation is so important nowadays, data has to be valid to be accepted by processes and systems that expect it.

Timeliness

Is new information entering your CRM every day in real-time or are you manually importing it? How often is the data “refreshed”? Timeliness is a crucial dimension because of the increasing need for up-to-date data.

Similar to other dimensions, timeliness is user-defined. One kind of data needs to be available on a quarterly basis for financial reporting. Other data must not be older than 5 minutes for real-time analytics.

Uniqueness

Do you have the same customer recorded twice in your data set or data catalog? Uniqueness measures how much duplicate data there is in a given data set, either within any particular column or as whole records. For example, in the orders table, each order should have just one row. If, on the other hand, you encounter two records with the same order id, you have a duplicate. How did it get there? Someone could have mistyped the order number. This brings us to the next dimension: accuracy.

Accuracy

Perhaps the most important dimension, accuracy refers to the number of errors in the data. In other words, it measures to what extent recorded data represents the truth. Accuracy is tricky because data might be valid, timely, unique, complete, but inaccurate.

100% accuracy is an aspirational goal for many data managers, and once achieved, the principles of data governance can be combined with DQ to ensure the data does not degrade and become inaccurate ever again.

Consistency

Do you have conflicting information about the same customer in two different systems? That means the data is inconsistent, which might lead to inconsistent reporting and poor customer service.

The Importance of Data Quality and its Value

Of course, everyone wants to know "why is data quality important?" However, we believe an even more important dimension to data needs to be discussed here: value.

Our definition of data quality's value is this: what are the business, risk, and financial values assigned to any piece of information? In this manner, data analysts and other practitioners of data management can quickly assign priorities to different data sources or specific data domains when they do data quality projects.

We recommend using a tool to assign literal values to your data such as:

Business - how valuable is, for instance, Employee salary data to marketing? Chances are, it has a much higher business value to the HR department, whereas customer emails are more useful for marketing.

Risk - are you holding Personally Identifiable Information (PII)? This means you could be exposed to the risk of GDPR fines if this data is not accurately protected to ensure the individual’s privacy.

Financial - eCommerce companies are the best example of the financial value of data: typically email address and credit card numbers are all that is needed in order to transact with customers and therefore profiling the data, keeping it of high quality, and reporting it over time can help eCommerce businesses understand the average value of customers and accurate email addresses.

As you can see from these examples, Data Quality tools can quickly become mission-critical for your business, depending on the quality of the data you hold that you need to perform day-to-day operations. So, why is data quality important? Because it adds value.

What are the business costs and risks of poor data quality?

Data quality maturity curves are becoming more prevalent, and organizations can quickly ascertain whether they’re reactive or optimized and governed in their approach to data management.

An example of an organization that is immature in its capture and management of data is one that does not use validation fields or uses free-form capture fields on the contact forms of its website, allowing anyone to enter whatever they like.

Bad data should not be taken lightly as it poses significant risks and business costs. Below are several examples:

  • Wasted marketing budget: if your organization is sending physical mail to your customers and marketing leads, but those addresses are out of date or invalid, you’ll be wasting precious marketing dollars and time.
  • Non-compliant data: regulations such as GDPR require a certain standard (Article 5) of how to maintain Data Quality in relation to the accuracy and integrity of data. If an organization’s data is found to be non-compliant with data-driven regulations such as the EU General Data Protection Regulation (GDPR) they can be fined up to 20 million euros or 4% of annual turnover - whatever is higher!
  • Hindered IT modernization projects: when data moves from source to target system, without correct mapping and data quality tools, old dirty data can wreak havoc on the new system.
  • Poor customer experience: If contact information is of poor quality, you cannot provide customers with a tailored customer experience and serve them via their preferred channel.
  • Fines: In regulated industries such as healthcare and banking, enterprises risk miscalculating key statistics for regulatory reports and getting fined.
  • Unreliable analytics and machine learning: Inaccurate or invalid data will provide inaccurate analytics and unreliable machine learning models.
  • Strategic operational mistakes: Building a warehouse at the wrong location, not catching fraud, producing the wrong alloy are all examples of using bad data for business decision-making.

    And yes, you can put a number on data quality.

    Bad data costs companies 10-30% of their revenue and correcting mistakes in data costs $1-10 per record.

    What are the benefits of better data quality?

    There are so many benefits to improving the quality of your information that it is impossible to list them all out, but some of the common ones include:

    • Increased return on investment for marketing activity thanks to improved email and postal deliverability and more reliable targeting
    • Less time spent fixing dirty data. This will save you $1-10 per record.
    • Increased ability to personalize your service or product offerings
    • Improved, faster decision-making
    • Compliance with new and existing regulations and the creation of a consumer-centric data-driven culture

    And many more. Ultimately, your business is unique, and therefore how you benefit from improved DQ is also unique.

    Giving Voice to the Business Benefits of Data Quality

    Watch webinar
    Giving Voice to the Business Benefits of Data Quality Giving Voice to the Business Benefits of Data Quality

    30 minutes.
    On demand webinar.

    What are must-have features to ensure data quality?

    If you'd like to learn about all the essential capabilities of data quality, you can read the full article here.

    Data Profiling

    Before you do any data quality checks, it’s important to examine your data at its source to better interpret and understand it. Data profiling does this faster and more efficiently than via SQL queries. It helps with defining what transformations are necessary for the data and what problems to track in the future.

    Data cleansing and transformation

    Very often you need to transform data to improve its quality. This includes:

    • Format standardization
    • Parsing data and breaking it down into separate attributes (e.g., full name into first name and last name)
    • Data enrichment: bringing additional data from external sources
    • Data deduplication: remove duplicates from data
    • Data masking: sometimes you need to obfuscate data for security reasons

    It’s important to note that these processes need to happen automatically to any new data before it travels to other systems and makes its way to data analysts and is used for business decision making.

    That being said, it's even more beneficial and smart to establish processes that validate and “treat data” before it enters any IT system. This is called a data quality firewall. An example of this is an algorithm that checks data entered into a web form against a required format and alerts the user to fix it, such as email addresses or birth dates. But DQ firewalls can be embedded into complex enterprise applications as well.

    Monitoring and reporting

    Peter Drucker said it best: “If you can’t measure it, you can’t improve it.” It’s as valid data quality as it is for business in general. Tracking changes and improvements to data over time is crucial and is usually done through data quality dashboards.

    First, it shows you whether you are moving in the right direction, i.e., whether the data quality metrics that you have defined are improving or not. Second, monitoring data quality helps catch unexpected influxes of bad data and track it to its source. And third, it helps with tracking compliance with regulatory requirements and more.

    FAQ

    If you want to know more, here are some frequently asked questions about data quality.

    Can the Data Catalog and Data Quality work together?

    Yes! Monitoring your data quality is much more efficient and accessible when integrating it with your data catalog. More specifically, you can automate data quality workflows using the metadata from the data catalog. Here are other ways the data catalog and data quality benefit each other:

    • Automating data quality monitoring
    • Improving data discovery
    • Streamlining on-demand DQ evaluation
    • Simplifying data preparation
    • Helping discover root causes of quality issues

    What is a real-world example of bad data quality affecting analytics?

    One of the most common places we find data quality is during census analysis. Many censuses are taken in paper and digital format, leading to quality discrepancies like unreadable inputs and duplicate entries for the same applicant. Most census data undergoes data profiling, standardization, enrichment, matching and consolidation, and relationship discovery before it’s considered fit for analysis.

    How to get started with data quality?

    Data quality management can seem like a bit of a daunting task. In our opinion, the first steps of any data quality improvement are:

    1. Determine your current goals and scope (help with a specific business problem dependent on data or focus on a specific critical data element).
    2. Profile your data.
    3. Fix the most urgent issues as soon as possible
    4. Come up with metrics and methods for measuring its quality.
    5. Monitor data quality problems.
    6. Scale your program to other teams, departments, source systems, and critical data elements.

    Following this process will ensure you find the relevant strategy for your organization and won’t embark on a task that is overwhelming or inadequate.

    How important is data quality for successful AI implementations?

    Data quality is essential for successful AI implementations. Spending too much time preparing data is one of the main reasons AI is so expensive and time-consuming. You can ensure more successful AI implementations if you:

    • Profile your data
    • Perform DQ evaluations
    • Have regular DQ monitoring

    Otherwise, you’ll be building machine learning models on the wrong sets, inevitably leading to errors or more work for your AI architects.

    Where is Data Quality headed in the future?

    Data quality is undoubtedly here to stay, but what kind of innovations can we expect? Well, you can expect the following improvements in the next few years:

    • Further automation will enable greater adoption of new architectures like the data fabric and data mesh.
    • The term is growing to encompass other aspects of data management like reference and master data management.
    • Data being deliverable to any user at the company regardless of skillset.
    • Data quality tools are becoming singular solutions instead of fragmented features that can cause conflict.
    • More systems than people are consuming data.
    • Much more!

    If you’d like to learn more about the future of data quality and how we got here, you can find it all here.

    Improve DQ with Ataccama

    An important first step is to profile your data to understand just what state it is in. There are several data management tools that you can use to do this, many of which offer free versions.

    Get started with data quality today

    Download data profiler

    Free download
    Online or desktop

    Related articles

    2021 State of Data Quality

    2021 State of Data Quality

    Ebook
    The Cost of Poor Data Quality

    The Cost of Poor Data Quality

    Blog
    Essential Data Quality Capabilities

    Essential Data Quality Capabilities

    Blog
    How to Get Started with Data Quality: The 3 Steps You Should Take First

    How to Get Started with Data Quality: The 3 Steps You Should Take First

    Blog
    The Evolution and Future of Data Quality

    The Evolution and Future of Data Quality

    Blog
    Privacy Policy Cookie Policy Terms of Use Ethics Hotline
    Espanol
    English Deutsche Pусский Français Espanol
    © Ataccama 2022
    Cookies We value your privacy

    We use cookies on our website to enhance your browsing experience. By using our website, you consent to the use of cookies. To understand more how we use cookies or how to change your preference and browser settings, please see our privacy policy.

    Select cookies