Ataccama
  • Plattform
    Enterprise Data Quality Fabric
    Enterprise Data Quality Fabric
    Arrow right
    How It Works
    Überblick über die Plattform
    Arrow right
    Datenqualität
    Datenqualität

    Automatisierte DQ-Prüfungen, Überwachung, Anomalieerkennung und Behebung

    Reference Data Management
    Referenzdatenverwaltung

    Zentralisiertes RDM, Authoring, Hierarchien und Synchronisierung

    Master Data Management
    Stammdatenverwaltung

    Multidomain-Mastering, Stewardship, KI-Matching, flexible Datenbereitstellung

    Datenintegration
    Datenintegration

    Flexible Datenextraktion, -transformation und -bereitstellung

    Datenkatalog
    Datenkatalog

    Automatisierte Datenermittlung, Geschäftsglossar und Datenmarktplatz

    Daten-Stories
    Daten-Stories

    Erzählen Sie mit Ihren Daten ansprechende Daten-Stories

    Implementierung
    Implementierungsoptionen Platform-as-a-Service Vor Ort und Hybrid Architektur und Integrationen
  • Lösungen
    Zurück
    Fokussiert auf
    Implementierung der Daten-Governance

    Ein Tool-Stack für den Start einer schnellen und nachhaltigen Data Governance

    Data Fabric

    Aktivieren Sie Metadaten und automatisieren Sie die Datenzuordnung, -extraktion und -bereitstellung

    Big-Data-Management

    Erfassen Sie zuverlässige Daten, steuern Sie Ihren Data Lake und verarbeiten Sie Daten.

    Zentrale Datenübersicht

    Etablieren Sie eine zentrale Quelle der Wahrheit (Single Source of the Truth, SSOT) und erstellen Sie eine einheitliche Übersicht.

    Alle Branchenlösungen anzeigen
    Branche
    Banken, Versicherungen und Finanzwesen

    Datenvalidierung bei Eingabe, Customer 360, Einhaltung gesetzlicher Vorschriften

    Gesundheitswesen

    Patient 360, zuverlässige Daten für Tests und elektronische Gesundheitsakten, HIPAA-Compliance

    Einzelhandel

    Datenvalidierung bei Eingabe, Customer 360, Datenanreicherung, Referenzdaten

    Regierung

    Citizen 360, Datenaustausch und -schutz, Smart Cities

    Biowissenschaften

    Produkt-MDM, bereinigte Daten für klinische Studien, Ausgabentransparenz

    Telekommunikation

    Customer 360, Datenanreicherung, Geräteverfolgung, Datenschutz

    Transportwesen

    Geräteüberwachung, Customer 360, Referenzdaten, Datenschutz

    Aktuelle Lektüre
    Data for Good: Enabling Data-Driven Altruism with Data Governance
    Data for Good: Enabling Data-Driven Altruism with Data Governance

    Using data for helping solve social causes comes with many challenges. How can social organizations use the data efficiently? Learn in this article.

  • Kunden
  • Unternehmen
    Zurück
    Kontaktieren Sie uns
    Telefonat vereinbaren Kontaktieren Sie uns Anmeldung für den Newsletter Live-Chat
    Unternehmen
    Über uns

    Alles über uns, wer wir sind, unsere Vision, Führungsebene und Standorte

    Medien-Kit

    Laden Sie unsere Markenressourcen, Fotos und Produkt-Screenshots herunter

    Karriere

    #NotYourAverageJob

  • Ressourcen 1
    Zurück
    Ressourcen

    Videos, Artikel und Tipps von unseren Experten und Vordenkern

    Neuigkeiten Erfolgsgeschichten Blog Whitepapers Webinare Demos
    Alle Ressourcen
    Support

    Erhalten Sie Antworten auf Ihre technischen Fragen

    Dokumentation Schulung Wissensdatenbank Nutzer-Community Kundenbetreuung
    Veranstaltungen

    Nehmen Sie an unseren bevorstehenden virtuellen und persönlichen Live-Events teil

    Future of Financial Services, Melbourne 2022

    Jul 20

    Innovate VIC 2022

    Jul 21

    Speziell für Sie ausgewählt
    title
    What Is Data Quality and Why Is It Important?

    Learn what data quality is, why it is important, what costs and risks bad data carries, and how you can get started with data quality today for free.

  • Partner
    Zurück
    Partner
    Partner werden

    Lernen Sie unser Partnerschaftsmodell kennen und werden Sie Partner

    Ataccama-Partnerportal

    Melden Sie sich bei unserem Partnerportal an, um auf alle wichtigen Tools und Ressourcen zuzugreifen.

    Vertriebsmöglichkeit registrieren

    Registrieren Sie Kunden und erhalten Sie eine Partnerprämie

    Unsere Partner

    Erfahren Sie mehr über unsere Technologiepartner, Systemintegratoren und Vertriebspartner

  • Jetzt testen
    Zurück
    Meeting
    Meeting vereinbaren

    Lassen Sie sich entsprechend Ihrer Bedürfnisse und Anforderungen von einem unserer Vertriebsmitarbeiter beraten.

    Kostenfreie Tools
    Web-Profiling

    Profiling in Ihrem Browser mit nur einem Klick. Sie müssen lediglich eine Datei ziehen und ablegen.

    Datenqualitätsanalyse

    Fortschrittliches Profiling-Tool. In nur wenigen Minuten unter Windows installieren.

    Daten-Stories

    Modern data visualization. Present complex facts and wow all stakeholders.

    Alle kostenfreien Tools anzeigen
  • Contact
Ataccama
Login
Benutzer
Anmelden oder Registrieren
Contact
Logo with rockets
Announcing
$150 Million Growth Investment
BainCapital logo
Learn more
Blog

Augmented Data Lineage: What It Is, and Why It Matters

5 minutes read

In recent years, data lineage has been a highly sought-after capability for data management and data governance teams. By now, it has become a critical feature of data catalogs and metadata management solutions, offering a wide range of benefits and applications. These include regulatory compliance, impact analysis, and a faster understanding of the enterprise data landscape.

Typically, data lineage is associated with technical roles, such as ETL developers and data engineers. However, when data lineage is enriched with business metadata, it can become a particularly useful and practical capability for business users. 

In this post, we’ll introduce the concept of augmented data lineage as a tool for business users. We will explore how business and analytical roles within enterprises can use it to find data and perform root cause analyses faster while avoiding corporate red tape.

What is Augmented Data Lineage?

Augmented data lineage is “regular” data lineage enriched with information from a data catalog: metadata such as real-time data quality, business terms & categories, and anomalies detected in data loads.

Enhanced with this information, data lineage can speed up the process of locating the right data or support analytical activities, such as root cause analysis or data quality analysis. The visual presentation of augmented data lineage alone makes a big difference in a user's ability to draw conclusions, as opposed to just viewing a list of data sets on the catalog's search results page.

Data lineage enhanced with business terms

Data lineage enhanced with business terms

This enriched data lineage can help answer many questions that are typically addressed with a data catalog search query or by consulting standard data lineage: 

  • Is this the best data I can use for my data science project or analytic assignment?
  • Has this report been generated from valid and timely data?
  • Why does a metric in a report contain an unexpectedly large or small value? 
  • Which data sets contain PII data, and in which systems do they originate? 

Let’s examine how these questions can be answered by using augmented data lineage.

Finding Data with Lineage

Imagine a scenario where you need to complete an analysis involving customer birthdates in order to predict purchases of a product by age buckets. You have a data catalog in place, so you search for “customer birthdates” and find a seemingly relevant data set. You get an overview of the whole data set, including assessments of data quality, validity, consistency, completeness, etc. However, looking at the frequency analysis, a data quality issue catches your eye: around 10% of the values are empty or obviously invalid (“NULL” and “N/A” values).

Data asset detail in the data catalog

Data asset detail in the data catalog

What do you do now? Go back to the search results and look at another data set with birthdates? How many times will you have to go back and forth like that? Suddenly, what seemed like a straightforward step in the process has become a time-consuming and frustrating endeavor. This is where augmented data lineage can be a game changer.

In the augmented business lineage view, you will immediately see all previous and subsequent transformations of the data set. If someone has already cleansed or prepared a better version of this data set, enriched with data from a different source, it will pop up on your screen in an easy-to-read and easy-to-trace format.

Thanks to business terms and DQ indicators, you will see to what extent the related assets are relevant and usable. Then, you can make a more informed decision about whether to use the current data set or a transformed data set with better quality, or even combinations of data sets from different points in the transformation lifecycle.

Data lineage enhanced with business terms and data quality information

Data lineage enhanced with business terms

However, what if the lineage for this particular data set does not reveal any other promising data sets? Read on to learn about an alternative lineage view.

Broadening the Context with Business Term Lineage

In the example above, the user is interested in very specific data: customer birthdates. Such columns will have a proper business term—something like “birthdate” or “date of birth”—assigned. With Ataccama, you can see the lineage for that particular glossary term and find the best quality assets in every system. The example below shows lineage for PII data.

Business term lineage for the ‘PII’ term

Business term lineage for the ‘PII’ term

Essentially, you get a map of all data sets that contain birthdates, tracking their origin all the way to the source systems. This high-level view provides the full context around specific data and enables users such as analysts and data scientists to pick and choose from the relevant data sets.

Rapid Root Cause Analysis with Anomalies

One of the most well-known benefits of data lineage is that it allows users to perform root cause analyses. The story usually goes like this: the VP of Sales (or someone in a similar role) thinks that the numbers in the new quarterly report do not make sense. Your task is to find out why. Some would say that all you need to complete this task is access to data lineage, at which point you can immediately identify the data sets that caused the problem.

While it's true that having accurate lineage decreases the time needed to diagnose a problem, an even faster method is to use AI to automatically detect anomalies whenever new data is loaded.

Data lineage enhanced with anomaly detection

Data lineage enhanced with anomaly detection

Anomalies work by comparing the previous and current versions of a data set and detecting notable changes in data characteristics, such as value frequency distribution, minimum and maximum values, unexpected record count, or inconsistency in data formatting. As a result, anyone who is investigating an issue can immediately see where it appeared for the first time, which SQL procedure or ETL transformation caused it, and which data sets were affected. Based on that knowledge, they can take measures to prevent the issue from appearing in the future, as well as seeing that the data set used in producing a report has been corrupted.

The key feature of anomaly detection is that it is completely automatic and needs no configuration since it is powered by machine learning.

Conclusion

In conclusion, augmented data lineage is an enhanced version of technical data lineage, enriched with business metadata, such as business terms, data quality, and AI-detected anomalies. With its help, data scientists, data analysts, data stewards, and other users can more quickly find the data they need, perform root-cause analyses in the cases of data quality decline, or analyze how data quality changes from data sources to data consumption points (data lakes or data warehouses). In these ways, augmented data lineage extends the utility of data lineage to a wider circle of users, providing them with an alternative—and often faster—way of solving known problems.

Privacy Policy Cookie Policy Terms of Use Ethics Hotline
Deutsche
English Deutsche Pусский Français Espanol
© Ataccama 2022
Cookies We value your privacy

We use cookies on our website to enhance your browsing experience. By using our website, you consent to the use of cookies. To understand more how we use cookies or how to change your preference and browser settings, please see our privacy policy.

Select cookies