Ataccama
  • Plateforme
    Enterprise Data Quality Fabric
    Enterprise Data Quality Fabric
    Arrow right
    How It Works
    Aperçu de la plateforme
    Arrow right
    Qualité des données
    Qualité des données

    Contrôles DQ automatisés, surveillance, détection d'anomalies et correction

    Reference Data Management
    Gestion des données de référence

    RDM, création, hiérarchies et synchronisation centralisés

    Master Data Management
    Gestion des données de référence

    Maîtrise multidomaine, intendance, correspondance par IA, fourniture flexible de données

    Intégration de données
    Intégration de données

    Extraction, transformation et fourniture de données flexibles

    Catalogue de données
    Catalogue de données

    Découverte automatisée de données, glossaire métier et marché de données

    Histoires de données
    Histoires de données

    Racontez des histoires attractives avec vos données

    Déploiement
    Options de déploiement Plateforme en tant que service Sur site et hybride Architecture et intégrations
  • Solutions
    Retour
    Concentré sur
    Mettre en œuvre la gouvernance des données

    Une pile d'outils pour démarrer rapidement et pérenniser la gouvernance des données

    Structure de données

    Activez les métadonnées et automatisez le mappage, l'extraction et la fourniture de données

    Gestion des mégadonnées

    Ingérez des données fiables, gérez votre lac de données et traitez les données.

    Vue unique des données

    Établissez une source unique de vérité et créez une vue unique pour tout le monde.

    Voir toutes les solutions industrielles
    De
    Banque, assurance, finance

    Validation des données à la saisie, Customer 360, conformité réglementaire

    Santé

    Patient 360, données fiables pour les tests et DSE, conformité HIPAA

    Vente au détail

    Validation des données à la saisie, Customer 360, enrichissement des données, données de référence

    Gouvernement

    Citizen 360, partage et protection des données, villes intelligentes

    Sciences de la vie

    MDM produit, données propres pour les études cliniques, transparence des dépenses

    Télécoms

    Customer 360, enrichissement des données, suivi d'équipements, confidentialité des données

    Transports

    Surveillance d'équipements, Customer 360, données de référence, confidentialité des données

    Dernière lecture
    Data for Good: Enabling Data-Driven Altruism with Data Governance
    Data for Good: Enabling Data-Driven Altruism with Data Governance

    Using data for helping solve social causes comes with many challenges. How can social organizations use the data efficiently? Learn in this article.

  • Clients
  • Entreprise
    Retour
    Nous contacter
    Planifier un appel Nous contacter S'inscrire à la newsletter Chat en direct
    Entreprise
    À propos de nous

    Tout sur nous, qui nous sommes, notre vision, notre leadership, nos bureaux

    Dossier de presse

    Téléchargez nos actifs de marque, photos et captures d'écran de produits

    Carrières

    #NotYourAverageJob

  • Ressources 1
    Retour
    Ressources

    Vidéos, articles, conseils de nos experts et leaders pédagogiques

    Nouvelles Réussites Blog Livres blancs Webinaires Démos
    Toutes les ressources
    Assistance

    Obtenez des réponses à vos questions techniques

    Documentation Formation Base de connaissances Communauté d'utilisateurs Assistance client
    Événements

    Assistez bientôt à nos événements virtuels en direct et en personne

    Future of Financial Services, Melbourne 2022

    Jul 20

    Innovate VIC 2022

    Jul 21

    Choisi rien que pour vous
    title
    What Is Data Quality and Why Is It Important?

    Learn what data quality is, why it is important, what costs and risks bad data carries, and how you can get started with data quality today for free.

  • Partenaires
    Retour
    Partenaires
    Devenir un partenaire

    Découvrez notre modèle de partenariat, rejoignez-nous

    Portail partenaire Ataccama

    Connectez-vous à notre portail partenaire pour accéder à tous les outils et ressources essentiels.

    Opportunité d'inscription

    Enregistrez le client potentiel et obtenez une récompense partenaire

    Nos partenaires

    Voir nos partenaires technologiques, intégrateurs de systèmes et partenaires de livraison

  • Essayez maintenant
    Retour
    Meeting
    Réserver une réunion

    Discutez de vos besoins et exigences avec l'un de nos représentants commerciaux.

    Outils gratuits
    Profilage Web

    Profilage en un clic dans votre navigateur. Il suffi de déposer un fichier.

    Analyseur de qualité des données

    Outil de profilage avancé. Installez en quelques minutes sur Windows.

    Histoires de données

    Modern data visualization. Present complex facts and wow all stakeholders.

    Voir tous les outils gratuits
  • Contact
Ataccama
Login
Utilisateur
Connexion ou inscription
Contact
Logo with rockets
Announcing
$150 Million Growth Investment
BainCapital logo
Learn more
Blog

What is a Data Fabric and Why You Should Care

11 minutes read

Gartner featured data fabric on both of its latest lists on top trends among the data community:

  • Top Technology Trends
  • Top 10 Data & Analytics Trends for 2021

Coincidentally, it’s been moving up Gartner’s hype cycle for emerging technologies. Allied Market Research expects the data fabric market to more than triple in size between now and 2026 and cited a “rise in need for business agility and data accessibility” as one of the top impacting factors.

So what’s all the hype about?

What is a data fabric in simple words?

A data fabric is a data management solution design that connects all data sources and data management components to provide frictionless access to enterprise data to all consumers.

Within the data fabric, you’ll find all the fundamental functions of any data management framework such as data quality tools and a data catalog. However, they’ll be stitched together by metadata that result in synergies to create a user-friendly and predominantly autonomous enterprise-wide data coverage interface.

Data fabric definition

Since data fabric is a bit of a complex concept, providing a more formal definition might go a long way towards understanding it in greater detail.

First, let’s look at two of our favorite formal definitions from other resources.

"A data fabric is an emerging data management and data integration design concept for attaining flexible, reusable and augmented data integration pipelines, services and semantics, in support of various operational and analytics use cases delivered across multiple deployment and orchestration platforms."

From Gartner’s Demystifying the Data Fabric

"Conceptually, a big data fabric is essentially a metadata-driven way of connecting a disparate collection of data tools that address key pain points in big data projects in a cohesive and self-service manner."

From Data Mesh Vs. Data Fabric: Understanding the Differences by Alex Woodie

A Data fabric involves different sources and types of data being connected together with methods for accessing them. It is an integrated, semi-autonomous layer that will span across all of your data platforms to perform quality checks, map data, perform continuous analysis, and several other processes. All of this is driven through metadata which the fabric uses to recognize patterns, make autonomous decisions, and construct data flows.

What makes a data fabric a data fabric is how these components interact with each other and exchange metadata, which is a primary driver of automation.

A data fabric can also analyze how your organization uses and accesses data and can streamline processes for future inquiries, even predicting what a user wants to do before they enter a request. This learning mechanism also helps the fabric access data you might not have been aware of, or was too low quality and presents it to users ready for business exploration.

Why do we need a data fabric?

Now that we understand the concept of a data fabric, you’re probably wondering why it’s necessary. Organizing your data systems in this way allows for several advantages and benefits that having separate data management systems won’t provide you. Let’s look at all the reasons why you need a data fabric.

Manual and lengthy processes to get data

One of the best features of data fabric is that it delivers all of your data to you on a silver platter. Combining the capabilities of a data catalog, data integration, and data profiling creates an easy mechanism to find and access high-quality data.

Without it, you will have to dig through your systems to find data manually. Once you find it, you won’t be able to guarantee that it’s high quality or even the exact data you’re looking for without several extra steps and tons of legwork like writing code to find the data or needing to profile the dataset before you can use it.

Catalogs are great but don’t deliver data

If your company already has a data catalog, you might think you don’t need a data fabric. Data catalogs are connected to your data sources and discover relevant metadata from them. However, while data catalogs can deliver that metadata in an easily consumable way, they cannot deliver data the way a fabric can.

That being said, data catalogs are an important part of a data fabric architecture. Read on to learn more.

More and more data sources

As companies expand their data collection networks and tap into more and more data sources, the job of integrating data and managing metadata grows exponentially complicated. Eventually, companies realize they are impossible to perform manually.

Data fabric components

Since a data fabric is a design concept, learning about its components might help you better understand the framework as a whole. Here are six components part of the formalized data fabric architecture.

Data catalog

The data catalog connects to all of the important data sources in your organization and captures metadata from them. It’s arguably the most critical part of the data fabric because metadata powers much of the automation that the data fabric delivers.

It’s important to note that you need a current self-driving data catalog that automates metadata discovery and ingestion. In other words, when you connect a new data source to your data catalog, the AI will reuse the knowledge it has about the existing data sources to infer metadata about the new source. For example, it will suggest business terms to label technical attributes.

Knowledge graph

The knowledge graph stores all of your metadata and relationships between them, not just metadata about data sources (which are stored in the data catalog). Users can take advantage of this to better understand data and metadata. It’s also used by the recommendation engine (more on that below). Knowledge graph allows both users and machines (i.e., recommendation engine) to consistently explore relationships between all the metadata entities (regardless of the source).

Metadata activation

Metadata activation means using existing metadata and inferring new metadata from it. Some examples are profiling data, generating statistics, evaluating data quality, and performing data classification. Activated metadata is saved back to the knowledge graph, further extending previously captured information.

Recommendation engine

The recommendation engine uses all the metadata from the knowledge graph (including the activated metadata, technical metadata, catalog metadata, etc.) to infer more metadata or recommend how to process your data.

The recommendation engine performs three main types of tasks:

  • Delivery optimization: it will suggest delivery models, optimize scheduling, and suggest data transformations.
  • Metadata inference: it will find new relationships, perform data classification, and apply data quality rules — all as suggestions for business users.
  • Anomaly detection: it will detect anomalies in data quality, data structure, or data delivery and alert stakeholders.

Data preparation and data delivery

The data fabric enables users and machines to consume data and metadata. Users can find and use data assets in the data catalog and transform (prepare) data in a self-service way. Machines can request and receive data via APIs.

The data fabric understands the structure of data (through metadata in the knowledge graph) and the intent of the data consumer. This enables the fabric to apply or suggest different data preparation or delivery types based on all the metadata and intent available. For example, it might suggest denormalized data for a report but normalized data for MDM.

The fabric should also simplify data delivery (or providing) by pre-configuring data output endpoints:

  • Automatically generating APIs
  • Letting users re-use existing data providing pipelines

Orchestration and data ops

Data fabric architecture requires components to optimize data delivery. This means having robust data processing engines close to data sources that can deliver data in the fastest way possible. One other requirement is adherence to Data Ops principles, such as reusability of data pipelines.

How does the data fabric work? The components in action

The section above describing the components of the data fabric might have already given you a good idea about how it works. However, you can have all these components and still not have a data fabric. You might find the illustration below helpful in understanding how these components interoperate.

How Data Fabric Works

Something better than a data fabric?

Data Quality Fabric embeds data quality services at all stages of the data life cycle.

Read the primer

What are the benefits of the data fabric?

Keeping in mind the components and necessity of a fabric and its unprecedented automation powered by metadata, it provides businesses with several benefits that make it one of the most appealing design concepts for a data system.

Faster and easier access to data

As we already know, data scientists and consumers spend an alarming portion of their time gathering and preparing data for analysis. A data fabric enables self-service data consumption for anyone who needs it at your company, regardless of their skill sets. It creates a single point of access connected to all the company’s source systems, so users don’t need to hunt down the data they need, making it easier to understand the data and its origins.

Simplified data privacy and data protection

While Data fabric can provide you with faster access to your data, that comes with new risks as it exposes you to data leaks and unnecessary exposure to PII. However, the self-maintaining metadata collected by your fabric can help prevent these issues through automatic policy mapping and enforcement, allowing you to implement protocols and policies to protect your data. It can even mask/redact data or deny access to certain data attributes, rows, or even some metadata, ensuring only the right people will have access.

Massive maintenance and configuration time savings

By automating metadata management, data integration, and other processes, the data fabric dramatically reduces the time needed to configure and maintain your data platform. Think about how much time self-maintaining metadata and reusable data pipelines will save for your data engineers, data scientists, data stewards, and other SMEs.

How to implement a data fabric?

Data fabric design includes many different components, and you might be wondering where you should start. As you might have noticed above, metadata is the main resource that enables the automation of the data fabric. With that in mind, we suggest that you start with metadata management. If you don’t have a metadata management solution, set one up. Modern data catalogs give you a user-friendly way to implement it.

Some next steps:

  1. Implement a data quality solution that connects to the knowledge graph and uses its metadata to create data quality jobs.
  2. If necessary, implement a data integration component.
  3. Implement a self-learning recommendation engine that monitors the whole data fabric.
  4. Implement a metadata-driven data preparation and provisioning solution.

Adding more details is beyond the scope of this article. Each organization is different, so we suggest you talk to an expert.

Have a question? Speak to our industry expert.

Marek Ovcacek
Chief Product Evangelist at Ataccama
Send message

Frequently asked questions

The section below should clarify any confusion about the differences between Data Fabric and other essential terms in data management.

Data fabric vs data virtualization

Data virtualization is one of the styles in which data fabric can deliver data. Data fabric goes beyond data virtualization in its ability to integrate and provide data in other styles and its ability to change and construct data views based on new metadata with minimal configuration.

For example, if a new data source is cataloged, its metadata will be automatically considered when providing data. Data fabrics can also use the information they learned from previous sources to accommodate new ones better as they’re introduced into the system.

In short, data virtualization provides a singular view of a company’s data sources, aiding in their analysis. Data fabrics are capable of this and much more, especially in terms of automating these processes, meaning that virtualization can be seen as a supported metadata delivery mode of a data fabric as opposed to an alternative framework.

Data fabric vs data integration

Unlike data virtualization, data integration actually combines data residing in different sources to provide users with a unified view of their datasets. Since data fabric connects multiple data sources into a single metadata-driven data providing layer, data integration is one of the main features of the data fabric.

Data fabric automates data integration by dynamically generating data integration pipelines whenever they are necessary to deliver the correct data to a requesting data consumer.

Data fabric vs data lake

A data lake is just a place to store data. A data fabric can access data in these data lakes and connect it to the rest of your data management and data analytics systems.

Data fabric vs data mesh

Data mesh is a fundamentally different approach to connecting data systems within one company. While data fabric builds a single layer of data across all your systems, a data mesh involves distributed groups or teams handling different parts of the data framework and collaborating through common data governance principles.

There are also ways to combine the two approaches, connecting systems through a fabric and still having distributed responsibilities like in the mesh. You can read more about the key differences here.

Build your data fabric with Ataccama

We have built data fabrics for clients like T-Mobile, Canadian Tire, Fiserv, and others.

Contact us

Related articles

Data Quality Fabric Primer

Data Quality Fabric Primer

Blog
The Evolution and Future of Data Quality

The Evolution and Future of Data Quality

Blog
What Is Data Quality and Why Is It Important?

What Is Data Quality and Why Is It Important?

Blog

Data Fabric

Focus
Privacy Policy Cookie Policy Terms of Use Ethics Hotline
Français
English Deutsche Pусский Français Espanol
© Ataccama 2022
Cookies We value your privacy

We use cookies on our website to enhance your browsing experience. By using our website, you consent to the use of cookies. To understand more how we use cookies or how to change your preference and browser settings, please see our privacy policy.

Select cookies