Innovate 2023 logo

Ataccama Innovate 2023

The Future of Data Quality

What is Master Data Management and Why is it Important?

The following is an introductory guide to Master Data Management (MDM), designed to explain a number of terms, concepts, and MDM best practices.

Read on to learn what sets MDM apart from other data management disciplines, and why it’s uniquely suited to improving the quality and consistency of your business data.

But first, let’s answer the following question:

What is master data?

Master data is a term that describes the most valuable business data, or operational business data.

Broadly speaking, we can put master data into three main categories:

  • People: customers, employees, vendors, suppliers, patients, citizens
  • Things: accounts, products, assets
  • Locations: addresses, routings

Their sub-categories are called domains and are defined by their own set of attributes or unique characteristics. Here are some examples:




(Both B2B/B2C) First name; last name; address; postal code; buying preferences, Service-Level-Agreements


Product code; color; size; design; storage (SKU)


Addresses; type of location (e.g., consumer, warehouse, retail location); logistic coordinates (geocoding)

We can all agree that having accurate and available information about customers, products, and locations is essential for efficient operations in many departments throughout the enterprise: sales, marketing, finance, back office, engineering, etc.

Now that we’ve defined master data and its importance, let’s take a look at what might go wrong with it and why master data management is so important.

Why master data management is important

Even within a single system application or database, you’re likely to find exact duplicates of multiple records. Redundant master data is still more confusing. For example, a customer record may exist where the names match, but the addresses do not, or the addresses are stored in different formats.

A given U.S. cell number, for instance, could be expressed 211-232-1221, or without the dashes: 2112321221. Even though the numbers are exactly the same, a given system might misread the cell number because it doesn’t recognize the format.

But what happens when a particular customer is created, recreated, and modified in multiple unsynchronized systems, which is very often the case?

How bad can it get?

Consider the missing data, inconsistencies in format, and misspellings that are possible when multiple, siloed systems create their own unique versions of one customer.

In this example, four records of Jane Smith exist in three systems. From a human perspective, we can judge there’s probably only one customer named Jane Smith born on October 12, 1987.

Or could the other record in the CRM system represent a John Smith born on December 10, 1987? We can’t be sure because one of the records has just one letter for the first name, the other record is missing the phone number, both are missing an ID card number, and their emails don’t match. This could still be Jane Smith with a new email address.

The cost of unmanaged master data

From a business perspective, this classic and systemic data management problem can seriously impact the customer’s relationship with your company, creating:

  • Inaccurate sales reporting about Jane Smith whose customer record is duplicated in different systems.
  • Inadequate customer service provided to Jane Smith because support agents might have wrong and incomplete contact and product information.
  • Increased marketing costs when sales materials are repeatedly and mistakenly sent to Jane Smith.
  • Missed opportunities to upsell Jane Smith because her buying history is siloed and not linked to her customer record.

To summarize:

  1. Master data is created and stored in different systems.
  2. Mistakes and duplicates occur in these systems.
  3. By failing to address inconsistent and inaccurate system versions of customer, product, and location data, companies will continuously create unreliable and flawed representations of key business information.

We can automate a data quality solution to fix issues in each individual system. We can even reconcile the format issues. But a siloed approach to data quality does nothing to consolidate or align the multiple and misleading versions of Jane Smith.

How master data management works

Having established the typical problems that plague master data, let’s examine some core MDM practices designed to create a single representation, or golden record, of Jane Smith.

  1. MDM data model lays the foundation for creating the golden record. The model is designed to standardize data input (or importing) from all participating enterprise systems. It imposes structural rules and formats that enable MDM to precisely map and interchange data between external source systems and the MDM solution.
  2. Data standardization and cleansing: Once data from multiple source systems has been consolidated within the MDM database or hub, MDM will reconcile inconsistencies in formatting, typos, and standardize data to begin the matching process.
  3. Matching determines whether records from individual systems represent the same business entity, or customer, such as Jane Smith. Rule-based matching works by evaluating attribute-based conditions. For example, a logical condition might be “if the first name, last name, and social security number are the same, then this is the same customer.” Multiple rules like this are prepared and evaluated in sequence until one of them is satisfied.
  4. Merging helps finalize the creation of a golden record: a consolidated view of Jane Smith across all systems. Like matching, merging is a rule-based process that works by taking the best pieces of information from each system. For example, ERP might be designated as the best source for addresses. Conversely, another rule could be “always take the phone number from CRM” (because that’s where the up-to-date contact information is stored). In some cases (and domains), merging is not required.

The use of the above MDM capabilities differs depending on data domains and business use cases. We cover this in more detail later on in this article where we talk about MDM implementation styles.

But before we move there, we need to address another critical data type that complements and supports master data: reference data.

What is reference data management and why do we need it?

Reference data are codesets and their descriptions used to classify or categorize other data.

Like master data, reference data or codesets need to be represented by a single version of the truth and shared accurately and reliably across business systems. Importantly, codesets must be accurately linked with master data.

Misaligned or inaccurate codesets can produce detrimental, seismic shifts in business functions by blocking access to what otherwise are accurate customer or product records. Using the wrong healthcare code on a patient’s record, for example, could create dire circumstances, including misidentifying diseases, their treatment, or prescribing the wrong medication.

Other examples of reference data include standardized codes for currencies, countries, gender, customer types. These are linked with master data such as customer or product records to enable accurate segmentation and various types of reporting.

For an in-depth discussion about RDM, read our blog post on the Fundamentals of Reference Data Management.

MDM architectural & implementation styles

There are four MDM architectural styles designed to ensure successful MDM implementations.

While the four styles essentially share the same master data management goals (i.e., a single enterprise representation of a master data), there are different implementation considerations based on system environments, business process improvement initiatives, and data domains.

MDM styles at a glance:

  • Consolidation: Pulls source data into the MDM hub, consolidates it, and provides the best version of data (usually golden records) to subscribing systems and users. Source data stays as is.
  • Centralized: Master data is created in the hub and is shared with subscribing systems.
  • Co-existence: Pulls source data into the hub, consolidates it, and provides the best version of data (usually golden records) to subscribing systems, users, and selected source systems (that’s what makes it different from the consolidation style).
  • Mixed/Hybrid. Combines consolidation and centralized styles to flexibly consolidate and author master data in the MDM hub, and provide data back to source systems or any other consumers.

The business benefits of MDM

MDM has a broader mission that actually goes beyond governing master data and reference data. MDM’s ability to centrally merge and govern data from any enterprise system brings powerful business benefits.

Correct business information means trusted financial, sales, and regulatory reporting. Accurate inventory information enhances control over the selling process. And a cleansed and reliable customer database enables accurate segmentation and increased ROI for marketing campaigns.

Just consider all the use cases for business addresses and their correct GPS coordinates in various industries:

  • Logistics for customer deliveries or supplier replenishment
  • SKU storage location
  • Emergency vehicle routing
  • Kiosk or device maintenance for vending machines

Strategically, however, MDM absolutely has a process-driven role when it comes to addressing data dependency and the quality of key domains associated with any number of major business initiatives:

Business initiative

Key domains

Customer 360

Customer, account, product, location

Merging product catalogs (M&A)

Product type, product codes, product location

Supply chain optimization

Supplier, product (buy-side) components, raw materials, location


Master data management provides a broad-based governance framework designed to manage any kind of enterprise master and reference data. It is the go-to data management solution for creating a single version of the truth from multiple, conflicting datasets and systems.

MDM not only governs multi-domain or customer, product, and location data, but also can logically interrelate domains to create a practical business view.

To summarize, MDM provides its value through its ability to

  1. Create and maintain the best version of business information
  2. Connect data from different domains into useful business views
  3. Make key business data accessible to users, systems, and processes in a secure and governed manner.

Ultimately, this means organizations can use reliable data for a wide range of tactical activities as well as ongoing core business processes.

Learn more about the basics of MDM by viewing our MDM 101 webinar. It expands on this blog’s content and additionally focuses on the symbiotic relationships between MDM, data quality, and a data catalog.