Any organization that stores, migrates, or merges data is aware of the fact that some sort of plan for data quality must be in place in order for their investment—data—to not lose integrity. If that plan is truly in place and functioning like it should, then most issues are corrected by the DQ rules that are defined within the system—but not all issues are automatically corrected. In fact, no present-day DQ tool has the ability to correct each and every issue that arises due to data inconsistencies. This is caused by the variety of data quality issues and exceptions for which it is impossible to design the rules that DQ tools work with.
Missing key data elements, ambiguities and regulatory/legal aspects represent the typical limitations of automated data corrections and thus it is seen as a necessity to manually intervene in the DQ process to counteract the undesired decay of data.
The overall concept of bad data detection and manual resolution begins with the system’s DQ tool, whose rules establish the guidelines to which system data must conform. In this automated process, each data record that does not satisfy the established DQ rules is subsequently segregated and sent to the DQIT application. Data stewards proceed to track each issue and pass it through a series of states until the issue is resolved in the source system and reaches closure. Stewards have the opportunity to either correct the source data within its original system—and affiliated systems—or adjust the DQ rules themselves, but a vast majority of cases involve correcting the faulty data in the source system(s).
The whole process consists of the following:
- Issue Detection. Reported either by a DQ tool based on the defined rules or by individual business users.
- Issue Resolution. Manual Cleansing (resolution of ambiguities, addition of missing values from other sources), Manual Match (selection of records to be merged/unmerged, suggestions can be provided by the DQ tool), Manual Merge (creation of Golden Record to reflect business requirements).
- Automated Data Cleansing & Matching Inputs. Repeated resolution of the same issues may result in suggesting improvements for business rules implemented in the DQ tool.
- Backward Propagation. Reflection of decisions/resolutions in the source systems.
- Monitoring & Reporting.
Manual data quality processes act as the first steps organizations take when identifying issues hidden in their data, soon realizing it’s not the most effective way to handle them all.
Ataccama supports the solution
Ataccama provides fully integrated platforms for both automated and manual data quality processes. DQC/MDC provides the means for issue detection, while DQ Issue Tracker—a powerful web-based application for data stewards—allows its users to monitor and manually resolve issues, track their history and related decisions, and includes comprehensive reports. Most importantly, it integrates with DQC/MDC DQ Firewall, preventing imprecise user resolutions of data quality issues.
- Effective combination of automated and manual data quality processes—the detection of issues, provision of proposals using the DQ tool, and manual resolution (enrichment, cleansing, match, merge)
- Manual data enrichment from other sources (documentation, interaction with the customer etc.)
- Ability to correct missing or obsolete values
- Ability to improve the final view of the customer data using manual merge (Golden Record creation) in cases where the correction of source data is impossible