Boost Data Quality with Soft Validation Rules
by Anna Li | January 08, 2019
Compare the enricher and the validation rule. If your goal is to increase data quality, it might seem obvious to use a validation rule to block poor quality data from entering your trusted MDM system.
However, we’ve seen more and more enrichers deployed to perform data quality checks that are traditionally reserved for validation rules.
Let’s explore this idea: in what situations would an enricher better serve business requirements than a validation rule?
Consider a Semarchy customer who manages business-to-business (B2B) customer data in Semarchy xDM. This organization purchases data from external sources. The data contains B2B customer lists that include address information.
In terms of data quality, the external sources span the gamut from complete and accurate to missing fundamental address information that makes the address unusable.
Data Set Example
Let’s look at three records of a fictional financial services corporation with accompanying address data. These records are based on real examples from three different external data sources:
1. ABC Bank Corp. (from BSO)
101 Montgomery St San Francisco, CA 94104
2. ABC Bank Corp. (from DFS)
101 Montgomery St Fl 5, San Francisco, CA 94104
3. ABC Bank Corp. (from BDR)
San Francisco, CA 94104
The organization sends promotional marketing mails to these B2B customers. Therefore, address accuracy and completeness is important.
Record #2 with the address, 101 Montgomery St Fl 5, San Francisco, CA 94104, is the most accurate and complete address.
However, consider record #1 and #3. Since the 101 Montgomery building is 28 floors, many organizations occupy the multiple floors and suites in this building, including Chase Bank, Yosemite Conservancy, cafes, law practices, and dental offices.
Let’s look at how this Semarchy customer managed its validation rules and enrichers to handle record #1 and #3.
In Semarchy xDM, the validation rule returns a Boolean value (true or false). If the record is valid, xDM will allow the record to go through the integration process and become a golden record. If the record fails validation, the record is put in the error queue.
Applying Validation Rules
For example, the case study organization has a validation rule in place to check that all customer records have a street number and street name in the address. That means that record #3 with the address, San Francisco, CA 94104, fails the validation rule and will be placed in the error queue. Record #3 will not make it downstream because most organizations will not consume records from the error queue.
Furthermore, the business requirements determined that it is not possible for a data steward to research the address because ABC Bank Corp. has multiple branches in San Francisco, and it would be impractical to research how to fix this address for ABC Bank Corp.
What should happen to record #1 with the address, 101 Montgomery St San Francisco, CA 94104? The organization doesn’t want to place this record into the error queue because it contains useful information that should flow to downstream systems like the CRM. However, a data steward should and more importantly can fix the address with some review and research.
To address the use case of record #1, the organization uses an enricher. Let’s see how an enricher can flag the record for review and still allow xDM to persist the record as a golden record.
In contrast to the validation rule, an enricher does not ever reject records as errors. Normally, enrichers are used to normalize, standardize, and augment existing data. But, Semarchy xDM customers can also use enrichers as “soft validation” rules.
For example, an enricher can be configured to filter on records missing a suite or floor number and enrich a message into a comment field, such as “REVIEW: This customer may be missing a suite or floor number”.
In the application, a custom business view displays all golden records where the comment field has a review message so data stewards can log into xDM and seamlessly see all the records they need to take action on.
This soft validation allows records like record #1 without a suite or floor number to become a golden record while still flagging errors to prompt data stewards to take action to fix the record. In other words, it allows the organization to clean up records to improve data quality without cutting off nourishment to downstream systems.
It is useful to understand the difference between an enricher and a validation rule because it gives you control over which tool to use for boosting data quality.
Consider when you may want to reject record outright with a validation rule and when an enricher would serve you better as a “soft” validation so that records with minor issues can continue to flow downstream.
If you're new to the Semarchy platform, learn about the benefits an intelligent data hub can provide. If you're building up your expertise on the platform, try our tutorials.