Organizations’ data typically degrades about 20 to 30 percent annually, but this number is about to increase at a much faster rate due to the recent business downturn and subsequent unemployment trends. As a result, many organizations are looking to third-party data vendors — in some cases more than one — to fill gaps in their data. Bringing in new data from new data sources, however, can introduce its own unique set of challenges.
Many of our clients, for example, choose to store some of the same data points in separate fields in tables. Let’s say you contract with ZoomInfo to pull in their Number of Employees data on your Account records. But you want to protect the Number of Employees data you already have, so you store it in a new field called “ZI Number of Employees.” The challenge this approach creates is that now your admins have to write a second set of rules for segmentation, scoring, assignment, and reporting to look at two fields — assuming they have been made aware of the second one — and the values in each may not correlate.
This approach might seem unusual for some, but, having performed many Data Audits for our clients, I see it quite frequently and in varying degrees. One recent client had nine fields related to “Number of Employees” in one database! That’s probably why, according to recent research, about 40% of a data science project’s duration is devoted to gathering and cleaning data — before the analysts can even start to visualize it.
You already have a lot of data stored within your organization. How will your organization’s data governance strategy measure up now under even greater pressure?
The following three steps provided a structured approach to reinforcing your data management strategy to accommodate new data sources while maintaining or even improving overall data quality:
1. Identify your master data fields
Master or “golden” data typically describes your customers, products, locations, and customer contracts. It tends to be a small subset of all of your data, but it’s the most valuable and must be well maintained. Master data is often shared across multiple users or groups within an organization, and it might be stored in different databases, which can easily create conflicts and inaccuracies. But only one database, one record, and/or one field should be your master to avoid any confusion.
Your first exercise is to define which pieces of business-critical information comprise your master data set. These might include names, Ideal Customer Profile descriptors, physical and digital contact information, persona descriptors, product name lists, etc. The data points on your list will be those most frequently used for business-critical use cases and should be highly focused.
In the ZoomInfo scenario described above, there is no longer a master field for “Number of Employees.” To reach this goal, you’ll need to gain agreement across your data governance stakeholders on a single field for each data point.
2. List and prioritize the sources of data for master fields
Once you’ve selected your master fields, you will need to identify all of the sources that might be populating them. Possible data sources might include:
- Form submissions
- Manual entry
- List uploads or imports from various sources
- Integrations, possibly multiple, from both internal and external databases
In some cases, you may need to consider different external tables importing into the same table in your database, such as both Leads and Contacts that import from your CRM into your single Person table in your Marketing Automation platform.
You will then need to prioritize your list of sources. An example of a prioritized list might be:
- Manual entry
- In-system data management
- Form submissions
- Bulk API
- CRM Accounts
- CRM Contacts
- CRM Leads
- CRM Report
- ZoomInfo Synch
- List import
- Remote file over SFTP
Depending on the native capabilities of your master database, you may also need to set up a custom configuration to capture the source of the value in the master field (e.g., “ZoomInfo Synch”), and the date last updated by that source. You will need this for your update audit trail as well as for applying your master data rules when a new value arrives for each data point.
3. Define update handling rules
Beyond prioritizing data sources, you’ll also want to overlay rules for updating the incoming values. A simple set of configuration options here might include:
- Always overwrite
- Update only if existing field is blank
- Append new data to existing
- Do nothing
However, you may opt for more sophisticated rules that compare both the existing and incoming values and incorporate source prioritization. This is a sample set of such update rules for a single field:
|Existing source vs. incoming source?||Existing field value is blank?||Incoming field value is blank?||Field value after the update|
|Same||No||No||Field value is updated with the incoming value.|
|Same||No||Yes||Field is updated with the incoming value.|
|Same||Yes||No||Field is updated with the incoming value.|
|Same||Yes||Yes||Field remains blank.|
|Different||No||No||If the incoming source is given higher priority than the existing source, field is updated with the incoming value.|
|Different||No||Yes||Field retains the existing value. However, if the incoming source is given higher priority than the existing source, the field is updated as blank.|
|Different||Yes||No||If the incoming source is given a higher priority than the existing source, the field is updated with the incoming value.|
|Different||Yes||Yes||Field remains blank.|
Lastly, for some types of data, such as privacy settings, you may want rules that simply overrule all others. For example, if you receive an opt-out value of “true” from any source, regardless of priority, consider the value, rather than the source, as the master.
Most importantly, document your master data, prioritized sources, and handling rules and share this out to relevant stakeholders. Be prepared to periodically revisit your strategy — after all, the only constant in life is change!
Other considerations for maintaining your data
Make sure you’re already following these best practices for maintaining your Person record data:
- Have a “No Longer at Org” checkbox field on your person records – Don’t let your users put this information into other fields like Job Title.
- Capture “Lead Source” from a picklist – Require this question to be answered, and answered consistently: “How did this record get into my database?”
- Audit your data – Find fields that have limited population or recent usage, unknown need or ownership, and start the process of cleanup.
- Deduplicate your database – Be confident that you have a single master record that best represents each entity in your database.
- Define a Data Retention Policy – As records expire or age, you’ll need to consider a retention plan to ensure you’re paying only for what you can use.
After reading through this article, you might be feeling a little overwhelmed by the recommendations. Most of us are struggling with the many changes going on in our lives and work right now. You don’t need to start out of the gate with a full-blown master data management strategy. DemandGen’s Data Services experts can help you with a thoughtful, phased approach to improving your organization’s data health. Prefer to not have to become a data expert overnight? You can also take advantage of our certified data specialists to implement that strategy on your behalf with our DataMD managed service.
Gaea Connary, Consultant at DemandGen, focuses on helping organizations strengthen their lead management processes, lead scoring, nurturing strategy, and reporting and analysis to get the best return on their technology investment and meet their marketing objectives.