- Data Cleansing and Data Scrubbing
[Explained by a Database Company] What Is Data Cleansing? A Comprehensive Guide to Organizing and Managing Customer Data!
Last Updated: June 19, 2023
Click Here to Learn More About Data Cleansing ▶
Discover Exclusive,
Innovative Data Cleansing Techniques!
If you have arrived at this article, you are likely facing challenges such as having previously attempted to cleanse your internal data and failing, or being unsure of where to begin with your data cleansing efforts.
In this article, we explain how to cleanse data scattered across your organization, such as in SFA or CRM systems, in five steps. We recommend starting from the lowest level.
Table of Contents
1-1The Importance of Corporate Data Cleansing for Database Maintenance
2Level 0: Define Your Data Cleansing Objectives
3Level 1: Define Target Data and Rules for Data Cleansing
4Level 2: Perform Data Cleansing Using Standard CRM and SFA Features
4-1Data Cleansing Features in Salesforce (Sales Cloud)
5Level 3: Export Data and Perform Manual Cleansing Using Excel or Similar Tools
5-1Representative Excel Functions
6Level 4: Perform Data Cleansing Using Specialized Tools
Recommended Articles
[Explained by a Database Company] What Is Data Cleansing? A Comprehensive Guide to Organizing and Managing Customer Data!
[Understand in 5 Minutes] What Is Data Cleansing? An Easy-to-Understand Guide to Objectives and Practical Examples!
Data consolidation, or "nayose" in Japanese, originated as a financial term referring to the integration of multiple accounts held at a single financial institution into one.
Today, the term has evolved to also mean the integration of identical data existing within a database, beyond just financial accounts.
Data Consolidation = Integrating identical data existing within a database into a single record.
Reference Article:
[Explained by a DB Company] What Is Data Consolidation? A Thorough Guide to Organizing and Managing Customer Data! ▶︎
While there are various types of data consolidation, in the context of "consolidating databases within a company," it can be broadly classified into two categories: individual consolidation and corporate consolidation.
Individual consolidation refers to integrating identical individuals within a database into a single record. Typical examples include data at the individual level, such as business contacts or leads. Since SFA and MA systems are fundamentally designed to manage data on a per-person basis, individual consolidation is essential for accurately managing sales contact data.
Corporate consolidation refers to integrating identical companies within a database into a single record. Typical examples include data at the corporate level, such as clients or suppliers. This is crucial for Customer Data Management, preventing sales collisions, and Implementing ABM.
Individual consolidation is relatively easier to implement because it is easy to set consolidation keys that are less prone to notation variance, such as Name + Email Address (or Phone Number) (though it is by no means simple).
On the other hand, corporate consolidation is prone to notation variance in company names and addresses, and it becomes harder to set unified keys for small and medium-sized enterprises, micro-enterprises, government agencies, or individual business locations.
This article focuses on "Corporate Data Consolidation," which is more complex and carries a higher level of importance.
☆-☆-☆ Body End ☆-☆-☆As mentioned above, there are various types of data consolidation. Since the target data and the rules for consolidation change depending on the objective, it is most important to clearly define "why you are performing data consolidation" and establish a common understanding among team members before you begin.
Think of your objectives by working backward from the benefits. Typical benefits include:
● Preventing sales collisions
● Improving data accuracy to enable appropriate marketing and sales activities
● Identifying appropriate new business opportunities through the analysis of existing customers
(You may also need to consider methods for data enrichment beyond just consolidation.)
Determine the data to be consolidated according to your objectives. If your internal data is scattered across CRM, SFA, MA, and business card management systems, consider using a data warehouse (a central repository for data).
Once you have determined the target data, establish your consolidation rules.
There are two main components to define for consolidation rules: "Consolidation Keys" and "Consolidation Logic."
Key: Information used as the condition for determining duplicates.
Logic: Rules for handling scenarios not covered by the key, such as the priority of records when consolidation keys are identical.
Ideally, consolidation keys should be symbols or codes that are uniquely linked to the actual data, such as an email address for individuals or a corporate registration number for companies.
Define logic for scenarios that cannot be covered by the consolidation key, such as when the key is blank or when identical data exists.
For example, if an email address is used as the key, you might set a rule to prioritize records where the name is not blank for data with the same email address.
Also, since it is difficult to completely eliminate all data duplication, it is important to set a boundary line for how far to consolidate and where to keep the data as is.
☆-☆-☆ Body End ☆-☆-☆This is the step for actual consolidation processing. If your company uses sales support systems like CRM or SFA, many tools come with built-in standard consolidation features that allow you to manage duplicate data.
Here, we will explain using Salesforce's Sales Cloud, a representative CRM/SFA tool, as an example.
Salesforce has "Matching Rules" and "Duplicate Rules" as standard features for duplicate management.
Matching rules compare field values to determine if a record is sufficiently similar to an existing record to be considered a duplicate.
For example, a matching rule can specify that a record is a duplicate if the email and phone values of two records match perfectly.-Sales Cloud: What Are Matching Rules?
Duplicate rules work in conjunction with matching rules to prevent users from creating duplicate records.
While matching rules determine if a record being created or updated is similar enough to be considered a duplicate of another, duplicate rules instruct Salesforce on what action to take when a duplicate is identified.
For example, duplicate rules can block users from saving a record identified as a potential duplicate, or simply notify the user with an alert while still allowing the record to be saved.-Sales Cloud: What Are Duplicate Rules?
Matching rules and duplicate rules can be configured in Salesforce under [Setup] > [Data] > [Duplicate Management].
You can set field configurations for each object and define exact or fuzzy matching, allowing you to detect duplicates and consolidate data when creating or editing records based on your specific objectives.
For more details, please check the Salesforce Guidelines.
If the standard features of your CRM or SFA have too many limitations to perform satisfactory consolidation, another method is to export the data from the target system and perform manual consolidation using spreadsheet software like Excel.
The advantage of spreadsheet software like Excel lies in its flexibility. By utilizing appropriate functions, more complex condition settings and processing are possible. Representative functions include the following.
| JIS Function | Converts half-width characters to full-width characters |
| ASC Function | Converts full-width characters to half-width characters |
| TRIM Function | Removes extra spaces |
| CLEAN Function | Removes line breaks |
| CONCATENATE Function | Integrates text strings |
| VLOOKUP Function | Extracts values from another column where specific item values match |
| XLOOKUP Function | Extracts multiple values from another column where specific item values match |
| IF Function | Creates conditional branches |
It is important to manage items that are prone to multiple interpretations separately. For example, if you only have a "Company Name" field, problems often arise such as the inclusion or exclusion of legal entity types or business location names. By pre-defining fields such as "Legal Entity Type," "Company Name," and "Business Location," you can reduce data entry errors. Another major benefit of separating items is that it becomes easier to set data validation rules.
Once you have finished consolidating the extracted data, import it back. At this time, be sure to thoroughly use the unique ID stored in the system for the import. If you do not use an ID, much of the imported data may be generated as new records, potentially causing a massive amount of duplicate data.
If you have created new fields, do not forget to create the corresponding fields in your CRM or SFA in advance.
☆-☆-☆ Body End ☆-☆-☆The types and causes of notation variance and data deficiencies are diverse. Corporate information, in particular, is updated daily due to company name changes, relocations, mergers, and bankruptcies.
It is unrealistic to capture all these changes and maintain the data using only internal resources.
By using a professional consolidation tool, changes in corporate information are captured and automatically maintained. By performing data entry through a dedicated tool, you can keep your data clean.
Furthermore, by utilizing the unique codes assigned by the dedicated tool, you can achieve centralized management of information that was previously managed separately by system, department, and business location.
Missing information is supplemented from the built-in corporate database, enabling the optimization of sales activities using data.
For more details on how this differs from manual consolidation and how to specifically implement it, please check the information below!
☆-☆-☆ Body End ☆-☆-☆In this article, we explained how to consolidate data existing within a company in five steps.
Data consolidation is a deep and complex task the further you delve into it. If you feel that your internal resources have reached their limits, why not consider introducing a professional tool?
Author
uSonar Editorial Department
MX Group, Editor-in-Chief
We are the uSonar Editorial Department.
We provide information on data utilization and digital technologies useful for B2B companies to rethink their future business operations.
uSonar is utilized by various companies
across all industries and sectors.
ITreview Grid Award 2026 Spring
Leader in 6 Categories
With uSonar,
we will guide your company to solve its challenges!
Case Studies and Sample Reports
Download
