- Data Cleansing
[Explained by a Database Company] What Is Data Cleansing? A Comprehensive Guide to Organizing and Managing Customer Data!
Last Updated: June 19, 2023
Click Here to Learn More About Data Cleansing ▶
Discover Exclusive,
Innovative Data Cleansing Techniques!
If you have arrived at this article, you are likely facing challenges such as having previously attempted to cleanse your internal data and failing, or being unsure of where to begin with your data cleansing efforts.
In this article, we explain how to cleanse data scattered across your organization, such as in SFA or CRM systems, in five steps. We recommend starting from the lowest level.
Table of Contents
1-1The Importance of Corporate Data Cleansing for Database Maintenance
2Level 0: Define Your Data Cleansing Objectives
3Level 1: Define Target Data and Rules for Data Cleansing
4Level 2: Perform Data Cleansing Using Standard CRM and SFA Features
4-1Data Cleansing Features in Salesforce (Sales Cloud)
5Level 3: Export Data and Perform Manual Cleansing Using Excel or Similar Tools
5-1Representative Excel Functions
6Level 4: Perform Data Cleansing Using Specialized Tools
Recommended Articles
Data consolidation, or "nayose" in Japanese, originated as a financial term referring to the integration of multiple accounts held at a single financial institution into one.
Today, the term has evolved to also mean the integration of identical data existing within a database, beyond just financial accounts.
Data Consolidation = Integrating identical data existing within a database into a single record
Reference Article:
[Explained by a DB Company] What Is Data Consolidation? A Thorough Guide to Organizing and Managing Customer Data! ▶︎
While there are various types of data consolidation, in the context of "consolidating databases within a company," it can be broadly classified into two categories: individual consolidation and corporate consolidation.
Individual consolidation refers to integrating records of the same person within a database into a single entry. Typical examples include data at the individual level, such as contact persons at client companies or leads. Since SFA and MA systems are fundamentally designed to manage data on a per-person basis, individual consolidation is essential for accurately managing sales contact data.
Corporate consolidation refers to integrating records of the same company within a database into a single entry. Typical examples include data at the corporate level, such as clients or suppliers. It is crucial for Customer Data Management, preventing sales overlaps, and Implementing ABM.
Individual consolidation is relatively easier to implement because it is easy to set consolidation keys that are less prone to notation variations, such as full name + email address (or phone number) (though it is by no means simple).
On the other hand, corporate consolidation is prone to notation variations in company names and addresses, and it becomes harder to set unified keys for small and medium-sized enterprises, micro-enterprises, government agencies, or individual business locations.
In this article, we will focus on "Corporate Data Consolidation," which is more complex and carries higher importance.
☆-☆-☆ Body End ☆-☆-☆As mentioned above, there are various types of data consolidation. Since the target data and the rules for consolidation change depending on the purpose, it is most important to clearly define "why you are performing data consolidation" and establish a common understanding among team members before you begin.
Think about the purpose by working backward from the benefits. Typical benefits include:
● Preventing sales overlaps
● Improving data accuracy to enable appropriate marketing and sales activities
● Identifying appropriate new business opportunities through the analysis of existing customers
(You may also need to consider methods for data enrichment in addition to consolidation.)
Decide which data to consolidate based on your objectives. If internal data is scattered across CRM, SFA, MA, and business card management systems, consider using a data warehouse (a central repository for data).
Once you have determined the target data, decide on the rules for consolidation.
There are two main components to define: "Consolidation Keys" and "Consolidation Logic."
Key: Information used as the condition for determining duplicates.
Logic: Rules for handling scenarios not covered by the keys, such as the priority order of data when keys match.
Ideally, consolidation keys should be symbols or codes that are uniquely linked to the actual data, such as email addresses for individuals or corporate registration numbers for companies.
Define logic for scenarios that cannot be covered by keys, such as when a key is blank or when identical data exists.
For example, if using an email address as a key, you can set a rule to prioritize data where the name is not blank for records with the same email address.
Also, since it is difficult to completely eliminate all data duplication, it is important to define the boundary of how far to consolidate and where to keep the data as is.
☆-☆-☆ Body End ☆-☆-☆This is the step for the actual consolidation process. If your company uses sales support systems like CRM or SFA, many tools come with built-in standard consolidation features that allow you to manage duplicate data.
Here, we will explain using Salesforce's Sales Cloud, a representative CRM/SFA tool, as an example.
Salesforce has "Matching Rules" and "Duplicate Rules" as standard features for duplicate management.
Matching rules compare field values to determine if a record is sufficiently similar to an existing record to be considered a duplicate.
For example, a matching rule can specify that a record is a duplicate if the email and phone values of two records match exactly.-Sales Cloud: What Are Matching Rules?
Duplicate rules work in conjunction with matching rules to prevent users from creating duplicate records.
While matching rules determine if a record being created or updated is similar enough to be considered a duplicate, duplicate rules instruct Salesforce on what action to take when a duplicate is identified.
For example, duplicate rules can block users from saving a record identified as a potential duplicate, or simply notify the user with an alert while still allowing the record to be saved.-Sales Cloud: What Are Duplicate Rules?
Matching and duplicate rules can be configured in Salesforce via [Setup] > [Data] > [Duplicate Management].
You can set field configurations for each object and define exact or fuzzy matching, allowing you to detect duplicates and consolidate data when creating or editing records based on your specific needs.
For more details, please check the Salesforce Guidelines.
If you find that standard CRM or SFA features are too limited to perform satisfactory consolidation, another method is to export the data from the target system and perform manual consolidation using spreadsheet software like Excel.
The advantage of spreadsheet software like Excel lies in its flexibility. By utilizing appropriate functions, more complex condition settings and processing are possible. Representative functions include the following:
| JIS Function | Converts half-width characters to full-width characters |
| ASC Function | Converts full-width characters to half-width characters |
| TRIM Function | Removes extra spaces |
| CLEAN Function | Removes line breaks |
| CONCATENATE Function | Integrates text strings |
| VLOOKUP Function | Extracts values from another column where specific item values match |
| XLOOKUP Function | Extracts multiple values from another column where specific item values match |
| IF Function | Creates conditional branches |
It is important to manage items that are prone to multiple interpretations separately. For example, if you only have a "Company Name" field, problems often arise such as the inclusion or exclusion of legal entity types or business location names. By pre-defining fields such as "Legal Entity Type," "Company Name," and "Business Location," you can reduce data entry errors. Another major benefit of separating items is that it becomes easier to set data validation rules.
Once you have finished consolidating the extracted data, import it back into the system. At this time, be sure to thoroughly use the unique ID stored in the system for the import. If you do not use IDs, much of the imported data may be generated as new records, potentially causing a massive amount of duplicate data.
If you have created new fields, do not forget to create the corresponding fields in your CRM or SFA in advance.
☆-☆-☆ Body End ☆-☆-☆The types and causes of notation variations and data deficiencies are diverse. Corporate information, in particular, is updated daily due to company name changes, relocations, mergers, and bankruptcies.
It is unrealistic to capture all these changes and maintain the data using only internal resources.
By using a specialized consolidation tool, changes in corporate information are captured and automatically maintained. By entering information through a dedicated tool, you can keep your data clean.
Furthermore, by utilizing the unique codes assigned by the specialized tool, you can achieve centralized management of information that was previously managed separately by different systems, departments, and business locations.
Missing information is supplemented from the built-in corporate database, enabling the optimization of sales activities using data.
For more details on how this differs from manual consolidation and how to specifically implement it, please check the information below!
☆-☆-☆ Body End ☆-☆-☆In this article, we explained the methods for consolidating data existing within a company in five steps.
Data consolidation is a deep and complex task the further you delve into it. If you feel that your internal resources are reaching their limits, why not consider introducing a specialized tool?
Author
uSonar Editorial Department
MX Group, Editor-in-Chief
We are the uSonar Editorial Department.
We provide information on data utilization and digital technologies useful for B2B companies to rethink their future business operations.
uSonar is utilized by various companies
across all industries and sectors.
ITreview Grid Award 2026 Spring
Leader in 6 Categories
With uSonar,
we will guide your company to solve its challenges!
Case Studies and Sample Reports
Download
