Salesforce

How to Clean Salesforce Data: The Complete Guide

Your Salesforce instance has years of accumulated problems. Duplicates, bad emails, outdated contacts, inconsistent formatting. Here's how to fix it.

January 2026 · 12 min read

Your Salesforce org has been running for years. Maybe a decade. In that time, thousands of records have piled up from form submissions, list imports, manual entry, and integrations that push data in whether you want it or not.

Now you're looking at a database where 15% of emails bounce, phone numbers are formatted six different ways, and the same company appears as "Acme Corp," "ACME Corporation," and "acme inc." Your reps don't trust the data. Marketing automation fails because records are incomplete. Reporting is useless because the data beneath it is garbage.

This guide covers how to systematically clean your Salesforce data, from assessment to execution. Some of this you can do with native tools. Some requires third-party apps or manual work. And some of it, honestly, might be worth outsourcing.

Before You Start: The Assessment

Cleaning data without understanding what's broken is like organizing a closet blindfolded. You need to know the scope of the problem before you can fix it.

Run a Data Quality Report

Start with the basics. For your Contact and Lead objects, pull counts for:

  • Records with no email address
  • Records with no phone number
  • Records with blank required fields (Title, Company, etc.)
  • Records that haven't been touched in 12+ months
  • Records with bounced emails (if you track this)

For Accounts, check:

  • Accounts with no associated Contacts
  • Accounts with blank Industry or Employee Count
  • Duplicate detection on Account Name

This gives you a baseline. You might find that 8% of records are missing emails, 23% have no phone, and 12% haven't been updated since 2021. Now you know what you're dealing with.

Identify Your Problem Categories

Salesforce data problems generally fall into five buckets:

1. Duplicates. The same person or company appearing multiple times, often with slight variations. "John Smith" at "Acme" exists three times because one record has his work email, one has his personal email, and one was imported from a list with no email at all.

2. Invalid Data. Emails that bounce. Phone numbers that don't dial. Addresses that don't exist. This data isn't just useless; it actively hurts you when you try to use it.

3. Incomplete Data. Records missing key fields. A Contact with no phone number. An Account with no industry. A Lead with no company name. You can't route, score, or segment what you can't see.

4. Inconsistent Data. The same value expressed different ways. "VP of Sales" versus "Vice President, Sales" versus "Sales VP." "California" versus "CA" versus "Calif." This breaks reporting and automation logic.

5. Stale Data. Information that was accurate once but isn't anymore. The VP of Sales who left 18 months ago. The company that was acquired. The email that worked until they changed domains. Data decays at roughly 30% per year, so a three-year-old record has a coin flip chance of being wrong.

Step 1: Handle Duplicates First

Duplicates are the foundation problem. Cleaning everything else is pointless if you're going to merge records later and lose your work. Start here.

Use Salesforce Duplicate Management

Salesforce has built-in duplicate rules and matching rules. They're not perfect, but they're a starting point.

Go to Setup > Duplicate Management > Duplicate Rules. Create rules for Contacts, Leads, and Accounts. The standard matching rules work on exact email match, exact name match, or fuzzy name matching.

The limitation: Salesforce's native matching is weak on fuzzy logic. It won't catch "Acme Corp" and "ACME Corporation" as duplicates unless you configure custom matching rules. For companies with messy data, you'll need more.

Consider Third-Party Tools

Apps like Cloudingo, DemandTools, or Duplicate Check offer more sophisticated matching. They can:

  • Match on normalized company names (stripping "Inc," "LLC," etc.)
  • Match on fuzzy name variations (Robert/Bob, William/Bill)
  • Match across objects (finding the Lead that's already a Contact)
  • Batch process thousands of records at once

If you have more than 10,000 records and serious duplicate problems, a tool will save you weeks of manual work.

The Merge Process

Once you've identified duplicates, you need a merge strategy. The key question: which record survives, and what data gets preserved?

Best practice: Keep the record with the most complete data, the most recent activity, or the oldest creation date (depending on your business). Merge the other record's data into it, preserving anything the survivor is missing.

Watch Out

Merging deletes records. This affects workflows, automation history, and reporting. Always export a backup before bulk merging, and test with a small batch first.

Step 2: Validate and Fix Invalid Data

Once duplicates are handled, clean up the data that's actively wrong.

Email Validation

Bad emails hurt in multiple ways. Bounces damage your sender reputation. Invalid addresses waste marketing spend. And reps lose credibility when they send to dead addresses.

You can validate emails with tools like NeverBounce, ZeroBounce, or Kickbox. They check whether addresses are:

  • Valid (deliverable)
  • Invalid (hard bounce)
  • Risky (catch-all domains, temporary addresses)
  • Unknown (server didn't respond)

Export your email list, run it through a validation service, then update Salesforce with the results. Most companies find 5-15% of their emails are invalid. Some find 30% or more.

Phone Standardization

Phone numbers get entered in every format imaginable: (555) 123-4567, 555.123.4567, 5551234567, +1-555-123-4567. This matters for:

  • Click-to-dial integrations (which expect consistent formatting)
  • Deduplication (the same number formatted differently looks like different data)
  • International calling (missing country codes break everything)

Standardize to a single format. E.164 (+15551234567) is the international standard, but (555) 123-4567 is more human-readable for US numbers. Pick one and apply it everywhere.

Address Verification

If you use address data for territory assignment, shipping, or compliance, verify it. Services like Smarty (formerly SmartyStreets) or Melissa can standardize addresses to USPS format and flag addresses that don't exist.

Step 3: Fill in Missing Data

Incomplete records are almost as bad as wrong records. You can't route leads by company size if the field is blank. You can't personalize emails by job title if half your contacts have no title.

Identify Critical Gaps

Not all fields matter equally. Focus on the fields your business actually uses for:

  • Lead routing and assignment
  • Lead scoring
  • Marketing segmentation
  • Sales prioritization
  • Reporting and forecasting

For most B2B companies, the critical fields are: Email, Phone, Title, Company, Industry, Employee Count, and Location. If these are incomplete, everything downstream breaks.

Enrichment Options

You have a few ways to fill gaps:

Manual research. Slow but accurate. Works for high-value accounts where you need perfect data. Doesn't scale past a few hundred records.

Data enrichment tools. Services like ZoomInfo, Clearbit, Apollo, or Cognism can append firmographic and contact data. They work by matching your records against their databases. Match rates vary (typically 60-90% for US B2B data), and accuracy varies by vendor.

Data enrichment services. If you don't want to buy a platform subscription, some companies (including us) will enrich your data as a one-time project. You get the clean data without the ongoing contract.

Step 4: Standardize Inconsistent Data

Standardization is the tedious middle child of data cleaning. It's less dramatic than finding duplicates or invalid emails, but inconsistent data quietly breaks everything from reports to automation.

Job Titles

People enter job titles however they want. Your database might have:

  • VP Sales
  • VP of Sales
  • Vice President Sales
  • Vice President of Sales
  • Vice President, Sales
  • Sales VP

These are all the same person, but your automation doesn't know that. Your "target VP and above" campaign misses half its audience.

Create a standardization map that converts variations to a canonical form. Then use Data Loader or a cleaning tool to apply it across your database.

Industry Values

Industry is even messier, especially if you've imported lists from multiple sources. You might have NAICS codes, SIC codes, free-text industry names, and Salesforce's default picklist values all mixed together.

Pick a standard (Salesforce's default picklist is fine for most companies) and map everything to it. This is manual work, but you only have to do it once per unique value.

State and Country

Enable Salesforce's State and Country Picklists if you haven't already. They force standardization at entry time, preventing "California" versus "CA" versus "calif" problems in the future.

For existing data, you'll need to clean it first. A simple find-and-replace can handle most variations.

Step 5: Archive or Delete Stale Records

The final step: deal with records that are too old to trust.

This is where companies get nervous. Nobody wants to delete data that might be useful someday. But keeping dead records has real costs:

  • They inflate your record counts and storage
  • They pollute reports and dashboards
  • Sales might waste time on leads who left their company years ago
  • Marketing emails to dead addresses hurt deliverability

Define Your Criteria

What makes a record "stale"? Common criteria:

  • No activity in 24+ months
  • Email bounced and no phone number
  • Company no longer exists (acquired, closed)
  • Contact confirmed to have left the company

Archive, Don't Delete

Unless you have compliance reasons to delete, archive stale records instead. Export them to a CSV (with all related data), then either delete them or move them to a "Stale" status that excludes them from active campaigns and lists.

This way, you can restore them if needed without losing historical context.

Maintaining Clean Data Going Forward

Cleaning data is a project. Keeping it clean is a process.

Validation Rules

Salesforce validation rules can enforce data quality at the point of entry. Require email format validation. Require certain fields on record creation. Block obviously bad data (phone numbers with wrong digit counts, emails from personal domains if you're B2B-only).

Don't go overboard. Too many required fields slow down data entry and lead to reps putting in junk data to bypass the rules.

Regular Audits

Run your data quality report monthly or quarterly. Track the metrics over time. If duplicate rates are climbing, something is broken in your processes. If email validity is dropping, you're overdue for a re-validation.

Integration Hygiene

Every system that pushes data into Salesforce is a potential source of garbage. Review your integrations periodically. Make sure marketing automation, forms, and third-party apps are mapping fields correctly and not creating duplicates.

When to DIY vs. Outsource

Some of this you can do yourself with Salesforce's native tools, spreadsheets, and some patience. Some of it requires specialized tools or expertise.

Do it yourself if:

  • You have fewer than 10,000 records
  • Your problems are mostly straightforward (obvious duplicates, simple formatting)
  • You have someone with time to learn the tools and do the work

Consider outsourcing if:

  • You have 50,000+ records with complex problems
  • You need enrichment from multiple sources
  • You've tried cleaning before and the problems came back
  • You don't have weeks to dedicate to a cleanup project

We clean Salesforce data for a living. If you want help, get in touch. If you want to do it yourself, everything above should get you started.

Common Questions

How often should I clean my Salesforce data?

Most companies benefit from quarterly cleanings. Contact data decays at roughly 30% per year, so even data that was perfect 12 months ago is significantly degraded now.

Should I use Salesforce's built-in duplicate management or a third-party tool?

Salesforce's native tools work for preventing new duplicates but struggle with fuzzy matching on existing records. For serious cleanup, you'll need third-party tools or manual review.

How do I clean data without breaking automations?

Export your data, clean it externally, then import using the same record IDs. This preserves relationships and automation references. Test with a small batch first.

What's the ROI of cleaning Salesforce data?

Sales reps waste an average of 13 hours per week on data problems. Email deliverability improves immediately after removing bounces. Automation that was failing starts working. The ROI is usually obvious within a quarter.

Need help cleaning your Salesforce data?

Clean My Data

Related: Data Cleaning Services | CRM Hygiene | Data Deduplication