How to Audit Your Salesforce Data Quality

Everyone knows their CRM data has problems. Nobody knows exactly how bad those problems are. "We probably have some duplicates" and "I think a lot of emails are outdated" aren't actionable statements.

A data quality audit gives you specific numbers: your duplicate rate is 14%, 23% of emails are invalid, 67% of accounts are missing industry data. With numbers, you can prioritize. You can calculate ROI on cleanup. You can make the case for investment.

Here's how to run a comprehensive Salesforce data quality audit.

The Five Dimensions of Data Quality

Data quality isn't a single metric. It breaks down into measurable dimensions:

Completeness: What percentage of records have values for required fields? If 40% of your Contacts are missing phone numbers, that's a completeness problem.

Accuracy: Are the values that exist correct? An email that's formatted properly but bounces is an accuracy problem. A company listed as "Tech" when it's actually healthcare is an accuracy problem.

Consistency: Are similar values formatted the same way? "United States", "US", "USA", and "U.S.A." all in your Country field is a consistency problem.

Uniqueness: How many duplicate records exist? If the same company appears as three different Accounts, that's a uniqueness problem.

Timeliness: How current is the data? A contact's job title from three years ago might be wrong now. Contact data decays at roughly 30% per year.

Account-Level Audit

Start with Accounts since they're the foundation of your B2B data model.

Completeness Metrics

Create a report that counts records with null/empty values for each key field:

Field Why It Matters Target Fill Rate
Account Name Basic identification 100%
Website/Domain Deduplication, enrichment matching >95%
Industry Segmentation, ICP filtering >85%
Employee Count Company size for routing/scoring >80%
Annual Revenue Budget qualification >60%
Billing Address Territory assignment >90%
Account Owner Sales coverage 100%

Uniqueness Metrics

Duplicate Accounts cause significant problems. Here's how to measure your duplicate rate:

  1. Exact domain matches: How many Accounts share the same website/domain?
  2. Fuzzy name matches: How many Account names are similar enough to be duplicates? (Requires fuzzy matching logic)
  3. Orphan Accounts: How many Accounts have zero Contacts? (May indicate data entry errors or abandonded records)

Benchmarks:
Good: <5% duplicates
Needs work: 5-10% duplicates
Problem: >10% duplicates

Consistency Metrics

Check categorical fields for variant values:

  • How many unique values exist in the Industry field? (20 is reasonable; 200 suggests inconsistency)
  • How many Country variations exist? (Should map to a standard list)
  • Are account names standardized? (Check for "Inc." vs "Inc" vs "Incorporated" variations)

Contact-Level Audit

Contacts are where most data quality issues live, and where the impact is most directly felt.

Completeness Metrics

Field Why It Matters Target Fill Rate
Email Primary communication channel >98%
First Name Personalization 100%
Last Name Identification 100%
Title Routing, scoring, personalization >90%
Phone Outbound calling >60%
Account (lookup) Company relationship 100%
Mailing Address Direct mail, events >50%

Email Quality Metrics

Email completeness isn't enough. You need to assess email quality:

  • Format validity: Does the email follow proper syntax?
  • Domain validity: Does the domain exist and accept mail?
  • Mailbox validity: Does the specific mailbox exist?
  • Role-based rate: What percentage are info@, sales@, support@ addresses?
  • Disposable rate: What percentage are throwaway email services?

Run your email list through a validation service to get these metrics.

Email quality benchmarks:
Good: >85% valid
Needs work: 70-85% valid
Problem: <70% valid

Uniqueness Metrics

Contact duplicates are tricky because the same person might have multiple legitimate records (changed companies, different email addresses).

Check for duplicates using:

  • Exact email matches: Same email on multiple Contact records
  • Same name + same Account: Strong duplicate indicator
  • Same name + same domain: Likely duplicate even if different email address

Timeliness Metrics

How stale is your Contact data?

  • Last Activity Date distribution: What percentage of Contacts have had activity in the last 6 months? 12 months? 24 months?
  • Last Modified Date distribution: When was the record data last updated?
  • Created Date age: Records created 3+ years ago without updates are likely outdated

Contacts with no activity in 24+ months and no updates in 12+ months are high-risk for data decay.

Lead-Level Audit

If you use Leads, apply similar metrics:

  • Completeness: Fill rates for key fields (email, company, title)
  • Email validity: Same validation as Contacts
  • Lead-to-Contact duplicates: How many Leads already exist as Contacts?
  • Lead aging: How long do Leads sit before conversion or disqualification?
  • Source quality: Which Lead Sources have the worst data quality?

Opportunity-Level Audit

Opportunity data quality directly affects forecasting and reporting:

  • Stage consistency: Are stages being used correctly or skipped?
  • Close Date accuracy: How often do Close Dates slip past the original date?
  • Amount fill rate: What percentage of Opportunities have amounts?
  • Contact Roles: What percentage of Opportunities have associated Contacts?
  • Orphan Opportunities: Opportunities without Accounts or with deleted Accounts

Running the Audit: Practical Steps

Step 1: Export Your Data

Pull reports for each object you're auditing. Include all fields you want to measure. Export to CSV for analysis.

For large databases, you may need to use Data Loader or the Bulk API instead of standard reports.

Step 2: Calculate Completeness

For each field, count:

  • Total records
  • Records with blank/null values
  • Fill rate = (Total - Blank) / Total

This can be done in Excel with COUNTBLANK functions, or programmatically with Python/SQL.

Step 3: Identify Duplicates

For exact matches (same email, same domain), simple sorting and comparison works.

For fuzzy matches (similar names, minor spelling variations), you'll need:

  • Fuzzy matching algorithms (Levenshtein distance, Jaro-Winkler)
  • Specialized tools (Cloudingo, DemandTools, RingLead)
  • Or manual review of sorted lists

Step 4: Validate Emails

Upload your email list to a validation service. They'll return results categorized by validity status.

Step 5: Assess Consistency

For picklist fields, pull the distinct value counts. High counts suggest inconsistency:

  • Country field with 50+ values probably has inconsistent entries
  • Industry field with 200+ values definitely needs cleanup

For free-text fields like company names, look for obvious variations (Inc vs Inc. vs Incorporated).

Step 6: Compile Results

Create a summary scorecard with all metrics:

Metric Current Target Status
Account duplicate rate 8.3% <5% Needs work
Contact email fill rate 94.2% >98% Needs work
Contact email validity 78.5% >85% Problem
Account industry fill rate 89.1% >85% Good

Prioritizing Cleanup

You can't fix everything at once. Prioritize based on:

Business impact: Email validity directly affects campaigns. Duplicate Accounts affect forecasting. Missing industry data affects segmentation. Rank by what's costing you most.

Effort to fix: Some issues (email validation) are straightforward. Others (duplicate merging) require more careful work. Balance quick wins with important fixes.

Compounding effect: Duplicates get worse over time as people add to both records. Bad data entry processes create ongoing problems. Fix root causes early.

A typical prioritization:

  1. Merge duplicate Accounts (affects everything else)
  2. Validate and flag invalid emails (highest immediate ROI)
  3. Merge duplicate Contacts (reduces confusion)
  4. Fill missing critical fields through enrichment
  5. Standardize categorical fields
  6. Archive or delete truly dead records

Ongoing Monitoring

A data quality audit isn't a one-time event. Set up ongoing monitoring:

Scheduled reports: Create Salesforce reports that surface data quality issues weekly or monthly.

Dashboard: Build a data quality dashboard with key metrics visible to RevOps and leadership.

Trend tracking: Are metrics improving or degrading over time? A declining email validity rate suggests a process problem.

Source analysis: Which Lead Sources, import files, or entry points create the worst data? Fix the sources, not just the symptoms.

When to Get Help

Running a comprehensive data quality audit takes time and expertise. Consider getting help if:

  • You've never done one: First-time audits benefit from experienced perspective
  • Your database is large: 100K+ records means significant analysis work
  • You need to justify investment: An external audit carries more weight with leadership
  • You need benchmarks: How do you compare to similar companies?

At Verum, data audits are often the first step in our engagements. We assess your current state, identify priorities, and give you a clear picture of what cleanup will involve. If you just want the audit without the cleanup, that works too.

Not sure about your data quality?

We can run a comprehensive audit of your Salesforce instance and give you a detailed report with specific metrics and prioritized recommendations.

Audit My Data