How to Audit Your Salesforce Data Quality
Everyone knows their CRM data has problems. Nobody knows exactly how bad those problems are. "We probably have some duplicates" and "I think a lot of emails are outdated" aren't actionable statements.
A data quality audit gives you specific numbers: your duplicate rate is 14%, 23% of emails are invalid, 67% of accounts are missing industry data. With numbers, you can prioritize. You can calculate ROI on cleanup. You can make the case for investment.
Here's how to run a comprehensive Salesforce data quality audit.
The Five Dimensions of Data Quality
Data quality isn't a single metric. It breaks down into measurable dimensions:
Completeness: What percentage of records have values for required fields? If 40% of your Contacts are missing phone numbers, that's a completeness problem.
Accuracy: Are the values that exist correct? An email that's formatted properly but bounces is an accuracy problem. A company listed as "Tech" when it's actually healthcare is an accuracy problem.
Consistency: Are similar values formatted the same way? "United States", "US", "USA", and "U.S.A." all in your Country field is a consistency problem.
Uniqueness: How many duplicate records exist? If the same company appears as three different Accounts, that's a uniqueness problem.
Timeliness: How current is the data? A contact's job title from three years ago might be wrong now. Contact data decays at roughly 30% per year.
Account-Level Audit
Start with Accounts since they're the foundation of your B2B data model.
Completeness Metrics
Create a report that counts records with null/empty values for each key field:
| Field | Why It Matters | Target Fill Rate |
|---|---|---|
| Account Name | Basic identification | 100% |
| Website/Domain | Deduplication, enrichment matching | >95% |
| Industry | Segmentation, ICP filtering | >85% |
| Employee Count | Company size for routing/scoring | >80% |
| Annual Revenue | Budget qualification | >60% |
| Billing Address | Territory assignment | >90% |
| Account Owner | Sales coverage | 100% |
Uniqueness Metrics
Duplicate Accounts cause significant problems. Here's how to measure your duplicate rate:
- Exact domain matches: How many Accounts share the same website/domain?
- Fuzzy name matches: How many Account names are similar enough to be duplicates? (Requires fuzzy matching logic)
- Orphan Accounts: How many Accounts have zero Contacts? (May indicate data entry errors or abandonded records)
Benchmarks:
Good: <5% duplicates
Needs work: 5-10% duplicates
Problem: >10% duplicates
Consistency Metrics
Check categorical fields for variant values:
- How many unique values exist in the Industry field? (20 is reasonable; 200 suggests inconsistency)
- How many Country variations exist? (Should map to a standard list)
- Are account names standardized? (Check for "Inc." vs "Inc" vs "Incorporated" variations)
Contact-Level Audit
Contacts are where most data quality issues live, and where the impact is most directly felt.
Completeness Metrics
| Field | Why It Matters | Target Fill Rate |
|---|---|---|
| Primary communication channel | >98% | |
| First Name | Personalization | 100% |
| Last Name | Identification | 100% |
| Title | Routing, scoring, personalization | >90% |
| Phone | Outbound calling | >60% |
| Account (lookup) | Company relationship | 100% |
| Mailing Address | Direct mail, events | >50% |
Email Quality Metrics
Email completeness isn't enough. You need to assess email quality:
- Format validity: Does the email follow proper syntax?
- Domain validity: Does the domain exist and accept mail?
- Mailbox validity: Does the specific mailbox exist?
- Role-based rate: What percentage are info@, sales@, support@ addresses?
- Disposable rate: What percentage are throwaway email services?
Run your email list through a validation service to get these metrics.
Email quality benchmarks:
Good: >85% valid
Needs work: 70-85% valid
Problem: <70% valid
Uniqueness Metrics
Contact duplicates are tricky because the same person might have multiple legitimate records (changed companies, different email addresses).
Check for duplicates using:
- Exact email matches: Same email on multiple Contact records
- Same name + same Account: Strong duplicate indicator
- Same name + same domain: Likely duplicate even if different email address
Timeliness Metrics
How stale is your Contact data?
- Last Activity Date distribution: What percentage of Contacts have had activity in the last 6 months? 12 months? 24 months?
- Last Modified Date distribution: When was the record data last updated?
- Created Date age: Records created 3+ years ago without updates are likely outdated
Contacts with no activity in 24+ months and no updates in 12+ months are high-risk for data decay.
Lead-Level Audit
If you use Leads, apply similar metrics:
- Completeness: Fill rates for key fields (email, company, title)
- Email validity: Same validation as Contacts
- Lead-to-Contact duplicates: How many Leads already exist as Contacts?
- Lead aging: How long do Leads sit before conversion or disqualification?
- Source quality: Which Lead Sources have the worst data quality?
Opportunity-Level Audit
Opportunity data quality directly affects forecasting and reporting:
- Stage consistency: Are stages being used correctly or skipped?
- Close Date accuracy: How often do Close Dates slip past the original date?
- Amount fill rate: What percentage of Opportunities have amounts?
- Contact Roles: What percentage of Opportunities have associated Contacts?
- Orphan Opportunities: Opportunities without Accounts or with deleted Accounts
Running the Audit: Practical Steps
Step 1: Export Your Data
Pull reports for each object you're auditing. Include all fields you want to measure. Export to CSV for analysis.
For large databases, you may need to use Data Loader or the Bulk API instead of standard reports.
Step 2: Calculate Completeness
For each field, count:
- Total records
- Records with blank/null values
- Fill rate = (Total - Blank) / Total
This can be done in Excel with COUNTBLANK functions, or programmatically with Python/SQL.
Step 3: Identify Duplicates
For exact matches (same email, same domain), simple sorting and comparison works.
For fuzzy matches (similar names, minor spelling variations), you'll need:
- Fuzzy matching algorithms (Levenshtein distance, Jaro-Winkler)
- Specialized tools (Cloudingo, DemandTools, RingLead)
- Or manual review of sorted lists
Step 4: Validate Emails
Upload your email list to a validation service. They'll return results categorized by validity status.
Step 5: Assess Consistency
For picklist fields, pull the distinct value counts. High counts suggest inconsistency:
- Country field with 50+ values probably has inconsistent entries
- Industry field with 200+ values definitely needs cleanup
For free-text fields like company names, look for obvious variations (Inc vs Inc. vs Incorporated).
Step 6: Compile Results
Create a summary scorecard with all metrics:
| Metric | Current | Target | Status |
|---|---|---|---|
| Account duplicate rate | 8.3% | <5% | Needs work |
| Contact email fill rate | 94.2% | >98% | Needs work |
| Contact email validity | 78.5% | >85% | Problem |
| Account industry fill rate | 89.1% | >85% | Good |
Prioritizing Cleanup
You can't fix everything at once. Prioritize based on:
Business impact: Email validity directly affects campaigns. Duplicate Accounts affect forecasting. Missing industry data affects segmentation. Rank by what's costing you most.
Effort to fix: Some issues (email validation) are straightforward. Others (duplicate merging) require more careful work. Balance quick wins with important fixes.
Compounding effect: Duplicates get worse over time as people add to both records. Bad data entry processes create ongoing problems. Fix root causes early.
A typical prioritization:
- Merge duplicate Accounts (affects everything else)
- Validate and flag invalid emails (highest immediate ROI)
- Merge duplicate Contacts (reduces confusion)
- Fill missing critical fields through enrichment
- Standardize categorical fields
- Archive or delete truly dead records
Ongoing Monitoring
A data quality audit isn't a one-time event. Set up ongoing monitoring:
Scheduled reports: Create Salesforce reports that surface data quality issues weekly or monthly.
Dashboard: Build a data quality dashboard with key metrics visible to RevOps and leadership.
Trend tracking: Are metrics improving or degrading over time? A declining email validity rate suggests a process problem.
Source analysis: Which Lead Sources, import files, or entry points create the worst data? Fix the sources, not just the symptoms.
When to Get Help
Running a comprehensive data quality audit takes time and expertise. Consider getting help if:
- You've never done one: First-time audits benefit from experienced perspective
- Your database is large: 100K+ records means significant analysis work
- You need to justify investment: An external audit carries more weight with leadership
- You need benchmarks: How do you compare to similar companies?
At Verum, data audits are often the first step in our engagements. We assess your current state, identify priorities, and give you a clear picture of what cleanup will involve. If you just want the audit without the cleanup, that works too.
Not sure about your data quality?
We can run a comprehensive audit of your Salesforce instance and give you a detailed report with specific metrics and prioritized recommendations.