Salesforce Data Quality Audit: A Step-by-Step Checklist
Your VP of Sales just asked you how many active accounts are in the pipeline. You pull a report, and the number looks wrong. Too high. You dig in and find accounts with no activity in two years, contacts with bounced emails still marked as valid, and a handful of records where the Industry field says "Other" because nobody bothered to fill it in properly. You don't have a pipeline problem. You have a Salesforce data quality problem, and you don't know how deep it goes.
A Salesforce data audit gives you that answer. Not a vague sense of "our data could be better," but actual numbers. What percentage of your records are complete. How many duplicates are hiding behind name variations. How fast your data is decaying. And most importantly, where to focus your cleanup effort for the biggest impact.
This guide walks you through a full audit with specific SOQL queries, a field-level checklist, a scoring framework, and an action plan template you can start using today.
What You're Measuring
Data quality isn't one thing. It's four dimensions, and you need to measure each separately because they have different causes and different fixes.
1. Completeness
What percentage of records have values in key fields? This is the easiest dimension to measure and usually the worst offender. An Account record without Industry, AnnualRevenue, or NumberOfEmployees is essentially invisible to your lead scoring, routing, and segmentation.
2. Accuracy
Are the values that do exist actually correct? A Contact with an email that bounces has a value in the Email field, but it's useless. An Account with "Technology" as the industry when they're actually a healthcare company is worse than blank, because it's actively misleading your sales team.
3. Consistency
Are the same things represented the same way? "US" vs "United States" vs "USA" in the Country field. "VP Sales" vs "Vice President of Sales" vs "VP, Sales" in Title. Inconsistent values break filters, reports, and automation rules.
4. Decay Rate
How fast is your data going stale? B2B contact data decays at roughly 2-3% per month. That means about 25-30% of your database becomes inaccurate every year through job changes, company moves, email changes, and phone number turnover. If you haven't touched your data in 18 months, nearly half of it could be wrong.
Step 1: Measure Completeness with SOQL
Open Developer Console or use a tool like Workbench to run these queries. They'll give you hard numbers on where your gaps are.
Contact completeness
SELECT COUNT(Id) FROM Contact WHERE Email = null
SELECT COUNT(Id) FROM Contact WHERE Phone = null
SELECT COUNT(Id) FROM Contact WHERE Title = null
SELECT COUNT(Id) FROM Contact WHERE MailingCity = null
Compare each result against your total Contact count. If 40% of your Contacts don't have a Title, that's a problem for any routing or scoring that depends on seniority.
Account completeness
SELECT COUNT(Id) FROM Account WHERE Industry = null
SELECT COUNT(Id) FROM Account WHERE AnnualRevenue = null
SELECT COUNT(Id) FROM Account WHERE NumberOfEmployees = null
SELECT COUNT(Id) FROM Account WHERE BillingState = null
SELECT COUNT(Id) FROM Account WHERE Website = null
AnnualRevenue and NumberOfEmployees are usually the worst. Most orgs we audit have these fields filled on fewer than 30% of their Account records.
Lead completeness
SELECT COUNT(Id) FROM Lead WHERE Company = null OR Company = ''
SELECT COUNT(Id) FROM Lead WHERE Email = null
SELECT COUNT(Id) FROM Lead WHERE Phone = null
SELECT COUNT(Id) FROM Lead WHERE Industry = null
Benchmark: Calculate your field fill rate for each critical field. Above 85% is good. Below 60% means that field is effectively unusable for segmentation or routing.
Step 2: Identify Duplicates
Duplicates are the most expensive data quality problem because they cause split pipelines, double outreach, and inaccurate forecasting. They're also the hardest to find because they hide behind name variations.
Find potential duplicate Contacts by email
SELECT Email, COUNT(Id) cnt
FROM Contact
WHERE Email != null
GROUP BY Email
HAVING COUNT(Id) > 1
ORDER BY COUNT(Id) DESC
Find potential duplicate Accounts by name
SELECT Name, COUNT(Id) cnt
FROM Account
GROUP BY Name
HAVING COUNT(Id) > 1
ORDER BY COUNT(Id) DESC
That second query only catches exact name matches. It won't find "Acme Corp" and "Acme Corporation" as duplicates. For that, you need either a normalized name field (see our company name normalization guide) or Salesforce's built-in duplicate management with fuzzy matching enabled.
Find orphaned Contacts
SELECT COUNT(Id) FROM Contact WHERE AccountId = null
Contacts not linked to any Account are often leftovers from bad imports or Lead conversions that went sideways. They clutter your database and confuse reps.
Step 3: Assess Accuracy
Accuracy is harder to measure than completeness because you can't just check for null values. You need to look for signals that data has gone stale or was wrong from the start.
Email bounce indicators
If you're using Salesforce with a marketing automation tool (Pardot, HubSpot, Marketo), check for hard bounces. Contacts with bounced emails should be flagged or removed from active outreach lists.
SELECT COUNT(Id) FROM Contact
WHERE HasOptedOutOfEmail = true
Also look for email patterns that suggest fake or low-quality addresses: noreply@, info@, test@, and personal domains (gmail, yahoo) for B2B records where you'd expect a corporate domain.
Stale records
SELECT COUNT(Id) FROM Contact
WHERE LastActivityDate < LAST_N_DAYS:365
SELECT COUNT(Id) FROM Account
WHERE LastActivityDate < LAST_N_DAYS:365
Records with no activity in over a year are likely stale. The contacts may have changed roles, the accounts may have been acquired. These records inflate your database size without contributing value.
Suspicious patterns
Look for fields where the same value appears suspiciously often. If 60% of your Accounts have Industry set to "Technology," that's probably a default value or lazy data entry rather than an accurate reflection of your market.
SELECT Industry, COUNT(Id) cnt
FROM Account
WHERE Industry != null
GROUP BY Industry
ORDER BY COUNT(Id) DESC
Step 4: Check Consistency
Inconsistent data breaks automation. A Flow that routes leads based on Country won't work if your data has "US," "USA," "United States," "U.S.," and "United States of America" all in the same field.
Country field variations
SELECT BillingCountry, COUNT(Id) cnt
FROM Account
WHERE BillingCountry != null
GROUP BY BillingCountry
ORDER BY COUNT(Id) DESC
State field variations
SELECT BillingState, COUNT(Id) cnt
FROM Account
WHERE BillingState != null
GROUP BY BillingState
ORDER BY COUNT(Id) DESC
If you see "California" and "CA" and "Calif" all appearing, you've got a consistency problem. Same goes for Title, Industry, and any picklist field where users can type free text.
Step 5: Calculate Your Data Quality Score
Now turn your findings into a single score so you can track improvement over time. Weight each dimension based on what matters most for your business.
| Dimension | Weight | How to Calculate |
|---|---|---|
| Completeness | 40% | Average fill rate across critical fields |
| Accuracy | 25% | % of records with valid, non-bounced email + recent activity |
| Uniqueness | 20% | 100% minus duplicate rate |
| Consistency | 15% | % of records using standard picklist values |
Example: Your Contacts have 72% average completeness across key fields, 65% have valid emails with activity in the last year, your duplicate rate is 8% (so uniqueness is 92%), and 80% of records use standardized picklist values.
Score: (0.72 x 40) + (0.65 x 25) + (0.92 x 20) + (0.80 x 15) = 28.8 + 16.25 + 18.4 + 12 = 75.45
That's a B. Not terrible, but the accuracy number is dragging you down.
Step 6: Build Your Action Plan
An audit without a cleanup plan is just depressing. Prioritize based on impact and effort.
Quick wins (do this week)
- Standardize picklist values. Country, State, Industry. Write a Flow or use Data Loader to bulk-update the variations to standard values.
- Delete or archive obviously dead records. No email, no phone, no activity in 2+ years, no open Opportunities. These are dead weight.
- Flag bounced emails. Mark hard bounces as invalid so reps stop wasting time on them.
Medium effort (do this month)
- Merge duplicates. Start with exact-match email duplicates since those are unambiguous. Then move to fuzzy name matching. See our duplicate management guide for the full process.
- Fill critical gaps on active records. Focus on records with open Opportunities or recent activity. Enrichment tools can backfill Industry, employee count, and revenue.
- Set up validation rules. Prevent bad data from entering in the first place. Required fields on Lead and Contact creation, picklist enforcement, email format validation.
Longer-term (this quarter)
- Implement duplicate rules. Configure matching rules and duplicate rules to catch duplicates at point of entry.
- Set up enrichment automation. Auto-enrich new Leads and Contacts with firmographic data on creation.
- Build a data quality dashboard. Track completeness, duplicate rate, and bounce rate monthly so you can see trends.
- Establish data governance ownership. Someone has to own this, even if it's a shared responsibility with clear rules.
Field-Level Audit Checklist
Use this checklist to audit each critical field. Print it, stick it in a spreadsheet, whatever works.
| Object | Field | Check |
|---|---|---|
| Contact | Fill rate, bounce rate, personal vs corporate domain ratio | |
| Contact | Phone | Fill rate, format consistency, disconnected numbers |
| Contact | Title | Fill rate, standardization (VP vs Vice President), junk values |
| Contact | MailingAddress | Fill rate, consistency (state abbreviations), completeness (zip without city) |
| Account | Industry | Fill rate, "Other" percentage, accuracy spot-check |
| Account | AnnualRevenue | Fill rate, outliers (suspiciously round numbers, $1 entries) |
| Account | NumberOfEmployees | Fill rate, outliers, staleness |
| Account | Website | Fill rate, broken URLs, http vs https |
| Lead | Company | Fill rate, junk values ("test," "asdf," single characters) |
| Lead | LeadSource | Fill rate, value distribution, "Other" percentage |
Common Mistakes When Auditing
Auditing everything at once. Your Salesforce org probably has hundreds of fields. Don't try to audit them all. Focus on the 10-15 fields that actually drive routing, scoring, segmentation, and reporting. The rest can wait.
Counting records instead of active records. A database with 200,000 Contacts sounds impressive. But if 120,000 of them have no activity in two years and bounced emails, your "real" database is 80,000. Audit the records that matter, not the total count.
Not establishing a baseline. If you don't write down your current scores, you can't prove improvement later. Save your audit results. You'll need them when someone asks whether the cleanup initiative was worth the investment.
Treating the audit as a one-time thing. Data quality isn't a project. It's a process. Run this audit quarterly, or at minimum before any major initiative: a new marketing campaign, a territory realignment, a CRM migration, or an enrichment project.
Ignoring custom fields. Standard fields get all the attention, but your custom fields often hold the data that's most critical to your specific business. Audit your ICP fields, your product interest fields, your custom scoring inputs.
When to DIY vs. Get Help
You can absolutely run this audit yourself. The SOQL queries above work in any Salesforce org, and the scoring framework doesn't require any special tools.
Where it gets harder is the cleanup. Merging 3,000 duplicate pairs without losing data. Enriching 50,000 Accounts with missing firmographics. Normalizing 15 variations of every state name across 200,000 records. That's where the volume makes DIY impractical.
If you're dealing with a large database and the audit reveals significant issues, we can help. We clean Salesforce data for a living, and we typically start with a free assessment that gives you the same numbers this guide describes, but faster.
Common Questions About Salesforce Data Quality Audits
How often should I audit Salesforce data quality?
Quarterly for a full audit. Monthly for key metrics (completeness on critical fields, duplicate rate, email bounce rate). If you're about to launch a major campaign or migrate CRMs, run an audit right before regardless of your regular schedule.
What's a good data quality score for Salesforce?
Most orgs score between 40-60% on their first audit. Above 80% is strong. Above 90% is exceptional and usually requires active data governance. The trend matters more than the absolute number. Going from 45% to 65% in a quarter is significant progress.
Can I automate a Salesforce data quality audit?
You can automate the measurement with scheduled reports and dashboards that track completeness, duplicate rates, and bounce rates over time. Some teams build custom Apex to calculate scores weekly. The interpretation and action planning still requires human judgment. Tools like Validity DemandTools or Cloudingo can help with ongoing monitoring.
What SOQL queries are most useful for a data quality audit?
Start with completeness queries that count null values on critical fields. Then run GROUP BY queries on fields like Industry, Country, and State to spot inconsistencies. Finally, use GROUP BY on Email to find duplicate Contacts. These three query types cover 80% of what you need.
Want someone to run this audit for you? We'll assess your Salesforce data and show you exactly where the problems are, with a cleanup plan attached.
Get a Free Data AssessmentRelated: All Salesforce Guides | Duplicate Management Guide | Data Enrichment | Data Quality Metrics
About the Author
Rome Thorndike is the founder of Verum. Before starting Verum, Rome spent years at Salesforce working on data quality and CRM implementation challenges. He now helps B2B companies clean, enrich, and maintain their Salesforce data.