Salesforce Data Quality Audit: A Step-by-Step Checklist

Your VP of Sales just asked you how many active accounts are in the pipeline. You pull a report, and the number looks wrong. Too high. You dig in and find accounts with no activity in two years, contacts with bounced emails still marked as valid, and a handful of records where the Industry field says "Other" because nobody bothered to fill it in properly. You don't have a pipeline problem. You have a Salesforce data quality problem, and you don't know how deep it goes.

A Salesforce data audit gives you that answer. Not a vague sense of "our data could be better," but actual numbers. What percentage of your records are complete. How many duplicates are hiding behind name variations. How fast your data is decaying. And most importantly, where to focus your cleanup effort for the biggest impact.

This guide walks you through a full audit with specific SOQL queries, a field-level checklist, a scoring framework, and an action plan template you can start using today.

What You're Measuring

Data quality isn't one thing. It's four dimensions, and you need to measure each separately because they have different causes and different fixes.

1. Completeness

What percentage of records have values in key fields? This is the easiest dimension to measure and usually the worst offender. An Account record without Industry, AnnualRevenue, or NumberOfEmployees is essentially invisible to your lead scoring, routing, and segmentation.

2. Accuracy

Are the values that do exist actually correct? A Contact with an email that bounces has a value in the Email field, but it's useless. An Account with "Technology" as the industry when they're actually a healthcare company is worse than blank, because it's actively misleading your sales team.

3. Consistency

Are the same things represented the same way? "US" vs "United States" vs "USA" in the Country field. "VP Sales" vs "Vice President of Sales" vs "VP, Sales" in Title. Inconsistent values break filters, reports, and automation rules.

4. Decay Rate

How fast is your data going stale? B2B contact data decays at roughly 2-3% per month. That means about 25-30% of your database becomes inaccurate every year through job changes, company moves, email changes, and phone number turnover. If you haven't touched your data in 18 months, nearly half of it could be wrong.

Step 1: Measure Completeness with SOQL

Open Developer Console or use a tool like Workbench to run these queries. They'll give you hard numbers on where your gaps are.

Contact completeness

SELECT COUNT(Id) FROM Contact WHERE Email = null
SELECT COUNT(Id) FROM Contact WHERE Phone = null
SELECT COUNT(Id) FROM Contact WHERE Title = null
SELECT COUNT(Id) FROM Contact WHERE MailingCity = null

Compare each result against your total Contact count. If 40% of your Contacts don't have a Title, that's a problem for any routing or scoring that depends on seniority.

Account completeness

SELECT COUNT(Id) FROM Account WHERE Industry = null
SELECT COUNT(Id) FROM Account WHERE AnnualRevenue = null
SELECT COUNT(Id) FROM Account WHERE NumberOfEmployees = null
SELECT COUNT(Id) FROM Account WHERE BillingState = null
SELECT COUNT(Id) FROM Account WHERE Website = null

AnnualRevenue and NumberOfEmployees are usually the worst. Most orgs we audit have these fields filled on fewer than 30% of their Account records.

Lead completeness

SELECT COUNT(Id) FROM Lead WHERE Company = null OR Company = ''
SELECT COUNT(Id) FROM Lead WHERE Email = null
SELECT COUNT(Id) FROM Lead WHERE Phone = null
SELECT COUNT(Id) FROM Lead WHERE Industry = null

Benchmark: Calculate your field fill rate for each critical field. Above 85% is good. Below 60% means that field is effectively unusable for segmentation or routing.

Step 2: Identify Duplicates

Duplicates are the most expensive data quality problem because they cause split pipelines, double outreach, and inaccurate forecasting. They're also the hardest to find because they hide behind name variations.

Find potential duplicate Contacts by email

SELECT Email, COUNT(Id) cnt
FROM Contact
WHERE Email != null
GROUP BY Email
HAVING COUNT(Id) > 1
ORDER BY COUNT(Id) DESC

Find potential duplicate Accounts by name

SELECT Name, COUNT(Id) cnt
FROM Account
GROUP BY Name
HAVING COUNT(Id) > 1
ORDER BY COUNT(Id) DESC

That second query only catches exact name matches. It won't find "Acme Corp" and "Acme Corporation" as duplicates. For that, you need either a normalized name field (see our company name normalization guide) or Salesforce's built-in duplicate management with fuzzy matching enabled.

Find orphaned Contacts

SELECT COUNT(Id) FROM Contact WHERE AccountId = null

Contacts not linked to any Account are often leftovers from bad imports or Lead conversions that went sideways. They clutter your database and confuse reps.

Step 3: Assess Accuracy

Accuracy is harder to measure than completeness because you can't just check for null values. You need to look for signals that data has gone stale or was wrong from the start.

Email bounce indicators

If you're using Salesforce with a marketing automation tool (Pardot, HubSpot, Marketo), check for hard bounces. Contacts with bounced emails should be flagged or removed from active outreach lists.

SELECT COUNT(Id) FROM Contact
WHERE HasOptedOutOfEmail = true

Also look for email patterns that suggest fake or low-quality addresses: noreply@, info@, test@, and personal domains (gmail, yahoo) for B2B records where you'd expect a corporate domain.

Stale records

SELECT COUNT(Id) FROM Contact
WHERE LastActivityDate < LAST_N_DAYS:365

SELECT COUNT(Id) FROM Account
WHERE LastActivityDate < LAST_N_DAYS:365

Records with no activity in over a year are likely stale. The contacts may have changed roles, the accounts may have been acquired. These records inflate your database size without contributing value.

Suspicious patterns

Look for fields where the same value appears suspiciously often. If 60% of your Accounts have Industry set to "Technology," that's probably a default value or lazy data entry rather than an accurate reflection of your market.

SELECT Industry, COUNT(Id) cnt
FROM Account
WHERE Industry != null
GROUP BY Industry
ORDER BY COUNT(Id) DESC

Step 4: Check Consistency

Inconsistent data breaks automation. A Flow that routes leads based on Country won't work if your data has "US," "USA," "United States," "U.S.," and "United States of America" all in the same field.

Country field variations

SELECT BillingCountry, COUNT(Id) cnt
FROM Account
WHERE BillingCountry != null
GROUP BY BillingCountry
ORDER BY COUNT(Id) DESC

State field variations

SELECT BillingState, COUNT(Id) cnt
FROM Account
WHERE BillingState != null
GROUP BY BillingState
ORDER BY COUNT(Id) DESC

If you see "California" and "CA" and "Calif" all appearing, you've got a consistency problem. Same goes for Title, Industry, and any picklist field where users can type free text.

Step 5: Calculate Your Data Quality Score

Now turn your findings into a single score so you can track improvement over time. Weight each dimension based on what matters most for your business.

Dimension Weight How to Calculate
Completeness 40% Average fill rate across critical fields
Accuracy 25% % of records with valid, non-bounced email + recent activity
Uniqueness 20% 100% minus duplicate rate
Consistency 15% % of records using standard picklist values

Example: Your Contacts have 72% average completeness across key fields, 65% have valid emails with activity in the last year, your duplicate rate is 8% (so uniqueness is 92%), and 80% of records use standardized picklist values.

Score: (0.72 x 40) + (0.65 x 25) + (0.92 x 20) + (0.80 x 15) = 28.8 + 16.25 + 18.4 + 12 = 75.45

That's a B. Not terrible, but the accuracy number is dragging you down.

Step 6: Build Your Action Plan

An audit without a cleanup plan is just depressing. Prioritize based on impact and effort.

Quick wins (do this week)

  • Standardize picklist values. Country, State, Industry. Write a Flow or use Data Loader to bulk-update the variations to standard values.
  • Delete or archive obviously dead records. No email, no phone, no activity in 2+ years, no open Opportunities. These are dead weight.
  • Flag bounced emails. Mark hard bounces as invalid so reps stop wasting time on them.

Medium effort (do this month)

  • Merge duplicates. Start with exact-match email duplicates since those are unambiguous. Then move to fuzzy name matching. See our duplicate management guide for the full process.
  • Fill critical gaps on active records. Focus on records with open Opportunities or recent activity. Enrichment tools can backfill Industry, employee count, and revenue.
  • Set up validation rules. Prevent bad data from entering in the first place. Required fields on Lead and Contact creation, picklist enforcement, email format validation.

Longer-term (this quarter)

  • Implement duplicate rules. Configure matching rules and duplicate rules to catch duplicates at point of entry.
  • Set up enrichment automation. Auto-enrich new Leads and Contacts with firmographic data on creation.
  • Build a data quality dashboard. Track completeness, duplicate rate, and bounce rate monthly so you can see trends.
  • Establish data governance ownership. Someone has to own this, even if it's a shared responsibility with clear rules.

Field-Level Audit Checklist

Use this checklist to audit each critical field. Print it, stick it in a spreadsheet, whatever works.

Object Field Check
Contact Email Fill rate, bounce rate, personal vs corporate domain ratio
Contact Phone Fill rate, format consistency, disconnected numbers
Contact Title Fill rate, standardization (VP vs Vice President), junk values
Contact MailingAddress Fill rate, consistency (state abbreviations), completeness (zip without city)
Account Industry Fill rate, "Other" percentage, accuracy spot-check
Account AnnualRevenue Fill rate, outliers (suspiciously round numbers, $1 entries)
Account NumberOfEmployees Fill rate, outliers, staleness
Account Website Fill rate, broken URLs, http vs https
Lead Company Fill rate, junk values ("test," "asdf," single characters)
Lead LeadSource Fill rate, value distribution, "Other" percentage

Common Mistakes When Auditing

Auditing everything at once. Your Salesforce org probably has hundreds of fields. Don't try to audit them all. Focus on the 10-15 fields that actually drive routing, scoring, segmentation, and reporting. The rest can wait.

Counting records instead of active records. A database with 200,000 Contacts sounds impressive. But if 120,000 of them have no activity in two years and bounced emails, your "real" database is 80,000. Audit the records that matter, not the total count.

Not establishing a baseline. If you don't write down your current scores, you can't prove improvement later. Save your audit results. You'll need them when someone asks whether the cleanup initiative was worth the investment.

Treating the audit as a one-time thing. Data quality isn't a project. It's a process. Run this audit quarterly, or at minimum before any major initiative: a new marketing campaign, a territory realignment, a CRM migration, or an enrichment project.

Ignoring custom fields. Standard fields get all the attention, but your custom fields often hold the data that's most critical to your specific business. Audit your ICP fields, your product interest fields, your custom scoring inputs.

When to DIY vs. Get Help

You can absolutely run this audit yourself. The SOQL queries above work in any Salesforce org, and the scoring framework doesn't require any special tools.

Where it gets harder is the cleanup. Merging 3,000 duplicate pairs without losing data. Enriching 50,000 Accounts with missing firmographics. Normalizing 15 variations of every state name across 200,000 records. That's where the volume makes DIY impractical.

If you're dealing with a large database and the audit reveals significant issues, we can help. We clean Salesforce data for a living, and we typically start with a free assessment that gives you the same numbers this guide describes, but faster.

Common Questions About Salesforce Data Quality Audits

How often should I audit Salesforce data quality?

Quarterly for a full audit. Monthly for key metrics (completeness on critical fields, duplicate rate, email bounce rate). If you're about to launch a major campaign or migrate CRMs, run an audit right before regardless of your regular schedule.

What's a good data quality score for Salesforce?

Most orgs score between 40-60% on their first audit. Above 80% is strong. Above 90% is exceptional and usually requires active data governance. The trend matters more than the absolute number. Going from 45% to 65% in a quarter is significant progress.

Can I automate a Salesforce data quality audit?

You can automate the measurement with scheduled reports and dashboards that track completeness, duplicate rates, and bounce rates over time. Some teams build custom Apex to calculate scores weekly. The interpretation and action planning still requires human judgment. Tools like Validity DemandTools or Cloudingo can help with ongoing monitoring.

What SOQL queries are most useful for a data quality audit?

Start with completeness queries that count null values on critical fields. Then run GROUP BY queries on fields like Industry, Country, and State to spot inconsistencies. Finally, use GROUP BY on Email to find duplicate Contacts. These three query types cover 80% of what you need.

Want someone to run this audit for you? We'll assess your Salesforce data and show you exactly where the problems are, with a cleanup plan attached.

Get a Free Data Assessment

Related: All Salesforce Guides | Duplicate Management Guide | Data Enrichment | Data Quality Metrics

About the Author

Rome Thorndike is the founder of Verum. Before starting Verum, Rome spent years at Salesforce working on data quality and CRM implementation challenges. He now helps B2B companies clean, enrich, and maintain their Salesforce data.