Fuzzy matching identifies records that are similar but not exactly the same. While exact matching only catches identical values ("John Smith" = "John Smith"), fuzzy matching catches variations: "John Smith" matches "Jon Smith," "J. Smith," "Jonathan Smith," and "John P. Smith." It uses algorithms that measure how similar two strings are and returns a confidence score for each potential match.
Why It Matters
Exact matching catches less than half of real duplicates. People abbreviate names, misspell companies, use different email formats, and enter phone numbers in different formats. If you only match on exact values, "Acme Corporation" and "Acme Corp" look like two different companies. Fuzzy matching closes this gap and catches the near-duplicates that exact matching misses entirely.
Common Fuzzy Matching Algorithms
- Levenshtein distance: Counts the minimum number of single-character edits needed to turn one string into another. 'Smith' to 'Smyth' = distance of 1
- Jaro-Winkler: Measures character-level similarity with extra weight on matching prefixes. Good for names where the beginning is most reliable
- Soundex/Metaphone: Encodes names by how they sound, not how they're spelled. Catches 'Smith' and 'Smythe' as phonetic matches
- Token-based matching: Compares sets of words regardless of order. 'Acme Inc' and 'Inc Acme' get a high score because the same words are present
- Composite scoring: Combine multiple algorithms and multiple fields for an overall match confidence score
Example
A database has "Robert Johnson, Acme Corp" and "Bob Johnson, Acme Corporation" with different email addresses. Exact matching finds nothing. Fuzzy matching detects: name similarity 78% (Robert/Bob are known nicknames), company similarity 92% (Corp/Corporation), same area code on phone numbers. Composite score: 87% match probability. Flagged for review.
Related Terms
Related Resources
Duplicates slipping through exact matching?
We use fuzzy matching across name, company, email, and phone to catch the near-duplicates your CRM misses.
See What We'll Find