No single data provider has complete coverage. Based on Verum's testing across multiple B2B segments, individual providers typically cover 60-75% of target accounts—but the gaps differ by provider. When you combine them strategically in a waterfall, you can reach 85-95% coverage.

This is the waterfall strategy: query multiple providers in sequence until you get a match. When Provider A doesn't have the data, fall through to Provider B, then Provider C. Like water cascading down, each level catches what the one above missed.

This guide covers how to build, optimize, and operate a contact data waterfall for maximum coverage at reasonable cost.

Why Use a Waterfall?

The math is simple but compelling:

Approach Typical Coverage Relative Cost
Single provider (premium) 60-70% $$
2-provider waterfall 75-85% $$ + $
3-provider waterfall 85-92% $$ + $ + $
4+ provider waterfall 90-95% $$$ (diminishing returns)

The incremental cost of Provider B only applies to records where Provider A failed. If your primary provider covers 70% of your records, you're only paying Provider B for the remaining 30%. This makes waterfalls surprisingly cost-effective.

How Waterfalls Work

Step 1: Input Record

Name, Company, LinkedIn URL

Step 2: Provider A (Primary)

Best quality, highest cost

↓ No match?

Step 3: Provider B (Secondary)

Good coverage, moderate cost

↓ No match?

Step 4: Provider C (Tertiary)

Wide coverage, lower accuracy

Step 5: Validation & Output

Verify email, standardize format

Waterfall Logic

  1. Input: Start with what you know—name, company, LinkedIn URL, domain
  2. Primary query: Hit your highest-quality (usually highest-cost) provider first
  3. Evaluate result: Did you get a match? Is it complete? Does it meet quality thresholds?
  4. Fall through: If no match or incomplete data, query the next provider
  5. Validate: Regardless of source, validate emails and phones before use
  6. Tag source: Track which provider supplied each data point for quality monitoring

What Triggers a Fallthrough?

Define clear rules for when to query the next provider:

  • No match: Provider returns no results for the input
  • Missing critical fields: Match found but missing email or phone
  • Low confidence: Provider returns data but with low confidence score
  • Stale data: Data found but last verified date is too old

Decision point: Should you always query the next provider if a field is missing? Not necessarily. If you need email AND phone, fall through if email is missing. But if phone is optional, you might accept an email-only result from Provider A rather than paying for Provider B.

Provider Selection

Choosing Your Primary Provider

Your primary provider should optimize for:

  • Data quality: Highest accuracy, most current data
  • Coverage for your ICP: Best match rates for your target segments
  • Integration ease: Native CRM connectors, API quality

Cost matters less for the primary—you're paying for most records anyway.

Choosing Secondary/Tertiary Providers

Secondary providers should optimize for:

  • Complementary coverage: Strong where your primary is weak
  • Cost efficiency: Lower per-record cost since you're paying for fewer records
  • Specific strengths: Maybe better phone data, or better EMEA coverage

Provider Comparison by Segment

Segment Strong Providers Notes
Enterprise (Fortune 1000) ZoomInfo, Dun & Bradstreet Most providers have good coverage here
Mid-Market ZoomInfo, Apollo, Clearbit Coverage varies more; test specific segments
SMB Apollo, Lusha, RocketReach Premium providers often weak here
Tech Companies Clearbit, BuiltWith, HG Insights Tech stack data valuable for targeting
EMEA Cognism, Lusha GDPR-compliant providers essential
APAC ZoomInfo, LeadIQ Coverage generally weaker; verify carefully

Provider Comparison by Data Type

Data Type Strong Providers Notes
Work Email ZoomInfo, Clearbit, Apollo Highest-value data point; validate regardless
Personal Email ContactOut, RocketReach Use carefully; higher privacy concerns
Direct Dial Phone ZoomInfo, Cognism Most difficult data to source accurately
Mobile Phone Lusha, Cognism Valuable but verify connectivity
Firmographics Clearbit, ZoomInfo, D&B Relatively commoditized; most providers good
Tech Stack BuiltWith, HG Insights, Clearbit Specialized providers more accurate

Implementation Architecture

Option 1: Sequential API Calls

Simplest approach—query providers one at a time:

async function enrichContact(input) { // Try Provider A first let result = await providerA.enrich(input); if (isComplete(result)) { return { ...result, source: 'provider_a' }; } // Fall through to Provider B result = await providerB.enrich(input); if (isComplete(result)) { return { ...result, source: 'provider_b' }; } // Fall through to Provider C result = await providerC.enrich(input); return { ...result, source: result ? 'provider_c' : 'no_match' }; } function isComplete(result) { return result && result.email && result.confidence >= 0.7; }

Pros: Simple to implement and debug
Cons: Slower (sequential calls), doesn't optimize for cost

Option 2: Parallel with Priority

Query all providers simultaneously, use results by priority:

async function enrichContactParallel(input) { // Query all providers in parallel const [resultA, resultB, resultC] = await Promise.all([ providerA.enrich(input).catch(() => null), providerB.enrich(input).catch(() => null), providerC.enrich(input).catch(() => null) ]); // Use results by priority if (isComplete(resultA)) { return { ...resultA, source: 'provider_a' }; } if (isComplete(resultB)) { return { ...resultB, source: 'provider_b' }; } if (isComplete(resultC)) { return { ...resultC, source: 'provider_c' }; } // Return best partial result return mergePartialResults(resultA, resultB, resultC); }

Pros: Fastest (parallel execution), gets all available data
Cons: Most expensive (pays all providers for every record)

Option 3: Smart Waterfall with Routing

Route to specific providers based on input characteristics:

async function enrichContactSmart(input) { // Determine which providers to try based on input const providers = selectProviders(input); for (const provider of providers) { const result = await provider.enrich(input); if (isComplete(result)) { return { ...result, source: provider.name }; } } return { source: 'no_match' }; } function selectProviders(input) { const providers = []; // Route EMEA contacts to Cognism first if (isEMEA(input.company)) { providers.push(cognism, zoomInfo, apollo); } // Route SMB to Apollo first (better coverage, lower cost) else if (isSMB(input.company)) { providers.push(apollo, lusha, zoomInfo); } // Enterprise goes to ZoomInfo first else { providers.push(zoomInfo, clearbit, apollo); } return providers; }

Pros: Optimizes cost and coverage based on segment
Cons: Requires segment detection, more complex to maintain

Handling Data Conflicts

When multiple providers return data, you'll encounter conflicts. Provider A says the person is VP of Sales; Provider B says Director of Business Development. Which do you trust?

Strategy 1: Strict Hierarchy

Primary provider always wins. Simple but may miss better data from secondary sources.

Strategy 2: Recency Wins

Use the most recently updated data, regardless of provider. Requires tracking data freshness.

Strategy 3: Confidence Scoring

Build a composite score based on:

  • Provider reliability (based on your historical accuracy)
  • Data freshness (when was it last verified?)
  • Verification status (was it validated?)
  • Source type (scraped vs. self-reported vs. verified)
function scoreDataPoint(data, providerName) { const providerWeight = { 'zoominfo': 0.9, 'clearbit': 0.85, 'apollo': 0.75 }; let score = providerWeight[providerName] || 0.5; // Boost for recent data const daysSinceVerified = daysSince(data.lastVerified); if (daysSinceVerified < 30) score *= 1.2; else if (daysSinceVerified > 180) score *= 0.7; // Boost for verified data if (data.isVerified) score *= 1.1; return Math.min(score, 1.0); }

Strategy 4: Field-Level Merging

Take the best data for each field from different providers:

  • Email from Provider A (highest deliverability)
  • Phone from Provider B (best phone coverage)
  • Title from Provider A (most recent)
  • Company data from Provider C (most complete)

Always validate email: Regardless of which provider supplied the email, run it through a validation service before use. A bounced email costs more than the validation fee.

Cost Optimization

Pay-Per-Match vs. Pay-Per-Query

Understand your contract structure:

Model How It Works Waterfall Implication
Pay-per-match Only charged when data is returned Low risk to query; can try multiple providers
Pay-per-query Charged for every API call Expensive waterfall; optimize routing
Credit bucket Monthly credit allocation Monitor usage; may need to throttle end of month
Unlimited Flat monthly fee Query freely; maximize usage

Optimization Strategies

  • Cache results: Don't re-enrich the same contact multiple times. Cache for 30-90 days.
  • Batch during off-peak: Some providers offer lower rates for batch processing
  • Segment routing: Route SMB to lower-cost providers that have good SMB coverage
  • Skip validation for known-good: If you have a recently validated email, don't re-validate
  • Pre-filter impossible matches: Don't query for contacts at companies too small to have the role

Sample Cost Calculation

For 10,000 contact enrichments with a 3-provider waterfall:

Provider Coverage Records Queried Cost/Record Total
Provider A (Primary) 70% 10,000 $0.25 $2,500
Provider B (Secondary) 60% of remaining 3,000 $0.15 $450
Provider C (Tertiary) 50% of remaining 1,200 $0.10 $120
Total 91% coverage - - $3,070

Effective cost: $0.31/contact for 91% coverage vs. $0.25/contact for 70% coverage with single provider.

Monitoring and Quality Control

Key Metrics to Track

  • Coverage rate by provider: What % of queries return matches?
  • Fallthrough rate: How often does each provider fail, triggering the next?
  • Accuracy by provider: Track bounce rates and connection rates by source
  • Cost per enriched record: Total cost / successful enrichments
  • Time to enrich: Latency matters for real-time use cases

Quality Monitoring Dashboard

Metric Provider A Provider B Provider C
Match rate 72% 58% 45%
Email bounce rate 3% 7% 12%
Phone connect rate 65% 48% 35%
Avg. confidence score 0.85 0.72 0.61

Review monthly. If a provider's quality drops significantly, consider reordering or replacing them.

Alerting

Set up alerts for:

  • Match rate drops >10% week-over-week
  • Bounce rate exceeds threshold (5% is typical)
  • API errors or timeouts spike
  • Cost per record increases significantly

Waterfall Platforms

You can build your own waterfall or use platforms that handle orchestration:

DIY Approach

  • Pros: Full control, can optimize for your specific needs
  • Cons: Engineering effort, need to manage multiple API integrations
  • Best for: Teams with engineering resources and complex requirements

Orchestration Platforms

Platform Providers Supported Best For
Clay 50+ providers Outbound sales teams, flexible workflows
Clearbit (Breeze) Clearbit + others HubSpot users, B2B marketing
LeadGenius Multiple + custom Enterprise, custom research needs
Openprise Most major providers Enterprise RevOps, complex routing

Implementation Checklist

Planning Phase

  • Define coverage requirements (what % is acceptable?)
  • Identify target segments and their characteristics
  • Audit current provider coverage on sample data
  • Calculate cost scenarios for different waterfall configurations
  • Define quality thresholds (confidence scores, freshness requirements)

Build Phase

  • Set up API integrations with selected providers
  • Implement waterfall logic with fallthrough rules
  • Add caching layer to avoid duplicate enrichments
  • Build validation layer (email, phone verification)
  • Tag data with source provider for tracking

Launch Phase

  • Run pilot on sample data (1,000-5,000 records)
  • Measure coverage and quality metrics
  • Adjust provider order and fallthrough rules based on results
  • Set up monitoring and alerting
  • Document procedures for handling quality issues

Frequently Asked Questions

What is a contact data waterfall?

A contact data waterfall is a strategy that queries multiple data providers in sequence to maximize contact data coverage. When the first provider doesn't have data for a contact, the request falls through to the second provider, then the third, and so on—like water cascading down a waterfall.

How many providers should be in a waterfall?

Most waterfalls use 2-4 providers. Beyond 4, you see diminishing returns—each additional provider adds maybe 5-10% more coverage at significant cost and complexity. Start with 2 providers, measure your coverage gap, and add more only if needed.

What order should providers be in the waterfall?

Order providers by: (1) data quality/accuracy, (2) coverage for your target segments, and (3) cost per match. Put your highest-quality provider first, then backfill with providers that have good coverage in segments where your primary is weak. Always validate results before use.

How do you handle conflicting data from different providers?

Establish a provider hierarchy where your most trusted source wins conflicts. Alternatively, implement confidence scoring based on data freshness, source type, and historical accuracy. For critical data like email, validate with a third-party tool regardless of source.

Need help with your data?

Tell us about your data challenges and we'll show you what clean, enriched data looks like.

See What We'll Find

About the Author

Rome Thorndike is the founder of Verum, where he helps B2B companies clean, enrich, and maintain their CRM data. With over 10 years of experience in data at Microsoft, Databricks, and Salesforce, Rome has seen firsthand how data quality impacts revenue operations.