Waterfall enrich HubSpot contacts across Apollo, Clearbit, and PDL using a Claude Code skill

Compatible agents

This skill works with any agent that supports the Claude Code skills standard, including Claude Code, Claude Cowork, OpenAI Codex, and Google Antigravity.

Prerequisites

Claude Code or another AI coding agent installed
HubSpot private app with crm.objects.contacts.read and crm.objects.contacts.write scopes
Apollo.io account with API access (required)
Clearbit account with API access (optional — second provider in the waterfall)
People Data Labs account with API access (optional — third provider in the waterfall)

Environment Variables

# HubSpot private app token with contacts read/write scopes

HUBSPOT_ACCESS_TOKEN=your_value_here

# Apollo.io API key — required, used as the first enrichment provider

APOLLO_API_KEY=your_value_here

# Clearbit API key — optional, used as the second provider

CLEARBIT_API_KEY=your_value_here

# People Data Labs API key — optional, used as the final fallback

PDL_API_KEY=your_value_here

Why a Claude Code skill?

Waterfall enrichment involves complex branching logic across three APIs with different response formats, error codes, and rate limits. An Claude Code skill wraps all of this into a conversation:

"Enrich contacts missing job titles — use all three providers"
"Run the waterfall on contacts created this week"
"Try just Apollo and Clearbit, skip PDL"
"Which contacts needed all three providers to get a complete profile?"

The agent reads API reference files for each provider, builds the cascade logic, handles provider-specific errors (Clearbit 404s, PDL 402s), and reports which providers filled which fields. You can change the provider order or add new providers in natural language.

How it works

The skill has three parts:

SKILL.md — instructions defining the waterfall workflow and merge rules
references/ — API documentation for Apollo People Match, Clearbit Person Find, PDL Person Enrich, and HubSpot endpoints
templates/ — field mapping rules showing how each provider's response maps to HubSpot properties, plus the "never overwrite" merge logic

When you invoke the skill, the agent reads these files, writes a script that cascades through providers, executes it, and reports which providers contributed to each contact's profile.

What is a Claude Code skill?

An Claude Code skill is a directory of reference files and instructions that teach an AI coding agent how to complete a specific task. Unlike traditional automation that runs the same code every time, a skill lets the agent adapt — it can modify search filters, reorder providers, handle errors, and explain what it did, all based on your natural-language request. The agent generates and runs code on the fly using the API patterns in the reference files.

Step 1: Create the skill directory

mkdir -p .claude/skills/waterfall-enrich/{templates,references}

Step 2: Write the SKILL.md file

Create .claude/skills/waterfall-enrich/SKILL.md:

---
name: waterfall-enrich
description: Waterfall enriches HubSpot contacts missing key fields. Tries Apollo first, then Clearbit for gaps, then People Data Labs as a final fallback. Writes enriched data and source attribution back to HubSpot.
disable-model-invocation: true
allowed-tools: Bash, Read
---
 
## Workflow
 
1. Read all reference files in `references/` for API patterns
2. Read `templates/field-mapping.md` for merge rules
3. Search HubSpot for contacts missing the requested field (default: jobtitle)
4. Filter to business emails only (exclude gmail.com, yahoo.com, etc.)
5. For each contact, cascade through providers in order: Apollo → Clearbit → PDL
6. After each provider call, check which target fields are still empty
7. Only call the next provider if there are still gaps
8. Merge results: each provider only fills fields that previous providers left empty
9. PATCH each contact in HubSpot with merged fields plus enrichment_source attribution
10. Print a summary showing provider coverage stats
 
## Rules
 
- Provider order: Apollo first (broadest coverage), then Clearbit, then PDL
- NEVER overwrite a field that a previous provider already filled
- Skip providers whose API key is not set (Clearbit and PDL are optional)
- Handle each provider's error codes: Clearbit returns 404 for unknown emails, PDL returns 402 for insufficient credits
- Rate limit: 500ms between provider calls, 200ms between HubSpot search pages
- Track enrichment_source as "apollo", "apollo+clearbit", "apollo+clearbit+pdl", etc.
- Use environment variables: HUBSPOT_ACCESS_TOKEN, APOLLO_API_KEY, CLEARBIT_API_KEY, PDL_API_KEY

Step 3: Add reference and template files

Create references/apollo-people-match.md:

# Apollo People Match API Reference
 
```
POST https://api.apollo.io/api/v1/people/match
x-api-key: {APOLLO_API_KEY}
Content-Type: application/json
```
 
Request body:
```json
{ "email": "jane@acme.com" }
```
 
Response:
```json
{
  "person": {
    "title": "Head of Sales",
    "organization": {
      "name": "Acme Inc",
      "industry": "Computer Software"
    },
    "phone_numbers": [
      { "sanitized_number": "+16505550177", "type": "work_direct" }
    ],
    "linkedin_url": "https://www.linkedin.com/in/janesmith",
    "seniority": "director"
  }
}
```
 
If no match: `{"person": null}`. Rate limit: 5 req/sec on Basic plan.

Create references/clearbit-person-find.md:

# Clearbit Person Find API Reference
 
```
GET https://person.clearbit.com/v2/people/find?email={email}
Authorization: Bearer {CLEARBIT_API_KEY}
```
 
Response:
```json
{
  "employment": {
    "title": "Head of Sales",
    "name": "Acme Inc",
    "seniority": "director"
  },
  "linkedin": {
    "handle": "janesmith"
  }
}
```
 
Error codes:
- 404: Person not found (expected — not an error)
- 401: Invalid API key
- 402: Insufficient credits
 
Note: Clearbit uses Bearer token auth (not Basic auth). LinkedIn URL must be constructed: `https://linkedin.com/in/{handle}`

Create references/pdl-person-enrich.md:

# People Data Labs Person Enrich API Reference
 
```
POST https://api.peopledatalabs.com/v5/person/enrich
x-api-key: {PDL_API_KEY}
Content-Type: application/json
```
 
Request body:
```json
{ "email": "jane@acme.com" }
```
 
Response:
```json
{
  "data": {
    "job_title": "Head of Sales",
    "job_company_name": "Acme Inc",
    "phone_numbers": ["+16505550177"],
    "linkedin_url": "https://linkedin.com/in/janesmith",
    "industry": "computer software"
  }
}
```
 
Error codes:
- 404: Person not found
- 402: Insufficient credits or plan doesn't include enrichment
 
Note: Person data is nested under a `data` key. Always check for `response.data` first.

Create references/hubspot-contacts-api.md:

# HubSpot Contacts API Reference
 
## Search for contacts missing fields
 
```
POST https://api.hubapi.com/crm/v3/objects/contacts/search
Authorization: Bearer {HUBSPOT_ACCESS_TOKEN}
Content-Type: application/json
```
 
Request body:
```json
{
  "filterGroups": [{
    "filters": [{
      "propertyName": "jobtitle",
      "operator": "NOT_HAS_PROPERTY"
    }]
  }],
  "properties": ["email", "jobtitle", "company", "phone", "linkedin_url", "industry"],
  "limit": 50,
  "after": 0
}
```
 
Pagination: continue with `after` value until `paging.next` is absent.
 
## Update a contact
 
```
PATCH https://api.hubapi.com/crm/v3/objects/contacts/{contactId}
Authorization: Bearer {HUBSPOT_ACCESS_TOKEN}
Content-Type: application/json
```
 
Request body:
```json
{
  "properties": {
    "jobtitle": "Head of Sales",
    "enrichment_source": "apollo+clearbit"
  }
}
```

Create templates/field-mapping.md:

# Waterfall Field Mapping
 
## Provider → HubSpot field mapping
 
| HubSpot property | Apollo field | Clearbit field | PDL field |
|---|---|---|---|
| jobtitle | person.title | employment.title | data.job_title |
| company | person.organization.name | employment.name | data.job_company_name |
| phone | person.phone_numbers[0].sanitized_number | (not available) | data.phone_numbers[0] |
| linkedin_url | person.linkedin_url | "https://linkedin.com/in/" + linkedin.handle | data.linkedin_url |
| industry | person.organization.industry | (not available) | data.industry |
 
## Merge rules
 
1. Process providers in order: Apollo → Clearbit → PDL
2. For each field, keep the FIRST non-null value from any provider
3. Never overwrite a field that already has a value from a previous provider
4. If all target fields are filled after any provider, skip remaining providers
5. Track which providers contributed data in enrichment_source (e.g., "apollo+clearbit")
 
## Personal email domains to skip
 
gmail.com, yahoo.com, hotmail.com, outlook.com, aol.com, protonmail.com, icloud.com, me.com, live.com

Step 4: Test the skill

# In Claude Code
/waterfall-enrich

Start with a single contact to verify the cascade:

"Enrich one contact missing a job title. Show me what each provider returns before writing to HubSpot."

The agent will call each provider, show the merge results, and wait for your approval.

Step 5: Schedule it (optional)

Option A: Cron + CLI

# Run every 6 hours
0 */6 * * * cd /path/to/project && claude -p "Run /waterfall-enrich for contacts missing job titles. Limit to 50." --allowedTools 'Bash,Read' 2>&1 >> /var/log/waterfall-enrich.log

Option B: GitHub Actions

name: Waterfall Enrichment
on:
  schedule:
    - cron: '0 */6 * * *'  # Every 6 hours
  workflow_dispatch: {}
jobs:
  enrich:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: anthropics/claude-code-action@v1
        with:
          prompt: "Run /waterfall-enrich for contacts missing job titles. Limit to 50."
          allowed_tools: "Bash,Read"
        env:
          HUBSPOT_ACCESS_TOKEN: ${{ secrets.HUBSPOT_ACCESS_TOKEN }}
          APOLLO_API_KEY: ${{ secrets.APOLLO_API_KEY }}
          CLEARBIT_API_KEY: ${{ secrets.CLEARBIT_API_KEY }}
          PDL_API_KEY: ${{ secrets.PDL_API_KEY }}

Option C: Cowork Scheduled Tasks

Claude Desktop's Cowork supports built-in scheduled tasks. Open a Cowork session, type /schedule, and configure the cadence — hourly, daily, weekly, or weekdays only. Each scheduled run has full access to your connected tools, plugins, and MCP servers.

Scheduled tasks only run while your computer is awake and Claude Desktop is open. If a run is missed, Cowork executes it automatically when the app reopens. For always-on scheduling, use GitHub Actions (Option B) instead. Available on all paid plans (Pro, Max, Team, Enterprise).

Troubleshooting

When to use this approach

You want to test the waterfall pattern before committing to a platform
You want full control over provider order and field-merging logic
You need ad-hoc enrichment ("enrich the contacts we imported today using all three providers")
You want to run tasks in the background via Claude Cowork while focusing on other work
You want to compare provider coverage on your specific contact list

When to switch to a dedicated tool

You need real-time enrichment on contact creation (not batch)
Multiple team members need to modify provider settings without talking to an agent
You want visual monitoring of provider call rates, fill rates, and error rates
You want to chain enrichment with scoring and routing in one platform

Common questions

Can I change the provider order by just asking?

Yes. Tell the agent: "Try Clearbit first, then Apollo, then PDL." It will reorder the cascade without modifying any files. This is useful for A/B testing which provider gives the best coverage for your specific ICP.

How many API credits does a typical run use?

For 50 contacts: 50 Apollo credits (always called first). If 30% need Clearbit, that's 15 Clearbit credits. If 10% need PDL, that's 5 PDL credits. Total: 50 + 15 + 5 = 70 credits across all three providers.

Can I start with just Apollo and add providers later?

Yes. The agent handles missing API keys gracefully — if CLEARBIT_API_KEY or PDL_API_KEY aren't set, those providers are skipped automatically. Start with Apollo alone, measure fill rates, then add Clearbit when you see specific gaps.

What's the difference between this and the code approach?

The code approach gives you a static script with a fixed provider array. The Claude Code skill lets you adjust the cascade in conversation — reorder providers, change which fields to target, or add a fourth provider — without editing files.

Cost

Apollo: 1 credit/enrichment — called for every contact. $49/mo Basic = 900 credits.
Clearbit: Volume-based starting at $99/mo — called for ~30% of contacts.
People Data Labs: $0.03-0.10/enrichment — called for ~10-15% of contacts.
Claude Code: Usage-based pricing per conversation.
The waterfall saves money: For 100 contacts, you use 100 Apollo + ~30 Clearbit + ~10 PDL credits instead of 100 at each provider.