Auto-enrich new HubSpot contacts with Apollo using code

Prerequisites

Node.js 18+ or Python 3.9+
HubSpot private app token with crm.objects.contacts.read and crm.objects.contacts.write scopes
Apollo API key from Settings → Integrations → API
A scheduling environment: cron, GitHub Actions, or a cloud function

Why code?

A script gives you maximum control with zero platform cost. You can use Apollo's bulk_match endpoint (10 contacts per request), add custom filtering logic (skip personal emails, only enrich ICP-matching domains), and version-control everything in your repo. GitHub Actions provides free scheduling with 2,000 minutes/month.

The trade-off is maintenance. You're responsible for error handling, rate limiting, pagination, and monitoring. There's no visual execution history — you debug by reading logs. If you want a visual workflow builder or non-technical team members need to modify the logic, use n8n or Make instead.

Step 1: Set up the project

# Test your Apollo API key
curl -X POST "https://api.apollo.io/api/v1/people/match" \
  -H "x-api-key: $APOLLO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"email": "test@example.com"}'

Step 2: Fetch recently created contacts from HubSpot

Poll for contacts created since the last run. Use the HubSpot Search API with a createdate filter:

# Fetch contacts created in the last hour
SINCE=$(date -v-1H +%s000 2>/dev/null || date -d '1 hour ago' +%s000)
curl -s -X POST "https://api.hubapi.com/crm/v3/objects/contacts/search" \
  -H "Authorization: Bearer $HUBSPOT_ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{
    \"filterGroups\": [{
      \"filters\": [{
        \"propertyName\": \"createdate\",
        \"operator\": \"GTE\",
        \"value\": \"$SINCE\"
      }]
    }],
    \"properties\": [\"email\", \"firstname\", \"lastname\", \"jobtitle\", \"company\"],
    \"limit\": 100
  }"

Step 3: Enrich each contact via Apollo

Call Apollo's People Match endpoint for each contact. Skip contacts that already have a job title (avoid wasting credits on already-enriched records):

import time
 
def enrich_contact(email):
    resp = requests.post(
        "https://api.apollo.io/api/v1/people/match",
        headers={
            "x-api-key": APOLLO_API_KEY,
            "Content-Type": "application/json"
        },
        json={"email": email}
    )
    resp.raise_for_status()
    return resp.json().get("person")
 
def enrich_all(contacts):
    results = []
    for contact in contacts:
        email = contact["properties"].get("email")
        existing_title = contact["properties"].get("jobtitle")
 
        if not email or existing_title:
            continue
 
        person = enrich_contact(email)
        if person:
            results.append({"contact_id": contact["id"], "person": person})
 
        time.sleep(0.5)  # Apollo rate limit: 5 req/sec on most plans
 
    return results

Apollo rate limits

Apollo's rate limit varies by plan: 5 requests/second on Basic, 10/sec on Professional. A 429 response includes a Retry-After header. Add exponential backoff or a simple sleep(0.5) between requests to stay safe.

Step 4: Write enriched data back to HubSpot

Update each contact with the data Apollo returned. Only write fields that have values — never overwrite existing HubSpot data with empty strings:

def update_hubspot_contact(contact_id, person):
    properties = {}
 
    if person.get("title"):
        properties["jobtitle"] = person["title"]
    if person.get("organization", {}).get("name"):
        properties["company"] = person["organization"]["name"]
    if person.get("phone_numbers") and person["phone_numbers"][0].get("sanitized_number"):
        properties["phone"] = person["phone_numbers"][0]["sanitized_number"]
    if person.get("linkedin_url"):
        properties["linkedin_url"] = person["linkedin_url"]
    if person.get("organization", {}).get("industry"):
        properties["industry"] = person["organization"]["industry"]
 
    if not properties:
        return
 
    resp = requests.patch(
        f"https://api.hubapi.com/crm/v3/objects/contacts/{contact_id}",
        headers=HS_HEADERS,
        json={"properties": properties}
    )
    resp.raise_for_status()
    print(f"Updated contact {contact_id}: {list(properties.keys())}")

Step 5: Tie it together and schedule

def main():
    print(f"Fetching new contacts...")
    contacts = get_new_contacts(since_minutes=65)  # slight overlap to avoid gaps
    print(f"Found {len(contacts)} new contacts")
 
    enriched = enrich_all(contacts)
    print(f"Enriched {len(enriched)} contacts via Apollo")
 
    for item in enriched:
        update_hubspot_contact(item["contact_id"], item["person"])
 
    print("Done.")
 
if __name__ == "__main__":
    main()

Schedule it with cron or GitHub Actions:

# .github/workflows/enrich-contacts.yml
name: Enrich New HubSpot Contacts
on:
  schedule:
    - cron: '0 * * * *'  # Every hour
  workflow_dispatch: {}
jobs:
  enrich:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.12'
      - run: pip install requests
      - run: python enrich.py
        env:
          HUBSPOT_ACCESS_TOKEN: ${{ secrets.HUBSPOT_ACCESS_TOKEN }}
          APOLLO_API_KEY: ${{ secrets.APOLLO_API_KEY }}

Rate limits

API	Limit	Notes
HubSpot Search	5 req/sec	Add 200ms delay between paginated calls
HubSpot PATCH	150 req/10 sec	Plenty for most volumes
Apollo People Match	5 req/sec (Basic)	Add 500ms delay between calls

Troubleshooting

Cost

Apollo: 1 credit per enrichment. Basic plan ($49/mo) = 900 credits. Professional ($79/mo) = 2,400 credits.
HubSpot: Free within API rate limits.
GitHub Actions: Free tier includes 2,000 minutes/month. Each run takes ~1-2 minutes.

Common questions

How much does this cost to run?

Zero platform cost. Apollo credits are the only variable expense: 1 credit per enrichment on the Basic plan ($49/mo for 900 credits). GitHub Actions free tier includes 2,000 minutes/month — each hourly run takes 1-2 minutes, so monthly usage is well within limits.

Should I use Python or Node.js?

Both work equally well. Python is slightly more concise for API scripting and has wider CI/CD compatibility. Node.js is better if your team already maintains JavaScript projects. The logic and API calls are identical.

How do I avoid re-enriching the same contacts?

The script uses a createdate filter with a time window (default: 65 minutes for hourly runs). For extra safety, set a custom enrichment_date property on each processed contact and add a NOT_HAS_PROPERTY filter on enrichment_date to the search query.

Next steps

Skip personal emails — add a domain check to skip gmail.com, yahoo.com, hotmail.com before calling Apollo
Add logging — write enrichment results to a CSV or database for audit trails
Deduplicate — check enrichment_date custom property to avoid re-enriching contacts on overlapping poll windows