Auto-enrich new HubSpot contacts with Apollo using code
Prerequisites
- Node.js 18+ or Python 3.9+
- HubSpot private app token with
crm.objects.contacts.readandcrm.objects.contacts.writescopes - Apollo API key from Settings → Integrations → API
- A scheduling environment: cron, GitHub Actions, or a cloud function
Why code?
A script gives you maximum control with zero platform cost. You can use Apollo's bulk_match endpoint (10 contacts per request), add custom filtering logic (skip personal emails, only enrich ICP-matching domains), and version-control everything in your repo. GitHub Actions provides free scheduling with 2,000 minutes/month.
The trade-off is maintenance. You're responsible for error handling, rate limiting, pagination, and monitoring. There's no visual execution history — you debug by reading logs. If you want a visual workflow builder or non-technical team members need to modify the logic, use n8n or Make instead.
Step 1: Set up the project
# Test your Apollo API key
curl -X POST "https://api.apollo.io/api/v1/people/match" \
-H "x-api-key: $APOLLO_API_KEY" \
-H "Content-Type: application/json" \
-d '{"email": "test@example.com"}'Step 2: Fetch recently created contacts from HubSpot
Poll for contacts created since the last run. Use the HubSpot Search API with a createdate filter:
# Fetch contacts created in the last hour
SINCE=$(date -v-1H +%s000 2>/dev/null || date -d '1 hour ago' +%s000)
curl -s -X POST "https://api.hubapi.com/crm/v3/objects/contacts/search" \
-H "Authorization: Bearer $HUBSPOT_ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d "{
\"filterGroups\": [{
\"filters\": [{
\"propertyName\": \"createdate\",
\"operator\": \"GTE\",
\"value\": \"$SINCE\"
}]
}],
\"properties\": [\"email\", \"firstname\", \"lastname\", \"jobtitle\", \"company\"],
\"limit\": 100
}"Step 3: Enrich each contact via Apollo
Call Apollo's People Match endpoint for each contact. Skip contacts that already have a job title (avoid wasting credits on already-enriched records):
import time
def enrich_contact(email):
resp = requests.post(
"https://api.apollo.io/api/v1/people/match",
headers={
"x-api-key": APOLLO_API_KEY,
"Content-Type": "application/json"
},
json={"email": email}
)
resp.raise_for_status()
return resp.json().get("person")
def enrich_all(contacts):
results = []
for contact in contacts:
email = contact["properties"].get("email")
existing_title = contact["properties"].get("jobtitle")
if not email or existing_title:
continue
person = enrich_contact(email)
if person:
results.append({"contact_id": contact["id"], "person": person})
time.sleep(0.5) # Apollo rate limit: 5 req/sec on most plans
return resultsApollo's rate limit varies by plan: 5 requests/second on Basic, 10/sec on Professional. A 429 response includes a Retry-After header. Add exponential backoff or a simple sleep(0.5) between requests to stay safe.
Step 4: Write enriched data back to HubSpot
Update each contact with the data Apollo returned. Only write fields that have values — never overwrite existing HubSpot data with empty strings:
def update_hubspot_contact(contact_id, person):
properties = {}
if person.get("title"):
properties["jobtitle"] = person["title"]
if person.get("organization", {}).get("name"):
properties["company"] = person["organization"]["name"]
if person.get("phone_numbers") and person["phone_numbers"][0].get("sanitized_number"):
properties["phone"] = person["phone_numbers"][0]["sanitized_number"]
if person.get("linkedin_url"):
properties["linkedin_url"] = person["linkedin_url"]
if person.get("organization", {}).get("industry"):
properties["industry"] = person["organization"]["industry"]
if not properties:
return
resp = requests.patch(
f"https://api.hubapi.com/crm/v3/objects/contacts/{contact_id}",
headers=HS_HEADERS,
json={"properties": properties}
)
resp.raise_for_status()
print(f"Updated contact {contact_id}: {list(properties.keys())}")Step 5: Tie it together and schedule
def main():
print(f"Fetching new contacts...")
contacts = get_new_contacts(since_minutes=65) # slight overlap to avoid gaps
print(f"Found {len(contacts)} new contacts")
enriched = enrich_all(contacts)
print(f"Enriched {len(enriched)} contacts via Apollo")
for item in enriched:
update_hubspot_contact(item["contact_id"], item["person"])
print("Done.")
if __name__ == "__main__":
main()Schedule it with cron or GitHub Actions:
# .github/workflows/enrich-contacts.yml
name: Enrich New HubSpot Contacts
on:
schedule:
- cron: '0 * * * *' # Every hour
workflow_dispatch: {}
jobs:
enrich:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- run: pip install requests
- run: python enrich.py
env:
HUBSPOT_ACCESS_TOKEN: ${{ secrets.HUBSPOT_ACCESS_TOKEN }}
APOLLO_API_KEY: ${{ secrets.APOLLO_API_KEY }}Rate limits
| API | Limit | Notes |
|---|---|---|
| HubSpot Search | 5 req/sec | Add 200ms delay between paginated calls |
| HubSpot PATCH | 150 req/10 sec | Plenty for most volumes |
| Apollo People Match | 5 req/sec (Basic) | Add 500ms delay between calls |
Troubleshooting
Cost
- Apollo: 1 credit per enrichment. Basic plan ($49/mo) = 900 credits. Professional ($79/mo) = 2,400 credits.
- HubSpot: Free within API rate limits.
- GitHub Actions: Free tier includes 2,000 minutes/month. Each run takes ~1-2 minutes.
Common questions
How much does this cost to run?
Zero platform cost. Apollo credits are the only variable expense: 1 credit per enrichment on the Basic plan ($49/mo for 900 credits). GitHub Actions free tier includes 2,000 minutes/month — each hourly run takes 1-2 minutes, so monthly usage is well within limits.
Should I use Python or Node.js?
Both work equally well. Python is slightly more concise for API scripting and has wider CI/CD compatibility. Node.js is better if your team already maintains JavaScript projects. The logic and API calls are identical.
How do I avoid re-enriching the same contacts?
The script uses a createdate filter with a time window (default: 65 minutes for hourly runs). For extra safety, set a custom enrichment_date property on each processed contact and add a NOT_HAS_PROPERTY filter on enrichment_date to the search query.
Next steps
- Skip personal emails — add a domain check to skip
gmail.com,yahoo.com,hotmail.combefore calling Apollo - Add logging — write enrichment results to a CSV or database for audit trails
- Deduplicate — check
enrichment_datecustom property to avoid re-enriching contacts on overlapping poll windows
Looking to scale your AI operations?
We build and optimize automation systems for mid-market businesses. Let's discuss the right approach for your team.