Snapshot your HubSpot pipeline weekly in Google Sheets using code and cron

medium complexityCost: $0

Prerequisites

Prerequisites
  • Node.js 18+ or Python 3.9+
  • HubSpot private app token (scopes: crm.objects.deals.read, crm.schemas.deals.read)
  • Google Cloud project with the Google Sheets API enabled
  • Google service account with a JSON key file, or OAuth2 credentials
  • The service account's email address shared as an Editor on the target Google Sheet
  • A scheduling environment: cron, GitHub Actions, or a cloud function

Why code?

A script gives you full control over snapshot logic, column ordering, and metric calculation with zero ongoing cost. The pipeline snapshot runs once a week and takes seconds — perfect for cron or GitHub Actions. The trade-off is you handle errors, monitoring, and Google credential management yourself. Best for teams with a developer who can maintain the script and wants version-controlled infrastructure.

How it works

  • Cron or GitHub Actions triggers the script every Monday morning
  • HubSpot Pipelines API fetches stage definitions for ID-to-label mapping
  • HubSpot Search API pulls all active deals with pagination
  • Aggregation logic computes total value, deal count, and per-stage breakdowns
  • Google Sheets API v4 appends a single snapshot row via service account auth

Step 1: Set up the project

# Verify your HubSpot token works
curl -s "https://api.hubapi.com/crm/v3/pipelines/deals" \
  -H "Authorization: Bearer $HUBSPOT_ACCESS_TOKEN" | head -c 200
Google service account setup

Create a service account in the Google Cloud Console, download the JSON key file, and share your Google Sheet with the service account email (e.g., snapshot@your-project.iam.gserviceaccount.com) as an Editor. Store the key file path in the GOOGLE_SERVICE_ACCOUNT_KEY environment variable.

Step 2: Fetch pipeline stages

Get a mapping of stage IDs to human-readable names.

curl -s "https://api.hubapi.com/crm/v3/pipelines/deals" \
  -H "Authorization: Bearer $HUBSPOT_ACCESS_TOKEN" \
  | jq '.results[0].stages[] | {id: .id, label: .label}'

Step 3: Search for active deals

Use the HubSpot Search API to pull all deals with pagination.

from hubspot.crm.deals import PublicObjectSearchRequest, Filter, FilterGroup
 
def get_all_deals(client):
    pipeline_filter = Filter(
        property_name="pipeline",
        operator="EQ",
        value="default"
    )
    filter_group = FilterGroup(filters=[pipeline_filter])
 
    all_deals = []
    after = 0
 
    while True:
        request = PublicObjectSearchRequest(
            filter_groups=[filter_group],
            properties=[
                "dealname", "amount", "dealstage",
                "closedate", "createdate"
            ],
            sorts=[{"propertyName": "amount", "direction": "DESCENDING"}],
            limit=100,
            after=after
        )
        response = client.crm.deals.search_api.do_search(
            public_object_search_request=request
        )
        all_deals.extend(response.results)
 
        if response.paging and response.paging.next:
            after = response.paging.next.after
        else:
            break
 
    return all_deals
Property values are strings

All HubSpot property values come back as strings — even numbers. amount returns "50000" not 50000. Always parse with float() / parseFloat() and handle None/null values.

Step 4: Calculate snapshot metrics

Aggregate deals into total value, count, and per-stage breakdowns.

from datetime import datetime, timezone
 
def calculate_snapshot(deals, stage_map, stage_order):
    total_value = 0
    by_stage = {}
 
    for deal in deals:
        props = deal.properties
        amount = float(props.get("amount") or 0)
        total_value += amount
 
        stage_id = props.get("dealstage", "unknown")
        stage_name = stage_map.get(stage_id, stage_id)
 
        if stage_name not in by_stage:
            by_stage[stage_name] = {"count": 0, "value": 0}
        by_stage[stage_name]["count"] += 1
        by_stage[stage_name]["value"] += amount
 
    # Build the row: date, total value, deal count, then per-stage count/value pairs
    today = datetime.now(timezone.utc).strftime("%Y-%m-%d")
    row = [today, total_value, len(deals)]
 
    for stage_name in stage_order:
        data = by_stage.get(stage_name, {"count": 0, "value": 0})
        row.append(data["count"])
        row.append(data["value"])
 
    return row

Step 5: Append to Google Sheets

Write the snapshot row to Google Sheets using the Sheets API v4 append method.

from google.oauth2 import service_account
from googleapiclient.discovery import build
 
def append_to_sheet(row, spreadsheet_id, range_name="Sheet1!A1"):
    creds = service_account.Credentials.from_service_account_file(
        os.environ["GOOGLE_SERVICE_ACCOUNT_KEY"],
        scopes=["https://www.googleapis.com/auth/spreadsheets"]
    )
    service = build("sheets", "v4", credentials=creds)
 
    body = {"values": [row]}
    result = service.spreadsheets().values().append(
        spreadsheetId=spreadsheet_id,
        range=range_name,
        valueInputOption="USER_ENTERED",
        insertDataOption="INSERT_ROWS",
        body=body,
    ).execute()
 
    print(f"Appended {result.get('updates', {}).get('updatedRows', 0)} row(s)")
    return result
Append behavior

The Sheets API append method finds the last row with data in the specified range and inserts below it. valueInputOption: "USER_ENTERED" means Google Sheets will parse values the same way it would if you typed them — numbers stay as numbers, dates as dates.

Service account sharing

The service account is a separate Google identity. It can't see your sheets by default. You must share the spreadsheet with the service account email address (xxx@yyy.iam.gserviceaccount.com) and give it Editor access. Without this, you'll get a 403 error.

Step 6: Wire it all together

#!/usr/bin/env python3
"""Pipeline snapshot: HubSpot → Google Sheets"""
import os
import sys
from datetime import datetime, timezone
 
try:
    from hubspot import HubSpot
    from google.oauth2 import service_account
    from googleapiclient.discovery import build
except ImportError:
    os.system("pip install hubspot-api-client google-api-python-client google-auth -q")
    from hubspot import HubSpot
    from google.oauth2 import service_account
    from googleapiclient.discovery import build
 
HUBSPOT_ACCESS_TOKEN = os.environ.get("HUBSPOT_ACCESS_TOKEN")
SPREADSHEET_ID = os.environ.get("GOOGLE_SPREADSHEET_ID")
GOOGLE_KEY_FILE = os.environ.get("GOOGLE_SERVICE_ACCOUNT_KEY")
 
if not all([HUBSPOT_ACCESS_TOKEN, SPREADSHEET_ID, GOOGLE_KEY_FILE]):
    print("ERROR: Missing env vars: HUBSPOT_ACCESS_TOKEN, GOOGLE_SPREADSHEET_ID, GOOGLE_SERVICE_ACCOUNT_KEY")
    sys.exit(1)
 
# 1. Get stages
client = HubSpot(access_token=HUBSPOT_ACCESS_TOKEN)
pipelines = client.crm.pipelines.pipelines_api.get_all(object_type="deals")
stage_map = {}
stage_order = []
for pipeline in pipelines.results:
    for stage in pipeline.stages:
        stage_map[stage.id] = stage.label
        stage_order.append(stage.label)
 
print(f"Found {len(stage_map)} stages: {', '.join(stage_order)}")
 
# 2. Get deals (paginated)
from hubspot.crm.deals import PublicObjectSearchRequest, Filter, FilterGroup
 
pipeline_filter = Filter(property_name="pipeline", operator="EQ", value="default")
all_deals = []
after = 0
while True:
    request = PublicObjectSearchRequest(
        filter_groups=[FilterGroup(filters=[pipeline_filter])],
        properties=["dealname", "amount", "dealstage", "closedate", "createdate"],
        sorts=[{"propertyName": "amount", "direction": "DESCENDING"}],
        limit=100,
        after=after,
    )
    response = client.crm.deals.search_api.do_search(public_object_search_request=request)
    all_deals.extend(response.results)
    if response.paging and response.paging.next:
        after = response.paging.next.after
    else:
        break
 
print(f"Found {len(all_deals)} deals")
 
# 3. Calculate snapshot
total_value = 0
by_stage = {}
for deal in all_deals:
    props = deal.properties
    amount = float(props.get("amount") or 0)
    total_value += amount
    stage_name = stage_map.get(props.get("dealstage", ""), props.get("dealstage", "unknown"))
    if stage_name not in by_stage:
        by_stage[stage_name] = {"count": 0, "value": 0}
    by_stage[stage_name]["count"] += 1
    by_stage[stage_name]["value"] += amount
 
today = datetime.now(timezone.utc).strftime("%Y-%m-%d")
row = [today, total_value, len(all_deals)]
for stage_name in stage_order:
    data = by_stage.get(stage_name, {"count": 0, "value": 0})
    row.append(data["count"])
    row.append(data["value"])
 
print(f"Snapshot: {today} | ${total_value:,.0f} | {len(all_deals)} deals")
 
# 4. Append to Google Sheets
creds = service_account.Credentials.from_service_account_file(
    GOOGLE_KEY_FILE, scopes=["https://www.googleapis.com/auth/spreadsheets"]
)
service = build("sheets", "v4", credentials=creds)
result = service.spreadsheets().values().append(
    spreadsheetId=SPREADSHEET_ID,
    range="Sheet1!A1",
    valueInputOption="USER_ENTERED",
    insertDataOption="INSERT_ROWS",
    body={"values": [row]},
).execute()
 
print(f"Appended to Google Sheets: {result.get('updates', {}).get('updatedRows', 0)} row(s)")

Step 7: Schedule with cron or GitHub Actions

Cron (server-based):

# crontab -e
0 8 * * 1 cd /path/to/pipeline-snapshot && python snapshot.py >> /var/log/pipeline-snapshot.log 2>&1

GitHub Actions (serverless):

# .github/workflows/pipeline-snapshot.yml
name: Weekly Pipeline Snapshot
on:
  schedule:
    - cron: '0 13 * * 1'  # 8 AM ET = 1 PM UTC
  workflow_dispatch: {}     # Allow manual runs
jobs:
  snapshot:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.12'
      - run: pip install hubspot-api-client google-api-python-client google-auth
      - run: python snapshot.py
        env:
          HUBSPOT_ACCESS_TOKEN: ${{ secrets.HUBSPOT_ACCESS_TOKEN }}
          GOOGLE_SPREADSHEET_ID: ${{ secrets.GOOGLE_SPREADSHEET_ID }}
          GOOGLE_SERVICE_ACCOUNT_KEY: ${{ secrets.GOOGLE_SERVICE_ACCOUNT_KEY }}
Service account key in CI

For GitHub Actions, store the service account JSON key as a secret. You have two options: (1) save the entire JSON content as a secret and write it to a temp file at runtime, or (2) base64-encode the key, store that as a secret, and decode it in the workflow. Option 1 is simpler — just use echo "$KEY_JSON" > /tmp/sa-key.json and set GOOGLE_SERVICE_ACCOUNT_KEY=/tmp/sa-key.json.

Troubleshooting

Common questions

What if the Google service account key rotates?

Service account keys don't expire automatically, but you should rotate them periodically. Update the JSON key file and the corresponding secret in GitHub Actions. The Google Cloud Console lets you create a new key while the old one is still active, so there's no downtime.

Can I add week-over-week deltas to the snapshot?

Yes. Read the last row from Google Sheets before appending the new one, then calculate percentage changes for each metric. The Sheets API get method with a range like Sheet1!A:A lets you find the last row.

How do I handle pipelines with 200+ deals?

The script already handles pagination — it loops through 100-deal pages using the after cursor. Add a 200ms delay between pages to stay within HubSpot's 5 requests/second search API limit.

What's the total runtime for a typical snapshot?

Under 10 seconds for a 100-deal pipeline. HubSpot API calls take 200-500ms each, and the Google Sheets append is under 300ms. Even with pagination, a 500-deal pipeline finishes in under 30 seconds.

Rate limits

APILimitImpact
HubSpot general150 req / 10 sec per accountNo concern for weekly snapshot
HubSpot Search5 req / secMatters with pagination (add 200ms delay between pages)
Google Sheets API300 req / min per projectNo concern for a single append
Google Sheets API60 req / min per user per projectNo concern

Trade-offs

  • Full control over data transformation, column ordering, and snapshot logic
  • Version controlled — track changes in git, review in PRs
  • No per-execution cost — runs on existing infrastructure (GitHub Actions free tier includes 2,000 min/month)
  • Google Sheets API is free — standard quotas are generous for weekly snapshots
  • Maintenance overhead — you handle errors, monitoring, and credential rotation
  • No visual debugging — logs, not a visual flow builder

Looking to scale your AI operations?

We build and optimize automation systems for mid-market businesses. Let's discuss the right approach for your team.