Data Engineering7 min read11 May 2026

Change Data Capture (CDC) for CRM Syncs: Why Polling Breaks and How to Fix It

A practical guide to change data capture — why timestamp-based sync fails, the patterns that actually work, and how to implement CDC in your automation stack.

H

Haroon Mohamed

AI Automation & Lead Generation

The problem: polling-based sync breaks

Most cross-system CRM syncs work like this:

  1. Every 15 minutes, a workflow runs
  2. Query System A: "give me contacts updated since 15 minutes ago"
  3. Update each one in System B

It feels reasonable. It's also fragile.

Problems:

  • Clock drift: What's "15 minutes ago" in your automation tool vs. the CRM? They may differ by seconds.
  • Race conditions: A contact updated during the sync window might be missed.
  • Time zone issues: CRM reports UTC. Your automation tool runs in local time. One-hour gaps.
  • Duplicate processing: If a sync fails and retries, the same records may process twice.
  • Bulk updates miss data: A batch update might touch 1,000 records in a second; the sync might query the API before the update is visible.

After 3-6 months of polling-based sync, you find drift — records that are inconsistent between systems. Some records never synced. Some synced twice.


What CDC is

Change Data Capture (CDC) means reacting to specific changes as they happen, not polling for them.

Two ways to do CDC:

1. Webhook-based CDC

The source system pushes changes to your automation as they happen.

  • GoHighLevel → webhook on "Contact Updated" → your flow processes the change
  • HubSpot → webhook on "Contact Property Changed" → your flow processes

No polling. No time window. Just "when X changes, do Y."

2. Log-based CDC

The source system publishes a stream of changes that downstream consumers read.

  • Postgres → logical replication → Debezium → Kafka → your consumers
  • Databases → binlog → CDC tools

This is enterprise-scale. Overkill for most automation stacks but worth knowing exists.


Why webhook-based CDC is the right fit for most automation

For a CRM → CRM sync, database → CRM, or CRM → custom app, webhook-based CDC solves the problems of polling:

  • No clock drift: the system tells you when a change happened.
  • No race conditions: the webhook fires immediately on change.
  • No duplicates: the webhook fires once per change (usually — more on this below).
  • Real-time: changes propagate in seconds, not minutes.

CDC patterns

Pattern 1: Direct webhook → sync

System A changes → webhook to Make/n8n → update System B

Simplest case. Works for:

  • New contact in GHL → create in HubSpot
  • Deal stage change in HubSpot → update in custom dashboard

Works when: change volume is low to moderate, and the receiving side can handle each event synchronously.

Pattern 2: Webhook → queue → process

System A changes → webhook → message queue → workers process

Used when: high volume or slow downstream processing.

Implementation in automation stack:

  • Webhook writes to Supabase/SQS/Redis queue
  • Scheduled workflow processes items from queue
  • Retries on failure

Pattern 3: Webhook → event sourcing

System A changes → webhook → append to event log → projections rebuild state

Enterprise-grade. Keeps a full history of every change. Can rebuild any state from the log.

Usually overkill for small automation stacks. But worth noting: if compliance requires full audit, event sourcing is the pattern.


Implementation: CRM → custom dashboard CDC

Scenario: you want a custom dashboard that stays in sync with your CRM in real-time.

With GoHighLevel

  1. Create a GHL workflow: "Contact Updated" → Webhook Out
  2. Configure webhook URL: your Supabase edge function or n8n webhook endpoint
  3. Send the contact's key fields in the payload
  4. Receiver (Supabase edge function, Make, or n8n):
    • Parse payload
    • UPSERT contact in your Supabase contacts table
    • Update any dependent aggregate tables

With HubSpot

  1. Create a HubSpot workflow: property-change trigger
  2. Webhook action → POST to your endpoint
  3. Same receiver logic

With Stripe

Stripe webhooks are always CDC. Subscribe to events like customer.updated, invoice.paid — they fire automatically.


Handling webhook reliability

Webhooks aren't perfectly reliable. Key issues and mitigations:

Issue: webhook delivery fails

Your endpoint is down → webhook fails to deliver.

Mitigation: most providers retry with exponential backoff (Stripe retries for 3 days). But some providers (GoHighLevel) have limited retry. Missed webhooks = data drift.

Fallback: run a nightly polling sync as a safety net. Catches any missed webhooks without the timing issues of polling-only.

Issue: duplicate webhook delivery

Same event delivered twice (retry logic, network hiccups).

Mitigation: idempotency. Every webhook has an event ID. Track which event IDs you've processed. Skip duplicates.

-- Idempotency table
CREATE TABLE processed_events (
  event_id TEXT PRIMARY KEY,
  processed_at TIMESTAMP DEFAULT NOW()
);

Before processing: check if event_id exists. If yes, skip. If no, process and insert event_id.

Issue: out-of-order delivery

Event B fires at 10:00:01 and event A at 10:00:00. You receive B first.

Mitigation: use source timestamps. If the incoming event's timestamp is older than the last-processed event for that record, skip or merge carefully.

Issue: webhook spoofing

Malicious actor POSTs fake data to your webhook URL.

Mitigation: verify webhook signatures (Stripe, Shopify, GitHub support HMAC signatures). For providers without signatures (GHL), use shared secret in URL or headers.


Implementation: idempotent upsert pattern

This is the workhorse pattern for CDC ingestion:

INSERT INTO contacts (
  external_id, email, name, phone, source, last_updated_at
)
VALUES (
  'ghl_contact_abc123', 
  'jane@example.com', 
  'Jane Doe', 
  '+15551234567', 
  'facebook', 
  '2026-04-24T15:30:45Z'
)
ON CONFLICT (external_id) 
DO UPDATE SET
  email = EXCLUDED.email,
  name = EXCLUDED.name,
  phone = EXCLUDED.phone,
  source = EXCLUDED.source,
  last_updated_at = EXCLUDED.last_updated_at
WHERE contacts.last_updated_at < EXCLUDED.last_updated_at;

Key elements:

  • ON CONFLICT handles duplicates gracefully (no error on second insert)
  • WHERE contacts.last_updated_at < EXCLUDED.last_updated_at ensures out-of-order events don't overwrite newer data

Bi-directional sync is harder

If changes can happen on both sides (CRM A ↔ CRM B), CDC is trickier:

Problem: infinite loops

Change in A → webhook to B → update in B → webhook to A → update in A → back to B...

Mitigation: mark updates as "sourced from sync." When webhook fires for an update marked as sync-sourced, skip it.

Problem: conflict resolution

Same contact updated in both A and B within seconds. Which wins?

Mitigation: define a source of truth per field. E.g., CRM A owns contact fields, CRM B owns deal fields. Or last-write-wins with timestamp comparison.

Bidirectional sync is complex enough that many teams avoid it and run unidirectional syncs with clear ownership.


When NOT to use CDC

  • Batch workloads: processing 50,000 records nightly is fine with a scheduled polling job.
  • Cross-organization sync: if the source system can't push webhooks to you, polling is your only option.
  • Analytics data: real-time CDC for dashboards is often unnecessary. 15-minute freshness is usually fine.

Tools for CDC in the automation stack

For small stacks

  • Make.com / n8n / Zapier with webhook triggers: covers 90% of cases
  • Supabase edge functions: HTTP endpoints that can receive webhooks and write to Postgres

For larger stacks

  • Fivetran / Airbyte: managed CDC platforms that connect CRMs to data warehouses
  • Segment: customer data platform with CDC-like event routing
  • PostHog / Mixpanel: event streaming platforms

For custom apps

  • Supabase realtime: Postgres → websocket updates for client apps
  • Hasura: GraphQL subscriptions backed by Postgres
  • Kafka / RabbitMQ: enterprise message queues

Migration from polling to CDC

If you're already running polling:

  1. Build the webhook-based sync alongside polling
  2. Run both in parallel for 2-4 weeks
  3. Compare results — ensure webhook version catches everything polling does
  4. Switch primary to webhook
  5. Keep polling as nightly fallback for missed webhooks
  6. After 3 months of clean operation, retire polling (or keep as insurance)

Sources

Change Data Capture concepts are industry-standard, documented in database replication literature (Postgres logical replication docs at postgresql.org, Kafka Streams documentation, Debezium documentation). Webhook reliability patterns reference Stripe's webhook best practices (stripe.com/docs/webhooks/best-practices), GitHub's webhook guide, and similar provider docs. Implementation examples are standard patterns for Supabase / Postgres deployments.

Running into drift between your CRM and your custom database? Let's talk — migrating from polling to CDC is usually a 1-2 week engagement with dramatic reliability improvements.

Need This Built?

Ready to implement this for your business?

Everything in this article reflects real systems I've built and operated. Let's talk about yours.

H

Haroon Mohamed

Full-stack automation, AI, and lead generation specialist. 2+ years running 13+ concurrent client campaigns using GoHighLevel, multiple AI voice providers, Zapier, APIs, and custom data pipelines. Founder of HMX Zone.

ShareShare on X →