Data Engineering7 min read23 April 2026

Webhook Data Validation: How to Stop Bad Data from Polluting Your CRM

Practical patterns for validating webhook payloads before they hit your CRM — required fields, format checks, deduplication, rate limiting — with real examples in Make.com and n8n.

H

Haroon Mohamed

AI Automation & Lead Generation

Why webhook validation matters

Webhooks are the connective tissue of modern automation stacks. A form submits → webhook fires → data flows to your CRM. A call ends → webhook fires → outcome updates your pipeline.

Without validation, every webhook becomes a potential vector for bad data:

  • A form with no email address creates a ghost contact
  • A VAPI webhook with malformed fields breaks your workflow
  • A Stripe webhook received twice creates a duplicate deal
  • A malicious POST fills your CRM with junk

Validation is the checkpoint between "data exists" and "data enters your system."


The 5 validation layers

Layer 1: Structure validation

Does the payload match the expected shape?

Check:

  • Required fields are present (email, phone, or whatever's minimum)
  • Fields are of expected types (string, number, boolean)
  • Nested objects/arrays exist if expected

Fail behavior: Reject with a 400 status code. Don't log as success.

Layer 2: Format validation

Do fields have valid values?

Check:

  • Email matches regex (/^[^\s@]+@[^\s@]+\.[^\s@]+$/)
  • Phone is parseable as E.164
  • Dates are valid ISO 8601
  • URLs are well-formed

Fail behavior: Route to an error queue. Log for human review.

Layer 3: Business logic validation

Does the data make sense for your business?

Check:

  • Budget value is within realistic range
  • Deal amount isn't negative
  • Timeline values are from expected set
  • Source tag is from canonical list

Fail behavior: Accept but flag with a "review" tag.

Layer 4: Deduplication

Is this data already in the system?

Check:

  • Normalized email or phone matches existing contact
  • Same event ID already processed (idempotency)
  • Same form submission within dedup window (e.g., 5 minutes)

Fail behavior: Update existing record instead of creating duplicate.

Layer 5: Security validation

Is the request actually from the expected source?

Check:

  • Signature header matches expected HMAC (for providers that sign webhooks — Stripe, Shopify, GitHub)
  • IP whitelist (if provider publishes allowed IPs)
  • Shared secret in header or query param

Fail behavior: Reject with 401 Unauthorized. Log attempt.


Implementation: Make.com

Basic structure validation

At the top of the webhook scenario, add a filter module that checks required fields:

email is not empty AND email contains @ AND phone is not empty

If false, route to error branch.

Email regex validation

Use a filter with:

email matches pattern ^[^\s@]+@[^\s@]+\.[^\s@]+$

Deduplication via HubSpot/GHL lookup

Before creating a contact:

  1. Search existing contacts by email (or normalized phone)
  2. If found → update instead of create
  3. If not found → create

Security via shared secret

Most webhook providers let you set a secret query parameter. In Make:

  1. Add a filter: _{query.secret}_ equals "YOUR-SECRET-HERE"
  2. If not, reject

For signed webhooks (Stripe, Shopify):

  1. Extract signature from header
  2. Compute HMAC using your secret
  3. Compare — if mismatch, reject

Implementation: n8n

Webhook node validation

Start with a Webhook trigger node. Immediately after, add a Code node with validation logic:

const { email, phone, name } = $input.item.json;

const errors = [];

// Required fields
if (!email && !phone) {
  errors.push('Must have email or phone');
}

// Email format
if (email && !/^[^\s@]+@[^\s@]+\.[^\s@]+$/.test(email)) {
  errors.push('Invalid email format');
}

// Phone format (strict E.164)
if (phone && !/^\+\d{10,15}$/.test(phone)) {
  errors.push('Invalid phone format');
}

if (errors.length > 0) {
  return { json: { valid: false, errors } };
}

return { json: { valid: true, data: { email, phone, name } } };

Then an IF node to branch on valid === true.

Signature verification (Stripe example)

const crypto = require('crypto');
const signature = $input.item.headers['stripe-signature'];
const payload = JSON.stringify($input.item.body);
const secret = 'your-stripe-webhook-secret';

const expectedSig = crypto
  .createHmac('sha256', secret)
  .update(payload, 'utf8')
  .digest('hex');

const valid = signature && signature.includes(expectedSig);

return { json: { valid } };

Deduplication patterns

Pattern 1: Email/phone lookup before create

For every incoming lead:

  1. Normalize email (lowercase, trim)
  2. Normalize phone (E.164)
  3. Lookup contact by normalized email
  4. If found → update fields (merge strategy: newer wins)
  5. If not found → create new contact

Pattern 2: Event ID idempotency

For events that might be delivered twice (Stripe, Shopify retry failed webhooks):

  1. Extract event ID from payload
  2. Check if we've processed this event ID before (store in Data Store or database)
  3. If yes → skip (return 200 OK to prevent retry)
  4. If no → process and record event ID

Pattern 3: Time-window dedup

For forms where users might accidentally double-submit:

  1. Check if same email submitted in last 5 minutes
  2. If yes → treat as duplicate, don't create new contact or trigger new workflow
  3. If no → process

Rate limiting

If a webhook endpoint is public, it can be abused. Rate limiting prevents flooding.

Simple IP rate limit

Track request count per IP in a Data Store / Redis / database. If >N requests in time window (e.g., 10 requests/minute), reject with 429.

In-app rate limiting (if supported)

GoHighLevel, HubSpot, and others rate-limit incoming webhooks by workflow. Configure at the destination level.

Per-contact rate limit

Prevent the same email from triggering 20 workflows in an hour. Use a tag like "processed-today" with 24-hour expiry — skip if present.


Error handling

When validation fails:

Option 1: Reject with HTTP status

Return 400/401/403 to the sender. For webhooks from tools like Stripe or Shopify, this triggers automatic retry with exponential backoff.

Option 2: Accept but log to error queue

Return 200 OK but route the payload to an error-handling workflow:

  • Store payload in a "Review" table
  • Notify admin via Slack/email
  • Don't create bad data in CRM

Option 2 is better when you can't risk upstream retries creating worse problems.

Option 3: Partial acceptance with flags

Accept the data, create the contact, but tag it "needs-review." Human reviews and cleans up. Not ideal but sometimes necessary for critical flows that can't miss any data.


Real webhook examples

Form submission webhook

Expected payload:

{
  "email": "user@example.com",
  "phone": "+15551234567",
  "name": "Jane Doe",
  "business_type": "solar",
  "budget": "10k-25k"
}

Validation:

  • email OR phone required
  • email format valid
  • business_type in canonical list
  • budget from dropdown values

VAPI call webhook

Expected payload:

{
  "call_id": "uuid",
  "status": "completed",
  "duration": 180,
  "transcript": "...",
  "outcome": "qualified",
  "contact_phone": "+15551234567"
}

Validation:

  • call_id present (for idempotency)
  • status from expected enum
  • duration is positive integer
  • contact_phone is valid E.164
  • Dedup by call_id to prevent double-processing

Stripe payment webhook

Expected payload: Standard Stripe event object.

Validation:

  • Signature verification (critical — prevents webhook spoofing)
  • Event type in expected list
  • Idempotency by event ID

What NOT to do

1. Trust all incoming data. "It came from a webhook, so it must be fine." Webhooks can be malformed, malicious, or accidentally duplicated.

2. Build happy-path only. Your workflow works for valid data. What about invalid? What about missing fields? What about duplicates? Design for the failure cases.

3. Skip logging. When something goes wrong, you need a trail. Log every webhook receipt, even invalid ones, with enough context to debug.

4. Validate too strictly. If validation rejects 30% of real submissions because of a too-tight regex, you're losing leads. Validate what matters, accept what's valid.

5. Rely only on application-layer validation. If possible, validate at the database layer too. Unique constraints prevent dupes even if application logic has bugs.


Sources

Patterns in this article are industry-standard data validation practices, adaptable from programming language references (RFC 5321 for emails, ITU-T E.164 for phone numbers) and service documentation (Stripe webhooks, Shopify webhooks, GitHub webhooks — all of which document signature verification patterns). Implementation examples tested across Make.com and n8n deployments.

Need help designing webhook validation for a specific integration? Let's talk — I can audit your current webhook endpoints and harden them.

Need This Built?

Ready to implement this for your business?

Everything in this article reflects real systems I've built and operated. Let's talk about yours.

H

Haroon Mohamed

Full-stack automation, AI, and lead generation specialist. 2+ years running 13+ concurrent client campaigns using GoHighLevel, multiple AI voice providers, Zapier, APIs, and custom data pipelines. Founder of HMX Zone.

ShareShare on X →