Webhook Data Validation: How to Stop Bad Data from Polluting Your CRM
Practical patterns for validating webhook payloads before they hit your CRM — required fields, format checks, deduplication, rate limiting — with real examples in Make.com and n8n.
Haroon Mohamed
AI Automation & Lead Generation
Why webhook validation matters
Webhooks are the connective tissue of modern automation stacks. A form submits → webhook fires → data flows to your CRM. A call ends → webhook fires → outcome updates your pipeline.
Without validation, every webhook becomes a potential vector for bad data:
- A form with no email address creates a ghost contact
- A VAPI webhook with malformed fields breaks your workflow
- A Stripe webhook received twice creates a duplicate deal
- A malicious POST fills your CRM with junk
Validation is the checkpoint between "data exists" and "data enters your system."
The 5 validation layers
Layer 1: Structure validation
Does the payload match the expected shape?
Check:
- Required fields are present (email, phone, or whatever's minimum)
- Fields are of expected types (string, number, boolean)
- Nested objects/arrays exist if expected
Fail behavior: Reject with a 400 status code. Don't log as success.
Layer 2: Format validation
Do fields have valid values?
Check:
- Email matches regex (
/^[^\s@]+@[^\s@]+\.[^\s@]+$/) - Phone is parseable as E.164
- Dates are valid ISO 8601
- URLs are well-formed
Fail behavior: Route to an error queue. Log for human review.
Layer 3: Business logic validation
Does the data make sense for your business?
Check:
- Budget value is within realistic range
- Deal amount isn't negative
- Timeline values are from expected set
- Source tag is from canonical list
Fail behavior: Accept but flag with a "review" tag.
Layer 4: Deduplication
Is this data already in the system?
Check:
- Normalized email or phone matches existing contact
- Same event ID already processed (idempotency)
- Same form submission within dedup window (e.g., 5 minutes)
Fail behavior: Update existing record instead of creating duplicate.
Layer 5: Security validation
Is the request actually from the expected source?
Check:
- Signature header matches expected HMAC (for providers that sign webhooks — Stripe, Shopify, GitHub)
- IP whitelist (if provider publishes allowed IPs)
- Shared secret in header or query param
Fail behavior: Reject with 401 Unauthorized. Log attempt.
Implementation: Make.com
Basic structure validation
At the top of the webhook scenario, add a filter module that checks required fields:
email is not empty AND email contains @ AND phone is not empty
If false, route to error branch.
Email regex validation
Use a filter with:
email matches pattern ^[^\s@]+@[^\s@]+\.[^\s@]+$
Deduplication via HubSpot/GHL lookup
Before creating a contact:
- Search existing contacts by email (or normalized phone)
- If found → update instead of create
- If not found → create
Security via shared secret
Most webhook providers let you set a secret query parameter. In Make:
- Add a filter:
_{query.secret}_ equals "YOUR-SECRET-HERE" - If not, reject
For signed webhooks (Stripe, Shopify):
- Extract signature from header
- Compute HMAC using your secret
- Compare — if mismatch, reject
Implementation: n8n
Webhook node validation
Start with a Webhook trigger node. Immediately after, add a Code node with validation logic:
const { email, phone, name } = $input.item.json;
const errors = [];
// Required fields
if (!email && !phone) {
errors.push('Must have email or phone');
}
// Email format
if (email && !/^[^\s@]+@[^\s@]+\.[^\s@]+$/.test(email)) {
errors.push('Invalid email format');
}
// Phone format (strict E.164)
if (phone && !/^\+\d{10,15}$/.test(phone)) {
errors.push('Invalid phone format');
}
if (errors.length > 0) {
return { json: { valid: false, errors } };
}
return { json: { valid: true, data: { email, phone, name } } };
Then an IF node to branch on valid === true.
Signature verification (Stripe example)
const crypto = require('crypto');
const signature = $input.item.headers['stripe-signature'];
const payload = JSON.stringify($input.item.body);
const secret = 'your-stripe-webhook-secret';
const expectedSig = crypto
.createHmac('sha256', secret)
.update(payload, 'utf8')
.digest('hex');
const valid = signature && signature.includes(expectedSig);
return { json: { valid } };
Deduplication patterns
Pattern 1: Email/phone lookup before create
For every incoming lead:
- Normalize email (lowercase, trim)
- Normalize phone (E.164)
- Lookup contact by normalized email
- If found → update fields (merge strategy: newer wins)
- If not found → create new contact
Pattern 2: Event ID idempotency
For events that might be delivered twice (Stripe, Shopify retry failed webhooks):
- Extract event ID from payload
- Check if we've processed this event ID before (store in Data Store or database)
- If yes → skip (return 200 OK to prevent retry)
- If no → process and record event ID
Pattern 3: Time-window dedup
For forms where users might accidentally double-submit:
- Check if same email submitted in last 5 minutes
- If yes → treat as duplicate, don't create new contact or trigger new workflow
- If no → process
Rate limiting
If a webhook endpoint is public, it can be abused. Rate limiting prevents flooding.
Simple IP rate limit
Track request count per IP in a Data Store / Redis / database. If >N requests in time window (e.g., 10 requests/minute), reject with 429.
In-app rate limiting (if supported)
GoHighLevel, HubSpot, and others rate-limit incoming webhooks by workflow. Configure at the destination level.
Per-contact rate limit
Prevent the same email from triggering 20 workflows in an hour. Use a tag like "processed-today" with 24-hour expiry — skip if present.
Error handling
When validation fails:
Option 1: Reject with HTTP status
Return 400/401/403 to the sender. For webhooks from tools like Stripe or Shopify, this triggers automatic retry with exponential backoff.
Option 2: Accept but log to error queue
Return 200 OK but route the payload to an error-handling workflow:
- Store payload in a "Review" table
- Notify admin via Slack/email
- Don't create bad data in CRM
Option 2 is better when you can't risk upstream retries creating worse problems.
Option 3: Partial acceptance with flags
Accept the data, create the contact, but tag it "needs-review." Human reviews and cleans up. Not ideal but sometimes necessary for critical flows that can't miss any data.
Real webhook examples
Form submission webhook
Expected payload:
{
"email": "user@example.com",
"phone": "+15551234567",
"name": "Jane Doe",
"business_type": "solar",
"budget": "10k-25k"
}
Validation:
- email OR phone required
- email format valid
- business_type in canonical list
- budget from dropdown values
VAPI call webhook
Expected payload:
{
"call_id": "uuid",
"status": "completed",
"duration": 180,
"transcript": "...",
"outcome": "qualified",
"contact_phone": "+15551234567"
}
Validation:
- call_id present (for idempotency)
- status from expected enum
- duration is positive integer
- contact_phone is valid E.164
- Dedup by call_id to prevent double-processing
Stripe payment webhook
Expected payload: Standard Stripe event object.
Validation:
- Signature verification (critical — prevents webhook spoofing)
- Event type in expected list
- Idempotency by event ID
What NOT to do
1. Trust all incoming data. "It came from a webhook, so it must be fine." Webhooks can be malformed, malicious, or accidentally duplicated.
2. Build happy-path only. Your workflow works for valid data. What about invalid? What about missing fields? What about duplicates? Design for the failure cases.
3. Skip logging. When something goes wrong, you need a trail. Log every webhook receipt, even invalid ones, with enough context to debug.
4. Validate too strictly. If validation rejects 30% of real submissions because of a too-tight regex, you're losing leads. Validate what matters, accept what's valid.
5. Rely only on application-layer validation. If possible, validate at the database layer too. Unique constraints prevent dupes even if application logic has bugs.
Sources
Patterns in this article are industry-standard data validation practices, adaptable from programming language references (RFC 5321 for emails, ITU-T E.164 for phone numbers) and service documentation (Stripe webhooks, Shopify webhooks, GitHub webhooks — all of which document signature verification patterns). Implementation examples tested across Make.com and n8n deployments.
Need help designing webhook validation for a specific integration? Let's talk — I can audit your current webhook endpoints and harden them.
Need This Built?
Ready to implement this for your business?
Everything in this article reflects real systems I've built and operated. Let's talk about yours.
Haroon Mohamed
Full-stack automation, AI, and lead generation specialist. 2+ years running 13+ concurrent client campaigns using GoHighLevel, multiple AI voice providers, Zapier, APIs, and custom data pipelines. Founder of HMX Zone.
Related articles
Time-Series Data for Marketing Analytics: When PostgreSQL Beats a Real TSDB
Time-series data is data with a timestamp where the timestamp matters. Every event has a "when," and you analyze across the time dimension constantly. For marketing analytics, this is most of the dat…
Schema Migrations Without Downtime: How to Evolve Your CRM Database Safely
In a small operation, schema changes feel low-risk. You add a custom field. You rename a tag. You change a dropdown to a multi-select. The change works in the CRM UI and you move on. What you didn't …