Time-Series Data for Marketing Analytics: When PostgreSQL Beats a Real TSDB
Marketing analytics is fundamentally time-series data. Most service businesses can run their analytics on PostgreSQL without needing a dedicated time-series database. Here's when that's true and when it isn't.
Haroon Mohamed
AI Automation & Lead Generation
What time-series data is, in plain terms
Time-series data is data with a timestamp where the timestamp matters. Every event has a "when," and you analyze across the time dimension constantly.
For marketing analytics, this is most of the data:
- Lead volume per day, per source
- Conversion rates over weeks
- Campaign performance over months
- Customer cohort behavior over time
- Revenue trends across quarters
The fact that your data is time-series shapes how you store and query it. Make the wrong choice and queries become slow, storage explodes, or you can't answer the questions you actually need to answer.
The conventional wisdom in tech circles is "use a dedicated time-series database for time-series data." Tools like InfluxDB, TimescaleDB, ClickHouse, and managed services like AWS Timestream exist precisely for this.
But for most service business marketing analytics, you don't need any of those. PostgreSQL handles the use case fine — and using PostgreSQL is dramatically simpler operationally than running a dedicated TSDB.
What dedicated TSDBs offer
Real time-series databases are optimized for specific patterns:
- Very high write volumes (millions of events per second)
- Time-based partitioning (data automatically split by time ranges for efficient querying)
- Compression for old data (years of data compressed to small disk footprint)
- Aggregation built-in (compute hourly/daily averages efficiently)
- Retention policies (auto-drop old data)
- Specialized query languages (Flux, PromQL, etc.)
These are real benefits at the right scale. The question is whether your scale needs them.
When PostgreSQL is enough
For most service business marketing analytics, the relevant volumes are modest:
- A high-volume lead generation operation: 10,000 leads/month = ~330/day
- An active CRM with all activity logged: 50,000 events/month
- Even an aggressive AI calling operation: 100,000 calls/month = ~3,300/day
PostgreSQL handles these volumes effortlessly. With basic indexing and reasonable query patterns, queries return in milliseconds even on millions of rows.
The pivot point where you'd actually need a dedicated TSDB:
- 1+ million events per day (typically high-traffic SaaS, IoT, or massive ad operations)
- Sub-second query requirements over years of data
- Need for complex windowed aggregations over high-cardinality data
- Real-time dashboards with thousands of concurrent users
If you're not in that range, PostgreSQL is the right tool — and saves enormous operational complexity.
How to structure time-series data in Postgres
The patterns that produce good performance:
1. A clean events table with timestamp + dimensions.
events (
id BIGSERIAL,
occurred_at TIMESTAMPTZ NOT NULL,
event_type TEXT NOT NULL,
contact_id UUID,
source TEXT,
campaign TEXT,
amount NUMERIC,
metadata JSONB
)
Each event row has a timestamp and the dimensions you'll filter/group by. JSONB for flexible metadata.
2. Index on timestamp + frequently-filtered columns.
CREATE INDEX idx_events_time ON events (occurred_at DESC);
CREATE INDEX idx_events_type_time ON events (event_type, occurred_at DESC);
CREATE INDEX idx_events_source_time ON events (source, occurred_at DESC);
These indexes make time-range queries fast.
3. Partition by time range if volume is high.
For tens of millions of rows, native PostgreSQL partitioning splits the table by month or quarter. Queries against recent time ranges only touch recent partitions, making them fast.
CREATE TABLE events (...) PARTITION BY RANGE (occurred_at);
CREATE TABLE events_2026_01 PARTITION OF events FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
For most operations under 10M rows, partitioning is not needed yet.
4. Materialized views for common aggregations.
If you query "leads per source per day for the last 90 days" frequently, don't recompute it every time. Create a materialized view that aggregates this and refreshes hourly:
CREATE MATERIALIZED VIEW leads_per_source_daily AS
SELECT date_trunc('day', occurred_at) as day,
source,
count(*) as lead_count
FROM events
WHERE event_type = 'lead_created'
GROUP BY 1, 2;
REFRESH MATERIALIZED VIEW CONCURRENTLY leads_per_source_daily;
The dashboard queries this materialized view instead of the raw events. Much faster.
The TimescaleDB middle ground
If PostgreSQL is straining at your scale but you want to keep the ecosystem, TimescaleDB is a Postgres extension that adds time-series-specific features:
- Automatic partitioning ("hypertables")
- Continuous aggregates (auto-refreshing materialized views)
- Compression for older data (often 90%+ size reduction)
- Time-bucketing functions
Crucially, TimescaleDB still speaks SQL. Your existing tools (Postgres clients, ORMs, BI tools) work without changes. You're not adopting a new query language or new operational stack — just adding a Postgres extension.
This is the right path when you've outgrown vanilla Postgres but don't want to rebuild on a fully different database.
Common analytics queries and how to write them efficiently
A few patterns worth knowing:
Daily lead volume by source over 90 days:
SELECT date_trunc('day', occurred_at)::date as day,
source,
count(*) as leads
FROM events
WHERE event_type = 'lead_created'
AND occurred_at >= now() - interval '90 days'
GROUP BY 1, 2
ORDER BY 1, 2;
With proper indexing, this returns in tens of milliseconds even on multi-million-row tables.
Cohort retention (customers signed up in week W, what % were still active in subsequent weeks):
WITH cohorts AS (
SELECT contact_id,
date_trunc('week', min(occurred_at)) as cohort_week
FROM events
WHERE event_type = 'signup'
GROUP BY 1
),
weekly_activity AS (
SELECT contact_id,
date_trunc('week', occurred_at) as activity_week
FROM events
WHERE event_type = 'login'
GROUP BY 1, 2
)
SELECT c.cohort_week,
a.activity_week,
count(DISTINCT a.contact_id) as active_users
FROM cohorts c
LEFT JOIN weekly_activity a ON a.contact_id = c.contact_id
GROUP BY 1, 2
ORDER BY 1, 2;
This is the kind of query that benefits enormously from TimescaleDB but runs adequately in vanilla Postgres for moderate volumes.
Funnel progression (how many leads moved from stage A to B in the last 30 days):
WITH stage_events AS (
SELECT contact_id,
occurred_at,
metadata->>'new_stage' as new_stage,
metadata->>'old_stage' as old_stage
FROM events
WHERE event_type = 'stage_transition'
AND occurred_at >= now() - interval '30 days'
)
SELECT old_stage, new_stage, count(*) as transitions
FROM stage_events
GROUP BY 1, 2
ORDER BY 3 DESC;
With JSONB indexing on the metadata column, this is fast.
Where things get hard
Some patterns push Postgres harder:
Very high cardinality. If you have millions of distinct contacts and want per-contact daily aggregations, the result set explodes. This is where TSDB compression starts mattering.
Long retention with active queries. If you need to query 5 years of detailed event data interactively, partitioning and aggregation pre-computation become essential.
Real-time dashboards with sub-second latency. Materialized views that refresh hourly aren't enough. You'd need streaming aggregations.
Complex windowed analytics. Some statistical computations are awkward in SQL and feel more natural in TSDB-specific query languages.
For most marketing analytics use cases, none of these apply. When they do, PostgreSQL plus TimescaleDB still handles it for a long time before you'd actually need to move.
What about ClickHouse, BigQuery, Snowflake?
For analytics specifically, you have options beyond Postgres and dedicated TSDBs:
BigQuery, Snowflake, Redshift: OLAP data warehouses. Excellent for complex analytics queries on large datasets. Often pair with a transactional database (Postgres) for operational data and use the warehouse for analytics. Good fit when you have data engineers and substantial query volume.
ClickHouse: Columnar database optimized for analytical queries. Very fast for aggregations. Good middle ground between Postgres and a full warehouse.
DuckDB: In-process analytical SQL database. Increasingly popular for analytics workloads that fit on a single machine. Worth knowing for ad-hoc analysis.
For most service businesses, the right pattern is:
- Operational data (customers, orders, real-time stuff): Postgres
- Analytics queries: same Postgres for low/medium scale; warehouse for higher scale
Don't pick a warehouse for a low-volume operation. The complexity isn't justified.
When to actually invest in real time-series infrastructure
Three signals that you've outgrown Postgres-as-TSDB:
1. Queries are taking minutes instead of seconds. Even with proper indexing and materialized views, you're hitting limits.
2. Storage cost is becoming material. Multi-TB Postgres is expensive; compressed TSDB storage is much cheaper.
3. You need real-time analytics with low latency. Postgres can do this for moderate volumes; high-volume real-time analytics is what TSDBs are built for.
If none of these are true, stay on Postgres. The ecosystem, tooling, and operational simplicity is worth a lot.
A reasonable starting point
For a service business setting up marketing analytics today, my recommended starting stack:
- Postgres (Supabase or self-hosted) for both operational and analytical data
- Materialized views for common aggregations
- A BI tool (Metabase, Looker Studio, Hex) that queries Postgres directly
- CDC or webhook-based ingestion from your CRM to Postgres so analytics data stays current
This stack handles the data and query patterns of probably 90% of service businesses. The cost is low, the operational overhead is minimal, and the scaling path (add TimescaleDB extension; later add a real warehouse if needed) is gradual.
The mistake is starting with a complex analytics stack before you have the data volume to justify it. Most service businesses optimize for problems they don't yet have.
If you want help designing a marketing analytics stack that fits your actual scale, let's talk.
Need This Built?
Ready to implement this for your business?
Everything in this article reflects real systems I've built and operated. Let's talk about yours.
Haroon Mohamed
Full-stack automation, AI, and lead generation specialist. 2+ years running 13+ concurrent client campaigns using GoHighLevel, multiple AI voice providers, Zapier, APIs, and custom data pipelines. Founder of HMX Zone.
Related articles
Schema Migrations Without Downtime: How to Evolve Your CRM Database Safely
In a small operation, schema changes feel low-risk. You add a custom field. You rename a tag. You change a dropdown to a multi-select. The change works in the CRM UI and you move on. What you didn't …
Event Sourcing for Service Business Workflows: A Beginner's Introduction
**Event sourcing** is a way of structuring data where, instead of storing the current state of things, you store the history of events that produced the state — and derive the current state from thos…