How to Bring Your Database Data into PostHog

October 31, 2025

If you’ve got rich data sitting in your product or marketing database - subscriptions, payments, churn events, or internal metrics - and you’re using PostHog for analytics, you’ve probably wondered:

‍

“how do I get this data into PostHog so I can actually use it?”

‍

This guide walks through how to sync data from any SQL database (like Redshift, Postgres, MySQL, or BigQuery) into PostHog - step by step.

‍

Why You’d Want to Do This

‍

Your database is where all data is recorded. PostHog is where you can visualize that data.

‍

When you connect the two, you bridge the gap between your backend systems and your analytics - so you can answer questions like:

‍

How many paying users actually used the feature they subscribed for?
Which accounts churned after a billing error?
Are certain payment types linked to higher retention?

‍

Once your database rows turn into PostHog events, you can start analyzing them alongside product behavior. No more CSV exports, no manual joins, no guessing.

‍

Step 1: Identify What to Sync

‍

Start by looking at your database tables and asking, which rows represent meaningful actions?

‍

Examples:

‍

Example Row	Event to Send to PostHog
subscription_events user_id=123, event='renewed', timestamp='2025-10-30'	subscription_renewed
payments payment_id=abc123, amount=49.99, status='successful'	payment_successful
refunds refund_id=xyz789, reason='user_cancelled'	refund_issued

‍

If your table describes something that happened, it’s a perfect candidate for an event in PostHog.

‍

Step 2: Pull Data from Your Database

‍

Here’s an example query to fetch new events from a table:

‍

SELECT
    user_id,
    event_type,
    event_timestamp,
    amount,
    country,
    plan_id
FROM subscription_events
WHERE event_timestamp >= NOW() - INTERVAL '1 day';

‍

This will pull every event that occurred in the last 24 hours. This is perfect for a daily sync job.

‍

Step 3: Map Rows to PostHog Events

‍

Each database row becomes one PostHog event.

‍

Here’s what that looks like in Python:

‍

event = {
    "event": row["event_type"],
    "distinct_id": row["user_id"],  # Use your user_id or equivalent
    "timestamp": row["event_timestamp"].isoformat(),
    "properties": {
        "amount": row["amount"],
        "country": row["country"],
        "plan_id": row["plan_id"],
        "source": "database_sync"
    }
}

‍

You can then send these events using PostHog’s /batch API, which supports up to 500 events per request:

‍

import gzip, json, requests

payload = {
    "api_key": "<POSTHOG_API_KEY>",
    "batch": [event for event in events]
}

gz = gzip.compress(json.dumps(payload).encode("utf-8"))
requests.post(
    "https://app.posthog.com/batch/",
    headers={"Content-Type": "application/json", "Content-Encoding": "gzip"},
    data=gz
)

‍

Step 4: Prevent Duplicates with a Tracking Table

If you run this sync regularly (and you should), you’ll want to avoid sending the same data twice.

‍

Create a tracking table in your database:

‍

CREATE TABLE IF NOT EXISTS public.synced_events (
    row_id VARCHAR(255),
    event_type VARCHAR(255),
    synced_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (row_id, event_type)
);

‍

Before sending an event, check if (row_id, event_type) already exists.

‍
If it does, skip it.

‍
If not, send it and insert it into this table.

‍

This makes your pipeline safe to re-run any time.

‍

Step 5: Make It Incremental

‍

Instead of scanning the whole table every time, use timestamps or IDs:

‍

SELECT *
FROM subscription_events
WHERE event_timestamp > (SELECT MAX(synced_at) FROM synced_events);

‍

That way, you only pull fresh data.

‍
You can even keep this value in a small control file or state table if you prefer.

‍

Step 6: Automate It

‍

You can run this pipeline anywhere, but the most common options are:

‍

AWS Glue: fully managed ETL, runs daily, integrates with Redshift easily.
Airflow: flexible and good for multi-source syncs.
Cron + Python Script: simple, fast, and works fine for small volumes.

‍

Whatever you choose, the process is the same:

‍

Pull new rows.
Transform to PostHog format.
Skip duplicates.
Send events in batches to PostHog.

‍

Step 7: Analyze Your Data in PostHog

‍

Once the data lands in PostHog, you can:

‍

Build funnels that combine backend and product behavior.
e.g. Subscription Started → Feature Used → Payment Successful
Segment users by billing plan, payment type, or lifecycle stage.
Trigger feature flags or experiments for specific account groups.

‍

This is where PostHog shines. You can now turn backend data into behavioral insights.

‍

The Takeaway

‍

Your database is your source of truth.

‍
PostHog is your source of analytics.

‍

By syncing them, you connect what users do with what your system knows - giving you the power to analyze, optimize, and automate with confidence.

Recent Blogs

November 6, 2025

•

7 min read

min read

Pipelines

How to Setup PostHog with Clay and Cloud Functions

November 06, 2025

Read on PostHog

October 31, 2025

•

4 min read

min read

Pipelines

How to Sync Customer.io Data into PostHog

October 31, 2025

Read on PostHog

October 29, 2025

•

4 min read

min read

Pipelines

Google Ads to BigQuery Setup for PostHog Integration

October 18, 2025

Read on PostHog

November 6, 2025

•

7 min read

Pipelines

How to Setup PostHog with Clay and Cloud Functions

See how you can create a full loop with PostHog, Clay and Cloud functions for your Product.

Read guide

October 31, 2025

•

4 min read

Pipelines

How to Sync Customer.io Data into PostHog

Check out how you can use Customer.io campaigns data in PostHog for analytics.

Read guide

October 29, 2025

•

4 min read

Pipelines

Google Ads to BigQuery Setup for PostHog Integration

Check out how you can set up Google Ads data to be synced with BigQuery for PostHog analytics.

Read guide