← ~/logs LOG-010

>Partitioning Very Large Tables with Django

How Kraken Tech partitions billion-row PostgreSQL tables with Django 5.2 composite primary keys.

Tim Bell’s talk at DjangoCon Europe 2026 on partitioning in Django. Tim works at Kraken Tech, where a workflow-steps table hit 9 billion rows, 3 TB, and 20-hour vacuum cycles.

Why partition

When a single table reaches billions of rows, the abstraction leaks. Vacuums take hours, block schema changes, and limit what you can do. Partitioning splits one logical table into many physical tables on a partition key, so each partition is vacuumed, indexed, and queried independently.

Three methods: RANGE (time-series), LIST (fixed set of values), HASH (no natural grouping, just want parallelism).

The two things that actually matter

1. The partition key must be part of every unique constraint — including the primary key. Postgres enforces uniqueness per-partition, so partitioning on created_at forces a composite (id, created_at) PK.

2. Partition pruning only kicks in when the query’s WHERE includes the partition key. Without that filter, Postgres scans every partition. The key should be present in 90%+ of your hot queries.

Django 5.2 makes it possible

Create the partitioned table with RunSQL (Django can’t emit PARTITION BY yet):

migrations.RunSQL(sql="""
    CREATE TABLE workflows_workflowstep (
        id          BIGINT      GENERATED BY DEFAULT AS IDENTITY,
        workflow_id BIGINT      NOT NULL,
        name        VARCHAR(100) NOT NULL,
        status      VARCHAR(20) NOT NULL,
        created_at  TIMESTAMPTZ NOT NULL,
        PRIMARY KEY (id, created_at)
    ) PARTITION BY RANGE (created_at);
""")

Django 5.2’s CompositePrimaryKey unlocks the composite-PK requirement. The model uses managed = False so Django doesn’t fight you:

class WorkflowStep(models.Model):
    pk = models.CompositePrimaryKey("id", "created_at")
    id = models.BigIntegerField()
    created_at = models.DateTimeField()
    workflow_id = models.BigIntegerField()
    name = models.CharField(max_length=100)
    status = models.CharField(max_length=20)

    class Meta:
        managed = False
        db_table = "workflows_workflowstep"

Reads through the ORM work transparently. Partition pruning kicks in when created_at is in the filter:

WorkflowStep.objects.filter(
    created_at__gte=date(2026, 4, 1),
    created_at__lt=date(2026, 5, 1),
    workflow_id=7,
)
# EXPLAIN: scans only workflows_workflowstep_2026_04

Key takeaways

  • Partition when a table is >100 GB or >100 M rows
  • Partition key matters more than partitioning method — it determines which queries can prune
  • Partitioning is transparent to Django reads. filter(), aggregates, ordering all work
  • Converting an existing large table online is its own project — look at pg_partman or psycopack

Slides | Experiment code