Case Study 1.2: How Postgres Handles OpenAI's 800 Million Users
Concept. OpenAI scaled ChatGPT to ~800M users on Postgres by keeping writes on a single primary and routing reads to dozens of followers.
Intuition. Imagine Listens with 800 million users, almost all reading and a few writing. Reads are the bottleneck. Postgres lets you keep one writable primary that holds the truth and spin up dozens of read-only replicas to absorb the queries.
The Read-Heavy Constraint
Scaling to 800 million users looks different for every application. For an email service like Gmail, the system sustains a massive volume of continuous inbound writes. But for ChatGPT, the primary operational bottleneck is read throughput. As detailed in their 2026 engineering post "Scaling PostgreSQL to power 800 million ChatGPT users", OpenAI sustains this load using standard PostgreSQL.
Isolating Read and Write Traffic
If we look at ChatGPT's traffic, we see a severe read/write imbalance. Why? Because every time you ask the AI a single question (one write), the system must fetch your entire conversation history from the database to give the LLM its context (hundreds of reads). Furthermore, users constantly browse their sidebars loading past chats.
OpenAI deployed a single primary database dedicated exclusively to saving new writes. To handle the massive read volume generated by the LLM context window and the user interface, they provisioned 50 globally distributed read replicas. This architecture physically separates read traffic from write traffic. It maintains the ACID constraints of standard SQL for saving new data, while horizontally expanding read capacity.
(Note: We will dive much deeper into how you can configure sharding and replication later in Module 5, and how strict safety guarantees work under the hood in Module 4).
The Bottom Line
Relational architecture scales horizontally for read-heavy workloads. Optimization at this level requires carefully structured data models and clear query patterns, not custom database engines.
However, when a system exhausts horizontal read replication or encounters massive unstructured parallel writes, this architecture degrades. We cover circumventing these hard limits using Key-Value stores and Data Lakes in Module 6.