UberEats: Deconstructing a Big Service

UberEats: one database, three sides, and the disconnected edge.

Concept. UberEats keeps users, drivers, and restaurants in a single relational database so each order updates all three records atomically, while the same SQL engine, embedded, runs offline on the driver's phone.

Intuition. A burrito order has three sides, diner, driver, restaurant, and they all have to agree the moment it is placed. SQL's transactional guarantees are what stop two drivers from claiming the same order, on the server and on the phone in a tunnel.

Case Study Reading Time: 6 mins

Figure 1. One database, three sides. The diner, driver, and restaurant apps hold volatile state (red); the relational database (green) is the single shared truth. One order is a single atomic transaction, which is why two drivers can never claim the same order.

One order, three sides

If you are building a three-sided marketplace like UberEats (diner, driver, restaurant), your primary constraint is state synchronization. A mobile client holds a volatile, in-memory state. Your phone cannot safely orchestrate a group order or dispatch a driver on its own. To do that, the application queries an out-of-process global state: the relational database.

Placing an order is one transaction: charge the diner, assign exactly one driver, decrement the restaurant's inventory. All of it, or none of it. That all-or-nothing guarantee is what stops two drivers from claiming the same order, and what makes a sold-out salad disappear for thousands of browsing clients at once. If the JOIN that maps an order to an available driver is slow or wrong, the system physically degrades: drivers sit idle and food gets cold.

App state versus global state

The split is the same one you saw in the agent teardown, at a different scale. The client's in-memory state is short-lived and local, like a context window or a Pandas DataFrame: fast, bounded, gone when the process ends. The relational database is the long-lived global truth that every diner, driver, and restaurant reads and writes against at once.

During a rainstorm, UberEats recalculates incentive pricing across an entire city. Distributed SQL engines execute those heavy aggregations over petabytes, then adjust operational thresholds instantly. Same language, laptop to cluster.

The disconnected edge

Figure 2. The disconnected edge. The driver's phone runs an embedded SQLite database, so it works offline and stays transactional even if the phone dies mid-write, then syncs to the cloud truth (green) on reconnect. Same SQL, from the cluster to the pocket.

There is one more place the database lives, and it is the smallest one: the driver's phone itself.

A driver in a parking garage or a dead-zone freeway still needs to accept an order, mark a pickup, update a status. The app cannot stall waiting for a network round-trip. So the phone carries its own embedded SQL database: SQLite, the same engine that already ships on the phone in your pocket.

Already on every phone. SQLite ships preinstalled on every Android device and every iPhone ever made, several billion devices in total. Nobody installs it; it is just there. It is the default embedded database the whole industry already builds on, which is why reaching for it on the edge is the obvious move, not an exotic one.
Serverless and embedded. No cluster to reach. The database sits inside the app, on the device.
Transactional offline. It obeys the same all-or-nothing rules even if the phone dies mid-write. A half-accepted order never exists.
Syncs on reconnect. When signal returns, the local changes reconcile against the cloud source of truth.

The lesson is the punchline of the whole kickoff: it is the same SQL language and the same transactional guarantees on a planet-scale server cluster and on a phone with no signal. You do not learn two systems. You learn one, and point it at different hardware.

Key Takeaways

One database, one transaction. Three sides agree because placing an order is a single atomic transaction. Two drivers can never claim the same order.
The same primitive at every size. You have now seen it small (a Pandas DataFrame), as an agent's memory (one SQLite file), at planet scale (UberEats' shared relational database), and at the disconnected edge (SQLite on the driver's phone). One language, laptop to cluster to pocket.
The rest of the course builds it. You have seen what a database does at every scale. Modules 1 through 6 are how you build and operate one yourself.

Going deeper

The same teardown, one level down: the history that explains the design, and the mechanics of the offline edge.

"But I read a blog, isn't SQL slow?"

This is a misconception born of the early-2000s "scale wall." In the 1990s and 2000s, SQL databases ran on a single massive server; when data exceeded one machine, scaling out was extremely hard. The industry temporarily pivoted to the NoSQL movement, sacrificing strict data safety and JOINs just to partition data across thousands of cheap machines.

Google's infrastructure shows exactly why the split happened:

Web Search and Gmail did not need strict transactional safety, just massive unstructured scale. Google abandoned relational constraints here, inventing MapReduce and BigTable, which fueled the wider NoSQL movement.
Ads data systems, the engine generating Google's revenue, had to be perfectly accurate. It stayed anchored to custom MySQL infrastructure, proving SQL's fit for financial data.

Modern distributed SQL closes the loop: 2020s engines deliver horizontal scale (NoSQL's original draw) without giving up safety or the declarative model. One tool on your laptop and on massive-scale compute. Module 6 revisits the SQL-versus-NoSQL question once you have the systems vocabulary to answer it.

Offline sync, concretely

The phone's schema mirrors the cloud with mobile tweaks. Lines starting with -- are SQL comments:

-- Orders the driver accepts and edits offline.
CREATE TABLE Orders (
    id INTEGER PRIMARY KEY,    -- unique order id
    status TEXT,               -- 'accepted' | 'picked-up' | 'delivered'
    created_at TIMESTAMP,      -- when the order was accepted
    updated_at TIMESTAMP       -- when it last changed; used to decide what to sync
);

On reconnect, a Common Table Expression finds what changed inside the sync window and ships only those rows to the cloud:

WITH RecentOrders AS (
    SELECT * FROM Orders
    WHERE updated_at > datetime('now', '-600 seconds')
)
SELECT * FROM RecentOrders;

Because the edge speaks the same declarative language as the cloud, complex state logic extends to the isolated device without overhauling the core data architecture.

Optional reading: real-world data architecture decisions

How Uber Serves Over 40 Million Reads Per Second. Uber's journey from MySQL to BigQuery, Spanner, and PrestoSQL.
Spotify's Data Platform Explained. Spotify's MySQL and BigQuery infrastructure at billions of events daily.
Building and Scaling Notion's Data Lake. Scaling PostgreSQL for a collaborative workspace.