Why Memory Matters

Concept. CS145 is about data and how it persists over time.

Intuition. Your other classes teach the actions: algorithms, model inference, systems. Actions are short-term. The data those actions read and write lives in a database, on disk, long-term. ("Memory" here means that durable database, not the volatile RAM chip in your laptop.)

What this course teaches

Figure 1. Each module adds one capability the database needs at the next scale. M1 is already a complete, declarative, parallel database; the rest of the course is what happens around it. M2 decides how rows live on disk. M3 finds rows in milliseconds. M4 keeps state correct when many users hit it at once. M5 spreads one logical database across many machines. M6 is where databases live in the wild: in your phone, inside an LLM, at the edge. See the course map for how these connect to the projects and the careers they lead to.

Where we start: three teardowns

We learn how the database is built by taking three real systems apart and finding the same primitive inside each one. In teaching order, from a laptop to the planet:

Figure 2. The whole kickoff in one map. The same database in three settings, left to right from one machine to many. The numbers under each card are the working scale: a 16 GB laptop, then one user growing to 800 million, then billions of orders across the planet. Modules 1 through 6 build it.

0.1 Pandas: Deconstructing Small Analytics. Pandas and Polars on a laptop, the world you already know. What happens when the data no longer fits in RAM?

0.2 Claude & OpenAI: Deconstructing Agent Memory. One AI coding agent, the whole course in one example. Why does it hallucinate with no memory? Why is the context window the same memory limit? How does a local database fix it on one machine? How do you scale it to millions of users, each with their own memory?

0.3 UberEats: Deconstructing a Big Service. UberEats at planet scale, plus the offline edge. How does one database serve billions? How do the driver apps work on a phone?

The short version of all three: the context window, the DataFrame, the phone in a tunnel are short-term memory. Anything that has to outlast them lives in a database. Every serious product (ChatGPT, Stripe, Epic, Spotify) is built that way.

Every product you use has a database underneath. By the end of this course, you can design one. Start with 0.1 Pandas: Deconstructing Small Analytics.