Caching, Explained
You've heard "we should cache that" in a hundred meetings, and it always sounds like the easy win — the magic word that makes slow things fast. Then one day a user swears they updated their profile but the old name keeps showing, and suddenly caching is the reason something is wrong, not fast. Both of those moments come from the same small idea, and once you hold that idea clearly, caching stops being a black box.
A cache is one thing: a copy of an expensive answer, kept somewhere fast, so you don't have to produce that answer again. Everything else — CDNs, Redis, browser caches, TTLs, the famous invalidation jokes — is a variation on that single move. This guide builds the idea up cleanly so you can reason about any cache you meet, instead of memorizing rules.
How to read this
- Want it to finally make sense? Read in order. Phase 1 installs the mental model, Phase 2 shows you where caches actually live, and Phase 3 covers the part everyone gets bitten by.
- Already comfortable and here for the hard part? Jump to Phase 3: The Hard Part — Invalidation & Staleness — that's where stale data, TTLs, and eviction live.
The phases
- What a Cache Actually Is — a copy of an expensive answer kept somewhere fast; hits, misses, and the notepad mental model.
- Where Caches Live — the browser, the CDN at the edge, your application cache (in-memory / Redis), and the database's own caches; a request traveling through all of them.
- The Hard Part — Invalidation & Staleness — why the cached copy and the truth drift apart, TTLs, eviction (LRU), write-through vs. cache-aside, and when not to cache.
This guide stays at the "reason about it" level. Cache stampedes, distributed cache coherence, and tuning Redis for production are deliberately left for a follow-up — you want the mental model solid before any of that helps.
Related: Why Is My Query Slow? · Designing for Scale