Optimizing Real Systems
You've measured something. Maybe you profiled a slow endpoint, or you finally wired up tracing and now you can see where requests spend their time. Good. But there's a gap nobody talks about between having data and making the system faster - and it's a gap people fall into for weeks. They tweak, they tune, they rewrite a clever loop, and the graph that matters doesn't move. Or it moves, and a week later nobody can say why, or whether it was worth it.
This guide is the capstone. It assumes you can already take a measurement (if you can't yet, start with Profiling 101 and Observability: Logs, Metrics, Traces) and it teaches the harder thing: how to turn measurement into durable speed. The disciplined loop that keeps you honest. Where the time actually goes in real-world systems, ranked, so you look in the right place first. And how to make a system faster in production - under real traffic, watching the right numbers - without trading away the correctness and readability you'll regret losing.
The thread running through all of it: the fastest code is the work you never do. Most real wins come not from making the work faster but from doing less of it.
How to read this
- Already deep in an optimization that isn't paying off? Jump to Phase 1: The Optimization Loop - you're probably missing a baseline or a target, and that's the whole problem.
- Don't know where to look first? Phase 2: Where the Time Actually Goes ranks the usual suspects so you start at the top, not the bottom.
- Want it to finally make sense? Read in order - each phase builds on the last.
The phases
- The Optimization Loop - measure, find the bottleneck, form one hypothesis, change one thing, re-measure, repeat - against a target you set in advance, and stop when you hit it.
- Where the Time Actually Goes - the real-world bottlenecks ranked: the database, the network, I/O and serialization, then CPU and algorithms - and the biggest lever of all, doing less work.
- Optimizing Safely in Production - verify with real traffic and observability, watch percentiles not averages, and steer clear of the classic traps: micro-optimizing a cold path, optimizing the wrong layer, and trading correctness for speed you didn't need.
Deliberately out of scope: the mechanics of taking a measurement. This guide is about what to do with the measurement once you have it. For flame graphs, sampling profilers, and reading a trace, see Profiling 101 and Observability: Logs, Metrics, Traces.