Part 1 · Core Building Blocks
Almost every large system you will ever read about — a social network, a payments backend, a video platform — looks bespoke from the outside. From the inside, it is the same dozen parts, wired together in different proportions. There is no secret part. There is a standard toolkit, and the engineering skill is in choosing which blocks to use, where to place them, and what each one costs you.
This part is that toolkit. We build each block from first principles: what problem does it solve, why does it work, and — the question we ask on every page — what does it buy us, and what does it cost? Nothing is free. Every block you add buys a property (lower latency, more throughput, isolation) and charges a price (staleness, complexity, a new failure mode). Master the prices and you can design.
The shape of a request
Section titled “The shape of a request”To see where the blocks live, follow a single user request from a browser to a database and back. Almost everything in this part sits somewhere on this path:
Browser │ (1) "where is example.com?" ▼ DNS ──────────────► returns an IP, routed to a nearby/healthy edge │ (2) HTTPS request to that IP ▼ CDN edge ──────────► serves cached static assets near the user │ (3) cache miss → go to origin ▼ Load balancer / reverse proxy / API gateway │ (4) pick a healthy server; terminate TLS; auth; rate-limit ▼ App servers (many identical, stateless copies) │ (5) read-through cache? ┌────────────┐ ├──────────────────────────────────►│ Cache │ (hot data, ms reads) │ (6) durable read/write └────────────┘ ▼ Database(s) ◄── chosen by data shape: relational, KV, document, … │ (7) slow / async work? hand it off ▼ Message queue ──────► workers process jobs later, decoupledThat single diagram contains all seven blocks of this part. Each page below takes one box and asks how it actually works.
The seven blocks
Section titled “The seven blocks”A short roadmap — read in order, or jump to what you need:
- DNS & Request Routing — how a human-readable name becomes an IP address, and how anycast and GeoDNS steer each user to the nearest healthy endpoint. This is step (1): the first decision in every request.
- Load Balancers — how one virtual address spreads traffic across a fleet of identical servers, the L4-vs-L7 split, balancing algorithms, and health checks. This is what lets you scale out instead of up.
- Caching — storing the answer so you don’t recompute it. The single highest-leverage performance technique, and the one with the sharpest hidden cost: staleness. We cover the whole hierarchy, from browser to database.
- Content Delivery Networks — caching pushed all the way out to the network edge, physically close to users. What it buys (latency, origin offload) and what it costs (staleness, invalidation pain).
- Databases: A Field Guide — the six broad families (relational, key-value, document, wide-column, graph, time-series) and the data shapes each one is built to serve. Choosing wrong here is the most expensive mistake on the list.
- Message Queues — decoupling producers from consumers so slow or spiky work happens asynchronously. The queue-vs-log distinction, delivery guarantees, and backpressure.
- Reverse Proxies & API Gateways — the front door. What a gateway centralizes (auth, rate limiting, TLS, routing, aggregation) and the single-point-of-failure tax that centralization charges.
A word on composition
Section titled “A word on composition”The blocks are not independent. A CDN is caching plus DNS routing. An API gateway is a reverse proxy plus auth plus rate limiting. A load balancer needs health checks, which only work if your app servers are stateless — which is why we keep pointing at Statelessness & Sessions. The toolkit is a vocabulary; real systems are sentences built from it. Once you know the words, the rest of this book is grammar: how to combine them under constraints like consistency, latency, and cost.
Check your understanding
Section titled “Check your understanding”- Trace a single user request through the seven blocks. Which block is touched first, and which is touched only for slow or asynchronous work?
- The book asks one recurring question on every page. What is it, and why is “free” never the answer?
- Why is a CDN best understood as a composition of two other blocks rather than a primitive?
- Why does a load balancer’s usefulness depend on your app servers being stateless?
- Pick any two blocks and name one property each one buys and one cost each one charges.