Design a URL Shortener
A URL shortener (think TinyURL, bit.ly) takes https://example.com/some/very/long/path?utm=... and
hands back https://sho.rt/aZ3kQ, then redirects anyone who visits the short link to the original.
It looks trivial — a hashmap with a web server in front — and that’s exactly why it’s the perfect
first case study. The interesting engineering is hidden in three places: generating short codes
that never collide, serving redirects with single-digit-millisecond latency, and doing both
read-heavy at massive scale. We’ll walk the six-step framework.
1. Requirements
Section titled “1. Requirements”Functional
- Given a long URL, return a short URL.
- Given a short URL, redirect to the original.
- (Optional) custom aliases (
sho.rt/my-brand) and link expiry. - (Optional) click analytics.
Non-functional
- Read-heavy, roughly 100:1 reads to writes — almost everyone clicks links; few create them.
- Redirect latency < 50ms at the edge.
- High availability (a dead shortener breaks every link ever issued).
- Short codes must be unique and ideally short (6–7 characters).
- Links are effectively permanent — durability matters more than write throughput.
2. Back-of-envelope estimation
Section titled “2. Back-of-envelope estimation”Assume 100M new URLs per day.
Writes: 100M / 86,400s ≈ 1,160 writes/sec (~3,500/sec at 3× peak)Reads: 100:1 ratio ≈ 116,000 reads/sec (~350,000/sec at peak)Storage: 500 bytes/record × 100M/day × 365 × 5yr ≈ 90 TB over 5 yearsKeyspace: how many 7-char codes do we need?The keyspace question decides the code length. Using base62 (a–z, A–Z, 0–9):
62^6 ≈ 56.8 billion ← enough for ~1.5 years at 100M/day62^7 ≈ 3.5 trillion ← ~95 years of headroom — pick 7 charactersSo: a write-light, read-heavy, storage-modest system. That profile screams cache the reads, keep writes simple, never lose a record. See Back-of-Envelope Estimation.
3. API sketch
Section titled “3. API sketch”POST /api/v1/shorten body: { "long_url": "...", "custom_alias?": "...", "expire_at?": "..." } returns:{ "short_url": "https://sho.rt/aZ3kQ7" }
GET /{short_code} returns: 301/302 redirect to long_url ← the hot path, 99% of trafficA subtle choice hides in that redirect status code. 301 (permanent) lets browsers and proxies cache the mapping, slashing your read load — but it means you never see those clicks again, so analytics break and you can’t change the target. 302 (temporary) forces every click through your server, preserving analytics and editability at the cost of more traffic. Most shorteners pick 302 for exactly this reason. What does it buy us, and what does it cost? — 302 buys observability and control; it costs the free CDN-and-browser caching that 301 would have handed you.
4. Data model
Section titled “4. Data model”A single table, accessed only by primary key — a textbook key-value workload.
url_mappings short_code STRING (primary key, e.g. "aZ3kQ7") long_url TEXT created_at TIMESTAMP expire_at TIMESTAMP (nullable) creator_id STRING (nullable)We never query by long_url, never join, and the read is always “give me the row for this code.”
That’s a KV store (DynamoDB, Cassandra, or even a partitioned SQL table) keyed on short_code — which
also gives us a clean partition key with naturally even
distribution, since codes are pseudo-random.
5. Short-code generation — the core decision
Section titled “5. Short-code generation — the core decision”Two families of approaches, and the choice defines the system.
Option A: counter + base62 encoding
Section titled “Option A: counter + base62 encoding”Keep a global monotonic counter. Each new URL grabs the next integer and base62-encodes it into a
short string (1000000000 → "15ftgG"). Guaranteed collision-free, and codes are compact.
The catch is the distributed counter: a single counter is a bottleneck and a single point of
failure. The fix is to hand each app server a range of IDs (e.g. via a ticket server or
ZooKeeper) — server A owns 1–1,000,000, server B owns 1,000,001–2,000,000 — so servers mint codes
locally with no per-request coordination. The downside: codes are sequential and guessable, so
anyone can enumerate every link (/15ftgG, /15ftgH, …). Mitigate by base62-encoding a shuffled
or offset counter.
Option B: hash the URL
Section titled “Option B: hash the URL”Compute MD5(long_url) and take the first 7 base62 characters. Stateless, no counter — but hashes
collide. When the truncated hash already exists for a different URL, you must detect it (the write
becomes “insert if not present, else perturb and retry”) and append a salt or rehash. Idempotency is a
bonus: shortening the same URL twice yields the same code, deduping storage.
Counter + base62 Hash + truncatecollisions impossible possible → retry logiccoordination needs ID-range allocation statelessguessable codes yes (sequential) no (random-looking)same URL twice new code each time same code (dedup)A common production blend: a distributed unique ID generator (Snowflake-style — timestamp + machine id + sequence) base62-encoded. No central counter, no collisions, no enumeration. Buys both properties; costs slightly longer codes and clock-coordination discipline.
6. High-level design and the read path
Section titled “6. High-level design and the read path” ┌─────────────┐ create ───POST───► │ App tier │──► ID generator (Snowflake/range) │ (stateless) │──► KV store (durable, replicated) └─────────────┘ ▲ click ───GET /aZ3kQ7──► Load Balancer │ ┌──────┴──────┐ │ App tier │──► Redis cache ──(hit: 99%)──► 302 redirect │ │ │ miss │ │ ▼ │ │ KV store ──► backfill cache ──► redirect └─────────────┘The redirect path is the entire performance story. With a 100:1 read ratio, you put a Redis cache in front of the KV store holding the hottest codes. Links follow a brutal power law — a handful of viral links serve the vast majority of clicks — so a modest cache absorbs nearly all reads. Use a read-through or cache-aside pattern with an LRU eviction: on a miss, fetch from the store and backfill. For globally popular links, a CDN at the edge can answer redirects without ever touching your origin.
Scaling the deep dives
Section titled “Scaling the deep dives”- Read scaling: cache first, then read replicas behind the cache for misses. The app tier is stateless, so scale it horizontally behind a load balancer.
- Analytics: don’t write to the DB synchronously on every click — that would re-introduce the write load you avoided. Emit a click event to a message queue and aggregate asynchronously in a separate analytics store.
- Custom aliases: check-and-set against the KV store (the alias is the key); reject on conflict.
- Expiry / cleanup: a TTL field plus a background sweeper, or rely on the KV store’s native TTL.
- Abuse: rate-limit the
shortenendpoint per IP/user — see Rate Limiting — and scan submitted URLs against malware/phishing lists.
Key trade-offs
Section titled “Key trade-offs”- 301 vs 302: caching-and-speed vs analytics-and-control. Pick 302 unless analytics don’t matter.
- Counter vs hash codes: guaranteed-unique-but-guessable vs random-but-collision-prone. Snowflake IDs get most of both.
- Cache size vs hit rate: a bigger cache costs RAM but, given the power law, even a small one wins.
- Durability vs cost: links are forever, so replicate the store across zones; this costs storage but losing a mapping breaks a link permanently.
Check your understanding
Section titled “Check your understanding”- Why is a URL shortener described as read-heavy, and how does that single fact shape the entire design?
- Work out why 7 base62 characters are enough but 6 might not be at 100M URLs/day.
- Compare counter-based and hash-based code generation. What problem does each have, and how does a Snowflake-style ID dodge both?
- Why do most shorteners return a 302 rather than a 301? What does each choice buy and cost?
- Why is caching not merely an optimization but the core of the redirect architecture? Estimate the origin load with and without a 99%-hit cache.