Design a URL Shortener

A URL shortener (think TinyURL, bit.ly) takes https://example.com/some/very/long/path?utm=... and hands back https://sho.rt/aZ3kQ, then redirects anyone who visits the short link to the original. It looks trivial — a hashmap with a web server in front — and that’s exactly why it’s the perfect first case study. The interesting engineering is hidden in three places: generating short codes that never collide, serving redirects with single-digit-millisecond latency, and doing both read-heavy at massive scale. We’ll walk the six-step framework.

1. Requirements

Functional

Given a long URL, return a short URL.
Given a short URL, redirect to the original.
(Optional) custom aliases (sho.rt/my-brand) and link expiry.
(Optional) click analytics.

Non-functional

Read-heavy, roughly 100:1 reads to writes — almost everyone clicks links; few create them.
Redirect latency < 50ms at the edge.
High availability (a dead shortener breaks every link ever issued).
Short codes must be unique and ideally short (6–7 characters).
Links are effectively permanent — durability matters more than write throughput.

2. Back-of-envelope estimation

Assume 100M new URLs per day.

Writes:  100M / 86,400s  ≈ 1,160 writes/sec   (~3,500/sec at 3× peak)
Reads:   100:1 ratio     ≈ 116,000 reads/sec  (~350,000/sec at peak)
Storage: 500 bytes/record × 100M/day × 365 × 5yr ≈ 90 TB over 5 years
Keyspace: how many 7-char codes do we need?

The keyspace question decides the code length. Using base62 (a–z, A–Z, 0–9):

62^6 ≈ 56.8 billion   ← enough for ~1.5 years at 100M/day
62^7 ≈ 3.5 trillion   ← ~95 years of headroom — pick 7 characters

So: a write-light, read-heavy, storage-modest system. That profile screams cache the reads, keep writes simple, never lose a record. See Back-of-Envelope Estimation.

3. API sketch

POST /api/v1/shorten
  body:   { "long_url": "...", "custom_alias?": "...", "expire_at?": "..." }
  returns:{ "short_url": "https://sho.rt/aZ3kQ7" }

GET /{short_code}
  returns: 301/302 redirect to long_url   ← the hot path, 99% of traffic

A subtle choice hides in that redirect status code. 301 (permanent) lets browsers and proxies cache the mapping, slashing your read load — but it means you never see those clicks again, so analytics break and you can’t change the target. 302 (temporary) forces every click through your server, preserving analytics and editability at the cost of more traffic. Most shorteners pick 302 for exactly this reason. What does it buy us, and what does it cost? — 302 buys observability and control; it costs the free CDN-and-browser caching that 301 would have handed you.

4. Data model

A single table, accessed only by primary key — a textbook key-value workload.

url_mappings
  short_code   STRING  (primary key, e.g. "aZ3kQ7")
  long_url     TEXT
  created_at   TIMESTAMP
  expire_at    TIMESTAMP  (nullable)
  creator_id   STRING     (nullable)

We never query by long_url, never join, and the read is always “give me the row for this code.” That’s a KV store (DynamoDB, Cassandra, or even a partitioned SQL table) keyed on short_code — which also gives us a clean partition key with naturally even distribution, since codes are pseudo-random.

5. Short-code generation — the core decision

Two families of approaches, and the choice defines the system.

Option A: counter + base62 encoding

Keep a global monotonic counter. Each new URL grabs the next integer and base62-encodes it into a short string (1000000000 → "15ftgG"). Guaranteed collision-free, and codes are compact.

The catch is the distributed counter: a single counter is a bottleneck and a single point of failure. The fix is to hand each app server a range of IDs (e.g. via a ticket server or ZooKeeper) — server A owns 1–1,000,000, server B owns 1,000,001–2,000,000 — so servers mint codes locally with no per-request coordination. The downside: codes are sequential and guessable, so anyone can enumerate every link (/15ftgG, /15ftgH, …). Mitigate by base62-encoding a shuffled or offset counter.

Option B: hash the URL

Compute MD5(long_url) and take the first 7 base62 characters. Stateless, no counter — but hashes collide. When the truncated hash already exists for a different URL, you must detect it (the write becomes “insert if not present, else perturb and retry”) and append a salt or rehash. Idempotency is a bonus: shortening the same URL twice yields the same code, deduping storage.

                  Counter + base62              Hash + truncate
collisions        impossible                    possible → retry logic
coordination      needs ID-range allocation     stateless
guessable codes   yes (sequential)              no (random-looking)
same URL twice    new code each time            same code (dedup)

A common production blend: a distributed unique ID generator (Snowflake-style — timestamp + machine id + sequence) base62-encoded. No central counter, no collisions, no enumeration. Buys both properties; costs slightly longer codes and clock-coordination discipline.

6. High-level design and the read path

                         ┌─────────────┐
   create  ───POST───►   │  App tier   │──► ID generator (Snowflake/range)
                         │ (stateless) │──► KV store (durable, replicated)
                         └─────────────┘
                                ▲
   click  ───GET /aZ3kQ7──►  Load Balancer
                                │
                         ┌──────┴──────┐
                         │  App tier   │──► Redis cache ──(hit: 99%)──► 302 redirect
                         │             │        │ miss
                         │             │        ▼
                         │             │     KV store ──► backfill cache ──► redirect
                         └─────────────┘

The redirect path is the entire performance story. With a 100:1 read ratio, you put a Redis cache in front of the KV store holding the hottest codes. Links follow a brutal power law — a handful of viral links serve the vast majority of clicks — so a modest cache absorbs nearly all reads. Use a read-through or cache-aside pattern with an LRU eviction: on a miss, fetch from the store and backfill. For globally popular links, a CDN at the edge can answer redirects without ever touching your origin.

Scaling the deep dives

Read scaling: cache first, then read replicas behind the cache for misses. The app tier is stateless, so scale it horizontally behind a load balancer.
Analytics: don’t write to the DB synchronously on every click — that would re-introduce the write load you avoided. Emit a click event to a message queue and aggregate asynchronously in a separate analytics store.
Custom aliases: check-and-set against the KV store (the alias is the key); reject on conflict.
Expiry / cleanup: a TTL field plus a background sweeper, or rely on the KV store’s native TTL.
Abuse: rate-limit the shorten endpoint per IP/user — see Rate Limiting — and scan submitted URLs against malware/phishing lists.

Key trade-offs

301 vs 302: caching-and-speed vs analytics-and-control. Pick 302 unless analytics don’t matter.
Counter vs hash codes: guaranteed-unique-but-guessable vs random-but-collision-prone. Snowflake IDs get most of both.
Cache size vs hit rate: a bigger cache costs RAM but, given the power law, even a small one wins.
Durability vs cost: links are forever, so replicate the store across zones; this costs storage but losing a mapping breaks a link permanently.

Check your understanding

Why is a URL shortener described as read-heavy, and how does that single fact shape the entire design?
Work out why 7 base62 characters are enough but 6 might not be at 100M URLs/day.
Compare counter-based and hash-based code generation. What problem does each have, and how does a Snowflake-style ID dodge both?
Why do most shorteners return a 302 rather than a 301? What does each choice buy and cost?
Why is caching not merely an optimization but the core of the redirect architecture? Estimate the origin load with and without a 99%-hit cache.