Skip to content

Design a URL Shortener

A URL shortener (think TinyURL, bit.ly) takes https://example.com/some/very/long/path?utm=... and hands back https://sho.rt/aZ3kQ, then redirects anyone who visits the short link to the original. It looks trivial — a hashmap with a web server in front — and that’s exactly why it’s the perfect first case study. The interesting engineering is hidden in three places: generating short codes that never collide, serving redirects with single-digit-millisecond latency, and doing both read-heavy at massive scale. We’ll walk the six-step framework.

Functional

  • Given a long URL, return a short URL.
  • Given a short URL, redirect to the original.
  • (Optional) custom aliases (sho.rt/my-brand) and link expiry.
  • (Optional) click analytics.

Non-functional

  • Read-heavy, roughly 100:1 reads to writes — almost everyone clicks links; few create them.
  • Redirect latency < 50ms at the edge.
  • High availability (a dead shortener breaks every link ever issued).
  • Short codes must be unique and ideally short (6–7 characters).
  • Links are effectively permanent — durability matters more than write throughput.

Assume 100M new URLs per day.

Writes: 100M / 86,400s ≈ 1,160 writes/sec (~3,500/sec at 3× peak)
Reads: 100:1 ratio ≈ 116,000 reads/sec (~350,000/sec at peak)
Storage: 500 bytes/record × 100M/day × 365 × 5yr ≈ 90 TB over 5 years
Keyspace: how many 7-char codes do we need?

The keyspace question decides the code length. Using base62 (a–z, A–Z, 0–9):

62^6 ≈ 56.8 billion ← enough for ~1.5 years at 100M/day
62^7 ≈ 3.5 trillion ← ~95 years of headroom — pick 7 characters

So: a write-light, read-heavy, storage-modest system. That profile screams cache the reads, keep writes simple, never lose a record. See Back-of-Envelope Estimation.

POST /api/v1/shorten
body: { "long_url": "...", "custom_alias?": "...", "expire_at?": "..." }
returns:{ "short_url": "https://sho.rt/aZ3kQ7" }
GET /{short_code}
returns: 301/302 redirect to long_url ← the hot path, 99% of traffic

A subtle choice hides in that redirect status code. 301 (permanent) lets browsers and proxies cache the mapping, slashing your read load — but it means you never see those clicks again, so analytics break and you can’t change the target. 302 (temporary) forces every click through your server, preserving analytics and editability at the cost of more traffic. Most shorteners pick 302 for exactly this reason. What does it buy us, and what does it cost? — 302 buys observability and control; it costs the free CDN-and-browser caching that 301 would have handed you.

A single table, accessed only by primary key — a textbook key-value workload.

url_mappings
short_code STRING (primary key, e.g. "aZ3kQ7")
long_url TEXT
created_at TIMESTAMP
expire_at TIMESTAMP (nullable)
creator_id STRING (nullable)

We never query by long_url, never join, and the read is always “give me the row for this code.” That’s a KV store (DynamoDB, Cassandra, or even a partitioned SQL table) keyed on short_code — which also gives us a clean partition key with naturally even distribution, since codes are pseudo-random.

5. Short-code generation — the core decision

Section titled “5. Short-code generation — the core decision”

Two families of approaches, and the choice defines the system.

Keep a global monotonic counter. Each new URL grabs the next integer and base62-encodes it into a short string (1000000000 → "15ftgG"). Guaranteed collision-free, and codes are compact.

The catch is the distributed counter: a single counter is a bottleneck and a single point of failure. The fix is to hand each app server a range of IDs (e.g. via a ticket server or ZooKeeper) — server A owns 1–1,000,000, server B owns 1,000,001–2,000,000 — so servers mint codes locally with no per-request coordination. The downside: codes are sequential and guessable, so anyone can enumerate every link (/15ftgG, /15ftgH, …). Mitigate by base62-encoding a shuffled or offset counter.

Compute MD5(long_url) and take the first 7 base62 characters. Stateless, no counter — but hashes collide. When the truncated hash already exists for a different URL, you must detect it (the write becomes “insert if not present, else perturb and retry”) and append a salt or rehash. Idempotency is a bonus: shortening the same URL twice yields the same code, deduping storage.

Counter + base62 Hash + truncate
collisions impossible possible → retry logic
coordination needs ID-range allocation stateless
guessable codes yes (sequential) no (random-looking)
same URL twice new code each time same code (dedup)

A common production blend: a distributed unique ID generator (Snowflake-style — timestamp + machine id + sequence) base62-encoded. No central counter, no collisions, no enumeration. Buys both properties; costs slightly longer codes and clock-coordination discipline.

┌─────────────┐
create ───POST───► │ App tier │──► ID generator (Snowflake/range)
│ (stateless) │──► KV store (durable, replicated)
└─────────────┘
click ───GET /aZ3kQ7──► Load Balancer
┌──────┴──────┐
│ App tier │──► Redis cache ──(hit: 99%)──► 302 redirect
│ │ │ miss
│ │ ▼
│ │ KV store ──► backfill cache ──► redirect
└─────────────┘

The redirect path is the entire performance story. With a 100:1 read ratio, you put a Redis cache in front of the KV store holding the hottest codes. Links follow a brutal power law — a handful of viral links serve the vast majority of clicks — so a modest cache absorbs nearly all reads. Use a read-through or cache-aside pattern with an LRU eviction: on a miss, fetch from the store and backfill. For globally popular links, a CDN at the edge can answer redirects without ever touching your origin.

  • Read scaling: cache first, then read replicas behind the cache for misses. The app tier is stateless, so scale it horizontally behind a load balancer.
  • Analytics: don’t write to the DB synchronously on every click — that would re-introduce the write load you avoided. Emit a click event to a message queue and aggregate asynchronously in a separate analytics store.
  • Custom aliases: check-and-set against the KV store (the alias is the key); reject on conflict.
  • Expiry / cleanup: a TTL field plus a background sweeper, or rely on the KV store’s native TTL.
  • Abuse: rate-limit the shorten endpoint per IP/user — see Rate Limiting — and scan submitted URLs against malware/phishing lists.
  • 301 vs 302: caching-and-speed vs analytics-and-control. Pick 302 unless analytics don’t matter.
  • Counter vs hash codes: guaranteed-unique-but-guessable vs random-but-collision-prone. Snowflake IDs get most of both.
  • Cache size vs hit rate: a bigger cache costs RAM but, given the power law, even a small one wins.
  • Durability vs cost: links are forever, so replicate the store across zones; this costs storage but losing a mapping breaks a link permanently.
  1. Why is a URL shortener described as read-heavy, and how does that single fact shape the entire design?
  2. Work out why 7 base62 characters are enough but 6 might not be at 100M URLs/day.
  3. Compare counter-based and hash-based code generation. What problem does each have, and how does a Snowflake-style ID dodge both?
  4. Why do most shorteners return a 302 rather than a 301? What does each choice buy and cost?
  5. Why is caching not merely an optimization but the core of the redirect architecture? Estimate the origin load with and without a 99%-hit cache.