Design a News Feed

A news feed (Twitter/X timeline, Facebook feed, Instagram home) shows each user a stream of recent posts from the accounts they follow, usually ranked rather than strictly chronological. The deceptively simple requirement — “show me what the people I follow just posted” — hides the single most famous trade-off in system design: do you do the work when someone posts (fan-out on write) or when someone reads (fan-out on read)? Get this wrong for users with millions of followers and your system melts. We’ll follow the framework.

1. Requirements

Functional

Publish a post.
Fetch a user’s home feed — recent posts from accounts they follow.
Feed is ranked (relevance/recency), paginated, infinitely scrollable.

Non-functional

Read-heavy — users scroll far more than they post (roughly 100:1).
Feed load latency < 200ms for a good scroll experience.
High availability; eventual consistency is acceptable — a post appearing a few seconds late is fine.
Handle the fat-tailed follower distribution: most users have hundreds of followers, a few have tens of millions.

2. Back-of-envelope estimation

Assume 300M daily active users, each posting twice a day and reading their feed ~10 times.

Posts (writes):   300M × 2  / 86,400  ≈ 7,000 posts/sec   (~20,000 peak)
Feed reads:       300M × 10 / 86,400  ≈ 35,000 reads/sec  (~100,000 peak)
Avg fan-out:      ~500 followers/post (mean)
Fan-out writes:   7,000 posts/sec × 500 ≈ 3.5M feed inserts/sec on write-path

That last number is the whole problem. 3.5 million feed-cache writes per second is enormous — and it’s the average. A celebrity post fans out to tens of millions of feeds at once. This is where hot partitions and the celebrity problem live.

3. API sketch

POST /api/v1/posts
  body:   { "author_id": "...", "content": "...", "media?": [...] }
  returns:{ "post_id": "..." }

GET /api/v1/feed?cursor={opaque}&limit=20
  returns:{ "posts": [ {post_id, author, content, ts, score}, ... ],
            "next_cursor": "..." }

Note the cursor-based pagination rather than offset/page numbers. Offsets break when new posts shift everything down; an opaque cursor (encoding the last-seen post’s score+id) gives stable, duplicate-free infinite scroll.

4. Data model

posts                         follows                    feed_cache (per user)
  post_id   (PK)               follower_id (PK part)      user_id   (PK part)
  author_id (indexed)          followee_id (PK part)      post_id   (sort key, by score/ts)
  content                      created_at                 author_id
  created_at                                              score
  media_urls

posts is the source of truth (durable, sharded by post_id). follows is the social graph. feed_cache is the precomputed per-user timeline — the thing fan-out-on-write populates. The feed cache is keyed by user_id, which makes each user’s feed a single partition you can fetch in one hop. Store it in Redis (sorted sets are a perfect fit for a ranked, capped list).

5. The core decision: fan-out on write vs read

Fan-out on write (push)

When a user posts, immediately push the post id into the feed cache of every follower. Reads are then trivial — a feed fetch is just “read my precomputed list.”

PUBLISH:  post → look up author's followers → insert post_id into each follower's feed_cache
READ:     GET feed_cache[user_id]   ← O(1), already assembled, sub-millisecond

Buys: blazing-fast reads (the expensive work is pre-done). Costs: writes are catastrophic for high-follower accounts — one celebrity post triggers millions of cache writes, and most of those feeds may never be read.

Fan-out on read (pull)

Store nothing precomputed. When a user opens their feed, gather posts on demand from everyone they follow, merge, rank, return.

PUBLISH:  post → write to posts table.  Done. O(1).
READ:     for each followee → fetch recent posts → merge + rank → return   ← expensive

Buys: cheap writes, no wasted work, always fresh. Costs: reads are slow and hit the database hard — fetching from hundreds of followees on every scroll, at 100,000 reads/sec, doesn’t scale.

                  Fan-out on WRITE (push)      Fan-out on READ (pull)
write cost        high (×followers)            low (×1)
read cost         low (precomputed)            high (gather + merge)
freshness         slight delay                 instant
wasted work       feeds nobody reads           none
celebrity         💥 millions of writes        💥 huge gather on read

The hybrid — and why it’s the real answer

Neither pure approach survives both inactive users and celebrities. Production systems use a hybrid:

Push for ordinary accounts (hundreds/thousands of followers): cheap to fan out, fast to read.
Pull for celebrity accounts (millions of followers): do not fan out their posts on write — that’s the celebrity problem. Instead, at read time, fetch the precomputed feed and merge in recent posts from the handful of celebrities the user follows.

read feed(user):
    base = feed_cache[user]                    # pushed posts from normal accounts
    celeb = recent_posts(celebrities_followed) # pulled at read time
    return rank(merge(base, celeb))[:limit]

6. High-level design

                            ┌──────────────┐
  POST ──► LB ──► App ──────►│ posts store  │ (durable, sharded)
                    │        └──────────────┘
                    │ enqueue fan-out job
                    ▼
            ┌───────────────┐   for non-celebrity authors:
            │ Message Queue │──► Fan-out workers ──► write post_id into
            └───────────────┘                       followers' feed_cache (Redis)

  GET feed ──► LB ──► App ──► read feed_cache (push)  ──┐
                              + pull recent celeb posts ─┴─► rank ──► return

The fan-out is done asynchronously through a message queue: the poster gets an instant 200 while workers populate follower feeds in the background. This decouples write latency from fan-out cost and lets you absorb spikes by buffering. Feed caches live in Redis; the cache holds only a capped window (e.g. the latest 800 post ids per user), with older history reconstructed from the posts store on demand.

Deep dives

Ranking: beyond recency, score posts by engagement signals, affinity to the author, and recency decay. Ranking can run at write time (store the score) or read time (rerank the candidate set). Most systems generate a candidate set cheaply, then apply an ML ranker on the top N at read time.
Hot partition on the social graph: a celebrity’s follower list is itself a hot read. Cache it and shard the follows table carefully — see hot partitions.
Eventual consistency: a freshly published post may take seconds to appear in all feeds. That’s an accepted trade for write throughput — feeds don’t need to be transactional.
Pagination: cursor-based, as above, so infinite scroll stays stable as new posts arrive.

Key trade-offs

Push vs pull vs hybrid: fast reads vs cheap writes vs operational complexity. The hybrid wins by routing per follower count.
Precompute vs compute-on-demand: precomputed feeds cost storage and wasted work for inactive users; on-demand costs read latency. Cap the precompute window to bound storage.
Ranking quality vs latency: richer ML ranking improves relevance but adds read-time cost; rank only the top candidate set.
Freshness vs throughput: async fan-out buys throughput at the cost of seconds of staleness — almost always the right call for a feed.

Check your understanding

Explain fan-out on write vs fan-out on read. What does each buy and cost?
Why does neither pure approach survive a user with 50 million followers, and how does the hybrid fix it?
Why is the fan-out performed asynchronously through a message queue rather than inline with the post request?
Why cursor-based pagination instead of page numbers for an infinite-scroll feed?
Where does eventual consistency show up in this design, and why is it an acceptable trade-off?