Skip to content

Real-Time: Polling, WebSockets & SSE

HTTP has a built-in bias: the client asks, the server answers. That works beautifully for “give me this page.” It works terribly for “tell me the moment something changes” — a new chat message, a price tick, a notification, a live score. The server knows something happened, but in plain request-response it has no way to speak unless spoken to. This page is about the techniques for getting fresh data to clients with low latency, and how to choose among them by direction and scale.

The simplest answer to “how do I know when something changed?” is to keep asking. The client sends a request every few seconds: anything new? anything new? anything new?

client → GET /messages?since=... server → "nothing"
(wait 3s)
client → GET /messages?since=... server → "nothing"
(wait 3s)
client → GET /messages?since=... server → "here's 1 new message"

It is trivial to build on ordinary HTTP. But the cost is brutal at scale: most requests return nothing, yet each one still pays the full price of a round trip, headers, and (often) a fresh connection. With 100,000 clients polling every 3 seconds, that’s ~33,000 requests per second of which the vast majority are wasted. And your data is still up to 3 seconds stale. Short polling trades near-zero implementation effort for enormous waste and mediocre freshness.

Long polling is a clever hack on the same machinery. The client makes a request, and instead of answering “nothing” immediately, the server holds the connection open until it actually has something to say (or a timeout fires). The instant news arrives, the server responds; the client immediately reconnects to wait for the next event.

client → GET /messages (server holds it open... waiting...)
── new message arrives ──► server responds instantly
client → GET /messages (reconnect, hold open again...)

This cuts the wasted empty responses and delivers events with near-real-time latency, all over ordinary HTTP — which is why it was the workhorse of real-time web before better transports existed. But it’s still one request per message, ties up a server connection per waiting client, and is fiddly around timeouts and reconnection. It’s a bridge technology: better than short polling, clumsier than what came next.

A WebSocket abandons the request-response model entirely. It starts as a normal HTTP request that asks to “upgrade” the connection, and once the server agrees, the same TCP connection becomes a persistent, full-duplex channel: either side can send a message at any time, with very little per-message overhead.

client → HTTP GET ... Upgrade: websocket
server → 101 Switching Protocols
═══════════ persistent bidirectional connection ═══════════
client ⇄ server ← either side sends, anytime, low overhead →

This is the right tool when communication is genuinely two-way and chatty: multiplayer games, collaborative editing (think cursors and edits flying both directions), chat, live trading. What does this buy us, and what does it cost? It buys true bidirectional, low-latency, low-overhead messaging. It costs a persistent stateful connection — each open socket consumes server memory and a connection slot, which complicates load balancing and cuts against statelessness. It’s a different protocol (ws://), so some proxies and infrastructure need special handling, and you must manage reconnection yourself.

Server-Sent Events (SSE): one-way, the easy way

Section titled “Server-Sent Events (SSE): one-way, the easy way”

Often you don’t need two-way. A news feed, a notification stream, a progress bar, a live dashboard — the data flows server → client only. For that, a WebSocket is overkill. Server-Sent Events is a much simpler, often-overlooked tool: a single long-lived HTTP response that the server keeps writing to, streaming events down to the client as they happen.

client → GET /stream (Accept: text/event-stream)
server → (keeps the response open, writes events as they occur)
data: price 101
data: price 102
data: price 103 ...

Because it’s just a long HTTP response, SSE inherits HTTP’s virtues: it works over plain HTTP, passes through proxies and firewalls cleanly, and — a genuinely nice touch — the browser’s EventSource automatically reconnects and can resume from the last event ID. You give up the upstream channel; the client can’t push back over the same connection (it just makes ordinary requests for that).

TechniqueDirectionConnection costLatencyBest for
Short pollingclient → serverwasteful (many empty hits)poor (stale)rare updates, dead-simple needs
Long pollingclient → serverone conn per waiting clientnear-real-timefallback, legacy infra
SSEserver → clientone open response/clientreal-timefeeds, notifications, dashboards
WebSocketsbidirectionalpersistent socket/clientreal-timechat, games, collaboration

The scale axis matters because every option except short polling holds a connection open per client. A million concurrent users means a million live connections — a real capacity and cost problem that pushes you toward connection-efficient servers, careful load balancing, and sometimes a dedicated real-time tier. Persistent connections are not free; they are state, and state is the thing horizontal scaling works hardest to avoid.

How does a server tell a client about something the instant it happens, when HTTP only lets clients ask? The answers form a ladder: polling fakes it by asking repeatedly; long polling holds the question open; SSE turns one response into a one-way stream; WebSockets open a true two-way pipe. Climbing the ladder buys you freshness and bidirectionality, and costs you persistent stateful connections — so you pick the lowest rung that satisfies your direction and your scale.

  1. Why does plain request-response HTTP make server-initiated updates awkward in the first place?
  2. How does long polling reduce the waste of short polling, and what does it still cost?
  3. What does the WebSocket “upgrade” accomplish, and what new operational cost does a persistent socket introduce?
  4. When is SSE the better choice than a WebSocket, and what capability are you giving up by choosing it?
  5. Why does the scale of concurrent clients push back against every option except short polling?