Backpressure & Flow Control
Every pipeline has a producer and a consumer, and they rarely run at the same speed. When the producer is faster — a firehose of events feeding a slower processor — something has to give. The naive answer is “buffer the difference.” It works, right up until it doesn’t, and when it fails it fails catastrophically: out-of-memory, GC death spirals, and a system that’s slower under recovery than it ever was under load. Backpressure is the discipline of letting a slow consumer tell a fast producer to slow down, instead of silently drowning.
The fundamental imbalance
Section titled “The fundamental imbalance”Picture the pipeline as a pipe with a tank in the middle:
producer ──▶ [ queue / buffer ] ──▶ consumer 100/s fills up 40/s at 60/s ←── the gap has to go somewhereIf producer rate > consumer rate for any sustained period, the buffer between them grows without
bound. There are only ever three things you can do with the overflow, and a system that doesn’t
choose one explicitly has chosen the worst one by accident:
- Buffer it — store the excess and hope the producer slows down later.
- Drop it — discard work you can’t handle (load shedding).
- Block it — refuse to accept more until the consumer catches up (backpressure).
Why unbounded buffers are a trap
Section titled “Why unbounded buffers are a trap”The seductive option is (1) with an unbounded queue: “just keep everything, we’ll catch up.” This is the single most common reliability mistake in distributed systems. An unbounded buffer doesn’t solve the rate mismatch — it hides it while quietly converting a throughput problem into a memory problem.
queue depth │ ╱ OOM / crash │ ╱ │ ╱ │ ╱ │ ____╱ │ ______╱ └────────────────────────────── time (looks fine) (latency climbs) (dead)The damage compounds:
- Latency explodes. A request entering a 10-million-item queue waits behind all 10 million. The buffer that was supposed to protect you is now the source of your tail latency.
- Memory dies. The queue grows until the process is OOM-killed — at which point you lose everything in the buffer, not just the overflow.
- Recovery is impossible. Once behind, the system must drain the backlog and serve new load. It’s slower exactly when it most needs to be fast — a death spiral.
Flow control: pull instead of push
Section titled “Flow control: pull instead of push”The fix is to invert who controls the rate. In a push model, the producer decides when to send, and the consumer must cope. In a pull (demand-driven) model, the consumer requests the next batch only when it has capacity. The producer physically cannot get ahead, because it isn’t allowed to send what wasn’t requested.
PUSH (producer-driven) PULL (consumer-driven)producer ──▶ consumer producer ◀── "give me 10" ── consumer sends whenever sends only what's asked for ⇒ consumer drowns ⇒ rate self-limitsThis is the core idea behind Reactive Streams and the credit-based flow control in message queues: the consumer grants the producer “credits” for N items, and the producer may only send up to its outstanding credit. Pull-based flow control makes backpressure the default rather than something you bolt on.
The mitigation toolkit
Section titled “The mitigation toolkit”Bounded queues
Section titled “Bounded queues”Cap the buffer. When a bounded queue is full, put() either blocks (propagating backpressure
upstream) or rejects (triggering shedding). The bound is not a limitation — it’s the safety
valve. A bounded queue forces you to decide what to do when full, which is exactly the decision the
unbounded queue let you dodge.
Load shedding
Section titled “Load shedding”When you genuinely can’t slow the producer (it’s the open internet), drop work deliberately:
reject excess requests with a 429 Too Many Requests, sample the firehose, or drop the lowest-value
items. Shedding load early is graceful; collapsing under load later is not. A system that sheds 10%
stays up for the other 90%; a system that buffers everything serves 0%.
Propagate backpressure end-to-end
Section titled “Propagate backpressure end-to-end”Backpressure is only useful if it travels. A blocked consumer must slow its upstream, which slows
its upstream, all the way back to the original source — TCP’s flow control, a client receiving
429s and backing off, a request rejected at the edge. A pipeline where backpressure stops halfway
just relocates the unbounded buffer to wherever the chain breaks.
What does this buy us, and what does it cost?
Section titled “What does this buy us, and what does it cost?”Backpressure buys survival under overload: the system degrades predictably instead of crashing, latency stays bounded, and memory stays safe. The cost is that you must now say no — to slower producers, dropped messages, or rejected requests. That’s uncomfortable; “we lost some data” feels worse than “we kept everything” right until the unbounded buffer kills the whole process and you lose it all anyway. The mature trade is to accept bounded, visible, controlled loss in exchange for a system that stays alive.
Check your understanding
Section titled “Check your understanding”- What are the only three things a system can do when producers outpace consumers?
- Why does an unbounded buffer convert a throughput problem into a worse (memory + latency) problem?
- How does a pull-based flow model make it structurally impossible for the producer to get ahead?
- Why is a bounded queue’s “full” state a feature rather than a failure?
- What goes wrong if backpressure is applied at the consumer but not propagated back to the source?