I have to admit that I find the existence of genuinely shocking. The idea that you can take some arbitrary file, turn it into a stream of fungible blobs, receive those blobs in literally any order, and have each new one help you reconstruct the original already seems pretty impressive. But then you learn that the total overhead for all of this is under 5%, and that the receiver often needs just two extra symbols beyond the bare minimum to decode with near-certainty. That seems both magical and frankly improbable.
To see why this matters, think about how we normally move data around. TCP is fundamentally a conversation: "I sent packet 4." "I didn’t get packet 4." "Okay, resending packet 4." "Got it." That works fine for loading a webpage, but it falls apart when latency is high (try a 40-minute round trip to Mars) or when you’re broadcasting to a million receivers at once over lossy cellular. TCP requires a feedback loop. The sender has to know exactly what the receiver is missing. Scale that to a million receivers, each losing different packets, all sending retransmission requests at once. That’s . The sender drowns.
RaptorQ does something completely different. You turn your file into a mathematical liquid and just spray packets at the receiver. The receiver is basically just a bucket. It doesn’t matter which drops land in it, and it doesn’t matter if half the spray blows away in the wind. As soon as the bucket has roughly drops (not any particular drops, just enough of them), the receiver reconstructs the original data.
How Good Is It, Really?
This is all codified in RaptorQ (RFC 6330). The RFC actually has a SHALL-level decoder requirement: if you receive encoding symbols whose IDs are chosen uniformly at random, the average decode failure rate must be at most 1 in 100 when receiving symbols, 1 in 10,000 at , and 1 in 1,000,000 at . The receiver almost never needs more than K + 2 symbols to decode perfectly.
But "+2 symbols" is only the reception overhead—the extra packets the receiver must collect. The full picture includes the precode’s internal expansion from source symbols to intermediate symbols. That ~2.5% structural redundancy is what makes the "+2 symbols" trick possible. Combined with the LT layer, total system overhead is under 5%—still remarkably small.