A key idea behind network coding—that sending evidence about a message can be more useful at times than sending the message itself—originated with Claude Elwood Shannon in the late 1940s. At the time, Shannon worked for Bell Telephone Laboratories and was concerned about improving communications along copper telephone lines.

Communication is about reliably transferring information from one location to another. In Shannon's day, as today, achieving reliability was difficult because communications media were noisy. Voltage applied at one end of a copper phone line or a radio signal transmitted through air could be corrupted by noise in the environment as well as interference from other transmissions. As a result, the signal received usually differed from the one transmitted. Today, the problem often plagues cell phones, for example.

Communications system design therefore focuses largely on protecting transmitted signals from noise. Early systems were pretty simple. To increase the reliability with which a message was received, it was sent with more power (the equivalent of speaking louder in a noisy room) or the transmission was repeated. Increasing reliability required transmitting with ever more power or sending less and less new information per transmission.

Shannon, a mathematician and engineer, changed that in 1948, when he published a paper, "A Mathematical Theory of Communication," in the Bell System Technical Journal. In it he made a startling claim: increasing reliability does not require increasing power or decreasing the amount of new information sent per transmission. He proved that every channel has a fixed "capacity" for any given power budget. That capacity can be defined as the fraction of bits (0s and 1s) in the full data stream that can be used to convey new information reliably. The remainder of the bit stream is used to convey just enough evidence to allow us to correct any errors caused by noise in the environment. In other words, provided we remain below channel capacity, we can achieve greater reliability without increasing the fraction of the bit stream used to send evidence about the new information content.

If you encode a message in a sequence of 0s and 1s and then tell a friend that your transmission contains an even number of 1s, your friend has a simple test for detecting a change in your transmission. If you tell your friend to expect four bits and then give your friend similar evidence about several subsets of the bits (the total number of 1s in positions one, two and three is even; the total number of 1s in positions one, two and four is odd; the total number of 1s in positions one, three and four is even; and so on), your friend can use that evidence to determine likely causes and corrections for observed inconsistencies. For example, there are only two strings of four-bit length that satisfy all of the above properties: 1010 and 1101. Thus, if your friend receives the string 1110, she knows that the message has been corrupted. The given string has a total of three 1s in positions 1, 2 and 3 when she expected an even number of 1s in those positions. Because the string 1010 differs from the received string in only one position, whereas 1101 differs from the received string in two positions, it is likely that the second bit in the first string was flipped during transmission. Your friend can then "correct" the string to 1010, because that is probably the message that was transmitted. Because describing whether the number of 1s in a subset of positions is even or odd requires only one bit (you send a 0 if the number of bits is even and a 1 if odd), the code requires little overhead. This type of "error correction code" is key to achieving capacity.

Hundreds of researchers have spent the last 50 years developing low-complexity techniques to realize Shannon's vision. Error correction codes now play a critical role in everything from supermarket bar codes to cell phones to the undersea fiber-optic cables that make the Internet possible.