The number of smartphones, tablets and other network-connected gadgets will outnumber humans by the end of the year. Perhaps more significantly, faster and more powerful mobile devices hitting the market are producing and consuming content at unprecedented levels. Global mobile data grew 70 percent in 2012, according to a recent report from Cisco, which makes much of the gear that runs the Internet. Yet the capacity of the world's networking infrastructure is finite, leaving many to wonder when we will reach the upper limit and what we will do when that happens.
There are ways to boost capacity, of course, such as adding cables, packing those cables with more data-carrying optical fibers and off-loading traffic onto smaller satellite networks, but these steps simply delay the inevitable. The solution is to make the overall infrastructure smarter. Two main components are needed: computers and other devices that can preprocess and possibly filter or aggregate their content before tossing it onto the network, along with a network that better understands what to do with this content, rather than numbly perceiving it as an endless, undifferentiated stream of bits and bytes.
To find out how these major advances could be accomplished, Scientific American spoke with Markus Hofmann, head of Bell Labs Research in Holmdel, N.J., the research and development arm of Alcatel-Lucent that, in its various guises, is credited with developing the transistor, the laser, the charge-coupled device and a litany of other groundbreaking 20th-century technologies. Hofmann—who joined Bell Labs in 1998, after earning his Ph.D. at the University of Karlsruhe in Germany—and his team see “information networking” as the way forward, an approach that promises to extend the Internet's capacity by raising its IQ. Excerpts follow.
Hofmann: The signs are subtle, but they are there. A personal example: When I use Skype to send my parents in Germany live video of my kids playing hockey, the video sometimes freezes at the most exciting moments. In all, this doesn't happen too often, but it happens more frequently lately—a sign that networks are becoming stressed by the amount of data they're asked to carry.
We know that Mother Nature gives us certain limits—only so much information you can transmit over certain communications channels. That phenomenon is called the nonlinear Shannon limit [named after former Bell Telephone Laboratories mathematician Claude Shannon], and it tells us how far we can push with today's technologies. We are already very, very close to this limit, within a factor of two roughly. Put another way, when we double the amount of network traffic we have today—something that could happen within the next four or five years—we will exceed the Shannon limit. That tells us there's a fundamental roadblock here. There is no way we can stretch this limit, just as we cannot increase the speed of light. So we need to work with these limits and still find ways to continue the needed growth.
The most obvious way is to increase bandwidth by laying more fiber. Instead of having just one transatlantic fiber-optic cable, for example, you have two, or five, or 10. That's the brute-force approach, but it's very expensive—you need to dig up the ground and lay the fiber, you need multiple optical amplifiers, transmitters and receivers, and so on. To make this economically feasible, we need to not only integrate multiple channels into a single optical fiber but also collapse multiple transmitters and receivers using new technologies such as photonic integration. This approach is referred to as spatial division multiplexing.
Still, boosting the existing infrastructure alone won't be sufficient to meet growing communications needs. What's needed is an infrastructure that no longer looks at raw data as only bits and bytes but rather as pieces of information relevant to a person using a computer or smartphone. On a given day do you want to know the temperature, wind speed and air pressure, or do you simply want to know how you should dress? This is referred to as information networking.
Many people refer to the Internet as a “dumb” network, although I don't like that term. What drove the Internet initially was non-real-time sharing of documents and data. The system's biggest requirement was resiliency—it had to be able to continue operating even if one or more nodes [computers, servers, and so on] stopped functioning. And the network was designed to see data simply as digital traffic, not to interpret the significance of those data.
Today we use the Internet in ways that require real-time performance, whether that is watching streaming video or making phone calls. At the same time, we're generating much more data. The network has to become more aware of the information it's carrying so it can better prioritize delivery and operate more efficiently. For example, if I'm doing a video conference in my office and turn my head away from the screen to chat with someone who has just entered my office, the conference setup should know to stop transmitting video until my attention returns to the screen. The system would recognize that I am no longer paying attention and not waste bandwidth while I'm speaking with the person in my office.
There are different approaches. If you want to know more about the data crossing a network—for example, to send a user's request for a Web page to the closest Web server—then you use software to peek into the data packet, something called deep-packet inspection. Think of a physical letter you send through the normal postal service wrapped in an envelope with an address on it. The postal service doesn't care what the letter says; it's only interested in the address. This is how the Internet functions today with regard to data. With deep-packet inspection, software tells the network to open the data envelope and read at least part of what's inside. But you can get only a limited amount of information about the data this way, and it requires a lot of processing power. Plus, if the data inside the packet are encrypted, deep-packet inspection won't work.
A better option would be to tag data and give the network instructions for handling different types of data. There might be a policy that states that a video stream should get priority over an e-mail, although you don't have to reveal exactly what is in that video stream or e-mail. The network simply takes these data tags into account when making routing decisions.
It all depends on the level at which these tags are being used. For example, data packets that use Internet protocol have a header that includes the source and destination address. These could be considered “tags,” but they provide very limited information. They don't indicate what Web site a user is requesting. They don't tell if the data belong to a [real-time] video stream or if they can be processed in batches. I'm talking about richer, higher-level tags or metadata that can in parts be mapped onto these lower-level tags.
It should be no different from what we see already, for example, on our roads and streets. When we hear an emergency vehicle with sirens on, we are all expected to pull to the side, clearing the street to let the vehicle pass as smoothly and quickly as possible, maybe saving a person's life. The tag in this case is the siren—as long as we recognize there is an emergency, we don't need to know who is in the vehicle or what the problem is, and we behave accordingly. Should we also give certain Internet packets priority in case of an emergency? It's all about transparency and agreed-on behaviors—on the roads and also on the Net.
Our smartphones, computers and other gadgets generate a good deal of raw data that we then send to data centers for processing and storage. Sending all these data around the globe for processing in a centralized center will not scale in the future. Rather we might move to a model where decisions are made about data before they are placed on the network. For example, if you have a security camera at an airport, you would program it or a small computer server controlling multiple cameras to perform facial recognition locally, based on a database stored in a camera or server, before putting any information out on the network.
At the moment, privacy is binary—either you keep your privacy, or you have to give it up almost entirely to obtain certain personalized services, such as music recommendations or online coupons. There has to be something in between that puts users in control of their information.
The biggest problem is that it has to be simple for the user. Look at how complicated it is to manage your privacy on social networks. You end up having your photos in the photo stream of people you don't even know. There should be the digital equivalent of a knob that lets you trade off privacy with personalization. The more I reveal about myself, the more personalized the services I receive. But I can also dial it back—if I'm willing to provide less detailed information, I can still receive some personalized, albeit less targeted, offers.
The information-networking approach provides the overall infrastructure with more awareness about the network traffic, which might be helpful in identifying and mitigating certain types of cyberattacks. Other factors could complicate this as well. I would expect—and hope—that data traffic will increasingly be encrypted to help provide true security and privacy. Of course, once data are encrypted, it becomes difficult to extract any information from them. This is a research challenge that will require new encryption schemes that maintain secrecy while permitting certain mathematical operations on the encrypted information.
Imagine, for example, that the income of each household in an area is encrypted and stored on a server in the cloud, so no one—except the authorized owner—can read the actual numbers. It might be helpful if the numbers were encrypted in a way that would allow software running in the cloud to calculate the average household income in the area—without identifying any of the actual households, purely by operating on the encrypted numbers.
Another approach might be to develop clever ways of managing encryption keys so they can be shared without compromising security. If done right, none of this should put any more burden on the user. That's the key and the challenge. Just think of how many users are actually encrypting their e-mails today—almost none, because it's extra work.