When you think about the number of phone calls one can make in a year, there's an upper bound to that. Technology doesn't let me speak any faster—we're not making a much higher magnitude of calls than we were a few years ago, but the storage capacity has gone up many times. Our storage is outpacing our ability to produce information. We are sending more emails than we were years ago, but at a certain point, I can only bang out emails so fast. But the ability to store those emails? That continues to increase exponentially.
Do you see a future where entities store all the information we produce?
It's not the future. It's the present, and it’s called Google, it's called Yahoo, it's called Facebook. Facebook already has every IM you've ever sent [through Facebook]. Google has saved all those emails you've been sending [through Gmail]. They have it, they've indexed it, and they’ve generated models on you. This isn't the future; this is the last few years.
Is that data collection for advertising purposes?
Definitely. In advertising, retailers build certain models. If every week I'm buying beer and chips, and then suddenly they see me buying a pregnancy test, and then they see me buying diapers. They can say "Oh, okay. Single life is over. We know what's going on, we're going to send this guy info about baby products." Everyone is doing predictive modeling.
On that note, do you see a privacy issue here?
There's a huge privacy issue. There's this great video from the ACLU about ordering a pizza in the future that sums it up. There aren't really any data privacy laws in place. In terms of what a website or retailer can track about you, I'm not aware of any laws on that subject.
We as individuals generally value that information at zero. In studies tresearchers have asked people: "For this particular website or service, we'll give you two choices: you can either pay for it, or you get it free, but it comes with ads." Everyone took it for free with ads. What they don't realize, or what they know and don't care about, is that those ads come with tracking cookies. They're collecting these massive data sets on us, and we as Americans just don't mind. Both culturally and legally, we don't seem to care. I think that's very unfortunate.
Is there anything unquestionably positive we can do with data collection beyond counterterrorism and advertising?
Big data is a tool, and like any tool, it can be used for good or bad. The Internet can be used to disseminate information on an unimaginable scale, and it can be used for child porn. So it’s really in the hands of the people who use it.
We can build models like we never have before. In crime, New York is famous for the Compstat system, where police look at what crimes occurred and when and where it happened. They allocate the police force based on that. A more efficient police force is wonderful for society.
Again, these models can be used for good or bad. Those police could be used to stop the criminals, or in the extreme police state, those officers could be used to crack down on dissenters.
In the end, companies are collecting an obscene amount of data on us, and I think that’s just as much a threat to individuals as government data collection might be.