Social Scientists Might Gain Access to Facebook's Data on User Behavior

The social network's move could quell complaints that it blocks verification of internal research results

Join Our Community of Science Lovers!

From Nature magazine

Social scientists hungry for Facebook’s data may be about to get a taste of it. Nature has learned that the social-networking website is considering giving researchers limited access to the petabytes of data that it has amassed on the preferences and behaviour of its almost one billion users.

Outsiders will not get a free run of the data, but the move could quell criticism from social scientists who have complained that the company’s own research on its users cannot be verified. Facebook's in-house scientists have been involved in publishing more than 30 papers since 2009, covering topics from what drives the spread of information and ideas to the relationship between social-networking activity and loneliness. However, because the company fears breaching its users’ privacy, it does not release the underlying raw data.


On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


Facebook is now exploring a plan that could allow external researchers to check its work in future by inspecting the data sets and methods used to produce a particular study. A paper currently submitted to a journal could prove to be a test case, after the journal said that allowing third-party academics the opportunity to verify the findings was a condition of publication.

“We want to participate in the scientific process and we believe that there should be a way to have other researchers validate [our studies] without infringing on the policies that we have set with our users,” says Cameron Marlow, head of Facebook's data-science team.

Restricted access
If the scheme were to go ahead, it would apply to papers after publication. Scholars would have to travel to the company’s headquarters in Menlo Park, California, because Facebook would not risk sending the data electronically, and they would have access to aggregated data only, and no personally identifiable information. The company would also allow access for only a limited period — and contingent upon researchers signing a non-disclosure agreement. Marlow says, however, that these conditions should not keep researchers from being openly critical about matters related to the published paper such as technique or data processing.

External scholars would not be allowed to conduct their own studies on the data sets.

The alternative — publicly releasing anonymized raw data sets — is not likely to be an option, says Facebook. Internet company AOL, based in New York, and film rental and streaming firm Netflix, based in Los Gatos, California, have both done this in the past, only for researchers to show that individuals could be identified in the anonymized data. “It is hard to really guarantee that it is anonymous,” says Marlow.

Facebook's proposals are a step in the right direction, say researchers. “Their intentions are very good,” agrees Bernardo Huberman, director of the social-computing group at Hewlett-Packard Laboratories in Palo Alto, California. Huberman has voiced concerns in Nature about the lack of researcher access to 'big data' at private companies. Facebook “wants to get closer to something that is the scientific method”, he says.

But Huberman and others have practical concerns. The requirement for on-site visits will hinder many researchers, with few likely to receive  funding to travel to merely validate a completed study. Furthermore, it is unclear whether Facebook will allow researchers to validate research by running their own programs on the data. If scientists are restricted to repeating Facebook researchers’ own analyses, says Anatoliy Gruzd, director of the social-media lab at Dalhousie University in Halifax, Canada, “they may be unknowingly repeating the same errors inherent in a technique”.

This article is reproduced with permission from the magazine Nature. The article was first published on July 26, 2012.

First published in 1869, Nature is the world's leading multidisciplinary science journal. Nature publishes the finest peer-reviewed research that drives ground-breaking discovery, and is read by thought-leaders and decision-makers around the world.

More by Nature magazine

It’s Time to Stand Up for Science

If you enjoyed this article, I’d like to ask for your support. Scientific American has served as an advocate for science and industry for 180 years, and right now may be the most critical moment in that two-century history.

I’ve been a Scientific American subscriber since I was 12 years old, and it helped shape the way I look at the world. SciAm always educates and delights me, and inspires a sense of awe for our vast, beautiful universe. I hope it does that for you, too.

If you subscribe to Scientific American, you help ensure that our coverage is centered on meaningful research and discovery; that we have the resources to report on the decisions that threaten labs across the U.S.; and that we support both budding and working scientists at a time when the value of science itself too often goes unrecognized.

In return, you get essential news, captivating podcasts, brilliant infographics, can't-miss newsletters, must-watch videos, challenging games, and the science world's best writing and reporting. You can even gift someone a subscription.

There has never been a more important time for us to stand up and show why science matters. I hope you’ll support us in that mission.

Thank you,

David M. Ewalt, Editor in Chief, Scientific American

Subscribe