Sep Kamvar is a consulting professor of computational mathematics at Stanford University and computer scientist who specializes in data mining. He is also the co-founder, with artist Jonathan Harris, of the popular website “We Feel Fine,” which combs blogs for expressions of emotion, and then displays the results in swarms of vivid color. The site provides a way to explore the emotional contours of our shared virtual world, and has also attracted the interest of psychologists and other scientists. Now the duo have compiled some of the site’s most interesting -- and visually arresting -- findings into a book, “We Feel Fine: An Almanac of Human Emotion.” Kamvar spoke with Mind Matters Editor Gareth Cook about the project and the potential it holds for psychology.
COOK: Please describe what We Feel Fine, the website, looks like?
KAMVAR: The We Feel Fine website is an interactive visualization of a continuously updated database of over 13 million expressions of emotion on blogs. The website itself has 6 different movements, each of which is built around a series of colored dots. Each colored dot is the visual representation of a feeling sentence. They are colored based on the feeling and when a user clicks on a dot, the ball explodes and shows the sentence.
COOK: Can you please give an example?
KAMVAR: The opening movement is a playful visualization that we call Madness. In Madness, all the colored dots are swarming around frantically around the screen, and when you click on any one of the dots, it explodes to reveal the feeling sentence behind it. Another movement is called Mobs, where the dots coalesce to make a bar graph to show some basic statistics about the data set, for example, what are the top feelings?
COOK: How does it work?
KAMVAR: The core of We Feel Fine is a crawler that scours the blogosphere every few minutes and scans for the words “I feel” or “I am feeling.” This data comes from a variety of sources, including LiveJournal, MySpace, Blogger, Flickr, Twitter, and Google.
Once the words “I feel” or “I am feeling” are found, the system looks back to the beginning of the sentence, and forward to the end of the sentence, and stores the full sentence in a database.
Since blogs are structured in largely standard ways, the crawler could use the “profile” section of the blog to get demographic information (age, gender, location) of the blogger who wrote the sentence, and this is stored in the database alongside the feeling.
COOK: When did it first become clear to you that this data might have some scientific value?
KAMVAR: Once we had a million feelings in the database, we realized that this was probably the biggest database of human emotion in existence. It allowed us to do "surveys" of several hundred thousand people in a matter of seconds. We thought that this would be a great tool for hypothesis generation, giving us hypotheses around questions like: Do people feel sadder in the winter? How do men feel differently from women.
COOK: Can you please describe one of the scientific projects that has used this data?
KAMVAR: We did some interesting joint work with Cassie Mogilner at UPenn and Jennifer Aaker at Stanford University. Cassie and Jennifer’s research focuses on happiness, and in this study, we were interested in how people define happiness. We used We Feel Fine and other, more traditional methods, to show that the meaning of happiness changes in very specific ways as people get older.
COOK: What did they discover?
KAMVAR: In the We Feel Fine database, there was a distinct change in feelings that co-occur in the same sentence with the feeling “happy.” For younger people, the feeling that co-occurred most often with happiness was excitement, whereas for older people, the feeling that co-occured most often with happiness was peacefulness.
Cassie then led a number of experiments where we used different methods to study this phenomenon in different ways. For example, we influenced people to feel excited or peaceful by playing for them slow acoustic or fast electric versions of the same song. We then gave them a survey where one of the questions asked them to rate their happiness.
The older people who were influenced to feel peaceful were happier than the older people who were influenced to feel excited, while the younger people who were influenced to feel excited were happier than the younger people who were influenced to feel peaceful.
COOK: Who else has approached you about use the data for scientific research?
KAMVAR: Peter Dodds and Chris Danforth are two professors of applied math at the University of Vermont. They got in touch with us a couple years back to use the data set, and recently published a paper in the Journal of Happiness Studies called "Measuring the Happiness of Large-Scale Written Expression."
COOK: What did they find?
KAMVAR: Christopher and Peter took standardized valence scores for words from a database called Affective Norms for English Words (ANEW). They used these to assign a happiness score to each sentence in the We Feel Fine database. They used this to determine trends in the data, for example, that Michael Jackson's death was one of the saddest days in the blogosphere in the past few years.
COOK: Please tell me a little bit about the book. Do you think there is anything in the book that might be interesting for researchers to follow up?
KAMVAR: The book tells two types of stories that we found in the data set, that we call micro stories and macro stories. The micro stories are individual stories about people and their emotions. The macro stories are large-scale statistics (for example, what is the gender breakdown of happiness, or the most common feelings that co-occur with excitement).
The macro stories are all very interesting seeds for further research. For example, we find that there is a strong link between gratitude and happiness, and on the other hand, there is a strong link between excitement and anxiety. Or we find that people get happier as they get older, and there is a big spike in happiness after ones' teenage years.
We chronicle the differences in peoples' emotion in different weather conditions, around different topics (like relationships or family), of different age groups, in different geographic locations, etc. All of this is fodder for further research.
COOK: This seems like a new way of doing psychology research. I wonder what you see as the potential strengths and weaknesses of the approach?
KAMVAR: Scale and cost are probably the two biggest strengths of the methodology. With a dataset like this, can do an experiment with 2 million people in less than a minute. The drawbacks are that there does exist a population bias, and there are only certain types of experiments you can run. For example, we can't do experiments where we change environmental conditions, measure the emotional response, and compare them to control condition, as one could in the lab. The types of experiments one can do with We Feel Fine are mostly large-scale correlational experiments.
I see methods in computational psychology becoming a useful complement to traditional psychology methods. They are good at rapid hypothesis generation, and those hypotheses can be tested with more time-tested methodologies in experimental psychology.
COOK: Are you surprised to find yourself working as a part-time mass psychologist?
KAMVAR: I am a computer scientist by training, and by day I teach and do research at Stanford. I got into this because I thought that one of the most interesting shifts on the web was not the technology shift, but the cultural shift that ensued, where people now feel comfortable sharing their whole life online. This cultural shift opened up many opportunities, for scientists, for artists, and for technologists.
In retrospect it is surprising that my work taken the direction it has -- but it's been a very interesting ride!
Are you a scientist? Have you recently read a peer-reviewed paper that you want to write about? Then contact Mind Matters co-editor Gareth Cook, a Pulitzer prize-winning journalist at the Boston Globe, where he edits the Sunday Ideas section. He can be reached at garethideas AT gmail.com