Using a dataset of more than 58,000 U.S. Facebook users, University of Cambridge researchers predicted race, age, IQ, sexuality, personality, substance use and political views using Likes alone.
If you “Like” lots of people, places and things on Facebook, you may get rewarded with discounts and special offers. But new research out today shows that these public Likes reveal more about you than you may think.
Using a dataset of more than 58,000 Facebook users in the USA collected between 2007 and 2012, researchers at the University of Cambridge in the United Kingdom were able to accurately predict certain qualities and traits, such as race, age, IQ, sexuality, personality, substance use and political views using Facebook Likes alone.
The Likes include photos, friends’ status updates, Facebook pages of products, sports, musicians, books, restaurants or popular websites.
“Likes represent a very generic class of digital records, similar to Web search queries, Web browsing histories, and credit card purchases,” says the study in Proceedings of the National Academy of Sciences.
The participants gave researchers access to their Facebook pages and they completed a variety of online tests, including personality and IQ. Their Likes were fed into algorithms and researchers created statistical models that were able to predict the personal details using Facebook Likes alone. Results were corroborated with information from the Facebook profiles and personality tests.
“Each person, on average, liked 170 things,” says psychologist Michal Kosinski, the study’s lead author. “Some liked only one thing and there were people who liked thousands of things. We removed those. We looked at people who liked between one and 700 different things.”
Sam Gosling, a psychologist at the University of Texas at Austin, calls it a “landmark study” because it illustrates “how things are no longer ephemeral.” He has been studying Facebook behavior since 2006, and has seen this new study.
“You ‘Like’ something. You leave a comment on somebody’s wall. They are now recorded in a way that machines can calibrate and measure them with great accuracy,” he says. “Together, they add up to substantially more information from which you can make quite reasonably accurate predictions.”
Fred Wolens, a Facebook spokesman at its headquarters in Menlo Park, Calif., says the predictions are “hardly surprising.”
“No matter the vehicle for information — a bumper sticker, yard sign, logos on clothing, or other data found online — it has already been proven that it is possible for social scientists to draw conclusions about personal attributes based on these characteristics,” he says.
Rebecca Lieb, a digital media analyst at the Altimeter Group, a consulting firm in New York City, agrees.
“Advertising and marketing focus on this, but it’s important not to isolate this as only an online issue or a social network issue,” she says. “Data is being collected at every stage of our lives. If you’re using a credit card, you’re opening yourself up to as much data collection as if you’re using Facebook or searching online and getting cookies collected in your browser.”
The study found the highest accuracy for ethnic origin and gender, with African Americans and Caucasians correctly classified in 95% of cases. Males and females were correctly classified in 93% of cases; Christians and Muslims in 82% of cases. Sexual orientation was easier to distinguish among males (88%) than females (75%).
The study notes that Likes that are the “best predictors of high intelligence include ‘Thunderstorms,’ The Colbert Report, ‘Science” and ‘Curly Fries.’ Low intelligence was indicated by liking (Facebook pages for) ‘Sephora,’ ‘I Love Being A Mom,’ ‘Harley Davidson’ and ‘Lady Antebellum.’ ” Researchers gave no further explanation of these findings.
The study also suggests that the findings may have “negative implications for personal privacy.”
David Jacobs, consumer privacy counsel for the Electronic Privacy Information Center, a public interest research center in Washington, that focuses on civil liberties and privacy, says this study aligns with others involving predictions based on social networking information.
“This is not unique to Facebook and is not even unique to social networking in general,” Jacobs says. “It’s one of the implications of Big Data and in this case Big Data in a social networking context. Lots of information makes for certain inferences and sensitive predictions.”
“It’s the current state of the digital world,” adds Kosinski.