In a study published last week at PLOS ONE, scientists at the University of Pennsylvania examined the language used in 75,000 Facebook profiles. After analyzing how people talk on social media and what they talk about, the researchers were able to make predictions for different ages, genders, and certain personality traits.
For example, the researchers found that they could predict a user’s gender with 92 percent accuracy. They could also guess a user’s age within three years more than half of the time.
One of the study's novel discoveries was that Introverts were more likely to talk about Japanese media like anime and manga.
We analyzed 700 million words, phrases, and topic instances collected from the Facebook messages of 75,000 volunteers, who also took standard personality tests, and found striking variations in language with personality, gender, and age. In our open-vocabulary technique, the data itself drives a comprehensive exploration of language that distinguishes people, finding connections that are not captured with traditional closed-vocabulary word-category analyses. Our analyses shed new light on psychosocial processes yielding results that are face valid (e.g., subjects living in high elevations talk about the mountains), tie in with other research (e.g., neurotic people disproportionately use the phrase ‘sick of’ and the word ‘depressed’), suggest new hypotheses (e.g., an active life implies emotional stability), and give detailed insights (males use the possessive ‘my’ when mentioning their ‘wife’ or ‘girlfriend’ more often than females use ‘my’ with ‘husband’ or 'boyfriend’). To date, this represents the largest study, by an order of magnitude, of language and personality.
- Our open-vocabulary analysis yields further insights into the behavioral residue of personality types beyond those from a priori word-category based approaches, giving unanticipated results (correlations between language and personality, gender, or age). For example, we make the novel discoveries that mentions of an assortment of social sports and life activities (such as basketball, snowboarding, church, meetings) correlate with emotional stability, and that introverts show an interest in Japanese media (such asanime, pokemon, manga and Japanese emoticons: ˆ_ˆ). Our inclusion of phrases in addition to words provided further insights (e.g. that males prefer to precede ‘girlfriend’ or ‘wife’ with the possessive ‘my’ significantly more than females do for ‘boyfriend’ or ‘husband’. Such correlations provide quantitative evidence for strong links between behavior, as revealed in language use, and psychosocial variables. In turn, these results suggest undertaking studies, such as directly measuring participation in activities in order to verify the link with emotional stability.
Words, phrases, and topics most distinguishing extraversion fromintroversion and neuroticism from emotional stability.
- While many of our results confirm previous research, demonstrating the instrument's face validity, our word clouds also suggest new hypotheses. For example, Figure 6 (bottom-right) shows language related to emotional stability (low neuroticism). Emotionally stable individuals wrote about enjoyable social activities that may foster greater emotional stability, such as ‘sports’, ‘vacation’, ‘beach’, ‘church’, ‘team’, and a family time topic. Additionally, results suggest that introverts are interested in Japanese media (e.g. ‘anime’, ‘manga’, ‘japanese’, Japanese style emoticons: ˆ_ˆ, and an anime topic) and that those low in openness drive the use of shorthands in social media (e.g. ‘2day’, ‘ur’, ‘every 1’). Although these are only language correlations, they show how open-vocabulary analyses can illuminate areas to explore further.
-------Scott Green is editor and reporter for anime and manga at geek entertainment site Ain't It Cool News. Follow him on Twitter at @aicnanime.