From Like IDs

Linear Traits

  • The scores are expressed as percentiles (ranging from 0 to 1) relative to the general population. A percentile indicates the position of a given score in the context of other scores. For example, the Extraversion score of .2 (or the 20th percentile) indicates that .2 (or 20%) of the people in general population have a lower score and .8 (80%) have a higher score. Score of .5 (or 50th percentile) indicates an average score.
  • Prediction accuracy is expressed as the correlation between the AMS prediction and the actual score. Accuracy of 1 indicates a perfect accuracy, whereas the accuracy of 0 indicates a random guess.
  • If there are more traits in a group, using the group name as a Trait ID will calculate predictions for all traits in the group. E.g. BIG5, Religion, Politics.
Trait ID Description Prediction Accuracy (correlation)
  • BIG5
  • BIG5_Openness
  • BIG5_Conscientiousness
  • BIG5_Extraversion
  • BIG5_Agreeableness
  • BIG5_Neuroticism

Original scores estimated using 100-item long International Personality Item Pool Five Factor Model questionnaire, arguably the most popular personality questionnaire used at the moment.

The model was built (and the accuracy validated) using a sample of 260,000 participants.

Between .35 and .50

(comparable with a short BIG 5 personality questionnaire).

  • Satisfaction_Life

Original scores estimated using Diener's Satisfaction With Life Scale. The model was built (and the accuracy validated) using a sample of 55,000 people.

.17

The test-retest reliability of Diener’s SWL scale is .44

  • Intelligence

Original scores estimated using our proxy for Raven's Standard Progressive Matrices - one of the most popular general intelligence questionnaires The model was built (and the accuracy validated) using a sample of 39,000 participants.

Note that the percentile scores can be easily be translated into the IQ scale. For example, a score of .5 equals IQ100, a score of .84 equals IQ115, and a score of .16 equals IQ85.

.47
  • Age
Reported as an actual age and not a percentile. We aim to predict the user's real age, but if they have mature or immature Likes on Facebook, then we may be out by a few years. .75

Categories

  • The scores are expressed as probabilities (ranging from 0 to 1). For example the score of .2 for trait female, indicates that given user is female with 20% of probability and male with 80% of probability.
  • Prediction accuracy is expressed as the Area Under the receiver operating Curve (AUC) which is an equivalent of the probability of correctly classifying two randomly selected users one from each class, such as males and females. AUC of 1 indicates a perfect accuracy, whereas AUC of .5 indicates a random guess.
  • If there are more traits in a group, using the group name as a Trait ID will calculate predictions for all traits in the group. E.g. BIG5, Religion, Politics.
Trait ID Description Prediction Accuracy (AUC)
  • Female

Probability of being a female. Importantly, it is a reversed probability of being male. This means that the probability of being male equals to (1-Female). The model was built (and the accuracy validated) using a sample of 243,000 people.

For example, the score of .2 indicates that given user has 20% probability of being female and 80% probability of being male.

.93

This means that in 93 out of 100 cases the prediction matches the individual’s self-reported gender

  • Gay

Probability of being gay (only for male users). Note that the probability of being straight equals (1-Gay). The percentile score gives the number of people out of 100 with Likes similar to the user’s who are Gay. The model was built (and the accuracy validated) using a sample of 98,000 people.

.88

This means that given two males, one homosexual and one heterosexual, the algorithm accurately distinguishes between them 88% of the time

  • Lesbian
Probability of being lesbian (only for female users). The percentile score gives the number of people out of 100 with Likes similar to the user’s who are Lesbian. The model was built (and the accuracy validated) using a sample of 78,000 people.

.75

This means that given two females, one homosexual and one heterosexual, the algorithm accurately distinguishes between them 75% of the time

  • Concentration
  • Concentration_Art
  • Concentration_Biology
  • Concentration_Business
  • Concentration_IT
  • Concentration_Education
  • Concentration_Engineering
  • Concentration_Journalism
  • Concentration_Finance
  • Concentration_History
  • Concentration_Law
  • Concentration_Nursing
  • Concentration_Psychology

Probability of having an interest in a given area. For a given user the predictions sum up to 1.

Note that the 12 concentrations are among the most popular and have differing prevalence in the general population. The model was built (and the accuracy validated) using a sample of 33,000 people and their concentration reported on Facebook profile. Note that a higher accuracy score can be achieved when considering 2 or more most probable conentrations selected for a given user.

.72
  • Politics
  • Politics_Conservative
  • Politics_Liberal
  • Politics_Uninvolved
  • Politics_Libertanian

Probability of exhibiting a preference for a given political view. For a given user the predictions sum up to 1. The model was built (and the accuracy validated) using a sample of 73,000 people and their political views reported on Facebook profile.

.79
  • Religion
  • Religion_None
  • Religion_Christian_Other
  • Religion_Catholic
  • Religion_Jewish
  • Religion_Lutheran
  • Religion_Mormon

Probability of having a given religious view. For a given user the predictions sum up to 1. The range of religious views for which predictions are made is based on the sample used to train the algorithm.

Category "None" includes atheists, agnostics, pastafarians and jedi. "Christian Other" includes several christian denominations. Note that a higher accuracy score can be achieved when considering 2 or more of the most probable concentrations selected for a given user.

.76
  • Relationship
  • Relationship_None
  • Relationship_Yes
  • Relationship_Married
Probability of being in a relationship. For a given user the predictions sum up to 1. .67

From text

Linear Traits

  • The scores are expressed as percentiles (ranging from 0 to 1) relative to the general population. A percentile indicates the position of a given score in the context of other scores. For example, the Extraversion score of .2 (or the 20th percentile) indicates that .2 (or 20%) of the people in general population have a lower score and .8 (80%) have a higher score. Score of .5 (or 50th percentile) indicates an average score.
  • If there are more traits in a group, using the group name as a Trait ID will calculate predictions for all traits in the group. E.g. BIG5, Religion, Politics.
Trait ID Description Prediction Accuracy (correlation)
  • BIG5
  • BIG5_Openness
  • BIG5_Conscientiousness
  • BIG5_Extraversion
  • BIG5_Agreeableness
  • BIG5_Neuroticism

Original scores estimated using 100-item long International Personality Item Pool Five Factor Model questionnaire or 336-item IPIP proxies for Costa and McCrae’s NEO-PI-R domains.

The model was built (and the accuracy validated) using a sample of 14 million status updates from 69,000 Facebook users.

Openness = .41

Conscientiousness = .4

Extraversion = .3

Agreeableness = .23

Neuroticism = .31

  • Age
Reported as an actual age and not a percentile. .76
  • Female

Probability of being a female. Importantly, it is a reversed probability of being male. This means that the probability of being male equals to (1-Female).

For example, the score of .2 indicates that given user has 20% probability of being female and 80% probability of being male.

.78

This means that in 78 out of 100 cases the prediction matches the individual’s self-reported gender

References

1. Goldberg, L. R. (1999). A broad-bandwidth, public domain, personality inventory measuring the lower-level facets of several five-factor models. In I. Mervielde, I. Deary, F. De Fruyt, & F. Ostendorf (Eds.), Personality Psychology in Europe, 7: 7-28. Tilburg, The Netherlands: Tilburg University Press

2. Diener, E. D., et al. (1985) "The satisfaction with life scale." Journal of personality assessment 49.1: 71-75.