Method

At OkCupid, we learn about our members’ personalities by asking multiple-choice questions. These questions show us who you are — like your beliefs and interests — and what sort of partner you’re looking for. I limited this analysis to questions that have been answered over 1 million times from 2014 to 2016. This resulted in 626 questions covering a wide range of topics (e.g., sex/relationships, politics, religion) with answers from over 19 million distinct members in the United States.

I then counted how often members in 945 Census-defined Metropolitan Statistical Areas (MSAs) gave a particular answer to each question. This data meant I could create a profile for the typical user in each MSA, determining the likelihood that they would give a particular answer to every one of the questions.

Instead of making thousands of maps to show how frequently different MSAs gave every answer to each of the 626 questions, I used a statistical method called factor analysis to find patterns in how questions were answered. Factor analysis takes a dataset with a huge number of variables and automatically looks for patterns. These patterns then help you (in this case, me) summarize what’s happening across this huge set of variables in a more comprehensible, less tedium-filled way.

For this case, our big dataset is the list of how a “typical member” of each MSA responded to each question. The factor analysis finds patterns based on which responses tend to co-occur in different cities. We can interpret these patterns to inform personality scales, or traits. For example, when the factor analysis recognized that members who answered “Yes” to “Could you date someone who does drugs?” also tended to answer “Yes” to “Are you okay with people who grow marijuana for their own personal use?” it grouped both of these answers together on a personality trait we’ve creatively labeled as Drugs. So thanks to the factor analysis, we only need to make a map showing how strongly each region expressed a particular personality scale (i.e., Drugs) rather than a map for each possible response to every question.

After running the analysis, we get a set of personality traits that describe how the residents of US cities differ from one another. The factor analysis doesn’t automatically tell us what each personality trait really “means” — it just gives us a list of point values associated with answers to different questions. To determine the traits that the responses may signify, we looked at the answers with the largest positive (or negative) point values, and figured out what word or concept best described the pattern we saw. This part’s a bit subjective, but most will find that the overlap between answers of particular questions certainly makes some clear suggestions about the respondent’s personality.

The factor analysis found that U.S. cities differ along 13 primary personality scales. These scales are ordered so that the first one explains the biggest differences between cities, and the second describes a little less, and so on and so on. And by looking at where on the spectrum of each trait the typical member in an MSA fell, we were able to determine the 8 types of people you’ll find across the US. For example, residents of the Deep South generally share a personality type that’s politically conservative, religious, and not very outdoorsy — go figure. But again, some findings are a bit more surprising.