I scraped the user submitted information of 800 male and 800 female profiles from a popular online dating site. The users were aged 25-35 and within 500 miles of Los Angeles, CA. I chose that age range because I figured that there would be plenty of people within that age range, and many people with similar things to say, and in a similar stage of life. I chose the geographical range with similar thinking to that of the age range – lots of people with things in common. I limited the number of words in each cloud to 200/250 depending on what fit well.

The data was scraped with python and the word clouds were made in R using the wordcloud, tm and RColorBrewer packages. I basically followed this blog’s instructions on making a word cloud in R.