This project is based on a unique dataset we compiled by analysing tens of thousands of images from each city, both through automatic image analysis and human judgement.

How we collected and filtered the data

To locate selfies photos, we randomly selected 120,000 photos (20,000-30,000 photos per city) from a total of 656'000 images we collected on Instagram. 2-4 Amazon’s Mechanical Turk workers tagged each photo. For these, we asked Mechanical Turk workers the simple question "Does this photo shows a single selfie"?

We then selected top 1000 photos for each city (i.e., photos which at least 2 workers tagged as a single person selfie).

We submitted these photos to Mechanical Turk again, asking three "master workers" (i.e. more skilled workers) not only to verify that a photo shows a single selfie, but also to guess the age and gender of the person.

On the resulting set of selfie images, we ran automatic face analysis, supplying us with algorithmic estimations of eye, nose and mouth positions, the degrees of different emotional expressions, etc.

As the final step, one or two members of the project team examined all these photos manually. While most photos were tagged correctly, we found some mistakes. We wanted to keep the data size the same (to make visualizations comparable), so our final set contains 640 selfie photos for every city.