If you use social media, you've probably noticed a trend across Facebook, Instagram, and Twitter of people posting their then-and-now profile pictures, mostly from 10 years ago and this year.

Instead of joining in, I posted the following semi-sarcastic tweet:

My flippant tweet began to pick up traction. My intent wasn't to claim that the meme is inherently dangerous. But I knew the facial recognition scenario was broadly plausible and indicative of a trend that people should be aware of. It’s worth considering the depth and breadth of the personal data we share without reservations.

Of those who were critical of my thesis, many argued that the pictures were already available anyway. The most common rebuttal was: “That data is already available. Facebook's already got all the profile pictures.”

Of course they do. In various versions of the meme, people were instructed to post their first profile picture alongside their current profile picture, or a picture from 10 years ago alongside their current profile picture. So, yes: These profile pictures exist, they’ve got upload time stamps, many people have a lot of them, and for the most part they’re publicly accessible.

But let's play out this idea.

Imagine that you wanted to train a facial recognition algorithm on age-related characteristics and, more specifically, on age progression (e.g., how people are likely to look as they get older). Ideally, you'd want a broad and rigorous dataset with lots of people's pictures. It would help if you knew they were taken a fixed number of years apart—say, 10 years.

Sure, you could mine Facebook for profile pictures and look at posting dates or EXIF data. But that whole set of profile pictures could end up generating a lot of useless noise. People don’t reliably upload pictures in chronological order, and it’s not uncommon for users to post pictures of something other than themselves as a profile picture. A quick glance through my Facebook friends’ profile pictures shows a friend’s dog who just died, several cartoons, word images, abstract patterns, and more.

In other words, it would help if you had a clean, simple, helpfully labeled set of then-and-now photos.

What's more, for the profile pictures on Facebook, the photo posting date wouldn’t necessarily match the date the picture was taken. Even the EXIF metadata on the photo wouldn't always be reliable for assessing that date.

Why? People could have scanned offline photos. They might have uploaded pictures multiple times over years. Some people resort to uploading screenshots of pictures found elsewhere online. Some platforms strip EXIF data for privacy.

LEARN MORE The WIRED Guide to Artificial Intelligence

Through the Facebook meme, most people have been helpfully adding that context back in (“me in 2008 and me in 2018”) as well as further info, in many cases, about where and how the pic was taken (“2008 at University of Whatever, taken by Joe; 2018 visiting New City for this year’s such-and-such event”).