Source: Mika Heittola/Shutterstock

The new field of social data analytics provides us with more resources than ever before to investigate social phenomena. One of my favorite examples of how the new field of social data analytics is being applied is in a beautiful 2013 study from Microsoft Research[1]. By analyzing the Twitter feeds of women, the researchers could predict with 80% accuracy if a woman would go on to develop .

They began by recruiting close to 400 women to participate. From Twitter, they collected data, including the texts of the women's tweets, their favorites, and replies. They then measured four types of behavior:

Engagement. This included how often someone tweeted, how many of those tweets were replies to other people, and how many links, retweets, and questions they posted.

This included how often someone tweeted, how many of those tweets were replies to other people, and how many links, retweets, and questions they posted. Social Networks. On Twitter, this is simply the number of people followed and number of followers someone has.

On Twitter, this is simply the number of people followed and number of followers someone has. Emotion. Using tools that analyzed the types of words people use, they measured posts for words that expressed , sadness, , sadness, etc.

Using tools that analyzed the types of words people use, they measured posts for words that expressed , sadness, , sadness, etc. Language Style. Getting down into the linguistic details, these measures looked at things like articles, helper verbs, pronouns, and prepositions in tweets. These words are interesting because they do not carry the core meaning of what we are saying, but they tend to vary between people based on their mood, , or other traits. They are also very hard for us to control because we choose them mostly unconsciously.

After building up a list of attributes, the women were monitored for signs of postpartum (PPD). While all of the women's behavior changed over the course of their pregnancies, women who went on to have PPD changed in different ways. The researchers built computer models utilized these small differences. Those computer models could then look at a person's Twitter feed and guess whether or not she would go on to develop PPD.

Using only data from before the women give birth, their models could accurately classify women as likely to develop PPD or not with about 70% accuracy. However, PPD typically develops about a month after giving birth. When the researchers added in the first few weeks postpartum, before PPD symptoms would begin to develop, the algorithms got even better, reaching 80% accuracy or higher.

In what way did the women's Twitter behavior change? Women who went on to develop PPD tended to decrease their tweet frequency and number of followers, as well as their use of 2nd- and 3rd-person personal pronouns ("he", "they", "you"), while those who did not develop PPD actually increased in all categories.

On the other hand, women developing PPD tended to ask more questions while women who did not decreased the number of questions they asked.

The interesting scientific insight is that these are all subtle cues that are not direct expressions of PPD. It means that even if women tried to hide their potential condition, they are unlikely to be able to do so successfully, at least from the algorithm.

As a diagnostic tool for physicians, this technique holds great promise. It is non-invasive and, with such high accuracy, could be a great help in signaling which new mothers might benefit from extra monitoring and .

[1] De Choudhury, Munmun, Scott Counts, and Eric Horvitz. "Predicting postpartum changes in emotion and behavior via social media." Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2013.