Stay on Top of Enterprise Technology Trends Get updates impacting your industry from our GigaOm Research Community

Every Sunday morning, millions of people in India tune in to watch Bollywood star Aamir Khan host one of the country’s highest-rated television shows, Satyamev Jayate. Only unlike so many popular programs, Satyamev Jayate doesn’t involve a singing competition or a collection of volatile strangers living under the same roof. It’s a documentary program tackling some of the country’s most-sensitive topics, and it has the whole country — indeed, the whole world — talking. In order to funnel millions of messages a week into something valuable, the shows producers have turned to big data.

Aside from Khan’s star power, the show is so popular because of the types of issues it tackles — female feticide, caste discrimination, dowry deaths, child abuse and medical practice among them. According to one of the show’s producers, the amount of engagement and the number of responses from viewers is “completely unprecedented.” Here’s a sample of what we’re talking about, just 13 episodes into the show’s existence:

400 million viewers on Indian television and across the world on YouTube (s goog).

More than 1.2 billion people have connected with Satyamev Jayate across its website, Facebook, Twitter, YouTube and mobile devices.

More than 8 million people have contributed a total of more than 14 million responses to the show’s content via Facebook, web comments, text-message votes and a telephone hotline. More than 100,000 new people respond each week.

The responses take all sorts of forms, from votes on a weekly poll question to long, heartfelt letters explaining a viewer’s experience with an issue or how the show has changed their thinking on an issue. And although 95 percent of responses come from India, the show has received them from 5,000 locations in 165 countries, including as far away as northern Canada and Alaska. The show’s topics regularly rank among the top trends on Twitter shortly after each episode airs.

Surprisingly, the producer said, the India-created Satyamev Jayate has not received a single piece of hate mail from bitter geopolitical rival Pakistan. In fact, there have been numerous requests for an episode on India-Pakistan unity. (If you have 90 minutes, here’s an episode on human dignity.)

Parsing through millions of messages

In order keep up with all the messages, Satyamev Jayate turned to Persistent Systems, an Indian IT consultancy with offices around the world, which created a system for automating their analysis. Here’s how the process works.

About a day-and-a-half before each show, Satyamev Jayate’s production company tells Persistent what the issue will be and the two groups come up with a taxonomy that will help the system sort through messages based on what topics will be brought up during Sunday’s show. But it’s not by any means the definitive list. As activity ramps up on Twitter while the show airs (tweet rates are highest during commercials and immediately after it ends, by the way), the team gets a sense of what topics are resonating with viewers and what themes they can expect in the nearly million responses that will follow.

When the responses actually do start pouring in after lunch, they hit a system designed by Persistent to automatically tag them and score them based on interest level and sentiment. So, as Mukund Deshpande, head of business intelligence and analytics at Persistent, told me, a long message with an interesting story will be marked as higher quality, while a short, congratulatory note will be scored lower. Because so many viewers write in “Hinglish,” a combination of Hindi and English, an off-the-shelf system wouldn’t have been as accurate for processing these messages.

In the future, he’d like to train the system to recognize various gradients of emotion, too, beyond just simple sentiment. That means not just “positive” or “negative,” but also “happy,” “sad,” “angry” and any other way a viewer might be feeling.

The best messages are then sent to a team of trained analysts — often college students and graduates, along with some Persistent employees — who decide which ones are worth following up on for a Friday radio show Khan does, and for placement on Satyamev Jayate’s web site. These analysts try to ensure that the stories shared are truthful and that the messages don’t contain personal information that could get viewers in trouble or affect their privacy. Data visualizations about how many people have responded and where they come from is available on the Impact section of the show’s site, as well as on separate Impact pages for each episode.

Making a difference with data

All this feedback has an impact, both on the show itself and on India. Satyamev Jayate’s voting process, in particular, has yielded some impressive results. After the first episode about female feticide, or the selective abortion of female fetuses, 99.8 percent of viewers said they agreed with the idea of a fast-track court to prosecute doctors who perform such operations. When Khan presented the results to the Indian government, officials agreed almost immediately to amend the court system accordingly, the producer told me.

Sometimes, though, the results simply present an interesting — if not troubling — view into the Indian subconscious. Almost 32 percent of respondents, for example, voted in favor of the right of families to use force preventing the marriage of two willing adults (subsequent analysis uncovered some reasons why, including continuing opposition to inter-caste marriage), while almost 14 percent of respondents one week said that beating a woman is a sign of masculinity. And although women comprise only about 32 percent of the show’s audience, they have accounted for the majority of responses on shows addressing issues important to them.

The producer said his team also uses the data to inspire ideas for future shows and to populate a weekly radio show that Khan does with a local journalist. The Satyamev Jayate team analyzes the week’s messages in order to pick the most powerful and determine trends in viewers’ feelings, and Khan shares them during the interview. The second season, he said, will be shaped in part by how viewers responded to the format during the first season and the issues they want covered next.

Beyond just the next season, though — and the occasional political victory — the hope is that all the data Satyamev Jayate generates will have continuing utility. Deshpande said he’d like to see it used for ethnographic and social science research, because the dataset is larger than most academic studies could generate (something that’s already happening with crowdsourced medical research) and it’s very high quality because of the demographic and geographic information attached to it.

However, the producer with whom I spoke seems perfectly content right now with the way Satyamev Jayate is resonating with the public. For example, he said, viewers are reporting crimes they previously might not have considered too big a deal and are reaching out to disabled citizens. This is the first time many people are speaking openly about these issues, he said, and they’re able to track the effects because they’re able to ensure no message is left behind.