Published on March 26 by Ryan Whitacker

A couple months ago I posted an analysis of media bias in the 2016 election. The analysis used Google searches compared to mainstream TV mentions and found a very strong correlation, showing that the trends are strongly connected. Most surprisingly, there was a huge discrepancy between Bernie Sanders and Hillary Clinton TV coverage. At the time Clinton was getting 10 times more media mentions per search compared to Sanders.

Checking on Clinton Supporters' Explanations

The post was, of course, picked apart by people who didn't like the implications of the findings. Clinton and Rubio supporters, for example, disagreed that the media was covering their candidate unevenly. There were lots of reasons thrown out, but many of them were either nonsense or already covered in the post. For example, several people suggested that poll numbers could explain the difference in TV coverage, but the post had already shown poll numbers don't even come close to explaining the count of TV mentions. The numbers are barely even correlated, and an analysis of variance found poll numbers and changes in poll numbers to be bad explanations for network TV coverage.

More thoughtful commenters suggested some plausible explanations that I thought were worth checking on:

People watching the news are older, and older people are more interested in hearing about Clinton. People using Google to do research are younger, thus inflating the Google searches for Sanders. Clinton has been in the news for all kinds of things as secretary of state, thus inflating coverage.

I don't have a strong investment in these claims being true or false, so I thought about how to test them.

Possible explanations 1 & 2: The news isn't biased; you're just looking at different audiences.

Even if explanation 1 is true, biasing coverage to an audience's choice candidate is still media bias. I don't know the hearts of network executives, but ultimately it doesn't matter whether they're biasing coverage for viewership, profit, or to intentionally influence the public towards their views. When I speak of bias, I simply mean that there is more coverage, and it has nothing to do with poll numbers.

More importantly a follow-up analysis between Clinton and Sanders seems to cast a lot of doubt on both explanation 1 and 2.

This is one of the most fascinating graphs I've ever seen, simply because it shows a clear and unexplained trend. I'm going to assume that the national TV audience and the demographics of Google searchers hasn't changed much in the last two months. Why, then, has Sanders gone from a 1:10 media mention deficit per Google search to a 1:2 ratio? In case you're curious, the correlation between absolute coverage and Google searches is even stronger than before (94% for Sanders, 84% for Clinton).

A cynical person might also look at the graph above and think, “Of course the media is covering Sanders fairly now that he's no longer a threat.” I try not to infer human intent from data analysis, so I have no stance on whether this is proof of intentional media bias. I'd recommend all readers do the same. That said, it's clear this is further proof of media bias beyond demographic interest, and I honestly don't have a way to explain the data above. Go ahead and light up the comments section again if you think you have an answer.

We all love some data-centered mystery around here, but be careful about jumping to unjustified conclusions.

Possible explanation 3: Clinton was in the news for Benghazi, emails, and more. Without analyzing context, you can't prove media bias.

This is a good point: one I should have considered and discussed. I do truly love the Internet keeping analysts and authors like myself honest. To test this we can simply look at Google trends and searches to see whether this was an influential factor. If one of the big headlines Clinton's been involved with was influencing Google searches, it should show up under terms like “Clinton emails,” or “Clinton Benghazi.” Let's take a look. If you want to see dates you can follow along here.

There are a few gems here when compared to the earlier graph. The spike in search interest over Benghazi corresponds nicely with an October spike in her media coverage per Google search a week after the first debate. But what about November and December? Why didn't coverage even out while searches for this term dropped to a mere 2,400 and related searches also fell?

I have a hard time believing that tens of thousands of searches were making waves in the data in a month where over 1 million people looked for her name alone. (1 million in November and 1.2 million in December).

The next interesting thing is that the late January and early March surges in Google interest for both scandals come at about the same time. Both spikes correspond with major email releases from the State Department. The impact is fully reversed this time. A spike in searches results in a lower coverage-to-search ratio, meaning the media didn't seem as interested in the emails as online users were. For comparison, here is the February 28 – March 5th point enlarged:

This isn't hugely surprising given that users could go online to actually read the emails. Many of those who watched the news would turn to looking for the actual documents through a Google search.

Ultimately it does appear that Clinton's past as Secretary of State influenced both her TV coverage and the number of people searching on Google at various points throughout the campaign. This throws a wrench in the data, but it's not a big enough wrench to break the conclusion. It does not appear to have been a major factor in most months, and it certainly wasn't responsible for Clinton's historical 4-8x more coverage per Google search. Simply consider that the gap in coverage is still quite high for months where the scandal was playing out at a slow simmer.

The narrowing discrepancy between Sanders and Clinton coverage is a bit of a mystery. Maybe network execs felt the Bern? Maybe it was just too hard to ignore him when he was winning states. Regardless, the longitudinal data does imply neither young people Googling Sanders nor old people demanding news on Clinton are sufficient explanation for the historical 800% bias in Clinton's favor.