Published on January 12 by Ryan Whitacker

In my last post I talked about correlations between media coverage and poll numbers for Republican presidential candidates. The most common response from non-Trump supporters was a claim that their candidate is being treated unfairly by the media. Claims of biased media are nothing new:

Blaming the mainstream media is now a mainstream political tactic.

But what if there is a media bias? Luckily this is a question we can answer with data. We looked at popularity vs. mainstream media coverage by comparing poll numbers, Google searches per week, and mentions on mainstream media broadcasts.

If CNN's Chris Cuomo is correct in his challenge to Sanders, the correlation should be strong between poll numbers and media coverage. However, if you subscribe to the belief that mainstream media is covering what people are interested in, then we should see a strong correlation between Google searches and news stories. We'll find which explanation is more correct, and then compare candidates.

Our analysis shows some candidates are being ignored and some candidates are being inexplicably preferred by the mainstream media.

Skip to the results

Data Sources

It'll be a good idea to show where I'm getting the data from. This will also help if you want to run your own correlations on a candidate that we don't post.

Count of Stories: Archive.org

Because most candidates are claiming a “mainstream” media bias, we looked at the top television networks. Archive.org sifts through huge amounts of closed-captioning to find key phrases. The data is then compiled into the 2016 Campaign Television Tracker, which is a daily representation of the how often a candidate is mentioned. We looked at all national networks since June of 2015 when the campaign really started to heat up.

Search Interest: Google Trends and Google Keyword Tool

The folks on /r/datisbeautiful took me to task last time for Google Trends figures and graphs. Their criticisms were valid, because Google Trends by itself uses an arbitrary “100” as a peak for any given term. Everything else is just a percentage of the max.

This time we took that data (which is weekly) and combined it with Google's keyword data from AdWords.

If we know that December had 4,090,000 searches we can then apply the weekly Google trends number to determine (more or less) the absolute value of searches each week. This is no problem for weeks that fall in the middle of the month, but it does get tricky when a week is split in half by a month. We did the analysis by breaking it down into days and rebuilding the weeks with search volume rather than a Google Trends number.

If you want to do this analysis yourself, you must apply the search trend weekly, but the volume trend monthly. Since no daily data exists, we have to make our own. Divide the number of searches in the month by the number of days in the month. This is your “un-adjusted” daily figure, which we'll adjust by the Google Trends weekly figure. We use a multiplier (the Google Trend for that day, based on week, divided by the average of Google Trend each day in the month times the not-yet-adjusted daily searches) to build searches adjusted “by day.” The day search count is actually just an average, but it doesn't matter since we can only take the analysis up to the week level anyway. Make sure that the total number of searches in the month sums back up to total searches.

Finally we re-build the week's number of searches with an estimated number of searches rather than a number of 1 to 100. There may be an easier way to do this, but I'm fairly certain there's no way to get a more accurate number. Weeks that pass from one month to the next get a little rounded, but the resulting weekly data is far closer to absolute searches by week and the search count far more useful.

Poll Figures: RealClearPolitics.com

There are lots of poll numbers, but RCP's poll section does a great job collecting and averaging them. I'd prefer a Nate Silver kind of number, but this is the best we can get going back in history. Note that I pulled these numbers from the graph for the middle of the week where current polls were averaged.

Analysis

We ran this analysis for Clinton, Sanders, O'Malley, Trump, Cruz, Rubio, Carson, and Paul. Across these candidates there was a correlation between both poll numbers (25%) and search interest (84%), but the correlation was far stronger with search volume, and stronger for every candidate. The fact that polls don't predict coverage isn't surprising. Leading in the polls isn't necessarily news unless that lead is recent. In that respect we disagree with Chris Cuomo: news coverage has more to do with how many people are interested in the topic than with poll figures.

In essence:

The news isn't there to tell you what happened. It's there to tell you what it wants you to hear or what it thinks you want to hear. -Joss Whedon

You can see the news-to-Google searches correlation in action when you look at individual candidates. The graphs below show the number of people looking for a candidate's exact name on the red line and right axis. The number of national TV mentions is on the black line and left axis. Note that the numbers of searches are for the candidate's name only. There are more searches (that generally follow the same trend) for all types of modified search terms.

Do not be fooled by the relative height of each line. They're measuring different things, and the axis max can make one line look smaller or bigger on a whim. Pay attention primarily to the trend they follow.

Donald Trump

Trump is the most-searched-for candidate in 2016, and so far he's also the most covered. In fact, he's been covered in the mainstream news about twice as many times as the next-closest candidate: Hillary Clinton.

Ted Cruz

Marco Rubio

Ben Carson

Rand Paul

Hillary Clinton

Bernie Sanders

Martin O'Malley

Conclusions

The only reason we've included the relative historical trends and correlations above is so that you can see that we are measuring something strongly correlated and re-validated by every candidate. These trends indicate an interaction that is obvious (more searches = more coverage, more coverage = more searches), but we needed to demonstrate the validity of the numbers for what comes next. Things get really interesting when you compare the raw numbers side-by-side.

Candidate Google Searches (Jun-Present) Mainstream Press Mentions (Jun-Present) Clinton 9,235,231 87,737 Sanders 21,536,032 29,525 O'Malley 1,158,270 3,996 Trump 37,046,010 183,903 Cruz 5,373,797 14,465 Rubio 4,428,465 26,463 Carson 8,728,769 33,794 Paul 3,258,219 6,566

In theory these two numbers would be proportional to some extent, at least within a party. Whether the news causes people to be interested or people's interest causes more news stories (or both) we should see roughly the same ratio. That idea is even further reinforced by the strong correlations seen above. Here's the problem, though: candidates don't get proportional coverage based on who is interested.

Holy crap. When I first had this idea I thought I might kill some conspiracy theories about the media. What we found is strong evidence of media bias.

Our analysis shows Bernie Sanders is being ignored by the mainstream media to a shocking degree. If covered at the average rate we'd have seen about 61,500 more stories including Sanders in the last 6 months: 91,094 mentions instead of 29,525.

Clinton receives a high amount of coverage, despite no dramatic changes in polls and lower search interest.

Candidates like Rand Paul also appear to be locked out of the mainstream press. Paul isn't the most popular candidate, but if the average held he'd have been in twice as many stories. Rubio, despite being 36% more popular than Paul was 403% more likely to be covered by the news.

Again, the poll numbers don't explain the difference in coverage: Clinton's poll-to-media-mention correlation, for example, is actually negative 48%. That means that news coverage goes up a little when her poll numbers drop. Sanders, on the other hand, sees no large benefit when his poll numbers rise (correlation = 11%).

For both Clinton and Sanders there's a strong correlation between online search interest and news coverage: 90% and 77% respectively. All that means is that the lines in the graphs above follow the same trend. Search interest goes up, and so do the number of TV mentions. If Sanders received the same volume of mainstream press coverage that Clinton did based on search popularity the correlation could remain unchanged. The line for “national news mentions” would have the same ups and downs, but it would be 10 times higher across the board.

Remember that correlation and causality are two different things. It's unclear whether news coverage causes interest or whether interest creates incentive to cover; the truth is that both causes are partly true. What we can say is some candidates receive far more coverage than is justified by either poll figures or search interest.

[author] [author_image timthumb='on'][/author_image] [author_info]Ryan Whitacker is one of DecisionData’s founders. Ryan is a data freak, news junkie, and open data fanatic. He’s worked for political organizations and nonprofits as a data analyst, developer, and consultant.[/author_info] [/author]

Please feel free to contact us to obtain the data. We'd love to see someone do a per-channel analysis using our method. This may lead to further proof of bias or at least explain the discrepancy.