I recently read an article in the Washington Post titled, “Ranking the media from liberal to conservative, based on their audiences”. Inspiring me to rank news sites based on their subjectivity and polarity on a given subject, in this case, Donald Trump.

Photo by rawpixel on Unsplash

I used Python to pull the following news sites for their 30 most recent articles that contained the keyword “Trump” (ranging from the liberal side of the Washington Post article to the conservative side):

New Yorker NPR CNN Fox News Drudge Report Breitbart

Then performed a text analysis on the description of the article to return a list of how subjective (or opinionated) an article was and the polarity (whether the author felt positively or negatively about the Trump). By doing this I could come up with a (very basic) ranking of the News Sites on how biased they are about our President and what their opinions of him are. With this, I could compare with the Post article to what political affiliation the news source is most associated with and which is the most biased.

Imports

To search google, I used this library, TextBlob for text analysis and TextTable for displaying the results in the terminal.

from google import google

from textblob import TextBlob

import texttable as tt

from time import sleep

Search

In order to get and analyze the articles from varying websites, I used Google Dork searches which allow you to search a specific website for keywords (among many other things). For example you can type

inurl:medium.com intext:python

In order to return search results just from the Medium website that mention python.

I created a function called search which takes the arguments of the site you are searching and the keyword to search. Then setting the variable search_results to the Google Dork search that uses the site and keyword parameters, we can collect the articles.

I then created a search_results_list , subjectivity_list , and a polarity_list to append the results to later. Creating the num list is simply to have the articles numbered as they appear in the text table.

def search(site, search):

site = site

search = search

num_page = 3

search_results = google.search(“inurl:” + site + “ intext:” + search, 3)

search_results_list = []

subjectivity_list = []

polarity_list = []

num = []

number = 1

Sentiment Analysis

The next step is to determine the subjectivity and polarity of a given article. This can be done by calling the result.description method on the search result, and then appending that to the search_results_list.

By setting up TextBlob with the search_results as its argument as a variable analysis you perform basic sentiment analysis on the articles. Run analysis.sentiment.subjectivity to get the subjectivity for the results and analysis.sentiment.polarity to return the polarity. Append these results to their respective lists.

for result in search_results:

search_results = result.description

search_results_list.append(search_results) analysis = TextBlob(search_results)

subjectivity = analysis.sentiment.subjectivity

subjectivity_list.append(subjectivity)

polarity = analysis.sentiment.polarity

polarity_list.append(polarity)

number = number + 1

num.append(number)

sleep(5)

Text Table

In order to create the table, make a new variable tab and set it equal to tt.Texttable() . Then write out your headings, I used Number, Results, Subjectivity, and Polarity.

tab = tt.Texttable()

headings = [‘Number’,’Results’,’Subjectivity’, ‘Polarity’]

tab.header(headings)

Then run a for loop to add each element of the lists as a row to your table.

for row in zip(num, search_results_list, subjectivity_list, polarity_list):

tab.add_row(row)

Taking the subjectivity and polarity list we can find the average of each news source, which we can then print out along with the given site, search results and table.

avg_subjectivity = (sum(subjectivity_list) / len(subjectivity_list))

avg_polarity = (sum(polarity_list) / len(polarity_list)) table = tab.draw()

print site

print search

print table

print (site + “ average subjectivity: “ + str(avg_subjectivity))

print (site + “ average polarity: “ + str(avg_polarity))

Calling the Function

Finally you must call the search function for each news site.

search(“newyorker”, “trump”)

search(“npr”, “trump”)

search(“cnn”, “trump”)

search(“foxnews”, “trump”)

search(“drudgereport”, “trump”)

search(“breitbart”, “trump”)

And thats it!

Results

Running the script the terminal output will look something like this:

Displaying the site, the number of the article, article description, the subjectivity and polarity for each article and then the average for the site.

The averages for the site’s respective subjectivity are as follows (in order from most objective to most subjective):

NPR (0.21) Fox News (0.23) CNN (0.25) The New Yorker (0.27) Breitbart (0.34) Drudge Report (0.36)

No surprise that NPR is the most objective news source and that Drudge Report and Breitbart are the most opinionated on the subject of Trump. However, the fact that Fox News ranked 2nd in objectiveness surprised me.

The rankings for the polarity where equally as surprising, the closer the value is to -1 the more negative and closer to 1 the more positive.

Fox News (0.04) NPR (0.05) CNN (0.07) The New Yorker (0.07) Drudge Report (0.11) Breitbart (0.12)

Once again Fox News surprised me as the had the most negative sentiment towards Trump over 30 articles. After that, the news sources fall in line as expected. It was very interesting to see that the highest positive average was barely above zero! I expected Drudge Report and Breitbart to be at least in the 0.6 range.

Breitbart recorded the most positive sentiment of 0.8 on an article, however, it also had the most negative sentiment at -0.4. This leads me to believe that Breitbart articles use far more connotative language than its counterparts. Which is supported by the fact that Breitbart ranks 2nd in subjectivity.

Final Thoughts

Obviously, this was a very basic ranking and shouldn’t be taken as truth, due to an extremely small and limited (only using the description, not the full-length article) data set.

I saw the Post’s article and figured this would be a fun experiment to run and see the results. Hope you enjoyed!

Here is the full source code on Github

Check out my other Python articles: