I’ll be collecting fanbase statistics throughout the entire 2018 NFL season - click here for a Google Sheet of every team’s week-by-week metrics, and click here for an interactive Shiny application of team data.

This past weekend, the American sports world turned its eyes to the first matchups of the 2018 NFL regular season. From Thursday night to Monday night, we watched Sam Darnold and Patrick Mahomes impress in their debuts, and watched as a one-legged Aaron Rodgers commandeered a twenty-point comeback against Chicago. We saw teams that were expected to lose – the Bills and Raiders, for example – get blown out, and teams that were supposed to win – most notably, the Saints and Lions – upset by their presumed weaker opponents. We even watched as Cleveland finally ended their seventeen-game losing streak in the most quintessentially Browns way possible – by playing their AFC North rival Steelers to a draw.

After the Bills-Ravens game mercifully concluded, with Baltimore walloping Buffalo by a final score of 47-3, morbid curiosity brought me to r/buffalobills, the Bills-specific forum (or “subreddit”) of Reddit, where Bills fans’ early reviews of the team’s first game were less than stellar:





Intuition might suggest that since Buffalo had lost by the largest margin of any team in the league, their fanbase would have been the most negatively impacted by the loss. On the other hand, the Bills were largely expected to struggle this season; after all, they traded starting quarterback Tyrod Taylor in March, and were starting a quarterback who had, less than a year ago, thrown five interceptions in only one half of play. A fun exercise, therefore, would be quantifying Bills fans’ negativity – was the Buffalo subreddit demonstrably the most negative of all thirty-two teams? And if not, which fanbase had the worst Week 1?

First, I scraped thousands of rows of data for each NFL team, with each row representing a comment for any post left on that team’s subreddit between five minutes before the end of their Week 1 game and twenty-four hours after the start of their Week 1 game. A Python library, PRAW, exists for the express purpose of scraping Reddit submissions and comments, and although PRAW can collect only the thousand most recent posts in a subreddit, this proved a non-issue, as no team’s subreddit had even close to that many posts in such a short time span.

After data collection, I used the VADER SentimentIntensityAnalyzer algorithm within Python’s nltk library to perform sentiment analysis on each individual comment. Sentiment analysis determines the overall positivity or negativity in a given piece of text, as well as the sentiment’s overall intensity (or “polarity”), while also accounting for slang, negation, and strength of language. Every subreddit comment is given four scores: positivity, negativity, neutrality (i.e., neither positive nor negative), and “compound,” which is an amalgam of the others. While a comment can be partly positive, partly negative, and partly neutral, all three scores must sum to 1 (see the attached link for examples). While all individual positivity, negativity, and neutrality scores can take values in the range [0, 1], I will be multiplying every score by a hundred for ease of reading.

However, before performing sentiment analysis, I first had to separate all contractions (e.g., “don’t” or “hasn’t”) and remove all Reddit-style HTML from each comment, as well as additional unnecessary punctuation. I also deleted every mention of “Luck” from the Colts’ subreddit, as VADER cannot distinguish between the dictionary definition of luck (which it considers a “positive” word) and the word in the context of Colts quarterback Andrew Luck. The word “super” was also subject to removal - this time league-wide. Because teams with 2018 Super Bowl aspirations are more likely to use the word “super” (which, again, VADER considers positive), inclusion of the word would artificially inflate that team’s positivity scores.

But enough of the boring technical details – on to the results!

Teams

As follows is a scatterplot of the positivity and negativity of each subreddit’s average comment between Thursday and Tuesday nights. The happiest fanbases (i.e., the subreddits with the highest average positivity and lowest average negativity) appear in the lower right-hand corner of the plot, and the least happy fanbases appear in the upper-left. Each data point has also been color-coded in accordance with the outcome of that team’s game.





While Buffalo may have suffered the worst loss of Week 1, their fanbase actually had nowhere near the highest mean negativity in the league. That honor instead goes to the New Orleans Saints, who squeaked by the Tennessee Titans (by less than a one-hundredth of a percentage point!) to be titled the league’s most negative Week 1 fanbase. Tennessee, however, also has a significantly lower mean positivity score than New Orleans, and are the only team in the league with a lower average positivity than average negativity. Sitting through the longest game in NFL history clearly soured the mood of Titans fans, and losing their starting quarterback and tight end to injuries - as well as the game itself - certainly didn’t help matters.

We can also observe the weather delay’s affect on Miami fans. Although the Dolphins won the game, their mean positivity and negativity scores land them firmly among Week 1 losers Atlanta and Houston. For the most part, however, the graph cleanly splits between winning and losing teams; r/Colts and r/49ers both buck this trend, with the former subreddit actually finishing the weekend with the highest positivity score in the league. The Colts, in fact, finish with both higher positivity and lower negativity than the Bengals - even though Cincinnati won their matchup!

The Redskins and Vikings clearly have the two happiest fanbases of Week 1, as both teams emerged victorious in their opening games - the latter, incidentally, over the (curiously positive) 49ers. Also of note is that while New York Jets fans had the lowest negativity score in the league – which makes sense, given the fantastic Week 1 play of rookie quarterback Sam Darnold – they still finished with a relatively moderate positivity score. Perhaps years of disappointment have simply left the Jets fanbase incapable of happiness.

So how do we definitively decide which team’s fans had the best and worst Week 1 in the league? By subtracting each subreddit’s mean negativity from its mean positivity, we can rank teams by their net positivity scores. Here are the results, ranked from best to worst:





By net positivity, the Minnesota Vikings’ fanbase had the happiest Week 1, and the Tennessee Titans’ fanbase had the worst. Looking at the Game Result column, we can still observe a fairly clean split between winning teams and losing teams, showing that fanbases of winning teams generally exhibit more positivity than fanbases of losing ones. Winning margin, however, appears to play little part in a team’s ranking; the #1-ranked Vikings won their matchup by eight points, while the #10 Ravens beat the Bills by over forty.

Posts

As follows are the individual posts which resulted in comments with the highest mean positivity and negativity. I have excluded posts with fewer than twenty-five comments from this section of results so that posts with very few – but very polarized – comments do not dominate the list. I have also excluded posts whose inclusion on the list was only due to one user’s repetition of the same word or phrase.

Here are the three most positive posts, along with each post’s mean positivity score and relevant team:

Now, as follows are the three most negative posts of Week 1. If you look closely, you might spot another explanation for the Dolphins’ low net positivity score:

Comments

Just as we did on a post-by-post level, we can also identify the overall most positive and negative comments of the weekend. For each subreddit, I found the 67th percentile of the length (in characters) of that subreddit’s comments, and, for purposes of this list, am only including comments between the 67th and 100th percentiles. Additionally, the following lists comprise only comments with at least twenty-five upvotes.

Here are the three most positive comments of Week 1:

And the three most negative:

Words

Finally, we can use nltk’s FreqDist function to identify common words contained in each subreddit’s most negative comments. This way, we can achieve a rough idea of what specific topics have inspired the most negativity for each fanbase. I have removed all curse words from the following results, but rest assured that every single team’s list of most common negatively-associated words included at least four or five curses. Results have also been lightly edited for brevity.

49ers (r/49ers): jimmy (5), cancer (4), injury (4) Bears (r/chibears): packers (10), loss (9), collinsworth (8) Bengals (r/bengals): penalty (11), nfl (9), steelers (8), dirty (4), refs (4) Bills (r/buffalobills): tyrod (6), benjamin (5), terrible (4), peterman (4) Broncos (r/denverbroncos): raiders (6), von (6), offense (4) Browns (r/browns): hue (6), sucks (6), offense (6) Buccaneers (r/buccaneers): fire (10), saints (10), smith (6), cannons (5) Cardinals (r/azcardinals): rosen (7), bradford (7), riot (6), mccoy (5) Chargers (r/chargers): cut (5), hate (5), us (5) Chiefs (r/kansascitychiefs): defense (6), terrible (3), crazy (3), killing (3) Colts (r/colts): mad (4), doyle (4), hit (4) Cowboys (r/cowboys): fire (15), offense (13), romo (9), dak (9) Dolphins (r/miamidolphins): jets (119), bills (29), patriots (17), delay (5) Eagles (r/eagles): penalty (11), collinsworth (7), sullivan (5), brutal (4) Falcons (r/falcons): sark (23), offense (11), fire (10), refs (9), ryan (8) Giants (r/nygiants): flowers (17), cut (7), dallas (6), offensive (4) Jaguars (r/jaguars): penalty (7), offense (6), dumb (6), cam (5) Jets (r/nyjets): hate (5), announcers (3), sad (3), suck (3) Lions (r/detroitlions): die (18), fire (14), sorry (11) Packers (r/greenbaypackers): bears (15), suck (15), miss (9) Panthers (r/panthers): hate (5), lawrence (4), saints (4) Patriots (r/patriots): hate (5), hill (5), post (4), pissed (4), brady (4) Raiders (r/oaklandraiders): carr (18), talib (17), gruden (9), hate (8) Rams (r/losangelesrams): hold (12), cook (5), gruden (5), dirt (5), jared (4) Ravens (r/ravens): steelers (8), fans (6), bills (5) Redskins (r/redskins): dallas (4), peterson (3), viking (3), giants (3) Saints (r/saints): ryan (7), rob (6), pizza (6), offense (5) Seahawks (r/seahawks): announcers (12), ifedi (7), refs (5) Steelers (r/steelers): ben (11), offense (9), penalty (7), haley (7) Texans (r/texans): johnson (8), kevin (7), watson (5) Titans (r/tennesseetitans): worst (8), refs (7), dolphins (7) Vikings (r/minnesotavikings): rodgers (13), rule (8), bears (5)

A few takeaways:

Dolphins fans exhibit negativity towards their entire division, but especially towards the Jets (as we learned in the Posts section).

Bills fans still miss Tyrod Taylor, and are less than thrilled with wideout Kelvin Benjamin.

Atlanta’s fanbase is fed up with their offensive coordinator.

Giants fans have seen enough of offensive tackle Ereck Flowers.

r/oaklandraiders is concerned with the recent play of Derek Carr. They also associate new coach Jon Gruden and opposing cornerback Aqib Talib with negativity.

Seahawks and Jets fans were both evidently displeased with the announcers of their respective games. The Bears and Eagles fanbases both seemed particularly upset with NBC commentator Cris Collinsworth.

Thank you for reading - I hope you enjoyed the project! For anyone interested in using PRAW to scrape Reddit, this tutorial can get you started.

All code is available on my GitHub: https://github.com/saisenberg