This paper presents a novel approach in analyzing discrepancies between actual calls and the judgment of these foul situations by the league. For the first time, differences in the assessment of game situations by the employer (NBA) and employees (referees) are compared on a call-by-call basis. This has the advantage of a very precise measurement of calls that can be broken down to the individual player committing a foul. Given that player characteristics like origin or star status are common knowledge, it opens the possibility to devote biased decision making by referees to these player specifics. Previous investigations relied on statistical frequency of calls and devoted peculiarities to biased decision making, potentially mixing biased decision making with actual differences in behavior by the players or teams.

To lessen favoritism, leagues have various instruments to implement unbiased decision making, where the most powerful and presumably most costly tool is a monitoring system to supervise referees. Here, leagues evaluate the performance by their referees to tie chances for reappointment and promotion to the proper and impartial calling of games. Hence, financial incentives by the NBA aim at unbiased game calling by the referees. What remains is a potential trade-off for the referees, who face contradicting expectations from their employer, fans, and third parties, each with individual interest in the game outcome. Given the literature on referee bias in sports, impartial decision making by referees should not be taken for granted. The literature on referee biases report favoritism towards home teams, players of the referees’ ethnicity, losing teams, and others. 2

Referees in the National Basketball Association (NBA) are hired by the league to judge games impartially. They evaluate in game situations subjectively and are potentially prone to biases that are not in line with the league’s interest. 1 These biases of judgment by the referee can stem from personal preferences towards certain players or teams. Social payoffs in form of home fans applauding for calls in their teams’ favor can serve as another kind of non-monetary reward. In most recent years, cases of bribing made the press, like the 2007 NBA betting scandal surrounding former referee Tim Donaghy or the 2005 Bundesliga soccer scandal centered on former referee Robert Hoyzer. Both the Donaghy case and the Hoyzer case resulted in criminal proceedings, evidencing the overlap between referee bias and potential legal issues that may result.

3 Data, descriptive statistics and results

Studies mentioned in the previous section share the predicate of analyzing statistical frequency of calls, rather than analyzing call-by-call. Conclusions in the literature have been drawn without knowledge if referee decisions were correct, but simply on how often they occur. The data at hand for this paper adds significant value as it has detailed information on the correctness of foul calls as well as material non-calls that are called (or not called). Information is available for crucial games situations at the end of NBA games. Every call is reviewed by a senior referee manager or basketball operations manager and published online the day after the game is played. Information is available on the official website of the NBA at www.official.nba.com.

In this paper, calls are analyzed for every close game played during the 2014-15 regular season after March 1st 2015. The NBA refers games to being close when no team is ahead by more than five points with two minutes or less to play, or overtime (Deutscher, Frick, & Prinz, 2013). Out of the 356 regular season games during the period under observation, 113 fit this criterion. The data include 1229 calls and material non-calls that can be classified as displayed in Table 1 3

Find the frequencies for actual decisions by the referee (foul called and no foul called) as well as the assessment by the league (foul committed and no foul committed) in Table 1. 496 out of 619 fouls identifies by the league are correctly called by the referees (80.1 percent) while 593 of 610 no-foul situations identified by the league are correctly not called by the referees (97.1 percent). Due to the diminishing percentage of incorrect calls in no foul situations, these cases are dismissed from the analysis as the variation between players is too low. The following analysis focuses on fouls as assessed by NBA referees that are either called or not called by the referee. Here, the dependent dummy variable correct call indicates if the referee called the foul.

To analyze potential biases described in the literature review, additional data for every incident documented by the league has been added to the data from the website www.basketball-reference.com. The aim here is to systematically control for biases known from the literature, with the virtue of not having to rely on statistical frequencies of calls, but analyzing call-by-call decision making. Additionally, this is the first paper that combines the usage of different biases into one comprehensive approach. Control variables for often identified biases include information if the call is against the home (home) or away team (e.g. Anderson & Pierce, 2009), and if the committing and fouled players can be referred to as a superstar (Star) or not (NonStar) (e.g. Caudill et al., 2014). To control for a potential own-nationality bias (Pope & Pope, 2015), this paper distinguishes if the players’ origin is within the United States (US) or not (NonUS) as well as the underdog/favorite (favorite) status of their teams (e.g. Dawson et al., 2007). I define players as superstars if their number of appearances in NBA all-star games is at least one standard deviation above the average value for the full sample (Frick, 2001). For this paper, a player needs to appear in at least three NBA all-star games to be referred to as a superstar, a requirement 6.6 percent of the players in the sample meet. Teams are classified as underdogs if their probability to win was determined to be below 50 percent by the bookmaker prior to the game. Betting odds were drawn from the website betexplorer.com.

Further control variables include the number of seconds left in the game as well as crowd presence in the arena. While seconds left (secondsleft) to play in the game serve as an indicator for importance of a situation and the pressure on the decision by the referee, crowd presence (crowd presence) is measured as the percentage of tickets sold to display the crowd presence in the arena and possible social payoff to decisions in favor of the home team (Dohmen, 2008). As NBA arenas are very similar in their architecture, the necessity of further control variables is limited (Deutscher, 2011).

While every bias towards the home team would mean a negative bias against the away team, the introduction of superstar status and origin of players is more complex as, for example, the player committing a foul as well as the player being fouled could be a superstar. For superstars status as well as the origin of the players, four possible constellations are possible for fouls, as the player committing the foul as well as the player being fouled can fit or not fit the criteria superstar or US origin. All possible constellations are displayed in Table 2a and 2b and label the respective dummy variables and number of observations to be included in the empirical analysis. This labeling serves as a novel approach, since this paper is the first to distinguish between players committing fouls and players being fouled (foul committed and no foul committed).

Given these classification of fouls, Table 3 offers descriptive statistics for 619 fouls that were either correct calls or incorrect non calls. 19.7 percent of the fouls involved at least one superstar, while 38.1 percent of the fouls involved at least one foreign player.

To test foul calls for referee biases, we turn the attention to our dependent variable correct call. Its nature as a dummy variable suggests to apply a logit approach (Cox, 1958). The independent control variables account for potential referee biases towards home teams, superstar players, players with US origin and favorite teams while seconds left to play and the attendance serve as further control variables. Table 4 displays the logit estimations for correct calls in crucial game situations, where no team is ahead by more than five points and there are 2 minutes or less to play or overtime. While Model 1 estimates the impact of the most common referee bias (home bias), subsequent models include further referee biases described in the previous sections.

Results are very robust throughout all models. This data provides no support for home bias, contradictory to the vast majority of the literature. For an average value of 19.9 percent missed calls, no subgroup except for underdogs exhibits a value that is significantly different with 90% confidence. Concerning favoritism towards superstar (which would be expected in “Star vs Non- Star” or “Non-Star vs. Star” situations) no bias can be found compared to “neutral” foul situations (where a non-star fouls a non-star). Fouls where either a player from the US fouls or is fouled by a non-US player also provide no evidence of biased referee decision making. Compared to fouls without any player from the US, no systematic bias is detected by the estimations. Last NBA referees show a weak preference towards underdog teams. Control variables capturing the attendance and time left to play are not significant in any model.

Throughout all models, referee bias of the type tested in this paper appears to be largely non- existent in the NBA in crucial game situations. Reasons can be manifold: For referees, financial incentives to be achieved by reappointments in the future can serve as an explanation for the results. If the league punishes biased decision making, referees have an incentive for impartial behavior. Second, the NBA could fear bad press in case biased referee decision making becomes publically known. Referee bias as documented in academia by Price and Wolfers (2010) is not supported for later seasons (Pope, Price, & Wolfers, 2013). While no official statement by the NBA documents changes related to the results published by Price and Wolfers (2010), Price and Wolfers (2013) is at least suggestive of an improvement in referee training or monitoring. Furthermore, the sample size is a potential problem for the estimation.

Using information on assessment of calls by the league itself comprises the potential problem of bias judgment. If the person judging the call ex-post is biased the results in the estimations above would provide no support for biased judgment by the referees. In economic terms, the question “Who monitors the monitor?” remains. The data provides further limitations to be mentioned. First, there is only information on calls in crucial situations of close games. Referee bias could prevail in foul calling earlier in games or in games decided early. Second, monitoring by the league would fail if referee decision making is only evaluated by the league for predictable game situations. Third, the NBA allows for video revisions late during games to reduce the probability of bad referee decisions. This again reduces the probability of bad calls as certain calls can be revised.