Data Science, Politics, and Police

The intersection of science, politics, personal opinion, and social policy can be rather complex. This junction of ideas and disciplines is often rife with controversies, strongly held viewpoints, and agendas that are often more based on belief than on empirical evidence. Data science is particularly important in this area since it provides a methodology for examining the world in a pragmatic fact-first manner, and is capable of providing insight into some of the most important issues that we face today.

The recent high-profile police shootings of unarmed black men, such as Michael Brown (2014), Tamir Rice (2014), Anton Sterling (2016), and Philando Castile (2016), have triggered a divisive national dialog on the issue of racial bias in policing.

These shootings have spurred the growth of large social movements seeking to raise awareness of what is viewed as the systemic targeting of people-of-color by police forces across the country. On the other side of the political spectrum, many hold a view that the unbalanced targeting of non-white citizens is a myth created by the media based on a handful of extreme cases, and that these highly-publicized stories are not representative of the national norm.

In June 2017, a team of researchers at Stanford University collected and released an open-source data set of 60 million state police patrol stops from 20 states across the US. In this tutorial, we will walk through how to analyze and visualize this data using Python.

The source code and figures for this analysis can be found in the companion Github repository - https://github.com/triestpa/Police-Analysis-Python

To preview the completed IPython notebook, visit the page here.

This tutorial and analysis would not be possible without the work performed by The Stanford Open Policing Project. Much of the analysis performed in this tutorial is based on the work that has already performed by this team. A short tutorial for working with the data using the R programming language is provided on the official project website.

The Data

In the United States there are more than 50,000 traffic stops on a typical day. The potential number of data points for each stop is huge, from the demographics (age, race, gender) of the driver, to the location, time of day, stop reason, stop outcome, car model, and much more. Unfortunately, not every state makes this data available, and those that do often have different standards for which information is reported. Different counties and districts within each state can also be inconstant in how each traffic stop is recorded. The research team at Stanford has managed to gather traffic-stop data from twenty states, and has worked to regularize the reporting standards for 11 fields.

Stop Date

Stop Time

Stop Location

Driver Race

Driver Gender

Driver Age

Stop Reason

Search Conducted

Search Type

Contraband Found

Stop Outcome

Most states do not have data available for every field, but there is enough overlap between the data sets to provide a solid foundation for some very interesting analysis.

0 - Getting Started

We'll start with analyzing the data set for Vermont. We're looking at Vermont first for a few reasons.

The Vermont dataset is small enough to be very manageable and quick to operate on, with only 283,285 traffic stops (compared to the Texas data set, for instance, which contains almost 24 million records). There is not much missing data, as all eleven fields mentioned above are covered. Vermont is 94% white, but is also in a part of the country known for being very liberal (disclaimer - I grew up in the Boston area, and I've spent a quite a bit of time in Vermont). Many in this area consider this state to be very progressive and might like to believe that their state institutions are not as prone to systemic racism as the institutions in other parts of the country. It will be interesting to determine if the data validates this view.

0.0 - Download Datset

First, download the Vermont traffic stop data - https://stacks.stanford.edu/file/druid:py883nd2578/VT-clean.csv.gz

0.1 - Setup Project

Create a new directory for the project, say police-data-analysis , and move the downloaded file into a /data directory within the project.

0.2 - Optional: Create new virtualenv (or Anaconda) environment

If you want to keep your Python dependencies neat and separated between projects, now would be the time to create and activate a new environment for this analysis, using either virtualenv or Anaconda.

Here are some tutorials to help you get set up.

virtualenv - https://virtualenv.pypa.io/en/stable/

Anaconda - https://conda.io/docs/user-guide/install/index.html

0.3 - Install dependencies

We'll need to install a few Python packages to perform our analysis.

On the command line, run the following command to install the required libraries.

pip install numpy pandas matplotlib ipython jupyter

If you're using Anaconda, you can replace the pip command here with conda . Also, depending on your installation, you might need to use pip3 instead of pip in order to install the Python 3 versions of the packages.

0.4 - Start Jupyter Notebook

Start a new local Jupyter notebook server from the command line.

jupyter notebook

Open your browser to the specified URL (probably localhost:8888 , unless you have a special configuration) and create a new notebook.

I used Python 3.6 for writing this tutorial. If you want to use another Python version, that's fine, most of the code that we'll cover should work on any Python 2.x or 3.x distribution.

0.5 - Load Dependencies

In the first cell of the notebook, import our dependencies.

import pandas as pd import numpy as np import matplotlib.pyplot as plt %matplotlib inline figsize = (16,8)

We're also setting a shared variable figsize that we'll reuse later on in our data visualization logic.

0.6 - Load Dataset

In the next cell, load Vermont police stop data set into a Pandas dataframe.

df_vt = pd.read_csv('./data/VT-clean.csv.gz', compression='gzip', low_memory=False)

This command assumes that you are storing the data set in the data directory of the project. If you are not, you can adjust the data file path accordingly.

1 - Vermont Data Exploration

Now begins the fun part.

1.0 - Preview the Available Data

We can get a quick preview of the first ten rows of the data set with the head() method.

df_vt.head()

id state stop_date stop_time location_raw county_name county_fips fine_grained_location police_department driver_gender driver_age_raw driver_age driver_race_raw driver_race violation_raw violation search_conducted search_type_raw search_type contraband_found stop_outcome is_arrested officer_id is_white 0 VT-2010-00001 VT 2010-07-01 00:10 East Montpelier Washington County 50023.0 COUNTY RD MIDDLESEX VSP M 22.0 22.0 White White Moving Violation Moving violation False No Search Conducted N/A False Citation False -1.562157e+09 True 3 VT-2010-00004 VT 2010-07-01 00:11 Whiting Addison County 50001.0 N MAIN ST NEW HAVEN VSP F 18.0 18.0 White White Moving Violation Moving violation False No Search Conducted N/A False Arrest for Violation True -3.126844e+08 True 4 VT-2010-00005 VT 2010-07-01 00:35 Hardwick Caledonia County 50005.0 i91 nb mm 62 ROYALTON VSP M 18.0 18.0 White White Moving Violation Moving violation False No Search Conducted N/A False Written Warning False 9.225661e+08 True 5 VT-2010-00006 VT 2010-07-01 00:44 Hardwick Caledonia County 50005.0 64000 I 91 N; MM64 I 91 N ROYALTON VSP F 20.0 20.0 White White Vehicle Equipment Equipment False No Search Conducted N/A False Written Warning False -6.032327e+08 True 8 VT-2010-00009 VT 2010-07-01 01:10 Rochester Windsor County 50027.0 36000 I 91 S; MM36 I 91 S ROCKINGHAM VSP M 24.0 24.0 Black Black Moving Violation Moving violation False No Search Conducted N/A False Written Warning False 2.939526e+08 False

We can also list the available fields by reading the columns property.

df_vt.columns

Index(['id', 'state', 'stop_date', 'stop_time', 'location_raw', 'county_name', 'county_fips', 'fine_grained_location', 'police_department', 'driver_gender', 'driver_age_raw', 'driver_age', 'driver_race_raw', 'driver_race', 'violation_raw', 'violation', 'search_conducted', 'search_type_raw', 'search_type', 'contraband_found', 'stop_outcome', 'is_arrested', 'officer_id'], dtype='object')

1.1 - Drop Missing Values

Let's do a quick count of each column to determine how consistently populated the data is.

df_vt.count()

id 283285 state 283285 stop_date 283285 stop_time 283285 location_raw 282591 county_name 282580 county_fips 282580 fine_grained_location 282938 police_department 283285 driver_gender 281573 driver_age_raw 282114 driver_age 281999 driver_race_raw 279301 driver_race 278468 violation_raw 281107 violation 281107 search_conducted 283285 search_type_raw 281045 search_type 3419 contraband_found 283251 stop_outcome 280960 is_arrested 283285 officer_id 283273 dtype: int64

We can see that most columns have similar numbers of values besides search_type , which is not present for most of the rows, likely because most stops do not result in a search.

For our analysis, it will be best to have the exact same number of values for each field. We'll go ahead now and make sure that every single cell has a value.

# Fill missing search type values with placeholder df_vt['search_type'].fillna('N/A', inplace=True) # Drop rows with missing values df_vt.dropna(inplace=True) df_vt.count()

When we count the values again, we'll see that each column has the exact same number of entries.

id 273181 state 273181 stop_date 273181 stop_time 273181 location_raw 273181 county_name 273181 county_fips 273181 fine_grained_location 273181 police_department 273181 driver_gender 273181 driver_age_raw 273181 driver_age 273181 driver_race_raw 273181 driver_race 273181 violation_raw 273181 violation 273181 search_conducted 273181 search_type_raw 273181 search_type 273181 contraband_found 273181 stop_outcome 273181 is_arrested 273181 officer_id 273181 dtype: int64

1.2 - Stops By County

Let's get a list of all counties in the data set, along with how many traffic stops happened in each.

df_vt['county_name'].value_counts()

Windham County 37715 Windsor County 36464 Chittenden County 24815 Orange County 24679 Washington County 24633 Rutland County 22885 Addison County 22813 Bennington County 22250 Franklin County 19715 Caledonia County 16505 Orleans County 10344 Lamoille County 8604 Essex County 1239 Grand Isle County 520 Name: county_name, dtype: int64

If you're familiar with Vermont's geography, you'll notice that the police stops seem to be more concentrated in counties in the southern-half of the state. The southern-half of the state is also where much of the cross-state traffic flows in transit to and from New Hampshire, Massachusetts, and New York. Since the traffic stop data is from the state troopers, this interstate highway traffic could potentially explain why we see more traffic stops in these counties.

Here's a quick map generated with Tableau to visualize this regional distribution.

1.3 - Violations

We can also check out the distribution of traffic stop reasons.

df_vt['violation'].value_counts()

Moving violation 212100 Equipment 50600 Other 9768 DUI 711 Other (non-mapped) 2 Name: violation, dtype: int64

Unsurprisingly, the top reason for a traffic stop is Moving Violation (speeding, reckless driving, etc.), followed by Equipment (faulty lights, illegal modifications, etc.).

By using the violation_raw fields as reference, we can see that the Other category includes "Investigatory Stop" (the police have reason to suspect that the driver of the vehicle has committed a crime) and "Externally Generated Stop" (possibly as a result of a 911 call, or a referral from municipal police departments).

DUI ("driving under the influence", i.e. drunk driving) is surprisingly the least prevalent, with only 711 total recorded stops for this reason over the five year period (2010-2015) that the dataset covers. This seems low, since Vermont had 2,647 DUI arrests in 2015, so I suspect that a large proportion of these arrests were performed by municipal police departments, and/or began with a Moving Violation stop, instead of a more specific DUI stop.

1.4 - Outcomes

We can also examine the traffic stop outcomes.

df_vt['stop_outcome'].value_counts()

Written Warning 166488 Citation 103401 Arrest for Violation 3206 Warrant Arrest 76 Verbal Warning 10 Name: stop_outcome, dtype: int64

A majority of stops result in a written warning - which goes on the record but carries no direct penalty. A bit over 1/3 of the stops result in a citation (commonly known as a ticket), which comes with a direct fine and can carry other negative side-effects such as raising a driver's auto insurance premiums.

The decision to give a warning or a citation is often at the discretion of the police officer, so this could be a good source for studying bias.

1.5 - Stops By Gender

Let's break down the traffic stops by gender.

df_vt['driver_gender'].value_counts()

M 179678 F 101895 Name: driver_gender, dtype: int64

We can see that approximately 36% of the stops are of women drivers, and 64% are of men.

1.6 - Stops By Race

Let's also examine the distribution by race.

df_vt['driver_race'].value_counts()

White 266216 Black 5741 Asian 3607 Hispanic 2625 Other 279 Name: driver_race, dtype: int64

Most traffic stops are of white drivers, which is to be expected since Vermont is around 94% white (making it the 2nd-least diverse state in the nation, behind Maine). Since white drivers make up approximately 94% of the traffic stops, there's no obvious bias here for pulling over non-white drivers vs white drivers. Using the same methodology, however, we can also see that while black drivers make up roughly 2% of all traffic stops, only 1.3% of Vermont's population is black.

Let's keep on analyzing the data to see what else we can learn.

1.7 - Police Stop Frequency by Race and Age

It would be interesting to visualize how the frequency of police stops breaks down by both race and age.

fig, ax = plt.subplots() ax.set_xlim(15, 70) for race in df_vt['driver_race'].unique(): s = df_vt[df_vt['driver_race'] == race]['driver_age'] s.plot.kde(ax=ax, label=race) ax.legend()

We can see that young drivers in their late teens and early twenties are the most likely to be pulled over. Between ages 25 and 35, the stop rate of each demographic drops off quickly. As far as the racial comparison goes, the most interesting disparity is that for white drivers between the ages of 35 and 50 the pull-over rate stays mostly flat, whereas for other races it continues to drop steadily.

2 - Violation and Outcome Analysis

Now that we've got a feel for the dataset, we can start getting into some more advanced analysis.

One interesting topic that we touched on earlier is the fact that the decision to penalize a driver with a ticket or a citation is often at the discretion of the police officer. With this in mind, let's see if there are any discernable patterns in driver demographics and stop outcome.

2.0 - Analysis Helper Function

In order to assist in this analysis, we'll define a helper function to aggregate a few important statistics from our dataset.

citations_per_warning - The ratio of citations to warnings. A higher number signifies a greater likelihood of being ticketed instead of getting off with a warning.

- The ratio of citations to warnings. A higher number signifies a greater likelihood of being ticketed instead of getting off with a warning. arrest_rate - The percentage of stops that end in an arrest.

def compute_outcome_stats(df): """Compute statistics regarding the relative quanties of arrests, warnings, and citations""" n_total = len(df) n_warnings = len(df[df['stop_outcome'] == 'Written Warning']) n_citations = len(df[df['stop_outcome'] == 'Citation']) n_arrests = len(df[df['stop_outcome'] == 'Arrest for Violation']) citations_per_warning = n_citations / n_warnings arrest_rate = n_arrests / n_total return(pd.Series(data = { 'n_total': n_total, 'n_warnings': n_warnings, 'n_citations': n_citations, 'n_arrests': n_arrests, 'citations_per_warning': citations_per_warning, 'arrest_rate': arrest_rate }))

Let's test out this helper function by applying it to the entire dataframe.

compute_outcome_stats(df_vt)

arrest_rate 0.011721 citations_per_warning 0.620751 n_arrests 3199.000000 n_citations 103270.000000 n_total 272918.000000 n_warnings 166363.000000 dtype: float64

In the above result, we can see that about 1.17% of traffic stops result in an arrest, and there are on-average 0.62 citations (tickets) issued per warning. This data passes the sanity check, but it's too coarse to provide many interesting insights. Let's dig deeper.

2.1 - Breakdown By Gender

Using our helper function, along with the Pandas dataframe groupby method, we can easily compare these stats for male and female drivers.

df_vt.groupby('driver_gender').apply(compute_outcome_stats)

arrest_rate citations_per_warning n_arrests n_citations n_total n_warnings driver_gender F 0.007038 0.548033 697.0 34805.0 99036.0 63509.0 M 0.014389 0.665652 2502.0 68465.0 173882.0 102854.0

This is a simple example of the common split-apply-combine technique. We'll be building on this pattern for the remainder of the tutorial, so make sure that you understand how this comparison table is generated before continuing.

We can see here that men are, on average, twice as likely to be arrested during a traffic stop, and are also slightly more likely to be given a citation than women. It is, of course, not clear from the data whether this is indicative of any bias by the police officers, or if it reflects that men are being pulled over for more serious offenses than women on average.

2.2 - Breakdown By Race

Let's now compute the same comparison, grouping by race.

df_vt.groupby('driver_race').apply(compute_outcome_stats)

arrest_rate citations_per_warning n_arrests n_citations n_total n_warnings driver_race Asian 0.006384 1.002339 22.0 1714.0 3446.0 1710.0 Black 0.019925 0.802379 111.0 2428.0 5571.0 3026.0 Hispanic 0.016393 0.865827 42.0 1168.0 2562.0 1349.0 White 0.011571 0.611188 3024.0 97960.0 261339.0 160278.0

Ok, this is interesting. We can see that Asian drivers are arrested at the lowest rate, but receive tickets at the highest rate (roughly 1 ticket per warning). Black and Hispanic drivers are both arrested at a higher rate and ticketed at a higher rate than white drivers.

Let's visualize these results.

race_agg = df_vt.groupby(['driver_race']).apply(compute_outcome_stats) fig, axes = plt.subplots(nrows=2, ncols=1, figsize=figsize) race_agg['citations_per_warning'].plot.barh(ax=axes[0], figsize=figsize, title="Citation Rate By Race") race_agg['arrest_rate'].plot.barh(ax=axes[1], figsize=figsize, title='Arrest Rate By Race')

2.3 - Group By Outcome and Violation

We'll deepen our analysis by grouping each statistic by the violation that triggered the traffic stop.

df_vt.groupby(['driver_race','violation']).apply(compute_outcome_stats)

arrest_rate citations_per_warning n_arrests n_citations n_total n_warnings driver_race violation Asian DUI 0.200000 0.333333 2.0 2.0 10.0 6.0 Equipment 0.006270 0.132143 2.0 37.0 319.0 280.0 Moving violation 0.005563 1.183190 17.0 1647.0 3056.0 1392.0 Other 0.016393 0.875000 1.0 28.0 61.0 32.0 Black DUI 0.200000 0.142857 2.0 1.0 10.0 7.0 Equipment 0.029181 0.220651 26.0 156.0 891.0 707.0 Moving violation 0.016052 0.942385 71.0 2110.0 4423.0 2239.0 Other 0.048583 2.205479 12.0 161.0 247.0 73.0 Hispanic DUI 0.200000 3.000000 2.0 6.0 10.0 2.0 Equipment 0.023560 0.187898 9.0 59.0 382.0 314.0 Moving violation 0.012422 1.058824 26.0 1062.0 2093.0 1003.0 Other 0.064935 1.366667 5.0 41.0 77.0 30.0 White DUI 0.192364 0.455026 131.0 172.0 681.0 378.0 Equipment 0.012233 0.190486 599.0 7736.0 48965.0 40612.0 Moving violation 0.008635 0.732720 1747.0 84797.0 202321.0 115729.0 Other 0.058378 1.476672 547.0 5254.0 9370.0 3558.0 Other (non-mapped) 0.000000 1.000000 0.0 1.0 2.0 1.0

Ok, well this table looks interesting, but it's rather large and visually overwhelming. Let's trim down that dataset in order to retrieve a more focused subset of information.

# Create new column to represent whether the driver is white df_vt['is_white'] = df_vt['driver_race'] == 'White' # Remove violation with too few data points df_vt_filtered = df_vt[~df_vt['violation'].isin(['Other (non-mapped)', 'DUI'])]

We're generating a new column to represent whether or not the driver is white. We are also generating a filtered version of the dataframe that strips out the two violation types with the fewest datapoints.

We not assigning the filtered dataframe to df_vt since we'll want to keep using the complete unfiltered dataset in the next sections.

Let's redo our race + violation aggregation now, using our filtered dataset.

df_vt_filtered.groupby(['is_white','violation']).apply(compute_outcome_stats)

arrest_rate citations_per_warning n_arrests n_citations n_total n_warnings is_white violation False Equipment 0.023241 0.193697 37.0 252.0 1592.0 1301.0 Moving violation 0.011910 1.039922 114.0 4819.0 9572.0 4634.0 Other 0.046753 1.703704 18.0 230.0 385.0 135.0 True Equipment 0.012233 0.190486 599.0 7736.0 48965.0 40612.0 Moving violation 0.008635 0.732720 1747.0 84797.0 202321.0 115729.0 Other 0.058378 1.476672 547.0 5254.0 9370.0 3558.0

Ok great, this is much easier to read.

In the above table, we can see that non-white drivers are more likely to be arrested during a stop that was initiated due to an equipment or moving violation, but white drivers are more likely to be arrested for a traffic stop resulting from "Other" reasons. Non-white drivers are more likely than white drivers to be given tickets for each violation.

2.4 - Visualize Stop Outcome and Violation Results

Let's generate a bar chart now in order to visualize this data broken down by race.

race_stats = df_vt_filtered.groupby(['violation', 'driver_race']).apply(compute_outcome_stats).unstack() fig, axes = plt.subplots(nrows=1, ncols=2, figsize=figsize) race_stats.plot.bar(y='arrest_rate', ax=axes[0], title='Arrest Rate By Race and Violation') race_stats.plot.bar(y='citations_per_warning', ax=axes[1], title='Citations Per Warning By Race and Violation')

We can see in these charts that Hispanic and Black drivers are generally arrested at a higher rate than white drivers (with the exception of the rather ambiguous "Other" category). and that Black drivers are more likely, across the board, to be issued a citation than white drivers. Asian drivers are arrested at very low rates, and their citation rates are highly variable.

These results are compelling, and are suggestive of potential racial bias, but they are too inconsistent across violation types to provide any definitive answers. Let's dig deeper to see what else we can find.

3 - Search Outcome Analysis

Two of the more interesting fields available to us are search_conducted and contraband_found .

In the analysis by the "Stanford Open Policing Project", they use these two fields to perform what is known as an "outcome test".

On the project website, the "outcome test" is summarized clearly.

In the 1950s, the Nobel prize-winning economist Gary Becker proposed an elegant method to test for bias in search decisions: the outcome test. Becker proposed looking at search outcomes. If officers don’t discriminate, he argued, they should find contraband — like illegal drugs or weapons — on searched minorities at the same rate as on searched whites. If searches of minorities turn up contraband at lower rates than searches of whites, the outcome test suggests officers are applying a double standard, searching minorities on the basis of less evidence."

Findings, Stanford Open Policing Project

The authors of the project also make the point that only using the "hit rate", or the rate of searches where contraband is found, can be misleading. For this reason, we'll also need to use the "search rate" in our analysis - the rate at which a traffic stop results in a search.

We'll now use the available data to perform our own outcome test, in order to determine whether minorities in Vermont are routinely searched on the basis of less evidence than white drivers.

3.0 Compute Search Rate and Hit Rate

We'll define a new function to compute the search rate and hit rate for the traffic stops in our dataframe.

Search Rate - The rate at which a traffic stop results in a search. A search rate of 0.20 would signify that out of 100 traffic stops, 20 resulted in a search.

- The rate at which a traffic stop results in a search. A search rate of would signify that out of 100 traffic stops, 20 resulted in a search. Hit Rate - The rate at which contraband is found in a search. A hit rate of 0.80 would signify that out of 100 searches, 80 searches resulted in contraband (drugs, unregistered weapons, etc.) being found.

def compute_search_stats(df): """Compute the search rate and hit rate""" search_conducted = df['search_conducted'] contraband_found = df['contraband_found'] n_stops = len(search_conducted) n_searches = sum(search_conducted) n_hits = sum(contraband_found) # Filter out counties with too few stops if (n_stops) < 50: search_rate = None else: search_rate = n_searches / n_stops # Filter out counties with too few searches if (n_searches) < 5: hit_rate = None else: hit_rate = n_hits / n_searches return(pd.Series(data = { 'n_stops': n_stops, 'n_searches': n_searches, 'n_hits': n_hits, 'search_rate': search_rate, 'hit_rate': hit_rate }))

3.1 - Compute Search Stats For Entire Dataset

We can test our new function to determine the search rate and hit rate for the entire state.

compute_search_stats(df_vt)

hit_rate 0.796865 n_hits 2593.000000 n_searches 3254.000000 n_stops 272918.000000 search_rate 0.011923 dtype: float64

Here we can see that each traffic stop had a 1.2% change of resulting in a search, and each search had an 80% chance of yielding contraband.

3.2 - Compare Search Stats By Driver Gender

Using the Pandas groupby method, we can compute how the search stats differ by gender.

df_vt.groupby('driver_gender').apply(compute_search_stats)

hit_rate n_hits n_searches n_stops search_rate driver_gender F 0.789392 506.0 641.0 99036.0 0.006472 M 0.798699 2087.0 2613.0 173882.0 0.015027

We can see here that men are three times as likely to be searched as women, and that 80% of searches for both genders resulted in contraband being found. The data shows that men are searched and caught with contraband more often than women, but it is unclear whether there is any gender discrimination in deciding who to search since the hit rate is equal.

3.3 - Compare Search Stats By Age

We can split the dataset into age buckets and perform the same analysis.

age_groups = pd.cut(df_vt["driver_age"], np.arange(15, 70, 5)) df_vt.groupby(age_groups).apply(compute_search_stats)

hit_rate n_hits n_searches n_stops search_rate driver_age (15, 20] 0.847988 569.0 671.0 27418.0 0.024473 (20, 25] 0.838000 838.0 1000.0 43275.0 0.023108 (25, 30] 0.788462 492.0 624.0 34759.0 0.017952 (30, 35] 0.766756 286.0 373.0 27746.0 0.013443 (35, 40] 0.742991 159.0 214.0 23203.0 0.009223 (40, 45] 0.692913 88.0 127.0 24055.0 0.005280 (45, 50] 0.575472 61.0 106.0 24103.0 0.004398 (50, 55] 0.706667 53.0 75.0 22517.0 0.003331 (55, 60] 0.833333 30.0 36.0 17502.0 0.002057 (60, 65] 0.500000 6.0 12.0 12514.0 0.000959

We can see here that the search rate steadily declines as drivers get older, and that the hit rate also declines rapidly for older drivers.

3.4 - Compare Search Stats By Race

Now for the most interesting part - comparing search data by race.

df_vt.groupby('driver_race').apply(compute_search_stats)

hit_rate n_hits n_searches n_stops search_rate driver_race Asian 0.785714 22.0 28.0 3446.0 0.008125 Black 0.686620 195.0 284.0 5571.0 0.050978 Hispanic 0.644231 67.0 104.0 2562.0 0.040593 White 0.813601 2309.0 2838.0 261339.0 0.010859

Black and Hispanic drivers are searched at much higher rates than White drivers (5% and 4% of traffic stops respectively, versus 1% for white drivers), but the searches of these drivers only yield contraband 60-70% of the time, compared to 80% of the time for White drivers.

Let's rephrase these results.

Black drivers are 500% more likely to be searched than white drivers during a traffic stop, but are 13% less likely to be caught with contraband in the event of a search.

Hispanic drivers are 400% more likely to be searched than white drivers during a traffic stop, but are 17% less likely to be caught with contraband in the event of a search.

3.5 - Compare Search Stats By Race and Location

Let's add in location as another factor. It's possible that some counties (such as those with larger towns or with interstate highways where opioid trafficking is prevalent) have a much higher search rate / lower hit rates for both white and non-white drivers, but also have greater racial diversity, leading to distortion in the overall stats. By controlling for location, we can determine if this is the case.

We'll define three new helper functions to generate the visualizations.

def generate_comparison_scatter(df, ax, state, race, field, color): """Generate scatter plot comparing field for white drivers with minority drivers""" race_location_agg = df.groupby(['county_fips','driver_race']).apply(compute_search_stats).reset_index().dropna() race_location_agg = race_location_agg.pivot(index='county_fips', columns='driver_race', values=field) ax = race_location_agg.plot.scatter(ax=ax, x='White', y=race, s=150, label=race, color=color) return ax def format_scatter_chart(ax, state, field): """Format and label to scatter chart""" ax.set_xlabel('{} - White'.format(field)) ax.set_ylabel('{} - Non-White'.format(field, race)) ax.set_title("{} By County - {}".format(field, state)) lim = max(ax.get_xlim()[1], ax.get_ylim()[1]) ax.set_xlim(0, lim) ax.set_ylim(0, lim) diag_line, = ax.plot(ax.get_xlim(), ax.get_ylim(), ls="--", c=".3") ax.legend() return ax def generate_comparison_scatters(df, state): """Generate scatter plots comparing search rates of white drivers with black and hispanic drivers""" fig, axes = plt.subplots(nrows=1, ncols=2, figsize=figsize) generate_comparison_scatter(df, axes[0], state, 'Black', 'search_rate', 'red') generate_comparison_scatter(df, axes[0], state, 'Hispanic', 'search_rate', 'orange') generate_comparison_scatter(df, axes[0], state, 'Asian', 'search_rate', 'green') format_scatter_chart(axes[0], state, 'Search Rate') generate_comparison_scatter(df, axes[1], state, 'Black', 'hit_rate', 'red') generate_comparison_scatter(df, axes[1], state, 'Hispanic', 'hit_rate', 'orange') generate_comparison_scatter(df, axes[1], state, 'Asian', 'hit_rate', 'green') format_scatter_chart(axes[1], state, 'Hit Rate') return fig

We can now generate the scatter plots using the generate_comparison_scatters function.

generate_comparison_scatters(df_vt, 'VT')

The plots above are comparing search_rate (left) and hit_rate (right) for minority drivers compared with white drivers in each county. If all of the dots (each of which represents the stats for a single county and race) followed the diagonal center line, the implication would be that white drivers and non-white drivers are searched at the exact same rate with the exact same standard of evidence.

Unfortunately, this is not the case. In the above charts, we can see that, for every county, the search rate is higher for Black and Hispanic drivers even though the hit rate is lower.

Let's define one more visualization helper function, to show all of these results on a single scatter plot.

def generate_county_search_stats_scatter(df, state): """Generate a scatter plot of search rate vs. hit rate by race and county""" race_location_agg = df.groupby(['county_fips','driver_race']).apply(compute_search_stats) colors = ['blue','orange','red', 'green'] fig, ax = plt.subplots(figsize=figsize) for c, frame in race_location_agg.groupby(level='driver_race'): ax.scatter(x=frame['hit_rate'], y=frame['search_rate'], s=150, label=c, color=colors.pop()) ax.legend(loc='upper center', bbox_to_anchor=(0.5, 1.2), ncol=4, fancybox=True) ax.set_xlabel('Hit Rate') ax.set_ylabel('Search Rate') ax.set_title("Search Stats By County and Race - {}".format(state)) return fig

generate_county_search_stats_scatter(df_vt, "VT")

As the old idiom goes - a picture is worth a thousand words. The above chart is one of those pictures - and the name of the picture is "Systemic Racism".

The search rates and hit rates for white drivers in most counties are consistently clustered around 80% and 1% respectively. We can see, however, that nearly every county searches Black and Hispanic drivers at a higher rate, and that these searches uniformly have a lower hit rate than those on White drivers.

This state-wide pattern of a higher search rate combined with a lower hit rate suggests that a lower standard of evidence is used when deciding to search Black and Hispanic drivers compared to when searching White drivers.

You might notice that only one county is represented by Asian drivers - this is due to the lack of data for searches of Asian drivers in other counties.

4 - Analyzing Other States

Vermont is a great state to test out our analysis on, but the dataset size is relatively small. Let's now perform the same analysis on other states to determine if this pattern persists across state lines.

4.0 - Massachusetts

First we'll generate the analysis for my home state, Massachusetts. This time we'll have more data to work with - roughly 3.4 million traffic stops.

Download the dataset to your project's /data directory - https://stacks.stanford.edu/file/druid:py883nd2578/MA-clean.csv.gz

We've developed a solid reusable formula for reading and visualizing each state's dataset, so let's wrap the entire recipe in a new helper function.

fields = ['county_fips', 'driver_race', 'search_conducted', 'contraband_found'] types = { 'contraband_found': bool, 'county_fips': float, 'driver_race': object, 'search_conducted': bool } def analyze_state_data(state): df = pd.read_csv('./data/{}-clean.csv.gz'.format(state), compression='gzip', low_memory=True, dtype=types, usecols=fields) df.dropna(inplace=True) df = df[df['driver_race'] != 'Other'] generate_comparison_scatters(df, state) generate_county_search_stats_scatter(df, state) return df.groupby('driver_race').apply(compute_search_stats)

We're making a few optimizations here in order to make the analysis a bit more streamlined and computationally efficient. By only reading the four columns that we're interested in, and by specifying the datatypes ahead of time, we'll be able to read larger datasets into memory more quickly.

analyze_state_data('MA')

The first output is a statewide table of search rate and hit rate by race.

hit_rate n_hits n_searches n_stops search_rate driver_race Asian 0.331169 357.0 1078.0 101942.0 0.010575 Black 0.487150 4170.0 8560.0 350498.0 0.024422 Hispanic 0.449502 5007.0 11139.0 337782.0 0.032977 White 0.523037 18220.0 34835.0 2527393.0 0.013783

We can see here again that Black and Hispanic drivers are searched at significantly higher rates than white drivers. The differences in hit rates are not as extreme as in Vermont, but they are still noticeably lower for Black and Hispanic drivers than for White drivers. Asian drivers, interestingly, are the least likely to be searched and also the least likely to have contraband if they are searched.





If we compare the stats for MA to VT, we'll also notice that police in MA seem to use a much lower standard of evidence when searching a vehicle, with their searches averaging around a 50% hit rate, compared to 80% in VT.

The trend here is much less obvious than in Vermont, but it is still clear that traffic stops of Black and Hispanic drivers are more likely to result in a search, despite the fact the searches of White drivers are more likely to result in contraband being found.

4.1 - Wisconsin & Connecticut

Wisconsin and Connecticut have been named as some of the worst states in America for racial disparities. Let's see how their police stats stack up.

Again, you'll need to download the Wisconsin and Connecticut dataset to your project's /data directory.

We can call our analyze_state_data function for Wisconsin once the dataset has been downloaded.

analyze_state_data('WI')

hit_rate n_hits n_searches n_stops search_rate driver_race Asian 0.470817 121.0 257.0 24577.0 0.010457 Black 0.477574 1299.0 2720.0 56050.0 0.048528 Hispanic 0.415741 449.0 1080.0 35210.0 0.030673 White 0.526300 5103.0 9696.0 778227.0 0.012459

The trends here are starting to look familiar. White drivers in Wisconsin are much less likely to be searched than non-white drivers (aside from Asians, who tend to be searched at around the same rates as whites). Searches of non-white drivers are, again, less likely to yield contraband than searches on white drivers.





We can see here, yet again, that the standard of evidence for searching Black and Hispanic drivers is lower in virtually every county than for White drivers. In one outlying county, almost 25% (!) of traffic stops for Black drivers resulted in a search, even though only half of those searches yielded contraband.

Let's do the same analysis for Connecticut

analyze_state_data('CT')

hit_rate n_hits n_searches n_stops search_rate driver_race Asian 0.384615 10.0 26.0 5949.0 0.004370 Black 0.284072 346.0 1218.0 37460.0 0.032515 Hispanic 0.291925 282.0 966.0 31154.0 0.031007 White 0.379344 1179.0 3108.0 242314.0 0.012826





Again, the pattern persists.

4.2 - Arizona

We can generate each result rather quickly for each state (with available data), once we've downloaded each dataset.

analyze_state_data('AZ')

hit_rate n_hits n_searches n_stops search_rate driver_race Asian 0.196664 224.0 1139.0 48177.0 0.023642 Black 0.255548 2188.0 8562.0 116795.0 0.073308 Hispanic 0.160930 5943.0 36929.0 501619.0 0.073620 White 0.242564 9288.0 38291.0 1212652.0 0.031576





4.3 - Colorado

analyze_state_data('CO')

hit_rate n_hits n_searches n_stops search_rate driver_race Asian 0.537634 50.0 93.0 32471.0 0.002864 Black 0.481283 270.0 561.0 71965.0 0.007795 Hispanic 0.450454 1041.0 2311.0 308499.0 0.007491 White 0.651388 3638.0 5585.0 1767804.0 0.003159





4.4 - Washington

analyze_state_data('WA')

hit_rate n_hits n_searches n_stops search_rate driver_race Asian 0.087143 608.0 6977.0 352063.0 0.019817 Black 0.130799 1717.0 13127.0 254577.0 0.051564 Hispanic 0.103366 2128.0 20587.0 502254.0 0.040989 White 0.156008 15768.0 101072.0 4279273.0 0.023619





4.5 - North Carolina

analyze_state_data('NC')

hit_rate n_hits n_searches n_stops search_rate driver_race Asian 0.104377 31.0 297.0 46287.0 0.006416 Black 0.182489 1955.0 10713.0 1222533.0 0.008763 Hispanic 0.119330 776.0 6503.0 368878.0 0.017629 White 0.153850 3387.0 22015.0 3146302.0 0.006997





4.6 - Texas

You might want to let this one run while you go fix yourself a cup of coffee or tea. At almost 24 million traffic stops, the Texas dataset takes a rather long time to process.

analyze_state_data('TX')

hit_rate n_hits n_searches n_stops search_rate driver_race Asian 0.289271 976.0 3374.0 349105.0 0.009665 Black 0.345983 27588.0 79738.0 2300427.0 0.034662 Hispanic 0.219449 37080.0 168969.0 6525365.0 0.025894 White 0.335098 83157.0 248157.0 13576726.0 0.018278





4.7 - Even more data visualizations

I highly recommend that you visit the Stanford Open Policing Project results page for more visualizations of this data. Here you can browse the search outcome results for all available states, and explore additional analysis that the researchers have performed such as stop rate by race (using county population demographics data) as well as the effects of recreational marijuana legalization on search rates.

5 - What next?

Do these results imply that all police officers are overtly racist? No.

Do they show that Black and Hispanic drivers are searched much more frequently than white drivers, often with a lower standard of evidence? Yes.

What we are observing here appears to be a pattern of systemic racism. The racial disparities revealed in this analysis are a reflection of an entrenched mistrust of certain minorities in the United States. The data and accompanying analysis are indicative of social trends that are certainly not limited to police officers. Racial discrimination is present at all levels of society from retail stores to the tech industry to academia.

We are able to empirically identify these trends only because state police deparments (and the Open Policing team at Stanford) have made this data available to the public; no similar datasets exist for most other professions and industries. Releasing datasets about these issues is commendable (but sadly still somewhat uncommon, especially in the private sector) and will help to further identify where these disparities exist, and to influence policies in order to provide a fair, effective way to counteract these biases.

To see the full official analysis for all 20 available states, check out the official findings paper here - https://5harad.com/papers/traffic-stops.pdf.

I hope that this tutorial has provided the tools you might need to take this analysis further. There's a lot more that you can do with the data than what we've covered here.

Analyze police stops for your home state and county (if the data is available). If the data is not available, submit a formal request to your local representatives and institutions that the data be made public.

Combine your analysis with US census data on the demographic, social, and economic stats about each county.

Create a web app to display the county trends on an interactive map.

Build a mobile app to warn drivers when they're entering an area that appears to be more distrusting of drivers of a certain race.

Open-source your own analysis, spread your findings, seek out peer review, maybe even write an explanatory blog post.

The source code and figures for this analysis can be found in the companion Github repository - https://github.com/triestpa/Police-Analysis-Python

To view the completed IPython notebook, visit the page here.

The code for this project is 100% open source (MIT license), so feel free to use it however you see fit in your own projects.

As always, please feel free to comment below with any questions, comments, or criticisms.