An increasing number of minority youth experience contact with the criminal justice system. But how does the expansion of police presence in poor urban communities affect educational outcomes? Previous research points at multiple mechanisms with opposing effects. This article presents the first causal evidence of the impact of aggressive policing on minority youths’ educational performance. Under Operation Impact, the New York Police Department (NYPD) saturated high-crime areas with additional police officers with the mission to engage in aggressive, order-maintenance policing. To estimate the effect of this policing program, we use administrative data from more than 250,000 adolescents age 9 to 15 and a difference-in-differences approach based on variation in the timing of police surges across neighborhoods. We find that exposure to police surges significantly reduced test scores for African American boys, consistent with their greater exposure to policing. The size of the effect increases with age, but there is no discernible effect for African American girls and Hispanic students. Aggressive policing can thus lower educational performance for some minority groups. These findings provide evidence that the consequences of policing extend into key domains of social life, with implications for the educational trajectories of minority youth and social inequality more broadly.

Over the past three decades, cities across the United States have adopted strategies known as proactive or broken-windows policing, with a focus on strict enforcement of low-level crimes and extensive use of pedestrian stops (Fagan et al. 2016; Greene 2000; Kohler-Hausmann 2013; Kubrin et al. 2010; Weisburd and Majmundar 2018).1 As a consequence of these changes in the strategies and tactics of street policing, an increasing number of minority youth have involuntary contact with the criminal justice system (Brame et al. 2012; Hagan, Shedd, and Payne 2005). By age 18, between 15.9 and 26.8 percent of the U.S. population has been arrested at least once (Brame et al. 2012). In a recent representative survey of 15-year-old urban youth, 39 percent of African American boys, compared to 23 percent of White boys, reported they were stopped by the police at least once (Geller 2018). In New York City, the police conducted more than 4 million pedestrian stops between 2004 and 2012, with more than half concentrated among persons younger than 25 years of age (Fagan et al. 2010). Similar practices exist in major cities across the country (Weisburd and Majmundar 2018).

Investments in policing, including some forms of proactive policing, are credited with reductions in crime, but we know less about the social costs of policing (Weisburd and Majmundar 2018). What are the consequences of the increasing presence of police in minority communities for minority youths’ educational performance? Previous research points to multiple mechanisms with opposing effects. First, policing may have a positive effect by reducing the level of neighborhood crime and violence, which in turn increases school performance. Second, aggressive, broken-windows policing may have negative effects by undermining trust in authorities, including schools and teachers, and by leading to withdrawal and system avoidance. High rates of direct or indirect contact with police may also create stress and other health and emotional responses that undermine cognitive performance. Despite minority youths’ increasing exposure to the police and contradictory previous research, there is no convincing causal evidence about the effects of proactive policing on minority youths’ educational performance.

To address this question, we focus on New York Police Department’s (NYPD) Operation Impact, a policing program in New York City that substantially increased the intensity of broken-windows policing in selected neighborhoods. Our design exploits the staggered implementation of Operation Impact, which quickly increased the number of police officers in high-crime areas designated as impact zones at different points in time (Golden and Almo 2004). Starting in January 2004, the NYPD deployed around 1,500 recent police academy graduates to impact zones with the mission to engage in aggressive order-maintenance policing. These officers targeted disorderly behaviors through strict enforcement of low-level crimes and extensive use of pedestrian stops. The high concentration of officers in impact zones produced a substantial increase in policing activity and a modest decrease in violent crime. Between 2004 and 2012, the NYPD continuously modified the program over 15 phases by expanding, moving, removing, or adding impact zones roughly every six months. Over the duration of the program, 18.3 percent of African American, 14.6 percent of Hispanic, and .7 percent of White elementary and middle public-school students in New York City were exposed to impact zones at least once.

To estimate the effect of Operation Impact on educational performance, we link information on impact zones with administrative data from the New York City Department of Education (NYCDOE) on public-school students from the school years 2003/2004 to 2011/ 2012. We use a difference-in-differences (DD) approach that exploits the longitudinal structure of the data and variation in the timing of police surges across neighborhoods (Meyer 1995). Focusing on students’ residential context, our approach compares changes in test scores before, during, and after Operation Impact for areas affected by the intervention to the same differences for areas designated as impact zones at a different point in time. The analysis conditions on the level of prior crime, because it was the most important criteria for selection of impact zones.

The findings show that Operation Impact lowered the educational performance of African American boys, which has implications for child development, economic mobility, and racial inequality. The effect size varies by race, gender, and age. It is substantial for African American boys age 13 to 15, and small and statistically insignificant for other groups. A series of supplementary analyses support the plausibility of the design and rule out violent crime as an alternative explanation. Additional analyses provide first evidence on the underlying mechanisms but are limited by the lack of data on student health. They show that Operation Impact reduced crime, providing evidence for a positive channel through lower crime rates; and they show that Operation Impact reduced school attendance, indicating that system avoidance is a possible mechanism. Considering the signiﬁcant racial disparities in police contact (Fagan et al. 2010; Hagan et al. 2005; Legewie 2016), these findings suggest aggressive policing strategies and tactics can lower educational performance and perpetuate racial inequalities in educational outcomes. They reveal consequences of policing that extend into key domains of social life.

NYPD’s Operation Impact In 2004, the NYPD launched Operation Impact, a tactic designed to maximize police investigative stops in areas designated as “impact zones” (Golden and Almo 2004). Operation Impact was a second-generation enforcement tactic that replaced the Street Crime Unit, or SCU. SCU was created in 1994 and expanded citywide in 1997 (White and Fradella 2016). SCU officers roamed the city and conducted intensive stop activity under the NYPD stop-and-frisk program, targeting small “high-crime areas” identified through a combination of police intelligence and data analytics. Under Operation Impact, these activities were rationalized through crime analysis to focus on specific locations or “hot spots,” as well as days of the week and times of day when criminal activity was highest. Impact zones ranged in size from very small areas, such as residential buildings or public housing sites, to areas as large as entire precincts. Impact zones were located in areas predominantly populated by Black and Latino New Yorkers. Over the years, the NYPD implemented the program through 15 consecutive phases by expanding, moving, removing, or adding impact zones roughly every six months (see Figure S1 in the online supplement for the rollout of impact zones over time). Between 2004 and 2012, 75 of the 76 New York City Police precincts had one or more impact zones (MacDonald et al. 2016). On average, areas remained designated as impact zones for 12.3 months, but the duration ranged from 5.3 months to 7.5 years. Additional officers beyond regular precinct deployments were assigned to impact zones. From the outset, roughly two-thirds of graduating classes from the police academy were assigned to impact zones (Golden and Almo 2004), while the overall number of sworn ofﬁcers declined slightly between 2003 and 2013. Supervisors encouraged these rookie officers to conduct high volumes of investigative stops. In addition to suspicion-based stops, officers were encouraged to make arrests for low-level offenses, issue warrants for minor non-criminal infractions (e.g., open containers of alcohol), and conduct other stops as pretexts to search for persons with outstanding warrants (Barrett 1998). The high concentration of officers in impact zones produced a substantial increase in policing activity, which quickly returned to previous levels after areas were removed from Operation Impact (see the online supplement for a detailed analysis on the effect of Operation Impact on police activity). The number of pedestrian stops increased by 33.2 percent. Arrests for low-level offenses rose by 11.0 percent for misdemeanors and by 29.7 percent for violations; felony arrests remained largely the same. This increase in police activity was uneven by race. The number of pedestrian stops increased by 35.1 percent for African Americans, 25.2 percent for Hispanics, and 22.2 percent for Whites, with a similar pattern for misdemeanors and violation arrests.

Data and Methods Our analyses rely on two sources of information. The first is administrative school district records from the New York City Department of Education (NYCDOE) assembled by the Research Alliance for New York City Schools. The database consists of administrative student-level records for all public-school students in New York City in grades K to 8 from the school year 2003/2004 to 2011/2012. Records include the school and grade identifier for the fall and spring terms and a limited number of standard demographic characteristics, such as race/ethnicity, gender, date of birth, eligibility for free lunch as a measure of parental socioeconomic background, limited English-learner status as a measure of immigrant status, yearly test-score measures for language and math in grades 3 through 8, students’ residential neighborhood, and, starting in 2007, survey data from the NYC Learning Environment Survey. The second data source includes pedestrian stops, crime complaints, arrests, and information on Operation Impact from the New York Police Department. Pedestrian stops are based on the “Stop, Question, and Frisk” program and include records on 4.6 million time- and geocoded police stops of pedestrians in New York City between 2004 and 2012.2 Stops are recorded by the ofﬁcer on the “Stop, Question, and Frisk Report Worksheet” (UF-250 form). Each record includes information on the exact timing, geographic location, circumstances that led to the stop, details about the stopped person, the suspected crime, and events during the stop itself, such as an arrest or use of physical force by the police ofﬁcer.3 The incident-level arrest data include 3.3 million arrests in New York City between 2004 and 2012, with information on date and time, geocoded location, offense charge, and race, age, and gender of the arrested person. Offense charges were coded as violent felony, property felony, other felony, misdemeanor, or violation. Crime complaints include 4.8 million geocoded, incident-level felony, misdemeanor, and violation crimes reported to the NYPD from 2004 to 2012. The format of the data is similar to the arrest data. It includes information on date and time; geocoded location; offense type, including violent felony, property felony, misdemeanor, or violation; and (if available) the suspect’s race, age, and gender. Finally, our information from the NYPD include digital boundary maps from Operation Impact (shapefiles) showing the exact geographic location and shape of impact zones from phase III to XVII together with information on the timing of the different phases. The NYPD was unable to provide comparable information for phases I and II in 2003. This initial period was smaller in scale, and we ignore it in our analysis.

Estimation Strategy Estimating the effect of policing on educational performance is challenging considering that police activity is closely linked to crime and other neighborhood characteristics. Indeed, the selection of impact zones was based on a two-step process (Golden and Almo 2004). First, police commanders nominated high-crime areas within their precincts. Second, nominated areas were then discussed with officials and analysts at police headquarters to make the final selection. According to the NYPD, this selection process was based on crime patterns and history. As a result, impact zones differ from other areas in many confounding ways, such as crime or poverty rates, that might also influence educational outcomes. This nonrandom selection of impact zones makes it difficult for typical observational studies to estimate the causal effect of Operation Impact and policing more broadly. To overcome this challenge, we use student-level data and a difference-in-differences (DD) approach (Angrist and Pischke 2008; Legewie 2012; Meyer 1995) with additional control variables for prior crime and in some models student fixed-effect terms. This approach exploits the longitudinal nature of the data and variation in the timing of police surges across neighborhoods together with some of the same data on crime the NYPD used to select impact zones. It focuses on students’ residential context and relies on the fact that Operation Impact was rolled out over 15 phases by expanding, moving, removing, or adding impact zones roughly every six months. We restrict the sample to areas designated as impact zones at some point over the almost 10-year duration of the program to ensure a comparison between similar neighborhoods.4 As a result, our approach compares changes in test scores before, during, and after Operation Impact for students in areas affected by the intervention to the same difference for students in areas designated as impact zones at a different point in time. The DD model adjusts for all time-constant differences across neighborhoods. It adjusts for stable differences in crime, policing, and test performance (but not important changes), as well as crime history, population characteristics such as the poverty rate and population size (but not population change), and housing structure, including the presence of public housing. Crime declined significantly in New York City, but differences across neighborhoods remained relatively stable and historic crime patterns might still be important for contemporary perceptions of neighborhoods. In addition, the models control for prior crime defined as the number of violent and property crimes in the six months before the selection of impact zones. The measures are based on the same crime data used by the NYPD and focus on the period during which decisions about changes to impact zones were made. They are temporally prior to the treatment, ensuring they are unaffected by the treatment itself.5 This neighborhood-level, pre-treatment control variable is important because the selection of impact zones was largely based on crime rates. Officially, selection of impact zones was solely based on crime patterns and history, but population characteristics and housing structure are potentially relevant factors as well, so the neighborhood fixed-effect term and controls for prior crime mitigate confounding bias. Formally, we estimate two regression models separately by race and gender with clustered standard errors on the neighborhood level to address potential serial correlation problems.6 The first model is a group-level difference-in-differences model without any covariates on the student level (aside from the dependent variable):7 y i j t g = π j + η t g + δ 1 D j t + β 2 U j t + ε i j t g (1) The second is a student-level DD estimator that adds a student fixed-effect term α i and time-specific, student-level control variables β 1 X it : y i j t g = α i + π j + η t g + δ 1 D j t + β 1 X i t + β 2 U j t + ε i j t g (2) where the dependent variables are English Language Arts (ELA) and Mathematics Test scores for student i, in neighborhood j, at school year t, and grade g. The treatment variable D jt is on the neighborhood-year level and measures the number of days a student lived in an impact zone during the school year scaled to one year. The corresponding coefficient δ 1 estimates the effect of Operation Impact, D jt . To obtain age-specific estimates, we either extend the model with a series of interaction terms, δ 2 D j t A g e 10 i t + ⋯ + δ 7 D j t A g e 15 i t (main results), or run separate regressions for specific age groups (some additional analyses). In addition to the treatment indicator for Operation Impact, these models include a stable neighborhood effect, π j , that controls for mean differences in test scores across neighborhoods, and a grade-by-year effect, η tg , that captures test-score differences across years and grades that are constant across all students, such as characteristics of a particular test. The student-level DD estimator in Equation 2 also includes a student fixed-effect term, α i . The term adjusts for the selection of students into impact zones based on stable, observed, and unobserved student characteristics. The individual-level fixed-effect term means all estimates are based on within-student variation over the years and reflect changes relative to the individual-level mean. The additional specification safeguards our analyses against other types of bias, reaffirms our findings based on different specifications and assumptions, and improves precision of the estimates. Both models include the same time-varying covariates on the neighborhood-level, β 2 U jt , for the number of violent and property crimes in the six months before the selection of impact zones. The student-level DD estimator in Equation 2 includes individual-level covariates, β 1 X it , for free or reduced lunch as a measure of parental background and English learner status. The within students’ variation in both variables is small but might capture important changes in family income and improvements in English ability for non-native speakers. The online supplement presents results from additional specifications, including a model with a school fixed-effect term; a neighborhood-specific, linear time trend, γ j year; additional control variables for the prior level of police activity; and a model without controls for prior crime (see the online supplement for details). We later extend these models with two lead and lag terms, δ t±x D j,t±x , that estimate changes in test scores before areas were designated as impact zones and after they were removed from the program (Angrist and Pischke 2008). The lead and lag terms are equal to one only in the relevant year. For example, for students in a neighborhood designated as an impact zone from July 2006 to July 2008, the lead term D j,t+2 is coded as one for the 2004/2005 school year and the lead term D j,t+1 for the 2005/2006 school year. The treatment indicator D jt is coded as one for the school year 2006/2007 and 2007/2008 because students are exposed to Operation Impact for the entire school year. Finally, the lagged terms D j,t–1 and D j,t–2 are coded as one in the 2008/2009 and 2009/2010 school year, respectively. The variables are defined at the neighborhood level and assigned to students based on their current neighborhood even if they are not part of the sample in previous or future years. This approach assumes that students did not move but ensures the sample size is sufficient to support the analysis. This specification allows us to estimate the effect of Operation Impact before (lead), during (treatment indicator), and after (lag) areas are designated as impact zones. The core assumption of our difference-in-differences approach is that in the absence of Operation Impact and conditional on prior crime, changes in test scores of students exposed to the police surge would have been the same as changes in test scores of students in control areas (common trend assumption). The Results section further discusses the plausibility of our approach and presents additional evidence to support the credibility of our design. Examining the Underlying Mechanisms As a second step of our analysis, we examine some of the underlying mechanisms that might explain the effect of Operation Impact on educational outcomes. These analyses focus on changes in crime, school-related attitudes, and school attendance. The measures are more proximate causes of educational performance related to our theoretical argument about a positive effect based on crime reduction, and a negative effect based on trust in schools and system avoidance. First, we explore the possibility of a positive effect through the reduction of neighborhood crime and violence, which in turn increases school performance. The analysis is based on a similar difference-in-differences approach as our main analysis for student outcomes, but it uses data on the neighborhood-quarter level, so each observation (row) represents a specific neighborhood j in quarter q, where quarter ranges from Q1 in 2004 to Q4 in 2012 (36 quarters in total). The dependent variables are the number of violent and property crimes in neighborhood j and quarter q. The treatment indicator is coded as one when neighborhood j and quarter q are part of Operation Impact. Additional variables include four lead terms and four lag terms to estimate changes in crime before areas are designated impact zones and after they are removed from the program (Angrist and Pischke 2008). We restrict the sample to neighborhoods designated as impact zones at some point over the duration of the program and areas with at least one student. This sample restriction ensures the analysis focuses on the same areas as the main analysis discussed earlier. To model the number of violent and crime incidents, we use negative binomial regressions, which are a common approach in research on crime (Osgood 2000). The models assess the causal impact of Operation Impact on crime by comparing changes in crime before, during, and after Operation Impact for areas affected by the intervention to the same difference for areas designated as impact zones at a different point in time. The online supplement includes a detailed discussion of this approach. Second, we estimate the effect of Operation Impact on school-related attitudes and school attendance as possible negative channels, based on trust in schools and teachers as well as system avoidance. For this purpose, we turn to the NYC Learning Environment Survey and administrative data on school attendance to estimate the effect of Operation Impact on school-related attitudes and attendance. Formally, we estimate the difference-in-differences models described in Equations 1 and 2 using school-related attitudes and attendance rates as outcome variables (see variable descriptions in the next section). The models include two lead and lag terms for the treatment indicator that allow us to examine changes in attitudes and attendance before, during, and after areas were designated impact zones. The analysis for school-related attitudes restricts the sample to grades 6 to 8 and the years 2007 to 2012 because the NYC Learning Environment Survey did not collect data for earlier periods and other grade levels.

Coding of Variables The main outcome variables are test scores from the NYS English Language Arts (ELA) and Mathematics Test taken by students every spring in grades 3 through 8. The statewide tests are mandated by the No Child Left Behind law and were developed by McGraw-Hill and Pearson in 2012. These high-stakes exams are administered in the spring term. All public-school students who are not excused for medical reasons or because of severe disabilities are required to take the ELA and mathematics exams. The ELA exam assesses three learning standards: information and understanding, literary response and expression, and critical analysis and evaluation. It includes reading and listening sections as well as a short editing task with multiple-choice items and short-response answers. Depending on the grade level, the Mathematics Test consists of questions on number sense and operations, algebra, geometry, measurement, and statistics. The test scores are measured on a common scale using item response theory. To adjust for variations in the test across years and grades, we standardize the ELA and Mathematics test scores to have mean zero and standard deviation one by year and grade across the entire New York City sample. The treatment indicator is defined on the neighborhood school-year level and measures exposure to Operation Impact during the school year. It is a continuous variable defined as the number of days an area was part of Operation Impact during the school year scaled to one year, so it ranges from 0 (not exposed) to 1 (exposed for the entire school year). We link this neighborhood-level indicator to student records based on geocoded student addresses for the spring term of each school year.8 The definition focuses on current exposure. Students who were exposed in the past, either because they moved or their residential neighborhood was removed from the program, are coded as “not exposed.” In an additional analysis, we extend our model with lagged terms to estimate the effect of previous exposure and assess the temporal duration of the effect. For the purpose of this study, neighborhoods are created by splitting 76 police precincts into 1,257 distinct areas with at least one student based on the boundaries of impact zones. As a comparison, there are 2,166 census tracts in New York City. Splitting police precincts by impact zones ensures areas are aligned with impact zones, the central level of the intervention studied here. The modified police precincts provide a closer approximation to Operation Impact, policing, and crime compared to other geographic units, such as census tracts. The control variables include student- and neighborhood-level covariates. On the student level, we control for student age, an indicator for free or reduced lunch, and English learner status.9 Most analyses are conducted separately by race and gender. On the neighborhood level, we control for the number of violent and property crimes in the six months before the selection of impact zones. Prior crime is an essential covariate because, officially, the selection of impact zones was solely based on crime patterns and history. Additional analyses focus on crime, school-related attitudes, and school attendance as possible mechanisms. The three measures are defined in the following way. First, we estimate the impact of Operation Impact on the number of violent and property crimes in neighborhood j during quarter q. The level of analysis and crime measures are distinct from the control variables discussed earlier. The measures are aggregated from incident-level crime data reported by the NYPD. Although police-reported crime data are limited in many ways (Lynch and Addington 2006), the measures afford an important opportunity to examine changes in crime related to Operation Impact as a possible mechanism. Second, we estimate the effect of Operation Impact on school-related attitudes and trust using survey data from the NYC Learning Environment Survey.10 The student questionnaire from the NYC Learning Environment Survey includes a range of questions on students’ attitudes toward their school, teachers, and other school staff. We use exploratory maximum likelihood factor analysis to construct the measure “positive school attitudes and trust” from five items. The five questions are measured on a four-point Likert scale ranging from strongly disagree to strongly agree or from uncomfortable to comfortable. They include: “I feel welcome in my school” (factor loading .61), “Discipline in my school is fair” (factor loading .52), “My teachers inspire me to learn” (factor loading .60), “The adults at my school look out for me” (factor loading .64), and “The adults at my school help me understand what I need to do to succeed in school” (factor loading .62). The variables cover different aspects of school-related attitudes and trust. Our results suggest the five items belong to the same factor.11 The factor score from the exploratory maximum likelihood factor analysis can be understood as an index for positive attitudes and trust toward school and teachers. Third, we use school attendance to measure system avoidance based on administrative data from the NYCDOE. Our measure is defined as the attendance rate for a specific school year using days on the roll as the denominator. We scale the variable from 0 to 100 percent. On average, the attendance rate is 92 percent with a standard deviation of 7.6.

Sample and Summary Statistics We restrict student data in several ways to obtain our primary analysis sample of 285,439 students who were exposed to Operation Impact, with over 827,922 student-year observations. First, we restrict our sample to African American and Hispanic students, because the sample size of White and Asian students living in impact zones is too small to support our analysis.12 Second, we exclude students who did not participate in the yearly state test in grades 3 to 8. Third, we limit our analysis to students age 9 to 15. The number of cases for younger and older students is insufficient to obtain age-specific estimates. Fourth, as part of our estimation strategy, we restrict our sample to areas designated as impact zones at least once. Finally, we exclude 2.6 percent of observations with missing data on any of the relevant variables. AppendixTable A1 reports student-level summary statistics. It compares all students age 9 to 15 who participated in the yearly state exam with our analytic sample restricted to areas that were part of Operation Impact at some point in time. The proportion of White students is far lower among students in impact zones, whereas the share of African American and Hispanic students is higher compared to the general student population. Students in impact zones are more likely to receive free or reduced lunch and to score lower on the English Language Art and Mathematics Tests. Students in impact zones live in neighborhoods with a higher poverty rate, a smaller proportion of White residents, a higher share of minority groups, and substantially higher crime rates.

Understanding the Effect of Operation Impact To better understand the mechanisms that explain the effect of Operation Impact on educational outcomes, we examine changes in crime, school-related attitudes, and school attendance. These measures are more proximate causes of educational performance related to our theoretical argument about a positive effect based on crime reduction, and a negative effect based on trust in schools and teachers and system avoidance. Data limitations do not allow us to examine health-related mechanisms such as stress, fear, and anxiety. Violent and Property Crime First, we explore the possibility of a positive effect through the reduction of neighborhood crime and violence, which in turn increases school performance. For this purpose, we examine changes in violent and property crime before, during, and after Operation Impact (for a similar analysis, see MacDonald et al. 2016). The analysis is based on a similar difference-in-differences approach as our main analysis for student outcomes, but it uses data on the neighborhood-quarter level and negative binomial regressions to model the number of crime incidents (Osgood 2000). The dependent variables are the number of police-reported violent and property crimes in neighborhood j and quarter q. Police-reported crime data are limited in important ways, including potential changes in citizen reporting of crime related to Operation Impact and increased crime reporting due to additional police officers deployed to impact zones. The analysis nonetheless offers important insights to changes in reported crime before, during, and after Operation Impact. Figure 6 presents results for the effect of Operation Impact on violent and property crimes together with estimated leads and lags, running from four quarters before to four quarters after a neighborhood was part of Operation Impact. The estimates show that the number of violent crimes was almost 10 percent higher in the two quarters before areas were designated impact zones compared to control areas. This finding confirms that the NYPD selected impact zones based on crime levels in the period leading up to the selection of impact zones. Our main analysis adjusts for this selection process by controlling for crime levels in the six months before selection of impact zones. After the implementation of Operation Impact in a particular area, with the corresponding sharp increase in police activity, the number of violent crimes decreased to about 5 percent below the level in control areas. This effect refers to the entire duration of the policing program in a particular area, which ranged from about two quarters to 7.5 years and is, on average, about one year. The reduction in crime dissipates quickly after areas are removed from Operation Impact, with violent crimes returning to the same level as in control areas in the second quarter after an area was removed from Operation Impact. In contrast to violent crimes, property crimes were largely unaffected by Operation Impact and remained at the same level as control areas before, during, and after the program. These findings indicate that Operation Impact reduced violent crime, although the effect was limited to the duration of the program. Together with substantial evidence that violent crime in the residential environment has a negative impact on cognitive development, school performance, mental health, and long-term physical health (for an overview, see Sharkey 2018a), these findings provide evidence for a potential positive channel. They suggest Operation Impact might improve the educational prospects of children in high-crime areas by reducing developmentally disruptive violent crime in the local environment. However, the main analysis clearly indicates that any positive effect through the reduction in crime is far exceeded by the negative consequences of aggressive, broken-windows policing. School-Related Attitudes and School Attendance Second, we estimate the effect of Operation Impact on school-related attitudes and school attendance as possible negative channels. Broken-windows policing programs, such as Operation Impact, might influence educational outcomes by undermining trust in schools and teachers, or by leading to system avoidance and withdrawal from institutions of social control such as schools. The analysis is based on the same difference-in-differences approach described earlier, but it uses a measure for positive attitudes toward school and attendance rate as outcome variables. Figure 7 reports results for African American boys age 13 to 15 years. The findings for school-related attitudes show no evidence that Operation Impact influenced school-related attitudes. The effect estimates are small and statistically insignificant, and the direction of the point estimates is inconsistent with our theoretical expectation. This result indicates that the negative consequences of Operation Impact are not driven by changes in school-related attitudes. However, Figure 7 provides evidence for the effect of Operation Impact on school attendance. The results show no effect before areas were designated as impact zones but a modest decrease after the introduction of Operation Impact, which flattens out in the years after the program ends. Figure S6 in the online supplement shows the same results by race and gender, indicating that reduced school attendance is confined to African American boys. This pattern is consistent with a causal interpretation of the results. The size of the effect indicates that Operation Impact reduced the attendance rate of African American boys, but not other groups, by .46 to .84 depending on the estimate, which corresponds to about .1 standard deviations or 1.35 school days in a 180-days school year. Previous research consistently shows that lower attendance is related to performance on standardized tests, dropout, and other educational outcomes (for research on NYC, see Durán-Narucki 2008). Although the effect size is modest or even small, the finding indicates that system avoidance is a possible mechanism by which Operation Impact reduced test scores for African American boys. Download Open in new tab Download in PowerPoint Together, these analyses present the first evidence about three possible mechanisms that might explain the effect of Operation Impact on educational outcomes. They show that Operation Impact decreased neighborhood crime, which might improve educational outcomes, did not influence school-related attitudes, and had a negative effect on school attendance. These findings indicate that system avoidance might partly explain the overall negative effect for African American boys. The lack of health-related measures, however, prevents us from examining stress, fear, and anxiety as plausible explanations of the negative effect.

Conclusions In response to rising crime rates, police departments around the country implemented aggressive policing strategies and tactics often inspired by the broken-windows theory of crime popularized by Wilson and Kelling (1982). Under Mayor Rudy Giuliani, the New York Police Department—the nation’s largest municipal police force—pioneered these reforms in the early 1990s. Investments in policing, including some forms of proactive policing, are credited with reductions in crime, but systematic research and empirical evidence on the social costs of policing is scarce (Weisburd and Majmundar 2018). We know little about the potential negative consequences of aggressive, broken-windows policing for generations of African American and Latino youth. This article focuses on the consequences of aggressive, broken-windows policing for educational performance. The theoretical argument suggests aggressive policing can either exacerbate racial inequality in educational attainment by disproportionately targeting youth of color in high-crime neighborhoods, or it can indirectly reduce educational inequality if it reduces violence and crime in risky communities. Exploiting the staggered implementation of Operation Impact in New York City and using a difference-in-differences approach, we present the first causal evidence suggesting that this widely applied police model, which emphasizes extensive police contact at low levels of suspicious behavior, can lower the educational performance of African American boys, with implications for child development and racial inequality. The effect sizes vary by students’ race, gender, and age: it is substantial for African American boys age 13 to 15 and small and/or statistically insignificant for other groups. These race, ethnicity, and gender gaps require further research on the consequences of policing for female, Hispanic, and White students. The findings advance our understanding about the role of the criminal justice system for youth development and racial/ethnic inequality. They complement recent research on how different forms of criminal justice contact, such as arrest, conviction, and incarceration, influence mental and physical health, employment, and other important outcomes. Most research on the link between the criminal justice system and child development focuses on parental incarceration, even though law enforcement and policing are a central, and the most visible, part of the criminal justice system. Indeed, police are the face of the state and criminal justice system, with an educative function for many youths, especially minority youths in the most intensively policed neighborhoods (Justice and Meares 2014; Soss and Weaver 2017). The findings document how direct and indirect contact with police can have spillover effects to behaviors in other contexts of state authority and control. The focus on neighborhood-level exposure is particularly important. It shows that the consequences of the criminal justice system are not confined to those who are incarcerated, arrested, or even stopped by the police and instead highlights that the consequences extend to entire communities. Considering the signiﬁcant racial disparities in individual- and neighborhood-level police exposure (Fagan et al. 2010; Hagan et al. 2005; Legewie 2016), the findings suggest aggressive policing strategies and tactics may perpetuate racial inequalities in educational outcomes. They provide evidence that the consequences of policing extend into key domains of social life, with implications for the educational trajectories of minority youth and social inequality more broadly. These findings should encourage police reformers, policymakers, and researchers to consider the broader implications and social costs of policing strategies and tactics. Although investments in policing are credited with reductions in crime that might benefit students in high-crime areas (Sharkey 2018b), the findings from this study and emerging evidence from other research indicate that aggressive policing influences a range of different outcomes and might harm students as well. Understanding the social costs of programs like Operation Impact is important for the design and implementation of policing programs that attempt to reduce crime and mitigate any negative consequences for minority youth. The effectiveness of policing programs is regularly assessed based on crime rates. Our findings could inform new ways to assess the effectiveness of policing practices that are relevant for all policing programs. By combining large-scale administrative or big data from different government agencies with a rigorous research design, our work lays out an agenda for how to measure the long-term social consequences of policing across key domains of social life (Law and Legewie 2018). It suggests that a better understanding and regular assessment of the social consequences of policing should play a key role in evaluation of police programs and police accountability. Our research also points at a general set of processes that highlight how frequent and negative interactions with authority ﬁgures and the experience of discrimination can undermine educational and potentially other outcomes. We focus on students’ residential context and ignore exposure to policing and other forms of social control in other contexts, such as schools, non-residential neighborhoods, or shopping malls. Indeed, many students attend school in a different neighborhood or spend significant time in other areas. Increasing surveillance and social control in schools or other settings might have similar adverse consequences and shape students’ responses in important ways. From this perspective, our work lays out an agenda for how to test for these mechanisms. It encourages other researchers to examine the social costs of social control and surveillance across different settings. However, we acknowledge several limitations. First, our data only allow us to test some of the underlying mechanisms. We are unable to examine health effects related to stress, fear, trauma, and anxiety, and our attitudinal measures are only a first step. Future research should address this limitation and examine how neighborhood-level exposure to policing and the experience of police discrimination is related to child development, particularly youths’ mental and physical health. Second, we evaluate the effect of aggressive policing based on New York City and Operation Impact alone. The advantage of our design is that it overcomes several of the challenges in isolating the causal effect of policing. As in any study, external validity is not ensured. The policing strategies and tactics at the core of Operation Impact, however, are fairly representative of police reforms in major cities across the country (Fagan et al. 2016; Greene 2000; Weisburd and Majmundar 2018). Several studies suggest that adolescents in suburban areas experience policing in much the same way as do urban teenagers (Beck 2019; Boyles 2015; Brunson and Weitzer 2009). There is less research on rural policing, where police contact is less frequent and crime rates may be lower, and where policing may be more embedded in the community compared to urban areas (Christensen and Crank 2001). Similarly, the setting in other countries with different racial histories, policing strategies, and educational systems is distinct, limiting the potential to generalize the findings. These similarities and differences are not sufficient to draw conclusions, but the findings warrant future research on how crime and policing shape youth development in non-urban contexts and across the world.

Appendix Table A1. Sample Characteristics View larger version Table A2. Difference-in-Differences Model for the Effect of Operation Impact on English Language Arts (ELA) Test Scores by Race and Gender View larger version Table A3. Difference-in-Differences Model with Student Fixed-Effect Term for the Effect of Impact Zones on English Language Arts (ELA) Test Scores by Race and Gender View larger version

Acknowledgments, IRB, and Replication Note

For helpful comments and advice, we thank Anouk Lloren, Andy Papachristos, Olav Sorenson, Desmond Ang, Jim Kemple, and other members of the Research Alliance for NYC Schools (RANYCS). The study was approved by the Institutional Review Board at Yale University (IRB Protocol ID 2000022096) and Harvard University (IRB Protocol ID IRB18-1584). Replication code is available at https://osf.io/5mzue/.

Notes 1.

The terms order-maintenance, zero-tolerance, quality-of-life, proactive, and broken-windows policing are used somewhat interchangeably. They generally refer to a style of policing that strictly enforces low-level crimes, targets minor forms of disorderly behavior, and proactively engages citizens through pedestrian stops and searches to prevent crime. 2.

Stop, Question, and Frisk (SQF) operations are regulated under federal (Terry v. Ohio) and state (People v. DeBour) standards. SQF operations allow police ofﬁcers who reasonably suspect a person has committed, is committing, or is about to commit a felony or a penal law misdemeanor to stop and question that person. Frisks are permitted if officers suspect the presence of a weapon or if ofﬁcers suspect they are in danger of physical injury. Searches are permitted if officers have reasonable suspicion to believe a crime has taken place. About 5 percent of stops resulted in arrests during this period, and another 5 percent resulted in issuance of a citation for a non-criminal violation. 3.

Beginning in January 2006, New York City started using a citywide records management system to collect SQF data. Earlier records are not geocoded and instead include the address or street intersection of the stop. We geocoded these records using ArcGIS. 4.

We also considered two alternative sample definitions that would imply different control groups. First, we considered no sample restrictions so that the trend in treatment areas would be compared to all other areas. Second, we considered restricting the sample to all areas with a high crime rate prior to the introduction of Operation Impact. This approach would use high-crime areas not designated as impact zones as the control trend. However, both alternative definitions of the control group do not perform well in sensitivity analyses (i.e., the trend in test scores already diverges before areas were designated as impact zones). 5.

Controlling for lagged crime might be endogenous in later periods of Operation Impact. To address this concern, the online supplement presents results without prior crime as a control variable. 6.

Accounting for serial correlation is important in many difference-in-differences settings to ensure the resulting standard errors are consistent (Bertrand, Duflo, and Mullainathan 2004). A simple approach to address this problem in settings with a large number of groups are standard errors clustered on the group, or in our context neighborhood, level instead of the group-by-year level (Angrist and Pischke 2008). We adopt this approach for the two difference-in-differences models described here. In general, the results are consistent across different methods of calculating standard errors. 7.

All models are estimated using a generalization of the within (fixed-effect) estimator for multiple high dimensional categorical variables using the lfe package in R (Gaure 2013). In particular, we use the within transformation for the neighborhood and grade-by-year fixed-effect terms and in some models for the student fixed-effect term. 8.

Geocoding of student addresses was done by the Research Alliance for New York City School Research, which provided us with access to neighborhood identifiers and x- and y-coordinates with random noise to ensure privacy. Geographic information is missing for the 2004 school year (October 2003 and June 2004). These data are unrecoverable by the NYCDOE. To address this problem, we replace the missing 2004 data with the closest available information from either June 2003 or October 2004. 9.

Free or reduced lunch and English learner status are the two variables with the highest proportion of missing cases in our analytic sample. Many of these cases are for single years, with complete student information for other years. We impute missing information for free or reduced lunch and English learner status with the lagged or lead value if it is missing. 10.

In 2007, the NYCDOE started to conduct the NYC Learning Environment Survey, which is a yearly student, parent, and teacher survey in grades 6 to 12. The survey focuses on the learning environment at each school and covers four categories used for reporting as part of the yearly school-level progress report. The four areas are academic expectations, communication, engagement, and safety and respect. The NYCDOE asks all parents and students in grades 6 to 12, and all teachers, to participate in the survey. The response rate among students was 65 percent in 2007 and increased to 82 percent in 2012. 11.

Only one factor has an eigenvalue above 1 (Kaiser criterion), all the variables have factor loadings above .5 (with most above .6), and common metrics such as the RMSEA index (.056) and the Tucker Lewis Index (.995) are below/above the standard thresholds. 12.

AppendixTables A2 and A3 do show results for White students. Some of the point estimates are comparable in size to the main findings reported here. However, the standard errors are extremely large, the estimates are inconsistent across different model specifications, there is no systematic pattern across age and gender (as expected by theory and other findings), and the sensitivity analysis indicates the common trend assumption is violated for the sample of White students. Together, these findings challenge the validity of the estimates for White students, presumably because the number of White students exposed to impact zones is too small. 13.

The lack of any impact on Hispanic boys is surprising, particularly for older students. However, Operation Impact increased the number of pedestrian stops by 35.1 percent for African Americans compared to 25.2 percent for Hispanics, with a similar pattern for misdemeanor and violation arrests (see the online supplement for details). Accordingly, Hispanic students experienced Operation Impact at least somewhat differently than did African American students. 14.

Note that the negative effect is not confined to groups of students who are regularly stopped and arrested by the police. Indeed, the stop and arrest rates of African American boys age 12 and 13 are relatively low, yet they begin to experience a negative effect of Operation Impact. 15.

The analysis also omits the pre-impact-zone crime measures.