He has been called "Stanford's star of statistical inference," and he is responsible for creating a swathe of modern statistics, including the bootstrap. But when Brad Efron first heard about a project to tackle the horrific prevalence of rape in Nairobi's poorest schools, his first thought was that it required more than statistical help—it required "angelic" intervention. "I put it in the category of great ideas that couldn't be carried out," he says. "How do you actually execute a highly interactive interventional plan within a disparate public-school system—and then how do you show whether the interventions are working?"

Those were precisely the answers the founder of No Means No Worldwide was seeking. Leaders of the nongovernmental organization (NGO) based in San Francisco with a project to train girls approached Stanford for help after its founder, Lee Paiva, thought it just wasn't enough to tell funders how many children had been through the program. "What did those kids actually get?" she asked in a Stanford Medicine (https://stanmed.stanford.edu/2016summer/standing-up-to-sexual-assault.html) article. "What is that money really going to do? And in that moment, I knew—I'm not doing this anymore until I absolutely know what that child got out of this."

Figuring that out requires more than taking a few stats courses. It requires pushing the boundaries of statistical modeling, itself. "And then I did one smart thing," says Efron. "I involved Mike Baiocchi in the project."

Baiocchi's passion for causal inference had led him to occupy positions in both the statistics department and at the Stanford Research Prevention Center in the university's medical school; he quickly became absorbed by the technicality of the challenge and the impact of finding a solution. "If we can measure difficult concepts like empowerment and gender norms," he says, "then we can use statistical analyses to map out the causal pathways to assess what is essential, and what is not, in these interventions."

By developing an evidence base for what works, funders, policymakers and NGOs can put their limited resources to where they will have the largest impact—and all in highly quantifiable, defensible ways.

Initially, Baiocchi started looking at the observational data No Means No Worldwide had collected from their training in Nairobi's schools to see whether it could be used in a rigorous way to draw inferences about what was working. But, as he notes, observational data often tempt researchers into overstating the benefits of an intervention. At the same time, No Means No Worldwide wanted to expand their program. "So, we were like, 'Hey, this is a perfect opportunity to do a randomized study!'" By randomly assigning some schools to training and others to being delayed in receiving the training, they would have a much better sense of the overall impact of the program.

Baiocchi partnered with Clea Sarnquist, a senior research scholar in pediatrics who had done pioneering work in assessing interventions to prevent gender violence in sub-Saharan Africa. In 2014, they both went to Kenya.

The project's location—Nairobi's slums—presented a huge statistical challenge. "It's a dynamic environment," says Baiocchi. "Buildings are being erected and demolished; people are constantly moving in and trying to reach escape velocity. So, if you run a training program in a school and then the school disappears, you'll have to deal with student diffusion."

"We also heard anecdotally that the girls loved the training and were teaching their friends, sisters and moms," continues Baiocchi. "That's wonderful—and researchers are thinking about how to use that. But, from a statistical perspective, there was a risk that our intervention group would contaminate into our control group, and that would prevent us from seeing whether the intervention was working. It just wasn't clear that we could empirically establish that you could prevent sexual assaults in these kinds of environments."

The first trial involved 5,000 girls in 28 schools—and was put together with—as Baiocchi put it—the experimental equivalent of shoestring, chewing gum and duct tape. But the result was striking, and it mirrored what they had seen in the observational study—the training cut the rate of sexual assault in half.

At the same time, a group of Canadian researchers published the results (www.nejm.org/doi/full/10.1056/NEJMsa1411131) of their multiyear randomized trial of a similar sexual assault intervention program for college-age women at Windsor University and found rates of rape had been cut in half. That similar intervention programs had similar outcomes in two very different populations and environments was a sign that the underlying theory behind the training was really working.

But the Stanford team was only getting started. "We wanted to unpack why things were happening, and not just focus on changing outcomes, says Baiocchi. "We wanted to make sure we really understand what's going on."

The theory behind the training posited four causal pathways to prevent assault. The first is situational awareness, which involves training girls to be aware of the common scenarios perpetrators use and to be aware of the physical and social resources available to them. The second is empowerment, which involves training the girls to understand that they are "worth it" and they have the right to have their will heard and registered in such situations. The third is training in the verbal skills—how to be heard, the words that are most effective and the tone and social constructs to use. And the fourth is physical defense, a mix of Krav Maga and Brazilian jujitsu.

"Our first randomized trial showed strong evidence that the empowerment pathway seems to work as the theory proposes," says Baiocchi. "But they needed bigger and more complex trial to drill down, and that would be expensive." To continue with Efron's angel metaphor—at least in terms of investment—enter the U.K.'s Department for International Development (DfID) and its global program focused on girls and women, What Works to Prevent Violence (www.whatworks.co.za).

With DfID's support, they are now working with more than 100 schools and following the students over two years to assess their experience over time, tease out causal pathways and look at the durability of the training's effect. "If the girls are trained once, does that training last for years?" asks Baiocchi. "Or is a 'booster' program needed to keep the skills effective?" They will also measure academic outcomes to see whether the program has additional benefits outside of sexual assault—something that would help make the case for the Kenyan Ministry of Education to scale up the intervention.

Baiocchi is quick to note success has many authors—from the collaborating researchers in Kenya to those running the training to the Canadians doing their study. But as a statistician, he lingers on the contribution of one group whose impact is easy to miss: the Stanford Ph.D. students in statistics who signed on to help with the project. They worked on the exhausting task of establishing the baseline demographic data from which to measure future results—everything from the girls' experience of gender-based violence to what materials their houses are made of. And they took it upon themselves to create tutorials with synthetic data sets so others could learn how to do similar studies.

"There isn't a ton of statistical expertise in the global gender-based violence field," says Rina Friedberg, a third-year Ph.D. student in statistics. "So, we've been creating training programs for other groups who want to do similar studies. They can go through and read our software code and our explanations of what it does and why, and hopefully they will be able to go out and replicate our statistics."

It is difficult to overstate the importance of this, according to Baiocchi. "The academic discipline of sexual assault prevention is quite young, so its researchers were trained in many different disciplines," says Baiocchi. "This means when we get together to talk, it's a bit like the Tower of Babel in terms of statistics, because everybody has a different statistical language they use; it's hard to communicate. We're trying to provide a Rosetta stone for everybody working in sexual assault prevention, to make it easy to have a consistent and rigorous statistical methodology relatively quickly."

"Mike and his colleagues were able, with what seemed to me heroic persistence, to both carry out the intervention and to demonstrate its massively good effects," says Efron. "These guys are going to win major humanitarian awards."

Last year, Baiocchi won Stanford's Rosenkranz Prize for Health Care Research in Developing Countries. The team is using the financial support from this award to explore the impact of the fraught 2017 Kenyan elections on rates of violence on children in the slums. This year, Friedberg won the Marjorie Lozoff prize for scholarship that furthers women's development. In the spring of 2018, Baiocchi's team partnered with the researchers who developed the training program for college-age women in Canada to start a pilot program for Stanford undergraduates. "We're excited to be bringing this approach to combating sexual assault home," says Baiocchi. "And a little nervous about all the new complexities we'll encounter here on campus."

Explore further Empowerment program greatly decreases incidence of rape, study finds

More information: IMP: Interference Manipulating Permutations

Thursday, August 2, 2018, 10:30 a.m. - 12:20 p.m.

ww2.amstat.org/meetings/jsm/2018/onlineprogram/AbstractDetails.cfm?abstractid=326760