The original point of looking at talent geography was to understand how the geographic distribution of lacrosse players around the country (and world) impacts the various D1 programs. To put it another way: to what extent does a school’s geography lock in a talent advantage or disadvantage? It’s an interesting question with Utah joining D1 shortly, given that they are now the western-most D1 men’s program.

This is the fourth post in a series of posts on the topic of geography and talent distribution in lacrosse:

Note: The data set for this analysis was all players active in 2017. We excluded players who graduated prior to 2017 because our data only starts in 2014, so older players would have a big gap in their history. This is important for the parts of the analysis that rely on estimated player talent.

The one where he encounters some general thorniness

The question of geographical advantage turns out to be a thorny question to tackle. As we have noted in previous posts, there really isn’t a lot of comprehensive data on lacrosse recruit rankings, so you are left using actual game stats to divine some measure of recruit pedigree. Of course, the pedigree of a recruit before they step on campus is not going to be precisely correlated with actual performance on the field. This means making some assumptions about whether a particular player choosing a particular school should be considered a coup or not.

Outside of data availability, there are the basic data challenges associated with trying to mash up a data set of player stats, team locations, and hometowns. In our case, that involved some heavy cleansing of the roster data to get credible hometowns for each player. It also involved some nifty Python to get the latitude and longitude of each player’s hometown and each school’s campus. But we love data engineering, so it was a weird sort of nerds-only blessing.

But even with those challenges, there are still some meaningful outcomes to discuss. And of course, more things to add to the to-do list.

A good chunk of D1 players stay close to home

Let’s start with the most basic question: do players tend to go to school closer to their hometown? In short: yes.

31% of the 2,315 players in our data set attended a school that was less than 100 miles from their hometown. (Flashing Sign/Lightbulb!!!) Obviously, talent and teams concentrate in the Northeast and MidAtlantic, so this is not a huge surprise. It is also affected by the fact that with limited recruiting budgets, I have to believe that the rational program is going to focus their recruiting efforts closer to home.

But to put 100 miles in context, a kid from Baltimore who stays within 100 miles of home can’t get to Penn State, Rutgers, or Richmond. They can barely make it to the Philly schools. So despite a relatively concentrated talent based, 100 miles is not nothing.

But again, the point is not to really say whether 31% is high or low. It just is. 31%.

Talent Travels

I suspected that what we would see next is that the schools with the pedigree, the lacrosse blue-bloods, would have players coming from all over. As a result, the average distance that their players traveled from home would be higher. It was not the case.

I could still do some work to normalize program pedigree and distance from the player base. Denver is a great example: they have a strong program pedigree, but the average distance traveled by their roster is high because they are out west. Maryland (feel free to argue) has a comparable pedigree, but their average player distance traveled is going to be lower because they are centrally located. To say that program pedigree doesn’t attract players from far away just because Maryland is so close to the talent base doesn’t make sense. Still, looking at the visualization as it stands, I’m not sure normalizing is going to make much of a difference. (At least it didn’t make sense to hold up this post to figure it out.)

So with one hypothesis out the window, another came into the light. Maybe it’s not the program that attracts players from far and wide, but the talent level of the players themselves. Maybe instead of top programs pulling players from all over the map, it’s the top players who have their choice of program who are going farther away. And we do see some evidence of that.

Of the players who end up having forgettable careers (0-50 expected goals added), 33% attend a school within 100 miles of their hometown. For the cream of the crop, the players who have upwards of 150 expected goals added for their career, that number is much much lower. And players with careers in the middle end up travelling to schools that are, you guessed it, somewhere in the middle.

I struggled a bit to interpret this chart. On the one hand, programs that attract the more generic players (i.e. the bottom tier of the lacrosse world) probably have less budget to recruit, which means that they end up staying closer to home, which means that they probably aren’t going to get the star from out west. Instead, they are going to fill their roster from the available local stock.

That’s one way to interpret this. The other way to interpret this is that the top players are more willing to pursue their lacrosse ambitions at the best school, no matter the location. In other words, when making their decision, location or proximity to home is less of a factor than the less highly touted players. In this telling, the players are the ones putting in the effort to get on the radar of their ideal school, no matter where it is.

Schools are not uniformly going after the best players

Honestly, there really isn’t a way to resolve which it is with the data we’ve got. But we can at least add some more context. Via graphs!!!

The chart below shows the make-up of each team’s roster relative to their proximity to the overall talent pool. Teams farther to the right are closer to the average talent pool. Teams toward the top of the chart have filled their rosters with players from farther away.

It is not surprising that there is a strong correlation here. A team that recruits nationally, and is far away from most top-tier players (i.e. Air Force or Denver), is naturally going to have a longer average distance traveled.

The interesting things to take away from this chart are situations where teams with the same proximity score have different median player distances. Take Princeton and Manhattan as an example. Both are roughly equidistant to the overall talent pool, but Princeton’s median player is from about 200 miles further from campus than Manhattan’s. Is their program pedigree attracting players from farther away? Or do they have a recruiting budget that Manhattan doesn’t?

Another interesting one was Air Force/Denver. Air Force has a much higher median player distance than Denver, but that is probably to be expected given that they are a service academy with a different value proposition. While Denver still attracts or recruits players from far away, their median player is from much closer to Denver that Air Force’s.

Detroit is an example of the phenomenon of lower-tier programs staying close to home. Given their relatively low proximity to the overall talent base, you’d expect that they’d need to fill their roster with players from farther away. But they actually have a much more local roster than you’d expect, given the curve above.

And getting back to the question of program pedigree, (while not listed on the chart) Towson vs UMBC is an interesting comparison. Because they are so close to one another, they have equivalent proximity scores. But the median player on Towson’s roster is from only 4 miles farther away than the median player on UMBC’s roster. In other words, despite having the pedigree, Towson is not attracting players from any farther away than UMBC is.

Would Towson’s workman-like approach to the game suffer with a roster of all stars from all over? Does this cause them to focus their recruiting on the under-appreciated local kids? Or are they underselling their chances of a title by filling their roster with locals? As with a lot of this stuff, there’s no correct answer; just some food for thought.

What have we learned?

In short, not much.

But we’ve established a few things. Players tend to go to schools that are close to home. Teams that are farther away from the talent base do tend to have rosters of players from farther away. Despite that last fact, it’s not a perfect correlation, which means that some combination of program pedigree and recruiting strategy is affecting how different teams build their rosters.

I think there are two directions that would be interesting to take this analysis.

The first is to try to incorporate program pedigree more directly into the math here. We’ve established that there is a strong relationship between how far a team is from the talent pool and the distance make up of their roster. This is to be expected. But the individual team variances prove that something is missing in that model. It could be that finding a way to incorporate program pedigree explains the rest of the uncertainty.

The other aspect is to try and see if we can divine anything about a program’s recruiting strategy based on this data. In our Towson/UMBC example, you’ve got two programs where proximity to the talent pool is held constant. That leaves us with program pedigree as the only real distinguishing factor. Since we see that they have very similar roster make-up, can we infer that Towson is eschewing players farther afield that their pedigree suggests they should have a shot at? What does it look like if we are able to compare pedigree/roster make-up for all teams. Who knows what clusters would emerge.

At the end of the day, this is a good start in visualizing the trends in how players of varying skill levels choose their school. What’s next is to try and come up with some actionable insights that teams could actually use to refine their recruiting approach.