I typically only dabble in NFL Draft data. I’m a scientist, nerd, and huge/passionate/obsessive fan of the NFL Draft. Over the years, I’ve become less interested in who my favorite team will draft, and more interested in what makes some players great while others never make a team. From May until September, I usually have to find other sports to follow (NBA, MMA, and track and field). During this year’s NBA Draft, I developed a minor man-crush on Ben Simmons. First off, he attended LSU, and I am a die-hard tiger fan. Secondly, his game mirrors one of my favorite players in the NBA: Lebron James. But, it was bothering me that I couldn’t rationalize my man-crush through statistics.Although I don’t know the game of basketball as well as football, I was still convinced that filling a stat-sheet with points, rebounds, assists, and steals was very important. So, my curiosity prevailed and I started looking further into the NBA advanced metrics, and correlations between those numbers and collegiate production.

I began my analysis with a group of 40 Top-10 draft picks from 2009 to 2012. I don’t have a real reason why I chose Top-10 vs Lottery vs every pick, except: I figured that blowing Top-10 picks will cost you a job. I compiled per game statistics and advanced metrics for each player during their NBA and NCAA careers. This group of 40 players reduced to 34, because 6 players had no NCAA data. I then ran an analysis called Recursive Partitioning, which basically partitions groups into smaller cohorts based on different variables (more can be read about this analysis here: https://cran.r-project.org/web/packages/rpart/vignettes/longintro.pdf). I chose this program because, when multivariate data interact in complicated, nonlinear ways, building a single model can be very confusing, if not impossible. Typical quantitative approaches to talent evaluation (my experience reading about NFL analytics) involve a regression model and some expected value. An alternative to fitting (or over-fitting) regression models is partitioning data into smaller cohorts of data, making the interactions more manageable. One method to achieve this is called Classification and Regression Tree (CART), which I used.

Initially I thought this would be an endlessly confusing and complicated model, given all the input. But to my surprise, predicting career Player Efficiency Rating (PER) for these players drafted in the Top-10 boiled down to one metric: Effective Field Goal Percentage in college. The group of 34 players were split into two cohorts (15|19) based on an Effective Field Goal Percentage threshold of 0.537. Players who shot below 53.7% in college had a mean career PER of 13.7 in the NBA. Players who shot above 53.7% had a mean career PER of 19.3 in the NBA. Additionally, 87% of the players who shot better than 53.7% in college had better-than-average NBA PERs (>15), while only 26% of the other group did.

Looking at the 2016 Top-10 draft picks, Ben Simmons, Buddy Hield, Jamal Murray, Marquese Chriss, and Jakob Poeltl all exceeded an effective field goal percentage of 53.7%.

My first foray into the NBA Draft has been really interesting and fun. It has also been illuminating at how little I know about the game of basketball (which is also fun).