The models here are slightly worse when compared to the k-NN approach. Taking a look at the classification matrix for the guards, we can see something interesting.

OOS Confusion Matrix for Guards

Compared to the k-NN approach, the model tends to under-predict NBA talent. Here, the false-positive rate is low, while the false-negative rate is high.

Since we wanted interpretable features, we can look at the coefficients of each model for each position. Keep in mind, increasing the value of the features corresponding to positive coefficient values pushes the odds towards 1, while increasing the value of features corresponding to negative coefficient values pushes the odds towards 0.

We can immediately notice some trends. It should come as no surprise that Offensive/Defensive Win shares, PER, Usage %, Per Game Minutes, and Height are the biggest influencers. Within groups however, we can see that PPR (Pure Point Rating) and PPS (points per shot) are predictive of NBA potential for guards. For big men, wingspan and defensive statistics such as BLK% and STL% are important.

Some of the coefficients are weird (for example, higher TOV% is favored for guards) but this is due to confounding effects. For example, a high TOV% could mean the player has the ball in his hands frequently and is expected to make plays (i.e. a valued player). Also, high TOV% is expected from freshman phenoms as they are adjusting to the rigors of the college game. In this case, the model could be picking up the one-and-done signal.

Random Forest Classifier Approach

Given the difficulty of the classification task, I figured it would be beneficial to utilize a non-linear model. In this method, I don’t separate by position and instead ordinally encode the categorical position. This method performed better than one-hot-encoding (reduced feature space). I once again used the same weighted loss function.

Here, the area under the curve looks promising. If we take a look at the confusion matrix, we see that we actually do pretty well.

OOS RFC Conf. Matrix

We can perform the same kind of coefficient analysis here. With decision trees it is a little different since non-linearity is involved. Here, the feature importances indicate how important the model thinks it is to split on a specific feature.