In my time on Streaking the Lawn, I've used advanced sports statistics to analyze and provide an additional perspective on Virginia Cavalier Football and Basketball. Advanced stats, though, aren't limited to a collection of formulas and acronyms spanning these two major sports and the even more well-known baseball sabermetrics.[1] Advanced stats are best understood as a conceptual approach to understanding sports, asking questions such as: what do classic statistics fail to measure, and which statistics measure too much[2]?

Chris Birch, Director of Volleyball Operations for the Virginia Volleyball team, recently contacted me to help tweak his raw data into useful statistics. He hopes to better evaluate players in a sport as yet undisturbed by the steady progress of nerdy sports analysis. This is where you[3] come in.

In a series of undetermined length, Chris and I will attempt to successively develop better stats for various aspects of the sport via email exchanges, and I'll post our progress here on Streaking the Lawn. We're counting on you invaluable readers to provide us with additional perspectives. Think of these posts as less of a report and more of a prompt for ideas.[4] Comment with whatever comes to mind, whether you've suddenly discovered you're Volleyball's version of Bill James or simply noticed a typo. Every opinion is helpful.

In today's first part, we'll take a look at passing.

[Past Method of Evaluating Passing]

Chris uses a program that allows him to analyze match video by coding each contact of the ball and giving it a grade. Passes in particular are scored on a scale of 0 to 3, where a 3-pass is optimal and a 0-pass directly results in a point for the opponent. Most teams average a player's pass scores to determined pass proficiency. In the table below, pass history is written as [#3s/#2s/#1s/#0s].





Chris noticed a problem. These players are almost universally considered equal passers, but the first player at least kept her team in the point on every pass, while the second player directly contributed to opponent points on 1/3 of her passes. Poor passes were being underemphasized.

[Options for an Improved Passing Stat]

---Option 1----

Chris' first approach wisely incorporated scoring probability into the analysis; he charted the number of times the team scored as a result of each pass grade level, and determined a probability of scoring from this data. Among 30 games from last season, the team scored 45.2% of the time after a 3-pass, 33.9% after a 2-pass, and 21% of the time after a 1-pass.[5] He then broke down a player's passes into percentages, instead of raw numbers, for the various grades. Our [10/0/0/5] passer from the previous table would have 66% 3-passes and 33% 0-passes. Multiplying the respective probabilities by the player's pass percentages gives an idea of the scoring probability of an individual player's passing, and resulted in the formula and comparison in the chart below:





The first player has appropriately overtaken the second player as "best passer." Furthermore, we're directly relating passing proficiency to the most important stat of all; points. This seems like a great way to approach the problem, but different issues now arise. Chris noted that there's a lot of statistical noise[6] in this analysis.

---Option 2----

Under my limited understanding of volleyball, much of the noise is coming from the fact that points are generally directly produced by the attack; passes clearly influence the attack, but the ensuing play of the attacker and play of the defending team have an even bigger impact on whether the passing team scored. This reminded me a lot of basketball. In basketball, ball-handling is critically important to the success of the possession, but also occurs after the start of the possession and before the scoring attempt, so its points produced are also heavily influenced by outside forces. Ball-handling is essentially divided into three categories: turnovers, passes, and assists. Stat-wise, it's kind of an all-or-nothing aspect of the game. A notable pass either directly leads to a scoring opportunity or directly results in a turnover. Noise is largely eliminated. The vast majority of passes are not recorded.

Basketball's assist/turnover ratio might analogize well to volleyball. As shown in Chris' probabilities, the biggest jump in probability comes from just getting a pass off. A 0 pass could be like a turnover, and a 1 pass could be ignored as a "regular" pass. Depending on how 2 and 3 passes influence the game, we have a few options: we could consider either both of them or only a 3 as similar to an assist.





This dramatically simplifies the analysis, removes much of the noise, and rates player 1 significantly better than player 2. The infinity is a little weird, but it's a small sample and the same issue occurs in basketball.[7] But my inner math nerd thought, maybe it's a little too simple ...

---Option 3---

With Chris' interesting probability work, we could multiply the 3s by a factor representing their point probability "worth" over 2s[8] to better weight the pass quality.





This slightly closes the gap between these players by returning some emphasis to highly rated passes, while still staying within the assist/turnover framework and avoiding some of the noise.

---Option 4---

Chris felt that we should include all the pass ratings in the wPQR, so I standardized it to a 1-pass. This re-introduces some of the noise but still gives us another option to consider.





[Conclusion]

So let us know: which options do you like, which ones do you hate, preferably with reasoning behind the opinions, and also feel free to suggest a completely different approach. Discuss.





______________________________________________________________________

[1] Gaining widespread popularity with Moneyball (which I highly recommend). The influence of sabermetrics is so mainstream that some refer to advanced statistical analysis in a variety of sports as "sabermetrics." "Sabermetrics" is a mash-up of "SABR" and, obviously, "metrics," wherein SABR stands for the Society for American Baseball Research. Those who analyze advanced basketball stats are not advancing baseball research. I find this particularly bothersome.

[2] Like the completely useless Wins for a baseball pitcher. "Yes, he pitched well, which is his only job, but let's put an equal emphasis on something that he has no control over, like his team's ability to score runs." See Cliff Lee 2012.

[3] Yes, YOU. Well, and me. All of us, really ... you'll see.

[4] I hesitate to use the word "discussion" because you aren't being locked in a room for an hour and forced to participate against your will. Well, sort of.

[5] 0% of the time after a 0-pass, by definition

[6] for those unfamiliar with "noise," it's like saying that too many other things are affecting the scoring probability to confidently say, for example, that a 3-pass is responsible for scoring points 45% of the time. I'll expand on this in a second.

[7] If someone can finish a season with an infinity Pass Quality Ratio, then good for them

[8] .452/.339