I had the pleasure to sit down with Dr. Stephanie Kovalchik, a statistician with the RAND Corporation, to discuss the recent work she presented at the Joint Statistical Meetings in Seattle. However, our conversations covered broader topics, from the need to take a scientific approach in tennis to the divide between industry and academia, which could potentially bridge gaps in these approaches. We both shared the same ‘amusement’ with the IBM Keys to the Match. She recommended the end of this podcast to me. Listen to the the last 4–5 minutes of Hang Up and Listen as Josh Levin pretty much attacks the IBM Keys to the Match.

Listen at 1:10:00 from the ridiculous nature of IBM Keys to the Match.

In the end, our conversations pretty much turned into a massive brainstorming session but the central themes to the discussion were clear — 1) there is a lack of proper tennis analytics, 2) detailed point-by-point data would be valuable [not just binary outcome data], and 3) visualization is key. However, for now, we both agree that it is better to work with what we have and prove how powerful numbers can be with hopes for improvement in the future.

One example of this power is through Kovalchik’s recent presentation entitled, “Are Women Professional Tennis Players Really Less Consistent Than Male Players?”. In some sense, her conclusion is very logical: if you give the matured, established top ranked player more points to play, they will have a higher probability of winning. Yet, often if you search through the history of tennis media coverage, you will find the word ‘women’ and ‘inconsistent’ or ‘unpredictable’ together. For example, after the upset of Williams in 2012, The Economist stated:

Another reason not to be overly shocked by Ms Williams’ exit is that it fits with an overall pattern of inconsistency in the upper echelons of women’s tennis — particularly when compared with the far greater predictability of the men’s game.

However, later they added an addendum, acknowledging a simple comment from a fellow reader:

Men’s grand slams are 5 sets, women’s 3 sets. Playing more points should increase the chance that the best player will win the match. This should go some way to explaining the lesser dominance of the top players in the women’s game.

The fact the original statement was written in the first place is not singular to this piece and it makes the quantitative analysis in Kolvachik’s work even more important. Thus, I thought I would explain her current work in a little more detail but if you would like to see all of her figures, look out for an article this October in Significance.

Six Measures of Consistency

1. Upsets

When I go through a bracket like in Tennis Note #9, it is really easy to just pick the seeded players aka the higher ranked players. In fact, there is an expectation for these higher ranked players to perform at a superior level. What happens if you examine the frequency of losses for these higher ranked players at the hand to a lower ranked players?

If you look at the difference between the winner rank and loser rank, an upset will have a positive difference. The real question: how frequently do the differences occur in ATP vs. WTA?

This is the first parameter that found a difference between the women and men BUT only at the Grand Slams.

2. Streaks

Remember Tennis Note #7? Everyone was freaking out about Rafael Nadal and his decline on clay and I decided to visualize this decline by illustrating the number of consecutive wins before a loss, specifically for clay courts tournaments.

Note: This figure is outdated but if you want to imagine this year, just add four more bars that represent 4, 2, 4, and 5 wins. His consistency on clay has clearly declined.

Nadal’s ability to have 81 consecutive wins on clay from 2005 to 2007 is the mark of incredible consistency but as you can see, his ability to reproduce even 1/3 the results has disappeared. I did this analysis for one player but Kovalchick looked at the exact same parameter, regardless of surface and over half a decade, for both the WTA and ATP. She found no differences between the men and women in terms of average win streaks for each rank.

3.Letdowns

Raise your hand if you were disappointed the first time Rafael Nadal had an early exit in Wimbledon after winning Roland Garros. I know the first time it happened, I was certainly disappointed. This is a letdown — a first or second round loss after being a finalist or winner. If you compile all the letdowns in the past five years for WTA and ATP, Kovalchik found statistically significant differences, but again only at Grand Slam events.

4. Variations in Match Win Percentage

Animated GIF to explain how the data was compared for this parameters in consistency.

In the end, it does not matter how close the sets are or hard you fought because there are only two outcomes: win or lose.

Thus, Kovalchik determined the win stability for each rank based on the variation in win percentages across five seasons. You expect the higher ranked player to be more stable at these tournaments. Again, differences were only found at Grand Slams.

5. Variations in Serve/Return Ability

In a similar manner, you can look at the stability of higher ranked players in serve and return. As stated by Kovalchik,

These mixed findings indicate men’s serve and return skills are not universally more stable than women’s.

6. Reversals in Game Spread

Forget the first statement I made in #4 for a second. Even if a player gets the win, there is a difference between straight sets and completely lopsided results. Kovalchik’s sixth parameter examines the frequency of these up-and-down matches. Hopefully, the example I provided helps you interpret the results: the median game spread reversals for each tour was 2, and the ATP actually had a 0.1 greater spread reversal at Grand Slam tournaments, suggesting a small but greater level of inconsistency

Match Format, NOT Gender

Thus far, from these six parameters of consistency, three of them demonstrate an inconsistency in WTA vs. ATP at the Grand Slam level. We already discussed earlier how reasonable this seems — giving the vetern player more time to turn a match around — but how do you illustrate this quantitatively?

Consider a game of tennis as a tree with several point outcomes, similar to Tennis Note #8. Now, a game tree like this gives you the probabilities of winning the point on serve (or return) based on the past result. It can further inform people about the probability of a player winning a game, set, and match. However, before a match begins, how can you determine these probabilities?

Kovalchik assumed each probability is independent and identically distributed [IID]. In simple terms, treat each set outcome like a coin toss but in this case, the probability to win a set for a particular player is constant and based on that player’s ability. She is not the first person to use it successfully, with regards to tennis. Some even do this analysis at the point level, which sounds a little less believable, but this analysis at the set level is much more plausible. I will not include all the math involved but if you really want to buckle down, I will link the article once it is released or you can do your own research. I will tell you the result: if a player has a high probability of winning a set, then their chances of winning a best of 5 match increases up to 10% compared to a best of 3 match. But, this is all very theoretical — I mean do you really expect Novak Djokovic to perform under the same probability for each set(*whispers yes*). Thus, Kovalchik actually took the last five seasons to get the actual match win probabilities for each rank and determined the match advantage for ATP vs. WTA. This is the data that has been publicized already but hopefully you have a better understand what this means.