The statistical component of sport has always provided a fascinating way to analyze performance and success. This might simply be the final score, but for some sports, such as football, baseball, cricket, golf and tennis, meaningful analysis of every facet of the game and a player or team’s actions is part of the essence of the game itself. It is as common to see statistics and graphical summaries of the action reported as it is to see the action itself and this provides a fascinating insight into strategy as well as an explanation of outcome. In this blog entry we explore the results of the London Olympics Gold Medal tennis match between Roger Federer and Andy Murray to show how you can use GIS to identify particular patterns within the match that may not have been exposed by using traditional non-geographical analysis and display techniques.

Created using ArcGIS, figure 1 shows the location of where each player played a winning shot and their movement during every point of the gold medal match.

Figure 1: An infographic showing the player movement and winning shot positions from the Olympic Gold Medal Match between Roger Federer and Andy Murray.

Whilst figure 1 certainly carries a lot of visual impact it doesn’t actually tell us a whole lot. The player movement lines overlap one another and make it hard to distinguish which line relates to which point. We cannot tell the direction of movement in many cases because there are no directional arrows. The infographic also doesn’t show where the winning stroke landed, or the direction of the shot. It also fails to show the temporal component of the match.

Figure 2: The complete data set from the Olympic Gold Medal Match. 1708 point locations were collected from the 3 set match

Capturing the data

For the study we captured the tennis match data using ArcScene 10.1 and video footage of the match (see figure 3). We built a court at a scale of 1:1 in its correct geographic location (center court at Wimbledon) and were able to quickly capture the location of each player’s stroke and corresponding ball bounce for the match entirely from the video footage. At each location we collected a set of key attributes like who played the stroke, what type of stroke it was, the stroke number, point number, game number, set number, who was serving etc. The data captured provides a statistical summary of every shot in the match.

Figure 3: Video footage of the match in ArcScene. The red dots represent the player’s stroke position and ball bounce. The green lines represent the direction of ball travel for each shot.

By using ArcScene we were able to plot the player’s position and ball bounces to within +/-20cm using the 3D editing tools. We approximated the camera angle of the video footage and set our data view to match. This made the data capture process rapid and increased accuracy, compared to a 2D environment, because we were able to continuously match the changing camera view in the video by using the Navigate Scene control in ArcScene. This also helped us counter the scale distortion in the camera view when capturing points at the end furthest from the camera.

Once all of the point data was captured, we used the XY To Line tool to create connectivity between the points using the shot, point, game and set number attributes. The lines are instrumental in allowing us to visualize stroke patterns (as you will see later in the blog entry). We ran the same XY To Line process to create player movement lines.

Visualising the data

Statistics from the match tell us that Andy Murray made a total of 18 winners to Roger Federer’s 13. What these statistics don’t tell us is where those winners occurred, the stroke of each winner, when the winner occurred and what led to the winning shot occurring. They also fail to show us any potential stroke patterns during the match. By capturing and storing all of the match data in a file geodatabase (figure 4) we are able to take advantage of the geo-location of these winners and create some interesting visualizations to tell a far more interesting story than single snapshots allow.

Figure 4: Using a file geodatabase to store sports data in ArcGIS

One of the challenges in dealing with sports data is that there are many instances of similar events occurring at the same or similar locations over relative small periods of time. This often results in very tight clusters of points over very small areas of your court, pitch or field. If your data has an element of connectivity, you will additionally have overlapping lines along similar bearings and distances or lines that run in completely random directions, depending on the type of sport you are analyzing. This provides us with an interesting challenge of how to represent and compare this information meaningfully.

One way to make sense of so many overlapping points and lines is to use a visualization technique (often promoted by Edward Tufte) called Small Multiples (see figure 5). Small multiples use a series of common basemaps (in our case a tennis court) with different slices of data on top of each map. The maps are arranged in a logical sequence, much like animated movie frames. Small multiples are useful to disaggregate your data, reducing the visual complexity and quantity of information so that it can more easily be seen and interpreted.