This post was auto-generated from an IPython notebook. By default only the output cells are shown. Show code

This is part of a series of posts analyzing about 1 million games of Dungeon Crawl Stone Soup. Check out the crawl-coroner project on GitHub for more information.

In [1]: % matplotlib inline from __future__ import division from matplotlib import pyplot as plt import pandas as pd import numpy as np from IPython.display import display_javascript from vis_common import load_store , load_games FS = ( 10 , 6 ) # Reasonable default figsize DECILES = list ( np . arange ( 0.1 , 1.0 , 0.1 )) store = load_store () g = games = load_games ()

In this post, I'll be taking a step back and doing some analysis at the level of players of DCSS, rather than individual games. How many players do we have?

In [2]: print "{:,} distinct players among {:,} games" . format ( g [ 'pid' ] . unique () . shape [ 0 ], len ( g ) ) print print "Distribution of games played per player..." print print g . groupby ( 'pid' ) . size () . describe () 26,004 distinct players among 1,326,162 games Distribution of games played per player... count 26004.000000 mean 50.998385 std 201.641369 min 1.000000 25% 1.000000 50% 3.000000 75% 20.000000 max 8170.000000 dtype: float64

The "average" player has 50 games, though this statistic is pretty misleading. A majority of players can count their games on one hand.

In [3]: count_counts = g . groupby ( 'pid' ) . size () . value_counts () . to_frame ( 'nplayers' ) count_counts [ 'ngames' ] = count_counts . index fig , ax = plt . subplots ( figsize = FS ) ax . scatter ( count_counts [ 'ngames' ], count_counts [ 'nplayers' ], marker = '.' , s = 25 , alpha =. 5 ) onegame = count_counts . loc [ 1 ][ 'nplayers' ] ax . annotate ( "1 game: {} players" . format ( onegame ), xy = ( 1.1 , onegame *. 98 ), xytext = ( 2 , onegame *. 75 ), bbox = dict ( boxstyle = 'round,pad=0.5' , fc = 'yellow' , alpha = 0.5 ), arrowprops = dict ( arrowstyle = '->' , connectionstyle = 'arc3,rad=0' ) ) ax . set_xlabel ( 'ngames' ) ax . set_ylabel ( 'nplayers' ) ax . set_title ( "Games played per player" ) ax . loglog ();

A straight line on a log-log plot. Looks like a pretty heavy-tailed distribution.

In [4]: fig , ax = plt . subplots ( figsize = FS ) counts = g . groupby ( 'pid' ) . size () . values n , bins , patches = ax . hist ( counts , np . arange ( 1 , 10 ** 4 ), #np.logspace(0, 4, 150), normed = 1 , histtype = 'step' , cumulative = 1 ) midpoint = ( n >= . 5 ) . argmax () midpct = 100 * n [ midpoint ] ax . set_xscale ( 'log' ); ax . axhline ( y =. 5 , color = 'black' , linestyle = ':' , linewidth =. 5 ); ax . set_xlabel ( 'n games' ) ax . set_ylabel ( 'P(games played <= n)' );

In [5]: print '{:.1f} % o f players have <= {} games' . format ( midpct , midpoint + 1 ) 51.4% of players have <= 3 games

What about the distribution of wins across players?

In [6]: gpp = games_per_pid = g . groupby ( 'pid' ) . size () wpp = wins_per_pid = g . groupby ( 'pid' )[ 'won' ] . sum () wpp = wins_per_pid print "{:,} / {:,} players have won at least one game" . format ( ( wpp > 0 ) . sum (), len ( wpp ) ) 2,111 / 26,004 players have won at least one game

That statistic certainly speaks to DCSS's difficulty. >90% of players have never won a game.

In [7]: print "Do winners account for more games?" winpids = wpp . where ( lambda wins : wins > 0 ) . dropna () . index print "Avg. number of games played by...

\t winners: {:.1f}

\t non-winners: {:.1f}" . format ( g [ g [ 'pid' ] . isin ( winpids )] . groupby ( 'pid' ) . size () . mean (), g [ ~ g [ 'pid' ] . isin ( winpids )] . groupby ( 'pid' ) . size () . mean (), ) winnergames = g [ 'pid' ] . isin ( winpids ) . sum () print "{:,} out of {:,} games ({:.1f}%) were played by players who have won a game" . format ( winnergames , len ( g ), 100 * winnergames / len ( g ), ) Do winners account for more games? Avg. number of games played by... winners: 305.3 non-winners: 28.5 644,434 out of 1,326,162 games (48.6%) were played by players who have won a game

In [8]: gameticks = [ 1 , 2 , 5 , 10 , 25 , 50 , 100 , 250 , 500 , 1000 , 2000 , 4000 , 10000 ] winticks = [ 1 , 2 , 5 , 10 , 25 , 50 , 100 ] def scatter_games_v_wins ( minwins = 0 , jitter = False ): fig , ax = plt . subplots ( figsize = FS ) wins = wpp [ wpp >= minwins ] winjitter = 0 if jitter : winjitter = ( np . random . rand ( len ( wins )) - . 5 ) * . 55 ax . scatter ( gpp . loc [ wins . index ], wins . values + winjitter , alpha =. 1 , ) ax . set_yscale ( 'symlog' if minwins == 0 else 'log' ) ax . set_xscale ( 'log' ) ax . set_xticks ( gameticks ) ax . set_yticks ( winticks ); ax . set_xticklabels ([ str ( i ) for i in gameticks ]) ax . set_yticklabels ([ str ( i ) for i in winticks ]); ax . set_xlabel ( 'Games' ) ax . set_ylabel ( 'Wins' ) ax . set_title ( "Games vs. wins per player" ) return ax

In [9]: scatter_games_v_wins ();

Here I'm placing a dot for each player, according to their number of games played and number of wins (still on a log-log scale). It's hard to get a sense of what's going on in those stripes that form at 0/1/2 wins. We can maybe get a better sense of the density by randomly jittering each point in the y direction:

In [10]: scatter_games_v_wins ( jitter = 1 );

We already knew there were a lot of players with 0 wins (more than 90% of all players). But it's interesting to see that this isn't just a result of having lots of players who have played a tiny number of games. There are many people who have played 100+ games without a win. Let's turn our attention to just the players with at least 1 win.

In [11]: ax = scatter_games_v_wins ( minwins = 1 , jitter = 1 ) ax . grid ( axis = 'y' , lw =. 5 ); ymin , ymax = ax . get_ylim () xmin , xmax = ax . get_xlim () ymin = 1 cmap = plt . get_cmap ( 'gist_heat' ) winrate_lines = [ . 001 , . 01 , . 1 , 1.0 ] for i , wr in enumerate ( reversed ( winrate_lines )): p1 = [ ymin / wr , ymin ] if xmax * wr <= ymax : p2 = [ xmax , xmax * wr ] else : p2 = [ ymax / wr , ymax ] xs , ys = zip ( p1 , p2 ) ax . plot ( xs , ys , label = ( '{}% win rate' . format ( int ( wr * 100 )) if wr >= . 01 else '{:.1f}% win rate' . format ( wr * 100 )), color = cmap ( i / len ( winrate_lines )), linewidth = 2 , linestyle = '-' , ) ax . legend ();

Eyeballing the diagonal guides, it seems like the greatest density is around the 1% win rate, with the vast majority falling somewhere between 0.1 and 10%. There appear to be a fair number of players with a 100% win rate over 1 game.

In [12]: perfect = gpp == wpp print "{} players have a perfect record (avg. number of games played = {})" . format ( perfect . sum (), gpp [ perfect ] . mean (), ) 37 players have a perfect record (avg. number of games played = 1.0)

Far more impressive are those few dots with a >50% win rate over 50+ games.

In [13]: wrpp = g [ g [ 'pid' ] . isin ( gpp [ gpp >= 50 ] . index )] . groupby ( 'pid' )[ 'won' ] . mean () . rename ( 'winrate' ) tops = wrpp . sort_values ( ascending = 0 ) . head ( 3 ) players = store [ 'players' ] print pd . concat ([ players . loc [ tops . index ] . rename ( 'name' ), wpp . loc [ tops . index ] . rename ( 'wins' ) . astype ( int ), gpp . loc [ tops . index ] . rename ( 'games' ), tops . round ( 2 ), ], axis = 1 ) name wins games winrate pid 17003 jumbajumba 60 83 0.72 6203 makemeasandwich 75 115 0.65 8250 hellmonk 30 52 0.58

You go, jumbajumba! Winrates¶

In [14]: import winrate g [ 'raw winrate' ] = g . groupby ( 'pid' )[ 'won' ] . mean () . loc [ g [ 'pid' ]] . values g [ 'wr' ] = winrate . smoothed_winrate ( g ) #print "Distribution of raw/smoothed winrates over players...

" #print (100*g.groupby('pid')[['raw winrate', 'wr']].mean()).rename(columns={'wr': 'smoothed winrate'})\ # .describe([.1, .25, .5, .75, .9, .95, .975, .99, .995]) rwrpp = g . groupby ( 'pid' )[ 'raw winrate' ] . first () wrpp = g . groupby ( 'pid' )[ 'wr' ] . first ()

In [15]: fig , ax = plt . subplots ( figsize = FS ) order = - 1 #ax.step(np.sort(wrpp.values)[::order], np.arange(len(wrpp))+1, label='smoothed', alpha=.7, lw=0, marker='.'); #ax.step(np.sort(rwrpp.values)[::order], np.arange(len(rwrpp))+1, label='raw'); ax . plot ( np . sort ( rwrpp . values )[:: order ], np . arange ( len ( rwrpp )) + 1 , label = 'raw' , alpha =. 7 , lw = 0 , marker = '.' , ms = 1 , ); xlin = 10 **- 2 ax . set_xscale ( 'symlog' , linthreshx = xlin ); ax . set_xlabel ( 'win rate' ) ax . set_ylabel ( 'num players with win rate >= x' ) ax . set_ylim ( bottom = 0 );

We already know >90% of players have a 0% winrate. We might get a less L-shaped curve if we map each game to the winrate of the corresponding player, and look at the distribution of win rates across games. (This is motivated by questions we may want to ask like "Are deep dwarves favoured by skilled players?", or "Do good players clear the Elven Halls more often than newbs?". A natural way to answer these kinds of questions is to select the games where foo happened, and look at the posterior distribution of winrates over those games.)

In [16]: fig , ax = plt . subplots ( figsize = FS ) order = - 1 #ax.step(np.sort(g['wr'])[::order], np.arange(len(g))+1, label='smoothed', alpha=.7); #ax.step(np.sort(g['raw winrate'])[::order], np.arange(len(g))+1, label='raw', alpha=.7); ax . plot ( np . sort ( g [ 'raw winrate' ])[:: order ], np . arange ( len ( g )) + 1 , label = 'raw' , alpha =. 7 , lw = 0 , marker = '.' , ms = 1 , ); xlin = rwrpp [ rwrpp > 0 ] . min () xlin = 10 **- 2 ax . set_xscale ( 'symlog' , linthreshx = xlin ) ax . set_xlabel ( 'win rate' ) ax . set_ylabel ( 'num games with win rate >= x' ) #ax.legend() ax . set_ylim ( bottom =- 10000 );

This is definitely smoother than the previous curve, but using raw winrate to measure skill leads to some silly outcomes, particularly at the extremes: The curve extends all the way to 100%, because of those 37 players with records of 1 win 0 losses. We shouldn't be so naive as to believe that if we had them play another 10 games each, they'd win them all.

We still map about half of all games to the same win rate of 0%. It'd be nice to be able to differentiate among them. Intuitively, it seems like a player with 0 wins 1 loss should be ranked ahead of one with 0 wins 300 losses. This problem is involved enough that I devoted a whole separate post to it: Estimating player skill in DCSS. Here's a sneak peek of the smoothed estimate of win rate I came up with, compared to raw win rate.

In [17]: fig , ax = plt . subplots ( figsize = ( 12 , 8 )) order = - 1 plot_kwargs = dict ( lw = 0 , alpha =. 6 ) n = 1000 seed = 1337 if n : ax . plot ( np . sort ( g [ 'wr' ] . sample ( n , random_state = seed ))[:: order ], np . arange ( n ) + 1 , label = 'smoothed' , marker = 'd' , ms = 4 , ** plot_kwargs ); ax . plot ( np . sort ( g [ 'raw winrate' ] . sample ( n , random_state = seed ))[:: order ], np . arange ( n ) + 1 , label = 'raw' , marker = '.' , ms = 5 , ** plot_kwargs ); else : ax . plot ( np . sort ( g [ 'wr' ])[:: order ], np . arange ( len ( g )) + 1 , label = 'smoothed' , ms = 40 , alpha = 1 , ** plot_kwargs ); ax . plot ( np . sort ( g [ 'raw winrate' ])[:: order ], np . arange ( len ( g )) + 1 , label = 'raw' , ms = 5 , alpha = 1 , ** plot_kwargs ); xlin = g [ 'wr' ] . min () #xlin = rwrpp[rwrpp > 0].min() ax . set_xscale ( 'symlog' , linthreshx = xlin ) ax . set_xlabel ( 'win rate' ) ax . set_ylabel ( 'num games with win rate >= x' ) ax . legend (); #ax.set_ylim(bottom=-10000);

The most drastic effect is that all the zeroes are gone. Even if a player has lost 500 games in a row, we're going to optimistically guess that, if they played another 9500, they would probably win one. Being able to distinguish the mediocre from the truly abysmal will turn out to be useful - in a future post, I'll be using these smoothed winrates to mythbust some Crawl newbie traps.