Introduction ​

I started a scene at my University this past year and one of the problems we had was that there was no way for us to seed our tournaments. After much consideration (and a few failed systems), we decided to go with Glicko2, also used by the Free Internet Chess Server, Chess.com, Australian Chess Federation, and Guild Wars 2. The software in this post serves to give a way to easily process Challonges and .txt files to extract results and calculate ratings, and thus, proper seeding.



About Glicko2 ​





``The problem with the Elo system that the Glicko system addresses has to do with the reliability of a player’s rating. Suppose two players, both rated 1700, played a tournament game with the first player defeating the second. Under the US Chess Federation’s version of the Elo system, the first player would gain 16 rating points and the second player would lose 16 points. But suppose that the first player had just returned to tournament play after many years, while the second player plays every weekend. In this situation, the first player’s rating of 1700 is not a very reliable measure of his strength, while the second player’s rating of 1700 is much more trustworthy. My intuition tells me that (1) the first player’s rating should increase by a large amount (more than 16 points) because his rating of 1700 is not believable in the first place, and that defeating a player with a fairly precise rating of 1700 is reasonable evidence that his strength is probably much higher than 1700, and (2) the second player’s rating should decrease by a small amount (less than 16 points) because his rating is already precisely measured to be near 1700, and that he loses to a player whose rating cannot be trusted, so that very little information about his own playing strength has been learned.''



Full information about Glicko and Glicko2 may be found here:

Many of you have heard of the Elo system, used for Fide Chess rankings. Glicko2 is an improvement on this system that takes into account how uncertain rankings are in how results affect new rankings. As put by Mark Glickman, the creator of Glicko and Glicko2:``The problem with the Elo system that the Glicko system addresses has to do with the reliability of a player’s rating. Suppose two players, both rated 1700, played a tournament game with the first player defeating the second. Under the US Chess Federation’s version of the Elo system, the first player would gain 16 rating points and the second player would lose 16 points. But suppose that the first player had just returned to tournament play after many years, while the second player plays every weekend. In this situation, the first player’s rating of 1700 is not a very reliable measure of his strength, while the second player’s rating of 1700 is much more trustworthy. My intuition tells me that (1) the first player’s rating should increase by a large amount (more than 16 points) because his rating of 1700 is not believable in the first place, and that defeating a player with a fairly precise rating of 1700 is reasonable evidence that his strength is probably much higher than 1700, and (2) the second player’s rating should decrease by a small amount (less than 16 points) because his rating is already precisely measured to be near 1700, and that he loses to a player whose rating cannot be trusted, so that very little information about his own playing strength has been learned.''Full information about Glicko and Glicko2 may be found here: http://www.glicko.net/glicko.html

Implementation ​



First, you'll need Python. Go to https://www.python.org/downloads/ and download Version 3.x.y (currently 3.4.3). Then, download the .zip folder here: https://www.dropbox.com/sh/1eagl4tknq3657p/AADi_-fojg0FcJcMZ943YGMza?dl=0 and extract the contents. The main things you'll need are SmashRankingsCalculator and RankingSettings. Unless you really know what you're doing, don't touch anything else. (To ``open'' things, right click the file and then left click ``Edit with IDLE'' (or similar). To Run, do Run --> Run Module (F5 on my machine).) An ELI5 .png of basic functionality can be found in the dropbox folder, but more thorough instructions are listed below (note the ELI5 doesn't show seeding, writing result files, etc., so I'd recommend reading the full instructions and perusing the UsefulFunctions stuff by uncommenting it).

SmashRankingsCalculator

This is where the magic happens. The file as given is an example of how to run things. Note that # initiates comments, so feel free to make notes to yourself as you use it. You'll notice a bunch of WriteTxtFromChallonge('ChallongeURL', 'TournamentName'). This is how you pull data from Challonges. When these a run, a .txt file named ``Tournament Name.txt'' will be written wherever you've put all the python documents. You only need to run this once per Challonge, obviously, after that you can comment them out or delete them. Next, you'll notice a bunch of ProcessRankings(['FirstEventName, 'SecondEventName', ...], 'Title'). In each of these, put whatever results you have for each ranking period (should be two months), along with which title (SSB, Melee, Brawl, PM, or Sm4sh) the results are associated with. Finally, ShowAllRankings() is run to...show all the rankings.



RankingSettings

This is pretty self explanatory for the most part. TagDict lets you associate names with tags if you so desire. Any tag with numbers in them MUST go in NumFixes. If you have frequent typos, or someone changes their tag, put that in ReplacementList. The syntax should be pretty obvious from the example.



Exporting Data

Use the WriteTxtAllRankings, WriteMobileAllRankings, and WriteCSVAllRankings functions to export rankings to .txt, .txt (in mobile form), and .csv (spreadsheets) documents. Example syntax:

WriteTxtAllRankings('Rankings', TitleMin = 2, SortedBy = 'Low', SortedByTie = 'Middle', LinesBetween = 2) and similar for the other functions.



Other Stuff

In SmashRankingsCalculator, at the bottom, you'll see UsefulFunctions() commented out. Run this to see what other functions are available, such as seeding participants of a tournament, putting participants in properly seeded pools, showing rankings for just one game, etc., as well as additional options for those.



FAQ ​

From the examples, why is ________ listed lower than ___________? Clearly __________ is better.



You're probably right. The system works best on relatively closed groups of people, that is, when not so many new players are coming to each and every tournament. Since the examples draw on tournaments that are happening across the country, there are a ton of ``newcomers'' each time, who have no ranking information. In particular, since the average strength of the newcomers at each tournament varies wildly, it makes the information much less reliable. At more closed settings (locals, university scenes, etc.), the Glicko2 system is much more stable and accurate.

​

What if I don't use Challonge to do tournaments?



That's totally fine. If you would like to still process your rankings, simply put them in a .txt file with the format P1 P1wins P2 P2wins on each line, with each line representing a different match. This is also useful if you want to have matches that occur out of tournament as part of the rankings. For example, members of my club challenge each other to ``power matches'', and the results are put in a .txt file for each two month ranking period to be processed with results from tournaments.



Additionally, just for pool results (since Challonge is awful at pools, I and I assume a lot of other people just run pools on their own), there is WriteTxtFromPools(PoolList, TxtName), which guides you through all the matches from a list of pools, so you only have to enter 0's, 1's, and 2's (and maybe 3's if Bo5 in pools is your thing) corresponding to the wins of each match. ​

Limitations ​

Obviously, the system can't draw data that isn't there. If you just put 2-0 and 3-0 for everything, then it will be read as such. Additionally, "checks" for victories will simply not be processed. Finally, although there is support for seeding teams based simply on the ratings of the individual players, ranking teams themselves is not well supported.



Edits:

4/24/15: Added how to actually run Python stuff.

4/26/15: Added what to do if you don't use Challonge.

4/27/15: Added the WriteTxtFromPools function, as well as a bit about it in the FAQ.

5/5/15: Added WriteTxtAllRankings, WriteMobileAllRankings, and WriteCSVAllRankings functions (see the ``Exporting Data'' portion of the post), as well as a few minor bug fixes.

5/28/15: Added some extra functionality to a couple of existing functions. Added an ELI5 to the folder.