A journey from FGC player ranking to data science reaching thousands of people Bavo Follow Dec 4, 2019 · 22 min read

A retrospective on rank.shoryuken.com by Bavo Bruylandt

… or millions of users if you believe Google

The site could reach up to 38.000 users a month with 160.000 pages rendered.

Origin story

The rank.shoryuken.com as you know it originated in 2014 from a simple question ‘which SF4 player can currently be considered the best’?

This was not easy to answer at a time where there were no Pro Tours as we know it. There was no overarching system of tournaments and points that produced player rankings. The closest one could come is winning EVO, which would result in that one player being called the best of the world. And then long debates would ensue as to how representative a single win could be.

Thanks to a member at Eventhubs that posting a list of 86 tournaments with their Top 8 results trioggering me into finding a way to weigh this data and distribute points to the players in them. The closest tried and proven system I could find was the Tennis Pro Tour system; it ranked tournaments into categories and then distributed player points per position in that category.

This did not require any head-to-head tracking a lot of scoring systems use, which needs a vast more amount of data.

original scoring used as base

Because one does not want to go into discussion as to what tournament gets what type of ranking (and thus weight), the system was designed to be self-balancing. Tournaments do not get any ranks to start with, the players get an ‘initial estimated weight’, which forms the tournament weight (a simple sum of best 8 players weight is enough). When all tournaments are weighted, they are auto-ranked into a finite amount of types (matching the Tennis types) and then finally players are given scores and re-weighted. This can go for a few cycles to get a final balance, and rebalances on every new tournament entered into the system.

Fine-tuning

simplified explanation of the circular scoring system

This ended up in a dynamic system that could start with only a limited dataset and would improve gradually over time. Moderators may have a influence on the initial player weights, but after a few cycles it is the system that decides via back-propagation.

Althought the idea was simple enough, the fine-tuning evolved over years when further issues popped up.

How many tournaments do you consider? (we considered all except for locals)? For how long (we limited at 18 months)

How many top players do you take into considerarion per tournament for the weighting? (we went from 8 to 16 for the games that allowed that)

How do you avoid regional bias on over-reported regions? (we gave bonuses to tournaments with 2+ countries attending, as these were typically traveled to and thus indicative of importance even though players could not join as many)?

What do you do with players that have been inactive, but are known to be good? (we ended up with weight being best of actual and all-time ranking)

When do you consider a game to be ‘dead’ and thus the rankings to be final?

How much data do you need to publish a ranking that causes no big upsets? (we held back a few months for new games)

How do you avoid over-represented players to get bloated scores? (we ended up on taking best of 12 player results)

The result appears to work very well. For all games listed on the site, I honestly feel the rankings do reflect the competiveness of the players. One can discuss minor points and relative rankings of players, and one will find players being blatantly under-rated. In this ‘fair’ system that has no moderator influence, one cannot simply go in and fix a player that theoretically beats all others, but joins few to no tournaments. If a player did not reach the limit of 12 games in the tracked window, he will not have maximized his scoring potential. That is the trade-off. Players need to travel quite a bit to get to their best possible ranking.

Luckily, most games went on long enough and had enough travelling top players to get a good ranking up to at least top 50, and in large games like Street Fighter even up to top 200.

Shoryuken support

first version of the site thay only supported SF4:AE2012

The initial site based only on a simple table that ranked the top players and linked into their results and tournaments caught the attention of people at shoryuken.com. Terry Kineda as site designer, and Ian Walker as editor saw a potential to have better tracking of tournament data, which up until then was only fragmently tracked by wiki’s, forums and news posts. They contacted me to link it on their domain, but let me keep the onwership of the hosting, code and content. They diligently advised me on how to improve the layout and content, and greatly helped me promote the site. Just having shoryuken.com backing you as a brand was huge as a form of legitimation for your data.

A single voice is just one out of many, but a community brings in real weight to the discussion. This would have not been such a success if it was not for them.

Evolution into data science

editorial on launch of new games to be tracked — by Ian Walker

While we started with SF4 Arcade Edition 2012, we also wanted to apply it to other hot games, like Marvel vs Capcom 3. This required a re-design of the homepage to not be just a single table of top SF4 players, but to serve as an overview that linked into deeper results.

A more complex data model was built, supporting any game or version to be listed. Soon, we could also answer who were the best in older versions of SF4 as well, of which a lot of historical data was already available.

Sites like Shoryuken and Eventhubs also published the character used by every player. Linking that info not only gave insight as in what character a player used for his ranking, but also let us rank characters themselves. If a ranking produces points for a player, it also does for a character!

Character usage and performance, each character has detailed historical pages of results and best players

deep dive into details per character in a game

The first question that comes to mind is what character is most effective at this point? Easy to answer! More elaborate questions than are: how balanced is a game — and for SF4, per release version of it — , and how does a character evolve over time in the same game, perhaps due to balance updates or changing meta?

statistical balance comparison of versions in same game

This received its own section in the ‘character balance and tiers’ section, that aimed to bring raw data to the tiers discussion. Not only by counting usage or victories, but by actually applying weighting to those as well. A win at EVO for Akuma weighs a lot more than a win by a decent Sakura player in your local. Common statistics also can be applied, like how far off is the 50% character usage distribution, or 90% distribution etc. Balanced games tend to reach a high character variety in for example their top 100 of players.

We also expanded on the Player profiles, adding more metadata like real name, gaming team, country, controller, description etc. This is not only informative, but also allowed for advanced statistics to be made. The tournament pages as well were expanded with not only info on weight and eventual ranking, but also the event they were featured in.

All this gives great possibilities of data mining. Questions like

what gaming team is getting the best results?

what event has the strongest competition?

who are the best players for this country?

which world region delivers the most top players?

Team performance YTD, showing the teams that booked the best players

As it is hard to have these all rendered in the site, the Twitter account at https://twitter.com/SRKRanking was created to share these statistics.

One of the many balance patch analysis done, where characters are scored per season to see how they evolve.

Eventually we evolved from a pure player ranking to a general FGC tournament data aggregation and analysis source.

Analysis of EVO2019 that chekcs how many of the top players are registered. That year all 32 best players attended.

Game overlap charts are made by checking who registered for what games in same tournament, singalling crossover between certain type of games.

As all eyes are on evo, the total number of entrants and countries they come from are a barometer of the community health.

community viewer stats per tournament, indicating interest over time

We even ventured into tracking viewer numbers of tournaments to keep a pulse on how the community was evolving.

Community building

To open this up to a bigger scale we started looking for help from community experts. It is easy enough to add Street Fighter data when you follow all majors live on twitch yourself, but a lot harder to do for other more games that were continuously being added. The tournament coverage is also very focused on the North American and Japanese regions, which makes it hard to balance out with players from Europe, Latin America, Middle East or other Asian countries. And they sure do matter, as was proven by the unexepected wins by Luffy from France, Xian from Singapore or Arslan Ash from Pakistan.

The site allowed users to be added and made data input as easy as possible. This evolved over many iterations, and in its current form, one can paste top 128 from the typical major news sites straight into the site and it will recognize the format, link or create players, their teams and characters used just like that.

The challenge in chaotic systems like these is that players just register under any name they desire, and people report on interpretations of that. So more often than not, names and characters do not match. With the help of Apache Lucene as text matcher and extra algorithmic guessing, we can suggest the correct registered name in our database against what has been reported. The moderator just has to click the right version to fix it. This makes the database one of the best sources of historical data, as fragmentation is avoided as much as possible. As a moderator there is also access to tools everywhere to edit data, merge/split or to use as template for new entries. Minimizing the cost of data aggregration and maintenance.

We also enabled third-party contribution via a form, although that attracted mostly low-quality data (too local, too little known players, too little character info) and was seldom used and approved.

To give back to the community we open-sourced both the code and data at Github (https://github.com/bavobbr/sf4ranking) which has the code and JSON backup data (which is used to import all data locally for testing even if database format would change), and created an API for people to freely use at http://rank.shoryuken.com/api/

This exposes our most crucial data:

player/tournament wildcard search

player/tournament data

rankings per game (SRK and CPT versions)

seedings by SRK for your smash tournament

A lot of the data has been contributed by volunteering moderators, and I can never be thankful enough that they did invest their time. The work was so meticulous and added an enormous amount of value. We currently cover over 11000 players in more than 3100 tournaments thanks to this. A long way to go from the initial 86 tournaments it started from!

tournaments contributed per moderator

A huge thanks goes to:

Tomakh aka https://twitter.com/FGCTomakh for a very good coverage on SF5 and especially Tekken. His Japanese connections really helped.

aka https://twitter.com/FGCTomakh for a very good coverage on SF5 and especially Tekken. His Japanese connections really helped. Arnaud aka https://twitter.com/Achorawl for big contribution in small games, but especially in the European scene. He is from France and helped putting the community on the international map.

aka https://twitter.com/Achorawl for big contribution in small games, but especially in the European scene. He is from France and helped putting the community on the international map. gdnalk aka https://twitter.com/GravediggerNALK for almost singlehandedly covering the anime fighters, which was my weak spot.

Third-party ecosystem

Capcom Pro Tour

When the Capcom Pro Tour started in 2015, it launched an official way of ranking players by earning points at approved tournaments. Initially, the quality on tracking of this and the speed of processing by Capcom was very low. We enhanced our data by tracking the CPT type as well, and thus creating a secondary CPT ranking that outpaced CPT official data easily until 2018, thanks to the realtime updates we already did for majors. We also added the usual statistics, API and projections we do, creating added value above what the official site offered. As of 2018, Capcom invested in the site and it is no longer a need for us to do. The difficulty of these Pro Tours is that you need to get it 100% right or you are wrong. So it requires us to follow up on the exact changes in CPT rules, track down every tourmanent played (even the often not reported online ones) and link every oplayer correctly to their score (again even when online where player names differ).

The CPT section was a big success in the first years and it was often fun beating them at their own game in publishing results ;)

Maxoplata

A guy with his own story, Maxamilian Demian, created at the time Maxoplata (now hosted as https://www.fgcbattles.com/), a site that tracked tournament results in very good detail. He covered even more type of games, and tracked them with head-to-head results of all majors, mostly in the US.

Maxoplata however did not analyze any results, which was our main goal. We were very complimentary so started adding links to players on their database from ours, so people could find those head-to-head results we did not have. We also intergated a part of this data in our player vs player comparison. He stopped updating — possibly the best data source available — in 2018.

Smash.gg

There is no going around this now, smash.gg started with helping Smash Bros tournaments organization and grew out to encompass the whole FGC. They gradually started convincing people to switch from Challonge to them for bracket hosting, added a lot of features and eventually started overlapping with a lot we did. On our end, we used smash.gg a lot to parse results from live and finished tournaments and to analyze their brackets as a pre-tournament service. We download their bracket, match the players (by linking smash player id with SRK player id, quite some work to do as again no official format in naming is defined and players changed Smash.gg accounts all the time) and apply our weights on their brackets. On major tournaments this directly gives you a view on who the top players are, how many are represented and how well they are distributed among the brackets. This info is then shared on Twitter and Reddit, the latter because it can be real-time updated with bracket progression and have more detail. Ironically not on Shoryuken.com itself as that requires a seperate editor to approve, review and process.

These days smash.gg has so much growth behind it, it became the defacto standard and is replacing Shoryuken as a tournament result datasource. I feel that in the future, our analytics may be run directly from their data instead of ours. With the minor risk of creating a data monopoly.

The single reason why Shoryuken is still way better as a search engine than Smash.gg is that Shoryuken focuses on players, not on tournaments. Today it is not even possible to link to a player on smash.gg (unless via an event he appears in), although they force their players to register and thus have a very player-centric model.

Reddit

Reddit is great as a publishing platform for detailed data that does not fit into a tweet, and would be too hard to put into code on the site.

During tournaments we can track the top players real-time in the brackets, automatically sending updates to the Reddit page via the Reddit API. This makes it easy to spot upsets and advances, without having to go through the brackets, or having to rely on reporters.