The following blog post, unless otherwise noted, was written by a member of Gamasutras community.

The thoughts and opinions expressed are those of the writer and not Gamasutra or its parent company.

Sometimes as a game analyst, you get to do work that does not have an immediately useful application, but which are just fun and basic, explorative research.



Such exercises sometimes begin when people get together socially after a long day and start speculating (glasses with variously colored liquids sometimes enter the picture as well). This was also the case when one day, almost two years ago, at a conference, a couple of analysts from the industry and academia started the customary ritual of bragging about the accomplishments of their characters in various MMOs - in an "artists interpretation" as follows:

“Yea, HealBot was among the first 10 to hit the level cap on [insert name of vastly popular and elite server in equally popular MMO]” one might say, with a colleague chiming in:“ha! As if that is an accomplishment. NecroZeus has 45,000 hit points [a lot] and is top-ranked in [insert suitably impressive guild name in major commercial MMO]”. “Your characters have barely matured from their diapers gentlemen – ZonkarTheMighty was first player ever to get the [insert name of impressive epic loot requiring dozens of player to coordinate their efforts]” – and so forth and so forth.

This lively debate unexpectedly led to a discussion about character names, and a speculation that there might actually be some sort of pattern in how we as players name our characters. For example, it was wondered why in e.g. World of Warcraft, Paladins were always called something along the lines of “Healbot”, Warlocks cool-sounding names like “Bloodmaster” and blood elves – irrespective of class – fancy names such as “Moonlight”. Taurens, it was argued, usually had names that deserved to have gravimetric force on their own, like “Mortar” or “EarthStomper”. It was decided to investigate whether these hypotheses had any merit (and eventually a paper was written).

Enough exposition, get on with it ...

The next step involved getting a hold of a ridiculous number of character data from World of Warcraft. About 18 million in total. This is a lot. And it made analysis cumbersome, so it was cut down to about 8 million, subsampled from the 50 biggest guilds on US and EU servers. Some pre-processing was done, e.g. removing special characters so that e.g. "Gand'alf" would be counted along with "Gandalf".

The first result that became apparent was that players in World of Warcraft are incredibly talented at creating unique names. Characters with the same name can exist on different servers, so the mind-boggling 3.8 million unique names found in the dataset was not expected. This is more diverse than real-world names, despite the naming restrictions in World of Warcraft. Why this diversity occurs we can only speculate about, but it may relate to a desire to build a unique character – and the name is the only truly unique feature in World of Warcraft. On average, 58% of the names were unique – but on the Role-Playing servers 83% were unique – whether that speaks for the added creativity of RP’ers vs. those that favor PvP is unknown however.

Moving on, it is also evident that there is a relationship between the class and race of a character and the name selection. Looking at the most popular names for the different races and classes in World of Warcraft, there is virtually no overlap. For example, Mages have different names than Warlocks, and Tauren are named differently than Blood Elves. Perhaps unsurprisingly, name frequencies follow power laws (a very interesting set of models that keeps cropping up in internet- and games work).



For reasons unknown, Mages have a much higher variety of names than any other class. Any speculations as to why is welcome (are people who play Mages more creative? More independent-minded?).

Histogram of most popular Blood Elf race names

Histogram of the most popular Mage class names





Histogram of the most popular Human race names

In fact, isomap projection revealed that the “pretty” races (humans, blood elves, draenai, etc.) and the “bestial” races (orcs, tauren, undead, trolls – and gnomes and dwarves) formed two distinct groups. I.e. people name pretty characters very differently from the bestial ones (or however you want to define these races). Strangely, the otherwise non-bestial Gnomes and Dwarves map with the bestial races. This results leads into speculations about some uncomfortable aspects of human psychology that we will leave alone for now.

Isomap projection also revealed that character names on US and EU servers mapped differently. And that characters on RP realms mapped differently than those on PvP realms (with PvE and RP-PVP and other hybrids mapping with either parent group or in the space between). Interestingly, names for RP servers overlap across the US-EU divide.

Histogram of most popular names on the EU servers

Histogram of most popular names on the US servers

You mentioned prediction?

Prediction is one of the most common goals of data mining player behavior, and somewhat surprisingly to the analysts involved, estimations of the conditional probabilities of a given class/race/server type/level/etc. given a particular character name actually revealed that some names are very good predictors. Class and race emerged as the overall best predictors of names. This is especially the case when names are inspired by popular media, books, film, mythology etc. – more on this below.

Conditional probability of a given class / race / realm, given a particular character name

So, how do people come up with these names?

Not satisfied with patterns, the 1000 most common character names in World of Warcraft (roughly 138,000 characters total) were taken through a pain-staking process of manual examination and source of inspiration identification. The method used was manual semantic coding. This is a fancy way of saying the categories were made up as the names were investigated. What was found was that real-world, vanilla names such as Sara, Mia, Daniel and so forth were the most common (186 names). Mythology, notably Greek, accounted for 164 names. Popular culture (notably Japanese manga and the characters invented by a certain Mr. Wheedon) accounted for 174. Fantasy literature, with Tolkien ruling supreme, accounted for 39 names. A lot of the most popular names were in breach of the terms of use of World of Warcraft. Many names were basically copies of important NPCs. About 300 of the 1000 most popular names could not be classified. These were names consisting of verbs or nouns of unspecified nature, however, they can be categorized based loosely on semantic content. For example:

”Negative”: Nightmare, Sin, Fear, Requiem

”Positive”: Hope, Love, Pure

”Neutral”: Who, Moonlight, Magic, Snow

The names with negative semantic connotations were six times more common than those with positive connotations. Does this mean gamers are depressed? Or do “dark” names just sound cooler?

What about other games?

The World of Warcraft results were intriguing because we honestly did not expect to find any patterns in the character name data. But it encouraged us to look beyond World of Warcraft. Looking to another genre, the FPS, it was decided to try our luck with investigating whether it is possible to say anything about how a person plays a game, their playstyle so to speak, and their gamer tag.



Gamer tags in shooters like the Crysis-, Medal of Honor- and Battlefield-series are at face value very different than World of Warcraft character names. The use of numbers and special characters is much more frequent, as is combinations of letters and numbers that do not have any direct resemblance with words. A quick look over on the P-stats network throws up examples like “MaliciousMaulr”, “x6naca6x”, “Ankur”, “HackJake0025”, “InSaNe_x_ChAoZz”, and “Acid_Snake”.



In order to investigate if there are relationships between gamer tag and playstyle, different analyses were run. The first ran a sample of 10,000 gamer tags from Battlefield 2: Bad Company 2 through a variety of clustering algorithms and distance measures useful for string clustering, seeing if the resulting clusters of gamer tags correlated with which Battlefield 2: Bad Company 2 class (e.g. Assault, Medic) the player in question favored. The result revealed however only a small degre of non-randomness, i.e. that gamer tags only to a limited degree relates to the most favoured class of a player. Moving on, we tried the same thing with seven playstyle profiles built from behavioral telemetry from Battlefield. The profiles were the result of earlier cluster analysis of 11 behavioral measures related to core mechanics of the game. This time the result was a lot better, indicating that there is a relationship between gamer tag and behavioral profile in Battlefield 2: Bad Company 2 for a pretty significant chunk of the players (cluster purity measures reached 61%).

Okay, but is this useful?

It cannot be claimed that the analyses provided direct insights in how to improve the design of World of Warcraft nor stunning insights for the players of Battlefield. It does provide insight into the players though. Irrespective, this kind of work falls squarely into the explorative and non-goal driven category of game analytics processes discussed here. However, without curiosity-driven and basic research, we would not have the laser, the microwave or teflon. Sometimes this kind of approach pays off. The new game analytics book includes examples of curiosity-driven game analysis leading to important conclusions. So this is definitely not an argument against being explorative and creative in analytics.

Perhaps the most intriguing part of the work we did is the hints that it may be possible to predict some aspects of play behavior or perhaps even personality of people based on their character names. Player/behavioral profiling is another area where including character/profile names or gamer tags could be of interest, notably in sparse-data situations.

(note: a shorter version of this post also appears on the GA research blog).