The following blog post, unless otherwise noted, was written by a member of Gamasutras community.

The thoughts and opinions expressed are those of the writer and not Gamasutra or its parent company.

Given a huge and varied game like Destiny, it is of interest to see if there are any patterns in how people play the game.

There can be a variety of reasons for this – for developers the focus can be on monitoring the players and checking no group emerges that has issues progressing, gaining insights into how to change the game to improve it, all the way to actively detect cheaters, bots and similar. For players, such analyses can be used to show how we can improve or point to new strategies, or just cool to see what the patterns look like and where we fit in them.

There are many ways to find these kinds of patterns, but arguably one of the key analytical methods that has emerged in the past few years in game analytics is behavioral profiling. Profiling can be done in a variety of ways, e.g. focusing on the player base as it is right now, or as it has operated historically via time-series analysis, or even predictively, how we expect behavioral profiles to develop in the future. Here we focus on profiling Destiny players as they behave within a specific moment in time, also referred to as snapshot profiling. Snapshot profiling is pretty useful for understanding how people are playing a game in its current version.

In this post we present player profiles from Destiny – generated just before the Rise of Iron expansion was released. The profiles are focused on performance across a few dozen indicators, and it turns out that the weapons we use and the degree of efficiency with which we use them are the most powerful characterization indicators in Destiny.

The profiles are generated using cluster models. Clustering is a machine learning technique for finding patterns in data. When you have a big and varied dataset from games, clustering can be an effective technique for figuring out which metrics that are important to characterize the behavior of the players, and how players are organized in the variance space provided by those metrics. Clustering for game analytics is introduced in more detail here (introduction) and here (detailed).

To check out the profiles now, continue reading. For a bit of background and method on how they were developed, see further down, for discussion about what it all means, check even further down. The details of the work can be found here.

We extend our warmest regards and thanks to Bungie for their help with making data available, advice and support.

Destiny profiles

Given the differences in Destiny across PvE and PvP modes, we wanted to build profiles for each of these modes. PvP in Destiny is accessed via the Crucible, which is a hub for a number of different PvP activities. It turns out the way we play Destiny is fairly similar across PvP and PvE, at least when it comes to weapon choice and –efficiency, as most of the profiles are similar across PvP and PvE, typically focusing on either close combat, long range, or a mixture of weapons.

Starting with PvE, the data we worked with is basic character information covering metrics for performance, engagement, progression etc. Running these through a cluster model called Archetype Analysis (this is one of we used four different cluster models, for more on method see below), we obtained five clusters, from which profiles can be developed. The metrics that characterize the behavior of players in the sample are primarily defined by damage output, preferred distance from enemies and the weapons used. Perhaps unsurprisingly, a lot of other behavioral metrics are less important in characterizing Destiny playstyles (e.g. light level, class). Summarizing the key features of each profile:

High DPS (23.5% of the players): The largest cluster of players focused on using weapons with a high damage per second (DPS) output and use their specials a lot.

Guerilla Warriors (16.7% of the players): These use a great variety of weapons and are highly adaptable to changing situations in the game. They change weapons often and are characterized mainly by their flexibility and consistent high performance with all kinds of weapons.

Close Combatants (22.9% of the players): These players are characterized by their almost exclusive reliance on close combat weaponry (melee and short-range guns), with which they perform very well, better than any other profile.

Sitting Duck Snipers (18.8% of the players): These players prefer to kill their opponents at long ranges, and tend to operate from stationary positions, utilizing sniper rifles first and then switching to other weapons as enemies close the distance.

Mobile Marksmen (18.1% of the players): Unlike the Sitting Duck Snipers, the Mobile Marksmen stay mobile and engage the enemy at closer ranges, but surprisingly stick to a single weapon type, most commonly a long-distance rifle such as scout-, pulse- or sniper rifles.

For the PvP dataset, our Archetype Analysis indicated four clusters, which are again characterized by their use of weapons.

Close Range (35.1% of the players): These make very effective use of shotguns and complement this their absolute favorite weapon with melee attacks to dismantle PvP opponents very effectively with K/D ratios, survival times etc. at the top of the range.

Marksmen (28.9% of the players): Unlike the close range players, Marksmen almost exclusively use sniper rifles at long ranges (to a lesser degree scout rifles), switching to hand cannons at close and medium range. They have top performance ranks for these weapon categories. These spend the most time in PvP.

Objective Killers (20.1% of the players): are players that play a majority of Control games where the match focuses on holding various bases, defending them and attacking enemy controlled bases. They are proficient with a range of weapons.

Casual PvPers (15.9% of the players): These tend not to play much PvP, preferring PvE, and capable of using any weapon type, although with less skill and lower performance indicators for any one weapon class than the other profiles.

Behavioral profiling

This kind of profiling analysis in Destiny is possible because when we play the game, massive amounts of behavioral data are generated as characters interact with the game. Millions of players generate even more millions of touchpoints every day, and all that data is being turned into metrics resting in Bungie´s enormous metrics servers. Developers and academics have for the past few years been spending increasing amounts of attention on behavioral data, not only in mobile games but across the sector. In online games and notably esports there is a strong tradition for giving the players access to such data, which has led to a widespread adoption of analytics in the community. For Destiny, services such as destinytracker allows players to view their own statistics, and similar services exist across most if not all major online or esports title, e.g. the p-stats network or opendota.

Behavioral profiling is a technique known from a variety of data and information science domains, including web analytics and finance, and serves as a means for considering users or consumers in a non-abstract and quantifiable way. Behavioral profiling in digital games seeks to condense the often high-dimensional, high-volume and volatile behavioral datasets generated, notably typical for major commercial multi-player titles, into a subset of well-described profiles that encapsulate player behavior and informs game developers and researchers about how people are playing the game under investigation. This piece describes profiling in games in detail.

For Destiny alone, thousands of behavioral variables (or features) are captured from every character. Given the millions of players, running sometimes complex machine learning algorithms across the entire dataset can lead to trouble. If we can identify the most important metrics to work with, we are much better off in terms of not only efficiency, but also in terms of translating the results of an analysis into something that can be acted upon.

Generating Destiny profiles

For this post we are going to take a bit of time to talk about the data and the analysis being run, because it helps enabling anyone interested in running a similar analysis for their own game. Along the way we will also discuss some common considerations you encounter when developing behavioral profiles using clustering techniques (clustering is described in detail here). Perhaps more importantly, describing how you run an analysis provides others with a way to verify that the work is done properly and level critique where needed – we will abbreviate the details here but read this for a thorough description. Finally, even simple machine learning techniques like the clustering algorithms we employed to build the Destiny profiles, are by no means objective. Human decision making has to be done at many steps in the process. This also introduces the chance of errors or just eliminates other potentially interesting pathways. In summary, being open about how results were generated is pretty crucial in analytics.

Rather than work with data from every single Destiny player, a 12,000 players random sample was drawn from players who had played the game for at least two hours. This yielded about 34,000 characters. Because the sample is randomly generated, it is large enough to draw inferences about the population of Destiny players at a reasonable confidence level (given a 100 mio. player population max., a 95% confidence level and a confidence interval of 1) – but please note that the algorithms used for calculating minimum sample sizes vary quite a lot. Anyway: Destiny has two main modes of play: PvE and PvP. We wanted to work with both of them – and there is a lot of information on Bungie´s back end, 1200+ metrics for PvE alone.

We wanted to build profiles that focused on the performance of Destiny players, so a large chunk of the features could be removed either because they did not relate directly to performance, because they correlated very well with other features and thus were redundant in terms of the analysis process, or because they would drag the analysis into a weird feature space that would not yield interpretable profiles. For example, including the number of times a character has been killed by every type of enemy in the game would be interesting, but given the number of enemy types in the game, it would be hard to decipher the influence of any other type of metric. Rather, if performance as it relates specifically to player death events is of interest, analyses should be formed on those metrics specifically. In practice, in game analytics we have to make choices about which behavioral variables to include or not, and a lot of trial-and-error is often necessary. This is certainly also the case here.

Mean and standard deviation of some of the primary features in the PvP and PvE datasets from Destiny.

Importantly, Destiny features class levels. A level one character has a lot less abilities than a level 40 character, and running a profiling exercise on their performance would therefore inevitably lead to a result that profiled low-level vs. high-level characters. This is not what we wanted, so we limited our focus to level 40 characters. This is by far the most common level in the game, capturing over 60% of the sample (still above the minimum sample size need for inferential work if anyone should be interested in doing that).

Player level distribution in the Destiny player sample. Note the peak at lvl 34, which may indicate a group of players who had not bought the expansion expanding the level range to lvl 40.

Example of a skewed distribution, the distribution of sniper kills across the Destiny characters in the sample. The vast majority of the characters have zero registered kills with sniper rifles.

Skew (as in the figure above, i.e. where frequencies are bunched along one end of a graph) consistently emerged as an issue when generating univariate frequency plots of all the features. We performed logarithmic transformations on the data to mitigate these issues. Of the remaining set of transformed features, we identified any potential correlations with total play time. The intent was to avoid placing greater weight on features for some characters simply because they had been played for a larger amount of time. For example, the total count of in game participants that a player had interacted with exhibited a correlation of .97 with total play time. After dividing the quantity by the play time to convert it to a ratio, the correlation decreased to .003.

We were also curious about how different cluster models performed. There has been a number of research publications in recent years using cluster analysis on game behavioral data, but not a lot of comparisons between different models. Given the over 100 cluster models in existence, such comparison would otherwise be useful to identify which models that work the best in which kinds of situations. Here we selected four models, covering two fundamentally different clustering strategies (for more on clustering strategies see here, here and here). Centroid-based methods define clusters based on the central tendencies of the data; our analysis includes K-means and Gaussian Mixture Models (GMM). Extrema-based methods define clusters based on points along the convex hull surrounding the data, resulting in clusters with definitions that are more distinguishable from other points; our analysis includes K-maxoids and Archetype Analysis (AA). The AA-generated profiles were described above, details of the results from the other models can be found here.

For our cross-model analysis, we elected to use adjusted mutual information to assess the similarity of the cluster results. Below are the adjusted mutual information values between the 4, 5, and 6 cluster solutions for all considered approaches. Overall, the results ran contrary to our expectations, namely that the centroid-based methods would yield similar cluster assignments with those of other centroid-based methods, and likewise for the extrema-based methods. Overall, clustering solutions for a given approach tend to be similar to themselves, and solutions have a slightly higher similarity than what would be expected due to randomness, i.e. a mutual information score of 0. The relative lack of strong similarity between the clustering solutions suggests that there is no ideal technical approach to take when defining behavioral profiles using dense game telemetry data. Rather, the interpretability and actionability of the method outputs is still the key determinants of a method’s usefulness in the context of game design. This conclusion aligns with previous work performed in this area.

Cross-model analysis: choosing between four different clustering models.

Discussion

At the heart of Destiny´s gameplay lies weapons – lot´s and lot´s of weapons – and it turns out that playstyles in the game are heavily influenced by the weapons we equip our characters with. It is hardly surprising – some people like to snipe from afar, others to get up close and personal with melee weapons. Perhaps more than anything, the guns we choose and how we use them defines how we play Destiny.

The primary differentiators of character behavior (given the variables in our analysis) fall into three dimensions: a) the usage frequency for different weapons, b) the average kill distance, and c) the time spent playing either PvP or PvE.

The first of these typically align with each other; players getting more frequent kills with typically longer range weapons tend to have larger average kill distances. Given that Destiny’s main gameplay revolves round the collection and upgrading of weapons, the importance of kill frequency by weapon type is understandable, as players may latch onto certain weapon archetypes early in the game and develop a signature loadout. Some players, however, display a variety of weapon usage, suggesting that a portion of the playerbase is willing to adapt to various situations in different game modes by changing their weaponry.

The game mode dimension groups the cluster results into either PvP focused or PvE focused, with few players spending equal time in both – rather there is a marked preference for one of them. Within each game mode, other features serve as proxy measures for activity preferences. For example, offensive and defensive kills are exclusive to the control gametype, in which teams guard territory to earn points, so clusters with large values for these features may correspond to players that prefer objective-based gameplay.

Players tend to focus on either only a few, or a variety of weapon types. Regardless of game mode, clusters of players emerge that prefer either extreme close range or long range playstyles. Long range players use scout and pulse rifles for primaries, and sniper rifles for secondaries. Short range players specialize in melee attacks and point-blank shotgun blasts. Players that vary their weapon choice also tend to include melee attacks, and special abilities such as grenades and super abilities.

Player preference for PvE or PvP varies between two extremes, and within each game mode preferences for specific activities are revealed through average playtime and types of kills. In addition to the three main dimensions described above, the results also demonstrate variability in features that are either more subtle or secondary to Destiny’s main gameplay goals of collecting items and defeating enemies. E.g. the ratio of player resurrections performed to received was significantly above average for some clusters. In these cases, the values for the remaining features did not seem to follow any identifiable pattern. This could mean that some players are inherently more attuned to supportive roles, regardless of their preferences for certain weapon types. Another feature - the average time remaining in a PvP activity when a player quits - allows for inferences to be made about more nuanced player behavior. Some clusters show high values for this feature, suggesting that some players may be more likely to leave early if a match is not going their way.

Regarding the use of different cluster models, from the matrix and clusters we can conclude that each cluster model gives varying results. If we specifically know what kind of behavior needs to be analyzed, it is crucial that the appropriate method is used during the exercise, e.g. k-means for more general behavior or archetype analysis for more extreme behavior. Otherwise, it is important for a clustering project to include a variety of methods in order to evaluate a range of behaviors/scenarios, encompassing both general/typical- and more extremal behavior. The results indicate that there is no best method to examine how players form clusters, but that the choice should be determined by the goal of analysis and include multiple models. In essence, different clustering models are more or less suited for specific circumstances or for providing specific views on the data. The choice of clustering algorithm is important.

The developers of Destiny have designed an apparently well-balanced game when it comes to performance; a digital experience where players with different preferences can adopt diverging strategies in order to hit the level-cap and to continue the experience beyond that point. The aforementioned balance is indicated in the cluster analysis results, with varying styles colliding at the top-levels of the game. For developers, an analysis with similar results would serve as an evaluation of the design intent in delivering an experience that can be played in a variety of ways.

For a complete breakdown of the experimental work, see this report.

Please do not hesitate to add insights in the comments section or contact us directly with any questions.

Co-authored by Anders Drachen, James Green, Chester Gray, Elie Harik, Patty Lu, Rafet Sifa and Diego Klabjan.