Researchers ranging from psychologists to epidemiologists have wondered for some time whether online, multiplayer games might provide some ways to test concepts that are otherwise difficult to track in the real world. A Saturday morning session at the meeting of the American Association for the Advancement of Science described what might be the most likely way of finding out. With the cooperation of Sony, a collaborative group of academic researchers at a number of institutions have obtained the complete server logs from the company's Everquest 2 MMORPG.

As the researchers who are dealing with this new resource describe it, it's one of those "be careful what you wish for" situations—with nearly 60TB of data, the standard procedures for tackling social data sets just aren't up to the job.

Dmitri Williams introduced the project and described how researchers have been approaching various game developers over the years. He paraphrased the conversation with Sony as:

"What do you collect?"

"Well, everything—what do you want?"

"Can we have it all?"

"Sure."

The end result is a log that includes four years of data for over 400,000 players that took part in the game, which was followed up with demographic surveys of the users. All told, it makes for a massive data set with distinct challenges but plenty of opportunities.

Computer science challenges

Jaideep Srivastava is a computer scientist doing work on machine learning and data mining—in the past, he has studied shopping cart abandonment at Amazon.com, a virtual event without a real-world parallel. He spent a little time talking about the challenges of working with the Everquest II dataset, which on its own doesn't lend itself to processing by common algorithms. For some studies, he has imported the data into a specialized database, one with a large and complex structure. Regardless of format, many one-pass, exhaustive algorithms simply choke on a dataset this large, which is forcing his group to use some incremental analysis methods or to work with subsets of the data.

Srivastava then gave a short tour of the sorts of items the team is trying to extract from the raw logs. He apparently has graduate students working on non-traditional figures like the "monster composite difficulty index" and an "experience rate measure."

But many of the other measures that researchers in the social sciences want—trust, performance, expertise—are fairly subjective. To get estimates of them, the team is experimenting with trying to track physical proximity and direct interactions, such as when characters share experience from an in-game victory.

To give a concrete example of the data's utility, Srivastava described how he could explore the phenomenon of customer churn, something that's significant for any sort of subscription-based service, like cell phones or cable TV. With the full dataset, the team can now track how individual customers dropping out of the game influenced others who they typically played or interacted with. Using this data, the spreading rate and influence factor could then be calculated, providing hard measures to work with.

Getting social

Noshir Contractor described how the data was allowing him to explore social network dynamics within the game. He described a variety of factors that are thought to influence the growth and extent of social networks, such as collective action, social exchange, the search for similar people, physical proximity, friend-of-a-friend (FoaF) interactions, and so on. Because these are well-developed concepts, statistical tools exist that can extract their signature from the raw data by looking at interactions like instant messaging, partnerships, and trade.

Contractor described the results of running these tests on a week's worth of data from a server that saw over 3,000 North American players during that span.

In that week, his team could detect over 2,000 players that became involved in partner relationships and about 2,500 who took part in trade interactions. The IM network had fewer participants; in the question-and-answer session afterwards, Williams suggested that many players rely on VoIP for their interactions—"It's easier to say 'look out' than take your hands off the controls and type it," he said.

Nevertheless, signatures of popularity and FoaF relationships were apparent in the IM data. FoaF relationships were the most common in other interactions as well.

Mixing in the demographic information produced a few surprises. Gender turned out to be a negative influence on interactions: even after their low numbers were taken into account, female players avoided interacting with each other. Time zones had some influence; players in the same time zone were 1.25 times more likely to partner than players even one time zone apart.

But distance had a much larger effect; players within 10 kilometers of each other were five times more likely to interact. Contractor concluded that, for the typical player, the game simply offered a way of continuing their real-world social interactions in a virtual setting.

Links between the real and virtual world

In addition to introducing the EQ2 logs as a resource, Dmitri Williams described some of the efforts involved in exploring how much of the real world spilled over into the virtual.

The average age of players turned out to be 31. "These aren't just pasty white teenage boys in a basement—to be sure, they're there, but they're not typical," he said. The older players tended to play more than the kids and, although the total hours played seem large, he said that the time mostly displaced either TV watching or movie going. And the surveys showed that those who viewed TV news in the first place continued to do so, suggesting that gamers really slotted EQ2 into their entertainment time.

Mostly, the gamers seemed healthy; their body mass index was better than the US average and, although they were slightly more depressed than average, they were also less anxious.

Buried among those happy, average players was a small subset of the population—about five percent—who used the game for serious role playing and, according to Williams, "They are psychologically much worse off than the regular players." They belong to marginalized groups, like ethnic and religious minorities and non-heterosexuals, and tended to use the game as a coping mechanism.

Implications for gaming and science

Williams pointed out one case where having access to the server logs allowed the researchers to identify some serious skewing in the responses to the demographic surveys. Older women turned out to be some of the most committed players but significantly under-reported the amount of time they spent in the game by three hours per week (men under-reported as well, but only by one hour). The example highlights the risk of using self-reporting for behavioral studies and the potential of the virtual world data.

Saying, "I'm not tenured yet, and I don't want to tick off that many people at once," Williams wouldn't get into the significance of this finding, but Srivastava was happy to do so. ("I'm tenured and I'm not in the social sciences," he said, "so i can tick as many people off as I like.")

In his view, the data suggests that many studies that report marginal male-female differences in gaming based only on self-reported figures most likely did so based on unreliable numbers. It's entirely possible that a number of other sanity checks on past studies are lurking in this virtual data trove.

There was also talk about the potential for a symbiotic relationship between game designers and researchers. Srivastava's work on customer churn, for example, could prove highly valuable for developers that rely on retaining subscribers, and many of the studies that the speakers were interested in doing could provide valuable feedback on how users were actually interacting with various features of the game.

For the most part, the companies that the researchers have approached either haven't been interested in sharing their logs or the logs themselves don't contain the sort of data that would make for fruitful research. In several cases, Williams has been told that he should ignore entire classes of events in the logs, because they were purely put in for debugging purposes.

But he argues that this isn't just about researchers losing out. "There are a lot of things we can show them about their bottom line, but these industries are deadline focused," Williams said. "They're not far enough beyond the garage-shop mentality."