Can I Use Mechanical Turk (MTurk) for a Research Study?

Amazon Mechanical Turk (MTurk) has quickly become a highly visible source of participants for human subjects research. Psychologists, in particular, have begun to use MTurk as a major source of quick, cheap data. Studies with hundreds or thousands of participants can be identified in mere days, or sometimes, even a few hours. When it takes a full semester to get a couple hundred undergraduate research participants, the attractiveness of MTurk is obvious. But is it safe to use MTurk this way? Won’t our research be biased toward MTurk workers?

New work by Landers and Behrend explore the suitability of MTurk and other convenience sampling techniques for research purposes within the field of industrial/organizational psychology (I/O). I/O is a particularly interesting area to explore this problem because I/O research focuses upon employee behavior, and it has been a longtime concern in that field to answer questions of sampling, like: Under what conditions does sampling undergraduates lead to valid conclusions about employees?

Traditionally, there are some very extreme opinions here. Because I/O research is concerned with organizations, some researchers say the only valid research is research conducted within real organizations. Unfortunately, this preference is based largely in tradition and a largely superficial understanding of sampling.

Sampling in the social sciences works by choosing a population you’re interested in and then choosing people at random from that population. For example, you might say, “I’m interested in Walmart employees” (that’s your population), so you send letters to a randomly selected subset of all Walmart employees asking them to respond to a survey. This is called probability sampling.

The key issue is that probability sampling is effectively impossible in the real world for most psychological questions (including those in I/O). Modern social science research is generally concerned with global relationships between variables. For example, I might want to know, “In general, are people that are highly satisfied with their jobs also high performers?” To sample appropriately, I would need to randomly select employees from every organization on Earth.

Having access to a convenient organization does not solve this problem. Employees in a particular company are convenience samples just as college students and MTurk Workers are. The difference that researchers should pay attention to is not the simple fact that these are convenience samples, but instead, what does that convenience do to your sample?

For example, if we use a convenient organization, we’re also pulling with it all the hiring procedures that the company has ever used. We’re grabbing organizational culture. We’re grabbing attrition practices. We’re grabbing all sorts of other sample characteristics that are part of this organization. As long as none of our research questions have anything to do with all these extra characteristics we’ve grabbed, there’s no problem. The use of convenience sampling in such a case will introduce unsystematic error – in other words, it doesn’t bias our results and instead just adds noise.

The problems occur only when what we grab is related to our research questions. For example, what if we want to know the relationship between personality and job performance? If our target organization hired on the basis of personality, any statistics we calculate based upon data from this organization will potentially be biased. Fortunately there are ways to address this statistically (for the statistically included: corrections for range restriction), but you must consider all of this before you conduct your study.

MTurk brings the same concerns. People use MTurk for many reasons. Maybe they need a little extra money. Maybe they’re just bored. As long as the reasons people are on MTurk and the differences between MTurkers and your target population aren’t related to your research question, there’s no problem. MTurk away! But you need to explicitly set aside some time to reason through it. If you’re interested in investigating how people respond to cash payments, MTurk probably isn’t a good choice (at least as long as MTurk workers aren’t your population!).

As the article states: