As a UX researcher for a social media operation, Ute considers different interface designs that might allow users to make more social contacts. Ute gets a radical idea to test her hunches: What if we manipulated some of our current users’ profile pictures and measured the impact of those changes on their friends list? If successful, her research would provide valuable insight into the social media design elements most likely to result in sociability online. Of course, a successful study would also diminish the experiences of thousands already using her company’s service. In Ute’s mind, this is a simple A/B test, yet in the wake of recent controversy surrounding social media research, she’s starting to wonder if she should be concerned about the ethics of her work.

As a research scientist and professor at two different universities, I work to better understand the social and psychological impact of technology on human communication. Our experiments have tested the limits of accepted research design practice, with designs ranging from the manipulation of romantic jealousy using social networks to studying the impact of induced stress and boredom on video game experiences, and a host of other experiments and observations. Yet, these studies all share a common element: they were all subject to intensive internal and external ethical review practices to ensure that participants in these studies were both informed (either before or after the study concluded) and unharmed.

On these two points, recent debates surrounding the recent Facebook “emotional contagion” study have centered on notions of informed consent (Did Facebook users know they were in a study?) and minimizing harm (Were any Facebook users hurt by this study?). Yet, to the majority of UX researchers who have not undergone the same required extensive ethics training as biomedical and social scientists, some of these issues appear more abstract than useful. To this end, I offer below an “insider’s perspective” into the mechanics of research ethics, along with some issues that UX researchers might consider in their daily practice.

So, UX research isn’t research!?!

First, a quick primer on how we define research. As would be suggested in the job title, UX researchers are often tasked with gathering and analyzing user data, usually drawing comparisons between different interface designs to see which ones result in the most desired behaviors among particular users.

However, such activity does not usually fall under the legal definition of research. According to the U.S. Department of Health and Human Services #46.102, research is defined as “systematic investigation, including research development, testing and evaluation, designed to develop or contribute to generalizable knowledge.”

That last clause, “… generalizable knowledge” is key, as the vast majority of A/B testing is not intended to contribute to the larger body of knowledge on UX – indeed, much of this work is proprietary to those companies conducting it, and never released to the public. What might have well helped Facebook (ironically) is if they had never published the study in the first place, the idea of which led to a bit of confusion on Twitter as to why it’s okay to do research, so long as it isn’t published.

What that means for us UX researchers is that technically, any research is “allowed” because it isn’t research. However, in order to make ethical decisions that we are comfortable with as human beings, it’s worth digging deeper to understand why UX research isn’t subject to the same ethics reviews as other research.

Legally ethical research

One common reason that internal corporate research—such as product testing—is not often subject to ethics review is that most UX research is done on anonymous data, or data without any personal information.

Regarding the Facebook study, one university exempted the study from internal review because the researchers were never given direct access to any individual Facebook user data. In general, research on big data tends to be exempt from ethics review so long as the data is aggregated and not focused on individual persons, and many social and behavioral scientists have subscribed to this ethical perspective.

However, even when data is anonymous, this doesn’t mean that people aren’t affected. In most research ethics reviews, the main concern is balancing the risks and rewards of a given study. The research team must prepare an argument that the societal benefits of the study’s potential outcomes substantially outweigh any risks to people participating in the study.

As a dramatic example, a team of biomedical researchers might approach terminal cancer patients with an opportunity to participate in a case-control study in which they are randomly assigned to receive either (a) a proprietary and experimental cancer medication or (b) a placebo. In this case, the societal benefits (a potential cure for a particular cancer) are thought to outweigh the risks (the eventual death of terminal cancer patients not receiving the experimental medication).

Likely, the risks of most technology research (including my own) are far less extreme – perhaps influencing a user to spend more time reading a particular advertisement or sharing a story element with their social media followers. However, UX researchers should still ask the question: “Would participants in this study be exposed to risks that are greater than those encountered in everyday life?” If the researchers can honestly answer “no,” then their studies are usually fine. In the case of the Facebook study, most have argued that the purposeful manipulation of emotions exposed participants to unnecessary psychological risk (such as depression or other negative emotional states). Moreover, while the end result of the Facebook study turned out to be statistically minute, many have counter-argued that the authors had no way to fully understand the potential effects of their emotion manipulations in such a way that they could have meaningfully worked to mitigate harm.

A great example of ethically-sound and effective industry A/B testing was performed by Dr. Jeffrey Lin, a research scientist with Riot Games trying to better understand reports of “toxic chat” in the video game League of Legends. His team of scientists manipulated several features of the game’s chat system without (initial) player knowledge, eventually finding that one of the best ways to protect players from salty talk was to simply disable in-game chat features by default. The end result was a dramatic drop in offensive language, obscenity, and negative affect, even while the actual chat activity remained stable.

Why did their UX research get so much praise, while Facebook got so much poison? Similar to the Facebook study, data was collected and analyzed anonymously (raw chat data) and participants were not informed about the study. Similar to the Facebook study, Lin’s team was interested in emotions from technology usage (in fact, both studies dealt with the same “emotional contagion” effect). However, unlike the Facebook study, Lin’s work did not expose participants to negative effects beyond those already existing in the game (i.e., “toxic talk”) but instead, randomly assigned some gamers to the “chat off” interface as a potential treatment for an observed problem in their product: negative play experiences.

For a UX research analog, consider how many A/B studies are done on the impact of color scheme on interface behaviors. UX researchers are often tasked with designing interfaces that might be more emotionally stimulating to users so that they might engage in a desired behavior. Many are inspired by color psychology, with recent work applying the theory to algorithms able to retrieve images based on the emotional content of a web page.

Fitting a hypothetical question back into Ute’s original research model, we might wonder about the ethics of an A/B study that intentionally presents a user interface to make it purposefully frustrating, stressful, or an emotionally negative experience. Some might argue that testing both “good” and “bad” experiences is necessary in order to have a complete understanding of UX, but I would contend that the purposeful exposure to a negative experience does little to advance UX, while it does a lot to frustrate users who might not be in a state of mind to handle it.

How can we be more ethical?

What can the active UX researcher take away from all of this? A long breath of relief. It is unlikely that any eventual fallout of the Facebook study (including a potential Federal Trade Commission investigation) will result in a death knell for corporate and organizational A/B testing.

However, this breath of relief – as with any contemplative effort – should be followed by a deep inhalation and a consideration about the “real” units of analysis in any UX researcher: individual people.

Let’s reconsider Ute’s dilemma from our introduction, but this time through the lens of a few questions that I recommend all UX researcher ask themselves when considering the ethics of their own work. Indeed, these are essentially the same questions I ask myself (and my institutions’ ethics boards ask of me) at the start of any research:

Is the manipulation theoretically or logically justified? In scientific research, a research team often has to prepare a short literature review to explain the theory and logic behind their proposed manipulation. This is an essential step in the research process, as it provides the potential explanation for any observed effects. After all, what good is a positive A/B test if the researcher can’t give an explanation for the observed results? If Ute can’t produce a sound theoretical or logical explanation as to why she thinks visuals will be more engaging (although there is some data on the topic), then I might suggest that she needs to do more homework before conducting her study. Is a manipulation necessary for my research? As mentioned above, a key “tipping point” in the ethics debate around the Facebook study was the active manipulation of user’s news feeds. While experiments are often considered the “gold standard” of research, it is important to remember that they are not the only way to establish causality. In a famous example from 1968, scholars Donald Shaw and Maxwell McCombs were able to demonstrate that the mass media’s coverage of election topics in July of that year (a U.S. presidential election year) heavily influenced public opinion about the importance of these topics in November of that same year by using a cross-lagged correlational design, a simple design where researchers take multiple measurements and compare their influence on each other across time. One way that Ute could get around the ethical dilemma of actively manipulating user profiles is to use a similar design—watching users’ natural behavior over a set period of time and looking for changes in user behavior as a result of (in Ute’s case) using more or fewer photos in profile posts. Could the manipulation be potentially harmful in any way? Once a manipulation has been logically justified and considered necessary for addressing a UX researcher’s burning question, the project still isn’t ready for the green light until it can arguably pass the most important scrutiny: could the manipulation reasonably expose participants to any risks beyond what could be encountered in their normal usage of a site or platform? For Ute’s question, it might seem harmless enough to add or hide a few selfies on randomly selected user profiles. However, media psychologists suggest that selfies are a key component for identity expression, and we might question the extent to which Ute’s research proposal would disrupt these users’ online experiences. To some extent, the minimization of harm is very much related to having a clear understanding of the mechanisms behind a study (the first question in our list). How might our users feel about being studied? The first three questions deal more with planning and implementing a UX research project, but there is a final important ethical consideration: the user experience in the study itself. Often times in psychology experiments, researchers will conduct an exit survey where they will (a) explain to study participants the purpose of the study, (b) debriefed them about the mechanics of the study manipulations, (c) provide participants a chance to comment on the study and (d) ask them to offer oral or written consent, allowing the user’s data to be included in the final research report. While not always practical, such a practice can go a long way in making users feel included in the research process. In addition, these interviews can go a long way in providing qualitative data that might explain larger data abnormalities (in the business, we refer to this as mixed methods research). In general, chances are that if a UX research team doesn’t feel comfortable informing users about their role in a study, then they shouldn’t be conducting the study in the first place.

While intensive ethics training might not be practical, it wouldn’t hurt to at least consider the impact of the research beyond the data. Taking a more critical eye to the possible impact of A/B testing on users will not only result in more compassionate studies, but more compelling and effective results to boot.