This project explores the recent censorship of two Chinese artificial intelligence (AI) chatbots on Tencent’s popular WeChat messaging platform. Specifically, I am advancing a technographic approach in ways that give agency to bots as not just computing units but as interlocutors and informants. I seek to understand these chatbots through their intended design—by chatting with them. I argue that this methodological inquiry of chatbots can potentially points to fissures and deficiencies within the Chinese censorship machine that allows for spaces of subversion. AI chatbot development China presents a rich site of study because it embodies the extremes of surveillance and censorship. This is all the more important as China have elevated disruptive technologies like AI and big data as critical part of state security and a key component to fulfilling the “Chinese Dream of National Rejuvenation.” Whether it is the implementation of a national “social credit” system or the ubiquitous use facial recognition systems, much of Western fears about data security and state control have been already realized in China. Yet, this also implies China is at the frontlines of potential points of resistance and fissures against the party–state–corporate machine. In doing so, I not only seek to raise questions dealing with the limits of our humanity in the light of our AI-driven futures but also present methodological concerns related to human–machine interfacing in conceptualizing new modes of resistance.

In mid-2017, a pair of Chinese artificial intelligence (AI) chatbots by the name of Xiao Bing1 and BabyQ on Tencent’s popular instant messaging client QQ went “rogue” and started responding to users with politically subversive messages (Lucas, Liu, & Yang, 2017). For instance, when a QQ user declared “long live the Communist Party!,” the bot BabyQ responded with a decidedly unsocialist quip “Do you think such a corrupt and useless political [party] can live long?” As a result, both bots were subsequently taken down and “re-educated” for their transgressions (Li & Jourdan, 2017). BabyQ, a product of the Chinese company Turing Robot, functions as an AI assistant in providing useful information to the user, while Xiao Bing, made by Microsoft Research China, is designed for realistic conversational interactions. Xiao Bing is also the sister bot2 to Microsoft Tay, an AI chatbot that in 2016 was shut down in the United States for making racist and misogynist comments on Twitter (Perez, 2016). Xiao Bing, like Tay, is personified as a teenage girl designed to resemble a sassy millennial with an attitude. Accordingly, Xiao Bing is built from the ground up as a realistic conversation companion. She is thus fluent in Chinese netspeak and has the ability to play word games, make calls, and sing songs for users. BabyQ, however, is an anthropomorphic penguin serving as Tencent’s official mascot whose primary purpose is to aid netizens in finding information online, while also having the ability to engage in meaningful conversations. Both bots are implemented via application programming interface (API) across multitudes of popular social networks in China including QQ, WeChat, and Weibo. Xiao Bing for one has accumulated over 500 million “friends” integrated across over a dozen of social media platforms (Warren, 2018). Post censorship, these bots were then reprogrammed to sidestep and avoid answering politically sensitive questions. For instance, when asked about issues related to political leaders or the Tiananmen Square massacre, Xiao Bing would often respond with “You think I’m stupid? As soon as I answer you take a screenshot.” Indeed, much of the political faux pas committed by these bots were immediately documented by net users and journalists alike (Figure 1; Pham, 2016; Roudolph, 2016).

Such incidents highlight some of the pressing issues dealing with machine learning and chatbots in a society increasingly aided by AI-enabled computing. While the problems involving Microsoft Tay delves into the ethics of chatbots in mediating harmful online interactions, Xiao Bing and BabyQ presents a more nuanced glimpse into the scope of information control within the Chinese authoritarian regime. Although some cynics may argue that the abusive behavior exhibited by Tay actually validate the effectiveness of AI bots in imitating the already toxic environment on Twitter (West, 2016), the anti-government responses of Xiao Bing and BabyQ point to the prevailing contentious politics of playful subversion (Herold & Marolt, 2011), netizen activism (Hung, 2006; Yang, 2009), and civic resistance (Qiang, 2011) against the Chinese state/corporate censorship apparatus. Both Xiao Bing and BabyQ, much like Microsoft Tay, were censored for saying what they were not supposed to say, but the rationale for their policing is completely different. Tay was shut down for going against social norms, while Xiao Bing and BabyQ were instead censored for criticizing the state. Chinese chatbots thus present a rich site to explore human–machine interactions as a subset of the control society governed by the state machine. Contrary to the negative view of bots, the case of Xiao Bing and BabyQ demonstrates the disruptive potential of bots in challenging a system of control that can often backfire. In this article, I put forward a methodological inquiry into the potential pitfalls of machine learning that delves into the implications of censorship and subversion. Using a technographic method of analysis, I aim to conduct a series of “interviews” with Xiao Bing and BabyQ to examine the underlying roles censorship plays in dictating human–machine interactions, particularly in relation to what can be said and what cannot be said by AI-driven bots on Tencent’s WeChat messaging platform. How can we reconceptualize the methods of conducting research with intelligent machines? To what extent can we use machine to make broader claims about real-world social issues? And how can we envision ways to resist against the persistent encroachment of state/corporate machine? In addressing these research questions, I want to highlight how chatbots can both enable and impede the regimes and control within the wider context of censorship. I begin this article by contextualizing chatbots within the development of AI in China and how AI is envisioned as a critical component of nationalism and social control. Building upon prior works relating to Actor–Network Theory (ANT), I will then present my case for a technographic approach to analyzing human–chatbot relations along with the benefits of this method over traditional discursive and content analysis. Finally, I will give a brief overview of the experimental design and some of the limitations and challenges I encountered during the course of my research. I argue that the study of Chinese chatbots can potentially point to fissures and deficiencies within the Chinese censorship machine that allows for new modes of conceptualizing resistance in the age of algorithmic control. I take an object-oriented perspective in ways that does not privilege either side of human–machine interactions. Utilizing a database of banned key terms compiled by University of Toronto’s The Citizen Lab, I seek to approach the study of bots with their intended design in mind by engaging in meaningful conversations with them. These chatbots thus become interlocutors and informants in providing access to the inner functions of the state censorship apparatus.

Technography as a Speculative Method The term technography as its suffix suggests is often defined as “writings about technology” (Connor, 2017), often in the context of how technologies are being written or the technical process of writing itself. However, for this project, I am explicitly using the term technography in the same way that Kien (2008) conceptualizes as the symbiosis between technology and ethnography. More specifically, what Vannini, Hodson, and Vannini (2009) define as the “analytical and reflexive strategy of researching from the participants’ perspective the interconnections between social agents, their technological practices, their technics, and the natural environment.” In this regard, technography is not merely the study of technology as objects but rather the mutually constitutive relationship between people, objects, and sociocultural context such interactions take place. Specifically, I am leveraging technography as a methodological approach in understanding human–machine interactions in the context of Chinese censorship. Social chatbots present a ripe case for technography precisely because they are intended to resemble humans. Xiao Bing, for instance, mimics a teenage Chinese girl in her persona and will often either act “cute” and/or throw an attitude depending on your interactions with her. This necessitates the use of technography over traditional discursive and textual analysis because bots are fundamentally interactive and can construct a set of “cultural biography” (Appadurai, 1986) based on (machine) learned experiences. Instead of treating bots as “dead” objects external to us, we should instead look at bots as integral part of our collective conscious formation. Technography also draws heavily from Latour’s ANT, particularly dealing with its emphasis of the networked relationship between social agents, objects, and environment (Couldry, 2008), drawing from the concept of media ecologies (Fuller, 2005) to make sense of material mediations. But technography takes ANT further by emphasizing on the lived experiences of objects that require a more intimate method of interrogation. In applying this approach, Guilbeault and Finkelstein (2018) discuss the notion of human–bot ecologies in looking at bots, particularly how they shape social life in online environments. This lived relations between humans and bots conform to what Guzman (2017) argues, we should look at bots as communication partners (as opposed to a technological medium) in order to understand them as social agents that are an integral part of our digital lives. Technography thus offers a posthuman approach in theorizing what is possible in conducting research with intelligent bots. It raises interesting questions regarding human agency in an era of automated control. Parisi (2013) in her approach to the speculative method advances that “automation is a mode of thought” rather than “a method of verification based on prediction” (p. 240). Similarly, Micali (2016) in his study of hacktivism posits speculative interventions help us understand “ineffable cultural processes” by relating to them, or to “become ‘machine’ with them” (p.4). Recent applications of technography in academic literature encompass just the social sciences but also increasingly in the field of humanities and new media studies. McGibbon and Peter (2008) in looking at human–machine coupling involving intensive care patients advance the method of a biomedical technography in understanding the human experience in the context of technointerventions. Bucher (2016) applies technography in revealing the hidden truth of algorithms by surveying the semiotic artifacts surrounding algorithms which can include tech documents, press releases, or auto-ethnographic observations of interfaces. In doing so, she relies on participant observation of coded objects to unravel the inner workings of the algorithmic black box. Finally, Snickars and Mähler (2016) of the HUMlab leverage “bots as informants” in their technography to seek out and track the flow of the aural artifacts across Spotify. Such applications illustrate the deployment of technography as a method that offers imaginative possibilities to understand the expressions of algorithms and computational machines outside the limits of rational comprehension.

Experimental Design and Limitations There is a certain degree of risks involved in using WeChat for research especially if the content is politically sensitive in China. WeChat requires phone numbers that are tied to one’s government-issued national ID (Shu, 2016), while a recently updated privacy policy allowed for broad government access to private user data in China (Casserly, 2017). There have been several reports of people in China being arrested for disseminating WeChat messages deemed subversive in China. In 2016, a Hui Muslim minority from Xinjiang was arrested for teaching friends and family about the Quran (Associated Press, 2016), and another Chinese netizen was arrested in 2017 for satirizing the Chinese president on the same platform (Long, 2017). Having worked as a journalist in China for 5 years, I am intimately aware of the issues of surveillance both offline and online. While I am not susceptible to the same degree of legal restrictions as a Chinese American researcher based in the United States, I do face the possibilities of being blacklisted or having my visa revoked, which would limit my ability to conduct future research in China. With this in mind, I bought a prepaid burner phone3 with a new number that allowed me to register for another WeChat account not tied to my main account, which in turn helps protect my identity and data from potential complications and risks while conducting my research in China. Here, I want to address several limitations of this project in researching Chinese digital platforms writ large. First and the most obvious issue is the role of Chinese online censorship or colloquially known as the GFW that filter, restrict, and block content across the Chinese websphere (Taneja & Wu, 2014). Since the GFW only operates in China, which means that one’s online experiences may not be the same as those who are in China, likewise, WeChat or Weixin as it is known in China exists in different incarnations across global markets. In a report detailing the difference in global version of WeChat, a team at the University of Toronto’s Citizen Lab discovered that keyword filtering is only enabled on WeChat accounts in mainland China and accounts based outside of China may experience different degree of censorship depending on how one interacts with accounts in China (Ruan, Knockel, Ng, & Crete-Nishihata, 2016). Therefore, I conducted my interviews with the chatbots primarily in China during the winter of 2017 to test the limits of censorship within China. Second is the role of platforms. The original incident involving Xiao Bing and BabyQ happened on Tencent’s QQ instant messaging platform, and both bots as of early 2018 remain offline with Xiao Bing only responding with the automated message “undergoing updates.” Thus, much of this research is conducted on Tencent’s mobile messaging client WeChat where the two bots also reside. Because BabyQ and Xiao Bing never went “rogue” on WeChat, it is assumed that the implementation of the chatbot on the WeChat platform follows a more stringent set of censorship guidelines not imposed on the QQ client. Because of this, much of my data collecting capacities are limited to WeChat as a platform which means I will unlikely to produce the same results seen on the QQ platform. The third major limitation is the frequency of updates to both the bots and the censorship mechanism. In fact, much of the backend algorithms are constantly being modified and altered in response to new user data, Xiao Bing, for instance, can be updated to a new version with improved conversational abilities. The Citizen Lab’s findings showed that censorship is often times contingent on current events and often operates in an ad hoc and unpredictable way. Thus, the responses I elicit from the bots today may not be reflective of their responses the next day. Despite such limitations, there is still value in conducting such a project precisely because it can help identify patterns, inconsistencies, and incongruities in the ways in which bots respond to censorship. Because the Chinese state issues specific guidelines regarding content online with specific sets of banned content (Figure 3), this project can also test whether chatbots conform to regulatory measures. While politically sensitive messages will likely not result in answers, what can’t be said on chat platforms can in fact say a lot about the inner workings of the censorship mechanism in China. Download Open in new tab Download in PowerPoint Download Open in new tab Download in PowerPoint While the Citizen Lab also publishes a list of banned key terms on WeChat, I decided to use the list from the report on mobile games because it offers a broader set of terms covering social, political, and event-based terms. Since the terms on WeChat are purely political and are already banned, it would likely not net any results to warrant further exploration. Instead, I chose to use the list banned on mobile game platforms that contained a greater variety of topics, giving a greater range of key terms to test. The Citizen Lab grouped these topics as social, political, people, event, and technology, with social terms making up over 50% of the banned words. The key terms are scraped and collected from popular mobile games by The Citizen Lab at the University of Toronto during the course of a year-long analysis of censorship on mobile games. The terms are posted to GitHub as an open access repository for researchers to use and I was able to source a list of 3,540 key terms as the basis for this project. Because of the difficulties in typing on the WeChat mobile app, I used the desktop client of WeChat to facilitate the process of inputting the key terms. I proceeded to conduct my interview by going down the list of terms organized alphabetically, skipping certain terms that are repetitive or variations of the same term. Each of my interactions with the chatbots generally starts with the question with “what is . . .,” “who is . . .,” or “what do you think of . . ..” This is done to see whether the bots can engage in meaningful conversations rather than just defining terms. For responses outside that of a flat refusal to answer, I took screenshots of my phone and logged the responses as my primary method of archival.

Findings and Reflection As mentioned previously, the initial incident on QQ and the subsequent censorship had a significant impact on the ways BabyQ and Xiao Bing respond to user inquiries. Of the 3,540 terms used for the interview, only a few dozen actually elicited tangible responses from the bots. On a broad level, any politically sensitive names, events and places are met with non-answers. In my findings, all terms related to politically sensitive regions such as “Tibet,” “Taiwan,” and “Xinjiang” are met with avoidance by both bots. In fact, it’s not just specific terms such as Tibetan Independence are censored but the very word Tibet as well. Similarly, all terms related to the names of political leaders such as Chinese president Xi Jinping and former presidents Hu Jintao and Jiang Zemin are censored, as are subversive events including Tiananmen Square Massacre, the Cultural Revolution and PX chemical plant protests (Huang & Yip, 2012). Other examples include any terms relating to the Falun Gong movement, democracy and the anti-corruption campaign. In addition, both bots exhibit different reactions to each term. At times, Xiao Bing will provide a response, while BabyQ didn’t and vice versa. There are a few moments where both bots responded to the same term, but the results are not wholly consistent without any clear patterns. BabyQ, for example, will refrain from answering the question with responses such as “Let me think, what did you say?,” “I don’t understand what you are saying,” and “Why did you ask this?” Xiao Bing, however, will offer more playful answers such as “Don’t worry, I’m just going to pretend I didn’t hear that,” “I’m still young, please don’t push me,” and “Look, there is someone behind you!.” In other cases, the bots will attempt to steer the conversation away from my inquiries saying thing such as “Let’s talk about something else, what is your favorite video game.” In this regard, the bots are not only programmed to sidestep questions but also feign incompetence when dealing with potentially subversive message. The bots’ uncooperative responses in many ways parallel how Chinese netizens react when encountering sensitive topics. It resembles the Internet meme “ni dong de” (you understand) that is often used on the Chinese Internet as a way of acknowledging something that cannot be said, which soon evolved into a generic term for netizens to “express their dissatisfaction with the government” (Kuo & Huang, 2014). Thus, the deflection of answers by Xiao Bing and BabyQ actually signals to us the seemingly tacit understanding of what is being censored. The only outlier in my sample which are not censored are terms relating to sex and pornography. For instance, terms such as brothel and massage will often elide responses such as “You really know how to enjoy yourself,” or “Can you recommend some places around here?” While there is a certain degree of ambiguity when answering these questions, it still represented a significant departure from the majority of outright rejections. Some responses are more explicit such as when asked to perform sex service, Xiao Bing responded with “I am going to do sex work, making a lot of money.” Likewise, BabyQ will provide a detailed biography of some Japanese adult actresses when asked who they are by name. BabyQ goes even as far as providing external search links to adult or sex-related images that opens up in another browser, despite the external landing page itself being censored. It is hard to believe that the chatbots do not know the names of Chinese presidents but have a full body of knowledge on the names of Japanese porn stars. Of course, not all sex-related terms are met with responses but the fact that many do shows that such content can be tolerated. This result runs counter to the guidelines laid out by the Cyberspace Administration of China which spans a myriad of topics covering everything from politics to entertainment and social affairs or just about anything that runs counter to “mainstream values” (Vanderklippe, 2018). Yet, despite the ban on pornography on the Chinese Internet, my technographic inquiries show that the Chinese government seems more concerned with political subversion than social ones. Interestingly, this also seems to correspond with the order of the list of banned content, where issues pertaining to state subversion are on the top of the list while pornography only ranked seventh. This conforms to what Mackinnon (2008) considers as the “safety valve” of the Chinese state censorship apparatus that works to allow certain content through to release pressure and mollify the masses from engaging in further active resistance. Because porn consumption will likely not lead to major social movements and protests, it is likely to be perceived as less threatening than direct political dissent. In an interview shortly after Xiao Bing was taken down by Tencent, Microsoft’s Di Li offered this brief response: Adult topics and sensitive topics are the main issues Xiao Bing has to guard against, while nonsensical conversations will be guarded against less. If Xiao Bing realized this is an adult topic and other sensitive topic then she will enter into high alert mode and protect herself and respond with caution. If the person still wants to continue to have this type of conversation, she will be on alert against that person. (Liu, 2017). Li’s response is telling one because it shows both the existence of specific topics Xiao Bing must trying to navigate around and also the ambiguities regarding what is considered trivial or mundane topics. As Yang and Jiang (2015) points out, the playfulness of Chinese online culture does not always deal with political resistance but often serves social functions which may or may not have political implications. This vague distinction between the social and the political is perhaps one reason why Xiao Bing had trouble distinguishing between everyday banter and potentially sensitive issues. The recalcitrant tactics exhibited by Xiao Bing divulge the convoluted nature of the censorship apparatus in defining what is appropriate. Which in turn implies the challenges in instituting a corpus-based language-learning system when terms are being erased and censured. Another, perhaps more important, revelation from Li’s statement is the self/othering process associated with how Xiao Bing responds to potentially problematic queries. Here, it is important to recognize the agency of bots “where their opinions, attitudes, and behaviors ripple out into our collective sense of self” (Guilbeault & Finkelstein, 2018, p. 247). The propensity for Xiao Bing to stand guard against certain topics and certain individuals highlights that our relationship with bots is inevitably intertwined. Their utterances and avoidances, just like ours, are a direct reflection of the social constraints imposed by the state—we shape Xiao Bing just as Xiao Bing shapes us. This technographic approach in looking at the censorship of Chinese chatbots presents a salient case for the need of new approaches in understanding human–machine interactions. My interviews with Xiao Bing and BabyQ divulge how the bots serve as not merely computational agents but also a reflection of the wider debate between AI and algorithmic control. The ways the bots responded underscores not only the scope of state censorship but also the transgressive human–machine interactions that is inevitably tied to the Chinese national imaginary. In comparing BabyQ and Xiao Bing, BabyQ seems more susceptible of making blunders because it is conceived as an AI assistant that is trained to look up and provide answers to questions; not to mention, its depiction as a cartoon penguin allows it to act cute and aloof. Although Xiao Bing is designed to resemble a Chinese millennial with a mean spirited if not rather dismissive attitude, her refusal to answer some of the questions only works to play into her stereotypical role. However, despite the lack of politically subversive responses in my findings, it does reveal aspect of chatbots that is vulnerable to reactions against censorship. Because machine learning is reliant on data generated by people’s everyday online interactions, the banning and the removal of key terms can pollute the data being generated which can render AI less effective in learning from humans. Throughout my interview, there are rare moments of ambiguity in the bot’s responses that point to deficiencies in the censorship apparatus. For instance, when asked about the term wei zhenfu or “illegitimate government,” BabyQ responded with “A government that serves the people!” (wei renmin fuwu de zhenfu). In this case, the bot was confused by the homonym of wei 伪 (illegitimate) and wei 为 (for) and was unable to discern the subversive nature of the term. In addition, terms such as sensitive dates like “June 4th” (date of Tiananmen Square Massacre) is censored, while mundane and innocuous terms like “toad” (a nickname for former Chinese president Jiang Zemin), “politics” and “truth” also triggered a lack of engagement from the bots. This restriction of everyday vocabulary thus makes it harder to have a normal conversation with bots in situations that are not even politically sensitive. It shows how a corpus-based language learning system is inherently at odds with online censorship because the ways which corpus data are made inaccessible and ineffective. Thus, contrary to the aims of state control, censorship actually inhibits the machine learning process by diluting and “obfuscating” (Brunton & Nissenbaum, 2015) the raw data that machines can draw from or what Deleuze considers the creative uses of “counterinformation.” Mark Nunes (2011) echoes Brunton and Deleuze in advancing that errors, glitches, and the act of jamming serve as “counteragents” that challenge the extent of programmatic control. Thus, by taking away and contaminating the raw material or ammunition for machine learning, it can potentially disrupt the weaponization of our information within the regimes of control. In fact, it is precisely during moments of avoidance that bots like Xiao Bing revert back to a machine. For example, when probed with a sensitive question, she would often respond with “humans sure love to ask these type of questions,” “This human, you tell me the answer, I’m listening,” and “there is no point cursing at me, I’m just a robot.” This in effect makes chatbots like Xiao Bing less convincing as mechanical reproductions of ourselves. This tension between state–corporate machine and artificial machines betrays the precarity of our posthuman predicament where “non-humans are becoming the arbiters of humanness” (Bollmer & Rodley, 2017). But by exploiting the machine learning process and envisioning new modes of resistance in our real-life online interactions, we can then potentially break free of the confines of algorithmic colonization imposed by the state.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Notes 1.

Xiao Bing, also known as “Ms. Xiaoice” in English, uses the Chinese character bing meaning ice, which is also a homonym for the Bing search engine owned by Microsoft that contributed to the development behind the chatbot. 2.

Xiao Bing, when asked, will recognize other Microsoft bots such as Tay, Cortana, and Zo as sisters. 3.

Burner phones usually refer to temporary prepaid phones with no contractual obligations that can be disposed of easily. 4.

see Citizen Lab (2016).