The Chinese government fakes 448 million social media posts a year in a strategy that seeks to create the appearance of “viral” outbursts of Web activity, according to a new study by Harvard data scientists.

The posts appear under the names of apparently ordinary people, and aim to distract from topics related to actual or potential collective action, said Gary King, the Albert J. Weatherhead III University Professor, who carried out the research with two of his former graduate students: Jennifer Pan, now an assistant professor at Stanford University, and Margaret Roberts, an assistant professor at the University of California at San Diego.

The researchers’ foundation was an analysis of a 2014 leak of emails to one county’s propaganda department. The team used this information to extrapolate countrywide and understand the content and purpose of the social media posts and the Chinese government’s strategy.

The research shows that assumptions about the Chinese government’s tactics in this area are wrong, King said. The prevailing belief among journalists, academics, and activists, he said, has been that the government maintains an aggressive social media strategy that actively rebuts anti-government posts and tries to cast opponents, whether domestic or foreign, institutional or individual, in a negative light.

In fact, such posts make up a tiny minority, the researchers found. Most qualify as “cheerleading”: praise for the government and items on revolutionary history, national holidays, and other patriotic themes. In short, King said, the government is trying to distract people, and defuse tension over fraught issues.

The strategy makes sense, he said. The chance of changing minds through argument is remote; changing the subject is usually a more effective tack. “It’s the same strategy we use with our kids. We distract them: ‘Look at this shiny thing’ or ‘You have a good argument. Now let’s go out for ice cream,’” King said.

The research, supported by the Institute for Quantitative Social Science, which King directs, grew out of an earlier project to analyze a tool created to automatically understand textual data from blogs, websites, and social media sites. After success in developing the methodology and analyzing English-language social media posts, King and his colleagues wanted to stress-test the tool’s limits. Since it was created in English, they decided to target a notoriously difficult language: Chinese.

When the group examined the results, King realized that they somehow had been able to scrape social media posts that were subsequently censored by the Chinese government. They examined which posts were censored, finding that the government didn’t censor all critical posts, only those calling for protests and other collective action. In fact, it even censored posts calling for rallies in the government’s favor.

The new work probes deeper into a sophisticated strategy that doesn’t seek to stifle all discontent, but instead focuses on discouraging on-the-ground action against the government.

The findings also shed light on who’s behind the posts. Speculation has often centered on the so-called 50-cent party, thought to be a cadre of dedicated posters paid 50 cents — 8 cents U.S. — per post. But the research suggests that the 50-centers are regular government workers who author the posts without extra compensation, perhaps as an add-on to their everyday duties.

The 2014 email leak, by an anonymous blogger, released an archive of all 2013 and 2014 emails to the account of Zhanggong District’s Internet Propaganda Office, which included numerous emails from workers claiming credit for completing their 50-cent posting assignments, as well as other communications.

The archive’s size and the complexity of its content — it included screen shots, numerous attachments, different document formats, multiple email storage formats, and links to outside information — had been an obstacle to systematic analysis. King and colleagues devised a variety of procedures, some automated and some manual — such as hand coding — to categorize the content of the leaked messages.

The researchers identified 43,797 fake social media posts as well as the accounts behind them. They expanded their analysis to all posts from those accounts — some 167,971 — and then analyzed the account characteristics to identify similar accounts nationwide. At each step, they developed novel data science to examine the content of the posts and found that the dominant category amounted to cheerleading, with some factual reporting along with innocuous praise and suggestions.

Overall, the research estimates that the government fakes some 448 million social media posts a year, all written by humans without automation. While that may seem like a lot, about half are posted on official government sites; those posted to accounts on commercial sites amount to just one in every 178 on those accounts. Still, King said, the government’s strategy — coordinated bursts in response to specific calls for collective action — probably gives the posts an outsized effect.

The research identified significant spikes in posts after the late June 2013 Shanshan riots; after the April 2014 Urumqi Railway Explosion; and around significant official events and holidays, such as Martyrs’ Day, Tomb-Sweeping Day, and the 18th Party Congress’s third plenary session.

King said the ability to rise above background noise is what makes a viral strategy effective. Getting clued in to what others are talking about is a major reason people use social media in the first place, he pointed out.

“That’s why you go to social media,” King said. “That’s what social media is. It’s likely this activity has a big effect. … What we do not know is the trajectory or ultimate outcome of the ongoing arms race between government efforts at information control and popular efforts at expression and collective action.”