Online abuse can be cruel – but for some tech companies it is an existential threat. Can giants such as Facebook use behavioural psychology and persuasive design to tame the trolls?

It was a book on the curious decline of the murder rate that gave Aja Bogdanoff her idea.

Her job back then stopped short of actual bloodshed, but it left her no stranger to the dark side of human nature. A software engineer who built and moderated online comment platforms, Bogdanoff spent her days wading through insults, and her spare time firefighting more urgent incidents. She could see platforms being overwhelmed by the sheer volume of antisocial users, but couldn’t figure out how to get one jump ahead of them.

Then she read Steven Pinker’s Better Angels of Our Nature, which charts the evolution of human violence. “The book says people always believe that their actions are justified. No matter what they’re doing, they think there’s a valid reason,” she recalls, from the office of her startup in Oregon. “So, I realised we had to get in there and interrupt that process; make people think about it, that these are real people.” Instead of constantly running to catch up with “bad” posters, could she design better behaviour in from the start?

The idea of a “nicer” net sounds a bit twee, guaranteed to enrage libertarians who fear the creation of bland, beige safe spaces where free speech goes to die. But it’s an idea with some big guns behind it, and what they are advocating isn’t censorship, but smarter design. This month at the Sundance film festival, the web pioneer Tim Berners-Lee called on platforms to start building “systems that tend to produce constructive criticism and harmony, as opposed to negativity and bullying”.

In Britain, Yvette Cooper, who endured some grim misogynistic online abuse when she was campaigning to become Labour leader, is planning a “reclaim the net” summit, examining ways of building a kinder, more civilised culture. “When they’re talking to politicians, people should be able to disagree and say so strongly,” she says. “But when it’s about bombarding people who are not in public life with aggressive abuse, that’s where it becomes a problem, if it starts to silence people. You hear teenage girls saying, ‘Right, I’m going to stop saying anything about feminism on Facebook because it’s just so hostile.’” This isn’t about censoring free speech, but protecting it, she argues, for those now being bullied out of the conversation.

For idealists such as Berners-Lee, the fact that the net has become an exhausting place to spend time is an affront to its founding values. Technology was supposed to make the world a better place, not a bitchier one. And for the big corporate players – Twitter, Instagram, online publishers and other businesses reliant on us spending more and more time online – it’s a genuine commercial threat. Few users and fewer advertisers enjoy hanging out in a room full of furious people spoiling for a fight.

“If Facebook wasn’t a safe place and people didn’t feel they could have a conversation that’s civil and respectful, why would anyone want to advertise in that place?” says Simon Milner, Facebook’s director of policy for the UK, Middle East and Europe. “The two things go together. It’s an important part of the business model.”

This is where Civil Comments, the startup Bogdanoff founded with Christa Mrgan, comes in.

The idea is simple (although the software is so complex it took a year to build): before posting a comment in a forum or below an article, users must rate two randomly selected comments from others for quality of argument and civility (defined as an absence of personal attacks or abuse). Ratings are crunched to build up a picture of what users of any given site will tolerate, which is then useful for flagging potentially offensive material.

Crucially, users must then rate their own comment for civility, and can rewrite it if they want (in testing, about 5% did).

It may not deter hardened trolls but, says Bogdanoff, the process reminds ordinary users that they’re not just “shouting into a void” – other people are judging them, too. It evokes the slight sense of social inhibition we feel in real life when asked to speak before an audience. “If somebody handed you a microphone and gave you a few moments in the spotlight, you would think: ‘Is this what I really wanted to say?’”

It’s early days – the system is being piloted by two local Oregan newspapers. But Mrgan insists it hasn’t led to sterile, sanitised comments. “People are still making jokes, being a little bit snarky, getting really opinionated,” she says. “The personal attacks, name-calling and abuse have gone, but the same feel has really stayed.”

Asking humans to judge each other can be a surprisingly powerful thing. Take the Uber taxi app’s rating system, which asks passengers and drivers to give each other star-ratings – essentially a way of creating a reciprocal relationship between two strangers, where each has a reputation to lose. The company doesn’t spell out the consequences for passengers who get bad reviews because, frankly, it doesn’t need to; passengers go to surprising lengths to keep a good rating without really understanding why it matters. But invoking a sense of being watched isn’t the only way platforms subliminally encourage social behaviour.

A few years ago, Facebook managers noticed a rush of complaints from users about friends posting photographs of them that they didn’t like. The pictures weren’t explicit; they just reminded users of something they would rather forget, or made them look stupid. These complaints were invariably rejected because no rules had been broken, yet friendships were being strained as a result. “We tried saying, ‘Why don’t you just message the person?’, but people didn’t quite know what to say,” says Milner, adding tactfully that not everybody “has the social skills” to resolve such petty squabbles.

So, Facebook introduced social reporting, which works like a teacher gently helping kids in a playground dispute to resolve things between themselves. Complainants get a template message to send to their friend, explaining how the picture makes them feel and asking politely for its removal. Usually, that’s all it takes – it is, says Milner, “helping you have an empathetic response” that leaves everyone feeling good. “We set up our systems to encourage people to be nice – to think about things before you post.”

It’s a classic example of what BJ Fogg, a Stanford-based behavioural scientist who specialises in the psychology of Facebook, calls persuasive design: if you want people to do something, don’t explain why, just show them how. Humans learn by imitation, which means modelling nice behaviour beats lecturing people to be nice.

Lately, Facebook’s plans for promoting harmony have become significantly more ambitious. Last month, its chief operating officer, Sheryl Sandberg, provoked raised eyebrows in Davos by suggesting users could help undermine jihadi propaganda with a concerted counter-offensive of what can only be described as organised niceness. She cited a recent “like attack” staged by German users, who swamped a neo-Nazi group’s Facebook page with messages of inclusivity and tolerance.

It’s hard to imagine hardened fascists being lovebombed into repentance overnight. But Sandberg was reflecting a growing belief that while playing Whac-a-Mole with extremists – shutting offensive accounts only for them to resurface under new identities – is necessary, it’s not enough. (Twitter has shut 125,000 accounts linked to Isis since mid-2015, which indicates the scale of the problem.) Why not mobilise the vast majority of reasonable human beings to marginalise what is really a tiny but disproportionately noisy minority of extremists?

Last autumn, Facebook launched a project with the thinktank Demos on what it calls “counterspeech”, providing alternatives to extremist narratives, and it has recently begun working with academics at King’s College London who specialise in jihadi propaganda. It doesn’t have all the answers yet, says Milner, but “what does work is the kind of thing Sheryl was talking about: humour and warmth”. If extremists seek to spread fear and shock, counterspeech might aim to make them look small and ridiculous. Facebook now plans to build a network of NGOs across Europe and beyond, cultivating a grassroots counter-narrative to jihadi propaganda. The question that it’s grappling with, says Milner, is almost “how do we enable empathy in a crowd as opposed to individuals?”

It may all sound far-fetched. But another senior tech executive privately compares the problem of hate speech online with racism in football decades ago. There was only so much clubs could do to change crowd behaviour, and the real shift came when the Kick It Out campaign encouraged ordinary football fans to see it as their responsibility. Cracking down on antisocial behaviour from on high works best alongside changing the culture from the bottom up.

Recently, Cooper was watching football with a friend who had just joined Twitter. “It was a big game and they were interviewing the managers afterwards, and his immediate response was fascinating,” she says. “He was looking through Twitter, looking at what people’s responses were, working out ‘What should I post? Oh, shall I say what an idiot I think that guy is?’ Because it looks like that’s the sort of thing you’re supposed to say.”

Humans take our cue in novel situations from what everyone else does, which means on social media we’ll subconsciously try to match the general tone. At Sundance, Berners-Lee singled Twitter out for creating a world where “people tend to retweet stuff that really gets them going”.

Twitter staff plaintively reject the criticism, insisting that the hashtags that go viral are often warmly upbeat. “It’s things like #lovewins or ‘you ain’t no Muslim, bruv’, things where people are challenging society with a positive message,’ says Nick Pickles, head of public policy at Twitter, who argues that its great strength is empowering people to denounce prejudice collectively.

But with user numbers falling, Twitter is now taking a harder look at its culture.

Psychologist turned web design consultant Susan Weinschenk, author of 100 Things Every Designer Needs to Know About People, compares the speed at which aggressive behaviour colonises platforms with the infamous “broken windows” thesis of how neighbourhoods decline. One small incident, left unchallenged, quickly creates the impression that anything goes and thus encourages more serious problems. Gentler souls move out, and troublemakers move in.

“Once you let any of it in, it really is easy for it to escalate,” she says. “It’s the broken windows theory, just online.” And on large platforms, windows break too fast and too numerously to keep track of.

Several platforms have experimented with building algorithms that can detect threatening patterns of speech, perhaps even intervening automatically to cool heated exchanges – say, by sending an “are you sure you want to say this?” message to those involved.

But so far, says Bogdanoff, the idea of computers being able to judge offensiveness “really doesn’t work outside of the movies”. Software struggles with sarcasm, irony and the sheer range of human annoyance (the most reported post in YouTube’s history, says Pickles, was a Justin Bieber video).

Another initially promising idea is software allowing users to block messages containing words offensive to them personally, such as racial epithets. Dr Claire Hardaker, a lecturer in corpus linguistics at the University of Lancaster, recently carried out a project with Twitter analysing messages sent to the much-trolled feminist campaigner Caroline Criado-Perez. She identified several keywords characteristically used by people making threats – ranging from “bitch”and “rape” to the more surprising “LOL” – but unfortunately, she says, none is a perfectly reliable indicator of trouble. “Very few words are inherently and unwaveringly bad. Even the N-word is perfectly acceptable when used by certain group members in certain contexts.” Then there’s what she calls “the Scunthorpe problem”, where scanning for a banned group of letters produces false positives.

Cancer cons, phoney accidents and fake deaths: meet the internet hoax buster | Rachel Monroe Read more

Twitter’s big problem, meanwhile, is that every move towards a sunnier, more positive culture – appointing a “safety council” to tackle harassment, offering an optional algorithmic feed that promotes tweets you might like, or introducing an animated heart symbol for content users like – prompts furious complaints that it’s becoming like Facebook. Whereas Facebook revolves around friendships, Twitter’s raison d’etre is connecting strangers, exposing people to challenging new ideas and perspectives. That’s what makes it intellectually stimulating. But consider what happens when Corbynistas meet Blairites on the site, and it may also be what makes some users mad.

The emerging answer is what Pickles calls “allowing users to control their own experience”, for example by using the mute button to screen out people they don’t like without that person knowing. Essentially it’s a way of allowing some tweeters to express divisive opinions without bothering others who won’t like it. One person’s right to free speech is matched by another’s right not to listen, as it might be at a party where you try to escape when trapped by the bore in the corner.

To some, even this sounds like a retreat into some vacuous bubble, where you need never meet an idea you didn’t like. But if the pioneers of prosocial behaviour are right, the tide may be slowly turning, even in a community that once prized freedom of expression above everything. “When I started out, I’d tell people what I did for work and they’d be like, ‘Oh, you’re a censor,’” says Bogdanoff. “Now you tell people you’re working on civil comment and they’re like, ‘Oh, thank you!’”