For David Butler, it began with a knock on the door early one November morning, seven years ago. When he opened it, officers from the Merseyside police were standing on his doorstep. The retired taxi driver was being arrested for murder.

The police said they had evidence connecting Butler to the death of Anne Marie Foy, a 46-year-old sex worker who had been battered and strangled in Liverpool in 2005.

Butler’s DNA, it turned out, had been logged into the UK national database after a 1998 investigation into a break-in at the home he shared with his mother. A partial match had been made to DNA found on Foy’s fingernail clippings and cardigan buttons. This, combined with CCTV evidence of a distinctive taxi seen near the scene, led the prosecutor to tell the jury in Butler’s trial that the DNA information “provides compelling evidence that the defendant was in contact with Anne Marie Foy at the time immediately before she died”.

The case seemed conclusive. Yet Butler was adamant: he had not met Foy.

“You do see an assumption being made that a DNA profile is evidence of contact – case closed – whereas it is actually a lot more complicated than that,” says Ruth Morgan, the director of the Centre for the Forensic Sciences at University College London. “We are only beginning to realise quite how complex it is.”

Since DNA was first used in a police investigation 31 years ago, to solve the murder of Dawn Ashworth, a 15-year-old schoolgirl who was raped and strangled in Leicestershire, the technique has attained an aura of being bulletproof. Certainly, in some cases, evidence of a DNA match to a suspect can be powerful. “There will be times when you get a really clear [DNA] profile, and it is very clear how that material got to the crime scene. And it is [also] very clear that it was during the course of an illegal activity,” says Morgan. “The classic [example] would be semen on the clothing of someone who is underage.”

But Butler’s case is just one of many that highlight growing questions in the world of forensic science: what exactly are fingermarks, DNA or gunshot residue actually evidence of – particularly now that even tiny traces can be detected?

It’s a riddle whose answer may have profound consequences. According to research published by Morgan and her colleagues, rulings for 218 successful appeal cases in England and Wales between 2010 and 2016 argued that DNA evidence had been misleading, with the main issues being its relevance, validity or usefulness in proving an important point in a trial.

It is not the first time forensic science has come under scrutiny. In 2015, the FBI and co-authors released a report that put the final nail in the coffin of hair analysis, while the matching of fingermarks (found at crime scenes) and fingerprints (taken from suspects) has also been in the spotlight.

In a seminal paper from 2005, the neuroscientist Itiel Dror and colleagues revealed that, in the case of ambiguous marks, those examining the evidence could be swayed in their conclusions by the context of a case, with a match more likely to be made when the crime had been depicted as harrowing.

Anne Marie Foy … DNA found on her fingernails led to an innocent person.

After initial resistance, the impact of such work has been dramatic. “Fingermarks are now presented in court in a completely different way – it is really, really rare that you get someone saying unequivocally: ‘This is an identification,’” says Morgan.

But with technology now allowing the recovery of minute traces of DNA, new challenges have arisen. Not only is it often unclear whether trace DNA is from skin cells, saliva or some other body fluid, but such DNA samples often contain material from multiple individuals, which is difficult to tease apart.

What’s more, working out when the DNA was deposited, and for how long it might have been present, is an enormous problem. “If you get a mixed profile on an item of clothing, is the major profile the last person who wore it, or is it somebody who regularly wore it?” Morgan asks.

And it gets more complicated. “In different scenarios, some people leave DNA and some people don’t,” she says. Indeed, studies from several groups have looked at a number of factors affecting how much DNA is left behind, which can be influenced by such things as how long it was since somebody washed their hands and which hand a person touched an object with. And some people simply shed more. “We’ve had some experiments where the person whose DNA we were looking for left either a partial profile or not really a viable profile – but there was other DNA [from a person] who we were able to identify as a close partner who hadn’t touched the item; they hadn’t been in the lab.”

To Butler, such issues proved pivotal. The DNA samples from Foy’s nails were a complex mixture of profiles and only a partial match was found with Butler’s DNA. Further analysis of the initial examination notes also revealed that Foy had been wearing glittery nail varnish. “That is going to retain more DNA for a longer time because there is more opportunity; more things for it to stick to,” says Sue Pope, a DNA expert who worked on the case, and is now co-director of Principal Forensic Services Ltd.

And there was another significant factor: Butler had a condition which led to him having flaky skin. “He was depositing a lot more cells that you might expect from a single touch,” says Pope. The findings, argued the defence, meant that Butler’s DNA could have found its way on to Foy’s hands and hence her clothing by entirely innocent means – for example by Foy handling coins that had previously been touched by Butler. After eight months on remand, Butler was acquitted.

The case exemplifies the puzzles Morgan and her colleagues are hoping to tackle by means of a host of experiments, from looking at how DNA can be transferred between individuals to how long particles such as quartz grains can cling to footwear – an important consideration given that the shape and texture of such grains can be linked to specific environments. “One poor student had to wear the same pair of shoes most days for four months,” says Morgan.

The results were intriguing. The outsides of the shoes showed a rise in particular types of quartz grain as the student visited each of five known locations, with the quantity of each type dropping off over time. At the end of the study, grains from just two locations were found on each shoe.

But there was a surprise. “Inside [the shoe] we had every single location,” said Morgan. That, counterintuitively, means the inside of a pair of shoes could offer up more clues than the outside when it comes to tracing a suspect’s movements.

Meanwhile, research by the team carried out after the Rotherham abuse scandal not only revealed that DNA from semen could be found on clothes laundered several months after the fluid was deposited, but also threw up another result. “We found the ‘suspect’s’ DNA on other items that had never had any of that bodily fluid on them, indicating you are getting transfer in a washing machine,” says Morgan.

Meredith Kercher … Raffaele Sollecito’s DNA was found on her bra. Photograph: PA

The question of how and when DNA can be transferred, and its implications for the justice system, was thrown into sharp relief by the murder of Meredith Kercher in November 2007. Among the evidence was the fact that DNA from Raffaele Sollecito – the boyfriend of Kercher’s flatmate, Amanda Knox – was found on the clasp of Kercher’s bra.

While it was argued that the DNA cropped up as a result of contamination, Morgan points out that when people are under the same roof there are multiple opportunities for transfer, from handling each other’s laundry to touching the same objects. Yet just how much DNA is transferred, and in what circumstances, remains unknown.

The upshot is that, although the technology is more powerful than ever, the presence of trace DNA is far from a magic bullet. Indeed, the 2015 annual report from the UK government’s chief scientific adviser into forensic science warned that for many substances now detectable at trace levels “our ability to analyse may outstrip our ability to interpret”.

But funding, says Morgan, is largely directed towards inventing new gadgets and miniaturising existing technology, adding that UCL has had to turn to crowdfunding to raise money for a centre dedicated to the interpretation of forensic evidence.

Georgina Meakin, an expert in DNA analysis, also based at UCL, says that public understanding is another thing lagging behind advances in technology. One potential area of confusion is just what DNA analysis involves. Rather than sequencing the whole genome, only certain areas of the DNA are examined. Since 2014, in England, it has generally been at 16 sites, plus an additional marker that indicates whether the sample is from a man or woman. “These [sites] consist of repeating sequences of DNA, and we are interested in the number of repeats that are present; it is the number of those repeats that can differ between individuals,” says Meakin.

But, she stresses, trace DNA is often far from conclusive, with analysts often having to turn to statisticians to unpick mixed profiles. It’s a situation that some have sought to commercialise, among them Cybergenetics – the company behind an algorithm-based technology known as TrueAllele which claims to be able to untangle mixed profiles and “produce accurate results on previously unsolvable DNA evidence”.

Brian Shivers … flawed DNA evidence led to his conviction being overturned. Photograph: Peter Muhly/AFP/Getty Images

It has been used in hundreds of cases in the US. But there is a hitch: experts have argued that neither they, nor the defendants, have been allowed access to the system’s source code – meaning, among other things, that it is difficult to know what assumptions are built into the technology. “There is a lot of concern,” says Morgan. “People aren’t happy that it is essentially a black box.”

But the company puts the case that both defence and prosecution are welcome to test the software on their own data, adding that the maths behind the system has been disclosed.

Nonetheless, Pope argues that independent validation of software for DNA analysis is crucial. “A courtroom setting is not the best place for looking at really detailed questions about how statistics have been done.” And, even if the technology is accepted as being reliable, questions remain. It “tells you something about the potential source of the DNA, but nothing at all about the activity involved in the DNA coming to be where it was found”, she says.

That became apparent in the case of the Massereene barracks murders – the shooting of two British soldiers in Antrim, Northern Ireland, in March 2009. Among the evidence were findings from TrueAllele, which included a match between mixed-profile DNA taken from a mobile phone found in the partly burnt-out getaway car and one of the suspects, Brian Shivers. As Mark Perlin, founder of TrueAllele, testified, the DNA on the phone was six billion times more likely to be that of Shivers than it being a coincidence.

Together with other DNA evidence, the finding proved pivotal in the outcome of the trial. Shivers was found guilty and sentenced to at least 25 years in jail, with his poor health making it likely he would die in prison.

Yet in 2013, there was a retrial. The reasoning hung not on the evidence, but on its interpretation. Shivers’ DNA, the judge concluded, might have turned up on the phone and on other evidence from an innocent touch, or even a handshake.

“Have the prosecution eliminated other possibilities than the guilt of the accused? Am I satisfied beyond reasonable doubt of the guilt of the accused?” he asked.

The answer was clear. No. Shivers was acquitted.

Take on the role of a forensic investigator, in the Guardian’s latest VR experience, Crime Scene