Programs do more and more scientific work - but you need to be able to check them as well as the original data, as the recent row over climate change documentation shows

One of the spinoffs from the emails and documents that were leaked from the Climate Research Unit at the University of East Anglia is the light that was shone on the role of program code in climate research. There is a particularly revealing set of "README" documents that were produced by a programmer at UEA apparently known as "Harry". The documents indicate someone struggling with undocumented, baroque code and missing data – this, in something which forms part of one of the three major climate databases used by researchers throughout the world.

Many climate scientists have refused to publish their computer programs. I suggest is that this is both unscientific behaviour and, equally importantly, ignores a major problem: that scientific software has got a poor reputation for error.

There is enough evidence for us to regard a lot of scientific software with worry. For example Professor Les Hatton, an international expert in software testing resident in the Universities of Kent and Kingston, carried out an extensive analysis of several million lines of scientific code. He showed that the software had an unacceptably high level of detectable inconsistencies.

For example, interface inconsistencies between software modules which pass data from one part of a program to another occurred at the rate of one in every seven interfaces on average in the programming language Fortran, and one in every 37 interfaces in the language C. This is hugely worrying when you realise that just one error — just one — will usually invalidate a computer program. What he also discovered, even more worryingly, is that the accuracy of results declined from six significant figures to one significant figure during the running of programs.

Hatton and other researchers' work indicates that scientific software is often of poor quality. What is staggering about the research that has been done is that it examines commercial scientific software – produced by software engineers who have to undergo a regime of thorough testing, quality assurance and a change control discipline known as configuration management.

By contrast scientific software developed in our universities and research institutes is often produced by scientists with no training in software engineering and with no quality mechanisms in place and so, no doubt, the occurrence of errors will be even higher. The Climate Research Unit's "Harry ReadMe" files are a graphic indication of such working conditions, containing as they do the outpouring of a programmer's frustrations in trying to get sets of data to conform to a specification.

Computer code is also at the heart of a scientific issue. One of the key features of science is deniability: if you erect a theory and someone produces evidence that it is wrong, then it falls. This is how science works: by openness, by publishing minute details of an experiment, some mathematical equations or a simulation; by doing this you embrace deniability. This does not seem to have happened in climate research. Many researchers have refused to release their computer programs — even though they are still in existence and not subject to commercial agreements. An example is Professor Mann's initial refusal to give up the code that was used to construct the 1999 "hockey stick" model that demonstrated that human-made global warming is a unique artefact of the last few decades. (He did finally release it in 2005.)

The situation is by no means bad across academia. A number of journals, for example those in the area of economics and econometrics, insist on an author lodging both the data and the programs with the journal before publication. There's also an object lesson in a landmark piece of mathematics: the proof of the four colour conjecture by Apel and Haken. They proved a longstanding hypothesis which suggested - but had never been able to show and so elevate to a theory - that in any map, the regions can be coloured using at most four colours so that no two adjacent regions have the same colour. Their proof was controversial in that instead of an elegant mathematical exposition, they partly used a computer program. Their work was criticised for inelegance, but it was correct and the computer program was published for checking.

The problem of large-scale scientific computing and the publication of data is being addressed by organisations and individuals that have signed up to the idea of the fourth paradigm. This was the idea of Jim Grey, a senior researcher at Microsoft, who identified the problem well before Climategate. There is now a lot of research and development work going into mechanisms whereby the web can be used as a repository for scientific publications, and more importantly the computer programs and the huge amount of data that they use and generate. A number of workers are even devising systems that show the progress of a scientific idea from first thoughts to the final published papers. The problems with climate research will do doubt provide an impetus for this work to be accelerated.

So, if you are publishing research articles that use computer programs, if you want to claim that you are engaging in science, the programs are in your possession and you will not release them then I would not regard you as a scientist; I would also regard any papers based on the software as null and void.

I find it sobering to realise that a slip of a keyboard could create an error in programs that will be used to make financial decisions which involve billions of pounds and, moreover, that the probability of such errors is quite high. But of course the algorithms (known as Gaussian copula functions) that the banks used to assume that they could create risk-free bonds from sub-prime loans has now been published (http://www.wired.com/techbiz/it/magazine/17-03/wp_quant?currentPage=all). That was pretty expensive. Climate change is expensive too. We really do need to be sure that we're not getting any of our sums wrong - whether too big or small - there as well.

Darrel Ince is professor of computing at the Open University