In this article I'm going to show you some examples explaining why physicists developing software products to be used in their field should also use static code analysis tools. I would be glad to see PVS-Studio in this role, but any other analyzer would do as well, of course. A code analyzer can significantly reduce the debugging time and headache from silly mistakes. Isn't it better when you can focus on physics rather than waste time seeking and fixing bugs in C++ applications?

A sad preface

This article actually happened to be a "missed shot", for even those whose job is to seek others' mistakes sometimes make mistakes themselves. :)

It was my fault that I didn't see to it. I asked a young newcomer employee to prepare the Geant4 project for the check. He was supposed to download the source codes, generate a project for Visual Studio and do other necessary preparations so that I could just take a ready project and check it. He did it all alright, but he downloaded the first version he had come across, which turned out to be an ancient version Geant4_9_4 we had already described in one of the earlier articles about building a project under Windows. Alas, I found it out only after this article had been ready!

On the other hand, there are some positive aspects to this situation too. Having figured out our mistake, I downloaded the latest program version (10.0-beta), checked it and wrote another article titled Going On with the Check of Geant4. Now we can compare examples from these two articles to see which of the errors have been fixed in the new version - and, consequently, could have been found much earlier and easier with the aid of static analysis - and which are still lurking in the code.

Among sixteen bugs mentioned in this article:

6 are fixed in the new version

10 are still there

So, even though this article is not quite to the point, it very well shows the diagnostic abilities of the PVS-Studio code analyzer. After all, it's not the project version we checked that matters; it's the opportunity to show you how many bugs could have been avoided already at the code writing stage.

Introduction

This article is a continuation of a series of articles about static analysis of code used in science-related fields. The previous articles are:

This time we're dealing with the Geant4 project. Here's a description from Wikipedia:

Geant4 (for GEometry ANd Tracking) is a platform for "the simulation of the passage of particles through matter", using Monte Carlo methods. It is the successor of the GEANT series of software toolkits developed by CERN, and the first to use object oriented programming (in C++). Its development, maintenance and user support are taken care by the international Geant4 Collaboration. Application areas include high energy physics and nuclear experiments, medical, accelerator and space physics studies. The software is used by a number of research projects around the world.

The project website: http://geant4.org. The project code is of a medium size, 76 Mbytes. Compare it with the following projects:

VirtualDub, 13 Mbytes;

Apache HTTP Server, 26 Mbytes;

Chromium (including additional libraries), 710 Mbytes.

Analysis was performed by the PVS-Studio static code analyzer. Since the Geant4 project is rather big, there was also a big chance to find some interesting bugs in it. No bugs at all can only be found in small projects (see the post about non-linear density of errors). Sometimes we come across large projects where PVS-Studio doesn't find anything too, but this is unfortunately an exception.

I want to apologize right away for any silly thing related to physics I could have written due to lack of knowledge of this subject. But please note that I found genuine bugs in this software without understanding what partons are or almost everything else about nuclear reactions!

Note. In this article I've mentioned only some of the bugs I found. For a complete list of the warnings that attracted my attention, refer to this file: geant4_old.txt.

Let's see what interesting bugs we can find in Geant4.

Copy-Paste and muons

About muons I only know that it's a type of elementary particles. But I do know very well what Copy-Paste is. Here's a nice example of a mistake when one code line was copied several times, the clones then edited but some left unchanged:

void G4QMessenger::SetNewValue(G4UIcommand* aComm, G4String aS) { if(photoDir) { if (aComm==theSynchR) thePhoto->SetSynchRadOnOff(aS); else if(aComm==minGamSR) thePhoto->SetMinGammaSR(.... else if(aComm==theGamN) thePhoto->SetGammaNuclearOnOff(.... else if(aComm==theMuoN) thePhoto->SetElPosNuclearOnOff(.... else if(aComm==theMuoN) thePhoto->SetMuonNuclearOnOff(aS); else if(aComm==theMuoN) thePhoto->SetTauNuclearOnOff(aS); else if(aComm==biasPhotoN)thePhoto->SetPhotoNucBias(.... } .... }

PVS-Studio's diagnostic message: V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 195, 196. G4phys_builders g4qmessenger.cc 195

Notice the check (aComm==theMuoN) repeated three times.

Note. The bug is either fixed in the new version of Geant4 or this code is removed.

Baryon decay

It's not an easy task studying radioactive decay or trying to detect proton decay. It's especially difficult when your program has bugs.

void G4QEnvironment::DecayBaryon(G4QHadron* qH) { .... else if(qM<mSzPi) // Only Lambda+PiM is possible { fQPDG=lQPDG; // Baryon is Lambda fMass=mLamb; sQPDG=pimQPDG; // Meson is Pi- sMass=mPi; } else if(qM<mSzPi) // Both Lambda+PiM & Sigma0+PiM are possible { if(G4UniformRand()<.6) { .... }

PVS-Studio's diagnostic message: V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 8373, 8380. G4hadronic_body_ci g4qenvironment.cc 8373

One and the same condition (qM<mSzPi) is used twice. It means that we will never get into the branch the comment "// Both Lambda+PiM & Sigma0+PiM are possible" refers to.

Note. The bug is either fixed in the new version of Geant4 or this code is removed.

Counting partons

In particle physics, the parton model was proposed at Cambridge University by Richard Feynman in 1969 as the vibrational energy required to accelerate one quark at a speed very close to the speed of light. It was later recognized that partons describe the same objects now more commonly referred to as quarks and gluons.

Unfortunately, one may have a hard time counting partons:

G4bool G4CollisionMesonBaryonElastic:: IsInCharge(const G4KineticTrack& trk1, const G4KineticTrack& trk2) const { G4bool result = false; G4ParticleDefinition * p1 = trk1.GetDefinition(); G4ParticleDefinition * p2 = trk2.GetDefinition(); if( (GetNumberOfPartons(p1) != 2 || GetNumberOfPartons(p2) != 3) ||(GetNumberOfPartons(p1) != 3 || GetNumberOfPartons(p2) != 2) ) { result = false; } .... }

PVS-Studio's diagnostic message: V547 Expression is always true. Probably the '&&' operator should be used here. G4had_im_r_matrix g4collisionmesonbaryonelastic.cc 53

The error might be not clearly visible at first, so I'll simplify the expression for you:

A = GetNumberOfPartons(p1); B = GetNumberOfPartons(p2); if ( (A != 2 || B != 3) || (A != 3 || B != 2) )

We may discard the braces too:

if ( A != 2 || B != 3 || A != 3 || B != 2 )

This condition is always true. The variable 'A' is always either not equal to 2 or not equal to 3. The same trouble is with the variable 'B'. I guess something is messed up somewhere. Most likely, the '&&' operator is missing in this code.

Note. The bug is either fixed in the new version of Geant4 or this code is removed.

Coulomb blockade and array index out of bounds error

Coulomb blockade is the increased resistance at small bias voltages of an electronic device comprising at least one low-capacitance tunnel junction. Because of the CB, the resistances of devices are not constant at low bias voltages, but increase to infinity for zero bias (i.e. no current flows). When few electrons are involved and an external static magnetic field is applied, Coulomb blockade provides the ground for spin blockade (also called Pauli blockade) which includes quantum mechanical effects due to spin interactions between the electrons.

Something is wrong with the function SetCoulombEffects(). The pointer array 'sig' receives addresses of two non-existent subarrays. Working with 'sig' will cause undefined behavior. At best, the program will crash; at worst, it will go on running and chaotically write to and read from memory occupied by other arrays and variables.

enum { NENERGY=22, NANGLE=180 }; class G4LEpp : public G4HadronicInteraction { .... G4float * sig[NANGLE]; static G4float SigCoul[NENERGY][NANGLE]; .... }; G4LEpp::SetCoulombEffects(G4int State) { if (State) { for(G4int i=0; i<NANGLE; i++) { sig[i] = SigCoul[i]; } elab = ElabCoul; } .... }

PVS-Studio's diagnostic message: V557 Array overrun is possible. The value of 'i' index could reach 179. g4lepp.cc 62

The array 'sig' contains 180 pointers supposed to point to different rows of the two-dimensional array 'SigCoul'. But 'SigCoul' contains only 22 rows, so most pointers of the 'sig' array will point to God knows where.

I can't say for sure where in particular the mistake was made. I guess something is wrong with the declaration of the 'SigCoul' array; perhaps the numbers of rows and columns should be swapped:

SigCoul[NENERGY][NANGLE] -->> SigCoul[NANGLE][NENERGY]

Note. This bug is still present in the new version of Geant4.

A typo and twist surfaces

void G4VTwistSurface::GetBoundaryLimit(G4int areacode, G4double limit[]) const { .... if (areacode & sC0Min1Max) { limit[0] = fAxisMin[0]; limit[1] = fAxisMin[1]; } else if (areacode & sC0Max1Min) { limit[0] = fAxisMax[0]; limit[1] = fAxisMin[1]; } else if (areacode & sC0Max1Max) { limit[0] = fAxisMax[0]; limit[1] = fAxisMax[1]; } else if (areacode & sC0Min1Max) { limit[0] = fAxisMin[0]; limit[1] = fAxisMax[1]; } .... }

PVS-Studio's diagnostic message: V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 793, 802. G4specsolids g4vtwistsurface.cc 793

We have 4 variables in this code:

sC0Min1Max

sC0Max1Min

sC0Min1Min

sC0Max1Max

When working with bounds, only 3 of them are used. Besides, the check (areacode & sC0Min1Max) is executed twice. If you look close, you'll notice that minimums are selected after the first check: fAxisMin[0], fAxisMin[1]. Most likely, this check should have looked like this:

if (areacode & sC0Min1Min) { limit[0] = fAxisMin[0]; limit[1] = fAxisMin[1]; }

Note. The bug is either fixed in the new version of Geant4 or this code is removed.

X-Ray and incorrect IF

X-Ray is a form of electromagnetic radiation. Most X-rays have a wavelength in the range of 0.01 to 10 nanometers, corresponding to frequencies in the range 30 petahertz to 30 exahertz (3×1016 Hz to 3×1019 Hz) and energies in the range 100 eV to 100 keV.

In the following sample, the class G4ForwardXrayTR is related to X-Ray, if I'm right. A mistake is made in the comparison of indexes.

G4VParticleChange* G4ForwardXrayTR::PostStepDoIt(....) { .... if (iMat == jMat || ( (fMatIndex1 >= 0 && fMatIndex1 >= 0) && ( iMat != fMatIndex1 && iMat != fMatIndex2 ) && ( jMat != fMatIndex1 && jMat != fMatIndex2 ) ) .... }

PVS-Studio's diagnostic message: V501 There are identical sub-expressions to the left and to the right of the '&&' operator: fMatIndex1 >= 0 && fMatIndex1 >= 0 G4xrays g4forwardxraytr.cc 620

The index 'fMatIndex1' is checked twice, whereas 'fMatIndex2' is ignored. I guess the fixed code should look like this:

(fMatIndex1 >= 0 && fMatIndex2 >= 0)

Note. This bug is still present in the new version of Geant4.

A typo and neutrons

G4double G4MesonAbsorption::GetTimeToAbsorption( const G4KineticTrack& trk1, const G4KineticTrack& trk2) { .... if(( trk1.GetDefinition() == G4Neutron::Neutron() || trk1.GetDefinition() == G4Neutron::Neutron() ) && sqrtS>1.91*GeV && pi*distance>maxChargedCrossSection) return time; .... }

PVS-Studio's diagnostic message: V501 There are identical sub-expressions 'trk1.GetDefinition() == G4Neutron::Neutron()' to the left and to the right of the '||' operator. G4had_im_r_matrix g4mesonabsorption.cc 285

I really don't know what this function does, but as far as I understand, it receives two particle trajectories as input. The function must in a special way process the case when at least one particle is a neutron. But actually only the first particle is checked.

The programmer must have intended the following:

trk1.GetDefinition() == G4Neutron::Neutron() || trk2.GetDefinition() == G4Neutron::Neutron()

A similar typo can be found in the following fragment: g4scatterer.cc 138

Note. The bug is either fixed in the new version of Geant4 or this code is removed.

Inserting a parton

There is the function InsertParton() whose duty is to insert a parton into a container. The user can specify after which parton the new item must be inserted. If the insertion location is not specified, I guess it can be done anywhere. But this particular case appeared to be implemented incorrectly.

typedef std::vector<G4Parton *> G4PartonVector; inline void G4ExcitedString::InsertParton( G4Parton *aParton, const G4Parton * addafter) { G4PartonVector::iterator insert_index; .... if ( addafter != NULL ) { insert_index=std::find(thePartons.begin(), thePartons.end(), addafter); .... } thePartons.insert(insert_index+1, aParton); }

PVS-Studio's diagnostic message: V614 Potentially uninitialized iterator 'insert_index' used. g4excitedstring.hh 193

If the pointer 'addafter' equals zero, the iterator "insert_index" remains uninitialized. As a result, inserting a new item may cause unpredictable effects.

Note. This bug is still present in the new version of Geant4.

Not all nucleons processed

A nucleon is one of the particles that makes up the atomic nucleus. Each atomic nucleus consists of one or more nucleons, and each atom in turn consists of a cluster of nucleons surrounded by one or more electrons. There are two known kinds of nucleon: the neutron and the proton.

The function packNucleons() in the sample below doesn't processes all the items it is supposed to because the loop terminates right after the first iteration. There is the 'break' operator at the end of the loop body, but the 'continue' operator is missing.

void G4QMDGroundStateNucleus::packNucleons() { .... while ( nmTry < maxTrial ) { nmTry++; G4int i = 0; for ( i = 1 ; i < GetMassNumber() ; i++ ) { .... } if ( i == GetMassNumber() ) { areTheseMsOK = true; } break; } .... }

PVS-Studio's diagnostic message: V612 An unconditional 'break' within a loop. g4qmdgroundstatenucleus.cc 274

I guess the 'break' operator at the end is extraneous and was written by mistake.

Note. This bug is still present in the new version of Geant4.

Lund string model and a typo in the index

In particle physics, the Lund string model is a phenomenological model of hadronization.

When you have to deal with array items individually, it's very easy to mistype. This is what happened in the constructor of the class G4LundStringFragmentation. In the code sample below, the mistake is clearly visible: one and the same cell is assigned two values. But this function is very large, and a lot of array items are initialized inside it, so it's pretty difficult to notice a mistake while examining the function. This is the case where static code analysis is absolutely necessary.

G4LundStringFragmentation::G4LundStringFragmentation() { .... BaryonWeight[0][1][2][2]=pspin_barion*0.5; .... BaryonWeight[0][1][2][2]=(1.-pspin_barion); .... }

PVS-Studio's diagnostic message: V519 The 'BaryonWeight[0][1][2][2]' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 205, 208. g4lundstringfragmentation.cc 208

Note. I saw many code fragments in this project where a variable is assigned two different values on end. Many of these cases are harmless: for example, a variable is first assigned 0 and then the needed value. But many of such fragments may contain an error indeed. That's why I recommend that Geant4's authors study close all the V519 diagnostic messages. I myself have just quickly scanned through them.

By the way, I don't quite understand this practice of initializing a variable with a default value first and only then with the value you need. What's the point of doing so? Isn't it easier to declare a variable right where you need it and initialize it with the number you need.

Note. This bug is still present in the new version of Geant4.

Some other V519 warnings

I don't like the copying operator in the class G4KineticTrack, something is not right:

const G4KineticTrack& G4KineticTrack::operator=( const G4KineticTrack& right) { .... the4Momentum = right.the4Momentum; the4Momentum = right.GetTrackingMomentum(); .... }

PVS-Studio's diagnostic message: V519 The 'the4Momentum' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 451, 452. g4kinetictrack.cc 452

Note. This bug is still present in the new version of Geant4

By the way, there were many V519 warnings for constructors. Perhaps those code fragments do have some meaning - for debugging purposes, for example? I don't know. Here are a few more examples like that:

void G4IonisParamMat::ComputeDensityEffect() { .... fX0density = 0.326*fCdensity-2.5 ; fX1density = 5.0 ; fMdensity = 3. ; while((icase > 0)&&(fCdensity < ClimiG[icase])) icase-- ; fX0density = X0valG[icase]; fX1density = X1valG[icase]; .... }

PVS-Studio's diagnostic messages: V519 The 'fX0density' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 245, 247. g4ionisparammat.cc 247

V519 The 'fX1density' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 245, 247. g4ionisparammat.cc 247

Note. This bug is still present in the new version of Geant4.

void G4AdjointPhotoElectricModel::SampleSecondaries(....) { .... pre_step_AdjointCS = totAdjointCS; post_step_AdjointCS = AdjointCrossSection(aCouple, electronEnergy,IsScatProjToProjCase); post_step_AdjointCS = totAdjointCS; .... }

PVS-Studio's diagnostic message: V519 The 'post_step_AdjointCS' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 76, 77. g4adjointphotoelectricmodel.cc 77

Note. This bug is still present in the new version of Geant4.

And the last suspicious fragment I've noticed. Note the item 'erecrem'.

void G4Incl::processEventIncl(G4InclInput *input) { .... varntp->mzini = izrem; varntp->exini = esrem; varntp->pxrem = pxrem; varntp->pyrem = pyrem; varntp->pzrem = pzrem; varntp->mcorem = mcorem; varntp->erecrem = pcorem; varntp->erecrem = erecrem; .... }

PVS-Studio's diagnostic message: V519 The 'varntp->erecrem' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 896, 897. g4incl.cc 897

Note. The bug is either fixed in the new version of Geant4 or this code is removed.

Counting array items starting with 1

The programmer must have forgotten that array items in C++ are counted starting with zero. If this rule is violated, an array overrun may occur. Besides, the comparison with the value 1.4 in the very beginning of the array is missing.

void G4HEInelastic::MediumEnergyClusterProduction(....) { .... G4double alem[] = {1.40, 2.30, 2.70, 3.00, 3.40, 4.60, 7.00}; .... for (j = 1; j < 8; j++) { if (alekw < alem[j]) { jmax = j; break; } } .... }

PVS-Studio's diagnostic message: V557 Array overrun is possible. The value of 'j' index could reach 7. g4heinelastic.cc 4682

Note. This bug is still present in the new version of Geant4.

Physics and undefined behavior

C++ is a cruel language. Let yourself relax a bit - and have your foot shot off by a proton. You may even not notice it at first. Here's an example of an incorrectly implemented addition operator:

template <typename T> GMocrenDataPrimitive<T> & GMocrenDataPrimitive<T>::operator + (const GMocrenDataPrimitive<T> & _right) { GMocrenDataPrimitive<T> rprim; .... return rprim; }

PVS-Studio's diagnostic message: V558 Function returns the reference to temporary local object: rprim. G4GMocren g4gmocrenio.cc 131

The function returns a reference to a local object. Trying to work with this reference will cause undefined behavior.

Note. This bug is still present in the new version of Geant4.

Time to stop

Unfortunately, we have to finish our excursion around the world of physics. It's just an article, not a multipage report. It's a pity that I can't tell you about many other bugs, so if you want to learn more about the suspicious code fragments I noticed while examining PVS-Studio's diagnostic messages, see this file: geant4_old.txt.

But please don't think that these are all the bugs PVS-Studio has managed to find. I only glanced through the report and could have missed much. That's why I suggest that the project's developers check their code with PVS-Studio themselves. Write to us, and we'll grant you a free registration key for some time.

And, as usual, let me remind you that static analysis should be used regularly, not on rare occasions. For you to grasp the idea why regular use is so much necessary, do read this and this.

As I've already said, the file contains a much bigger list of suspicious fragments than mentioned in this article. All the cases must be pretty clear; but if not, look for the error codes in the documentation to see a detailed description with examples.

The only thing I'd like to explain specifically is the diagnostics V636 and V624. Sometimes they may signal inaccuracy in calculations. I believe these diagnostics are very important when dealing with computing software.

An example of the V636 diagnostic:

G4double G4XAqmTotal::CrossSection( const G4KineticTrack& trk1, const G4KineticTrack& trk2) const { .... G4int sTrk1 = ....; G4int qTrk1 = ....; G4double sRatio1 = 0.; if (qTrk1 != 0) sRatio1 = sTrk1 / qTrk1; .... }

PVS-Studio's diagnostic message: V636 The 'sTrk1 / qTrk1' expression was implicitly casted from 'int' type to 'double' type. Consider utilizing an explicit type cast to avoid the loss of a fractional part. An example: double A = (double)(X) / Y;. g4xaqmtotal.cc 103

The result of the division operation "double X = 3/2" is 1, not 1.5 as you may mistakenly think at first. At first integer division is performed, and only then the result is cast to 'double'. While examining unfamiliar code, you may find it difficult to determine if integer division is correct or incorrect in every particular case. Such fragments in Geant4 are worth a closer examination.

Note. I recommend you to add special comments in those fragments where you really need integer division.

An example of the V624 diagnostic:

dSigPodT = HadrTot*HadrTot*(1+HadrReIm*HadrReIm)* (....)/16/3.1416*2.568;

PVS-Studio's diagnostic message: V624 The constant 3.1416 is being utilized. The resulting value could be inaccurate. Consider using the M_PI constant from <math.h>. g4elastichadrnucleushe.cc 750

I don't see the reason why strict constants are used for Pi, Pi/2, etc. No doubt, they are accurate enough, but it still doesn't explain why they should be used. Strict constants are just another chance to get more typos and defects, so it is better to replace them with predefined constants such as M_PI, M_PI_4, M_LN2. PVS-Studio gives necessary recommendations about that.

And one more thing. In the file geant4_old.txt, also included are the messages referring to microoptimizations. For example, here's one regarding iterators' increment:

class G4PhysicsTable : public std::vector<G4PhysicsVector*> { .... }; typedef G4PhysicsTable::iterator G4PhysicsTableIterator; inline void G4PhysicsTable::insertAt (....) { G4PhysicsTableIterator itr=begin(); for (size_t i=0; i<idx; ++i) { itr++; } .... }

PVS-Studio's diagnostic message: V803 Decreased performance. In case 'itr' is iterator it's more effective to use prefix form of increment. Replace iterator++ with ++iterator. g4physicstable.icc 83

To find out why these changes are preferable, see the article: Is it reasonable to use the prefix increment operator ++it instead of postfix operator it++ for iterators?.

Conclusion

You should take easy the fact that everyone makes mistakes and typos - and you, dear readers, make them too. It's inevitable. Static code analysis tools can help you fix a lot of errors at the very early stages, allowing you to focus on the technical or scientific task you're solving rather than catching defects.

References