The code fragments were provided by Ilya Ivanov and Sergei Vasilyev from the PVS-Studio team.

Unreal Engine continues to develop as new code is added and previously written code is changed. What is the inevitable consequence of ongoing development in a project? The emergence of new bugs in the code that a programmer wants to identify as early as possible. One of the ways to reduce the number of errors is the use of a static analyzer like PVS-Studio. Moreover, the analyzer is not only evolving, but also constantly learning to look for new error patterns, some of which we will discuss in this article. If you care about code quality, this article is for you.

Static code analysis, theoretical reference

Static code analysis is the process of detecting errors and flaws in the source code of programs. It can be seen as a process of automated code review.

Code review is one of the oldest, and most useful, methods of detecting defects. It involves joint reading of the source code and giving recommendations on how to make improvements. This process helps detect errors or code fragments that may become erroneous in the future. Also, there is an unwritten rule that the author of the code should not give any explanations as to how a certain part of the program works. The algorithm should be clear just by looking at the text of the program and comments in the code. If this is not so, the code should be modified.

Generally, code review works quite well, as programmers notice errors in someone else's code much easier than in their own. You can find more details concerning the methodology of code review in a great book by Steve McConnell, "Code Complete".

The methodology of code review has two disadvantages:

Extremely high cost. It requires distracting several programmers from their main tasks to review newly written code or rewritten code after the recommended modifications are made. At the same time, programmers should regularly take breaks to rest while working. If a person tries looking through large code fragments, there is a danger of quickly losing attention and negating the purpose of the review. It's hard to detect errors that aren't directly related to the new/modified code. Looking at a fresh code fragment, it's not easy to assume that the malloc function works incorrectly, because the header file stdlib.h is not included. You may find more details about this situation in the article "A nice 64-bit error in C". One more example: changing the function type or a variable in a header file. Ideally, a programmer should review all the code where this function or variable is used after such changes. In practice, this is too time-consuming, and the review is limited only to those fragments where a programmer has changed something.

On one hand, there is a wish to perform code review regularly. On the other hand, it is too expensive. The compromise is static analysis. Static analysis tools check the source texts of programs and give recommendations to the programmers about reviewing certain code fragments. The analyzers don't get tired and check all code affected by the changes in the header files. Of course, a program won't substitute a full-fledged code review, done by a team of developers. However, the benefits/price ratio makes the static analysis quite a useful method, adopted by many companies.

As with any other methodology of error detection, static analysis has its strengths and weaknesses. There is no ideal method of testing programs and the best results can be achieved using a combination of various approaches, such as: a good coding style, static code analysis, dynamic code analysis, unit-testing, regression testing, and so on.

An important advantage of static analysis is in the ability to detect a lot of the errors right after they appear in the code, meaning that fixing them won't cost much. The earlier an error is detected, the less expensive it is to correct it. Thus, according to the book "Code Complete" by McConnell, correction of an error at the stage of testing the code is ten times more expensive than at the stage of writing the code:

Average costs of correcting defects depending upon the time of their detection (the data presented in the table are taken from the book 'Code Complete' by S. McConnell)

Static analysis tools allow the detection of a large amount of errors, typical during the initial writing of any given code, which significantly reduces the cost of the development of the whole project.

The prevalence of static analyzers will grow over the time due to the constant growth of the codebase of modern applications. Programs are becoming larger and more complicated. At the same time, the density of errors depends on code size nonlinearly.

The larger the project, the more errors per 1000 lines of code it contains. Take a look at this chart:

The size of the project and typical density of errors. Source: "Program Quality and Programmer Productivity" (Jones, 1977), "Estimating Software Costs" (Jones, 1998).

This data is shown in the graph below for better visualization.

Typical density of errors in the project. Blue - maximum quantity. Red - the average number. Green - the smallest amount of errors.

The graph shows that with the growth of the project, programmers are forced to use more tools to maintain the desired quality of code. It's impossible to create high quality code in the same way it was, let's say, eight years ago. This can be an unpleasant discovery for a team: it seems that they write the code as usual, but the condition of the code gets worse.

It is necessary to explore new methodologies and tools as old methods may not be sustainable as these technologies grow and change. One of the most useful methods that is worth using is static analysis.

If the reader was not familiar with the methodology of static analysis, I hope I was able to raise interest in that. Here are several links with suggested reading for more details:

Now it is time to move from theory to practice and see how static analysis helps a project such as Unreal Engine 4.

Unreal Engine

Our team was again honored to work with the code of Unreal Engine! Although, we did it two years ago, since that time we got more work to do regarding code editing and improvement. It is always useful and interesting to look at the project’s codebase after a two-year break. There are several reasons for this.

First, we were interested to look at false positives from the analyzer. This work helped us improve our tool as well, reducing the number of unnecessary messages. Fighting false positives is a constant task for any developer of code analyzers. To those who are willing to read more, I suggest having a look at the article "The way static analyzers fight against false positives, and why they do it".

The codebase of Unreal Engine 4 has significantly changed over the last two years. Some fragments were added, some were removed, sometimes entire folders disappeared. That's why not all the parts of the code got sufficient attention, which means that there is plenty of work for PVS-Studio.

I would like to compliment Epic Games for taking good care of their code and using such tools as PVS-Studio. A reader could take that with a smile: "Of course your team should be praising Epic Games, because they are your customer". To be honest, we do have a motive to leave positive feedback about the developers from Epic Games. However, I am saying the words of praise with absolute sincerity. The fact that the company uses static analysis tools shows the maturity of the project development cycle and the care given to ensuring the reliability and safety of the code.

Why am I sure that using PVS-Studio can greatly improve the quality of the code? It is one of the most powerful static analyzers and easily detects errors, even in projects such as:

Using PVS-Studio brings the quality of the code to the next level. By doing this, Epic Games shows they also care about all those who use Unreal Engine 4 for their projects. Every detected bug lessens somebody's headache.

Interesting errors

I won't be talking about all the errors that we found and fixed. I'll highlight only those that deserve attention, to my mind. For those who are interested, the other errors are viewable in the pull request on Github. To access the source code, and a specified pull request, you must have access to the Unreal Engine repository on GitHub. This blog outlines how to get access if you don't already have it.

The development of PVS-Studio analyzer is not only in the creation of new diagnostics, but also improvement of the existing ones. For example, the algorithms for evaluating possible values of variables are improving all the time. Due to this, the analyzer started detecting errors of this kind over a year ago:

uint8* Data = (uint8*)PointerVal; if (Data != nullptr || DataLen == 0) { NUTDebug::LogHexDump(Data, DataLen); } else if (Data == nullptr) { Ar.Logf(TEXT("Invalid Data parameter.")); } else // if (DataLen == 0) { Ar.Logf(TEXT("Invalid DataLen parameter.")); }

PVS-Studio warning: V547 Expression 'Data == nullptr' is always true. unittestmanager.cpp 1924

If the condition (Data != nullptr || DataLen == 0) is not true, it means that the pointer Data is definitely equal to nullptr. Therefore, to further check (Data == nullptr) makes no sense.

Correct variant of the code:

if (Data != nullptr && DataLen > 0)

The diagnostic V547 was written in 2010. However, the mechanism of evaluating the values of variables wasn't perfect, and it didn't allow the finding of this error. The analyzer was confused by the check of the variable value DataLen. It couldn't figure out what the variable values are equal to in various conditions. It is probably not a problem for a person to analyze such code, but it's not that simple when it comes to writing algorithms to look for such errors.

So, this was a demonstration of the improvement of internal mechanisms of PVS-Studio, which helped detect a new error. These were inner improvements, with the help of which, the analyzer started working more accurately.

We also make "external" improvements by supporting new constructions appearing in the new versions of the C++ language. Still, it's not enough to learn to parse C++11, C++14, and so on. It's equally important to refine old diagnostics and to implement new diagnostics that will find bugs in new language constructs. As an example, let's consider diagnostic V714 that looks for incorrect range-based loops. In Unreal Engine the V714 diagnostic points to the following loop:

for (TSharedPtr<SWidget> SlateWidget : SlateWidgets) { SlateWidget = nullptr; }

PVS-Studio warning: V714 Variable is not passed into for-each loop by a reference, but its value is changed inside of the loop. vreditorradialfloatingui.cpp 170

A programmer wanted to assign the value nullptr to all the elements in the container SlateWidgets. The error is that SlateWidget is a usual local variable that is created during every new iteration of the loop. Assigning a value to this variable does not lead to the changes of the element in the container. We should use a reference so that the code works correctly:

for (TSharedPtr<SWidget> &SlateWidget : SlateWidgets) { SlateWidget = nullptr; }

Of course we also add diagnostics which aren't related to the language. For example, the diagnostic V767 didn't exist in 2015 when our team wrote the previous article about Unreal Engine 4. This diagnostic appeared in PVS-Studio in version 6.07 (August 8, 2016). Thanks to this diagnostic we detected such an error:

for(int i = 0; i < SelectedObjects.Num(); ++i) { UObject* Obj = SelectedObjects[0].Get(); EdObj = Cast<UEditorSkeletonNotifyObj>(Obj); if(EdObj) { break; } }

PVS-Studio warning: V767 Suspicious access to element of 'SelectedObjects' array by a constant index inside a loop. skeletonnotifydetails.cpp 38

The loop should contain a search of the element that has UEditorSkeletonNotifyObj type. Due to a typo, a numeric literal 0 was written instead of the i variable during the choice of the element.

Correct variant of the code:

UObject* Obj = SelectedObjects[i].Get();

Let's take a look at another diagnostic, V763 that also appeared in the PVS-Studio 6.07. This bug is quite amusing, but I'll have to cite quite a long body of the RunTest function:

bool FCreateBPTemplateProjectAutomationTests::RunTest( const FString& Parameters) { TSharedPtr<SNewProjectWizard> NewProjectWizard; NewProjectWizard = SNew(SNewProjectWizard); TMap<FName, TArray<TSharedPtr<FTemplateItem>> >& Templates = NewProjectWizard->FindTemplateProjects(); int32 OutMatchedProjectsDesk = 0; int32 OutCreatedProjectsDesk = 0; GameProjectAutomationUtils::CreateProjectSet(Templates, EHardwareClass::Desktop, EGraphicsPreset::Maximum, EContentSourceCategory::BlueprintFeature, false, OutMatchedProjectsDesk, OutCreatedProjectsDesk); int32 OutMatchedProjectsMob = 0; int32 OutCreatedProjectsMob = 0; GameProjectAutomationUtils::CreateProjectSet(Templates, EHardwareClass::Mobile, EGraphicsPreset::Maximum, EContentSourceCategory::BlueprintFeature, false, OutMatchedProjectsMob, OutCreatedProjectsMob); return ( OutMatchedProjectsDesk == OutCreatedProjectsDesk ) && ( OutMatchedProjectsMob == OutCreatedProjectsMob ); }

The following part is the most important:

A programmer tries to initialize the variables OutMatchedProjectsDesk and OutCreatedProjectsDesk with the help of the first call of the CreateProjectSet function.

Using the second call of the CreateProjectSet function, there is an attempt to initialize the variables OutMatchedProjectsMob and OutCreatedProjectsMob.

Then there is a check that the values of these variables meet the condition:

return ( OutMatchedProjectsDesk == OutCreatedProjectsDesk ) && ( OutMatchedProjectsMob == OutCreatedProjectsMob );

Don't look for the errors in the body of the reviewed function; they aren't there. I have given this code to show that the function CreateProjectSet is expected to write the values into two variables, passed as two last factual arguments

The error lurks in the function CreateProjectSet:

static void CreateProjectSet(.... int32 OutCreatedProjects, int32 OutMatchedProjects) { .... OutCreatedProjects = 0; OutMatchedProjects = 0; .... OutMatchedProjects++; .... OutCreatedProjects++; .... }

PVS-Studio will issue two warnings here:

V763 Parameter 'OutCreatedProjects' is always rewritten in function body before being used. gameprojectautomationtests.cpp 88

V763 Parameter 'OutMatchedProjects' is always rewritten in function body before being used. gameprojectautomationtests.cpp 89

The analyzer is absolutely right when it warns that the values of the arguments OutCreatedProjects and OutMatchedProjects are not used in any way, but are immediately overwritten with 0.

The error is simple: a programmer forgot to pass parameters by reference. Correct variant of the code:

static void CreateProjectSet(.... int32 &OutCreatedProjects, int32 &OutMatchedProjects)

I've given errors that require at least some attentiveness for detection. However, there are a lot more simple and banal errors. For example, missing break statements:

{ case EWidgetBlendMode::Opaque: ActualBackgroundColor.A = 1.0f; case EWidgetBlendMode::Masked: ActualBackgroundColor.A = 0.0f; }

Or incorrect comparison of several variables for equality:

checkf(GPixelFormats[PixelFormat].BlockSizeX == GPixelFormats[PixelFormat].BlockSizeY == GPixelFormats[PixelFormat].BlockSizeZ == 1, TEXT("Tried to use compressed format?"));

If someone is new to C++ and doesn't get why this comparison is incorrect, I suggest looking at the description of V709 diagnostic.

These errors are the most numerous among those detected by PVS-Studio. But if they look so simple, why are they still unnoticed? They are so trivial if they are highlighted in the article for a reader. However, it's really hard to find them in the code of real applications. Even during a code review, one may look at the following code block and not see any errors.

{ case EWidgetBlendMode::Opaque: ActualBackgroundColor.A = 1.0f; case EWidgetBlendMode::Masked: ActualBackgroundColor.A = 0.0f; }

The code looks so simple that a programmer doesn't even try to read it carefully, thinking that it is completely correct.

Now, let's discuss a question: Can we reduce the number of errors in any way?

Recommendation

The errors described in the article were found using PVS-Studio, and most likely a reader would expect that I would recommend using static analysis tools. Yes, I do recommend integrating PVS-Studio static analyzer to the development process. There is no need to refuse the possibility of finding several bugs right after writing the code. However, I would like to discuss a very important point that is usually not mentioned in articles related to code quality.

It is impossible to achieve high quality in a project until a team of programmers admit that they make mistakes, and sometimes very simple ones. This phrase sounds very trivial, but it's very important. Until a programmer realizes that this statement refers not to an abstract programmer, but to them personally, no tool or methodology will be useful. In other words, programmers are most often too proud to admit that they need additional tools and methods to write quality code.

All programmers know that there are errors in all programs. Still, they suppose that the rules, recommendations, and tools aren't for them, as they are great professional developers who write bugless code. This is a problem of level overestimation. An article "The Problem With 'Above Average Programmers'" gives a nice explanation of this effect. I'll quote an excerpt:

"How would you rate your programming skills? (Below Average, Average, or Above Average)?"

Based on psychological studies across many different groups, about 90% of all programmers will answer "Above Average". Of course, that can't possibly be true. In a group of 100 people, 50 are above average, 50 are below average. This effect is known as illusory superiority. It is described in many spheres, but even if you haven't heard about this, you will most likely answer "above average".

This is a problem that prevents programmers from learning new technology and methodology. My main recommendation is to try reconsidering the attitude towards the work of the team and individuals. The position "I/we write great code" is counterproductive. It's a common thing for people to make mistakes; the same is true for programmers. By thinking through this, a person can make the biggest step in the direction of high quality software. I also suggest to the project managers to read this article.

I would like to warn about another reasoning error. Static and dynamic analyzers mainly detect simple bugs and typos. They won't find high-level logic errors, because artificial intelligence is not invented yet. However, a simple error can cause great harm and take a lot of time/money/effort to repair. You can read more on the topic here: "If the coding bug is banal, it doesn't mean it's not crucial".

And one more thing: Don't look for a silver bullet. Instead, use a combination of various elements such as:

Forget "our team is above average".

Have a coding standard, which is shared by all developers within the team.

Perform code reviews (at least of the most important fragments and code written by juniors).

Static code analysis.

Dynamic code analysis.

Regression testing, smoke testing.

Using unit tests, TDD.

and so on.

I'm not suggesting that you start using all the methods listed above at once. In different projects, some approaches will be more useful, some less. The point is not to hope for one alone to work but instead use a rational combination of techniques. Only this will improve the quality and reliability of the code.

Conclusion

The developers of Unreal Engine 4 care about the quality of their code, and the PVS-Studio team does its best to help them in their endeavors.

The PVS-Studio team is ready to work with the code of your projects as well. Besides providing the license for the tool and further support, we perform code auditing, migration of code, and so on.

I wish you as few bugs in the programs as possible.