We'd like to present the series of articles dealing with the recommendations on writing code of high quality using the examples of errors found in the Chromium project. This is the fifth part, which deals with the use of unchecked or incorrectly checked data. A very large number of vulnerabilities exist thanks just to the use of unchecked data that makes the this topic exciting and actual.

In fact, almost any type of error can become a vulnerability, even an ordinary typo. Actually, if a found error is classified according to the Common Weakness Enumeration, this means that it is a potential vulnerability.

PVS-Studio analyzer, starting with version 6.21, learned to classify bugs according to Common Weakness Enumeration and assign them the appropriate CWE ID.

Readers may have already noticed that in previous articles, in addition to the warning number Vxxx I cited CWE ID as well. This means that the errors considered earlier, in theory may cause vulnerabilities. The probability is low, but it takes place. What is interesting, we were able to match a CWE ID almost with every warning issued by PVS-Studio. This means that even though we haven't planned, we created the analyzer that is able to detect a large number of weaknesses :).

Conclusion. PVS-Studio analyzer helps you to prevent many types of vulnerabilities in advance. Publication on this topic: How Can PVS-Studio Help in the Detection of Vulnerabilities?

In this article I collected the bugs, which can potentially lead to security problems. I'd like to notify that the choice of errors is quite relative and subjective. It may be that a vulnerability is disguised as an error, which I called a trivial typo in one of the previous articles.

So, let's see what security defects I noticed during analyzing the report issued by PVS-Studio for the Chromium project. As I wrote in the introductory article, I skimmed through the report quite fluently, so there may be other, unnoticed errors. The main objective of the article is to outline the way some errors make the program handle incorrect or unchecked data. I haven't decided how to define such data yet, and for now I will use the term "untrusted data".

Examples of Errors

Chromium Project.

InstallUtil::ConditionalDeleteResult InstallUtil::DeleteRegistryValueIf(....) { .... ConditionalDeleteResult delete_result = NOT_FOUND; .... if (....) { LONG result = key.DeleteValue(value_name); if (result != ERROR_SUCCESS) { .... delete_result = DELETE_FAILED; } delete_result = DELETED; } return delete_result; }

PVS-Studio warning: V519 CWE-563 The 'delete_result' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 381, 383. install_util.cc 383

The function returns an incorrect status. As a result, other parts of the program will take that the function has successfully removed some value. The bug is that the status DELETE_FAILED is always replaced with a status DELETED.

The error can be corrected by adding the else keyword:

if (result != ERROR_SUCCESS) { .... delete_result = DELETE_FAILED; } else { delete_result = DELETED; }

Perhaps, the described error doesn't reflect well the essence of the untrusted data. In this function, creation of false data occurs, but not its checking or using. So let's look at another, more appropriate error.

PDFium library (used in Chromium).

CPVT_WordRange Intersect(const CPVT_WordRange& that) const { if (that.EndPos < BeginPos || that.BeginPos > EndPos || EndPos < that.BeginPos || BeginPos > that.EndPos) { return CPVT_WordRange(); } return CPVT_WordRange(std::max(BeginPos, that.BeginPos), std::min(EndPos, that.EndPos)); }

PVS-Studio warnings:

V501 CWE-570 There are identical sub-expressions 'that.BeginPos > EndPos' to the left and to the right of the '||' operator. cpvt_wordrange.h 46

V501 CWE-570 There are identical sub-expressions 'that.EndPos < BeginPos' to the left and to the right of the '||' operator. cpvt_wordrange.h 46

The condition is spelled wrong.

Let's reduce the condition so that it was easier to notice an error:

if (E2 < B1 || B2 > E1 || E1 < B2 || B1 > E2)

Note, that (E2 < B1) and (B1 > E2) mean the same thing. Similarly, (B2 > E1) is the same thing as (E1 < B2).

It turns out that not all the necessary checks are carried out. So, further an incorrect range may be generated, which, in turns, will affect the functioning of the program.

Now let's look at the large and complex fragment of code from a library of regular expressions RE2 (used in Chromium). Honestly, I don't even understand what's going on here, but the code definitely contains the anomalous check.

First, it has to be shown how some types are declared. If you don't do that, then the code is not very clear.

typedef signed int Rune; enum { UTFmax = 4, Runesync = 0x80, Runeself = 0x80, Runeerror = 0xFFFD, Runemax = 0x10FFFF, };

And now the function with an anomaly.

char* utfrune(const char *s, Rune c) { long c1; Rune r; int n; if(c < Runesync) /* not part of utf sequence */ return strchr((char*)s, c); for(;;) { c1 = *(unsigned char*)s; if(c1 < Runeself) { /* one byte rune */ if(c1 == 0) return 0; if(c1 == c) // <= return (char*)s; s++; continue; } n = chartorune(&r, s); if(r == c) return (char*)s; s += n; } return 0; }

PVS-Studio analyzer generates a warning for the string, which I noted with the comment "// <=". Message: V547 CWE-570 Expression 'c1 == c' is always false. rune.cc 247

Let's try to understand why the condition is always false. First, look carefully at these lines:

if(c < Runesync) return strchr((char*)s, c);

If the variable c < 0x80, the function ends its work. If the function doesn't end its work, and will continue it, you can say for sure that the variable c >= 0x80.

Now look at the condition:

if(c1 < Runeself)

A condition (c1 == c) marked by the comment "// <=", is executed only if c1 < 0x80.

So here's what we know about the values of the variables:

c >= 0x80

c1 < 0x80

It follows that the condition c1 == c is always false. It is very suspicious. It turns out that the function utfrune in the library of regular expressions is not working as planned. The consequences of such an error are unpredictable.

Video codec LibVPX (used in Chromium).

#define VP9_LEVELS 14 extern const Vp9LevelSpec vp9_level_defs[VP9_LEVELS]; typedef enum { .... LEVEL_MAX = 255 } VP9_LEVEL; static INLINE int log_tile_cols_from_picsize_level( uint32_t width, uint32_t height) { int i; const uint32_t pic_size = width * height; const uint32_t pic_breadth = VPXMAX(width, height); for (i = LEVEL_1; i < LEVEL_MAX; ++i) { if (vp9_level_defs[i].max_luma_picture_size >= pic_size && vp9_level_defs[i].max_luma_picture_breadth >= pic_breadth) { return get_msb(vp9_level_defs[i].max_col_tiles); } } return INT_MAX; }

PVS-Studio warnings:

V557 CWE-119 Array overrun is possible. The value of 'i' index could reach 254. vp9_encoder.h 931

V557 CWE-119 Array overrun is possible. The value of 'i' index could reach 254. vp9_encoder.h 932

V557 CWE-119 Array overrun is possible. The value of 'i' index could reach 254. vp9_encoder.h 933

An array vp9_level_defs consists of 14 elements. In the loop, the variable i used as an array index varies from 0 to 254. Here is the result: an array index out of bounds.

It's good if this code leads to Access Violation. But in practice, most likely, some random data located near the array vp9_level_defs will be handled.

I came across another similar error of using data out of array bounds in the SQLite library (used in Chromium).

First note that the array yy_shift_ofst contains 455 items.

static const short yy_shift_ofst[] = { /* 0 */ 355, 888, 1021, 909, 1063, 1063, 1063, 1063, 20, -19, .... /* 450 */ 1440, 1443, 1538, 1542, 1562, }

These two macros are of interest for us as well:

#define YY_SHIFT_COUNT (454) #define YY_MIN_REDUCE 993

The macro YY_SHIFT_COUNT defines the maximum index that can be used to access the elements in the array yy_shift_ofst. It is not 455, but 454, because the numbering of elements starts from 0.

The macro YY_MIN_REDUCE, equal to 993, has no relation to the size of the array yy_shift_ofst.

The function containing a weak check:

static unsigned int yy_find_shift_action(....) { int i; int stateno = pParser->yytos->stateno; if( stateno>=YY_MIN_REDUCE ) return stateno; // <= assert( stateno <= YY_SHIFT_COUNT ); do { i = yy_shift_ofst[stateno]; // <= .... }

PVS-Studio warning: V557 CWE-125 Array overrun is possible. The value of 'stateno' index could reach 992. sqlite3.c 138802

In this case, the protection is made in the way that when accessing to this array the index must not be greater than a certain value. Due to typo errors, or for any other reason, an incorrect constant is used. The constant equal to 454 should have been used, but instead of this the value of the index is compared with 993.

As a result, the array overrun and reading of random untrusted data is possible.

Note. Below there is a correct assert, but it will not help in the Release-version.

Most likely, the check should be rewritten as follows:

if (stateno > YY_SHIFT_COUNT) { assert(false); return stateno; }

ICU Project (used in Chromium).

UVector* ZoneMeta::createMetazoneMappings(const UnicodeString &tzid) { UVector *mzMappings = NULL; .... if (U_SUCCESS(status)) { .... if (U_SUCCESS(status)) { .... while (ures_hasNext(rb)) { .... if (mzMappings == NULL) { mzMappings = new UVector( deleteOlsonToMetaMappingEntry, NULL, status); if (U_FAILURE(status)) { delete mzMappings; uprv_free(entry); break; } } .... } .... } } ures_close(rb); return mzMappings; }

PVS-Studio warning: V774 CWE-416 The 'mzMappings' pointer was used after the memory was released. zonemeta.cpp 713

Code is complicated and I find it difficult to say exactly, if there is a bug or not. However, as far as I understood, it is possible that this function will return a pointer to the memory block being freed. A correct handler of incorrect status must reset the pointer:

if (U_FAILURE(status)) { delete mzMappings; mzMappings = nullptr; uprv_free(entry); break; }

But now it turns out that the function has returned a pointer to the released block of memory. In this memory anything can be and the use of invalid pointer will result in undefined behavior.

Negative values protection is implemented improperly in the following function of the Chromium project.

void AXPlatformNodeWin::HandleSpecialTextOffset(LONG* offset) { if (*offset == IA2_TEXT_OFFSET_LENGTH) { *offset = static_cast<LONG>(GetText().length()); } else if (*offset == IA2_TEXT_OFFSET_CARET) { int selection_start, selection_end; GetSelectionOffsets(&selection_start, &selection_end); if (selection_end < 0) *offset = 0; *offset = static_cast<LONG>(selection_end); } }

PVS-Studio warning: V519 CWE-563 The '* offset' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 3543, 3544. ax_platform_node_win.cc 3544

If the value of the variable selection_end is negative, the function must return 0. However, due to typo, 0 is written not in the right place. Correct code should be like this:

if (selection_end < 0) selection_end = 0; *offset = static_cast<LONG>(selection_end);

Because of this error, the function can return a negative number, although it must not. It is a negative number, which can "leak" through the check and there is untrusted data.

Other Errors

Honestly, I don't really like the examples I gave in the previous section of this article. There are few of them and they don't reflect very well the essence of the bugs related to the use of untrusted data. I think eventually I'll write a separate article where I'll show more vivid examples of errors, having collected them from various open source projects.

By the way, the article might include more examples of errors, but I have "wasted" them when writing the preceding articles, and I don't want to repeat myself. For example, in the article "Chromium: Typos" there was such a fragment:

if(!posX->hasDirtyContents() || !posY->hasDirtyContents() || !posZ->hasDirtyContents() || !negX->hasDirtyContents() || !negY->hasDirtyContents() || // <= !negY->hasDirtyContents()) // <=

Because of this typo, the object referenced by the pointer negZ is not checked. As a result, the program will work with untrusted data.

Also in this article, I did not consider the situations where the untrusted (tainted) data appear due to the lack of the check of the pointer, which a malloc function returns. If the malloc function returned NULL, this does not mean that the only error of null pointer dereference is possible. There are more insidious situations. Schematically, they look like this:

int *ptr = (int *)malloc(100 * sizeof(int)); ptr[1234567] = 42;

There will be no null pointer dereference. Here data recording and destruction of some data will occur.

It's an interesting story and I will dedicate it the following separate article.

Recommendations

Various errors lead to the use of untrusted (unchecked, tainted) data. Some kind of universal piece of advice cannot be given here. Of course you may write: don't make bugs in your code! But there is no use in such a recommendation :).

So why did I write this article and highlight this type of errors?

For you to know about them. Awareness that a problem exists - this is the factor that helps to prevent it. If one doesn't know that the problem exists it doesn't mean that there is no problem. Nice illustration:

What can we still advise:

Update libraries used in your project. Various errors can be corrected in new versions, which might be vulnerabilities. However, it must be recognized that a vulnerability can appear right in the new version, and be absent the old one. But anyway, a better solution would be to update the libraries. Much more people know about the old vulnerabilities rather than about the new ones.

Thoroughly check all input data, especially coming from somewhere outside. For example, all the data coming from somewhere by the network should be checked very carefully.

Use a variety of tools to check the code. For example, the Chromium project clearly lack the PVS-Studio static analyzer using :).

Explain to your colleagues that 'If the coding bug is banal, it doesn't mean it's not crucial'. If your team develops crucial applications, then you should focus on the quality of the code and deleting everything, even the innocent-looking error.

Note about PVS-Studio

As I have already said, PVS-Studio analyzer is already helping prevent vulnerabilities by detecting errors even at the stage of writing code. But we want more and soon we'll significantly improve PVS-Studio by introducing the concept "use of unchecked data" in Data Flow analysis.

We even have already reserved a special number for this important diagnostic: V1010. Diagnostic will detect errors when the data was obtained from unreliable source (for example, sent by the network), and is used without proper verification. The absence of all necessary checks of input data often cause detection of vulnerabilities in applications. Recently we've written about this in the article "PVS-Studio 2018: CWE, Java, RPG, macOS, Keil, IAR, MISRA" (see the section "Potential vulnerabilities, CWE").

New diagnostic will significantly strengthen the analyzer in the identification of potential vulnerabilities. Most likely V1010 diagnostic will match the CWE-20 identifier (Improper Input Validation).

Conclusion

I suggest you and your colleagues read our article "42 recommendations" on our website. A developer will not become a security expert but will find out a lot of interesting and useful material. These articles will be especially useful for developers, who have just started writing in C or C++ languages and who has no idea how deep is the rabbit-hole into which they fell.

I'm planning to update the "42 recommendations" and update them into "50 recommendations". So I invite you to subscribe to my Twitter @Code_Analysis and our RSS channel not to miss this and other interesting articles in our blog.