There exists the NetXMS project, which is a software product designed to monitor computer systems and networks. It can be used to monitor the whole IT-infrastructure, from SNMP-compatible devices to server software. And I am naturally going to monitor the code of this project with the PVS-Studio analyzer.

About NetXMS in brief

Links:

Description in Wikipedia: NetXMS

Website: http://www.netxms.org/

The NetXMS project is an open-source project distributed under the GNU General Public License v2. The code is written in the languages C, C++ and Java.

The project depends on a number of third-party libraries. To be honest, I felt too lazy to download some of them to get the project built. That's why it was checked not in full. Nevertheless, it doesn't prevent me from writing this post: my analysis is superficial anyway. It will be much better if the project's authors check it themselves. They are welcome to write to our support service: I will generate a temporary registration key for the PVS-Studio analyzer so that they could analyze it more thoroughly.

64-bit errors

In the articles describing checks of open-source projects, I let myself be carried away with citing general errors. But 64-bit errors have not disappeared; they can be found everywhere. They are just not that interesting to discuss. When you show null pointer dereferencing, the bug is obvious. When you tell that a 32-bit variable can overflow in a 64-bit application, it's not that interesting. A coincidence of some certain circumstances must happen for such an error to occur; so you have to speak of it as a "potential error".

Moreover, it's much more difficult to detect 64-bit bugs. The rule set designed for 64-bit error detection produces a whole lot of false positives. The analyzer doesn't know the permissible range of input values and attacks everything it finds at least a bit suspicious. To find really dangerous fragments, you have to review a lot of messages; this is the only way to make sure that the program has been correctly ported to the 64-bit platform. It is especially true for applications that use more than 4 Gbytes of memory.

So, to be brief, writing articles about catching common bugs is much easier than writing about catching 64-bit ones. But this time I overcame my laziness and found several dangerous fragments of that kind. Let's start with them.

64-bit error N1

BOOL SortItems(...., _In_ DWORD_PTR dwData); void CLastValuesView::OnListViewColumnClick(....) { .... m_wndListCtrl.SortItems(CompareItems, (DWORD)this); .... }

V220 Suspicious sequence of types castings: memsize -> 32-bit integer -> memsize. The value being casted: 'this'. lastvaluesview.cpp 716

Earlier, in 32-bit systems, the pointer's size was 4 bytes. When you needed to save or pass a pointer as an integer type, you used the types DWORD, UINT and so on. In 64-bit systems the pointer's size has grown to 8 bytes. To store them in integer variables the types DWORD_PTR, UINT_PTR and some others were created. Function interfaces have changed accordingly. Note the way the SortItems() function is declared in the first line of the sample.

Unfortunately, the program still contains a conversion of a pointer to the 32-bit DWORD type. The program is compiled successfully. The pointer is explicitly cast to the 32-bit DWORD type and then inexplicitly extended to DWORD_PTR. The worst thing is that the program works well in most cases.

It will work until the CLastValuesView class's instances are created within the 4 low-order Gbytes of memory - that is, almost always. But it might happen that the program needs more memory. Or, memory fragmentation happens after a long run. The object will then be created outside the 4 Gbytes, and the error will reveal itself. The pointer will lose the 32 high-order bits, and the program's behavior will become undefined.

The bug is very easy to fix:

m_wndListCtrl.SortItems(CompareItems, (DWORD_PTR)this);

There are some other fragments with similar type conversions:

mibbrowserdlg.cpp 160

lastvaluesview.cpp 232

graphdatapage.cpp 370

graphdatapage.cpp 330

graphdatapage.cpp 268

graphdatapage.cpp 172

controlpanel.cpp 126

Each of these is a sliest bug; they are often very hard to reproduce. As a result, you get VERY RARE crashes after a long run.

64-bit error N2

The next error seems to be not that critical. A poorly calculated hash code, however, can cause search algorithms to slow down.

static int hash_void_ptr(void *ptr) { int hash; int i; /* I took this hash function just off the top of my head, I have no idea whether it is bad or very bad. */ hash = 0; for (i = 0; i < (int)sizeof(ptr)*8 / TABLE_BITS; i++) { hash ^= (unsigned long)ptr >> i*8; hash += i * 17; hash &= TABLE_MASK; } return hash; }

V205 Explicit conversion of pointer type to 32-bit integer type: (unsigned long) ptr xmalloc.c 85

The author writes in the comment that he is not sure if the function works well. And he's right. At the least, here is a bug when casting the pointer to the 'unsigned long' type.

The data models used in Windows and Linux systems are different. In Linux, the LP64 data model is accepted. In this model the 'long' type's size is 64 bits. Thus, this code will work as intended under Linux systems.

In Win64, the 'unsigned long' type's size is 32 bits. As a result, the high-order part of the pointer gets lost, and the hash is calculated not that well.

64-bit error N3

It is not solely because of explicit type conversions that 64-bit errors occur. But errors of this type are much easier to detect - for me as well. That's why let's have a look at one more poor type conversion.

static int ipfix_print_newmsg(....) { .... strftime(timebuf, 40, "%Y-%m-%d %T %Z", localtime( (const time_t *) &(hdr->u.nf9.unixtime) )); .... }

V114 Dangerous explicit type pointer conversion: (const time_t *) & (hdr->u.nf9.unixtime) ipfix_print.c 68

This is how the 'unixtime' class's member is declared:

uint32_t unixtime; /* seconds since 1970 */

And this is how the type 'time_t' is declared:

#ifdef _USE_32BIT_TIME_T typedef __time32_t time_t; #else typedef __time64_t time_t; #endif

As far as I can tell, the _USE_32BIT_TIME_T macro is not declared anywhere in the project. I didn't manage to find it, at least. It means that the localtime() function must handle time values represented by 64-bit variables, while it is an address of a 32-bit variable that is passed into the function in our sample. It's no good. The function localtime() will be handling trash.

I suppose the readers can see now why I'm not fond of writing about 64-bit errors. They are too plain and unconvincing. I don't feel like going on to search for other samples to show you at all. Let's instead study some general bugs. They look much more impressive and dangerous.

Nevertheless, 64-bit errors still exist, and if you care about the quality of your 64-bit code, I advise you to keep the viva64 diagnostic rule set at hand. These errors will stay hidden for a longer time than common bugs. For you to get scared, I recommend the following reading for the night:

Errors when handling the SOCKET type

In Linux, the SOCKET type is declared as a signed variable. In Windows, this type is unsigned:

typedef UINT_PTR SOCKET;

This difference often causes bugs in Windows programs.

static int DoRadiusAuth(....) { SOCKET sockfd; .... // Open a socket. sockfd = socket(AF_INET, SOCK_DGRAM, 0); if (sockfd < 0) { DbgPrintf(3, _T("RADIUS: Cannot create socket")); pairfree(req); return 5; } .... }

V547 Expression 'sockfd < 0' is always false. Unsigned type value is never < 0. radius.cpp 682

The 'sockfd' variable is of the UINT_PTR type. It results in that the 'sockfd < 0' condition never holds when the program runs under Windows. The program will try in vain to handle the socket which has not been opened.

You should fight your laziness and use special constants. This is what the code should look like:

if (sockfd == SOCKET_ERROR)

Similar incorrect checks can be found in the following fragments:

ipfix.c 845

ipfix.c 962

ipfix.c 1013

ipfix.c 1143

ipfix.c 1169

ipfix_col.c 1404

ipfix_col.c 2025

A potential array overrun

int ipfix_snprint_string(....) { size_t i; uint8_t *in = (uint8_t*) data; for( i=len-1; i>=0; i-- ) { if ( in[i] == '\0' ) { return snprintf( str, size, "%s", in ); } } .... }

V547 Expression 'i >= 0' is always true. Unsigned type value is always >= 0. ipfix.c 488

The 'i' variable has the size_t type. It means that the check "i>=0" is pointless. If zero is not found on the stack, the function will start reading memory far outside the array's boundaries. Consequences of this may be very diverse.

One more error when handling unsigned types

bool CatalystDriver::isDeviceSupported(....) { DWORD value = 0; if (SnmpGet(snmp->getSnmpVersion(), snmp, _T(".1.3.6.1.4.1.9.5.1.2.14.0"), NULL, 0, &value, sizeof(DWORD), 0) != SNMP_ERR_SUCCESS) return false; // Catalyst 3550 can return 0 as number of slots return value >= 0; }

V547 Expression 'value >= 0' is always true. Unsigned type value is always >= 0. catalyst.cpp 71

Half-cleared buffers

One of the most common error patterns is confusion of WCHAR strings' sizes. You can find quite a number of examples in our bug database.

typedef WCHAR TCHAR, *PTCHAR; static BOOL MatchProcess(....) { .... TCHAR commandLine[MAX_PATH]; .... memset(commandLine, 0, MAX_PATH); .... }

V512 A call of the 'memset' function will lead to underflow of the buffer 'commandLine'. procinfo.cpp 278

The TCHAR type is expanded into the WCHAR type. The number of characters in the array 'commandLine' equals the value MAX_PATH. The size of this array is 'MAX_PATH * sizeof(TCHAR). The 'memset' function handles bytes. It means that the mechanism needed to correctly clear the buffer should look like this:

memset(commandLine, 0, MAX_PATH * sizeof(TCHAR));

An even better way is to make it like this:

memset(commandLine, 0, sizeof(commandLine));

The CToolBox class is sick in the same way:

typedef WCHAR TCHAR, *PTCHAR; #define MAX_TOOLBOX_TITLE 64 TCHAR m_szTitle[MAX_TOOLBOX_TITLE]; CToolBox::CToolBox() { memset(m_szTitle, 0, MAX_TOOLBOX_TITLE); }

V512 A call of the 'memset' function will lead to underflow of the buffer 'm_szTitle'. toolbox.cpp 28

Copy-paste

In the findIpAddress() function, a null pointer may get dereferenced. The reason is a copied-and-pasted line.

void ClientSession::findIpAddress(CSCPMessage *request) { .... if (subnet != NULL) { debugPrintf(5, _T("findIpAddress(%s): found subnet %s"), ipAddrText, subnet->Name()); found = subnet->findMacAddress(ipAddr, macAddr); } else { debugPrintf(5, _T("findIpAddress(%s): subnet not found"), ipAddrText, subnet->Name()); } .... }

V522 Dereferencing of the null pointer 'subnet' might take place. session.cpp 10823

The call of the debugPrintf() function was obviously copied. But the call in the 'else' branch is incorrect. The pointer 'subnet' equals NULL. It means that you cannot write "subnet->Name()".

A misprint

#define CF_AUTO_UNBIND 0x00000002 bool isAutoUnbindEnabled() { return ((m_flags & (CF_AUTO_UNBIND | CF_AUTO_UNBIND)) == (CF_AUTO_UNBIND | CF_AUTO_UNBIND)) ? true : false; }

V578 An odd bitwise operation detected: m_flags & (0x00000002 | 0x00000002). Consider verifying it. nms_objects.h 1410

The expression (CF_AUTO_UNBIND | CF_AUTO_UNBIND) is very strange. It seems that two different constants should be used here.

Unexpected optimization

void I_SHA1Final(....) { unsigned char finalcount[8]; .... memset(finalcount, 0, 8); SHA1Transform(context->state, context->buffer); }

V597 The compiler could delete the 'memset' function call, which is used to flush 'finalcount' buffer. The RtlSecureZeroMemory() function should be used to erase the private data. sha1.cpp 233

In functions related to cryptography, it is an accepted practice to clear temporary buffers. If you don't do that, consequences may be interesting: for instance, a fragment of classified information may be unintentionally sent to the network. Read the article "Overwriting memory - why?" to find out the details.

The function memset() is often used to clear memory. It is incorrect. If the array is not being used after the clearing, the compiler may delete the function memset() for the purpose of optimization. To prevent this you should use the function RtlSecureZeroMemory().

Using uninitialized variables

Many programmers are convinced that use of uninitialized variables is the most annoying and frequent bug. Judging by my experience of checking various projects, I don't believe it's true. This bug is very much discussed in books and articles. Thanks to that, everybody knows what uninitialized variables are, what is dangerous about them, how to avoid and how to find them. But personally I feel that much more errors are caused, say, through using Copy-Paste. But, of course, it doesn't mean that uninitialized variables are defeated. Here they are.

int OdbcDisconnect(void* pvSqlCtx) { .... SQLRETURN nSqlRet; .... if (nRet == SUCCESS) { .... nSqlRet = SQLDisconnect(pSqlCtx->hDbc); .... } if (SQLRET_FAIL(nSqlRet)) .... }

V614 Potentially uninitialized variable 'nSqlRet' used. odbcsapi.cpp 220

The nSqlRet variable becomes initialized only if we get into the 'if' operator's body. But it is checked after that all the time. It results in this variable's sometimes storing a random value.

Here are some other places where variables may be initialized not all the time:

session.cpp 2112

session.cpp 7525

session.cpp 7659

functions.cpp 386

unlock.cpp 63

alarmbrowser.cpp 539

A pointer is first used and then checked for being a null pointer

It is a very common situation that due to refactoring a pointer check is put after a pointer dereferencing operation in the program text. A lot of examples can be found here.

To detect this error pattern the V595 diagnostic is used. The number of such defects found in code often reaches many dozens. To NetXMS's credit, however, I noticed only one code fragment of that kind:

DWORD SNMP_PDU::encodeV3SecurityParameters(...., SNMP_SecurityContext *securityContext) { .... DWORD engineBoots = securityContext->getAuthoritativeEngine().getBoots(); DWORD engineTime = securityContext->getAuthoritativeEngine().getTime(); if ((securityContext != NULL) && (securityContext->getSecurityModel() == SNMP_SECURITY_MODEL_USM)) { .... }

V595 The 'securityContext' pointer was utilized before it was verified against nullptr. Check lines: 1159, 1162. pdu.cpp 1159

There were some other V595 warnings, but I found them too unconvincing to mention in the article. Those must be just unnecessary checks.

A bug when using variadic functions

Errors occurring when using the printf() and other similar functions are classic ones. The reason is that variadic functions don't control the types of the arguments being passed.

#define _ftprintf fwprintf static __inline char * __CRTDECL ctime(const time_t * _Time); BOOL LIBNETXMS_EXPORTABLE SEHServiceExceptionHandler(....) { .... _ftprintf(m_pExInfoFile, _T("%s CRASH DUMP

%s

"), szProcNameUppercase, ctime(&t)); .... }

V576 Incorrect format. Consider checking the fourth actual argument of the 'fwprintf' function. The pointer to string of wchar_t type symbols is expected. seh.cpp 292

The _ftprintf() macro is expanded into the function fwprintf(). The format string specifies that strings of the 'wchar_t *' type must be passed into the function. But the ctime() function returns a string consisting of 'char' characters. This bug must be left unnoticed, as it is situated inside the error handler.

Here are two more errors of that kind:

nxpush.cpp 193

nxpush.cpp 235

It is not taken into account that the 'new' operator throws exceptions when there is memory shortage

The 'new' operator earlier used to return 'NULL' when it failed to allocate memory. Now it throws an exception. Many programs don't take this change into account. It doesn't matter sometimes, but in some cases it may cause failures. Take a look at the following code fragment from the NetXMS project:

PRectangle CallTip::CallTipStart(....) { .... val = new char[strlen(defn) + 1]; if (!val) return PRectangle(); .... }

V668 There is no sense in testing the 'val' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error. calltip.cpp 260

The empty object 'PRectangle' was returned earlier if memory couldn't be allocated. Now an exception is generated when there is memory shortage. I don't know whether or not this behavior change is critical. Anyway, checking the pointer for being a null pointer doesn't seem reasonable anymore.

We should either remove the checks or use the 'new' operator that doesn't throw exceptions and returns zero:

val = new (std::nothrow) char[strlen(defn) + 1];

The PVS-Studio analyzer generates too many V668 warnings on the NetXMS project. Therefore I won't overload the article with examples. Let's leave it up to the authors to check the project.

A strange loop

static bool MatchStringEngine(....) { .... // Handle "*?" case while(*MPtr == _T('?')) { if (*SPtr != 0) SPtr++; else return false; MPtr++; break; } .... }

V612 An unconditional 'break' within a loop. tools.cpp 280

The loop body is executed not more than once. The keyword 'break' inside it must be unnecessary.

Instead of the conclusion

I haven't drawn any new conclusions from the check of the NetXMS project. Errors are everywhere; some of them can be found with static analysis - the earlier, the better.

I'll just give you some interesting and useful links instead of the conclusion: