The Microsoft Corporation has recently published, for free, access the source code of the CoreCLR engine, which is a key component of .NET Core. We couldn't help but pay attention to this event. The wider a project's audience is, the worse defects found in the code will seem, won't they? Despite Microsoft themselves being the authors of the product, there are still some issues to examine and think over in their code - just like in any other large project.

Introduction

CoreCLR is a runtime environment of .NET Core performing such functions as garbage collection, or compilation into target machine code. .Net Core is a modular implementation of .Net, which can be used as the base stack for a wide variety of scenarios.

The source code has recently been uploaded to GitHub, and was analyzed by PVS-Studio 5.23. Just like me, anyone interested can get the complete analysis log through Microsoft Visual Studio Community Edition, the release of which is another recent event by Microsoft.

Typos

The custom is that I start my reports with the typos section. Errors of this type have to do with duplicated variables, constants, macros or structure/class fields in conditional expressions. Whether or not there is a real error is subject for debate. Nevertheless, we found a couple of such fragments in the project, and they do look strange.

V501 There are identical sub-expressions 'tree->gtOper == GT_CLS_VAR' to the left and to the right of the '||' operator. ClrJit lsra.cpp 3140

// register variable GTNODE(GT_REG_VAR , "regVar" ,0,GTK_LEAF|GTK_LOCAL) // static data member GTNODE(GT_CLS_VAR , "clsVar" ,0,GTK_LEAF) // static data member address GTNODE(GT_CLS_VAR_ADDR , "&clsVar" ,0,GTK_LEAF) .... void LinearScan::buildRefPositionsForNode(GenTree *tree, ....) { .... if ((tree->gtOper == GT_CLS_VAR || tree->gtOper == GT_CLS_VAR) && i == 1) { registerType = TYP_PTR; currCandidates = allRegs(TYP_PTR); } .... }

Although the 'GenTree' structure has a field with a similar name "tree->gtType", this field has a different type than "tree->gtOper". I guess the mistake was made by copying the constant. That is, there should be another constant besides GT_CLS_VAR in the expression.

V501 There are identical sub-expressions 'DECODE_PSP_SYM' to the left and to the right of the '|' operator. daccess 264

enum GcInfoDecoderFlags { DECODE_SECURITY_OBJECT = 0x01, DECODE_CODE_LENGTH = 0x02, DECODE_VARARG = 0x04, DECODE_INTERRUPTIBILITY = 0x08, DECODE_GC_LIFETIMES = 0x10, DECODE_NO_VALIDATION = 0x20, DECODE_PSP_SYM = 0x40, DECODE_GENERICS_INST_CONTEXT = 0x80, DECODE_GS_COOKIE = 0x100, DECODE_FOR_RANGES_CALLBACK = 0x200, DECODE_PROLOG_LENGTH = 0x400, DECODE_EDIT_AND_CONTINUE = 0x800, }; size_t GCDump::DumpGCTable(PTR_CBYTE table, ....) { GcInfoDecoder hdrdecoder(table, (GcInfoDecoderFlags)( DECODE_SECURITY_OBJECT | DECODE_GS_COOKIE | DECODE_CODE_LENGTH | DECODE_PSP_SYM // <=1 | DECODE_VARARG | DECODE_PSP_SYM // <=1 | DECODE_GENERICS_INST_CONTEXT // <=2 | DECODE_GC_LIFETIMES | DECODE_GENERICS_INST_CONTEXT // <=2 | DECODE_PROLOG_LENGTH), 0); .... }

Here we have even two duplicated constants, although the "GcInfoDecoderFlags" enumeration includes other constants which are not used in the condition.

Other similar fragments:

V501 There are identical sub-expressions 'varLoc1.vlStk2.vls2BaseReg' to the left and to the right of the '==' operator. cee_wks util.cpp 657

V501 There are identical sub-expressions 'varLoc1.vlStk2.vls2Offset' to the left and to the right of the '==' operator. cee_wks util.cpp 658

V501 There are identical sub-expressions 'varLoc1.vlFPstk.vlfReg' to the left and to the right of the '==' operator. cee_wks util.cpp 661

V700 Consider inspecting the 'T foo = foo = ...' expression. It is odd that variable is initialized through itself. cee_wks zapsig.cpp 172

BOOL ZapSig::GetSignatureForTypeHandle(....) { .... CorElementType elemType = elemType = TryEncodeUsingShortcut(pMT); .... }

Seems just like an excessive assignment, but errors like this are often made when copying code, the programmer forgetting to rename some entity. Anyway, the code doesn't make sense that way.

V523 The 'then' statement is equivalent to the 'else' statement. cee_wks threadsuspend.cpp 2468

enum __MIDL___MIDL_itf_mscoree_0000_0004_0001 { OPR_ThreadAbort = 0, OPR_ThreadRudeAbortInNonCriticalRegion = .... , OPR_ThreadRudeAbortInCriticalRegion = ....) , OPR_AppDomainUnload = .... , OPR_AppDomainRudeUnload = ( OPR_AppDomainUnload + 1 ) , OPR_ProcessExit = ( OPR_AppDomainRudeUnload + 1 ) , OPR_FinalizerRun = ( OPR_ProcessExit + 1 ) , MaxClrOperation = ( OPR_FinalizerRun + 1 ) } EClrOperation; void Thread::SetRudeAbortEndTimeFromEEPolicy() { LIMITED_METHOD_CONTRACT; DWORD timeout; if (HasLockInCurrentDomain()) { timeout = GetEEPolicy()-> GetTimeout(OPR_ThreadRudeAbortInCriticalRegion); // <= } else { timeout = GetEEPolicy()-> GetTimeout(OPR_ThreadRudeAbortInCriticalRegion); // <= } .... }

This diagnostic detects identical blocks in if/else constructs. And here, we are also dealing with what seems to be a typo in a constant. In the first case, as suggested by the logic of the code, it is "OPR_ThreadRudeAbortInNonCriticalRegion" that fits best here.

Other similar fragments:

V523 The 'then' statement is equivalent to the 'else' statement. ClrJit instr.cpp 3427

V523 The 'then' statement is equivalent to the 'else' statement. ClrJit flowgraph.cpp 18815

V523 The 'then' statement is equivalent to the 'else' statement. daccess dacdbiimpl.cpp 6374

Constructor initialization list

V670 The uninitialized class member 'gcInfo' is used to initialize the 'regSet' member. Remember that members are initialized in the order of their declarations inside a class. ClrJit codegencommon.cpp 92

CodeGenInterface *getCodeGenerator(Compiler *comp); class CodeGenInterface { friend class emitter; public: .... RegSet regSet; // <= line 91 .... public: GCInfo gcInfo; // <= line 322 .... }; // CodeGen constructor CodeGenInterface::CodeGenInterface(Compiler* theCompiler) : compiler(theCompiler), gcInfo(theCompiler), regSet(theCompiler, gcInfo) { }

Under the standard, the class members are initialized in the constructor in the same order as they are declared in the class. To fix the error, we should move the declaration of the 'gcInfo' class member above that of 'regSet'.

A false, yet useful warning

V705 It is possible that 'else' block was forgotten or commented out, thus altering the program's operation logics. daccess daccess.cpp 2979

HRESULT Initialize() { if (hdr.dwSig == sig) { m_rw = eRO; m_MiniMetaDataBuffSizeMax = hdr.dwTotalSize; hr = S_OK; } else // when the DAC initializes this for the case where the target is // (a) a live process, or (b) a full dump, buff will point to a // zero initialized memory region (allocated w/ VirtualAlloc) if (hdr.dwSig == 0 && hdr.dwTotalSize == 0 && hdr.dwCntStreams == 0) { hr = S_OK; } // otherwise we may have some memory corruption. treat this as // a liveprocess/full dump else { hr = S_FALSE; } .... }

The analyzer has detected a suspicious code fragment. You can see that the code is commented ON, and everything works fine. But errors like this are very frequent when the code after 'else' is commented OUT, the operator following it becoming a part of the condition. There is no error in this particular case but it might well appear there, when editing this fragment in the future.

A 64-bit error

V673 The '0xefefefef << 28' expression evaluates to 1080581331517177856. 60 bits are required to store the value, but the expression evaluates to the 'unsigned' type which can only hold '32' bits. cee_dac _dac object.inl 95

inline void Object::EnumMemoryRegions(void) { .... SIZE_T size = sizeof(ObjHeader) + sizeof(Object); .... size |= 0xefefefef << 28; .... }

For the definition of the term "64-bit error", please see this note. In the example above, after the shift, the "size |= 0xf0000000" operation will be executed in the 32-bit program and "size |= 0x00000000f0000000" in the 64-bit one. The programmer most likely wanted the following calculations to be done in the 64-bit program: "size |= 0x0efefefef0000000". But where have we lost the most significant part of the number?

The number "0xefefefef" has the 'unsigned' type, as it doesn't fit into the 'int' type. A shift of a 32-bit number occurs, which results in unsigned 0xf0000000. Then this unsigned number is extended to SIZE_T and we get 0x00000000f0000000.

To have the code to work right, we need to execute an explicit type conversion first. This is the fixed code:

inline void Object::EnumMemoryRegions(void) { .... SIZE_T size = sizeof(ObjHeader) + sizeof(Object); .... size |= SIZE_T(0xefefefef) << 28; .... }

Another issue of the same kind:

V673 The '0xefefefef << 28' expression evaluates to 1080581331517177856. 60 bits are required to store the value, but the expression evaluates to the 'unsigned' type which can only hold '32' bits. cee_dac dynamicmethod.cpp 807

"Retired" code

Sometimes you may find conditions that literally contradict each other.

V637 Two opposite conditions were encountered. The second condition is always false. Check lines: 31825, 31827. cee_wks gc.cpp 31825

void gc_heap::verify_heap (BOOL begin_gc_p) { .... if (brick_table [curr_brick] < 0) { if (brick_table [curr_brick] == 0) { dprintf(3, ("curr_brick %Ix for object %Ix set to 0", curr_brick, (size_t)curr_object)); FATAL_GC_ERROR(); } .... } .... }

This code never gets control, but it doesn't look that critical, as in the following example:

V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 2353, 2391. utilcode util.cpp 2353

void PutIA64Imm22(UINT64 * pBundle, UINT32 slot, INT32 imm22) { if (slot == 0) { const UINT64 mask0 = UI64(0xFFFFFC000603FFFF); /* Clear all bits used as part of the imm22 */ pBundle[0] &= mask0; UINT64 temp0; temp0 = (UINT64) (imm22 & 0x200000) << 20; // 1 s temp0 |= (UINT64) (imm22 & 0x1F0000) << 11; // 5 imm5c temp0 |= (UINT64) (imm22 & 0x00FF80) << 25; // 9 imm9d temp0 |= (UINT64) (imm22 & 0x00007F) << 18; // 7 imm7b /* Or in the new bits used in the imm22 */ pBundle[0] |= temp0; } else if (slot == 1) { .... } else if (slot == 0) // <= { const UINT64 mask1 = UI64(0xF000180FFFFFFFFF); /* Clear all bits used as part of the imm22 */ pBundle[1] &= mask1; UINT64 temp1; temp1 = (UINT64) (imm22 & 0x200000) << 37; // 1 s temp1 |= (UINT64) (imm22 & 0x1F0000) << 32; // 5 imm5c temp1 |= (UINT64) (imm22 & 0x00FF80) << 43; // 9 imm9d temp1 |= (UINT64) (imm22 & 0x00007F) << 36; // 7 imm7b /* Or in the new bits used in the imm22 */ pBundle[1] |= temp1; } FlushInstructionCache(GetCurrentProcess(),pBundle,16); }

Perhaps it is a very important piece of code that never gets control, because of a bug in the cascade of conditional operators.

Other suspicious fragments:

V637 Two opposite conditions were encountered. The second condition is always false. Check lines: 2898, 2900. daccess nidump.cpp 2898

V637 Two opposite conditions were encountered. The second condition is always false. Check lines: 337, 339. utilcode prettyprintsig.cpp 337

V637 Two opposite conditions were encountered. The second condition is always false. Check lines: 774, 776. utilcode prettyprintsig.cpp 774

Undefined behavior

V610 Undefined behavior. Check the shift operator '<<'. The left operand '-1' is negative. bcltype metamodel.h 532

inline static mdToken decodeToken(....) { //<TODO>@FUTURE: make compile-time calculation</TODO> ULONG32 ix = (ULONG32)(val & ~(-1 << m_cb[cTokens])); if (ix >= cTokens) return rTokens[0]; return TokenFromRid(val >> m_cb[cTokens], rTokens[ix]); }

The analyzer has detected a negative-number shift causing undefined behavior.

V610 Undefined behavior. Check the shift operator '<<'. The left operand '(~0)' is negative. cee_dac decodemd.cpp 456

#define bits_generation 2 #define generation_mask (~(~0 << bits_generation)) #define MASK(len) (~((~0)<<len)) #define MASK64(len) ((~((~((unsigned __int64)0))<<len))) void Encoder::Add(unsigned value, unsigned length) { .... value = (value & MASK(length)); .... }

Thanks to the V610 message, I discovered a number of incorrect macros. '~0' is cast to a signed negative number of the int type, after which a shift is executed. Just like in one of the macros, an explicit conversion to the unsigned type is necessary:

#define bits_generation 2 #define generation_mask (~(~((unsigned int)0) << bits_generation)) #define MASK(len) (~((~((unsigned int)0))<<len)) #define MASK64(len) ((~((~((unsigned __int64)0))<<len)))

Incorrect sizeof(xx)

V579 The DacReadAll function receives the pointer and its size as arguments. It is possibly a mistake. Inspect the third argument. daccess dacimpl.h 1688

template<class T> inline bool MisalignedRead(CORDB_ADDRESS addr, T *t) { return SUCCEEDED(DacReadAll(TO_TADDR(addr), t, sizeof(t), false)); }

Here's just a small function which always takes the pointer size. The programmer most likely intended to write it like "sizeof(*t)", or maybe "sizeof(T)".

Another good example:

V579 The Read function receives the pointer and its size as arguments. It is possibly a mistake. Inspect the third argument. util.cpp 4943

HRESULT GetMTOfObject(TADDR obj, TADDR *mt) { if (!mt) return E_POINTER; HRESULT hr = rvCache->Read(obj, mt, sizeof(mt), NULL); if (SUCCEEDED(hr)) *mt &= ~3; return hr; }

The family of "memFAIL" functions

When using memXXX-functions, one risks making a variety of mistakes. The analyzer provides a number of diagnostic rules to detect such fragments.

V512 A call of the 'memset' function will lead to underflow of the buffer 'pAddExpression'. sos strike.cpp 11973

DECLARE_API(Watch) { .... if(addExpression.data != NULL || aExpression.data != NULL) { WCHAR pAddExpression[MAX_EXPRESSION]; memset(pAddExpression, 0, MAX_EXPRESSION); swprintf_s(pAddExpression, MAX_EXPRESSION, L"%S", ....); Status = g_watchCmd.Add(pAddExpression); } .... }

A very common bug when programmers forget to allow for the type size:

WCHAR pAddExpression[MAX_EXPRESSION]; memset(pAddExpression, 0, sizeof(WCHAR)*MAX_EXPRESSION);

Other similar fragments:

V512 A call of the 'memset' function will lead to underflow of the buffer 'pSaveName'. sos strike.cpp 11997

V512 A call of the 'memset' function will lead to underflow of the buffer 'pOldName'. sos strike.cpp 12013

V512 A call of the 'memset' function will lead to underflow of the buffer 'pNewName'. sos strike.cpp 12016

V512 A call of the 'memset' function will lead to underflow of the buffer 'pExpression'. sos strike.cpp 12024

V512 A call of the 'memset' function will lead to underflow of the buffer 'pFilterName'. sos strike.cpp 12039

V598 The 'memcpy' function is used to copy the fields of 'GenTree' class. Virtual table pointer will be damaged by this. ClrJit compiler.hpp 1344

struct GenTree { .... #if DEBUGGABLE_GENTREE virtual void DummyVirt() {} #endif // DEBUGGABLE_GENTREE .... }; void GenTree::CopyFrom(const GenTree* src, Compiler* comp) { .... memcpy(this, src, src->GetNodeSize()); .... }

When the preprocessor variable 'DEBUGGABLE_GENTREE' is declared, a virtual function is defined. The class then contains a pointer to the virtual method table, and cannot be copied that freely.

V598 The 'memcpy' function is used to copy the fields of 'GCStatistics' class. Virtual table pointer will be damaged by this. cee_wks gc.cpp 287

struct GCStatistics : public StatisticsBase { .... virtual void Initialize(); virtual void DisplayAndUpdate(); .... }; GCStatistics g_LastGCStatistics; void GCStatistics::DisplayAndUpdate() { .... memcpy(&g_LastGCStatistics, this, sizeof(g_LastGCStatistics)); .... }

In this fragment, incorrect copying is done in any mode, not only the debugging one.

V698 Expression 'memcmp(....) == -1' is incorrect. This function can return not only the value '-1', but any negative value. Consider using 'memcmp(....) < 0' instead. sos util.cpp 142

bool operator( )(const GUID& _Key1, const GUID& _Key2) const { return memcmp(&_Key1, &_Key2, sizeof(GUID)) == -1; }

It's incorrect to compare the result of the 'memcmp' function to 1 or -1. Whether or not such constructs will work depends on the libraries, compiler and its settings, the operating system and its bitness, and so on. In cases like this, you should check one of the three states: '< 0', '0', or '> 0'.

One more issue of the same kind:

V698 Expression 'wcscmp(....) == -1' is incorrect. This function can return not only the value '-1', but any negative value. Consider using 'wcscmp(....) < 0' instead. sos strike.cpp 3855

Pointers

V522 Dereferencing of the null pointer 'hp' might take place. cee_wks gc.cpp 4488

heap_segment* gc_heap::get_segment_for_loh (size_t size #ifdef MULTIPLE_HEAPS , gc_heap* hp #endif //MULTIPLE_HEAPS ) { #ifndef MULTIPLE_HEAPS gc_heap* hp = 0; #endif //MULTIPLE_HEAPS heap_segment* res = hp->get_segment (size, TRUE); .... }

When 'MULTIPLE_HEAPS' is not defined, it's no good, because the pointer will equal zero.

V595 The 'tree' pointer was utilized before it was verified against nullptr. Check lines: 6970, 6976. ClrJit gentree.cpp 6970

void Compiler::gtDispNode(GenTreePtr tree, ....) { .... if (tree->gtOper >= GT_COUNT) { printf(" **** ILLEGAL NODE ****"); return; } if (tree && printFlags) { /* First print the flags associated with the node */ switch (tree->gtOper) { .... } .... } .... }

There are many fragments in the project's source code where pointers are checked for being valid - but only after they have been dereferenced.

Here's a complete list of all the fragments of that kind: CoreCLR_V595.txt.

Excessive checks

Even if excessive code doesn't do any harm, its mere presence may distract the programmers' attention from working on more important things.

V503 This is a nonsensical comparison: pointer >= 0. cee_wks gc.cpp 21707

void gc_heap::make_free_list_in_brick (BYTE* tree, make_free_args* args) { assert ((tree >= 0)); .... }

A nice pointer check, huh? Two more examples:

V503 This is a nonsensical comparison: pointer >= 0. cee_wks gc.cpp 23204

V503 This is a nonsensical comparison: pointer >= 0. cee_wks gc.cpp 27683

V547 Expression 'maxCpuId >= 0' is always true. Unsigned type value is always >= 0. cee_wks codeman.cpp 1219

void EEJitManager::SetCpuInfo() { .... unsigned char buffer[16]; DWORD maxCpuId = getcpuid(0, buffer); if (maxCpuId >= 0) { .... }

A similar example, but with the DWORD type.

V590 Consider inspecting the 'wzPath[0] != L'\0' && wzPath[0] == L'\\'' expression. The expression is excessive or contains a misprint. cee_wks path.h 62

static inline bool HasUncPrefix(LPCWSTR wzPath) { _ASSERTE(!clr::str::IsNullOrEmpty(wzPath)); return wzPath[0] != W('\0') && wzPath[0] == W('\\') && wzPath[1] != W('\0') && wzPath[1] == W('\\') && wzPath[2] != W('\0') && wzPath[2] != W('?'); }

This function can be simplified to the following code:

static inline bool HasUncPrefix(LPCWSTR wzPath) { _ASSERTE(!clr::str::IsNullOrEmpty(wzPath)); return wzPath[0] == W('\\') && wzPath[1] == W('\\') && wzPath[2] != W('\0') && wzPath[2] != W('?'); }

Another fragment:

V590 Consider inspecting this expression. The expression is excessive or contains a misprint. cee_wks path.h 72

V571 Recurring check. The 'if (moduleInfo[MSCORWKS].baseAddr == 0)' condition was already verified in line 749. sos util.cpp 751

struct ModuleInfo { ULONG64 baseAddr; ULONG64 size; BOOL hasPdb; }; HRESULT CheckEEDll() { .... // Do we have clr.dll if (moduleInfo[MSCORWKS].baseAddr == 0) // <= { if (moduleInfo[MSCORWKS].baseAddr == 0) // <= g_ExtSymbols->GetModuleByModuleName ( MAIN_CLR_MODULE_NAME_A,0,NULL, &moduleInfo[MSCORWKS].baseAddr); if (moduleInfo[MSCORWKS].baseAddr != 0 && // <= moduleInfo[MSCORWKS].hasPdb == FALSE) { .... } .... } .... }

There's no need to check 'baseAddr' in the second case.

V704 'this == nullptr' expression should be avoided - this expression is always false on newer compilers, because 'this' pointer can never be NULL. ClrJit gentree.cpp 12731

bool FieldSeqNode::IsFirstElemFieldSeq() { if (this == nullptr) return false; return m_fieldHnd == FieldSeqStore::FirstElemPseudoField; }

Under the C++ standard, the 'this' pointer can never be null. For details about possible consequences of code such as the sample above, see the description of the V704 diagnostic. Such code working correctly after having been compiled by the Visual C++ compiler is mere luck, and one can't really rely on that.

The complete list of all other fragments of this kind: CoreCLR_V704.txt.

V668 There is no sense in testing the 'newChunk' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of a memory allocation error. ClrJit stresslog.h 552

FORCEINLINE BOOL GrowChunkList () { .... StressLogChunk * newChunk = new StressLogChunk (....); if (newChunk == NULL) { return FALSE; } .... }

If the 'new' operator has failed to allocate memory, then it must throw the std::bad_alloc() exception, as required by the C++ language standard. Therefore, checking the pointer for being null doesn't make any sense here.

I advise reviewing all the fragments of this kind. Here's a complete list: CoreCLR_V668.txt.

Conclusion

The recently published CoreCLR project is a nice example of what a proprietary software product's code may look like. There are unceasing debates on this subject, so here's another topic for you to think about, and argue on.

What is personally important for us, is the fact that there are always some bugs to be found in any large project, and that the best way to use a static analyzer is to use it regularly. Don't be lazy, go download PVS-Studio, and check your project.