This year PVS-Studio static analyzer turned 10. Although, we should clarify the point that 10 years ago it was called Viva64. Another interesting point: it's been 5 years since the previous check of the Notepad++ project. During this period of time the analyzer was significantly improved: about 190 new diagnostics were added and the old ones got refined. However, we cannot expect to see a large number of errors in Notepad++. It is quite a small project that has only 123 files with the source code. Nevertheless, there are still errors that are worth fixing.

Introduction

Notepad++ - a free open source text editor for Windows with syntax highlighting for a large number of programming languages and markup. It is based on the Scintilla component, written in C++ using STL and Windows API and is distributed under GNU General Public License.

To my mind, Notepad++ is a great text editor. I myself use it for everything besides writing the code. To analyze the source code I used PVS-Studio 6.15. Notepad++ project was already checked in 2010 and 2012. Now we found 84 warnings of the High level, 124 warnings of the Medium level and 548 Low level warnings. The levels denote the degree of certainty in the detected errors. So, out of 84 most certain warnings (High level), 81 point to real issues in the code - they need to be fixed right away without digging deep in the logic of the program, as the flaws are really obvious.

Note. Besides reviewing the results of the static analyzer, it would be useful to improve the code by deciding: to use spaces or tabs for indentation. Absolutely whole code looks like this:

Figure 1 - various indentations in the code.

Let's take a look at some errors that seemed the most interesting to me.

Inheritance problems

V599 The virtual destructor is not present, although the 'FunctionParser' class contains virtual functions. functionparser.cpp 39

class FunctionParser { friend class FunctionParsersManager; public: FunctionParser(....): ....{}; virtual void parse(....) = 0; void funcParse(....); bool isInZones(....); protected: generic_string _id; generic_string _displayName; generic_string _commentExpr; generic_string _functionExpr; std::vector<generic_string> _functionNameExprArray; std::vector<generic_string> _classNameExprArray; void getCommentZones(....); void getInvertZones(....); generic_string parseSubLevel(....); }; std::vector<FunctionParser *> _parsers; FunctionParsersManager::~FunctionParsersManager() { for (size_t i = 0, len = _parsers.size(); i < len; ++i) { delete _parsers[i]; // <= } if (_pXmlFuncListDoc) delete _pXmlFuncListDoc; }

The analyzer found a serious error that led to incomplete destruction of objects. The base class FunctionParser has a virtual function parse(), but it doesn't have a virtual destructor. In the inheritance hierarchy of this class there are such classes as FunctionZoneParser, FunctionUnitParser and FunctionMixParser:

class FunctionZoneParser : public FunctionParser { public: FunctionZoneParser(....): FunctionParser(....) {}; void parse(....); protected: void classParse(....); private: generic_string _rangeExpr; generic_string _openSymbole; generic_string _closeSymbole; size_t getBodyClosePos(....); }; class FunctionUnitParser : public FunctionParser { public: FunctionUnitParser(....): FunctionParser(....) {} void parse(....); }; class FunctionMixParser : public FunctionZoneParser { public: FunctionMixParser(....): FunctionZoneParser(....), ....{}; ~FunctionMixParser() { delete _funcUnitPaser; } void parse(....); private: FunctionUnitParser* _funcUnitPaser = nullptr; };

I made an inheritance scheme for these classes:

Figure 2 - Scheme of inheritance from the FunctionParser class

Thus, the created objects will not be completely destroyed. This will result in undefined behavior. We cannot say for sure how the program will work after the UB, but in practice in this case we will have a memory leak as a minimum, as the code "delete _funcUnitPaser" won't be executed.

Let's consider the following error:

V762 It is possible a virtual function was overridden incorrectly. See first argument of function 'redraw' in derived class 'SplitterContainer' and base class 'Window'. splittercontainer.h 61

class Window { .... virtual void display(bool toShow = true) const { ::ShowWindow(_hSelf, toShow ? SW_SHOW : SW_HIDE); } virtual void redraw(bool forceUpdate = false) const { ::InvalidateRect(_hSelf, nullptr, TRUE); if (forceUpdate) ::UpdateWindow(_hSelf); } .... } class SplitterContainer : public Window { .... virtual void display(bool toShow = true) const; // <= good virtual void redraw() const; // <= error .... }

There were several issues with function overloading in Notepad++. In the class SplitterContainer, inherited from the Window class, the display() method is overloaded correctly, but a programmer made a mistake when overloading the redraw() method.

Several more incorrect fragments:

V762 It is possible a virtual function was overridden incorrectly. See third argument of function 'create' in derived class 'UserDefineDialog' and base class 'StaticDialog'. userdefinedialog.h 332

V762 It is possible a virtual function was overridden incorrectly. See third argument of function 'create' in derived class 'FindReplaceDlg' and base class 'StaticDialog'. findreplacedlg.h 245

V762 It is possible a virtual function was overridden incorrectly. See third argument of function 'create' in derived class 'GoToLineDlg' and base class 'StaticDialog'. gotolinedlg.h 45

V762 It is possible a virtual function was overridden incorrectly. See third argument of function 'create' in derived class 'FindCharsInRangeDlg' and base class 'StaticDialog'. findcharsinrange.h 52

V762 It is possible a virtual function was overridden incorrectly. See third argument of function 'create' in derived class 'ColumnEditorDlg' and base class 'StaticDialog'. columneditor.h 45

V762 It is possible a virtual function was overridden incorrectly. See third argument of function 'create' in derived class 'WordStyleDlg' and base class 'StaticDialog'. wordstyledlg.h 77

V762 It is possible a virtual function was overridden incorrectly. See first argument of function 'redraw' in derived class 'WordStyleDlg' and base class 'Window'. wordstyledlg.h 99

V762 It is possible a virtual function was overridden incorrectly. See third argument of function 'create' in derived class 'PluginsAdminDlg' and base class 'StaticDialog'. pluginsadmin.h 107

Memory leak

V773 The function was exited without releasing the 'pXmlDocProject' pointer. A memory leak is possible. projectpanel.cpp 326

bool ProjectPanel::openWorkSpace(const TCHAR *projectFileName) { TiXmlDocument *pXmlDocProject = new TiXmlDocument(....); bool loadOkay = pXmlDocProject->LoadFile(); if (!loadOkay) return false; // <= TiXmlNode *root = pXmlDocProject->FirstChild(TEXT("Note....")); if (!root) return false; // <= TiXmlNode *childNode = root->FirstChildElement(TEXT("Pr....")); if (!childNode) return false; // <= if (!::PathFileExists(projectFileName)) return false; // <= .... delete pXmlDocProject; // <= free pointer return loadOkay; }

This function is also an interesting example of a memory leak. Dynamic memory is allocated for the pointer pXmlDocProject, but it is freed only when the function is executed till the end. Which is, most likely, a flaw, leading to memory leaks.

V773 Visibility scope of the 'pTextFind' pointer was exited without releasing the memory. A memory leak is possible. findreplacedlg.cpp 1577

bool FindReplaceDlg::processReplace(....) { .... TCHAR *pTextFind = new TCHAR[stringSizeFind + 1]; TCHAR *pTextReplace = new TCHAR[stringSizeReplace + 1]; lstrcpy(pTextFind, txt2find); lstrcpy(pTextReplace, txt2replace); .... }

The function processReplace() is called upon every replacement of a substring in a document. The memory is allocated for two buffers: pTextFind and pTextReplace. The search string is copied into one buffer, to the other - a replacement string. There are several errors here which may cause a memory leak:

The buffer pTextFind doesn't get cleared and is not used in the function at all. A search buffer txt2find is used for the replacement.

The pTextReplace buffer is not used further on, but the memory doesn't get freed.

Conclusion: every text-replace operation leads to the leak of several bytes. The bigger is the search string and the more matches, the more memory leaks.

Errors with pointers

V595 The 'pScint' pointer was utilized before it was verified against nullptr. Check lines: 347, 353. scintillaeditview.cpp 347

LRESULT CALLBACK ScintillaEditView::scintillaStatic_Proc(....) { ScintillaEditView *pScint = (ScintillaEditView *)(....); if (Message == WM_MOUSEWHEEL || Message == WM_MOUSEHWHEEL) { .... if (isSynpnatic || makeTouchPadCompetible) return (pScint->scintillaNew_Proc(....); // <= .... } if (pScint) return (pScint->scintillaNew_Proc(....)); else return ::DefWindowProc(hwnd, Message, wParam, lParam); }

In one fragment a programmer missed a check of the pScint pointer for validity.

V713 The pointer _langList[i] was utilized in the logical expression before it was verified against nullptr in the same logical expression. parameters.h 1286

Lang * getLangFromID(LangType langID) const { for (int i = 0 ; i < _nbLang ; ++i) { if ((_langList[i]->_langID == langID) || (!_langList[i])) return _langList[i]; } return nullptr; }

The author of the code made a mistake when writing a conditional statement. First he addresses the field _langID, using a pointer _langList[i], and then compares this pointer with null.

Most likely, the correct code should be like this:

Lang * getLangFromID(LangType langID) const { for (int i = 0 ; i < _nbLang ; ++i) { if ( _langList[i] && _langList[i]->_langID == langID ) return _langList[i]; } return nullptr; }

Miscellaneous errors

V501 There are identical sub-expressions to the left and to the right of the '!=' operator: subject != subject verifysignedfile.cpp 250

bool VerifySignedLibrary(...., const wstring& cert_subject, ....) { wstring subject; .... if ( status && !cert_subject.empty() && subject != subject) { status = false; OutputDebugString( TEXT("VerifyLibrary: Invalid certificate subject

")); } .... }

I remember that in Notepad++ there was found a vulnerability allowing replacing the editor components with modified ones. There were integrity checks added. I am not quite sure, if this code was written to fix the vulnerability, but judging by the function name, we can say that it serves for an important check.

The check

subject != subject

looks extremely suspicious, and most likely, it should be like this:

if ( status && !cert_subject.empty() && cert_subject != subject) { .... }

V560 A part of conditional expression is always true: 0xff. babygrid.cpp 711

TCHAR GetASCII(WPARAM wParam, LPARAM lParam) { int returnvalue; TCHAR mbuffer[100]; int result; BYTE keys[256]; WORD dwReturnedValue; GetKeyboardState(keys); result = ToAscii(static_cast<UINT>(wParam), (lParam >> 16) && 0xff, keys, &dwReturnedValue, 0); // <= returnvalue = (TCHAR) dwReturnedValue; if(returnvalue < 0){returnvalue = 0;} wsprintf(mbuffer, TEXT("return value = %d"), returnvalue); if(result!=1){returnvalue = 0;} return (TCHAR)returnvalue; }

Statements that are always true or always false look very suspicious. The constant 0xff is always true. Perhaps, there was a typo in the operator and the parameter of the function ToAscii() should be like this:

(lParam >> 16) & 0xff

V746 Type slicing. An exception should be caught by reference rather than by value. filedialog.cpp 183

TCHAR* FileDialog::doOpenSingleFileDlg() { .... try { fn = ::GetOpenFileName(&_ofn)?_fileName:NULL; if (params->getNppGUI()._openSaveDir == dir_last) { ::GetCurrentDirectory(MAX_PATH, dir); params->setWorkingDir(dir); } } catch(std::exception e) { // <= ::MessageBoxA(NULL, e.what(), "Exception", MB_OK); } catch(...) { ::MessageBox(NULL, TEXT("....!!!"), TEXT(""), MB_OK); } ::SetCurrentDirectory(dir); return (fn); }

It is better to catch exceptions by reference. The problem of such code is that a new object will be created, which will lead to the loss of the information about the exception during the catching. Everything that was stored in the classes inherited from Exception, will be lost.

V519 The 'lpcs' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 3116, 3117. babygrid.cpp 3117

LRESULT CALLBACK GridProc(HWND hWnd, UINT message, WPARAM wParam, LPARAM lParam) { .... case WM_CREATE: lpcs = &cs; lpcs = (LPCREATESTRUCT)lParam; .... }

An old value was instantly overridden with a new one. It looks like an error. If everything works correctly now, then we should leave only the second string with the assignment and remove the first one.

V601 The 'false' value becomes a class object. treeview.cpp 121

typedef std::basic_string<TCHAR> generic_string; generic_string TreeView::getItemDisplayName(....) const { if (not Item2Set) return false; // <= TCHAR textBuffer[MAX_PATH]; TVITEM tvItem; tvItem.hItem = Item2Set; tvItem.mask = TVIF_TEXT; tvItem.pszText = textBuffer; tvItem.cchTextMax = MAX_PATH; SendMessage(...., reinterpret_cast<LPARAM>(&tvItem)); return tvItem.pszText; }

The return value of the function is a string, but someone decided to make "return false" instead of an empty string.

Code cleaning

There is no point in doing refactoring in sake of refactoring, there are way more interesting and useful tasks in any project. What we should do is get rid of useless code.

V668 There is no sense in testing the 'source' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error. notepad_plus.cpp 1149

void Notepad_plus::wsTabConvert(spaceTab whichWay) { .... char * source = new char[docLength]; if (source == NULL) return; .... }

Why is this check necessary here in general? According to the modern C++ standard, the new operator throws an exception upon the lack of memory, it doesn't return nullptr.

This function is called upon the replacement of all the tab symbols with spaces in the whole document. Having taken a large text document, I saw that the lack of memory really leads to the program crash.

If the check is corrected, then the operation of symbol correction will be canceled and it will be possible to use the editor further on. All of these fragments need correction, besides that they are so many, that I had to make separate list of them in a file.

V713 The pointer commentLineSymbol was utilized in the logical expression before it was verified against nullptr in the same logical expression. notepad_plus.cpp 3928

bool Notepad_plus::doBlockComment(comment_mode currCommentMode) { .... if ((!commentLineSymbol) || // <= (!commentLineSymbol[0]) || (commentLineSymbol == NULL)) // <= WTF? { .... } .... }

There were ten such strange and useless checks:

V713 The pointer commentLineSymbol was utilized in the logical expression before it was verified against nullptr in the same logical expression. notepad_plus.cpp 3928

V713 The pointer commentStart was utilized in the logical expression before it was verified against nullptr in the same logical expression. notepad_plus.cpp 3931

V713 The pointer commentEnd was utilized in the logical expression before it was verified against nullptr in the same logical expression. notepad_plus.cpp 3931

V713 The pointer commentStart was utilized in the logical expression before it was verified against nullptr in the same logical expression. notepad_plus.cpp 4228

V713 The pointer commentEnd was utilized in the logical expression before it was verified against nullptr in the same logical expression. notepad_plus.cpp 4228

V713 The pointer commentLineSymbol was utilized in the logical expression before it was verified against nullptr in the same logical expression. notepad_plus.cpp 4229

V713 The pointer commentStart was utilized in the logical expression before it was verified against nullptr in the same logical expression. notepad_plus.cpp 6554

V713 The pointer commentEnd was utilized in the logical expression before it was verified against nullptr in the same logical expression. notepad_plus.cpp 6554

V713 The pointer commentLineSymbol was utilized in the logical expression before it was verified against nullptr in the same logical expression. notepad_plus.cpp 6555

V601 The 'true' value is implicitly cast to the integer type. pluginsadmin.cpp 603

INT_PTR CALLBACK PluginsAdminDlg::run_dlgProc(UINT message, ....) { switch (message) { case WM_INITDIALOG : { return TRUE; } .... case IDC_PLUGINADM_RESEARCH_NEXT: searchInPlugins(true); return true; case IDC_PLUGINADM_INSTALL: installPlugins(); return true; .... } .... }

The function run_dlgProc() is returning a value not of a logical type, even more so, the code returns either true/false, or TRUE/FALSE. First I wanted to write that at least all the indents are of the same kind, but it's not so: there is still a mixture of tabs and spaces in one of the 90 lines of the function. All the other lines have tabs. Yes, it's not critical, but the code looks to me as an observer, quite sloppy.

V704 '!this' expression in conditional statements should be avoided - this expression is always false on newer compilers, because 'this' pointer can never be NULL. notepad_plus.cpp 4980

void Notepad_plus::notifyBufferChanged(Buffer * buffer, int mask) { // To avoid to crash while MS-DOS style is set as default // language, // Checking the validity of current instance is necessary. if (!this) return; .... }

I would also classify such checks as useless code as well. As you can see in the comment, there used to be a problem with the dereference of the null this. According to the modern standard of the C++ language, such check is unnecessary.

Here is a list of all such fragments:

V704 'this && type == DOCUMENT' expression should be avoided: 'this' pointer can never be NULL on newer compilers. tinyxmla.h 505

V704 'this && type == ELEMENT' expression should be avoided: 'this' pointer can never be NULL on newer compilers. tinyxmla.h 506

V704 'this && type == COMMENT' expression should be avoided: 'this' pointer can never be NULL on newer compilers. tinyxmla.h 507

V704 'this && type == UNKNOWN' expression should be avoided: 'this' pointer can never be NULL on newer compilers. tinyxmla.h 508

V704 'this && type == TEXT' expression should be avoided: 'this' pointer can never be NULL on newer compilers. tinyxmla.h 509

V704 'this && type == DECLARATION' expression should be avoided: 'this' pointer can never be NULL on newer compilers. tinyxmla.h 510

V704 'this && type == DOCUMENT' expression should be avoided: 'this' pointer can never be NULL on newer compilers. tinyxml.h 505

V704 'this && type == ELEMENT' expression should be avoided: 'this' pointer can never be NULL on newer compilers. tinyxml.h 506

V704 'this && type == COMMENT' expression should be avoided: 'this' pointer can never be NULL on newer compilers. tinyxml.h 507

V704 'this && type == UNKNOWN' expression should be avoided: 'this' pointer can never be NULL on newer compilers. tinyxml.h 508

V704 'this && type == TEXT' expression should be avoided: 'this' pointer can never be NULL on newer compilers. tinyxml.h 509

V704 'this && type == DECLARATION' expression should be avoided: 'this' pointer can never be NULL on newer compilers. tinyxml.h 510

V704 'this' expression in conditional statements should be avoided - this expression is always true on newer compilers, because 'this' pointer can never be NULL. nppbigswitch.cpp 119

Conclusion

There were other errors found, where were not covered in the article. If desired, authors of Notepad++ can check the project themselves and examine the warnings. We are ready to provide a temporary license for this.

Of course, a simple user won't see such issues. RAM modules are quite large and cheap now. Nevertheless, the project is still developing and the quality of the code, as well as the convenience of its support may be greatly improved by fixing the detected errors and removing the layers of old code.

My evaluations are that PVS-Studio analyzer detected 2 real errors per 1000 lines of code. Of course, these aren't all errors. I think there would actually be 5-10 bugs per 1000 lines of code, which is a fairly low density of errors. The size of Notepad++ is 95 KLoc, which means that typical density of errors for projects of this kind is: 0-40 errors per 1000 lines of code. However, the source of these data about the average error density is quite old, I think that the code quality became much better.

I would like to thank the authors of Notepad++ for their work on this useful tool and wish them all the success further on.