Searching in the source code is a popular functionality for each IDE used, developers use it widely to find places where a specific keyword exist. Some IDE provides advanced search possibilities like using regular expressions. But what about searching a specific behavior in your source code.

By behavior I mean a treatment where some methods are called, some fields are assigned, and others are read in a method. Let’s take as example this code snippet from a financial library:

And we want to search where in the code an euribor3m is instantiated, its methods fixingClandar and fixinDays are called, and also the advance method from the calandar class is called.

Cases where detecting behaviors is valuable:

1- Code not well documented and we want to know where a speific treatment is done.

During my experience as developer and in my first weeks in a project, I always fight with the code to search where some treatments are done, specially if the code is not well documented.

2- Detect duplicate treatment

The most popular reason of duplicate code is the Copy/Paste operations, and in this case the source code is exactly similar in two or more places. There are many tools to detect these kind of cloned code, CCFinderX is one of the interesting available open source tools.

In our case we are more interested in cases where cloned code are not trivial to detect.

Case1: Modified Copy/pasted code.

The major problem of a copy/pasted code is when an instance of duplicate code is changed, its correspondents have to be changed simultaneously. Unfortunately it’s not always the case and the duplicate code instances became different.

To avoid these kind of hidden duplicate code, don’t hesitate to use a tool like CFinderX to discover the duplicate code instances, and at least tag them by adding comments if you don’t have time to refactor your code. This operation is very useful when a developer try to change a duplicate code instance, he will be noticed that other places has the same code. however if the developer is not informed, he will change only one place, and it will be very difficult in the future to detect the modified duplicate code.

Case 2: Similar functionality

The copy/paste operations is not the only origin of duplicate behavior, another reason is when a similar functionality is implemented.

Here’s from wikipedia a brief description of this second duplicate code origin:

Functionality that is very similar to that in another part of a program is required and a developer independently writes code that is very similar to what exists elsewhere. Studies suggest, that such independently rewritten code is typically not syntactically similar.

3- Detect not recommended behaviors

In some cases the technical managers recommend some rules when using libraries, like avoiding the use of some classes or a sequence of some methods.

How to search for behaviors in the source code?

In case of duplicate code not exactly the same, no tool could give you a reliable results, it could report only suspicious duplicate code, and it’s the responsibility of developers to check if it really concern a cloned code or just a false positive result.

Each tool uses a specific algorithm to track these kind of duplicate behaviors, we didint test any of these tools but I think that most of them could be interesting to check at least once, it could give you some interesting results that could help you to improve the design and implementation of your code, as we will discover later in this post.

In our case we will talk about an algorithm introduced by NDepend tool. It consists in defining sets of methods that are using the same members, i.e calling the same methods, reading the same fields, writing the same fields. We call these sets, suspect-sets. Suspect-sets are sorted by the number of same members used.

CppDepend implements also this algorithm as a CppDepend Power-Tool. Power-Tools are a set of open-source tools based on CppDpend.API. The source code of Power-Tools can be found in $CppDependInstallPath$\ CppDepend.PowerTools.SourceCode\ CppDepend.PowerTools.sln.

And choose the command “Search for duplicate code”

Another alternative is to use the CQLinq language.

A reccurent CQLinq pattern to detect behaviors could be defined like this:

from m in Methods where m.CreateA("Class1) && m.AssignField ("Field1") && m.IsUsing ("Method1") && m.IsUsing ("Filed3") && m.IsUsing ("Field4") select m

When searching for keywords we consider our source code as a set of tokens, and the logic of the code is not reflected, However when searching behaviors we take into account the logic of our traitement. which could be helful for many cases.