Today, I am happy to announce a guest post by Andrey Karpov about the evil of macros. Andrey is a Microsoft MVP in the “Developer Technologies” category and one of the founders of the PVS-Studio project. He is the author of a large number of articles, dedicated to code quality and telling about different patterns of errors that C++ programmers make. Andrey promotes methodologies of static and dynamic code analysis. You can find him online on Twitter and Facebook



The C++ language opens extensive opportunities to go without macros. So let’s try to use macros as seldom as possible!

It should be noted, however, that I’m not a fanatic and I don’t appeal to abandoning macros out of idealistic considerations. For example, when it comes to the manual generation of similar code, I can recognize the benefits of macros and deal with them. For example, I put things lightly to the macros in old programs written with usage of MFC. It makes no sense to fight against something like this:

BEGIN_MESSAGE_MAP(efcDialog, EFCDIALOG_PARENT ) //{{AFX_MSG_MAP(efcDialog) ON_WM_CREATE() ON_WM_DESTROY() //}}AFX_MSG_MAP END_MESSAGE_MAP()

There are such macros, let it be. Indeed, they have been created to simplify programming.

I’m talking about other macros, which developers use to avoid implementing a full function or try to reduce the function size. Let’s see some motives to avoid such macros.

First: Code with Macros Attracts Bugs

I don’t know how to explain the reasons for this phenomenon from a philosophical point of view, but it is so. Moreover, bugs related to macros, are often very difficult to notice when reviewing code.

I’m continually describing such cases in my articles. For example, the substitution of the isspace function with the following macro:

#define isspace(c) ((c)==' ' || (c) == '\t')

The developer, who was using the isspace thought that he was using the actual function, which considers not only spaces and tabs as space characters, but also LF, CR and some others. As a result, it turned out that one of the conditions was always true and the code worked not as intended. This error from Midnight Commander is described here.

How about such a reduction in writing the std::printf function?

#define sprintf std::printf

I think the reader realizes that it was quite an inappropriate macro. By the way, it was detected in the StarEngine project. You can read here in detail about it.

It could be argued that developers are to blame for these errors, not macros. Yes, it is so. Of course, developers are always bad guys when it comes to errors :).

What is important, is that macros provoke errors. It turns out that macros ought to be either used with greater concern or not used at all.

I could tell a long story of defects’ examples related to macros’ usage, which would turn this cute little note into a heavy multipage document. Of course, I’m not going to do it, but let me show you a couple of cases to drive the point home.

The ATL library provides such macros, as A2W, T2W and so on for string conversion. However, few people know that it is very dangerous to use these macros inside loops. Within the macro, a call to the alloca function occurs, which will repeatedly allocate memory on each loop iteration on the stack. A program makes show that it works correctly. Once a program starts to handle longer strings and the number of loop iterations increases, the stack can just end at the most unexpected moment. More details about this are available in this mini-book (see the chapter “Do not call the alloca() function inside loops”).

Such macros, as A2W, hide evil inside. They look like functions, but, in fact, have side effects that are difficult to notice.

I also cannot just pass by these attempts of reducing the code by using macros:

void initialize_sanitizer_builtins (void) { .... #define DEF_SANITIZER_BUILTIN(ENUM, NAME, TYPE, ATTRS) \ decl = add_builtin_function ("__builtin_" NAME, TYPE, ENUM, \ BUILT_IN_NORMAL, NAME, NULL_TREE); \ set_call_expr_flags (decl, ATTRS); \ set_builtin_decl (ENUM, decl, true); #include "sanitizer.def" if ((flag_sanitize & SANITIZE_OBJECT_SIZE) && !builtin_decl_implicit_p (BUILT_IN_OBJECT_SIZE)) DEF_SANITIZER_BUILTIN (BUILT_IN_OBJECT_SIZE, "object_size", BT_FN_SIZE_CONST_PTR_INT, ATTR_PURE_NOTHROW_LEAF_LIST) .... }

Only the first line of the macro is related to if operator. The rest will be executed regardless of the condition. We can say that this error is from the world of C, as it was found by me using the diagnostic V640 inside the GCC compiler. The GCC code is written basically in the C language, and in this language, it’s difficult to do without macros. However, I think you will agree that this is not the case. Here a developer could have written a real function.

Second: Complicated Code Reading

If you ever happened to face a project peppered with macros, consisting of other macros, then you are aware of how infernal it is to deal with such a project. If you haven’t, then, accept my word, it is very frustrating. An example of barely readable code is the GCC compiler already mentioned above.

According to the legend, Apple invested in the LLVM project as an alternative to GCC owing right to the great complexity of the GCC code due to these macros. I don’t remember where I read about it, so no proofs for it.

Third: It’s Difficult to Write Macros

Well, it’s easy to write a bad macro. I face them everywhere along with their related consequences. Whereas it’s often more difficult to write a good reliable macro than to write a similar function.

It’s a challenge to write a good macro because unlike a function, it cannot be considered as a separate entity. It is required to consider a macro right in the context of all possible options of its usage, otherwise one will likely get an extra headache such this one:

#define MIN(X, Y) (((X) < (Y)) ? (X) : (Y)) m = MIN(ArrayA[i++], ArrayB[j++]);

Sure, some workarounds have been invented long ago, and the macro can be safely implemented:

#define MAX(a,b) \ ({ __typeof__ (a) _a = (a); \ __typeof__ (b) _b = (b); \ _a > _b ? _a : _b; })

But here is a question – do we need all this in C++? No, in C++ there are templates and other ways to build efficient code. So why on earth do I still come across such macros in C++ programs?

Forth: Complicated Debugging

It is thought that debugging is for wimps :). It is certainly an interesting question for discussion, but from a practical point of view, debugging is useful and helps to find bugs. Macros complicate this process and definitely slow down the search for errors.

Fifth: False Positives of Static Analyzers

Many macros cause multiple false positives of static code analyzers due to their specific configuration. I can safely say that when checking C and C++ code most false positives relate right to macros.

The hitch with macros is that analyzers just cannot differentiate correct sly code from the erroneous code. In the article on Chromium check, there is a description of one of such macros.

What shall we do?

Let’s give up using macros in C++ programs unless absolutely necessary!

C++ provides a wealth of tools, such as templated functions, automatic type inference (auto, decltype) constexpr functions.

Almost always you can write an ordinary function instead of a macro. People often don’t do it because of plain laziness. This sloth is harmful, and we have to fight against it. A little extra time spent on writing a full function will be repaid with interest. It will be easier to read and maintain the code. The likelihood of shooting yourself in the foot will be less, compilers and static analyzers will issue fewer false positives.

Someone might argue that the code with a function is less efficient. This is also only the “excuse”.

Today compilers are good at inlining code even if you haven’t written the inline key word.

If we are talking about evaluating expressions at compile-time, macros are not needed and are even harmful. For the same purposes, it is much better and safer to use constexpr.

Let me explain it using an example: Here we have a classic error in a macro, which I’ve swiped from the FreeBSD Kernel code.

#define ICB2400_VPOPT_WRITE_SIZE 20 #define ICB2400_VPINFO_PORT_OFF(chan) \ (ICB2400_VPINFO_OFF + \ sizeof (isp_icb_2400_vpinfo_t) + \ (chan * ICB2400_VPOPT_WRITE_SIZE)) // <= static void isp_fibre_init_2400(ispsoftc_t *isp) { .... if (ISP_CAP_VP0(isp)) off += ICB2400_VPINFO_PORT_OFF(chan); else off += ICB2400_VPINFO_PORT_OFF(chan - 1); // <= .... }

The chan argument is used in the macro without wrapping in parentheses. As a result, not the (chan – 1) expression is multiplied by the constant ICB2400_VPOPT_WRITE_SIZE, but only the 1.

No error would have appeared if, instead of a macro, an ordinary function had been used.

size_t ICB2400_VPINFO_PORT_OFF(size_t chan) { return ICB2400_VPINFO_OFF + sizeof(isp_icb_2400_vpinfo_t) + chan * ICB2400_VPOPT_WRITE_SIZE; }

Most likely, a contemporary C and C++ compiler will independently perform a function inlining and the code will be just as effective, as in the case of a macro.

In addition, the code became more readable and correct.

If it is known that an input value is always a constant, then you can add constexpr and be sure, that all calculations will happen at compile time. Let’s imagine that it is written in the C++ language and chan is a constant. Then the function is better to be declared in the following way:

constexpr size_t ICB2400_VPINFO_PORT_OFF(size_t chan) { return ICB2400_VPINFO_OFF + sizeof(isp_icb_2400_vpinfo_t) + chan * ICB2400_VPOPT_WRITE_SIZE; }

Profit!

I hope I managed to convince you. I wish you good luck and fewer macros in code!