The preprocessor of the future past

While the previous advice apply to any C++ version there is an increasing number of ways to help you reduce your daily intake of macros if you have access to a fresh enough compiler.

1. Prefer modules over includes

While modules should improve compile times they do also offer a barrier from which macros cannot leak. In the beginning of 2018 there are no production ready compiler with that feature but GCC, MSVC and clang have implemented it or are in the process to.

While there is a collective lack of experience, it’s reasonable to hope that modules will make tooling easier and better enable features such as automatically including the module corresponding to a missing symbol, cleaning unneeded modules…

2. Use if constexpr over #ifdef whenever possible

When the disabled code-path is well-formed (does not refers to unknown symbols), if constexpr is a better alternative to #ifdef since the disabled code path will still be part of the AST and checked by the compiler and your tools, including your static analyzer and refactoring programs.

3. Even in a postmodern world you may need to resort to an #ifdef, so consider using a postmodern one.

While they don’t help solving the issue at hand, at all, a set of macros is being standardized to detect the set of standard facilities offered by your compiler. Use them if you need to. My advice is to stick to the features offered by every and all compilers your target. Chose a baseline an stick with it. Consider that it might be easier to back-port a modern compiler to your target system than to write an application in C++98.

4. Use std::source_location rather than __LINE__ and __FILE__

Everybody like to write their own logger. And now you can do that with less or no macro using std::source_location.

The long road towards macro-free applications

A few facilities offer better alternatives to some macro usages, but realistically, you will still have to resort to the preprocessor, sooner than later. But fortunately, there is still a lot we can do.

1. Replace -D with compiler-defined variables

One of the most frequent use case for define is to query the build environment. Debug/Release, target architecture, operating system, optimizations…

We can imagine having a set of constants exposed through a std::compiler to expose some of these build environment variables.

if constexpr(std::compiler.is_debug_build()) { }

In the same vein, we can imagine having some kind of extern compiler constexpr variables declared in the source code but defined or overwritten by the compiler. That would only have a real benefit over constexpr x = SOME_DEFINE; if there is a way to constrain the values that these variables can hold.

Maybe something like that

enum class OS {

Linux,

Windows,

MacOsX

}; [[compilation_variable(OS::Linux, OS::Windows, OS::MacOsX)]] extern constexpr int os;

My hope is that giving more information to the compiler about what the various configuration variables are and maybe even what combination of variables are valid would lead to a better modeling (and therefore tooling and static analysis ) of the source code.

2. More attributes

C++ attributes are great and we should have more or them. [[visibility]] would be a great place to start. it could take a constexpr variable as argument to switch from import to export.

3. Taking a page from Rust’s book

The Rust community never misses an occasion to promote fiercely the merits of the Rust language. And indeed, Rust does a lot of things really well. And compile time configuration is one of them.

Literally taken from the rust book

Using an attribute system to conditionally include a symbol in the compilation unit is a very interesting idea indeed.

First, it’s really readable and self documenting. Second, even if a symbol is not to be included in the build, we can still attempt to parse it, and more importantly, the sole declaration gives the compiler sufficient information about the entity to enable powerful tooling, static analysis and refactoring.

Consider the following code:

[[static_if(std::compiler.arch() == "arm")]]

void f() {}

void foo() {

if constexpr(std::compiler.arch() == "arm") {

f();

}

}

It has an amazing property : It’s well formed. Because the compiler knows that f is a valid entity and that it is a function name, it can unambiguously parse the body of the discarded if constexpr statement.

You can applies the same syntax to any kind of C++ declaration and the compiler would be able to make sense of it.

[[static_if(std::compiler.arch() == "arm")]]

int x = /*...*/

Here the compiler could only parse the left hand side since the rest is not needed for static analysis or tooling.

[[static_if(std::compiler.is_debugbuild())]]

class X {

};

For static analysis purposes we only need to index the class name and its public members.

Of course, referencing a discarded declaration from an active code path would be ill formed, but the compiler could check that it never happens for any valid configuration. Sure, it wouldn’t be computationally free but you would have a strong guarantee that all of your code is well formed. Breaking the windows build because you wrote your code on a Linux machine would become much harder.

It’s however not easy as it sounds. What if the body of discarded entities contains syntax the current compiler doesn’t know about ? Maybe a vendor extension or some newer C++ feature ? I think it’s reasonable that parsing happens on a best-effort basis and when a parsing failure happens the compiler can skip the current statement and warns about the parts of the source it doesn’t understand. “I haven’t been able to rename Foo between lines 110 and 130” is miles better that “I have rename some instances of Foo. Maybe not all, good luck skimming through the whole project by hand, really don’t bother with a compiler, just use grep”.

4. constexpr all the things.

Maybe we need a constexpr std::chrono::system_clock::now() to replace __TIME__

We may also want a compile time Random Number Generator. Why not ? Who cares about reproducible builds anyway ?

5. Generate code and symbols with reflection

The metaclasses proposal is the best thing since sliced bread, modules and concepts. In particular P0712 is an amazing paper on many regards.

One of the many constructs introduced is the declname keyword that creates an identifier from an arbitrary sequence of strings and digits

int declname("foo", 42) = 0; creates a variable foo42 . Given that string concatenation to form new identifiers is one of the most frequent use case for macros, this is very interesting indeed. Hopefully the compiler would have enough information on the symbols created ( or refereed to ) this way to still index them properly.

The infamous X macro should also become a thing of the past in the coming years.

6. To get rid of macros, we need a new kind of macros

Since macro are just text replacement, their arguments are lazily evaluated. And while we can use lambda to emulate that behavior, it’s rather cumbersome. So, could we benefit from lazy evaluation in functions ?

This is a topic I thought about last year

My idea is to use the facilities offered by code injection to create a new kind of “macros” which I call “syntactic macros” for lack of a better name. Fundamentally, if you give a name to a code fragment ( a piece of code that you can inject at a given point of your program), and allow it to take a number of parameters, you’ve got yourself a macro. But a macro which is checked at the syntax level (rather than the token source the preprocessor offers).

How would it work ?

Definition of usage of a S-Macro

Ok, What’s happening here.

We first create a constexpr block with constexpr { } . This is part of The meta class proposal. A constexpr block is a compound statement in which all the variables are constexpr and free of side effects. The only purpose of that block is to create injection fragments and modify the properties of the entity in which the block is declared, at compile time. ( Metaclasses are syntactic sugar on top of constexpr blocks and I’d argue that we don’t actually need metaclasses.)

Within the constexpr block we define a macro log . Notice that macro are not functions. They expand to code, they don’t return anything nor do they exist on the stack. log is an identifier that can be qualified and can not be the name of any other entity in the same scope. Syntactic macros obey the same lookup rules as all other identifier.

They use the -> injection operator. -> can be used to describe all code injection related operations without conflicting with its current uses. In your case since log is a syntactic macro which is a form of code injection, we define the macro with log->(){....} .

The body of the syntactic macro is itself a constexpr block which may contain any C++ expression that can be evaluated in a constexpr context.

It may contain 0, one or more injection statements denoted by -> {} . An injection statement creates a code fragment and immediately inject it at the point of invocation, which is, in the case of the syntactic macro, the location where the macro is expanded from.

A macro can either inject an expression or 0 or more statements. A macro that inject an expression can only be expanded where an expression is expected and reciprocally.

Though while it has no type, it has a nature which is determined by the compiler.

You can pass any arguments to a syntactic macro that you could pass to a function. Arguments are evaluated before expansion, and are strongly typed.

However, you can also pass reflections on an expression. That suppose being able to take the reflection of arbitrary expressions. A reflection on an expression e has a type corresponding to decltype(e) .

In term of implementation, in the above example above std::meta::expression<char*> is a concept matching any reflection on an expression which type is char* .

The last piece of magic when evaluating a macro is that expressions are implicitly converted to their reflection before expansion.

At a basic level, we are moving AST nodes around, which is consistent with the current approaches on reflection and code injections.

Lastly, when we inject print(->c, ->(args)...) notice the -> tokens. That transform the reflection back to the original expression which can then be evaluated.

From the call site, log->("Hello %", "World"); looks like a regular void function call except that the -> indicate the presence of a macro expansion.

Lastly, the ability to pass as argument an identifier before evaluation may alleviate the need for new keywords:

std::reflexpr->(x) could expand to __ std_reflexpr_intrasics(x) before x is evaluated.

Do S-Macro replace preprocessor macros completely ?

They don’t, but they don’t intend to. Notably, because they must be valid c++ and are checked at multiple point ( at definition time, before, during and after expansion) they actively prohibit token soup. They are valid C++, inject valid C++ and use valid C++ as parameters.

That means that they can’t inject partial statements, manipulate partial statements or take arbitrary statement as parameters.

They do solve the issue of lazy evaluation and conditional execution. For example you can not implement foreach with them since for(;;) is not a complete statement ( for(;;); and for(;;){} are but they are not very useful).

There are a lot of questions regarding name lookup. Should a macro “see” the context it’s expanded in ? Should and argument be aware of the inner of the macro ? it’s declaration context.

I think limitations are a good thing. If you really need to invent new constructs, maybe the language is lacking, in which case, write a proposal. Or maybe you need a code generator. Or just more abstractions, or more actual code.

Is this real life ?

It’s very much fantasy and absolutely not part of any current proposal, but I do think it would be a logical evolution of the code injection feature.

It resembles a bit to rust macros — except it does not allow for arbitrary statements as arguments — while (I hope) feeling like part of C++, rather than being another language with a separate grammar.