MSVC Preprocessor Progress towards Conformance

Ulzii

July 6th, 2018

Why re-write the preprocessor?

Recently, we published a blog post on C++ conformance completion. As mentioned in the blog post, the preprocessor in MSVC is currently getting an overhaul. We are doing this to improve its language conformance, address some of the longstanding bugs that were difficult to fix due to its design and improve its usability and diagnostics. In addition to that, there are places in the standard where the preprocessor behavior is undefined or unspecified and our traditional behavior diverged from other major compilers. In some of those cases, we want to move closer to the ecosystem to make it easier for cross platform libraries to come to MSVC.

If there is an old library you use or maintain that depends on the non-conformant traditional behavior of the MSVC preprocessor, you don’t need to worry about these breaking changes as we will still support this behavior. The updated preprocessor is currently under the switch /experimental:preprocessor until it is fully implemented and production-ready, at which point it will be moved to a /Zc: switch which is enabled by default under /permissive- mode.

How do I use it?

The behavior of the traditional preprocessor is maintained and continues to be the default behavior of the compiler. The conformant preprocessor can be enabled by using the /experimental:preprocessor switch on the command line starting with the Visual Studio 2017 15.8 Preview 3 release.

We have introduced a new predefined macro in the compiler called “_MSVC_TRADITIONAL” to indicate the traditional preprocessor is being used. This macro is set unconditionally, independent of which preprocessor is invoked. Its value is “1” for the traditional preprocessor, and “0” for the conformant experimental preprocessor.

#if defined(_MSVC_TRADITIONAL) && _MSVC_TRADITIONAL // Logic using the traditional preprocessor #else // Logic using cross-platform compatible preprocessor #endif

What are the behavior changes?

This first experimental release is focused on getting conformant macro expansions done to maximize adoption of new libraries by the MSVC compiler. Below is a list of some of the more common breaking changes that were run into when testing the updated preprocessor with real world projects.

Behavior 1 [macro comments]

The traditional preprocessor is based on character buffers rather than preprocessor tokens. This allows unusual behavior such as this preprocessor comment trick which will not work under the conforming preprocessor:

#if DISAPPEAR #define DISAPPEARING_TYPE /##/ #else #define DISAPPEARING_TYPE int #endif

// myVal disappears when DISAPPEARING_TYPE is turned into a comment // To make standard compliant wrap the following line with the appropriate // // #if/#endif DISAPPEARING_TYPE myVal;

Behavior 2 [L#val]

The traditional preprocessor incorrectly combines a string prefix to the result of the # operator:

#define DEBUG_INFO(val) L”debug prefix:” L#val // ^ // this prefix

const wchar_t *info = DEBUG_INFO(hello world);

In this case the L prefix is unnecessary because the adjacent string literals get combined after macro expansion anyway. The backward compatible fix is to change the definition to:

#define DEBUG_INFO(val) L”debug prefix:” #val // ^ // no prefix

This issue is also found in convenience macros that ‘stringize’ the argument to a wide string literal:

// The traditional preprocessor creates a single wide string literal token #define STRING(str) L#str

// Potential fixes: // Use string concatenation of L”” and #str to add prefix // This works because adjacent string literals are combined after macro expansion #define STRING1(str) L””#str

// Add the prefix after #str is stringized with additional macro expansion #define WIDE(str) L##str #define STRING2(str) WIDE(#str)

// Use concatenation operator ## to combine the tokens. // The order of operations for ## and # is unspecified, although all compilers // I checked perform the # operator before ## in this case. #define STRING3(str) L## #str

Behavior 3 [warning on invalid ##]

When the ## operator does not result in a single valid preprocessing token, the behavior is undefined. The traditional preprocessor will silently fail to combine the tokens. The new preprocessor will match the behavior of most other compilers and emit a diagnostic.

// The ## is unnecessary and does not result in a single preprocessing token. #define ADD_STD(x) std::##x

// Declare a std::string ADD_STD(string) s;

Behavior 4 [comma elision in variadic macros]

Consider the following example:

void func(int, int = 2, int = 3); // This macro replacement list has a comma followed by __VA_ARGS__ #define FUNC(a, …) func(a, __VA_ARGS__) int main() { // The following macro is replaced with: // func(10,20,30) FUNC(10, 20, 30);

// A conforming preprocessor will replace the following macro with: func(1, ); // Which will result in a syntax error. FUNC(1, ); }

All major compilers have a preprocessor extension that helps address this issue. The traditional MSVC preprocessor always removes commas before empty __VA_ARGS__ replacements. In the updated preprocessor we have decided to more closely follow the behavior of other popular cross platform compilers. For the comma to be removed, the variadic argument must be missing (not just empty) and it must be marked with a ## operator.

#define FUNC2(a, …) func(a , ## __VA_ARGS__) int main() { // The variadic argument is missing in the macro being evoked // Comma will be removed and replaced with: // func(1) FUNC2(1);

// The variadic argument is empty, but not missing (notice the // comma in the argument list). The comma will not be removed // when the macro is replaced. // func(1, ) FUNC2(1, ); }

In the upcoming C++2a standard this issue has been addressed by adding __VA_OPT__, which is not yet implemented.

Behavior 5 [macro arguments are ‘unpacked’]

In the traditional preprocessor, if a macro forwards one of its arguments to another dependent macro then the argument does not get “unpacked” when it is substituted. Usually this optimization goes unnoticed, but it can lead to unusual behavior:

// Create a string out of the first argument, and the rest of the arguments. #define TWO_STRINGS( first, … ) #first, #__VA_ARGS__ #define A( … ) TWO_STRINGS(__VA_ARGS__)

const char* c[2] = { A(1, 2) }; // Conformant preprocessor results: // const char c[2] = { “1”, “2” }; // Traditional preprocessor results, all arguments are in the first string: // const char c[2] = { “1, 2”, };

When expanding A(), the traditional preprocessor forwards all of the arguments packaged in __VA_ARGS__ to the first argument of TWO_STRINGS, which leaves the variadic argument of TWO_STRINGS empty. This causes the result of #first to be “1, 2” rather than just “1”. If you are following along closely, then you may be wondering what happened to the result of #__VA_ARGS__ in the traditional preprocessor expansion: if the variadic parameter is empty it should result in an empty string literal “”. Due to a separate issue, the empty string literal token was not generated.

Behavior 6 [rescanning replacement list for macros]

After a macro is replaced, the resulting tokens are rescanned for additional macro identifiers that need to be replaced. The algorithm used by the traditional preprocessor for doing the rescan is not conformant as shown in this example based on actual code:

#define CAT(a,b) a ## b #define ECHO(…) __VA_ARGS__

// IMPL1 and IMPL2 are implementation details #define IMPL1(prefix,value) do_thing_one( prefix, value) #define IMPL2(prefix,value) do_thing_two( prefix, value) // MACRO chooses the expansion behavior based on the value passed to macro_switch #define DO_THING(macro_switch, b) CAT(IMPL, macro_switch) ECHO(( “Hello”, b))

DO_THING(1, “World”); // Traditional preprocessor: // do_thing_one( “Hello”, “World”); // Conformant preprocessor: // IMPL1 ( “Hello”,”World”);

Although this example is a bit contrived, we have run into this issue few times when testing the preprocessor changes against real world code. To see what is going on we can break down the expansion starting with DO_THING:

DO_THING(1, “World”)— > CAT(IMPL, 1) ECHO((“Hello”, “World”))

Second, CAT is expanded:

CAT(IMPL, 1)– > IMPL ## 1 — > IMPL1

Which puts the tokens into this state:

IMPL1 ECHO((“Hello”, “World”))

The preprocessor finds the function-like macro identifier IMPL1, but it is not followed by a “(“, so it is not considered a function-like macro invocation. It moves on to the following tokens and finds the function-like macro ECHO being invoked:

ECHO((“Hello”, “World”))– > (“Hello”, “World”)

IMPL1 is never considered again for expansion, so the full result of the expansions is:

IMPL1(“Hello”, “World”);

The macro can be modified to behave the same under the experimental preprocessor and the traditional preprocessor by adding in another layer of indirection:

#define CAT(a,b) a##b #define ECHO(…) __VA_ARGS__

// IMPL1 and IMPL2 are macros implementation details #define IMPL1(prefix,value) do_thing_one( prefix, value) #define IMPL2(prefix,value) do_thing_two( prefix, value)

#define CALL(macroName, args) macroName args #define DO_THING_FIXED(a,b) CALL( CAT(IMPL, a), ECHO(( “Hello”,b)))

DO_THING_FIXED(1, “World”); // macro expanded to: // do_thing_one( “Hello”, “World”);

What’s next…

The preprocessor overhaul is not yet complete; we will continue to make changes under the experimental mode and fix bugs from early adopters.

Some preprocessor directive logic needs completion rather than falling back to the traditional behavior

Support for _Pragma

C++20 features

Additional diagnostic improvements

New switches to control the output under /E and /P

Boost blocking bug Logical operators in preprocessor constant expressions are not fully implemented in the new preprocessor so on #if directives the new preprocessor can fall back to the traditional preprocessor. This is only noticeable when macros not compatible with the traditional preprocessor are expanded, which can be the case when building boost preprocessor slots.



In closing

We’d love for you to download Visual Studio 2017 version 15.8 preview and try out all the new experimental features. As always, we welcome your feedback. We can be reached via the comments below or via email (visualcpp@microsoft.com). If you encounter other problems with MSVC in Visual Studio 2017 please let us know through Help > Report A Problem in the product, or via Developer Community. Let us know your suggestions through UserVoice. You can also find us on Twitter (@VisualC) and Facebook (msftvisualcpp).

Thank you,

Phil Christensen, Ulzii Luvsanbat