What's new in GCC 4.5?

Benefits for LWN subscribers The primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!

Version 4.5 of the GNU Compiler Collection was released in mid-April with many changes under-the-hood, as well as a few important user-visible features. GCC 4.5 promises faster programs using the new link-time optimization (LTO) option, easier implementation of compiler extensions thanks to the controversial plugin infrastructure, stricter standards-conformance for floating-point computations, and better debugging information when compiling with optimizations.

The GNU Compiler Collection is one of the oldest free software projects still around. Version 1.0 of GCC was released in 1987. More than twenty years later, GCC is still under active development and each new version is adding important features. Supporting these new features in such an old codebase often requires major rewriting of substantial parts of GCC. GCC 4.0 was an important milestone in this regard, and GCC internals are still evolving at a rapid pace. However, these core improvements are sometimes not clearly visible as improvements for users. This is not the case in GCC 4.5. This article describes four new features in GCC 4.5, and also looks at an internal feature that may radically change how GCC is developed in the future.

Link-Time Optimization

Perhaps the most visible of the new features in GCC 4.5 is the Link-Time Optimization option: -flto . When source files are compiled and linked using -flto , GCC applies optimizations as if all the source code were in a single file. This allows GCC to perform more aggressive optimizations across files, such as inlining the body of a function from one file that is called from a different file, and propagating constants across files. In general, the LTO framework enables all the usual optimizations that work at a higher level than a single function to also work across files that are independently compiled.

The LTO option works almost like any other optimization flag. First, one needs to use optimization (using one of the -O{1,2,3,s} options). In cases where compilation and linking are done in a single step, adding the option -flto is sufficient

gcc -o myprog -flto -O2 foo.c bar.c

This effectively deprecates the old -combine option, which was too slow in practice and only supported for C.

With independent compilation steps, the option -flto must be specified at all steps of the process:

gcc -c -O2 -flto foo.c gcc -c -O2 -flto bar.c gcc -o myprog -flto -O2 foo.o bar.o

An interesting possibility is to combine the options -flto and -fwhole-program . The latter assumes that the current compilation unit represents the whole program being compiled. This means that most functions and variables are optimized more aggressively. Adding -fwhole-program in the final link step in the example above, makes LTO even more powerful.

When using multiple steps, it is strongly recommended to use exactly the same optimization and machine-dependent options in all commands, because conflicting options during compilation and link-time may lead to strange errors. In the best case, the options used during compilation will be silently overridden by those used at link-time. In the worst case, the different options may introduce subtle inconsistencies leading to unpredictable results at runtime. This, of course, is far from ideal, and, hence, in the next minor release, GCC will identify such conflicting options and provide appropriate diagnostics. Meanwhile, some extra care should be taken when using LTO.

The current implementation of LTO is only available for ELF targets, and, hence, LTO is not available in Windows or Darwin in GCC 4.5. However, the LTO framework is flexible enough to support those targets and, in fact, Dave Korn has recently proposed a patch that adds LTO support for Windows to GCC 4.5.1 and 4.6, and Steven Bosscher has done the same for Darwin.

Finally, another interesting ongoing project, called whole program optimization [PDF], aims to make LTO much more scalable for very large programs (on the order of millions of functions). Currently, when compiling and linking with LTO, the final step stores information from all files involved in the compilation in memory. This approach does not scale well if there are many large files. In practice, there may be little interaction between some files and the information required could be partitioned and optimized independently, with little performance loss, or at least gracefully degrading the effectiveness of LTO depending on existing resources. The experimental -fwhopr option is a first step in this direction, but this feature is still under development and even the name of the option is likely to change. Therefore, GCC 4.6 will probably bring further improvements in this area.

Plugins

Another long-awaited feature is the ability to load user code as plugins that modify the behaviour of GCC. A substantial amount of controversy surrounded the implementation of plugins. The possibility of proprietary plugins was probably the main factor stalling the development of this feature. However, the FSF recently reworked the Runtime Library Exception in order to prevent proprietary plugins. With the new Runtime Library Exception in place, the development of the plugins framework progressed rapidly. This, however, did not completely end the controversy surrounding plugins, and while some developers think that plugins are essential for the future of GCC and for attracting new users and contributors, others fear that plugins may divert efforts from improving GCC itself.

The plugin framework of GCC can work in principle on any system that supports dynamic libraries. In GCC 4.5, however, plugins are only supported on ELF-based platforms, that is, most Unix-like systems, but not Windows or Darwin. A plugin is loaded with the new option -fplugin=/path/to/file.so . GCC makes available a series of events for which the plugin code can register its own callback functions. The events already implemented in GCC 4.5 allow plugins to interact with the pass manager to add, reorder and remove optimization passes dynamically, modify the low level representation used by C and C++ front-ends, add new custom attributes and compiler pragmas, and other possibilities described in the internal documentation.

Despite plugins being a new feature in GCC 4.5, several projects are already making use of the plugins support. Among these projects is Dehydra, the static analysis tool for C++ developed by Mozilla; and MELT, a framework for writing optimization passes in a dialect of LISP. Also, the ICI/MILEPOST research project strongly relies on the new plugins framework in GCC 4.5.

Variable Tracking at Assignments

The Variable Tracking at Assignments (VTA) project aims to improve debug information when optimizations are enabled. When GCC compiles some code with optimizations enabled, variables are renamed, moved around, or even completely removed. When debugging such code and trying to inspect the value of some variable, the debugger would often report that the variable has been optimized out. With VTA enabled, the optimized code is internally annotated in such a way that optimization passes transparently keep track of the value of each variable, even if the variable is moved around or removed.

A small example of the differences between debug information in GCC 4.5 and previous releases is the following program:

typedef struct list { struct list *n; int v; } *node; node find_prev (node c, node w) { while (c) { node opt = c; c = c->n; if (c == w) return opt; } return NULL; }

Variable opt is removed when compiling with optimization. Hence, in previous GCC versions, or when compiling without VTA, one cannot inspect the value of opt even at the highest debugging level. In GCC 4.5, however, VTA enables inspection of the value of all variables at all points of the function.

The effect of VTA is even more noticeable for inlined functions. Before VTA, optimizations would often completely remove some arguments of an inlined function, making it impossible to inspect their values when debugging. With VTA, these optimizations still take place, however, appropriate debug information is generated for the missing arguments.

Finally, the VTA project has brought another feature, the new -fcompare-debug option, which tests that the code generated by GCC with and without debug information is identical. This option is mainly used by GCC developers to test the compiler, but it may be useful for users to check that their program is not affected by a bug in GCC, though at a significant cost in compilation time.

Standard conforming excess precision

Perhaps the most reported bug in GCC is bug 323. The symptoms appear when different optimization levels produce different results in floating-point computations, and when two ways of performing the same calculation do not produce the same result. Although this is an inherent limitation of floating-point numbers, users are still surprised that different optimization levels lead to highly different results. One of the main culprits of the problem is the excess precision arising from the use of the x87 floating-point unit (FPU). That is, operations performed in the FPU have more precision than double precision numbers stored in memory. Hence, the final result of a computation may significantly depend on whether intermediate operations are stored in the FPU or in memory.

This leads to some unexpected and counter-intuitive results. For example, the same piece of code may produce different results using the same compilation flags and the same machine depending on changes of seemingly unrelated code, because the unrelated code forces the compiler to save some intermediate result in memory instead of keeping it in a FPU register. One workaround to this behavior is the option -ffloat-store , which stores every floating-point variable in memory. This has, however, a significant cost in computation time. A more fine-grained workaround is to use the volatile qualifier in variables suffering from this problem.

While this problem will never be solved in computers with inexact representation of floating-point numbers, GCC 4.5 helps improve the situation by adding a new option -fexcess-precision=standard , currently only available for C, that handles floating-point excess precision in a way that conforms to ISO C99. This option is also enabled with standards conformance options such as -std=c99 . However, standards-conforming precision incurs an extra cost in computation time. Therefore, users more interested in speed may wish to disable this behavior using the option -fexcess-precision=fast .

C++ compatible

GCC 4.5 is the first release of GCC that can be compiled with a C++ compiler. This may not seem very interesting or useful at the moment (but take a look at the much improved -Wc++-compat option). However, this is only the first step of an ongoing project to use C++ as the implementation language of GCC. Except for some front-end bits written in other languages, notably Ada, most of GCC is implemented in C. The internal structures of GCC are under a continuous improvement and modularization aimed at creating cleaner interfaces, and many GCC developers think that this work would be easier using C++ than C. However, this proposal is not free of controversy, and it is not clear whether the switch would occur in GCC 4.6, later, or ever.

Other improvements

The above are only some examples of the many improvements and new features in GCC 4.5. A few other features that are worth mentioning:

Conclusion

We are living interesting times on the compiler front, and GCC 4.5 is an indication that we can still expect new developments in the future. The release of GCC 4.5 brings to its users several important, and somewhat controversial, features. It also includes the typical long list of small fixes and improvements, where most will be able to find at least one thing to their liking. GCC 4.5 may well be a transition point, where the foundational work that has been done during the 4.x release series is starting to show up in user-visible features that would have been impossible in the GCC 3.x release series. It is difficult to say at this moment what GCC 4.6 will bring us in a year from now, as it will depend on what the contributors decide. Anyone can contribute to the future of GCC. This is free software after all.

Acknowledgments

I would like to thank in general the community of GCC developers, and in particular, Ian Lance Taylor, Diego Novillo, and Alexandre Oliva, for their helpful comments and suggestions when writing this article.