LLVM 2.3 Release Notes

Written by the LLVM Team

This document contains the release notes for the LLVM compiler infrastructure, release 2.3. Here we describe the status of LLVM, including major improvements from the previous release and any known problems. All LLVM releases may be downloaded from the LLVM releases web site. For more information about LLVM, including information about the latest release, please check out the main LLVM web site. If you have questions or comments, the LLVM developer's mailing list is a good place to send them. Note that if you are reading this file from a Subversion checkout or the main LLVM web page, this document applies to the next release, not the current one. To see the release notes for a specific releases, please see the releases page.

This is the fourteenth public release of the LLVM Compiler Infrastructure. It includes a large number of features and refinements from LLVM 2.2.

LLVM 2.3 no longer supports llvm-gcc 4.0, it has been replaced with llvm-gcc 4.2. LLVM 2.3 no longer includes the llvm-upgrade tool. It was useful for upgrading LLVM 1.9 files to LLVM 2.x syntax, but you can always use a previous LLVM release to do this. One nice impact of this is that the LLVM regression test suite no longer depends on llvm-upgrade, which makes it run faster. The llvm2cpp tool has been folded into llc, use llc -march=cpp instead of llvm2cpp . LLVM API Changes: Several core LLVM IR classes have migrated to use the ' FOOCLASS::Create(...) ' pattern instead of ' new FOOCLASS(...) ' (e.g. where FOOCLASS= BasicBlock ). We hope to standardize on FOOCLASS::Create for all IR classes in the future, but not all of them have been moved over yet.

' pattern instead of ' ' (e.g. where FOOCLASS= ). We hope to standardize on for all IR classes in the future, but not all of them have been moved over yet. LLVM 2.3 renames the LLVMBuilder and LLVMFoldingBuilder classes to IRBuilder.

MRegisterInfo was renamed to TargetRegisterInfo.

The MappedFile class is gone, please use MemoryBuffer instead.

The ' -enable-eh ' flag to llc has been removed. Now code should encode whether it is safe to omit unwind information for a function by tagging the Function object with the ' nounwind ' attribute.

' flag to llc has been removed. Now code should encode whether it is safe to omit unwind information for a function by tagging the Function object with the ' ' attribute. The ConstantFP::get method that uses APFloat now takes one argument instead of two. The type argument has been removed, and the type is now inferred from the size of the given APFloat value.

The core LLVM 2.3 distribution currently consists of code from the core LLVM repository (which roughly contains the LLVM optimizer, code generators and supporting tools) and the llvm-gcc repository. In addition to this code, the LLVM Project includes other sub-projects that are in development. The two which are the most actively developed are the new vmkit Project and the Clang Project.

The "vmkit" project is a new addition to the LLVM family. It is an implementation of a JVM and a CLI Virtual Machines (Microsoft .NET is an implementation of the CLI) using the Just-In-Time compiler of LLVM. The JVM, called JnJVM, executes real-world applications such as Apache projects (e.g. Felix and Tomcat) and the SpecJVM98 benchmark. It uses the GNU Classpath project for the base classes. The CLI implementation, called N3, is its in early stages but can execute simple applications and the "pnetmark" benchmark. It uses the pnetlib project as its core library. The 'vmkit' VMs compare in performance with industrial and top open-source VMs on scientific applications. Besides the JIT, the VMs use many features of the LLVM framework, including the standard set of optimizations, atomic operations, custom function provider and memory manager for JITed methods, and specific virtual machine optimizations. vmkit is not an official part of LLVM 2.3 release. It is publicly available under the LLVM license and can be downloaded from: svn co http://llvm.org/svn/llvm-project/vmkit/trunk vmkit

The Clang project is an effort to build a set of new 'LLVM native' front-end technologies for the LLVM optimizer and code generator. Clang is continuing to make major strides forward in all areas. Its C and Objective-C parsing support is very solid, and the code generation support is far enough along to build many C applications. While not yet production quality, it is progressing very nicely. In addition, C++ front-end work has started to make significant progress. At this point, Clang is most useful if you are interested in source-to-source transformations (such as refactoring) and other source-level tools for C and Objective-C. Clang now also includes tools for turning C code into pretty HTML, and includes a new static analysis tool in development. This tool focuses on automatically finding bugs in C and Objective-C code.

LLVM 2.3 includes a huge number of bug fixes, performance tweaks and minor improvements. Some of the major improvements and new features are listed in this section.

LLVM 2.3 includes several major new capabilities: The biggest change in LLVM 2.3 is Multiple Return Value (MRV) support. MRVs allow LLVM IR to directly represent functions that return multiple values without having to pass them "by reference" in the LLVM IR. This allows a front-end to generate more efficient code, as MRVs are generally returned in registers if a target supports them. See the LLVM IR Reference for more details. MRVs are fully supported in the LLVM IR, but are not yet fully supported in on all targets. However, it is generally safe to return up to 2 values from a function: most targets should be able to handle at least that. MRV support is a critical requirement for X86-64 ABI support, as X86-64 requires the ability to return multiple registers from functions, and we use MRVs to accomplish this in a direct way.

LLVM 2.3 includes a complete reimplementation of the " llvmc " tool. It is designed to overcome several problems with the original llvmc and to provide a superset of the features of the ' gcc ' driver. The main features of llvmc2 are: Extended handling of command line options and smart rules for dispatching them to different tools. Flexible (and extensible) rules for defining different tools. The different intermediate steps performed by tools are represented as edges in the abstract graph. The 'language' for driver behavior definition is tablegen and thus it's relatively easy to add new features. The definition of driver is transformed into set of C++ classes, thus no runtime interpretation is needed.

LLVM 2.3 includes a completely rewritten interface for Link Time Optimization. This interface is written in C, which allows for easier integration with C code bases, and incorporates improvements we learned about from the first incarnation of the interface.

The Kaleidoscope tutorial now includes a "port" of the tutorial that uses the Ocaml bindings to implement the Kaleidoscope language.

LLVM 2.3 fully supports the llvm-gcc 4.2 front-end, and includes support for the C, C++, Objective-C, Ada, and Fortran front-ends. llvm-gcc 4.2 includes numerous fixes to better support the Objective-C front-end. Objective-C now works very well on Mac OS/X.

Fortran EQUIVALENCEs are now supported by the gfortran front-end.

llvm-gcc 4.2 includes many other fixes which improve conformance with the relevant parts of the GCC testsuite.

New features include: LLVM IR now directly represents "common" linkage, instead of representing it as a form of weak linkage.

LLVM IR now has support for atomic operations, and this functionality can be accessed through the llvm-gcc "__sync_synchronize", "__sync_val_compare_and_swap", and related builtins. Support for atomics are available in the Alpha, X86, X86-64, and PowerPC backends.

The C and Ocaml bindings have extended to cover pass managers, several transformation passes, iteration over the LLVM IR, target data, and parameter attribute lists.

In addition to a huge array of bug fixes and minor performance tweaks, the LLVM 2.3 optimizers support a few major enhancements: Loop index set splitting on by default. This transformation hoists conditions from loop bodies and reduces a loop's iteration space to improve performance. For example, for (i = LB; i < UB; ++i) if (i <= NV) LOOP_BODY is transformed into: NUB = min(NV+1, UB) for (i = LB; i < NUB; ++i) LOOP_BODY

LLVM now includes a new memcpy optimization pass which removes dead memcpy calls, unneeded copies of aggregates, and performs return slot optimization. The LLVM optimizer now notices long sequences of consecutive stores and merges them into memcpy 's where profitable.

optimization pass which removes dead calls, unneeded copies of aggregates, and performs return slot optimization. The LLVM optimizer now notices long sequences of consecutive stores and merges them into 's where profitable. Alignment detection for vector memory references and for memcpy and memset is now more aggressive.

and is now more aggressive. The Aggressive Dead Code Elimination (ADCE) optimization has been rewritten to make it both faster and safer in the presence of code containing infinite loops. Some of its prior functionality has been factored out into the loop deletion pass, which is safe for infinite loops. The new ADCE pass is no longer based on control dependence, making it run faster.

The 'SimplifyLibCalls' pass, which optimizes calls to libc and libm functions for C-based languages, has been rewritten to be a FunctionPass instead a ModulePass. This allows it to be run more often and to be included at -O1 in llvm-gcc. It was also extended to include more optimizations and several corner case bugs were fixed.

LLVM now includes a simple 'Jump Threading' pass, which attempts to simplify conditional branches using information about predecessor blocks, simplifying the control flow graph. This pass is pretty basic at this point, but catches some important cases and provides a foundation to build on.

Several corner case bugs which could lead to deleting volatile memory accesses have been fixed.

Several optimizations have been sped up, leading to faster code generation with the same code quality.

We put a significant amount of work into the code generator infrastructure, which allows us to implement more aggressive algorithms and make it run faster: The code generator now has support for carrying information about memory references throughout the entire code generation process, via the MachineMemOperand class. In the future this will be used to improve both pre-pass and post-pass scheduling, and to improve compiler-debugging output.

The target-independent code generator infrastructure now uses LLVM's APInt class to handle integer values, which allows it to support integer types larger than 64 bits (for example i128). Note that support for such types is also dependent on target-specific support. Use of APInt is also a step toward support for non-power-of-2 integer sizes.

LLVM 2.3 includes several compile time speedups for code with large basic blocks, particularly in the instruction selection phase, register allocation, scheduling, and tail merging/jump threading.

LLVM 2.3 includes several improvements which make llc's --view-sunit-dags visualization of scheduling dependency graphs easier to understand.

visualization of scheduling dependency graphs easier to understand. The code generator allows targets to write patterns that generate subreg references directly in .td files now.

memcpy lowering in the backend is more aggressive, particularly for memcpy calls introduced by the code generator when handling pass-by-value structure argument copies.

lowering in the backend is more aggressive, particularly for calls introduced by the code generator when handling pass-by-value structure argument copies. Inline assembly with multiple register results now returns those results directly in the appropriate registers, rather than going through memory. Inline assembly that uses constraints like "ir" with immediates now use the 'i' form when possible instead of always loading the value in a register. This saves an instruction and reduces register use.

Added support for PIC/GOT style tail calls on X86/32 and initial support for tail calls on PowerPC 32 (it may also work on PowerPC 64 but is not thoroughly tested).

New target-specific features include: llvm-gcc's X86-64 ABI conformance is far improved, particularly in the area of passing and returning structures by value. llvm-gcc compiled code now interoperates very well on X86-64 systems with other compilers.

Support for Win64 was added. This includes code generation itself, JIT support, and necessary changes to llvm-gcc.

The LLVM X86 backend now supports the support SSE 4.1 instruction set, and the llvm-gcc 4.2 front-end supports the SSE 4.1 compiler builtins. Various generic vector operations (insert/extract/shuffle) are much more efficient when SSE 4.1 is enabled. The JIT automatically takes advantage of these instructions, but llvm-gcc must be explicitly told to use them, e.g. with -march=penryn .

. The X86 backend now does a number of optimizations that aim to avoid converting numbers back and forth from SSE registers to the X87 floating point stack. This is important because most X86 ABIs require return values to be on the X87 Floating Point stack, but most CPUs prefer computation in the SSE units.

The X86 backend supports stack realignment, which is particularly useful for vector code on OS's without 16-byte aligned stacks, such as Linux and Windows.

The X86 backend now supports the "sseregparm" options in GCC, which allow functions to be tagged as passing floating point values in SSE registers.

Trampolines (taking the address of a nested function) now work on Linux/X86-64.

__builtin_prefetch is now compiled into the appropriate prefetch instructions instead of being ignored.

is now compiled into the appropriate prefetch instructions instead of being ignored. 128-bit integers are now supported on X86-64 targets. This can be used through __attribute__((TImode)) in llvm-gcc.

in llvm-gcc. The register allocator can now rematerialize PIC-base computations, which is an important optimization for register use.

The "t" and "f" inline assembly constraints for the X87 floating point stack now work. However, the "u" constraint is still not fully supported.

New target-specific features include: The LLVM C backend now supports vector code.

The Cell SPU backend includes a number of improvements. It generates better code and its stability/completeness is improving.

New features include: LLVM now builds with GCC 4.3.

Bugpoint now supports running custom scripts (with the -run-custom option) to determine how to execute the command and whether it is making forward process.

LLVM is known to work on the following platforms: Intel and AMD machines (IA32) running Red Hat Linux, Fedora Core and FreeBSD (and probably other unix-like systems).

PowerPC and X86-based Mac OS X systems, running 10.3 and above in 32-bit and 64-bit modes.

Intel and AMD machines running on Win32 using MinGW libraries (native).

Intel and AMD machines running on Win32 with the Cygwin libraries (limited support is available for native builds with Visual C++).

Sun UltraSPARC workstations running Solaris 10.

Alpha-based machines running Debian GNU/Linux.

Itanium-based (IA64) machines running Linux and HP-UX. The core LLVM infrastructure uses GNU autoconf to adapt itself to the machine and operating system on which it is built. However, minor porting may be required to get LLVM to work on new platforms. We welcome your portability patches and reports of successful builds or error messages.

This section contains all known problems with the LLVM system, listed by component. As new problems are discovered, they will be added to these sections. If you run into a problem, please check the LLVM bug database and submit a bug if there isn't already one.

The following components of this LLVM release are either untested, known to be broken or unreliable, or are in early development. These components should not be relied on, and bugs should not be filed against them, but they may be useful to some people. In particular, if you would like to work on one of these components, please contact us on the LLVMdev list. The MSIL, IA64, Alpha, SPU, and MIPS backends are experimental.

The llc " -filetype=asm " (the default) is the only supported value for this option.

The X86 backend does not yet support all inline assembly that uses the X86 floating point stack. It supports the 'f' and 't' constraints, but not 'u'.

The X86 backend generates inefficient floating point code when configured to generate code for systems that don't have SSE2.

Win64 code generation wasn't widely tested. Everything should work, but we expect small issues to happen. Also, llvm-gcc cannot build mingw64 runtime currently due to several bugs due to lack of support for the 'u' inline assembly constraint and X87 floating point inline assembly.

The X86-64 backend does not yet support position-independent code (PIC) generation on Linux targets.

The X86-64 backend does not yet support the LLVM IR instruction va_arg . Currently, the llvm-gcc front-end supports variadic argument constructs on X86-64 by lowering them manually.

The Linux PPC32/ABI support needs testing for the interpreter and static compilation, and lacks support for debug information.

Thumb mode works only on ARMv6 or higher processors. On sub-ARMv6 processors, thumb programs can crash or produce wrong results (PR1388).

Compilation for ARM Linux OABI (old ABI) is supported, but not fully tested.

There is a bug in QEMU-ARM (<= 0.9.0) which causes it to incorrectly execute programs compiled with LLVM. Please use more recent versions of QEMU.

The SPARC backend only supports the 32-bit SPARC ABI (-m32), it does not support the 64-bit SPARC ABI (-m64).

On 21164s, some rare FP arithmetic sequences which may trap do not have the appropriate nops inserted to ensure restartability.

The Itanium backend is highly experimental, and has a number of known issues. We are looking for a maintainer for the Itanium backend. If you are interested, please contact the llvmdev mailing list.

The C backend has only basic support for inline assembly code.

The C backend violates the ABI of common C++ programs, preventing intermixing between C++ compiled by the CBE and C++ code compiled with llc or native compilers.

The C backend does not support all exception handling constructs.

llvm-gcc does not currently support Link-Time Optimization on most platforms "out-of-the-box". Please inquire on the llvmdev mailing list if you are interested. The only major language feature of GCC not supported by llvm-gcc is the __builtin_apply family of builtins. However, some extensions are only supported on some targets. For example, trampolines are only supported on some targets (these are used when you take the address of a nested function). If you run into GCC extensions which are not supported, please let us know.

The C++ front-end is considered to be fully tested and works for a number of non-trivial programs, including LLVM itself, Qt, Mozilla, etc. Exception handling works well on the X86 and PowerPC targets, including X86-64 darwin. This works when linking to a libstdc++ compiled by GCC. It is supported on X86-64 linux, but that is disabled by default in this release.