Function Multiversioning

Created November 12, 2012

Description

This support is available in GCC 4.8 and later. Support is only available in C++ for i386 targets.

Frequently executed functions in applications are some times built into many versions to take advantage of specific support or features of the hardware that executes the application. For example, functions are compiled to use SSE4 instructions if the hardware supports it. There is, however, the developer burden of creating the dispatching mechanism to execute the right version at runtime . This aim of this project is to make it really easy for the developer to specify multiple versions of a function, each catered to a specific target ISA feature. GCC then takes care of creating the dispatching code necessary to execute the right function version. With this support, here is a simple example of how to create function versions:

1 __attribute__ (( target ( " default " ))) (())) 2 int foo () () 3 { 4 5 return 0 ; 6 } 7 8 __attribute__ (( target ( " sse4.2 " ))) (())) 9 int foo () () 10 { 11 12 return 1 ; 13 } 14 15 __attribute__ (( target ( " arch=atom " ))) (())) 16 int foo () () 17 { 18 19 return 2 ; 20 } 21 22 __attribute__ (( target ( " arch=amdfam10 " ))) (())) 23 int foo () () 24 { 25 26 return 3 ; 27 } 28 int main () () 29 { 30 int (* p )() = & foo ; (*)() = & 31 assert ((* p ) () == foo ()); ((*) () ==()); 32 return 0 ; 33 }

In the above example, 4 versions of function foo are created. The first version of foo with the target attribute "default" is the default version. This version gets executed when no other target specific version qualifies for execution on a particular platform. A new version of foo is created by using the same function signature but with a different target string. Function foo is called or a pointer to it is taken just like a regular function. With the new support, GCC takes care of doing the dispatching to call the right version at runtime.

GCC supports FunctionSpecificOpt which makes is possible to version functions. Each function can be compiled with customized target options and this is used to create function versions.

Only the "default" target attribute specifies the default function version

In the above example, the default version must be tagged with the target attribute string "default". A function declaration with no target attributes does not declare a new version. This has been done to support existing code, with no function versions, which looks like the following example:

1 int foo (); (); 2 3 __attribute___ (( target ( " sse4.2 " ))) (())) 4 int foo () () 5 { 6 ... ... 7 return 0 ; 8 }

In this example, the declaration has no target attributes but the definition does. However, they are regarded as the same function and GCC supports attribute merging where the target attributes of the definition are merged with the declaration when the two decls are merged. Treating, the declaration as a separate function version would have broken such code.

C++ front end support

With the new support, the front end does the following:

Determine if two function decls with the same signature are versions.

Determine the assembler name of a function version.

Process a call/pointer to a function version.

Are two function decls with the same signature versions?

Two function decls with the same signature are versions if and only if both are tagged with the function attribute "target" and the target attribute strings differ.

What is the assembler name of a function version?

The assembler name of a function version is the default assembler name suffixed with the target attribute string. For example function version,

void foo () __attribute__ ((target ("ssse3")));

gets the assembler name:

_Z3foov.ssse3

The only exception to this is the default version tagged with target attribute string "default". The default version retains the original assembler name and is not changed.

How to process a call/pointer to a function version?

When the front-end sees a call (pointer) to a function version, it generates a new dispatcher function decl and replaces the existing call (pointer) to be a call (pointer) to the dispatcher function. The body of the dispatcher function is later generated when building the call graph. The dispatcher function has the logic to determine the right function version at runtime. At run-time, calling the function version, either directly or indirectly, will invoke the dispatcher logic which will execute the right function version.

CGRAPH changes

The call graph data structures maintain the following information regarding multiversioned functions:

Is this function a version?

For this function version, what are the other semantically equivalent function versions?

Is this function a dispatcher of a set of function versions?

The cgraph_function_version_info structure is used to maintain the function version information. It has the following fields:

1 struct GTY (()) cgraph_function_version_info { (()) 2 3 struct cgraph_node * this_node ; 4 5 6 7 struct cgraph_function_version_info * prev ; 8 9 10 11 struct cgraph_function_version_info * next ; 12 13 14 15 16 17 18 19 20 21 tree dispatcher_resolver ; 22 }; };

A map is created from a cgraph node to a cgraph_function_version_info struct for every function version and the associated dispatcher function. The cgraph_function_version_info structs of all the semantically identical function versions are chained as a doubly-linked list. The first version in this list is the default function. A dispatcher function has a pointer to the chain of function versions it dispatches.

Dispatcher Function Body

The body of the dispatcher function for function versions is also generated when the cgraph is analyzed for needed functions. The cgraph function body contains code to call the runtime CPU check builtins to check if the CPU supports the target features to execute a particular function version. The default function version is executed if no function version is appropriate on a particular platform. To keep the cost of dispatching low, the IFUNC mechanism is used for dispatching. This makes the call to the dispatcher a one-time thing during startup and a call to a function version is a single jump indirect instruction.

Dispatch Priority

Given a set of function versions, what is the order in which the versions should be dispatched? This is answered by assigning a dispatch priority to each function version based on the target attributes. For example, a function version targeted for SSE4.2 will have a higher dispatch priority than a version targeted for SSE2, that is, function versions with more advanced features get higher priority. The priority of the target features is determined by the target.

x86 Priority

The following is the priority (increasing order) of ISA features for the x86 architecture:

MMX

SSE

SSE2

SSE3

SSSE3

SSE4.1

SSE4.2

POPCNT

AVX

AVX2

When a function version with target attribute "arch=<processor X>" is compared with a function version with attribute "<ISA Y>", then the highest priority of the any ISA supported by processor X is compared to the priority of ISA Y to decide which version should be dispatched first. If the priorities are the same then the version with "arch=" is given precedence. For instance, when comparing "arch=corei7" and "popcnt", "arch=corei7" wins since the priority of the highest ISA supported by corei7 is popcnt.