From Hybridthreads Wiki

About LLVM

LLVM (Low-Level Virtual Machine) can refer to three things:

The open-source compiler project started at UIUC and now funded by Apple.

The "LLVM suite," comprising what might be called the middle- and back-end of the LLVM compiler and contrasted with "llvm-gcc," which is the front-end of the compiler.

The intermediate representation language used by the LLVM compiler. This IR can be represented by a human-readable text file, a bytecode format, or an in-memory AST; tools are provided to translate between these forms.

To get LLVM up and running on one of the lab workstations, see the LLVM install instructions .

OpenMP in LLVM: An Example

Here's a vector add in C:

void vector_add ( int *, int *, int *, int ) ; int main ( ) { int * a , * b , * c , n ; vector_add ( a , b , c , n ) ; return 0 ; } void vector_add ( int * a , int * b , int * c , int num ) { int i ; #pragma omp parallel shared(a,b,c) private(i) { #pragma omp for schedule(dynamic) nowait for ( i = 0 ; i < num ; i ++ ) { a [ i ] = b [ i ] + c [ i ] ; } } return ; }

This implementation is the simplest possible, and would be a sub-optimal implementation. The for pragma automatically slices up the loop into multiple threads; if no chunk size is give (as above), then the default chunk size is 0. So the above example would spawn num threads.

Compiling it without OpenMP enabled ( llvm-gcc adder.c -emit-llvm -S -o adder.ll ) results in the following llvm for the vector_add function:

define void @vector_add ( i32 * % a , i32 * % b , i32 * % c , i32 % num ) nounwind { entry : % a_addr = alloca i32 * ; <i32**> [#uses=2] % b_addr = alloca i32 * ; <i32**> [#uses=2] % c_addr = alloca i32 * ; <i32**> [#uses=2] % num_addr = alloca i32 ; <i32*> [#uses=2] % i = alloca i32 ; <i32*> [#uses=7] % "alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0] store i32 * % a , i32 ** % a_addr store i32 * % b , i32 ** % b_addr store i32 * % c , i32 ** % c_addr store i32 % num , i32 * % num_addr store i32 0 , i32 * % i , align 4 br label % bb1 bb : ; preds = %bb1 %0 = load i32 ** % b_addr , align 8 ; <i32*> [#uses=1] %1 = load i32 * % i , align 4 ; <i32> [#uses=1] %2 = sext i32 %1 to i64 ; <i64> [#uses=1] %3 = getelementptr i32 * %0 , i64 %2 ; <i32*> [#uses=1] %4 = load i32 * %3 , align 4 ; <i32> [#uses=1] %5 = load i32 ** % c_addr , align 8 ; <i32*> [#uses=1] %6 = load i32 * % i , align 4 ; <i32> [#uses=1] %7 = sext i32 %6 to i64 ; <i64> [#uses=1] %8 = getelementptr i32 * %5 , i64 %7 ; <i32*> [#uses=1] %9 = load i32 * %8 , align 4 ; <i32> [#uses=1] %10 = add i32 %4 , %9 ; <i32> [#uses=1] %11 = load i32 ** % a_addr , align 8 ; <i32*> [#uses=1] % 12 = load i32 * % i , align 4 ; <i32> [#uses=1] % 13 = sext i32 % 12 to i64 ; <i64> [#uses=1] % 14 = getelementptr i32 * %11 , i64 % 13 ; <i32*> [#uses=1] store i32 %10 , i32 * % 14 , align 4 % 15 = load i32 * % i , align 4 ; <i32> [#uses=1] % 16 = add i32 % 15 , 1 ; <i32> [#uses=1] store i32 % 16 , i32 * % i , align 4 br label % bb1 bb1 : ; preds = %bb, %entry % 17 = load i32 * % i , align 4 ; <i32> [#uses=1] % 18 = load i32 * % num_addr , align 4 ; <i32> [#uses=1] % 19 = icmp slt i32 % 17 , % 18 ; <i1> [#uses=1] br i1 % 19 , label % bb , label % bb2 bb2 : ; preds = %bb1 br label % return return : ; preds = %bb2 ret void }

With OpenMP enabled ( llvm-gcc adder.c -fopenmp -emit-llvm -S -o adder-omp.ll ), the generated llvm is more complex: