The OpenMP ARB is pleased to announce OpenMP 4.5, a major upgrade of the OpenMP standard language specifications. This release provides a substantial improvement on the support for programming of accelerator and GPU devices, and supports now also the parallelization of loops with well-structured dependencies. Implementation is underway in GCC and Clang. The new specification can be downloaded from here.

Standard for parallel programming extends its reach

With this release, OpenMP, the de-facto standard for parallel programming on shared memory systems, continues to extend its reach beyond pure HPC to include DSPs, real time systems, and accelerators. OpenMP aims to provide high-level parallel language support for a wide range of applications, from biotech and automotive to aeronautics, automation, robotics and financial analysis.

“OpenMP 4.5 is a significant achievement that demonstrates the industry-wide collaboration and the hard work and dedication within the OpenMP community,” says Michael Wong, CEO of the OpenMP ARB. “It is more than a minor release, representing the road towards OpenMP 5.0 while we continue on a cadence that delivers Technical Reports and/or Ratified Specifications annually, in keeping pace with the marketplace.”

Many new features

Significantly improved support for devices . OpenMP now provides mechanisms for unstructured data mapping and asynchronous execution and also runtime routines for device memory management. These routines allow for allocating, copying and freeing.

. OpenMP now provides mechanisms for unstructured data mapping and asynchronous execution and also runtime routines for device memory management. These routines allow for allocating, copying and freeing. Support for doacross loops . A natural mechanism to parallelize loops with well-structured dependences is provided.

. A natural mechanism to parallelize loops with well-structured dependences is provided. New taskloop construct . Support to divide loops into tasks, avoiding the requirement that all threads execute the loop.

. Support to divide loops into tasks, avoiding the requirement that all threads execute the loop. Reductions for C/C++ arrays . This often requested feature is now available by building on support for array sections.

. This often requested feature is now available by building on support for array sections. New hint mechanisms . Hint mechanisms can provide guidance on the relative priority of tasks and on preferred synchronization implementations.

. Hint mechanisms can provide guidance on the relative priority of tasks and on preferred synchronization implementations. Thread affinity support . It is now possible to use runtime functions to determine the effect of thread affinity clauses.

. It is now possible to use runtime functions to determine the effect of thread affinity clauses. Improved support for Fortran 2003 . Users can now parallelize many Fortran 2003 programs.

. Users can now parallelize many Fortran 2003 programs. SIMD extensions. These extensions include the ability to specify exact SIMD width and additional data-sharing attributes.

Implementations

Implementation is already almost complete in GCC version 6.0. It is starting in the current trunk of Clang 4.8. Other vendor compilers are following.

OpenMP 4.5 Specifications