This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.

[RFC] Proposal to contribute Intel’s implementation of C++17 parallel algorithms

From: "Kukanov, Alexey" <Alexey dot Kukanov at intel dot com>

To: "libstdc++ at gcc dot gnu dot org" <libstdc++ at gcc dot gnu dot org>

Date: Wed, 29 Nov 2017 12:29:52 +0000

Subject: [RFC] Proposal to contribute Intel’s implementation of C++17 parallel algorithms

Authentication-results: sourceware.org; auth=none

Dlp-product: dlpe-windows

Dlp-reaction: no-action

Dlp-version: 11.0.0.116

Hello all, At Intel, we have developed an implementation of C++17 execution policies for algorithms (often referred to as Parallel STL). We hope to contribute it to libstdc++/GCC, so would like to ask the community for comments on this. The code is already published at GitHub (https://github.com/intel/parallelstl). It supports the C++17 standard execution policies (seq, par, par_unseq) as well as the experimental unsequenced policy (unseq) for SIMD execution. At the moment, about half of the C++17 standard algorithms that must support execution policies are implemented; a few more will be ready soon, and the work continues. The tests that we use are also available at GitHub; needless to say we will contribute those as well. The implementation is not specific to Intel’s hardware. For thread-level parallelism it uses TBB* (https://www.threadingbuildingblocks.org/) but abstracts it with an internal API which can be implemented on top of other threading/parallel solutions – so it is for the community to decide which ones to use. For SIMD parallelism (unseq, par_unseq) we use #pragma omp simd directives; it is vendor-neutral and does not require any OpenMP runtime support. The current implementation meets the spirit but not always the letter of the standard, because it has to be separate from but also coexist with implementations of standard C++ libraries. While preparing the contribution, we will address inconsistencies, adjust the code to meet community standards, and better integrate it into the standard library code. We are also proposing that our implementation is included into libstdc++/GCC. Compatibility between the implementations seems useful as it can potentially reduce the amount of work for everyone. We hope to keep the code mostly identical, and would like to know if you think it’s too optimistic to expect. Obviously we plan to use appropriate open source licenses to meet the different projects’ requirements. We expect to keep developing the code and will take the responsibility for maintaining it (with community contributions, of course). If there are other community efforts to implement parallel algorithms, we are willing to collaborate. We look forward to your feedback, both for the overall idea and – if supported – for the next steps we should take. Regards, - Alexey Kukanov * Note that TBB itself is highly portable (and ported by community to Power and ARM architectures) and permissively licensed, so could be the base for the threading infrastructure. But the Parallel STL implementation itself does not require TBB.