A look at C++14: Papers Part 2

published at 01.04.2013 14:30 by Jens Weller

This is the second part of my C++ Standardization Papers series. The first part has been received quite well, with more then 5k views in the first two days. Also isocpp.org, Phoronix, lwn.net, a lot of russian blogs and others have linked to it. There also was a nice discussion on reddit. Again, like in Part 1, I want to emphasize, that I only cover part of all papers in this blog post. Also not all of these papers are meant to be happening with C++14, modules & concepts for example are not going to be part of C++14 (at least this is highly unlikely). Still, I will cover those papers too, as some of them will be discussed in Bristol for sure. All papers can be found here.

Some words on C++14. C++14 is not going to be like C++11 changing the language a lot. Its more meant to enhance the language with libraries, and improve or provide bug fixes for C++11. That's why you could call C++14 a minor Standard, and the next major C++ Standard is C++17, at least you could see this as the current plan and road map for C++. But lets have a look at the papers:

N3551 - Random Number Generation in C++11

For a long time there has been std::rand(), srand() and RAND_MAX for random number generation. C++11 improved the support for Random Number Generation with the header <random>. The C++11 random library is inspired by boost::random, and splits the generation from the distribution. Therefore you have a set of generator classes, which you can use with a set of distribution classes. This paper can be seen as a really good and complete tutorial to random, it also aims at improving <random>, and as N3547, proposes the introduction of the 4 new random related functions:

global_urng() - returns an implementation defined global Universal Random Number Generator.

randomize() - sets the above global URNG object to an (ideally) unpredictable state

int pick_a_number(int from, int thru) - returns an int number in range[from,thru]

double pick_a_number(double from, double upto) - returns a double number in the open range[from,upto)

N3552 - Introducing Object Aliases

An Object Alias could help to adjust a constant to the right value in the right context. The paper uses the example of pi, where pi can have multiple different requirements in precision depending on the context (float, double, long double). The authors show some techniques to resolve this, and discuss on how object aliases could be implemented in C++.

N3553 - Proposing a C++1y Swap Operator

The term C++1y is being used mostly to imply what was implied with C++0x before. The paper proposes to introduce a swap operator to C++. This new operator shall be treated as a special member function, and enable the programmer to provide an alternative swap operator implementation that the traditional member wise swapping. The authors propose this syntax for the swap operator implementation:

//non class-types

inline T& operator :=: (T& x, T&& y) {see below; return x; } inline T& operator :=: (T& x, T& y) { return x :=: std::move(y); }



//class types

inline C& C::operator:=:(C&& y) & {see below; return *this; } inline C& C::operator:=:(C &y) & { return *this :=: std::move(y); }

PLEASE read the paper for the further details, which simply don't fit in here.

The authors conclude:

"This paper has proposed a swap operator, operator:=: , for addition to C++1Y and has further proposed its application, where viable, as an alternative implementation technique for defaulted class assignment operators. We invite feedback from WG21 participants and other knowledgeable parties, and especially invite implementors to collaborate with us in order to experiment and gain experience with this proposed new language feature."

N3554 - A Parallel Algorithms Library for C++

Very nice. And its a combined proposal from Microsoft, Intel, and Nvidia. The idea is, to provide a parallel version of the <algorithm> header. This goes far beyond running std::sort on multiple threads. Maybe you want to do your sort on the GPU? Maybe do it in a vectorised fashion? At C++Now 2012 there was a very good keynote by Sean Parent (Adobe), mentioning that with the current standard, even with threads, you would not be able to reach the full performance of a machine utilizing vector units or GPU. This approach might be an answer, on how to integrate platform parallelism into the C++ Standard. Quoting the authors:

"We introduce three parallel execution policies for parallel algorithm execution: std::seq, std::par, and std::vec , as well as a facility for vendors to provide non-standard execution policies as extensions."

An short example of what is proposed:

std::vector vec = fill_my_vec_with_random_numbers(1024); // legacy sequential sort std::sort(vec.begin(), vec.end()); // explicit sequential sort std::sort(std::seq, vec.begin(), vec.end()); // parallel sort std::sort(std::par, vec.begin(), vec.end()); // vectorized sort std::sort(std::vec, vec.begin(), vec.end()); // sort with dynamically-selected execution size_t threshold = 512; std::execution_policy exec = std::seq; if(vec.size() > threshold) { exec = std::par; } std::sort(exec, vec.begin(), vec.end()); // parallel sort with non-standard implementation-provided execution policies: std::sort(vectorize_in_this_thread, vec.begin(), vec.end()); std::sort(submit_to_my_thread_pool, vec.begin(), vec.end()); std::sort(execute_on_that_gpu, vec.begin(), vec.end()); std::sort(offload_to_my_fpga, vec.begin(), vec.end()); std::sort(send_this_computation_to_the_cloud, vec.begin(), vec.end());

This approach would enhance the Standard Library with a algorithms capable of choosing the target by a certain specifier. The authors state further:

"This proposal is motivated by a strong desire to provide a standard model of parallelism enabling performance portability across the broadest possible range of parallel architectures."

I think its a very interesting approach, and its already backed by some of the most important compiler vendors, still, its hard to say, which improvements to parallelism and threading will end up in C++14, and which will carry on to C++17. There are many proposals about parallelism, which will need to be aligned and unified into a fitting concept of standardization for C++. The C++ Committee Meeting in Bristol will probably bring an insight, which proposals will be considered for further standardization of parallelism.

N3555 - a URI Library for C++

This paper is not linked, and you can't see it at the ISO listing at open-std.org. Its commented out in the HTML code, yet its visible on the listing at isocpp.org. I think its worth mentioning, that this paper is part of the cpp-net Library approach, which aims at bringing Network/HTTP support to C++. As the paper is not linked, and officially not visible, I'll link on its predecessor N3407.

N3556 - Thread Local Storage in X-Parallel Computations

This paper deals with ways to standardize Thread Local Storage. As there are different approaches to parallelism, the authors refer to this with X-Parallel, where this could be threads, vectorise, GPU, threadpools, task-based or any other parallelism.

"The purpose of this paper is to develop terminology so that the impact of any X-parallel model on TLS can be described clearly and evaluated effectively."

And this is exactly what this paper does, it deals with Thread Local Storage (TLS) in its very details, and tries to define how to translate this into the C++ Standard. This is a very complex topic, and as such the authors have not come to the point to offer std::thread_local_storage or other approaches, they focus on the development of the terminology, so that further work in this field can be done. One of the conclusions the authors make, is that "When discussing any parallel extension to C++, regardless of the X-parallel model, its interaction with TLS must be considered and specified."

For any discussion of such a parallel extension to C++ the authors specify 5 TLS related questions:

Does the X - parallel model meet the minimum concordance guarantee that a TLS access after an X - parallel computation refers to the same object as an access before the X - parallel computation?

What level of thread concordance does the X - parallel model offer for TLS?

What restrictions does the X - parallel model impose on TLS accesses?

For example, the model might forbid writing to TLS in parallel. If races are possible on TLS variables, how can they be resolved or avoided?

If logical and practical, are there new types of X - local storage that should be introduced to support new X - parallelism models?

N3557 - Considering a Fork-Join Parallelism Library

Can Fork-Join Parallelism brought into the C++ Standard as a library only solution, without adding new keywords or changing other parts of the C++ language? This is the key question in this proposal. As an Example of fork-join parallelism the author names the CilkPlus Framework. He was asked by the committee, if it would be possible to include this approach to parallelism as a library to the C++ Standard. There has been a proposal to add Cilk like features to the language, which got rejected at the Portland Meeting in 2012, as a library solution would have the following advantages:

not changing the language itself, changes to the language that serve only one purpose are opposed by some committee members.

library changes are easier to move through the standardization process then core language changes

library features might be easier to be deprecated, once the standard moves on

library features are easier to implement for the vendors, and hence faster on the market

The paper suggest creating a std::task_group interface, which is able to spawn parallel tasks, and can wait with sync() till all tasks are ended. The destructor ~task_group calls sync(), hence wait till all tasks are finished. In a simple example, this approach can look quite attractive, but the author sees several problems with it, where a language based solution would be superior:

Enforce strictness

Exception handling

Simple and transparent syntax in more complex situations such as complex parameter expressions and return values.

The author presents a few situations, where the library solution has its shortcomings over the cilk solution presented as the language model. He concludes possible changes to overcome them. Those library shortcomings, solvable by a language solution are:

better Parameter passing (avoid race conditions)

simpler return value handling

better overload resolution and template instantiation

constructs to enforce strictness

manipulation of exceptions

user-defined control constructs for improved syntax

Each of those points are explained in a short paragraph, please refer to the paper for details. The author also looks at ways to handle this in C, and points out, that due to missing templates and lambdas, a language solution for C is more likely to happen. The conclusion of the author is, that a language based approach will offer programmers easier access to fork-join parallelism as a library based approach.

N3558 - A Standardized Representation of Asynchronous Operations

Main concern of this paper are std::future and std::shared_future. You can spawn a asynchronous operation with std::future in C++11, you just can't wait for it asynchron, as std::future::get is blocking. There is in C++11 now way to install a handler for the result of std::future. This proposal proposes to add std::future::then to the standard, having such a handler as an argument. Also other additions to std::future/std::shared_future are proposed:

then - install a handler for the returning future.

unwrap - unwrap the future returned from another future.

ready - a nonblocking test if the future has returned.

when_any/when_all - compose multiple futures, and wait for the first to complete or all.

make_ready_future - construct a future from a value/with its return value.

All suggested features will only have impact on the Standard Library, no changes to the core language are required. The authors show also a detailed design rationale for each of these proposed functions. IMHO this proposal makes std::future/std::shared_future a lot more useful and usable for asynchronous operations.

N3559 - Proposal for Generic (Polymorphic) Lambda Expressions

C++11 lambdas are implemented as a class with a non template call operator. When the parameters of a lambda function are of type auto, the anonymous class representing the lambda could contain a templated call operator() as implementation. The authors propose to

allow auto type-specifier to indicate a generic lambda parameter

allow conversion from a capture-less generic lambda to an appropriate pointer-to-function

This proposal builds up on the Portland proposal for generic lambdas.

N3560 - Proposal for Assorted Extensions to Lambda Expressions

This proposal aims at making lambdas fully callable 'objects'. This paper proposes generic and non-generic extensions to lambda expressions. It builds up on the previous N3559 paper, and also references N3418, the Portland propsoal for generic lambdas. This paper proposes these 4 new extensions to lambdas:

allow the use of familiar template syntax in lambda expressions

auto LastElement = [](const std::array<T,N>& a) { return N ? a[N‐1] : throw "index error"; };

permit lambda body to be an expression for_each(begin(v), end(v), [](auto &e) e += 42 );

allow auto forms in the trailing return type auto L = [=](auto f, auto n) ‐> auto& { return f(n); };

allow generic lambdas with variadic auto parameters //Example auto PrinterCurrier = [](auto printer) { return [=](auto&& ... a) { printer(a ...); }; };



This paper gives an overview about which things still would be worth changing in the lambda area, maybe the meeting in Bristol will give further guidance, if those changes are accepted into C++14.

N3561 - Semantics of Vector Loops

This paper proposes vector loops for C++, it builds up on earlier proposals in this area, so it states to be not totally self contained. One of the things proposed is simd_for and simd_for_chunk(N). This would make C++ able to directly use of SIMD, in this case applied to loops. In short:

This paper summarizes the set of capabilities we propose for vector programming as part of parallel programming.

N3562 - Executors and Schedulers (revision 1)

A proposal for Executors, objects that can execute units of work packaged as function objects. So this is a nother possible approach to task based parallelism, where the executor object is used as a reusable thread, that can handled a queue of tasks. One possible implementation of an executor is a thread-pool, but other implementations are possible. The paper is based on internal heavily used Google and Microsoft code.

So, what exactly is an executor?

Conceptually, an executor puts closures on a queue and at some point executes them. The queue is always unbounded, so adding a closure to an executor never blocks.

The paper defines a closure to be std::function<void()>. Thus limiting the executor to this simple interface, which has its advantages, but also its limitations. The authors favor a template less approach to implement the executor library, and base the implementation on polymorphism and inhertiance.

N3563 - C++ Mapreduce

The map-reduce algorithm has become a modern workhorse, heavily used by Google and Frameworks like Hadoop building upon it. This paper aims at adding a C++ mapreduce library to the C++ standard. The paper proposes a couple of interfaces, which are used to implement mapreduce:

mapper_trait< input_type, key_type, value_type >

reduce_trait< key_type, value_type, output_type >

map_reduce_options< mapper, reducer, OutIter >

map_reduce< Iter, OutIter, Mapper, Reducer, Combiner = ..., shard_fn = ..., input_splitter=... >

This paper has been discussed in a privious version at Portland (N3446).

N3564 - Resumable Functions

This paper is related to N3558, which handles extensions for std::future/std::shared_future. This proposal concentrates on resumbable functions. While N3558 focuses on extending the asynchronous functions of the standard library, this paper also considers adding language features. It is proposed to add the keyword await for resumbable functions to C++, which accepts functions returning a std::(shared_)future<T>. A short example:

future f(stream str) resumable { shared_ptr< vector > buf = ...; int count = await str.read(512, buf); return count + 11; } future g() resumable { stream s = ...; int pls11 = await f(s); s.close(); }

This example could also be implemented with the changes only proposed in N3558, but it would be much more complicated, more code and harder to debug, the authors claim. Therefor a language based solution could improve the readability and usability of C++ Code using asynchronous functions.

N3565 - IP Address Design Constraints

There is a new Networking Working Group at the Standard Committee, aiming for bringing Networking and HTTP to the C++ Standard. This is one of the few papers they have published for Bristol. This paper focuses on discussing the class design for covering IPv4 and IPv6 addresses. There are three possibilities for addressing the design of an IP class:

simplicity of use (one class for all)

space concern (two classes)

performance concern (addressed by three or two class design)

"This paper describes different approaches with enumerated pros and cons and a quantitative score for each."

The paper continues describing the detailed design options for each version. There is no clear winner, all options score between -1 and 1. Where each positive point is +1, and each negative point is -1, the score is the sum of both.

N3568 - Shared Locking in C++

This is a new version of N3427 presented at Portland last fall. This paper wants to add easy support for multiple reader/single writer locking pattern. This proposal wants to add seven constructors to unique_lock and introduce a new header <shared_mutex> containing:

shared_mutex

upgrade_mutex

shared_lock<T>

upgrade_lock<T>

a few other classes

Interestingly this proposal is almost 6 years old, and includes a few patterns originally designed in coherence with the already existing mutexes. The original plan was to include these mutexes and locks into C++0x, but in 2007 the need to limit the scope of C++0x arose, which lead to only the first half of planned mutexes were introduced to the Standard Library. The goal of the authors is, to bring the original set of mutexes and locks to C++.

N3570 - Quoted Strings Library Proposal

No, this is not a new string class for C++. This proposal wants to deal with the issue, that strings passed and read from streams, might not be read as they were passed, if they contained spaces. Best way to understand this is the example from the paper:

std::stringstream ss; std::string original = "foolish me"; std::string round_trip; ss << original; ss >> round_trip; std::cout << original; // outputs: foolish me std::cout << round_trip; // outputs: foolish assert(original == round_trip); // assert will fire

This is the current situation, the paper suggests to add a manipulator for strings to <iomanip>: quoted(my_string). The manipulator quoted shall add quotes ('"' by default) to the string when written to the stream, and if read from, read the content within the "quote" and strip the the quote signs. This proposal is based on a boost component.

N3571 - Add SIMD Computation to the Library

This proposal aims at adding SIMD (Single Instruction Multiple Data) Support to C++. The authors propose a library solution, which allows adding SIMD support to a C++ program via a header only library. The authors base this paper on the work for a boost.SIMD library. The paper shows detailed which advantages the implementation has, and how this could be integrated into the C++ Standard Library.

N3572 - Unicode Support in the Standard Library

This paper wants to add a better unicode support to the standard library, it also summarizes the current state of unicode support in the library. One of the current short comings of the standard library with unicode is for example, that exceptions can't hold unicode text. The authors propose a new header <unicode>, containing a state of the art unicode implementation for C++.

N3573 - Heterogenous extension to unordered containers

This paper aims at extending std::unordered_map and std::unordered_set. One of its goals is, to allow the use of alternative types as keys. A simlpe example:

std::unordered_set<std::unique_ptr<T> > set;

Currently it is impossible to look up by another type than the key type. Currently, you can only insert into this set, its impossible to erase or test if an element is already contained in the set. As this would require constructing a 2nd unique_ptr<T>. But actually having a hash(t) == hash(k) option could solve this. Also the authors aim for overriding the hash or equality methods, which could be used for caching:

map.find(value, &(std::string& val) { if (!dirty) return hash_cache; else return std::hash<>()(val); });

The paper also contains a few changes to std::hash, and wants to add std::hash to the list of function objects.



And this is again the end of Part 2.

But, there is Part 3!

Join the Meeting C++ patreon community!

This and other posts on Meeting C++ are enabled by my supporters on patreon!