This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

[C++ coroutines] Initial implementation pushed to master.

From: Iain Sandoe <iain at sandoe dot co dot uk>

To: GCC Patches <gcc-patches at gcc dot gnu dot org>

Cc: Nathan Sidwell <nathan at acm dot org>, Richard Biener <richard dot guenther at gmail dot com>, Jonathan Wakely <jwakely at redhat dot com>, Jeff Law <law at redhat dot com>, libstdc++ <libstdc++ at gcc dot gnu dot org>

Date: Sat, 18 Jan 2020 12:53:48 +0000

Subject: [C++ coroutines] Initial implementation pushed to master.

Hi, Thanks to: * the reviewers, the code was definitely improved by your reviews. * those folks who tested the branch and/or compiler explorer instance and reported problems with reproducers. * WG21 colleagues, especially Lewis and Gor for valuable input and discussions on the design. ===== TL;DR: * This is not enabled by default (even for -std=c++2a), it needs -fcoroutines. * Like all the C++20 support, it is experimental, perhaps more experimental than some other pieces because wording is still being amended. * The FE/ME tests are run for ALL targets; in principle this should be target- agnostic, if we see fails then that is probably interesting input for the ABI panel. * I regstrapped on 64b LE and BE platforms and a 32b LE host with no observed issues or regressions. * it’s just slightly too big to send uncompressed so attached as a bz2. * commit is r10-6063-g49789fd08 thanks again to all those who helped, Iain ====== The full covering note: This is the squashed version of the first 6 patches that were split to facilitate review. The changes to libiberty (7th patch) to support demangling the co_await operator stand alone and are applied separately. The patch series is an initial implementation of a coroutine feature, expected to be standardised in C++20. Standardisation status (and potential impact on this implementation) -------------------------------------------------------------------- The facility was accepted into the working draft for C++20 by WG21 in February 2019. During following WG21 meetings, design and national body comments have been reviewed, with no significant change resulting. The current GCC implementation is against n4835 [1]. At this stage, the remaining potential for change comes from: * Areas of national body comments that were not resolved in the version we have worked to: (a) handling of the situation where aligned allocation is available. (b) handling of the situation where a user wants coroutines, but does not want exceptions (e.g. a GPU). * Agreed changes that have not yet been worded in a draft standard that we have worked to. It is not expected that the resolution to these can produce any major change at this phase of the standardisation process. Such changes should be limited to the coroutine-specific code. ABI --- The various compiler developers 'vendors' have discussed a minimal ABI to allow one implementation to call coroutines compiled by another. This amounts to: 1. The layout of a public portion of the coroutine frame. Coroutines need to preserve state across suspension points, the storage for this is called a "coroutine frame". The ABI mandates that pointers into the coroutine frame point to an area begining with two function pointers (to the resume and destroy functions described below); these are immediately followed by the "promise object" described in the standard. This is sufficient that the builtins can take a coroutine frame pointer and determine the address of the promise (or call the resume/destroy functions). 2. A number of compiler builtins that the standard library might use. These are implemented by this patch series. 3. This introduces a new operator 'co_await' the mangling for which is also agreed between vendors (and has an issue filed for that against the upstream c++abi). Demangling for this is added to libiberty in a separate patch. The ABI has currently no target-specific content (a given psABI might elect to mandate alignment, but the common ABI does not do this). Standard Library impact ----------------------- The current implementations require addition of only a single header to the standard library (no change to the runtime). This header is part of the patch. GCC Implementation outline -------------------------- The standard's design for coroutines does not decorate the definition of a coroutine in any way, so that a function is only known to be a coroutine when one of the keywords (co_await, co_yield, co_return) is encountered. This means that we cannot special-case such functions from the outset, but must process them differently when they are finalised - which we do from "finish_function ()". At a high level, this design of coroutine produces four pieces from the original user's function: 1. A coroutine state frame (taking the logical place of the activation record for a regular function). One item stored in that state is the index of the current suspend point. 2. A "ramp" function This is what the user calls to construct the coroutine frame and start the coroutine execution. This will return some object representing the coroutine's eventual return value (or means to continue it when it it suspended). 3. A "resume" function. This is what gets called when a the coroutine is resumed when suspended. 4. A "destroy" function. This is what gets called when the coroutine state should be destroyed and its memory released. The standard's coroutines involve cooperation of the user's authored function with a provided "promise" class, which includes mandatory methods for handling the state transitions and providing output values. Most realistic coroutines will also have one or more 'awaiter' classes that implement the user's actions for each suspend point. As we parse (or during template expansion) the types of the promise and awaiter classes become known, and can then be verified against the signatures expected by the standard. Once the function is parsed (and templates expanded) we are able to make the transformation into the four pieces noted above. The implementation here takes the approach of a series of AST transforms. The state machine suspend points are encoded in three internal functions (one of which represents an exit from scope without cleanups). These three IFNs are lowered early in the middle end, such that the majority of GCC's optimisers can be run on the resulting output. As a design choice, we have carried out the outlining of the user's function in the front end, and taken advantage of the existing middle end's abilities to inline and DCE where that is profitable. Since the state machine is actually common to both resumer and destroyer functions, we make only a single function "actor" that contains both the resume and destroy paths. The destroy function is represented by a small stub that sets a value to signal the use of the destroy path and calls the actor. The idea is that optimisation of the state machine need only be done once - and then the resume and destroy paths can be identified allowing the middle end's inline and DCE machinery to optimise as profitable as noted above. The middle end components for this implementation are: A pass that: 1. Lowers the coroutine builtins that allow the standard library header to interact with the coroutine frame (these fairly simple logical or numerical substitution of values, given a coroutine frame pointer). 2. Lowers the IFN that represents the exit from state without cleanup. Essentially, this becomes a gimple goto. 3. Sets the final size of the coroutine frame at this stage. A second pass (that requires the revised CFG that results from the lowering of the scope exit IFNs in the first). 1. Lower the IFNs that represent the state machine paths for the resume and destroy cases. Patches squashed into this commit: [C++ coroutines 1] Common code and base definitions. This part of the patch series provides the gating flag, the keywords, cpp defines etc. [C++ coroutines 2] Define builtins and internal functions. This part of the patch series provides the builtin functions used by the standard library code and the internal functions used to implement lowering of the coroutine state machine. [C++ coroutines 3] Front end parsing and transforms. There are two parts to this. 1. Parsing, template instantiation and diagnostics for the standard- mandated class entries. The user authors a function that becomes a coroutine (lazily) by making use of any of the co_await, co_yield or co_return keywords. Unlike a regular function, where the activation record is placed on the stack, and is destroyed on function exit, a coroutine has some state that persists between calls - the 'coroutine frame' (thus analogous to a stack frame). We transform the user's function into three pieces: 1. A so-called ramp function, that establishes the coroutine frame and begins execution of the coroutine. 2. An actor function that contains the state machine corresponding to the user's suspend/resume structure. 3. A stub function that calls the actor function in 'destroy' mode. The actor function is executed: * from "resume point 0" by the ramp. * from resume point N ( > 0 ) for handle.resume() calls. * from the destroy stub for destroy point N for handle.destroy() calls. The C++ coroutine design described in the standard makes use of some helper methods that are authored in a so-called "promise" class provided by the user. At parse time (or post substitution) the type of the coroutine promise will be determined. At that point, we can look up the required promise class methods and issue diagnostics if they are missing or incorrect. To avoid repeating these actions at code-gen time, we make use of temporary 'proxy' variables for the coroutine handle and the promise - which will eventually be instantiated in the coroutine frame. Each of the keywords will expand to a code sequence (although co_yield is just syntactic sugar for a co_await). We defer the analysis and transformatin until template expansion is complete so that we have complete types at that time. 2. AST analysis and transformation which performs the code-gen for the outlined state machine. The entry point here is morph_fn_to_coro () which is called from finish_function () when we have completed any template expansion. This is preceded by helper functions that implement the phases below. The process proceeds in four phases. A Initial framing. The user's function body is wrapped in the initial and final suspend points and we begin building the coroutine frame. We build empty decls for the actor and destroyer functions at this time too. When exceptions are enabled, the user's function body will also be wrapped in a try-catch block with the catch invoking the promise class 'unhandled_exception' method. B Analysis. The user's function body is analysed to determine the suspend points, if any, and to capture local variables that might persist across such suspensions. In most cases, it is not necessary to capture compiler temporaries, since the tree-lowering nests the suspensions correctly. However, in the case of a captured reference, there is a lifetime extension to the end of the full expression - which can mean across a suspend point in which case it must be promoted to a frame variable. At the conclusion of analysis, we have a conservative frame layout and maps of the local variables to their frame entry points. C Build the ramp function. Carry out the allocation for the coroutine frame (NOTE; the actual size computation is deferred until late in the middle end to allow for future optimisations that will be allowed to elide unused frame entries). We build the return object. D Build and expand the actor and destroyer function bodies. The destroyer is a trivial shim that sets a bit to indicate that the destroy dispatcher should be used and then calls into the actor. The actor function is the implementation of the user's state machine. The current suspend point is noted in an index. Each suspend point is encoded as a pair of internal functions, one in the relevant dispatcher, and one representing the suspend point. During this process, the user's local variables and the proxies for the self-handle and the promise class instanceare re-written to their coroutine frame equivalents. The complete bodies for the ramp, actor and destroy function are passed back to finish_function for folding and gimplification. [C++ coroutines 4] Middle end expanders and transforms. The first part of this is a pass that provides: * expansion of the library support builtins, these are simple boolean or numerical substitutions. * The functionality of implementing an exit from scope without cleanup is performed here by lowering an IFN to a gimple goto. This pass has to run for non-coroutine functions, since functions calling the builtins are not necessarily coroutines (i.e. they are implementing the library interfaces which may be called from anywhere). The second part is the expansion of the coroutine IFNs that describe the state machine connections to the dispatchers. This only has to be run for functions that are coroutine components. The work done by this pass is: In the front end we construct a single actor function that contains the coroutine state machine. The actor function has three entry conditions: 1. from the ramp, resume point 0 - to initial-suspend. 2. when resume () is executed (resume point N). 3. from the destroy () shim when that is executed. The actor function begins with two dispatchers; one for resume and one for destroy (where the initial entry from the ramp is a special- case of resume point 0). Each suspend point and each dispatch entry is marked with an IFN such that we can connect the relevant dispatchers to their target labels. So, if we have: CO_YIELD (NUM, FINAL, RES_LAB, DEST_LAB, FRAME_PTR) This is await point NUM, and is the final await if FINAL is non-zero. The resume point is RES_LAB, and the destroy point is DEST_LAB. We expect to find a CO_ACTOR (NUM) in the resume dispatcher and a CO_ACTOR (NUM+1) in the destroy dispatcher. Initially, the intent of keeping the resume and destroy paths together is that the conditionals controlling them are identical, and thus there would be duplication of any optimisation of those paths if the split were earlier. Subsequent inlining of the actor (and DCE) is then able to extract the resume and destroy paths as separate functions if that is found profitable by the optimisers. Once we have remade the connections to their correct postions, we elide the labels that the front end inserted. [C++ coroutines 5] Standard library header. This provides the interfaces mandated by the standard and implements the interaction with the coroutine frame by means of inline use of builtins expanded at compile-time. There should be a 1:1 correspondence with the standard sections which are cross-referenced. There is no runtime content. At this stage, we have the content in an inline namespace "__n4835" for the CD we worked to. [C++ coroutines 6] Testsuite. There are two categories of test: 1. Checks for correctly formed source code and the error reporting. 2. Checks for transformation and code-gen. The second set are run as 'torture' tests for the standard options set, including LTO. These are also intentionally run with no options provided (from the coroutines.exp script). gcc/ChangeLog: 2020-01-18 Iain Sandoe <iain@sandoe.co.uk> * Makefile.in: Add coroutine-passes.o. * builtin-types.def (BT_CONST_SIZE): New. (BT_FN_BOOL_PTR): New. (BT_FN_PTR_PTR_CONST_SIZE_BOOL): New. * builtins.def (DEF_COROUTINE_BUILTIN): New. * coroutine-builtins.def: New file. * coroutine-passes.cc: New file. * function.h (struct GTY function): Add a bit to indicate that the function is a coroutine component. * internal-fn.c (expand_CO_FRAME): New. (expand_CO_YIELD): New. (expand_CO_SUSPN): New. (expand_CO_ACTOR): New. * internal-fn.def (CO_ACTOR): New. (CO_YIELD): New. (CO_SUSPN): New. (CO_FRAME): New. * passes.def: Add pass_coroutine_lower_builtins, pass_coroutine_early_expand_ifns. * tree-pass.h (make_pass_coroutine_lower_builtins): New. (make_pass_coroutine_early_expand_ifns): New. * doc/invoke.texi: Document the fcoroutines command line switch. gcc/c-family/ChangeLog: 2020-01-18 Iain Sandoe <iain@sandoe.co.uk> * c-common.c (co_await, co_yield, co_return): New. * c-common.h (RID_CO_AWAIT, RID_CO_YIELD, RID_CO_RETURN): New enumeration values. (D_CXX_COROUTINES): Bit to identify coroutines are active. (D_CXX_COROUTINES_FLAGS): Guard for coroutine keywords. * c-cppbuiltin.c (__cpp_coroutines): New cpp define. * c.opt (fcoroutines): New command-line switch. gcc/cp/ChangeLog: 2020-01-18 Iain Sandoe <iain@sandoe.co.uk> * Make-lang.in: Add coroutines.o. * cp-tree.h (lang_decl-fn): coroutine_p, new bit. (DECL_COROUTINE_P): New. * lex.c (init_reswords): Enable keywords when the coroutine flag is set, * operators.def (co_await): New operator. * call.c (add_builtin_candidates): Handle CO_AWAIT_EXPR. (op_error): Likewise. (build_new_op_1): Likewise. (build_new_function_call): Validate coroutine builtin arguments. * constexpr.c (potential_constant_expression_1): Handle CO_AWAIT_EXPR, CO_YIELD_EXPR, CO_RETURN_EXPR. * coroutines.cc: New file. * cp-objcp-common.c (cp_common_init_ts): Add CO_AWAIT_EXPR, CO_YIELD_EXPR, CO_RETRN_EXPR as TS expressions. * cp-tree.def (CO_AWAIT_EXPR, CO_YIELD_EXPR, (CO_RETURN_EXPR): New. * cp-tree.h (coro_validate_builtin_call): New. * decl.c (emit_coro_helper): New. (finish_function): Handle the case when a function is found to be a coroutine, perform the outlining and emit the outlined functions. Set a bit to signal that this is a coroutine component. * parser.c (enum required_token): New enumeration RT_CO_YIELD. (cp_parser_unary_expression): Handle co_await. (cp_parser_assignment_expression): Handle co_yield. (cp_parser_statement): Handle RID_CO_RETURN. (cp_parser_jump_statement): Handle co_return. (cp_parser_operator): Handle co_await operator. (cp_parser_yield_expression): New. (cp_parser_required_error): Handle RT_CO_YIELD. * pt.c (tsubst_copy): Handle CO_AWAIT_EXPR. (tsubst_expr): Handle CO_AWAIT_EXPR, CO_YIELD_EXPR and CO_RETURN_EXPRs. * tree.c (cp_walk_subtrees): Likewise. libstdc++-v3/ChangeLog: 2020-01-18 Iain Sandoe <iain@sandoe.co.uk> * include/Makefile.am: Add coroutine to the std set. * include/Makefile.in: Regenerated. * include/std/coroutine: New file. gcc/testsuite/ChangeLog: 2020-01-18 Iain Sandoe <iain@sandoe.co.uk> * g++.dg/coroutines/co-await-syntax-00-needs-expr.C: New test. * g++.dg/coroutines/co-await-syntax-01-outside-fn.C: New test. * g++.dg/coroutines/co-await-syntax-02-outside-fn.C: New test. * g++.dg/coroutines/co-await-syntax-03-auto.C: New test. * g++.dg/coroutines/co-await-syntax-04-ctor-dtor.C: New test. * g++.dg/coroutines/co-await-syntax-05-constexpr.C: New test. * g++.dg/coroutines/co-await-syntax-06-main.C: New test. * g++.dg/coroutines/co-await-syntax-07-varargs.C: New test. * g++.dg/coroutines/co-await-syntax-08-lambda-auto.C: New test. * g++.dg/coroutines/co-return-syntax-01-outside-fn.C: New test. * g++.dg/coroutines/co-return-syntax-02-outside-fn.C: New test. * g++.dg/coroutines/co-return-syntax-03-auto.C: New test. * g++.dg/coroutines/co-return-syntax-04-ctor-dtor.C: New test. * g++.dg/coroutines/co-return-syntax-05-constexpr-fn.C: New test. * g++.dg/coroutines/co-return-syntax-06-main.C: New test. * g++.dg/coroutines/co-return-syntax-07-vararg.C: New test. * g++.dg/coroutines/co-return-syntax-08-bad-return.C: New test. * g++.dg/coroutines/co-return-syntax-09-lambda-auto.C: New test. * g++.dg/coroutines/co-yield-syntax-00-needs-expr.C: New test. * g++.dg/coroutines/co-yield-syntax-01-outside-fn.C: New test. * g++.dg/coroutines/co-yield-syntax-02-outside-fn.C: New test. * g++.dg/coroutines/co-yield-syntax-03-auto.C: New test. * g++.dg/coroutines/co-yield-syntax-04-ctor-dtor.C: New test. * g++.dg/coroutines/co-yield-syntax-05-constexpr.C: New test. * g++.dg/coroutines/co-yield-syntax-06-main.C: New test. * g++.dg/coroutines/co-yield-syntax-07-varargs.C: New test. * g++.dg/coroutines/co-yield-syntax-08-needs-expr.C: New test. * g++.dg/coroutines/co-yield-syntax-09-lambda-auto.C: New test. * g++.dg/coroutines/coro-builtins.C: New test. * g++.dg/coroutines/coro-missing-gro.C: New test. * g++.dg/coroutines/coro-missing-promise-yield.C: New test. * g++.dg/coroutines/coro-missing-ret-value.C: New test. * g++.dg/coroutines/coro-missing-ret-void.C: New test. * g++.dg/coroutines/coro-missing-ueh-1.C: New test. * g++.dg/coroutines/coro-missing-ueh-2.C: New test. * g++.dg/coroutines/coro-missing-ueh-3.C: New test. * g++.dg/coroutines/coro-missing-ueh.h: New test. * g++.dg/coroutines/coro-pre-proc.C: New test. * g++.dg/coroutines/coro.h: New file. * g++.dg/coroutines/coro1-ret-int-yield-int.h: New file. * g++.dg/coroutines/coroutines.exp: New file. * g++.dg/coroutines/torture/alloc-00-gro-on-alloc-fail.C: New test. * g++.dg/coroutines/torture/alloc-01-overload-newdel.C: New test. * g++.dg/coroutines/torture/call-00-co-aw-arg.C: New test. * g++.dg/coroutines/torture/call-01-multiple-co-aw.C: New test. * g++.dg/coroutines/torture/call-02-temp-co-aw.C: New test. * g++.dg/coroutines/torture/call-03-temp-ref-co-aw.C: New test. * g++.dg/coroutines/torture/class-00-co-ret.C: New test. * g++.dg/coroutines/torture/class-01-co-ret-parm.C: New test. * g++.dg/coroutines/torture/class-02-templ-parm.C: New test. * g++.dg/coroutines/torture/class-03-operator-templ-parm.C: New test. * g++.dg/coroutines/torture/class-04-lambda-1.C: New test. * g++.dg/coroutines/torture/class-05-lambda-capture-copy-local.C: New test. * g++.dg/coroutines/torture/class-06-lambda-capture-ref.C: New test. * g++.dg/coroutines/torture/co-await-00-trivial.C: New test. * g++.dg/coroutines/torture/co-await-01-with-value.C: New test. * g++.dg/coroutines/torture/co-await-02-xform.C: New test. * g++.dg/coroutines/torture/co-await-03-rhs-op.C: New test. * g++.dg/coroutines/torture/co-await-04-control-flow.C: New test. * g++.dg/coroutines/torture/co-await-05-loop.C: New test. * g++.dg/coroutines/torture/co-await-06-ovl.C: New test. * g++.dg/coroutines/torture/co-await-07-tmpl.C: New test. * g++.dg/coroutines/torture/co-await-08-cascade.C: New test. * g++.dg/coroutines/torture/co-await-09-pair.C: New test. * g++.dg/coroutines/torture/co-await-10-template-fn-arg.C: New test. * g++.dg/coroutines/torture/co-await-11-forwarding.C: New test. * g++.dg/coroutines/torture/co-await-12-operator-2.C: New test. * g++.dg/coroutines/torture/co-await-13-return-ref.C: New test. * g++.dg/coroutines/torture/co-ret-00-void-return-is-ready.C: New test. * g++.dg/coroutines/torture/co-ret-01-void-return-is-suspend.C: New test. * g++.dg/coroutines/torture/co-ret-03-different-GRO-type.C: New test. * g++.dg/coroutines/torture/co-ret-04-GRO-nontriv.C: New test. * g++.dg/coroutines/torture/co-ret-05-return-value.C: New test. * g++.dg/coroutines/torture/co-ret-06-template-promise-val-1.C: New test. * g++.dg/coroutines/torture/co-ret-07-void-cast-expr.C: New test. * g++.dg/coroutines/torture/co-ret-08-template-cast-ret.C: New test. * g++.dg/coroutines/torture/co-ret-09-bool-await-susp.C: New test. * g++.dg/coroutines/torture/co-ret-10-expression-evaluates-once.C: New test. * g++.dg/coroutines/torture/co-ret-11-co-ret-co-await.C: New test. * g++.dg/coroutines/torture/co-ret-12-co-ret-fun-co-await.C: New test. * g++.dg/coroutines/torture/co-ret-13-template-2.C: New test. * g++.dg/coroutines/torture/co-ret-14-template-3.C: New test. * g++.dg/coroutines/torture/co-yield-00-triv.C: New test. * g++.dg/coroutines/torture/co-yield-01-multi.C: New test. * g++.dg/coroutines/torture/co-yield-02-loop.C: New test. * g++.dg/coroutines/torture/co-yield-03-tmpl.C: New test. * g++.dg/coroutines/torture/co-yield-04-complex-local-state.C: New test. * g++.dg/coroutines/torture/co-yield-05-co-aw.C: New test. * g++.dg/coroutines/torture/co-yield-06-fun-parm.C: New test. * g++.dg/coroutines/torture/co-yield-07-template-fn-param.C: New test. * g++.dg/coroutines/torture/co-yield-08-more-refs.C: New test. * g++.dg/coroutines/torture/co-yield-09-more-templ-refs.C: New test. * g++.dg/coroutines/torture/coro-torture.exp: New file. * g++.dg/coroutines/torture/exceptions-test-0.C: New test. * g++.dg/coroutines/torture/func-params-00.C: New test. * g++.dg/coroutines/torture/func-params-01.C: New test. * g++.dg/coroutines/torture/func-params-02.C: New test. * g++.dg/coroutines/torture/func-params-03.C: New test. * g++.dg/coroutines/torture/func-params-04.C: New test. * g++.dg/coroutines/torture/func-params-05.C: New test. * g++.dg/coroutines/torture/func-params-06.C: New test. * g++.dg/coroutines/torture/lambda-00-co-ret.C: New test. * g++.dg/coroutines/torture/lambda-01-co-ret-parm.C: New test. * g++.dg/coroutines/torture/lambda-02-co-yield-values.C: New test. * g++.dg/coroutines/torture/lambda-03-auto-parm-1.C: New test. * g++.dg/coroutines/torture/lambda-04-templ-parm.C: New test. * g++.dg/coroutines/torture/lambda-05-capture-copy-local.C: New test. * g++.dg/coroutines/torture/lambda-06-multi-capture.C: New test. * g++.dg/coroutines/torture/lambda-07-multi-yield.C: New test. * g++.dg/coroutines/torture/lambda-08-co-ret-parm-ref.C: New test. * g++.dg/coroutines/torture/local-var-0.C: New test. * g++.dg/coroutines/torture/local-var-1.C: New test. * g++.dg/coroutines/torture/local-var-2.C: New test. * g++.dg/coroutines/torture/local-var-3.C: New test. * g++.dg/coroutines/torture/local-var-4.C: New test. * g++.dg/coroutines/torture/mid-suspend-destruction-0.C: New test. * g++.dg/coroutines/torture/pr92933.C: New test.