From e5e23c44a43dd69728502ca78e2c3fddf9e825f3 Mon Sep 17 00:00:00 2001 From: ericniebler Date: Fri, 22 Mar 2024 01:20:20 +0000 Subject: [PATCH] Publish: Merge pull request #209 from cplusplus/member-customization-points remove `tag_invoke` 85a62a692d2017a11f4c18e80031eb7d94f47c89 --- execution.html | 3808 +++++++++++++++++++++++++++--------------------- 1 file changed, 2113 insertions(+), 1695 deletions(-) diff --git a/execution.html b/execution.html index b2cf3eb..3ace999 100644 --- a/execution.html +++ b/execution.html @@ -2347,7 +2347,7 @@

D2300R9
std::execution

-

Draft Proposal,

+

Draft Proposal,

Authors: @@ -2417,43 +2417,44 @@

Table of Contents

  • 1.7.1 Composability with execution::let_*
  • 1.7.2 Moving between execution resources with execution::on and execution::transfer -
  • 1.8 What this proposal is not -
  • 1.9 Design changes from P0443 +
  • 1.8 Design changes from P0443
  • - 1.10 Prior art + 1.9 Prior art
      -
    1. 1.10.1 Futures -
    2. 1.10.2 Coroutines -
    3. 1.10.3 Callbacks +
    4. 1.9.1 Futures +
    5. 1.9.2 Coroutines +
    6. 1.9.3 Callbacks
  • - 1.11 Field experience + 1.10 Field experience
      -
    1. 1.11.1 libunifex -
    2. 1.11.2 Other implementations -
    3. 1.11.3 Inspirations +
    4. 1.10.1 libunifex +
    5. 1.10.2 stdexec +
    6. 1.10.3 Other implementations +
    7. 1.10.4 Inspirations
  • 2 Revision history
      -
    1. 2.1 R8 -
    2. 2.2 R7 +
    3. 2.1 R9 +
    4. 2.2 R8 +
    5. 2.3 R7
    6. - 2.3 R6 + 2.4 R6
        -
      1. 2.3.1 Environments and attributes +
      2. 2.4.1 Environments and attributes
      -
    7. 2.4 R5 +
    8. 2.5 R5
    9. - 2.5 R4 + 2.6 R4
        -
      1. 2.5.1 Dependently-typed senders +
      2. 2.6.1 Dependently-typed senders
      -
    10. 2.6 R3 -
    11. 2.7 R2 -
    12. 2.8 R1 -
    13. 2.9 R0 +
    14. 2.7 R3 +
    15. 2.8 R2 +
    16. 2.9 R1 +
    17. 2.10 R0
  • 3 Design - introduction @@ -2544,7 +2545,7 @@

    Table of Contents

  • 5.6 Lazy senders provide optimization opportunities
  • 5.7 Execution resource transitions are two-step
  • 5.8 All senders are typed -
  • 5.9 Ranges-style CPOs vs tag_invoke +
  • 5.9 Customization points
  • 6 Specification
  • @@ -2568,7 +2569,6 @@

    Table of Contents

    9.1 Function objects [function.objects]
    1. 9.1.1 Header <functional> synopsis [functional.syn] -
    2. 9.1.2 tag_invoke [func.tag_invoke]
  • @@ -2703,13 +2703,8 @@

    Table of Contents

  • 11.11 Sender/receiver utilities [exec.utils]
      -
    1. - 11.11.1 execution::receiver_adaptor [exec.utils.rcvr.adptr] -
        -
      1. 11.11.1.1 Non-member functions [exec.utils.rcvr.adptr.nonmembers] -
      -
    2. 11.11.2 execution::completion_signatures [exec.utils.cmplsigs] -
    3. 11.11.3 execution::transform_completion_signatures [exec.utils.tfxcmplsigs] +
    4. 11.11.1 execution::completion_signatures [exec.utils.cmplsigs] +
    5. 11.11.2 execution::transform_completion_signatures [exec.utils.tfxcmplsigs]
  • 11.12 Execution contexts [exec.ctx] @@ -2745,24 +2740,42 @@

    Table of Contents

    1. Introduction

    This paper proposes a self-contained design for a Standard C++ framework for managing asynchronous execution on generic execution resources. It is based on the ideas in A Unified Executors Proposal for C++ and its companion papers.

    1.1. Motivation

    -

    Today, C++ software is increasingly asynchronous and parallel, a trend that is likely to only continue going forward. -Asynchrony and parallelism appears everywhere, from processor hardware interfaces, to networking, to file I/O, to GUIs, to accelerators. -Every C++ domain and every platform needs to deal with asynchrony and parallelism, from scientific computing to video games to financial services, from the smallest mobile devices to your laptop to GPUs in the world’s fastest supercomputer.

    -

    While the C++ Standard Library has a rich set of concurrency primitives (std::atomic, std::mutex, std::counting_semaphore, etc) and lower level building blocks (std::thread, etc), we lack a Standard vocabulary and framework for asynchrony and parallelism that C++ programmers desperately need. std::async/std::future/std::promise, C++11’s intended exposure for asynchrony, is inefficient, hard to use correctly, and severely lacking in genericity, making it unusable in many contexts. -We introduced parallel algorithms to the C++ Standard Library in C++17, and while they are an excellent start, they are all inherently synchronous and not composable.

    -

    This paper proposes a Standard C++ model for asynchrony, based around three key abstractions: schedulers, senders, and receivers, and a set of customizable asynchronous algorithms.

    +

    Today, C++ software is increasingly asynchronous and parallel, a trend that is +likely to only continue going forward. Asynchrony and parallelism appears +everywhere, from processor hardware interfaces, to networking, to file I/O, to +GUIs, to accelerators. Every C++ domain and every platform needs to deal with +asynchrony and parallelism, from scientific computing to video games to +financial services, from the smallest mobile devices to your laptop to GPUs in +the world’s fastest supercomputer.

    +

    While the C++ Standard Library has a rich set of concurrency primitives +(std::atomic, std::mutex, std::counting_semaphore, etc) and lower level +building blocks (std::thread, etc), we lack a Standard vocabulary and +framework for asynchrony and parallelism that C++ programmers desperately need. std::async/std::future/std::promise, C++11’s intended exposure for +asynchrony, is inefficient, hard to use correctly, and severely lacking in +genericity, making it unusable in many contexts. We introduced parallel +algorithms to the C++ Standard Library in C++17, and while they are an excellent +start, they are all inherently synchronous and not composable.

    +

    This paper proposes a Standard C++ model for asynchrony based around three key +abstractions: schedulers, senders, and receivers, and a set of customizable +asynchronous algorithms.

    1.2. Priorities

    • -

      Be composable and generic, allowing users to write code that can be used with many different types of execution resources.

      +

      Be composable and generic, allowing users to write code that can be used with +many different types of execution resources.

    • -

      Encapsulate common asynchronous patterns in customizable and reusable algorithms, so users don’t have to invent things themselves.

      +

      Encapsulate common asynchronous patterns in customizable and reusable +algorithms, so users don’t have to invent things themselves.

    • Make it easy to be correct by construction.

    • -

      Support the diversity of execution resources and execution agents, because not all execution agents are created equal; some are less capable than others, but not less important.

      +

      Support the diversity of execution resources and execution agents, because not +all execution agents are created equal; some are less capable than others, +but not less important.

    • -

      Allow everything to be customized by an execution resource, including transfer to other execution resources, but don’t require that execution resources customize everything.

      +

      Allow everything to be customized by an execution resource, including transfer +to other execution resources, but don’t require that execution resources +customize everything.

    • Care about all reasonable use cases, domains and platforms.

    • @@ -2775,7 +2788,9 @@

      Be able to manage and terminate the lifetimes of objects asynchronously.

    1.3. Examples: End User

    -

    In this section we demonstrate the end-user experience of asynchronous programming directly with the sender algorithms presented in this paper. See § 4.19 User-facing sender factories, § 4.20 User-facing sender adaptors, and § 4.21 User-facing sender consumers for short explanations of the algorithms used in these code examples.

    +

    In this section we demonstrate the end-user experience of asynchronous +programming directly with the sender algorithms presented in this paper. See § 4.19 User-facing sender factories, § 4.20 User-facing sender adaptors, and § 4.21 User-facing sender consumers for short explanations of the algorithms used in +these code examples.

    1.3.1. Hello world

    using namespace std::execution;
     
    @@ -2793,15 +2808,27 @@ 

    This example demonstrates the basics of schedulers, senders, and receivers:

    1. -

      First we need to get a scheduler from somewhere, such as a thread pool. A scheduler is a lightweight handle to an execution resource.

      +

      First we need to get a scheduler from somewhere, such as a thread pool. A +scheduler is a lightweight handle to an execution resource.

    2. -

      To start a chain of work on a scheduler, we call § 4.19.1 execution::schedule, which returns a sender that completes on the scheduler. A sender describes asynchronous work and sends a signal (value, error, or stopped) to some recipient(s) when that work completes.

      +

      To start a chain of work on a scheduler, we call § 4.19.1 execution::schedule, which returns a sender that completes on +the scheduler. A sender describes asynchronous work and sends a signal +(value, error, or stopped) to some recipient(s) when that work completes.

    3. -

      We use sender algorithms to produce senders and compose asynchronous work. § 4.20.2 execution::then is a sender adaptor that takes an input sender and a std::invocable, and calls the std::invocable on the signal sent by the input sender. The sender returned by then sends the result of that invocation. In this case, the input sender came from schedule, so its void, meaning it won’t send us a value, so our std::invocable takes no parameters. But we return an int, which will be sent to the next recipient.

      +

      We use sender algorithms to produce senders and compose asynchronous work. § 4.20.2 execution::then is a sender adaptor that takes an input +sender and a std::invocable, and calls the std::invocable on the signal +sent by the input sender. The sender returned by then sends the result of +that invocation. In this case, the input sender came from schedule, so its void, meaning it won’t send us a value, so our std::invocable takes no +parameters. But we return an int, which will be sent to the next recipient.

    4. Now, we add another operation to the chain, again using § 4.20.2 execution::then. This time, we get sent a value - the int from the previous step. We add 42 to it, and then return the result.

    5. -

      Finally, we’re ready to submit the entire asynchronous pipeline and wait for its completion. Everything up until this point has been completely asynchronous; the work may not have even started yet. To ensure the work has started and then block pending its completion, we use § 4.21.2 this_thread::sync_wait, which will either return a std::optional<std::tuple<...>> with the value sent by the last sender, or an empty std::optional if the last sender sent a stopped signal, or it throws an exception if the last sender sent an error.

      +

      Finally, we’re ready to submit the entire asynchronous pipeline and wait for +its completion. Everything up until this point has been completely +asynchronous; the work may not have even started yet. To ensure the work has +started and then block pending its completion, we use § 4.21.2 this_thread::sync_wait, which will either return a std::optional<std::tuple<...>> with the value sent by the last sender, or +an empty std::optional if the last sender sent a stopped signal, or it +throws an exception if the last sender sent an error.

    1.3.2. Asynchronous inclusive scan

    using namespace std::execution;
    @@ -2850,35 +2877,60 @@ 

    This example builds an asynchronous computation of an inclusive scan:

    1. -

      It scans a sequence of doubles (represented as the std::span<const double> input) and stores the result in another sequence of doubles (represented as std::span<double> output).

      +

      It scans a sequence of doubles (represented as the std::span<const double> input) and stores the result in another sequence of doubles +(represented as std::span<double> output).

    2. -

      It takes a scheduler, which specifies what execution resource the scan should be launched on.

      +

      It takes a scheduler, which specifies what execution resource the scan should +be launched on.

    3. -

      It also takes a tile_count parameter that controls the number of execution agents that will be spawned.

      +

      It also takes a tile_count parameter that controls the number of execution +agents that will be spawned.

    4. -

      First we need to allocate temporary storage needed for the algorithm, which we’ll do with a std::vector, partials. We need one double of temporary storage for each execution agent we create.

      +

      First we need to allocate temporary storage needed for the algorithm, which +we’ll do with a std::vector, partials. We need one double of temporary +storage for each execution agent we create.

    5. -

      Next we’ll create our initial sender with § 4.19.2 execution::just and § 4.20.1 execution::transfer. These senders will send the temporary storage, which we’ve moved into the sender. The sender has a completion scheduler of sch, which means the next item in the chain will use sch.

      +

      Next we’ll create our initial sender with § 4.19.2 execution::just and § 4.20.1 execution::transfer. These senders will send the temporary +storage, which we’ve moved into the sender. The sender has a completion +scheduler of sch, which means the next item in the chain will use sch.

    6. -

      Senders and sender adaptors support composition via operator|, similar to C++ ranges. We’ll use operator| to attach the next piece of work, which will spawn tile_count execution agents using § 4.20.9 execution::bulk (see § 4.12 Most sender adaptors are pipeable for details).

      +

      Senders and sender adaptors support composition via operator|, similar to +C++ ranges. We’ll use operator| to attach the next piece of work, which +will spawn tile_count execution agents using § 4.20.9 execution::bulk (see § 4.12 Most sender adaptors are pipeable for details).

    7. -

      Each agent will call a std::invocable, passing it two arguments. The first is the agent’s index (i) in the § 4.20.9 execution::bulk operation, in this case a unique integer in [0, tile_count). The second argument is what the input sender sent - the temporary storage.

      +

      Each agent will call a std::invocable, passing it two arguments. The first +is the agent’s index (i) in the § 4.20.9 execution::bulk operation, +in this case a unique integer in [0, tile_count). The second argument is +what the input sender sent - the temporary storage.

    8. -

      We start by computing the start and end of the range of input and output elements that this agent is responsible for, based on our agent index.

      +

      We start by computing the start and end of the range of input and output +elements that this agent is responsible for, based on our agent index.

    9. -

      Then we do a sequential std::inclusive_scan over our elements. We store the scan result for our last element, which is the sum of all of our elements, in our temporary storage partials.

      +

      Then we do a sequential std::inclusive_scan over our elements. We store the +scan result for our last element, which is the sum of all of our elements, +in our temporary storage partials.

    10. -

      After all computation in that initial § 4.20.9 execution::bulk pass has completed, every one of the spawned execution agents will have written the sum of its elements into its slot in partials.

      +

      After all computation in that initial § 4.20.9 execution::bulk pass + has completed, every one of the spawned execution agents will have written + the sum of its elements into its slot in partials.

    11. -

      Now we need to scan all of the values in partials. We’ll do that with a single execution agent which will execute after the § 4.20.9 execution::bulk completes. We create that execution agent with § 4.20.2 execution::then.

      +

      Now we need to scan all of the values in partials. We’ll do that with a + single execution agent which will execute after the § 4.20.9 execution::bulk completes. We create that execution agent + with § 4.20.2 execution::then.

    12. -

      § 4.20.2 execution::then takes an input sender and an std::invocable and calls the std::invocable with the value sent by the input sender. Inside our std::invocable, we call std::inclusive_scan on partials, which the input senders will send to us.

      +

      § 4.20.2 execution::then takes an input sender and an std::invocable and calls the std::invocable with the value sent by the + input sender. Inside our std::invocable, we call std::inclusive_scan on partials, which the input senders will send to us.

    13. Then we return partials, which the next phase will need.

    14. -

      Finally we do another § 4.20.9 execution::bulk of the same shape as before. In this § 4.20.9 execution::bulk, we will use the scanned values in partials to integrate the sums from other tiles into our elements, completing the inclusive scan.

      +

      Finally we do another § 4.20.9 execution::bulk of the same shape as + before. In this § 4.20.9 execution::bulk, we will use the scanned + values in partials to integrate the sums from other tiles into our + elements, completing the inclusive scan.

    15. -

      async_inclusive_scan returns a sender that sends the output std::span<double>. A consumer of the algorithm can chain additional work that uses the scan result. At the point at which async_inclusive_scan returns, the computation may not have completed. In fact, it may not have even started.

      +

      async_inclusive_scan returns a sender that sends the output std::span<double>. A consumer of the algorithm can chain additional work + that uses the scan result. At the point at which async_inclusive_scan returns, the computation may not have completed. In fact, it may not have + even started.

    1.3.3. Asynchronous dynamically-sized read

    using namespace std::execution;
    @@ -2912,22 +2964,38 @@ 

    async_read is a pipeable sender adaptor. It’s a customization point object, but this is what it’s call signature looks like. It takes a sender parameter which must send an input buffer in the form of a std::span<std::byte>, and a handle to an I/O context. It will asynchronously read into the input buffer, up to the size of the std::span. It returns a sender which will send the number of bytes read once the read completes.

    +

    async_read is a pipeable sender adaptor. It’s a customization point object, +but this is what it’s call signature looks like. It takes a sender parameter +which must send an input buffer in the form of a std::span<std::byte>, and +a handle to an I/O context. It will asynchronously read into the input +buffer, up to the size of the std::span. It returns a sender which will +send the number of bytes read once the read completes.

  • -

    async_read_array takes an I/O handle and reads a size from it, and then a buffer of that many bytes. It returns a sender that sends a dynamic_buffer object that owns the data that was sent.

    +

    async_read_array takes an I/O handle and reads a size from it, and then a +buffer of that many bytes. It returns a sender that sends a dynamic_buffer object that owns the data that was sent.

  • dynamic_buffer is an aggregate struct that contains a std::unique_ptr<std::byte[]> and a size.

  • -

    The first thing we do inside of async_read_array is create a sender that will send a new, empty dynamic_array object using § 4.19.2 execution::just. We can attach more work to the pipeline using operator| composition (see § 4.12 Most sender adaptors are pipeable for details).

    +

    The first thing we do inside of async_read_array is create a sender that +will send a new, empty dynamic_array object using § 4.19.2 execution::just. We can attach more work to the pipeline +using operator| composition (see § 4.12 Most sender adaptors are pipeable for details).

  • -

    We need the lifetime of this dynamic_array object to last for the entire pipeline. So, we use let_value, which takes an input sender and a std::invocable that must return a sender itself (see § 4.20.4 execution::let_* for details). let_value sends the value from the input sender to the std::invocable. Critically, the lifetime of the sent object will last until the sender returned by the std::invocable completes.

    +

    We need the lifetime of this dynamic_array object to last for the entire +pipeline. So, we use let_value, which takes an input sender and a std::invocable that must return a sender itself (see § 4.20.4 execution::let_* for details). let_value sends the value +from the input sender to the std::invocable. Critically, the lifetime of +the sent object will last until the sender returned by the std::invocable completes.

  • -

    Inside of the let_value std::invocable, we have the rest of our logic. First, we want to initiate an async_read of the buffer size. To do that, we need to send a std::span pointing to buf.size. We can do that with § 4.19.2 execution::just.

    +

    Inside of the let_value std::invocable, we have the rest of our logic. +First, we want to initiate an async_read of the buffer size. To do that, +we need to send a std::span pointing to buf.size. We can do that with § 4.19.2 execution::just.

  • -

    We chain the async_read onto the § 4.19.2 execution::just sender with operator|.

    +

    We chain the async_read onto the § 4.19.2 execution::just sender +with operator|.

  • Next, we pipe a std::invocable that will be invoked after the async_read completes using § 4.20.2 execution::then.

  • @@ -2941,13 +3009,16 @@

    async_read, which will read the data.

  • -

    Once the data has been read, in another § 4.20.2 execution::then, we confirm that we read the right number of bytes.

    +

    Once the data has been read, in another § 4.20.2 execution::then, we + confirm that we read the right number of bytes.

  • -

    Finally, we move out of and return our dynamic_buffer object. It will get sent by the sender returned by async_read_array. We can attach more things to that sender to use the data in the buffer.

    +

    Finally, we move out of and return our dynamic_buffer object. It will get + sent by the sender returned by async_read_array. We can attach more + things to that sender to use the data in the buffer.

    1.4. Asynchronous Windows socket recv

    -

    To get a better feel for how this interface might be used by low-level operations see this example implementation -of a cancellable async_recv() operation for a Windows Socket.

    +

    To get a better feel for how this interface might be used by low-level +operations see this example implementation of a cancellable async_recv() operation for a Windows Socket.

    struct operation_base : WSAOVERALAPPED {
         using completion_fn = void(operation_base* op, DWORD bytesTransferred, int errorCode) noexcept;
     
    @@ -2957,6 +3028,8 @@ 

    1.4.1. More end-user examples

    1.4.1.1. Sudoku solver
    -

    This example comes from Kirk Shoop, who ported an example from TBB’s documentation to sender/receiver in his fork of the libunifex repo. It is a Sudoku solver that uses a configurable number of threads to explore the search space for solutions.

    -

    The sender/receiver-based Sudoku solver can be found here. Some things that are worth noting about Kirk’s solution:

    +

    This example comes from Kirk Shoop, who ported an example from TBB’s +documentation to sender/receiver in his fork of the libunifex repo. It is a +Sudoku solver that uses a configurable number of threads to explore the search +space for solutions.

    +

    The sender/receiver-based Sudoku solver can be found here. +Some things that are worth noting about Kirk’s solution:

    1. -

      Although it schedules asychronous work onto a thread pool, and each unit of work will schedule more work, its use of structured concurrency patterns make reference counting unnecessary. The solution does not make use of shared_ptr.

      +

      Although it schedules asychronous work onto a thread pool, and each unit of +work will schedule more work, its use of structured concurrency patterns +make reference counting unnecessary. The solution does not make use of shared_ptr.

    2. -

      In addition to eliminating the need for reference counting, the use of structured concurrency makes it easy to ensure that resources are cleaned up on all code paths. In contrast, the TBB example that inspired this one leaks memory.

      +

      In addition to eliminating the need for reference counting, the use of +structured concurrency makes it easy to ensure that resources are cleaned up +on all code paths. In contrast, the TBB example that inspired this one leaks memory.

    For comparison, the TBB-based Sudoku solver can be found here.

    1.4.1.2. File copy
    -

    This example also comes from Kirk Shoop which uses sender/receiver to recursively copy the files a directory tree. It demonstrates how sender/receiver can be used to do IO, using a scheduler that schedules work on Linux’s io_uring.

    -

    As with the Sudoku example, this example obviates the need for reference counting by employing structured concurrency. It uses iteration with an upper limit to avoid having too many open file handles.

    +

    This example also comes from Kirk Shoop which uses sender/receiver to +recursively copy the files a directory tree. It demonstrates how sender/receiver +can be used to do IO, using a scheduler that schedules work on Linux’s io_uring.

    +

    As with the Sudoku example, this example obviates the need for reference +counting by employing structured concurrency. It uses iteration with an upper +limit to avoid having too many open file handles.

    You can find the example here.

    1.4.1.3. Echo server
    -

    Dietmar Kuehl has a hobby project that implements networking APIs on top of sender/receiver. He recently implemented an echo server as a demo. His echo server code can be found here.

    -

    Below, I show the part of the echo server code. This code is executed for each client that connects to the echo server. In a loop, it reads input from a socket and echos the input back to the same socket. All of this, including the loop, is implemented with generic async algorithms.

    +

    Dietmar Kuehl has proposed networking APIs that use the sender/receiver +abstraction (see P2762). He has implemented an echo +server as a demo. His echo server code can be found here.

    +

    Below, I show the part of the echo server code. This code is executed for each +client that connects to the echo server. In a loop, it reads input from a socket +and echos the input back to the same socket. All of this, including the loop, is +implemented with generic async algorithms.

    outstanding.start(
         EX::repeat_effect_until(
               EX::let_value(
    @@ -3122,274 +3210,318 @@ 
    ) );
    -

    In this code, NN::async_read_some and NN::async_write_some are asynchronous socket-based networking APIs that return senders. EX::repeat_effect_until, EX::let_value, and EX::then are fully generic sender adaptor algorithms that accept and return senders.

    -

    This is a good example of seamless composition of async IO functions with non-IO operations. And by composing the senders in this structured way, all the state for the composite operation -- the repeat_effect_until expression and all its child operations -- is stored altogether in a single object.

    +

    In this code, NN::async_read_some and NN::async_write_some are asynchronous +socket-based networking APIs that return senders. EX::repeat_effect_until, EX::let_value, and EX::then are fully generic sender adaptor algorithms that +accept and return senders.

    +

    This is a good example of seamless composition of async IO functions with non-IO +operations. And by composing the senders in this structured way, all the state +for the composite operation -- the repeat_effect_until expression and all its +child operations -- is stored altogether in a single object.

    1.5. Examples: Algorithms

    -

    In this section we show a few simple sender/receiver-based algorithm implementations.

    +

    In this section we show a few simple sender/receiver-based algorithm +implementations.

    1.5.1. then

    -
    namespace exec = std::execution;
    +
    namespace stdexec = std::execution;
     
    -template<class R, class F>
    -class _then_receiver
    -    : exec::receiver_adaptor<_then_receiver<R, F>, R> {
    -  friend exec::receiver_adaptor<_then_receiver, R>;
    +template <class R, class F>
    +class _then_receiver : public R {
       F f_;
     
    -  // Customize set_value by invoking the callable and passing the result to the inner receiver
    -  template<class... As>
    -  void set_value(As&&... as) && noexcept try {
    -    exec::set_value(std::move(*this).base(), std::invoke((F&&) f_, (As&&) as...));
    -  } catch(...) {
    -    exec::set_error(std::move(*this).base(), std::current_exception());
    -  }
    -
      public:
    -  _then_receiver(R r, F f)
    -   : exec::receiver_adaptor<_then_receiver, R>{std::move(r)}
    -   , f_(std::move(f)) {}
    +  _then_receiver(R r, F f) : R(std::move(r)), f_(std::move(f)) {}
    +
    +  // Customize set_value by invoking the callable and passing the result to
    +  // the inner receiver
    +  template <class... As>
    +    requires std::invocable<F, As...>
    +  void set_value(As&&... as) && noexcept {
    +    try {
    +      stdexec::set_value(std::move(*this).base(), std::invoke((F&&) f_, (As&&) as...));
    +    } catch(...) {
    +      stdexec::set_error(std::move(*this).base(), std::current_exception());
    +    }
    +  }
     };
     
    -template<exec::sender S, class F>
    +template <stdexec::sender S, class F>
     struct _then_sender {
    -  using sender_concept = exec::sender_t;
    +  using sender_concept = stdexec::sender_t;
       S s_;
       F f_;
     
       template <class... Args>
    -    using _set_value_t = exec::completion_signatures<
    -      exec::set_value_t(std::invoke_result_t<F, Args...>)>;
    +    using _set_value_t = stdexec::completion_signatures<
    +      stdexec::set_value_t(std::invoke_result_t<F, Args...>)>;
    +
    +  using _except_ptr_sig =
    +    stdexec::completion_signatures<stdexec::set_error_t(std::exception_ptr)>;
     
       // Compute the completion signatures
    -  template<class Env>
    -  friend auto tag_invoke(exec::get_completion_signatures_t, _then_sender&&, Env)
    -    -> exec::transform_completion_signatures_of<S, Env,
    -        exec::completion_signatures<exec::set_error_t(std::exception_ptr)>,
    -        _set_value_t>;
    +  template <class Env>
    +  auto get_completion_signatures(Env&& env) && noexcept
    +    -> stdexec::transform_completion_signatures_of<
    +        S, Env, _except_ptr_sig, _set_value_t> {
    +    return {};
    +  }
     
       // Connect:
    -  template<exec::receiver R>
    -  friend auto tag_invoke(exec::connect_t, _then_sender&& self, R r)
    -    -> exec::connect_result_t<S, _then_receiver<R, F>> {
    -      return exec::connect(
    -        (S&&) self.s_, _then_receiver<R, F>{(R&&) r, (F&&) self.f_});
    +  template <stdexec::receiver R>
    +  auto connect(R r) && -> stdexec::connect_result_t<S, _then_receiver<R, F>> {
    +    return stdexec::connect(
    +      (S&&) s_, _then_receiver{(R&&) r, (F&&) f_});
       }
     
    -  friend decltype(auto) tag_invoke(get_env_t, const _then_sender& self) noexcept {
    -    return get_env(self.s_);
    +  decltype(auto) get_env() const noexcept {
    +    return get_env(s_);
       }
     };
     
    -template<exec::sender S, class F>
    -exec::sender auto then(S s, F f) {
    +template <stdexec::sender S, class F>
    +stdexec::sender auto then(S s, F f) {
       return _then_sender<S, F>{(S&&) s, (F&&) f};
     }
     
    -

    This code builds a then algorithm that transforms the value(s) from the input sender -with a transformation function. The result of the transformation becomes the new value. -The other receiver functions (set_error and set_stopped), as well as all receiver queries, -are passed through unchanged.

    +

    This code builds a then algorithm that transforms the value(s) from the input +sender with a transformation function. The result of the transformation becomes +the new value. The other receiver functions (set_error and set_stopped), as +well as all receiver queries, are passed through unchanged.

    In detail, it does the following:

    1. -

      Defines a receiver in terms of execution::receiver_adaptor that aggregates -another receiver and an invocable that:

      +

      Defines a receiver in terms of receiver and an invocable that:

      • -

        Defines a constrained tag_invoke overload for transforming the value -channel.

        +

        Defines a constrained set_value member function for transforming the +value channel.

      • -

        Defines another constrained overload of tag_invoke that passes all other -customizations through unchanged.

        +

        Delegates set_error and set_stopped to the inner receiver.

      -

      The tag_invoke overloads are actually implemented by execution::receiver_adaptor; they dispatch either to named members, as -shown above with _then_receiver::set_value, or to the adapted receiver.

    2. -

      Defines a sender that aggregates another sender and the invocable, which defines a tag_invoke customization for std::execution::connect that wraps the incoming receiver in the receiver from (1) and passes it and the incoming sender to std::execution::connect, returning the result. It also defines a tag_invoke customization of get_completion_signatures that declares the sender’s completion signatures when executed within a particular environment.

      +

      Defines a sender that aggregates another sender and the invocable, which +defines a connect member function that wraps the incoming receiver in the +receiver from (1) and passes it and the incoming sender to std::execution::connect, returning the result. It also defines a get_completion_signatures member function that declares the sender’s +completion signatures when executed within a particular environment.

    1.5.2. retry

    using namespace std;
    -namespace exec = execution;
    +namespace stdexec = execution;
     
    -template <class From, class To>
    +template<class From, class To>
     concept _decays_to = same_as<decay_t<From>, To>;
     
     // _conv needed so we can emplace construct non-movable types into
     // a std::optional.
     template<invocable F>
    -  requires is_nothrow_move_constructible_v<F>
     struct _conv {
       F f_;
    +
    +  static_assert(is_nothrow_move_constructible_v<F>);
       explicit _conv(F f) noexcept : f_((F&&) f) {}
    +
       operator invoke_result_t<F>() && {
         return ((F&&) f_)();
       }
     };
     
     template<class S, class R>
    -struct _op;
    +struct _retry_op;
     
    -// pass through all customizations except set_error, which retries the operation.
    +// pass through all customizations except set_error, which retries
    +// the operation.
     template<class S, class R>
    -struct _retry_receiver
    -  : exec::receiver_adaptor<_retry_receiver<S, R>> {
    -  _op<S, R>* o_;
    +struct _retry_receiver {
    +  _retry_op<S, R>* o_;
     
    -  R&& base() && noexcept { return (R&&) o_->r_; }
    -  const R& base() const & noexcept { return o_->r_; }
    -
    -  explicit _retry_receiver(_op<S, R>* o) : o_(o) {}
    +  void set_value(auto&&... as) && noexcept {
    +    stdexec::set_value(std::move(o_->r_), (decltype(as)&&) as...);
    +  }
     
       void set_error(auto&&) && noexcept {
         o_->_retry(); // This causes the op to be retried
       }
    +
    +  void set_stopped() && noexcept {
    +    stdexec::set_stopped(std::move(o_->r_));
    +  }
    +
    +  decltype(auto) get_env() const noexcept {
    +    return get_env(o_->r_);
    +  }
     };
     
     // Hold the nested operation state in an optional so we can
     // re-construct and re-start it if the operation fails.
     template<class S, class R>
    -struct _op {
    +struct _retry_op {
    +  using operation_state_concept = stdexec::operation_state_t;
    +  using _child_op_t =
    +    stdexec::connect_result_t<S&, _retry_receiver<S, R>>;
    +
       S s_;
       R r_;
    -  optional<
    -      exec::connect_result_t<S&, _retry_receiver<S, R>>> o_;
    +  optional<_child_op_t> o_;
     
    -  _op(S s, R r): s_((S&&)s), r_((R&&)r), o_{_connect()} {}
       _op(_op&&) = delete;
    +  _op(S s, R r)
    +    : s_(std::move(s)), r_(std::move(r)), o_{_connect()} {}
     
       auto _connect() noexcept {
         return _conv{[this] {
    -      return exec::connect(s_, _retry_receiver<S, R>{this});
    +      return stdexec::connect(s_, _retry_receiver<S, R>{this});
         }};
       }
    -  void _retry() noexcept try {
    -    o_.emplace(_connect()); // potentially-throwing
    -    exec::start(*o_);
    -  } catch(...) {
    -    exec::set_error((R&&) r_, std::current_exception());
    +
    +  void _retry() noexcept {
    +    try {
    +      o_.emplace(_connect()); // potentially-throwing
    +      stdexec::start(*o_);
    +    } catch(...) {
    +      stdexec::set_error(std::move(r_), std::current_exception());
    +    }
       }
    -  friend void tag_invoke(exec::start_t, _op& o) noexcept {
    -    exec::start(*o.o_);
    +
    +  void start() & noexcept {
    +    stdexec::start(*o_);
       }
     };
     
    +// Helpers for computing the `then` sender's completion signatures: 
    +template <class... Ts>
    +  using _value_t =
    +    stdexec::completion_signatures<stdexec::set_value_t(Ts...)>;
    +
    +template <class>
    +  using _error_t = stdexec::completion_signatures<>;
    +
    +using _except_sig =
    +  stdexec::completion_signatures<stdexec::set_error_t(std::exception_ptr)>;
    +
     template<class S>
     struct _retry_sender {
    -  using sender_concept = exec::sender_t;
    +  using sender_concept = stdexec::sender_t;
       S s_;
    -  explicit _retry_sender(S s) : s_((S&&) s) {}
    -
    -  template <class... Ts>
    -    using _value_t =
    -      exec::completion_signatures<exec::set_value_t(Ts...)>;
    -  template <class>
    -    using _error_t = exec::completion_signatures<>;
    +  explicit _retry_sender(S s) : s_(std::move(s)) {}
     
       // Declare the signatures with which this sender can complete
       template <class Env>
    -  friend auto tag_invoke(exec::get_completion_signatures_t, const _retry_sender&, Env)
    -    -> exec::transform_completion_signatures_of<S&, Env,
    -        exec::completion_signatures<exec::set_error_t(std::exception_ptr)>,
    -        _value_t, _error_t>;
    -
    -  template<exec::receiver R>
    -  friend _op<S, R> tag_invoke(exec::connect_t, _retry_sender&& self, R r) {
    -    return {(S&&) self.s_, (R&&) r};
    +    using _compl_sigs =
    +      stdexec::transform_completion_signatures_of<
    +        S&, Env, _except_sig, _value_t, _error_t>;
    +
    +  template <class Env>
    +  auto get_completion_signatures(Env&&) const noexcept -> _compl_sigs<Env> {
    +    return {};
       }
     
    -  friend decltype(auto) tag_invoke(exec::get_env_t, const _retry_sender& self) noexcept {
    -    return get_env(self.s_);
    +  template <stdexec::receiver R>
    +    requires stdexec::sender_to<S&, _retry_receiver<S, R>>
    +  _retry_op<S, R> connect(R r) && {
    +    return {std::move(s_), std::move(r)};
    +  }
    +
    +  decltype(auto) get_env() const noexcept {
    +    return get_env(s_);
       }
     };
     
    -template<exec::sender S>
    -exec::sender auto retry(S s) {
    -  return _retry_sender{(S&&) s};
    +template <stdexec::sender S>
    +stdexec::sender auto retry(S s) {
    +  return _retry_sender{std::move(s)};
     }
     
    -

    The retry algorithm takes a multi-shot sender and causes it to repeat on error, passing -through values and stopped signals. Each time the input sender is restarted, a new receiver -is connected and the resulting operation state is stored in an optional, which allows us -to reinitialize it multiple times.

    +

    The retry algorithm takes a multi-shot sender and causes it to repeat on +error, passing through values and stopped signals. Each time the input sender is +restarted, a new receiver is connected and the resulting operation state is +stored in an optional, which allows us to reinitialize it multiple times.

    This example does the following:

    1. -

      Defines a _conv utility that takes advantage of C++17’s guaranteed copy elision to -emplace a non-movable type in a std::optional.

      +

      Defines a _conv utility that takes advantage of C++17’s guaranteed copy +elision to emplace a non-movable type in a std::optional.

    2. -

      Defines a _retry_receiver that holds a pointer back to the operation state. It passes -all customizations through unmodified to the inner receiver owned by the operation state -except for set_error, which causes a _retry() function to be called instead.

      +

      Defines a _retry_receiver that holds a pointer back to the operation state. +It passes all customizations through unmodified to the inner receiver owned +by the operation state except for set_error, which causes a _retry() function to be called instead.

    3. -

      Defines an operation state that aggregates the input sender and receiver, and declares -storage for the nested operation state in an optional. Constructing the operation -state constructs a _retry_receiver with a pointer to the (under construction) operation -state and uses it to connect to the aggregated sender.

      +

      Defines an operation state that aggregates the input sender and receiver, and +declares storage for the nested operation state in an optional. +Constructing the operation state constructs a _retry_receiver with a +pointer to the (under construction) operation state and uses it to connect +to the input sender.

    4. -

      Starting the operation state dispatches to start on the inner operation state.

      +

      Starting the operation state dispatches to start on the inner operation +state.

    5. -

      The _retry() function reinitializes the inner operation state by connecting the sender -to a new receiver, holding a pointer back to the outer operation state as before.

      +

      The _retry() function reinitializes the inner operation state by connecting +the sender to a new receiver, holding a pointer back to the outer operation +state as before.

    6. -

      After reinitializing the inner operation state, _retry() calls start on it, causing -the failed operation to be rescheduled.

      +

      After reinitializing the inner operation state, _retry() calls start on +it, causing the failed operation to be rescheduled.

    7. -

      Defines a _retry_sender that implements the connect customization point to return -an operation state constructed from the passed-in sender and receiver.

      +

      Defines a _retry_sender that implements a connect member function to +return an operation state constructed from the passed-in sender and +receiver.

    8. -

      _retry_sender also implements the get_completion_signatures customization point to describe the ways this sender may complete when executed in a particular execution resource.

      +

      _retry_sender also implements a get_completion_signatures member function +to describe the ways this sender may complete when executed in a particular +execution resource.

    1.6. Examples: Schedulers

    In this section we look at some schedulers of varying complexity.

    1.6.1. Inline scheduler

    -
    class inline_scheduler {
    +
    namespace stdexec = std::execution;
    +
    +class inline_scheduler {
       template <class R>
    -    struct _op {
    -      [[no_unique_address]] R rec_;
    -      friend void tag_invoke(std::execution::start_t, _op& op) noexcept {
    -        std::execution::set_value((R&&) op.rec_);
    -      }
    -    };
    +  struct _op {
    +    using operation_state_concept = operation_state_t;
    +    R rec_;
    +
    +    void start() & noexcept {
    +      stdexec::set_value(std::move(rec_));
    +    }
    +  };
     
       struct _env {
         template <class Tag>
    -      friend inline_scheduler tag_invoke(
    -          std::execution::get_completion_scheduler_t<Tag>, _env) noexcept {
    -        return {};
    -      }
    +    inline_scheduler query(stdexec::get_completion_scheduler_t<Tag>) const noexcept {
    +      return {};
    +    }
       };
     
       struct _sender {
    -    using sender_concept = std::execution::sender_t;
    -    using completion_signatures =
    -      std::execution::completion_signatures<std::execution::set_value_t()>;
    -
    -    template <class R>
    -      friend auto tag_invoke(std::execution::connect_t, _sender, R&& rec)
    -        noexcept(std::is_nothrow_constructible_v<std::remove_cvref_t<R>, R>)
    -        -> _op<std::remove_cvref_t<R>> {
    -        return {(R&&) rec};
    -      }
    +    using sender_concept = stdexec::sender_t;
    +    using _compl_sigs = stdexec::completion_signatures<stdexec::set_value_t()>;
    +    using completion_signatures = _compl_sigs;
     
    -    friend _env tag_invoke(exec::get_env_t, _sender) noexcept {
    +    template <stdexec::receiver_of<_compl_sigs> R>
    +    _op<R> connect(R rec) noexcept(std::is_nothrow_move_constructible_v<R>) {
    +      return {std::move(rec)};
    +    }
    +
    +    _env get_env() const noexcept {
           return {};
         }
       };
     
    -  friend _sender tag_invoke(std::execution::schedule_t, const inline_scheduler&) noexcept {
    + public:
    +  inline_scheduler() = default;
    +
    +  _sender schedule() const noexcept {
         return {};
       }
     
    - public:
    -  inline_scheduler() = default;
       bool operator==(const inline_scheduler&) const noexcept = default;
     };
     
    -

    The inline scheduler is a trivial scheduler that completes immediately and synchronously on -the thread that calls std::execution::start on the operation state produced by its sender. -In other words, start(connect(schedule(inline-scheduler), receiver)) is -just a fancy way of saying set_value(receiver), with the exception of the fact that start wants to be passed an lvalue.

    -

    Although not a particularly useful scheduler, it serves to illustrate the basics of -implementing one. The inline_scheduler:

    +

    The inline scheduler is a trivial scheduler that completes immediately and +synchronously on the thread that calls std::execution::start on the operation +state produced by its sender. In other words, start(connect(schedule(inline_scheduler()), receiver)) is just a fancy way of +saying set_value(receiver), with the exception of the fact that start wants +to be passed an lvalue.

    +

    Although not a particularly useful scheduler, it serves to illustrate the basics +of implementing one. The inline_scheduler:

    1. Customizes execution::schedule to return an instance of the sender type _sender.

      @@ -3406,8 +3538,9 @@

      The operation state customizes std::execution::start to call std::execution::set_value on the receiver.

    1.6.2. Single thread scheduler

    -

    This example shows how to create a scheduler for an execution resource that consists of a single -thread. It is implemented in terms of a lower-level execution resource called std::execution::run_loop.

    +

    This example shows how to create a scheduler for an execution resource that +consists of a single thread. It is implemented in terms of a lower-level +execution resource called std::execution::run_loop.

    class single_thread_context {
       std::execution::run_loop loop_;
       std::thread thread_;
    @@ -3417,6 +3550,7 @@ 

    : loop_() , thread_([this] { loop_.run(); }) {} + single_thread_context(single_thread_context&&) = delete; ~single_thread_context() { loop_.finish(); @@ -3432,10 +3566,11 @@

    } };

    -

    The single_thread_context owns an event loop and a thread to drive it. In the destructor, it tells the event -loop to finish up what it’s doing and then joins the thread, blocking for the event loop to drain.

    +

    The single_thread_context owns an event loop and a thread to drive it. In the +destructor, it tells the event loop to finish up what it’s doing and then joins +the thread, blocking for the event loop to drain.

    The interesting bits are in the execution::run_loop context implementation. It -is slightly too long to include here, so we only provide a reference to +is slightly too long to include here, so we only provide a reference to it, but there is one noteworthy detail about its implementation: It uses space in its operation states to build an intrusive linked list of work items. In @@ -3445,7 +3580,10 @@

    1.7. Examples: Server theme

    -

    In this section we look at some examples of how one would use senders to implement an HTTP server. The examples ignore the low-level details of the HTTP server and looks at how senders can be combined to achieve the goals of the project.

    +

    In this section we look at some examples of how one would use senders to +implement an HTTP server. The examples ignore the low-level details of the HTTP +server and looks at how senders can be combined to achieve the goals of the +project.

    General application context:

    • @@ -3471,80 +3609,98 @@

      Example context:

      • -

        we are looking at the flow of processing an HTTP request and sending back the response

        +

        we are looking at the flow of processing an HTTP request and sending back the +response.

      • -

        show how one can break the (slightly complex) flow into steps with execution::let_* functions

        +

        show how one can break the (slightly complex) flow into steps with execution::let_* functions.

      • -

        different phases of processing HTTP requests are broken down into separate concerns

        +

        different phases of processing HTTP requests are broken down into separate +concerns.

      • -

        each part of the processing might use different execution resources (details not shown in this example)

        +

        each part of the processing might use different execution resources (details +not shown in this example).

      • -

        error handling is generic, regardless which component fails; we always send the right response to the clients

        +

        error handling is generic, regardless which component fails; we always send +the right response to the clients.

      Goals:

      • -

        show how one can break more complex flows into steps with let_* functions

        +

        show how one can break more complex flows into steps with let_* functions.

      • -

        exemplify the use of let_value, let_error, let_stopped, and just algorithms

        +

        exemplify the use of let_value, let_error, let_stopped, and just algorithms.

      -
      namespace ex = std::execution;
      +
      namespace stdexec = std::execution;
       
       // Returns a sender that yields an http_request object for an incoming request
      -ex::sender auto schedule_request_start(read_requests_ctx ctx) {...}
      +stdexec::sender auto schedule_request_start(read_requests_ctx ctx) {...}
      +
       // Sends a response back to the client; yields a void signal on success
      -ex::sender auto send_response(const http_response& resp) {...}
      +stdexec::sender auto send_response(const http_response& resp) {...}
      +
       // Validate that the HTTP request is well-formed; forwards the request on success
      -ex::sender auto validate_request(const http_request& req) {...}
      +stdexec::sender auto validate_request(const http_request& req) {...}
       
       // Handle the request; main application logic
      -ex::sender auto handle_request(const http_request& req) {
      +stdexec::sender auto handle_request(const http_request& req) {
         //...
      -  return ex::just(http_response{200, result_body});
      +  return stdexec::just(http_response{200, result_body});
       }
       
       // Transforms server errors into responses to be sent to the client
      -ex::sender auto error_to_response(std::exception_ptr err) {
      +stdexec::sender auto error_to_response(std::exception_ptr err) {
         try {
           std::rethrow_exception(err);
         } catch (const std::invalid_argument& e) {
      -    return ex::just(http_response{404, e.what()});
      +    return stdexec::just(http_response{404, e.what()});
         } catch (const std::exception& e) {
      -    return ex::just(http_response{500, e.what()});
      +    return stdexec::just(http_response{500, e.what()});
         } catch (...) {
      -    return ex::just(http_response{500, "Unknown server error"});
      +    return stdexec::just(http_response{500, "Unknown server error"});
         }
       }
      +
       // Transforms cancellation of the server into responses to be sent to the client
      -ex::sender auto stopped_to_response() {
      -  return ex::just(http_response{503, "Service temporarily unavailable"});
      +stdexec::sender auto stopped_to_response() {
      +  return stdexec::just(http_response{503, "Service temporarily unavailable"});
       }
      +
       //...
      +
       // The whole flow for transforming incoming requests into responses
      -ex::sender auto snd =
      +stdexec::sender auto snd =
           // get a sender when a new request comes
           schedule_request_start(the_read_requests_ctx)
           // make sure the request is valid; throw if not
      -    | ex::let_value(validate_request)
      +    | stdexec::let_value(validate_request)
           // process the request in a function that may be using a different execution resource
      -    | ex::let_value(handle_request)
      +    | stdexec::let_value(handle_request)
           // If there are errors transform them into proper responses
      -    | ex::let_error(error_to_response)
      +    | stdexec::let_error(error_to_response)
           // If the flow is cancelled, send back a proper response
      -    | ex::let_stopped(stopped_to_response)
      +    | stdexec::let_stopped(stopped_to_response)
           // write the result back to the client
      -    | ex::let_value(send_response)
      +    | stdexec::let_value(send_response)
           // done
           ;
      +
       // execute the whole flow asynchronously
      -ex::start_detached(std::move(snd));
      -
      -

      The example shows how one can separate out the concerns for interpreting requests, validating requests, running the main logic for handling the request, generating error responses, handling cancellation and sending the response back to the client. -They are all different phases in the application, and can be joined together with the let_* functions.

      -

      All our functions return execution::sender objects, so that they can all generate success, failure and cancellation paths. -For example, regardless where an error is generated (reading request, validating request or handling the response), we would have one common block to handle the error, and following error flows is easy.

      -

      Also, because of using execution::sender objects at any step, we might expect any of these steps to be completely asynchronous; the overall flow doesn’t care. -Regardless of the execution resource in which the steps, or part of the steps are executed in, the flow is still the same.

      +stdexec::start_detached(std::move(snd)); +
      +

      The example shows how one can separate out the concerns for interpreting +requests, validating requests, running the main logic for handling the request, +generating error responses, handling cancellation and sending the response back +to the client. They are all different phases in the application, and can be +joined together with the let_* functions.

      +

      All our functions return execution::sender objects, so that they can all +generate success, failure and cancellation paths. For example, regardless where +an error is generated (reading request, validating request or handling the +response), we would have one common block to handle the error, and following +error flows is easy.

      +

      Also, because of using execution::sender objects at any step, we might expect +any of these steps to be completely asynchronous; the overall flow doesn’t care. +Regardless of the execution resource in which the steps, or part of the steps +are executed in, the flow is still the same.

      1.7.2. Moving between execution resources with execution::on and execution::transfer

      Example context:

        @@ -3562,45 +3718,48 @@

        exemplify the use of on and transfer algorithms

      -
      namespace ex = std::execution;
      +
      namespace stdexec = std::execution;
       
      -size_t legacy_read_from_socket(int sock, char* buffer, size_t buffer_len) {}
      -void process_read_data(const char* read_data, size_t read_len) {}
      +size_t legacy_read_from_socket(int sock, char* buffer, size_t buffer_len);
      +void process_read_data(const char* read_data, size_t read_len);
       //...
       
       // A sender that just calls the legacy read function
      -auto snd_read = ex::just(sock, buf, buf_len) | ex::then(legacy_read_from_socket);
      +auto snd_read = stdexec::just(sock, buf, buf_len)
      +              | stdexec::then(legacy_read_from_socket);
      +
       // The entire flow
       auto snd =
           // start by reading data on the I/O thread
      -    ex::on(io_sched, std::move(snd_read))
      +    stdexec::on(io_sched, std::move(snd_read))
           // do the processing on the worker threads pool
      -    | ex::transfer(work_sched)
      +    | stdexec::transfer(work_sched)
           // process the incoming data (on worker threads)
      -    | ex::then([buf](int read_len) { process_read_data(buf, read_len); })
      +    | stdexec::then([buf](int read_len) { process_read_data(buf, read_len); })
           // done
           ;
      +
       // execute the whole flow asynchronously
      -ex::start_detached(std::move(snd));
      -
      -

      The example assume that we need to wrap some legacy code of reading sockets, and handle execution resource switching. -(This style of reading from socket may not be the most efficient one, but it’s working for our purposes.) -For performance reasons, the reading from the socket needs to be done on the I/O thread, and all the processing needs to happen on a work-specific execution resource (i.e., thread pool).

      -

      Calling execution::on will ensure that the given sender will be started on the given scheduler. -In our example, snd_read is going to be started on the I/O scheduler. -This sender will just call the legacy code.

      -

      The completion-signal will be issued in the I/O execution resource, so we have to move it to the work thread pool. -This is achieved with the help of the execution::transfer algorithm. -The rest of the processing (in our case, the last call to then) will happen in the work thread pool.

      -

      The reader should notice the difference between execution::on and execution::transfer. -The execution::on algorithm will ensure that the given sender will start in the specified context, and doesn’t care where the completion-signal for that sender is sent. -The execution::transfer algorithm will not care where the given sender is going to be started, but will ensure that the completion-signal of will be transferred to the given context.

      -

      1.8. What this proposal is not

      -

      This paper is not a patch on top of A Unified Executors Proposal for C++; we are not asking to update the existing paper, we are asking to retire it in favor of this paper, which is already self-contained; any example code within this paper can be written in Standard C++, without the need -to standardize any further facilities.

      -

      This paper is not an alternative design to A Unified Executors Proposal for C++; rather, we have taken the design in the current executors paper, and applied targeted fixes to allow it to fulfill the promises of the sender/receiver model, as well as provide all the facilities we consider -essential when writing user code using standard execution concepts; we have also applied the guidance of removing one-way executors from the paper entirely, and instead provided an algorithm based around senders that serves the same purpose.

      -

      1.9. Design changes from P0443

      +stdexec::start_detached(std::move(snd)); +
      +

      The example assume that we need to wrap some legacy code of reading sockets, and +handle execution resource switching. (This style of reading from socket may not +be the most efficient one, but it’s working for our purposes.) For performance +reasons, the reading from the socket needs to be done on the I/O thread, and all +the processing needs to happen on a work-specific execution resource (i.e., +thread pool).

      +

      Calling execution::on will ensure that the given sender will be started on the +given scheduler. In our example, snd_read is going to be started on the I/O +scheduler. This sender will just call the legacy code.

      +

      The completion-signal will be issued in the I/O execution resource, so we have +to move it to the work thread pool. This is achieved with the help of the execution::transfer algorithm. The rest of the processing (in our case, the +last call to then) will happen in the work thread pool.

      +

      The reader should notice the difference between execution::on and execution::transfer. The execution::on algorithm will ensure that the given +sender will start in the specified context, and doesn’t care where the +completion-signal for that sender is sent. The execution::transfer algorithm +will not care where the given sender is going to be started, but will ensure +that the completion-signal of will be transferred to the given context.

      +

      1.8. Design changes from P0443

      1. The executor concept has been removed and all of its proposed functionality @@ -3627,7 +3786,7 @@

        typed_sender concept is renamed sender; sender_traits is replaced with completion_signatures_of_t.

      2. Specific type erasure facilities are omitted, as per LEWG direction. Type -erasure facilities can be built on top of this proposal, as discussed in § 5.9 Ranges-style CPOs vs tag_invoke.

        +erasure facilities can be built on top of this proposal, as discussed in § 5.9 Customization points.

      3. A specific thread pool implementation is omitted, as per LEWG direction.

      4. @@ -3636,103 +3795,209 @@

        run_loop: An execution resource that provides a multi-producer, single-consumer, first-in-first-out work queue.

        -
      5. -

        receiver_adaptor: A utility for algorithm authors for defining one -receiver type in terms of another.

      6. completion_signatures and transform_completion_signatures: Utilities for describing the ways in which a sender can complete in a declarative syntax.

    -

    1.10. Prior art

    -

    This proposal builds upon and learns from years of prior art with asynchronous and parallel programming frameworks in C++. In this section, we discuss async abstractions that have previously been suggested as a possible basis for asynchronous algorithms and why they fall short.

    -

    1.10.1. Futures

    -

    A future is a handle to work that has already been scheduled for execution. It is one end of a communication channel; the other end is a promise, used to receive the result from the concurrent operation and to communicate it to the future.

    -

    Futures, as traditionally realized, require the dynamic allocation and management of a shared state, synchronization, and typically type-erasure of work and continuation. Many of these costs are inherent in the nature of "future" as a handle to work that is already scheduled for execution. These expenses rule out the future abstraction for many uses and makes it a poor choice for a basis of a generic mechanism.

    -

    1.10.2. Coroutines

    -

    C++20 coroutines are frequently suggested as a basis for asynchronous algorithms. It’s fair to ask why, if we added coroutines to C++, are we suggesting the addition of a library-based abstraction for asynchrony. Certainly, coroutines come with huge syntactic and semantic advantages over the alternatives.

    -

    Although coroutines are lighter weight than futures, coroutines suffer many of the same problems. Since they typically start suspended, they can avoid synchronizing the chaining of dependent work. However in many cases, coroutine frames require an unavoidable dynamic allocation and indirect function calls. This is done to hide the layout of the coroutine frame from the C++ type system, which in turn makes possible the separate compilation of coroutines and certain compiler optimizations, such as optimization of the coroutine frame size.

    -

    Those advantages come at a cost, though. Because of the dynamic allocation of coroutine frames, coroutines in embedded or heterogeneous environments, which often lack support for dynamic allocation, require great attention to detail. And the allocations and indirections tend to complicate the job of the inliner, often resulting in sub-optimal codegen.

    -

    The coroutine language feature mitigates these shortcomings somewhat with the HALO optimization Halo: coroutine Heap Allocation eLision Optimization: the joint response, which leverages existing compiler optimizations such as allocation elision and devirtualization to inline the coroutine, completely eliminating the runtime overhead. However, HALO requires a sophisiticated compiler, and a fair number of stars need to align for the optimization to kick in. In our experience, more often than not in real-world code today’s compilers are not able to inline the coroutine, resulting in allocations and indirections in the generated code.

    -

    In a suite of generic async algorithms that are expected to be callable from hot code paths, the extra allocations and indirections are a deal-breaker. It is for these reasons that we consider coroutines a poor choise for a basis of all standard async.

    -

    1.10.3. Callbacks

    -

    Callbacks are the oldest, simplest, most powerful, and most efficient mechanism for creating chains of work, but suffer problems of their own. Callbacks must propagate either errors or values. This simple requirement yields many different interface possibilities. The lack of a standard callback shape obstructs generic design.

    -

    Additionally, few of these possibilities accommodate cancellation signals when the user requests upstream work to stop and clean up.

    -

    1.11. Field experience

    -

    1.11.1. libunifex

    -

    This proposal draws heavily from our field experience with libunifex. Libunifex implements all of the concepts and customization points defined in this paper (with slight variations -- the design of P2300 has evolved due to LEWG feedback), many of this paper’s algorithms (some under different names), and much more besides.

    -

    Libunifex has several concrete schedulers in addition to the run_loop suggested here (where it is called manual_event_loop). It has schedulers that dispatch efficiently to epoll and io_uring on Linux and the Windows Thread Pool on Windows.

    -

    In addition to the proposed interfaces and the additional schedulers, it has several important extensions to the facilities described in this paper, which demonstrate directions in which these abstractions may be evolved over time, including:

    +

    1.9. Prior art

    +

    This proposal builds upon and learns from years of prior art with asynchronous +and parallel programming frameworks in C++. In this section, we discuss async +abstractions that have previously been suggested as a possible basis for +asynchronous algorithms and why they fall short.

    +

    1.9.1. Futures

    +

    A future is a handle to work that has already been scheduled for execution. It +is one end of a communication channel; the other end is a promise, used to +receive the result from the concurrent operation and to communicate it to the +future.

    +

    Futures, as traditionally realized, require the dynamic allocation and +management of a shared state, synchronization, and typically type-erasure of +work and continuation. Many of these costs are inherent in the nature of +"future" as a handle to work that is already scheduled for execution. These +expenses rule out the future abstraction for many uses and makes it a poor +choice for a basis of a generic mechanism.

    +

    1.9.2. Coroutines

    +

    C++20 coroutines are frequently suggested as a basis for asynchronous +algorithms. It’s fair to ask why, if we added coroutines to C++, are we +suggesting the addition of a library-based abstraction for asynchrony. +Certainly, coroutines come with huge syntactic and semantic advantages over the +alternatives.

    +

    Although coroutines are lighter weight than futures, coroutines suffer many of +the same problems. Since they typically start suspended, they can avoid +synchronizing the chaining of dependent work. However in many cases, coroutine +frames require an unavoidable dynamic allocation and indirect function calls. +This is done to hide the layout of the coroutine frame from the C++ type system, +which in turn makes possible the separate compilation of coroutines and certain +compiler optimizations, such as optimization of the coroutine frame size.

    +

    Those advantages come at a cost, though. Because of the dynamic allocation of +coroutine frames, coroutines in embedded or heterogeneous environments, which +often lack support for dynamic allocation, require great attention to detail. +And the allocations and indirections tend to complicate the job of the inliner, +often resulting in sub-optimal codegen.

    +

    The coroutine language feature mitigates these shortcomings somewhat with the +HALO optimization Halo: coroutine Heap Allocation eLision Optimization: the joint response, which leverages existing compiler optimizations +such as allocation elision and devirtualization to inline the coroutine, +completely eliminating the runtime overhead. However, HALO requires a +sophisiticated compiler, and a fair number of stars need to align for the +optimization to kick in. In our experience, more often than not in real-world +code today’s compilers are not able to inline the coroutine, resulting in +allocations and indirections in the generated code.

    +

    In a suite of generic async algorithms that are expected to be callable from hot +code paths, the extra allocations and indirections are a deal-breaker. It is for +these reasons that we consider coroutines a poor choise for a basis of all +standard async.

    +

    1.9.3. Callbacks

    +

    Callbacks are the oldest, simplest, most powerful, and most efficient mechanism +for creating chains of work, but suffer problems of their own. Callbacks must +propagate either errors or values. This simple requirement yields many different +interface possibilities. The lack of a standard callback shape obstructs generic +design.

    +

    Additionally, few of these possibilities accommodate cancellation signals when +the user requests upstream work to stop and clean up.

    +

    1.10. Field experience

    +

    1.10.1. libunifex

    +

    This proposal draws heavily from our field experience with libunifex. Libunifex +implements all of the concepts and customization points defined in this paper +(with slight variations -- the design of P2300 has evolved due to LEWG +feedback), many of this paper’s algorithms (some under different names), and +much more besides.

    +

    Libunifex has several concrete schedulers in addition to the run_loop suggested here (where it is called manual_event_loop). It has schedulers that +dispatch efficiently to epoll and io_uring on Linux and the Windows Thread Pool +on Windows.

    +

    In addition to the proposed interfaces and the additional schedulers, it has +several important extensions to the facilities described in this paper, which +demonstrate directions in which these abstractions may be evolved over time, +including:

    • -

      Timed schedulers, which permit scheduling work on an execution resource at a particular time or after a particular duration has elapsed. In addition, it provides time-based algorithms.

      +

      Timed schedulers, which permit scheduling work on an execution resource at a +particular time or after a particular duration has elapsed. In addition, it +provides time-based algorithms.

    • File I/O schedulers, which permit filesystem I/O to be scheduled.

    • -

      Two complementary abstractions for streams (asynchronous ranges), and a set of stream-based algorithms.

      -
    -

    Libunifex has seen heavy production use at Facebook. As of October 2021, it is currently used in production within the following applications and platforms:

    -
      -
    • -

      Facebook Messenger on iOS, Android, Windows, and macOS

      -
    • -

      Instagram on iOS and Android

      -
    • -

      Facebook on iOS and Android

      -
    • -

      Portal

      -
    • -

      An internal Facebook product that runs on Linux

      +

      Two complementary abstractions for streams (asynchronous ranges), and a set of +stream-based algorithms.

    -

    All of these applications are making direct use of the sender/receiver abstraction as presented in this paper. One product (Instagram on iOS) is making use of the sender/coroutine integration as presented. The monthly active users of these products number in the billions.

    -

    1.11.2. Other implementations

    -

    The authors are aware of a number of other implementations of sender/receiver from this paper. These are presented here in perceived order of maturity and field experience.

    +

    Libunifex has seen heavy production use at Meta. An employee summarizes it +as follows:

    +
    +

    As of June, 2023, Unifex is still used in production at Meta. It’s used to +express the asynchrony in rsys, and is +therefore serving video calling to billions of people every month on Meta’s +social networking apps on iOS, Android, Windows, and macOS. It’s also serving +the Virtual Desktop experience on Oculus Quest devices, and some internal uses +that run on Linux.

    +

    One team at Meta has migrated from folly::Future to unifex::task and seen +significant developer efficiency improvements. Coroutines are easier to +understand than chained futures so the team was able to meet requirements for +certain constrained environments that would have been too complicated to +maintain with futures.

    +

    In all the cases mentioned above, developers mix-and-match between the sender +algorithms in Unifex and Unifex’s coroutine type, unifex::task. We also rely +on unifex::task’s scheduler affinity to minimize surprise when programming +with coroutines.

    +
    +

    1.10.2. stdexec

    +

    stdexec is the reference implementation of +this proposal. It is a complete implementation, written from the specification +in this paper, and is current with \R8.

    +

    The original purpose of stdexec was to help find specification bugs and to +harden the wording of the proposal, but it has since become one of NVIDIA’s core +C++ libraries for high-performance computing. In addition to the facilities +proposed in this paper, stdexec has schedulers for CUDA, Intel TBB, and MacOS. +Like libunifex, its scope has also expanded to include a streaming abstraction +and stream algorithms, and time-based schedulers and algorithms.

    +

    The stdexec project has seen lots of community interest and contributions. At the +time of writing (March, 2024), the GitHub repository has 1.2k stars, 130 forks, +and 50 contributors.

    +

    stdexec is fit for broad use and for ultimate contribution to libc++.

    +

    1.10.3. Other implementations

    +

    The authors are aware of a number of other implementations of sender/receiver +from this paper. These are presented here in perceived order of maturity and +field experience.

    • HPX - The C++ Standard Library for Parallelism and Concurrency

      -

      HPX is a general purpose C++ runtime system for parallel and distributed applications that has been under active development since 2007. HPX exposes a uniform, standards-oriented API, and keeps abreast of the latest standards and proposals. It is used in a wide variety of high-performance applications.

      -

      The sender/receiver implementation in HPX has been under active development since May 2020. It is used to erase the overhead of futures and to make it possible to write efficient generic asynchronous algorithms that are agnostic to their execution resource. In HPX, algorithms can migrate execution between execution resources, even to GPUs and back, using a uniform standard interface with sender/receiver.

      -

      Far and away, the HPX team has the greatest usage experience outside Facebook. Mikael Simberg summarizes the experience as follows:

      +

      HPX is a general purpose C++ runtime system for parallel and distributed +applications that has been under active development since 2007. HPX exposes +a uniform, standards-oriented API, and keeps abreast of the latest standards +and proposals. It is used in a wide variety of high-performance +applications.

      +

      The sender/receiver implementation in HPX has been under active development +since May 2020. It is used to erase the overhead of futures and to make it +possible to write efficient generic asynchronous algorithms that are +agnostic to their execution resource. In HPX, algorithms can migrate +execution between execution resources, even to GPUs and back, using a +uniform standard interface with sender/receiver.

      +

      Far and away, the HPX team has the greatest usage experience outside +Facebook. Mikael Simberg summarizes the experience as follows:

      -

      Summarizing, for us the major benefits of sender/receiver compared to the old model are:

      +

      Summarizing, for us the major benefits of sender/receiver compared to the +old model are:

      1. Proper hooks for transitioning between execution resources.

      2. The adaptors. Things like let_value are really nice additions.

      3. -

        Separation of the error channel from the value channel (also cancellation, but we don’t have much use for it at the moment). Even from a teaching perspective having to explain that the future f2 in the continuation will always be ready here f1.then([](future<T> f2) {...}) is enough of a reason to separate the channels. All the other obvious reasons apply as well of course.

        +

        Separation of the error channel from the value channel (also +cancellation, but we don’t have much use for it at the moment). Even +from a teaching perspective having to explain that the future f2 in +the continuation will always be ready here f1.then([](future<T> f2) > {...}) is enough of a reason to separate the channels. All the other +obvious reasons apply as well of course.

      4. -

        For futures we have a thing called hpx::dataflow which is an optimized version of when_all(...).then(...) which avoids intermediate allocations. With the sender/receiver when_all(...) | then(...) we get that "for free".

        +

        For futures we have a thing called hpx::dataflow which is an +optimized version of when_all(...).then(...) which avoids +intermediate allocations. With the sender/receiver when_all(...) | > then(...) we get that "for free".

    • kuhllib by Dietmar Kuehl

      -

      This is a prototype Standard Template Library with an implementation of sender/receiver that has been under development since May, 2021. It is significant mostly for its support for sender/receiver-based networking interfaces.

      -

      Here, Dietmar Kuehl speaks about the perceived complexity of sender/receiver:

      +

      This is a prototype Standard Template Library with an implementation of +sender/receiver that has been under development since May, 2021. It is +significant mostly for its support for sender/receiver-based networking +interfaces.

      +

      Here, Dietmar Kuehl speaks about the perceived complexity of +sender/receiver:

      -

      ... and, also similar to STL: as I had tried to do things in that space before I recognize sender/receivers as being maybe complicated in one way but a huge simplification in another one: like with STL I think those who use it will benefit - if not from the algorithm from the clarity of abstraction: the separation of concerns of STL (the algorithm being detached from the details of the sequence representation) is a major leap. Here it is rather similar: the separation of the asynchronous algorithm from the details of execution. Sure, there is some glue to tie things back together but each of them is simpler than the combined result.

      +

      ... and, also similar to STL: as I had tried to do things in that space +before I recognize sender/receivers as being maybe complicated in one way +but a huge simplification in another one: like with STL I think those who +use it will benefit - if not from the algorithm from the clarity of +abstraction: the separation of concerns of STL (the algorithm being +detached from the details of the sequence representation) is a major leap. +Here it is rather similar: the separation of the asynchronous algorithm +from the details of execution. Sure, there is some glue to tie things back +together but each of them is simpler than the combined result.

      Elsewhere, he said:

      -

      ... to me it feels like sender/receivers are like iterators when STL emerged: they are different from what everybody did in that space. However, everything people are already doing in that space isn’t right.

      +

      ... to me it feels like sender/receivers are like iterators when STL +emerged: they are different from what everybody did in that space. +However, everything people are already doing in that space isn’t right.

      -

      Kuehl also has experience teaching sender/receiver at Bloomberg. About that experience he says:

      +

      Kuehl also has experience teaching sender/receiver at Bloomberg. About that +experience he says:

      -

      When I asked [my students] specifically about how complex they consider the sender/receiver stuff the feedback was quite unanimous that the sender/receiver parts aren’t trivial but not what contributes to the complexity.

      +

      When I asked [my students] specifically about how complex they consider +the sender/receiver stuff the feedback was quite unanimous that the +sender/receiver parts aren’t trivial but not what contributes to the +complexity.

    • -

      The reference implementation

      -

      This is a complete implementation written from the specification in this paper. Its primary purpose is to help find specification bugs and to harden the wording of the proposal. It is -fit for broad use and for contribution to libc++.

      -

      It is current with R8 of this paper.

      -
    • -

      Reference implementation for the Microsoft STL by Michael Schellenberger Costa

      -

      This is another reference implementation of this proposal, this time in a fork of the Mircosoft STL implementation. Michael Schellenberger Costa is not affiliated with Microsoft. He intends to contribute this implementation upstream when it is complete.

      +

      C++ Bare Metal Senders and Receivers from Intel

      +

      This is a prototype implementation of sender/receiver by Intel that has been +under development since August, 2023. It is significant mostly for its +support for bare metal (no operating system) and embedded systems, a domain +for which senders are particularly well-suited due to their very low dynamic +memory requirements.

    -

    1.11.3. Inspirations

    -

    This proposal also draws heavily from our experience with Thrust and Agency. It is also inspired by the needs of countless other C++ frameworks for asynchrony, parallelism, and concurrency, including:

    +

    1.10.4. Inspirations

    +

    This proposal also draws heavily from our experience with Thrust and Agency. It is also inspired by the +needs of countless other C++ frameworks for asynchrony, parallelism, and +concurrency, including:

    • HPX

      @@ -3742,7 +4007,26 @@

      stlab

    2. Revision history

    -

    2.1. R8

    +

    2.1. R9

    +

    The changes since R8 are as follows:

    +

    Fixes:

    +
      +
    • +

      The tag_invoke mechanism has been replace with member functions + for customizations as per P2855.

      +
    • +

      Per guidance from LWG and LEWG, receiver_adaptor has been removed.

      +
    • +

      The receiver concept is tweaked to requires that receiver types are not final. Without receiver_adaptor and tag_invoke, receiver adaptors + are easily written using implementation inheritance.

      +
    +

    Enhancements:

    +
      +
    • +

      The specification of the sync_wait algorithm has been updated + for clarity.

      +
    +

    2.2. R8

    The changes since R7 are as follows:

    Fixes:

      @@ -3774,7 +4058,7 @@

      2.1. enable_sender and enable_receiver traits now have default implementations that look for nested sender_concept and receiver_concept types, respectively.

    -

    2.2. R7

    +

    2.3. R7

    The changes since R6 are as follows:

    Fixes:

    -

    2.3. R6

    +

    2.4. R6

    The changes since R5 are as follows:

    Fixes:

      @@ -3845,15 +4129,16 @@

      2.3. ensure_started and split are changed to persist the result of calling get_attrs() on the input sender.

    • -

      Reorder constraints of the scheduler and receiver concepts to avoid constraint recursion -when used in tandem with poorly-constrained, implicitly convertible types.

      +

      Reorder constraints of the scheduler and receiver concepts to avoid +constraint recursion when used in tandem with poorly-constrained, implicitly +convertible types.

    • Re-express the sender_of concept to be more ergonomic and general.

    • Make the specification of the alias templates value_types_of_t and error_types_of_t, and the variable template sends_done more concise by expressing them in terms of a new exposition-only alias template gather-signatures.

    -

    2.3.1. Environments and attributes

    +

    2.4.1. Environments and attributes

    In earlier revisions, receivers, senders, and schedulers all were directly queryable. In R4, receiver queries were moved into a separate "environment" object, obtainable from a receiver with a get_env accessor. In R6, the @@ -3869,7 +4154,7 @@

    forwardable operation state queries. The authors chose to make opstates directly queryable since the opstate object is itself required to be kept alive for the duration of asynchronous operation.

    -

    2.4. R5

    +

    2.5. R5

    The changes since R4 are as follows:

    Fixes:

      @@ -3896,7 +4181,7 @@

      2.4. just, just_error, just_stopped, and into_variant have been respecified as customization point objects instead of functions, following LEWG guidance.

    -

    2.5. R4

    +

    2.6. R4

    The changes since R3 are as follows:

    Fixes:

      @@ -3959,7 +4244,7 @@

      2.5.

      tag_invoke respecified to improve diagnostics.

    -

    2.5.1. Dependently-typed senders

    +

    2.6.1. Dependently-typed senders

    Background:

    In the sender/receiver model, as with coroutines, contextual information about the current execution is most naturally propagated from the consumer to the @@ -4018,11 +4303,12 @@

    A further refinement of this design would be to separate the receiver and the environment entirely, passing then as separate arguments along with the sender to connect. This paper does not propose that change.

    Impact:

    -

    This change, apart from increasing the expressive power of the sender/receiver abstraction, has the following impact:

    +

    This change, apart from increasing the expressive power of the sender/receiver +abstraction, has the following impact:

    • -

      Typed senders become moderately more challenging to write. (The new completion_signatures and transform_completion_signatures utilities are added -to ease this extra burden.)

      +

      Typed senders become moderately more challenging to write. (The new completion_signatures and transform_completion_signatures utilities are +added to ease this extra burden.)

    • Sender adaptor algorithms that previously constrained their sender arguments to satisfy the typed_sender concept can no longer do so as the receiver is @@ -4033,12 +4319,18 @@

    "Has it been implemented?"

    -

    Yes, the reference implementation, which can be found at -https://github.com/NVIDIA/stdexec, has implemented this -design as well as some dependently-typed senders to confirm that it works.

    +

    Yes, the reference implementation, which can be found at https://github.com/NVIDIA/stdexec, has +implemented this design as well as some dependently-typed senders to confirm +that it works.

    Implementation experience

    -

    Although this change has not yet been made in libunifex, the most widely adopted sender/receiver implementation, a similar design can be found in Folly’s coroutine support library. In Folly.Coro, it is possible to await a special awaitable to obtain the current coroutine’s associated scheduler (called an executor in Folly).

    -

    For instance, the following Folly code grabs the current executor, schedules a task for execution on that executor, and starts the resulting (scheduled) task by enqueueing it for execution.

    +

    Although this change has not yet been made in libunifex, the most widely adopted +sender/receiver implementation, a similar design can be found in Folly’s +coroutine support library. In Folly.Coro, it is possible to await a special +awaitable to obtain the current coroutine’s associated scheduler (called an +executor in Folly).

    +

    For instance, the following Folly code grabs the current executor, schedules a +task for execution on that executor, and starts the resulting (scheduled) task +by enqueueing it for execution.

    // From Facebook's Folly open source library:
     template <class T>
     folly::coro::Task<void> CancellableAsyncScope::co_schedule(folly::coro::Task<T>&& task) {
    @@ -4074,21 +4366,20 @@ 

  • Receivers have an associated environment. The new get_env CPO retrieves a -receiver’s environment. If a receiver doesn’t implement get_env, it returns -an unspecified "empty" environment -- an empty struct.

    +receiver’s environment. If a receiver doesn’t implement get_env, it +returns an unspecified "empty" environment -- an empty struct.

  • sender_traits now takes an optional Env parameter that is used to determine the error/value types.

  • -

    The primary sender_traits template is replaced with a completion_signatures_of_t alias implemented in terms of a new get_completion_signatures CPO that dispatches -with tag_invoke. get_completion_signatures takes a sender and an optional -environment. A sender can customize this to specify its value/error types.

    +

    The primary sender_traits template is replaced with a completion_signatures_of_t alias implemented in terms of a new get_completion_signatures CPO that dispatches with tag_invoke. get_completion_signatures takes a sender and an optional environment. A +sender can customize this to specify its value/error types.

  • Support for untyped senders is dropped. The typed_sender concept has been renamed to sender and now takes an optional environment.

  • -

    The environment argument to the sender concept and the get_completion_signatures CPO defaults to no_env. All environment queries fail (are ill-formed) when -passed an instance of no_env.

    +

    The environment argument to the sender concept and the get_completion_signatures CPO defaults to no_env. All environment +queries fail (are ill-formed) when passed an instance of no_env.

  • A type S is required to satisfy sender<S> to be considered a sender. If it doesn’t know what types it will complete with @@ -4102,13 +4393,13 @@

    All of the algorithms and examples have been updated to work with dependently-typed senders.

    -

    2.6. R3

    +

    2.7. R3

    The changes since R2 are as follows:

    Fixes:

    • -

      Fix specification of the on algorithm to clarify lifetimes of -intermediate operation states and properly scope the get_scheduler query.

      +

      Fix specification of the on algorithm to clarify lifetimes of intermediate +operation states and properly scope the get_scheduler query.

    • Fix a memory safety bug in the implementation of connect-awaitable.

    • @@ -4121,13 +4412,15 @@

      2.6.

      Add receiver_adaptor utility to simplify writing receivers.

    • -

      Require a scheduler’s sender to model sender_of and provide a completion scheduler.

      +

      Require a scheduler’s sender to model sender_of and provide a completion +scheduler.

    • Specify the cancellation scope of the when_all algorithm.

    • Make as_awaitable a customization point.

    • -

      Change connect’s handling of awaitables to consider those types that are awaitable owing to customization of as_awaitable.

      +

      Change connect’s handling of awaitables to consider those types that are +awaitable owing to customization of as_awaitable.

    • Add value_types_of_t and error_types_of_t alias templates; rename stop_token_type_t to stop_token_of_t.

    • @@ -4135,7 +4428,7 @@

      2.6.

      Expand the section on field experience.

    -

    2.7. R2

    +

    2.8. R2

    The changes since R1 are as follows:

    • @@ -4143,21 +4436,25 @@

      2.7.

      Extend the execution::connect customization point and the sender_traits<> template to recognize awaitables as typed_senders.

    • -

      Add utilities as_awaitable() and with_awaitable_senders<> so a coroutine type can trivially make senders awaitable with a coroutine.

      +

      Add utilities as_awaitable() and with_awaitable_senders<> so a coroutine +type can trivially make senders awaitable with a coroutine.

    • Add a section describing the design of the sender/awaitable interactions.

    • -

      Add a section describing the design of the cancellation support in sender/receiver.

      +

      Add a section describing the design of the cancellation support in +sender/receiver.

    • Add a section showing examples of simple sender adaptor algorithms.

    • Add a section showing examples of simple schedulers.

    • -

      Add a few more examples: a sudoku solver, a parallel recursive file copy, and an echo server.

      +

      Add a few more examples: a sudoku solver, a parallel recursive file copy, and +an echo server.

    • Refined the forward progress guarantees on the bulk algorithm.

    • -

      Add a section describing how to use a range of senders to represent async sequences.

      +

      Add a section describing how to use a range of senders to represent async +sequences.

    • Add a section showing how to use senders to represent partial success.

    • @@ -4169,7 +4466,7 @@

      2.7.

      Various fixes of typos and bugs.

    -

    2.8. R1

    +

    2.9. R1

    The changes since R0 are as follows:

    • @@ -4183,15 +4480,15 @@

      2.8.

      Various fixes of typos and bugs.

    -

    2.9. R0

    +

    2.10. R0

    Initial revision.

    3. Design - introduction

    The following three sections describe the entirety of the proposed design.

    • § 3 Design - introduction describes the conventions used through the rest of the -design sections, as well as an example illustrating how we envision code will -be written using this proposal.

      +design sections, as well as an example illustrating how we envision code +will be written using this proposal.

    • § 4 Design - user side describes all the functionality from the perspective we intend for users: it describes the various concepts they will interact with, @@ -4199,23 +4496,24 @@

      § 5 Design - implementer side describes the machinery that allows for that programming model to function, and the information contained there is -necessary for people implementing senders and sender algorithms (including the -standard library ones) - but is not necessary to use senders productively.

      +necessary for people implementing senders and sender algorithms (including +the standard library ones) - but is not necessary to use senders +productively.

    3.1. Conventions

    The following conventions are used throughout the design section:

    1. The namespace proposed in this paper is the same as in A Unified Executors Proposal for C++: std::execution; however, for brevity, the std:: part of this name is - omitted. When you see execution::foo, treat that as std::execution::foo.

      +omitted. When you see execution::foo, treat that as std::execution::foo.

    2. Universal references and explicit calls to std::move/std::forward are - omitted in code samples and signatures for simplicity; assume universal - references and perfect forwarding unless stated otherwise.

      +omitted in code samples and signatures for simplicity; assume universal +references and perfect forwarding unless stated otherwise.

    3. None of the names proposed here are names that we are particularly attached - to; consider the names to be reasonable placeholders that can freely be - changed, should the committee want to do so.

      +to; consider the names to be reasonable placeholders that can freely be +changed, should the committee want to do so.

    3.2. Queries and algorithms

    A query is a callable that takes some set of objects (usually one) as @@ -4245,9 +4543,9 @@

    // snd is a sender (see below) describing the creation of a new execution resource // on the execution resource associated with sch

  • -

    Note that a particular scheduler type may provide other kinds of scheduling operations -which are supported by its associated execution resource. It is not limited to scheduling -purely using the execution::schedule API.

    +

    Note that a particular scheduler type may provide other kinds of scheduling +operations which are supported by its associated execution resource. It is not +limited to scheduling purely using the execution::schedule API.

    Future papers will propose additional scheduler concepts that extend scheduler to add other capabilities. For example:

    • @@ -4286,11 +4584,13 @@

      4.4. Senders are composable through sender algorithms

      -

      Asynchronous programming often departs from traditional code structure and control flow that we are familiar with. -A successful asynchronous framework must provide an intuitive story for composition of asynchronous work: expressing dependencies, passing objects, managing object lifetimes, etc.

      -

      The true power and utility of senders is in their composability. -With senders, users can describe generic execution pipelines and graphs, and then run them on and across a variety of different schedulers. -Senders are composed using sender algorithms:

      +

      Asynchronous programming often departs from traditional code structure and +control flow that we are familiar with. A successful asynchronous framework must +provide an intuitive story for composition of asynchronous work: expressing +dependencies, passing objects, managing object lifetimes, etc.

      +

      The true power and utility of senders is in their composability. With senders, +users can describe generic execution pipelines and graphs, and then run them on +and across a variety of different schedulers. Senders are composed using sender algorithms:

      • sender factories, algorithms that take no senders and return a sender.

        @@ -4300,29 +4600,55 @@

        sender consumers, algorithms that take (and potentially execution::connect) senders and do not return a sender.

      4.5. Senders can propagate completion schedulers

      -

      One of the goals of executors is to support a diverse set of execution resources, including traditional thread pools, task and fiber frameworks (like HPX and Legion), and GPUs and other accelerators (managed by runtimes such as CUDA or SYCL). -On many of these systems, not all execution agents are created equal and not all functions can be run on all execution agents. -Having precise control over the execution resource used for any given function call being submitted is important on such systems, and the users of standard execution facilities will expect to be able to express such requirements.

      -

      A Unified Executors Proposal for C++ was not always clear about the place of execution of any given piece of code. -Precise control was present in the two-way execution API present in earlier executor designs, but it has so far been missing from the senders design. There has been a proposal (Towards C++23 executors: A proposal for an initial set of algorithms) to provide a number of sender algorithms that would enforce certain rules on the places of execution -of the work described by a sender, but we have found those sender algorithms to be insufficient for achieving the best performance on all platforms that are of interest to us. The implementation strategies that we are aware of result in one of the following situations:

      +

      One of the goals of executors is to support a diverse set of execution +resources, including traditional thread pools, task and fiber frameworks (like HPX Legion), and GPUs and other +accelerators (managed by runtimes such as CUDA or SYCL). On many of these +systems, not all execution agents are created equal and not all functions can be +run on all execution agents. Having precise control over the execution resource +used for any given function call being submitted is important on such systems, +and the users of standard execution facilities will expect to be able to express +such requirements.

      +

      A Unified Executors Proposal for C++ was not always clear about the place of execution of any +given piece of code. Precise control was present in the two-way execution API +present in earlier executor designs, but it has so far been missing from the +senders design. There has been a proposal (Towards C++23 executors: A proposal for an initial set of algorithms) to provide a number of +sender algorithms that would enforce certain rules on the places of execution of +the work described by a sender, but we have found those sender algorithms to be +insufficient for achieving the best performance on all platforms that are of +interest to us. The implementation strategies that we are aware of result in one +of the following situations:

      1. -

        trying to submit work to one execution resource (such as a CPU thread pool) from another execution resource (such as a GPU or a task framework), which assumes that all execution agents are as capable as a std::thread (which they aren’t).

        +

        trying to submit work to one execution resource (such as a CPU thread pool) + from another execution resource (such as a GPU or a task framework), which + assumes that all execution agents are as capable as a std::thread (which + they aren’t).

      2. -

        forcibly interleaving two adjacent execution graph nodes that are both executing on one execution resource (such as a GPU) with glue code that runs on another execution resource (such as a CPU), which is prohibitively expensive for some execution resources (such as CUDA or SYCL).

        +

        forcibly interleaving two adjacent execution graph nodes that are both + executing on one execution resource (such as a GPU) with glue code that + runs on another execution resource (such as a CPU), which is prohibitively + expensive for some execution resources (such as CUDA or SYCL).

      3. -

        having to customise most or all sender algorithms to support an execution resource, so that you can avoid problems described in 1. and 2, which we believe is impractical and brittle based on months of field experience attempting this in Agency.

        +

        having to customise most or all sender algorithms to support an execution + resource, so that you can avoid problems described in 1. and 2, which we + believe is impractical and brittle based on months of field experience + attempting this in Agency.

      -

      None of these implementation strategies are acceptable for many classes of parallel runtimes, such as task frameworks (like HPX) or accelerator runtimes (like CUDA or SYCL).

      -

      Therefore, in addition to the on sender algorithm from Towards C++23 executors: A proposal for an initial set of algorithms, we are proposing a way for senders to advertise what scheduler (and by extension what execution resource) they will complete on. -Any given sender may have completion schedulers for some or all of the signals (value, error, or stopped) it completes with (for more detail on the completion-signals, see § 5.1 Receivers serve as glue between senders). -When further work is attached to that sender by invoking sender algorithms, that work will also complete on an appropriate completion scheduler.

      +

      None of these implementation strategies are acceptable for many classes of +parallel runtimes, such as task frameworks (like HPX) or accelerator runtimes (like CUDA +or SYCL).

      +

      Therefore, in addition to the on sender algorithm from Towards C++23 executors: A proposal for an initial set of algorithms, we are +proposing a way for senders to advertise what scheduler (and by extension what +execution resource) they will complete on. Any given sender may have completion schedulers for some or all of the signals (value, error, or +stopped) it completes with (for more detail on the completion-signals, see § 5.1 Receivers serve as glue between senders). When further work is attached to that sender by invoking +sender algorithms, that work will also complete on an appropriate completion +scheduler.

      4.5.1. execution::get_completion_scheduler

      -

      get_completion_scheduler is a query that retrieves the completion scheduler for a specific completion-signal from a sender’s environment. -For a sender that lacks a completion scheduler query for a given signal, calling get_completion_scheduler is ill-formed. -If a sender advertises a completion scheduler for a signal in this way, that sender must ensure that it sends that signal on an execution agent belonging to an execution resource represented by a scheduler returned from this function. -See § 4.5 Senders can propagate completion schedulers for more details.

      +

      get_completion_scheduler is a query that retrieves the completion scheduler +for a specific completion-signal from a sender’s environment. For a sender that +lacks a completion scheduler query for a given signal, calling get_completion_scheduler is ill-formed. If a sender advertises a completion +scheduler for a signal in this way, that sender must ensure that it sends that signal on an execution agent belonging to an execution +resource represented by a scheduler returned from this function. See § 4.5 Senders can propagate completion schedulers for more details.

      execution::scheduler auto cpu_sched = new_thread_scheduler{};
       execution::scheduler auto gpu_sched = cuda::scheduler();
       
      @@ -4347,10 +4673,18 @@ 

      4.6. Execution resource transitions are explicit

      -

      A Unified Executors Proposal for C++ does not contain any mechanisms for performing an execution resource transition. The only sender algorithm that can create a sender that will move execution to a specific execution resource is execution::schedule, which does not take an input sender. -That means that there’s no way to construct sender chains that traverse different execution resources. This is necessary to fulfill the promise of senders being able to replace two-way executors, which had this capability.

      -

      We propose that, for senders advertising their completion scheduler, all execution resource transitions must be explicit; running user code anywhere but where they defined it to run must be considered a bug.

      -

      The execution::transfer sender adaptor performs a transition from one execution resource to another:

      +

      A Unified Executors Proposal for C++ does not contain any mechanisms for performing an execution +resource transition. The only sender algorithm that can create a sender that +will move execution to a specific execution resource is execution::schedule, +which does not take an input sender. That means that there’s no way to construct +sender chains that traverse different execution resources. This is necessary to +fulfill the promise of senders being able to replace two-way executors, which +had this capability.

      +

      We propose that, for senders advertising their completion scheduler, all +execution resource transitions must be explicit; running user code +anywhere but where they defined it to run must be considered a bug.

      +

      The execution::transfer sender adaptor performs a transition from one +execution resource to another:

      execution::scheduler auto sch1 = ...;
       execution::scheduler auto sch2 = ...;
       
      @@ -4384,21 +4718,34 @@ 

      execution::connect by passing an lvalue reference to the sender to call these overloads. Multi-shot senders should also define overloads of execution::connect that accept rvalue-qualified senders to allow the sender to be also used in places where only a single-shot sender is required.

      -

      If the user of a sender does not require the sender to remain valid after connecting it to a -receiver then it can pass an rvalue-reference to the sender to the call to execution::connect. -Such usages should be able to accept either single-shot or multi-shot senders.

      -

      If the caller does wish for the sender to remain valid after the call then it can pass an lvalue-qualified sender -to the call to execution::connect. Such usages will only accept multi-shot senders.

      -

      Algorithms that accept senders will typically either decay-copy an input sender and store it somewhere -for later usage (for example as a data-member of the returned sender) or will immediately call execution::connect on the input sender, such as in this_thread::sync_wait or execution::start_detached.

      -

      Some multi-use sender algorithms may require that an input sender be copy-constructible but will only call execution::connect on an rvalue of each copy, which still results in effectively executing the operation multiple times. -Other multi-use sender algorithms may require that the sender is move-constructible but will invoke execution::connect on an lvalue reference to the sender.

      -

      For a sender to be usable in both multi-use scenarios, it will generally be required to be both copy-constructible and lvalue-connectable.

      +

      If the user of a sender does not require the sender to remain valid after +connecting it to a receiver then it can pass an rvalue-reference to the sender +to the call to execution::connect. Such usages should be able to accept either +single-shot or multi-shot senders.

      +

      If the caller does wish for the sender to remain valid after the call then it +can pass an lvalue-qualified sender to the call to execution::connect. Such +usages will only accept multi-shot senders.

      +

      Algorithms that accept senders will typically either decay-copy an input sender +and store it somewhere for later usage (for example as a data-member of the +returned sender) or will immediately call execution::connect on the input +sender, such as in this_thread::sync_wait or execution::start_detached.

      +

      Some multi-use sender algorithms may require that an input sender be +copy-constructible but will only call execution::connect on an rvalue of each +copy, which still results in effectively executing the operation multiple times. +Other multi-use sender algorithms may require that the sender is +move-constructible but will invoke execution::connect on an lvalue reference +to the sender.

      +

      For a sender to be usable in both multi-use scenarios, it will generally be +required to be both copy-constructible and lvalue-connectable.

      4.8. Senders are forkable

      -

      Any non-trivial program will eventually want to fork a chain of senders into independent streams of work, regardless of whether they are single-shot or multi-shot. -For instance, an incoming event to a middleware system may be required to trigger events on more than one downstream system. -This requires that we provide well defined mechanisms for making sure that connecting a sender multiple times is possible and correct.

      -

      The split sender adaptor facilitates connecting to a sender multiple times, regardless of whether it is single-shot or multi-shot:

      +

      Any non-trivial program will eventually want to fork a chain of senders into +independent streams of work, regardless of whether they are single-shot or +multi-shot. For instance, an incoming event to a middleware system may be +required to trigger events on more than one downstream system. This requires +that we provide well defined mechanisms for making sure that connecting a sender +multiple times is possible and correct.

      +

      The split sender adaptor facilitates connecting to a sender multiple times, +regardless of whether it is single-shot or multi-shot:

      auto some_algorithm(execution::sender auto&& input) {
           execution::sender auto multi_shot = split(input);
           // "multi_shot" is guaranteed to be multi-shot,
      @@ -4411,40 +4758,46 @@ 

      4.9. Senders support cancellation

      -

      Senders are often used in scenarios where the application may be concurrently executing -multiple strategies for achieving some program goal. When one of these strategies succeeds -(or fails) it may not make sense to continue pursuing the other strategies as their results -are no longer useful.

      -

      For example, we may want to try to simultaneously connect to multiple network servers and use -whichever server responds first. Once the first server responds we no longer need to continue -trying to connect to the other servers.

      -

      Ideally, in these scenarios, we would somehow be able to request that those other strategies -stop executing promptly so that their resources (e.g. cpu, memory, I/O bandwidth) can be -released and used for other work.

      -

      While the design of senders has support for cancelling an operation before it starts -by simply destroying the sender or the operation-state returned from execution::connect() before calling execution::start(), there also needs to be a standard, generic mechanism -to ask for an already-started operation to complete early.

      -

      The ability to be able to cancel in-flight operations is fundamental to supporting some kinds -of generic concurrency algorithms.

      +

      Senders are often used in scenarios where the application may be concurrently +executing multiple strategies for achieving some program goal. When one of these +strategies succeeds (or fails) it may not make sense to continue pursuing the +other strategies as their results are no longer useful.

      +

      For example, we may want to try to simultaneously connect to multiple network +servers and use whichever server responds first. Once the first server responds +we no longer need to continue trying to connect to the other servers.

      +

      Ideally, in these scenarios, we would somehow be able to request that those +other strategies stop executing promptly so that their resources (e.g. cpu, +memory, I/O bandwidth) can be released and used for other work.

      +

      While the design of senders has support for cancelling an operation before it +starts by simply destroying the sender or the operation-state returned from execution::connect() before calling execution::start(), there also needs to +be a standard, generic mechanism to ask for an already-started operation to +complete early.

      +

      The ability to be able to cancel in-flight operations is fundamental to +supporting some kinds of generic concurrency algorithms.

      For example:

      • -

        a when_all(ops...) algorithm should cancel other operations as soon as one operation fails

        +

        a when_all(ops...) algorithm should cancel other operations as soon as one +operation fails

      • -

        a first_successful(ops...) algorithm should cancel the other operations as soon as one operation completes successfuly

        +

        a first_successful(ops...) algorithm should cancel the other operations as +soon as one operation completes successfuly

      • a generic timeout(src, duration) algorithm needs to be able to cancel the src operation after the timeout duration has elapsed.

      • a stop_when(src, trigger) algorithm should cancel src if trigger completes first and cancel trigger if src completes first

      -

      The mechanism used for communcating cancellation-requests, or stop-requests, needs to have a uniform interface -so that generic algorithms that compose sender-based operations, such as the ones listed above, are able to -communicate these cancellation requests to senders that they don’t know anything about.

      -

      The design is intended to be composable so that cancellation of higher-level operations can propagate -those cancellation requests through intermediate layers to lower-level operations that need to actually -respond to the cancellation requests.

      -

      For example, we can compose the algorithms mentioned above so that child operations -are cancelled when any one of the multiple cancellation conditions occurs:

      +

      The mechanism used for communcating cancellation-requests, or stop-requests, +needs to have a uniform interface so that generic algorithms that compose +sender-based operations, such as the ones listed above, are able to communicate +these cancellation requests to senders that they don’t know anything about.

      +

      The design is intended to be composable so that cancellation of higher-level +operations can propagate those cancellation requests through intermediate layers +to lower-level operations that need to actually respond to the cancellation +requests.

      +

      For example, we can compose the algorithms mentioned above so that child +operations are cancelled when any one of the multiple cancellation conditions +occurs:

      sender auto composed_cancellation_example(auto query) {
         return stop_when(
           timeout(
      @@ -4457,117 +4810,156 @@ 

      cancelButton.on_click()); }

      -

      In this example, if we take the operation returned by query_server_b(query), this operation will -receive a stop-request when any of the following happens:

      +

      In this example, if we take the operation returned by query_server_b(query), +this operation will receive a stop-request when any of the following happens:

      • first_successful algorithm will send a stop-request if query_server_a(query) completes successfully

      • -

        when_all algorithm will send a stop-request if the load_file("some_file.jpg") operation completes with an error or stopped result.

        +

        when_all algorithm will send a stop-request if the load_file("some_file.jpg") operation completes with an error or stopped +result.

      • -

        timeout algorithm will send a stop-request if the operation does not complete within 5 seconds.

        +

        timeout algorithm will send a stop-request if the operation does not +complete within 5 seconds.

      • -

        stop_when algorithm will send a stop-request if the user clicks on the "Cancel" button in the user-interface.

        +

        stop_when algorithm will send a stop-request if the user clicks on the +"Cancel" button in the user-interface.

      • -

        The parent operation consuming the composed_cancellation_example() sends a stop-request

        +

        The parent operation consuming the composed_cancellation_example() sends a +stop-request

      -

      Note that within this code there is no explicit mention of cancellation, stop-tokens, callbacks, etc. -yet the example fully supports and responds to the various cancellation sources.

      -

      The intent of the design is that the common usage of cancellation in sender/receiver-based code is -primarily through use of concurrency algorithms that manage the detailed plumbing of cancellation -for you. Much like algorithms that compose senders relieve the user from having to write their own -receiver types, algorithms that introduce concurrency and provide higher-level cancellation semantics -relieve the user from having to deal with low-level details of cancellation.

      +

      Note that within this code there is no explicit mention of cancellation, +stop-tokens, callbacks, etc. yet the example fully supports and responds to the +various cancellation sources.

      +

      The intent of the design is that the common usage of cancellation in +sender/receiver-based code is primarily through use of concurrency algorithms +that manage the detailed plumbing of cancellation for you. Much like algorithms +that compose senders relieve the user from having to write their own receiver +types, algorithms that introduce concurrency and provide higher-level +cancellation semantics relieve the user from having to deal with low-level +details of cancellation.

      4.9.1. Cancellation design summary

      -

      The design of cancellation described in this paper is built on top of and extends the std::stop_token-based -cancellation facilities added in C++20, first proposed in Composable cancellation for sender-based async operations.

      -

      At a high-level, the facilities proposed by this paper for supporting cancellation include:

      +

      The design of cancellation described in this paper is built on top of and +extends the std::stop_token-based cancellation facilities added in C++20, +first proposed in Composable cancellation for sender-based async operations.

      +

      At a high-level, the facilities proposed by this paper for supporting +cancellation include:

      • -

        Add std::stoppable_token and std::stoppable_token_for concepts that generalise the interface of std::stop_token type to allow other types with different implementation strategies.

        +

        Add std::stoppable_token and std::stoppable_token_for concepts that +generalise the interface of std::stop_token type to allow other types with +different implementation strategies.

      • Add std::unstoppable_token concept for detecting whether a stoppable_token can never receive a stop-request.

      • -

        Add std::in_place_stop_token, std::in_place_stop_source and std::in_place_stop_callback<CB> types that provide a more efficient implementation of a stop-token for use in structured concurrency situations.

        +

        Add std::in_place_stop_token, std::in_place_stop_source and std::in_place_stop_callback<CB> types that provide a more efficient +implementation of a stop-token for use in structured concurrency situations.

      • -

        Add std::never_stop_token for use in places where you never want to issue a stop-request

        +

        Add std::never_stop_token for use in places where you never want to issue a +stop-request.

      • -

        Add std::execution::get_stop_token() CPO for querying the stop-token to use for an operation from its receiver’s execution environment.

        +

        Add std::execution::get_stop_token() CPO for querying the stop-token to use +for an operation from its receiver’s execution environment.

      • -

        Add std::execution::stop_token_of_t<T> for querying the type of a stop-token returned from get_stop_token()

        +

        Add std::execution::stop_token_of_t<T> for querying the type of a stop-token +returned from get_stop_token().

      -

      In addition, there are requirements added to some of the algorithms to specify what their cancellation -behaviour is and what the requirements of customisations of those algorithms are with respect to -cancellation.

      -

      The key component that enables generic cancellation within sender-based operations is the execution::get_stop_token() CPO. -This CPO takes a single parameter, which is the execution environment of the receiver passed to execution::connect, and returns a std::stoppable_token that the operation can use to check for stop-requests for that operation.

      +

      In addition, there are requirements added to some of the algorithms to specify +what their cancellation behaviour is and what the requirements of customisations +of those algorithms are with respect to cancellation.

      +

      The key component that enables generic cancellation within sender-based +operations is the execution::get_stop_token() CPO. This CPO takes a single +parameter, which is the execution environment of the receiver passed to execution::connect, and returns a std::stoppable_token that the operation +can use to check for stop-requests for that operation.

      As the caller of execution::connect typically has control over the receiver -type it passes, it is able to customise the std::execution::get_env() CPO for that -receiver to return an execution environment that hooks the execution::get_stop_token() CPO to return a stop-token that the receiver has +type it passes, it is able to customise the std::execution::get_env() CPO for +that receiver to return an execution environment that hooks the execution::get_stop_token() CPO to return a stop-token that the receiver has control over and that it can use to communicate a stop-request to the operation once it has started.

      4.9.2. Support for cancellation is optional

      -

      Support for cancellation is optional, both on part of the author of the receiver and on part of the author of the sender.

      +

      Support for cancellation is optional, both on part of the author of the receiver +and on part of the author of the sender.

      If the receiver’s execution environment does not customise the execution::get_stop_token() CPO then invoking the CPO on that receiver’s environment will invoke the default implementation which returns std::never_stop_token. This is a special stoppable_token type that is statically known to always return false from the stop_possible() method.

      -

      Sender code that tries to use this stop-token will in general result in code that handles stop-requests being -compiled out and having little to no run-time overhead.

      -

      If the sender doesn’t call execution::get_stop_token(), for example because the operation does not support -cancellation, then it will simply not respond to stop-requests from the caller.

      -

      Note that stop-requests are generally racy in nature as there is often a race betwen an operation completing -naturally and the stop-request being made. If the operation has already completed or past the point at which -it can be cancelled when the stop-request is sent then the stop-request may just be ignored. An application -will typically need to be able to cope with senders that might ignore a stop-request anyway.

      +

      Sender code that tries to use this stop-token will in general result in code +that handles stop-requests being compiled out and having little to no run-time +overhead.

      +

      If the sender doesn’t call execution::get_stop_token(), for example because +the operation does not support cancellation, then it will simply not respond to +stop-requests from the caller.

      +

      Note that stop-requests are generally racy in nature as there is often a race +betwen an operation completing naturally and the stop-request being made. If the +operation has already completed or past the point at which it can be cancelled +when the stop-request is sent then the stop-request may just be ignored. An +application will typically need to be able to cope with senders that might +ignore a stop-request anyway.

      4.9.3. Cancellation is inherently racy

      -

      Usually, an operation will attach a stop-callback at some point inside the call to execution::start() so that -a subsequent stop-request will interrupt the logic.

      -

      A stop-request can be issued concurrently from another thread. This means the implementation of execution::start() needs to be careful to ensure that, once a stop-callback has been registered, that there are no data-races between -a potentially concurrently-executing stop-callback and the rest of the execution::start() implementation.

      -

      An implementation of execution::start() that supports cancellation will generally need to perform (at least) -two separate steps: launch the operation, subscribe a stop-callback to the receiver’s stop-token. Care needs -to be taken depending on the order in which these two steps are performed.

      -

      If the stop-callback is subscribed first and then the operation is launched, care needs to be taken to ensure -that a stop-request that invokes the stop-callback on another thread after the stop-callback is registered -but before the operation finishes launching does not either result in a missed cancellation request or a -data-race. e.g. by performing an atomic write after the launch has finished executing

      -

      If the operation is launched first and then the stop-callback is subscribed, care needs to be taken to ensure -that if the launched operation completes concurrently on another thread that it does not destroy the operation-state -until after the stop-callback has been registered. e.g. by having the execution::start implementation write to -an atomic variable once it has finished registering the stop-callback and having the concurrent completion handler -check that variable and either call the completion-signalling operation or store the result and defer calling the -receiver’s completion-signalling operation to the execution::start() call (which is still executing).

      +

      Usually, an operation will attach a stop-callback at some point inside the call +to execution::start() so that a subsequent stop-request will interrupt the +logic.

      +

      A stop-request can be issued concurrently from another thread. This means the +implementation of execution::start() needs to be careful to ensure that, once +a stop-callback has been registered, that there are no data-races between a +potentially concurrently-executing stop-callback and the rest of the execution::start() implementation.

      +

      An implementation of execution::start() that supports cancellation will +generally need to perform (at least) two separate steps: launch the operation, +subscribe a stop-callback to the receiver’s stop-token. Care needs to be taken +depending on the order in which these two steps are performed.

      +

      If the stop-callback is subscribed first and then the operation is launched, +care needs to be taken to ensure that a stop-request that invokes the +stop-callback on another thread after the stop-callback is registered but before +the operation finishes launching does not either result in a missed cancellation +request or a data-race. e.g. by performing an atomic write after the launch has +finished executing

      +

      If the operation is launched first and then the stop-callback is subscribed, +care needs to be taken to ensure that if the launched operation completes +concurrently on another thread that it does not destroy the operation-state +until after the stop-callback has been registered. e.g. by having the execution::start implementation write to an atomic variable once it has +finished registering the stop-callback and having the concurrent completion +handler check that variable and either call the completion-signalling operation +or store the result and defer calling the receiver’s completion-signalling +operation to the execution::start() call (which is still executing).

      For an example of an implementation strategy for solving these data-races see § 1.4 Asynchronous Windows socket recv.

      4.9.4. Cancellation design status

      This paper currently includes the design for cancellation as proposed in Composable cancellation for sender-based async operations - "Composable cancellation for sender-based async operations". -P2175R0 contains more details on the background motivation and prior-art and design rationale of this design.

      -

      It is important to note, however, that initial review of this design in the SG1 concurrency subgroup raised some concerns -related to runtime overhead of the design in single-threaded scenarios and these concerns are still being investigated.

      -

      The design of P2175R0 has been included in this paper for now, despite its potential to change, as we believe that -support for cancellation is a fundamental requirement for an async model and is required in some form to be able to -talk about the semantics of some of the algorithms proposed in this paper.

      -

      This paper will be updated in the future with any changes that arise from the investigations into P2175R0.

      +P2175R0 contains more details on the background motivation and prior-art and +design rationale of this design.

      +

      It is important to note, however, that initial review of this design in the SG1 +concurrency subgroup raised some concerns related to runtime overhead of the +design in single-threaded scenarios and these concerns are still being +investigated.

      +

      The design of P2175R0 has been included in this paper for now, despite its +potential to change, as we believe that support for cancellation is a +fundamental requirement for an async model and is required in some form to be +able to talk about the semantics of some of the algorithms proposed in this +paper.

      +

      This paper will be updated in the future with any changes that arise from the +investigations into P2175R0.

      4.10. Sender factories and adaptors are lazy

      In an earlier revision of this paper, some of the proposed algorithms supported executing their logic eagerly; i.e., before the returned sender has been connected to a receiver and started. These algorithms were removed because eager execution has a number of negative semantic and performance implications.

      -

      We have originally included this functionality in the paper because of a long-standing -belief that eager execution is a mandatory feature to be included in the standard Executors -facility for that facility to be acceptable for accelerator vendors. A particular concern -was that we must be able to write generic algorithms that can run either eagerly or lazily, -depending on the kind of an input sender or scheduler that have been passed into them as -arguments. We considered this a requirement, because the _latency_ of launching work on an +

      We have originally included this functionality in the paper because of a +long-standing belief that eager execution is a mandatory feature to be included +in the standard Executors facility for that facility to be acceptable for +accelerator vendors. A particular concern was that we must be able to write +generic algorithms that can run either eagerly or lazily, depending on the kind +of an input sender or scheduler that have been passed into them as arguments. We +considered this a requirement, because the _latency_ of launching work on an accelerator can sometimes be considerable.

      -

      However, in the process of working on this paper and implementations of the features -proposed within, our set of requirements has shifted, as we understood the different -implementation strategies that are available for the feature set of this paper better, -and, after weighting the earlier concerns against the points presented below, we -have arrived at the conclusion that a purely lazy model is enough for most algorithms, -and users who intend to launch work earlier may use an algorithm such as ensure_started to achieve that goal. We have also come to deeply appreciate the fact that a purely -lazy model allows both the implementation and the compiler to have a much better -understanding of what the complete graph of tasks looks like, allowing them to better -optimize the code - also when targetting accelerators.

      +

      However, in the process of working on this paper and implementations of the +features proposed within, our set of requirements has shifted, as we understood +the different implementation strategies that are available for the feature set +of this paper better, and, after weighting the earlier concerns against the +points presented below, we have arrived at the conclusion that a purely lazy +model is enough for most algorithms, and users who intend to launch work earlier +may use an algorithm such as ensure_started to achieve that goal. We have also +come to deeply appreciate the fact that a purely lazy model allows both the +implementation and the compiler to have a much better understanding of what the +complete graph of tasks looks like, allowing them to better optimize the code - +also when targetting accelerators.

      4.10.1. Eager execution leads to detached work or worse

      One of the questions that arises with APIs that can potentially return eagerly-executing senders is "What happens when those senders are destructed @@ -4693,30 +5085,40 @@

      4.11. Schedulers advertise their forward progress guarantees

      -

      To decide whether a scheduler (and its associated execution resource) is sufficient for a specific task, it may be necessary to know what kind of forward progress guarantees it provides for the execution agents it creates. The C++ Standard defines the following -forward progress guarantees:

      +

      To decide whether a scheduler (and its associated execution resource) is +sufficient for a specific task, it may be necessary to know what kind of forward +progress guarantees it provides for the execution agents it creates. The C++ +Standard defines the following forward progress guarantees:

      • concurrent, which requires that a thread makes progress eventually;

      • -

        parallel, which requires that a thread makes progress once it executes a step; and

        +

        parallel, which requires that a thread makes progress once it executes +a step; and

      • weakly parallel, which does not require that the thread makes progress.

      This paper introduces a scheduler query function, get_forward_progress_guarantee, which returns one of the enumerators of a new enum type, forward_progress_guarantee. Each enumerator of forward_progress_guarantee corresponds to one of the aforementioned guarantees.

      4.12. Most sender adaptors are pipeable

      -

      To facilitate an intuitive syntax for composition, most sender adaptors are pipeable; they can be composed (piped) together with operator|. -This mechanism is similar to the operator| composition that C++ range adaptors support and draws inspiration from piping in *nix shells. -Pipeable sender adaptors take a sender as their first parameter and have no other sender parameters.

      -

      a | b will pass the sender a as the first argument to the pipeable sender adaptor b. Pipeable sender adaptors support partial application of the parameters after the first. For example, all of the following are equivalent:

      +

      To facilitate an intuitive syntax for composition, most sender adaptors are pipeable; they can be composed (piped) +together with operator|. This mechanism is similar to the operator| composition that C++ range adaptors support and draws inspiration from piping in +*nix shells. +Pipeable sender adaptors take a sender as their first parameter and have no +other sender parameters.

      +

      a | b will pass the sender a as the first argument to the pipeable sender +adaptor b. Pipeable sender adaptors support partial application of the +parameters after the first. For example, all of the following are equivalent:

      execution::bulk(snd, N, [] (std::size_t i, auto d) {});
       execution::bulk(N, [] (std::size_t i, auto d) {})(snd);
       snd | execution::bulk(N, [] (std::size_t i, auto d) {});
       
      -

      Piping enables you to compose together senders with a linear syntax. -Without it, you’d have to use either nested function call syntax, which would cause a syntactic inversion of the direction of control flow, or you’d have to introduce a temporary variable for each stage of the pipeline. -Consider the following example where we want to execute first on a CPU thread pool, then on a CUDA GPU, then back on the CPU thread pool:

      +

      Piping enables you to compose together senders with a linear syntax. Without it, +you’d have to use either nested function call syntax, which would cause a +syntactic inversion of the direction of control flow, or you’d have to introduce +a temporary variable for each stage of the pipeline. Consider the following +example where we want to execute first on a CPU thread pool, then on a CUDA GPU, +then back on the CPU thread pool:

      @@ -4767,19 +5169,34 @@

      execution::when_all and execution::when_all_with_variant: Since this sender adaptor takes a variadic pack of senders, a partially applied form would be ambiguous with a non partially applied form with an arity of one less.

      +

      execution::when_all and execution::when_all_with_variant: Since this +sender adaptor takes a variadic pack of senders, a partially applied form +would be ambiguous with a non partially applied form with an arity of one +less.

    • -

      execution::on: This sender adaptor changes how the sender passed to it is executed, not what happens to its result, but allowing it in a pipeline makes it read as if it performed a function more similar to transfer.

      +

      execution::on: This sender adaptor changes how the sender passed to it is +executed, not what happens to its result, but allowing it in a pipeline makes +it read as if it performed a function more similar to transfer.

      Sender consumers could be made pipeable, but we have chosen to not do so. -However, since these are terminal nodes in a pipeline and nothing can be piped after them, we believe a pipe syntax may be confusing as well as unnecessary, as consumers cannot be chained. -We believe sender consumers read better with function call syntax.

      +However, since these are terminal nodes in a pipeline and nothing can be piped +after them, we believe a pipe syntax may be confusing as well as unnecessary, as +consumers cannot be chained. We believe sender consumers read better with +function call syntax.

      4.13. A range of senders represents an async sequence of data

      -

      Senders represent a single unit of asynchronous work. In many cases though, what is being modelled is a sequence of data arriving asynchronously, and you want computation to happen on demand, when each element arrives. This requires nothing more than what is in this paper and the range support in C++20. A range of senders would allow you to model such input as keystrikes, mouse movements, sensor readings, or network requests.

      -

      Given some expression R that is a range of senders, consider the following in a coroutine that returns an async generator type:

      +

      Senders represent a single unit of asynchronous work. In many cases though, what +is being modelled is a sequence of data arriving asynchronously, and you want +computation to happen on demand, when each element arrives. This requires +nothing more than what is in this paper and the range support in C++20. A range +of senders would allow you to model such input as keystrikes, mouse movements, +sensor readings, or network requests.

      +

      Given some expression R that is a range of senders, consider +the following in a coroutine that returns an async generator type:

      for (auto snd : R) {
         if (auto opt = co_await execution::stopped_as_optional(std::move(snd)))
           co_yield fn(*std::move(opt));
      @@ -4787,12 +5204,28 @@ 

      break; }

      -

      This transforms each element of the asynchronous sequence R with the function fn on demand, as the data arrives. The result is a new asynchronous sequence of the transformed values.

      -

      Now imagine that R is the simple expression views::iota(0) | views::transform(execution::just). This creates a lazy range of senders, each of which completes immediately with monotonically increasing integers. The above code churns through the range, generating a new infine asynchronous range of values [fn(0), fn(1), fn(2), ...].

      -

      Far more interesting would be if R were a range of senders representing, say, user actions in a UI. The above code gives a simple way to respond to user actions on demand.

      +

      This transforms each element of the asynchronous sequence R with the function fn on demand, as the data arrives. The result is a new +asynchronous sequence of the transformed values.

      +

      Now imagine that R is the simple expression views::iota(0) | views::transform(execution::just). This creates a lazy range of senders, each +of which completes immediately with monotonically increasing integers. The above +code churns through the range, generating a new infine asynchronous range of +values [fn(0), fn(1), fn(2), ...].

      +

      Far more interesting would be if R were a range of senders +representing, say, user actions in a UI. The above code gives a simple way to +respond to user actions on demand.

      4.14. Senders can represent partial success

      -

      Receivers have three ways they can complete: with success, failure, or cancellation. This begs the question of how they can be used to represent async operations that partially succeed. For example, consider an API that reads from a socket. The connection could drop after the API has filled in some of the buffer. In cases like that, it makes sense to want to report both that the connection dropped and that some data has been successfully read.

      -

      Often in the case of partial success, the error condition is not fatal nor does it mean the API has failed to satisfy its post-conditions. It is merely an extra piece of information about the nature of the completion. In those cases, "partial success" is another way of saying "success". As a result, it is sensible to pass both the error code and the result (if any) through the value channel, as shown below:

      +

      Receivers have three ways they can complete: with success, failure, or +cancellation. This begs the question of how they can be used to represent async +operations that partially succeed. For example, consider an API that reads +from a socket. The connection could drop after the API has filled in some of the +buffer. In cases like that, it makes sense to want to report both that the +connection dropped and that some data has been successfully read.

      +

      Often in the case of partial success, the error condition is not fatal nor does +it mean the API has failed to satisfy its post-conditions. It is merely an extra +piece of information about the nature of the completion. In those cases, +"partial success" is another way of saying "success". As a result, it is +sensible to pass both the error code and the result (if any) through the value +channel, as shown below:

      // Capture a buffer for read_socket_async to fill in
       execution::just(array<byte, 1024>{})
         | execution::let_value([socket](array<byte, 1024>& buff) {
      @@ -4809,8 +5242,20 @@ 

      }); })

      -

      In other cases, the partial success is more of a partial failure. That happens when the error condition indicates that in some way the function failed to satisfy its post-conditions. In those cases, sending the error through the value channel loses valuable contextual information. It’s possible that bundling the error and the incomplete results into an object and passing it through the error channel makes more sense. In that way, generic algorithms will not miss the fact that a post-condition has not been met and react inappropriately.

      -

      Another possibility is for an async API to return a range of senders: if the API completes with full success, full error, or cancellation, the returned range contains just one sender with the result. Otherwise, if the API partially fails (doesn’t satisfy its post-conditions, but some incomplete result is available), the returned range would have two senders: the first containing the partial result, and the second containing the error. Such an API might be used in a coroutine as follows:

      +

      In other cases, the partial success is more of a partial failure. That happens +when the error condition indicates that in some way the function failed to +satisfy its post-conditions. In those cases, sending the error through the value +channel loses valuable contextual information. It’s possible that bundling the +error and the incomplete results into an object and passing it through the error +channel makes more sense. In that way, generic algorithms will not miss the fact +that a post-condition has not been met and react inappropriately.

      +

      Another possibility is for an async API to return a range of senders: if the +API completes with full success, full error, or cancellation, the returned range +contains just one sender with the result. Otherwise, if the API partially fails +(doesn’t satisfy its post-conditions, but some incomplete result is available), +the returned range would have two senders: the first containing the partial +result, and the second containing the error. Such an API might be used in a +coroutine as follows:

      // Declare a buffer for read_socket_async to fill in
       array<byte, 1024> buff;
       
      @@ -4829,9 +5274,16 @@ 

      } }

      -

      Finally, it’s possible to combine these two approaches when the API can both partially succeed (meeting its post-conditions) and partially fail (not meeting its post-conditions).

      +

      Finally, it’s possible to combine these two approaches when the API can both +partially succeed (meeting its post-conditions) and partially fail (not meeting +its post-conditions).

      4.15. All awaitables are senders

      -

      Since C++20 added coroutines to the standard, we expect that coroutines and awaitables will be how a great many will choose to express their asynchronous code. However, in this paper, we are proposing to add a suite of asynchronous algorithms that accept senders, not awaitables. One might wonder whether and how these algorithms will be accessible to those who choose coroutines instead of senders.

      +

      Since C++20 added coroutines to the standard, we expect that coroutines and +awaitables will be how a great many will choose to express their asynchronous +code. However, in this paper, we are proposing to add a suite of asynchronous +algorithms that accept senders, not awaitables. One might wonder whether and how +these algorithms will be accessible to those who choose coroutines instead of +senders.

      In truth there will be no problem because all generally awaitable types automatically model the sender concept. The adaptation is transparent and happens in the sender customization points, which are aware of awaitables. (By @@ -4847,9 +5299,17 @@

      auto o = this_thread::sync_wait(doSomeAsyncWork()); } -

      Since awaitables are senders, writing a sender-based asynchronous algorithm is trivial if you have a coroutine task type: implement the algorithm as a coroutine. If you are not bothered by the possibility of allocations and indirections as a result of using coroutines, then there is no need to ever write a sender, a receiver, or an operation state.

      +

      Since awaitables are senders, writing a sender-based asynchronous algorithm is +trivial if you have a coroutine task type: implement the algorithm as a +coroutine. If you are not bothered by the possibility of allocations and +indirections as a result of using coroutines, then there is no need to ever +write a sender, a receiver, or an operation state.

      4.16. Many senders can be trivially made awaitable

      -

      If you choose to implement your sender-based algorithms as coroutines, you’ll run into the issue of how to retrieve results from a passed-in sender. This is not a problem. If the coroutine type opts in to sender support -- trivial with the execution::with_awaitable_senders utility -- then a large class of senders are transparently awaitable from within the coroutine.

      +

      If you choose to implement your sender-based algorithms as coroutines, you’ll +run into the issue of how to retrieve results from a passed-in sender. This is +not a problem. If the coroutine type opts in to sender support -- trivial with +the execution::with_awaitable_senders utility -- then a large class of senders +are transparently awaitable from within the coroutine.

      For example, consider the following trivial implementation of the sender-based retry algorithm:

      template<class S>
         requires single-sender<S&> // see [exec.as.awaitable]
      @@ -4862,42 +5322,93 @@ 

      } }

      -

      Only some senders can be made awaitable directly because of the fact that callbacks are more expressive than coroutines. An awaitable expression has a single type: the result value of the async operation. In contrast, a callback can accept multiple arguments as the result of an operation. What’s more, the callback can have overloaded function call signatures that take different sets of arguments. There is no way to automatically map such senders into awaitables. The with_awaitable_senders utility recognizes as awaitables those senders that send a single value of a single type. To await another kind of sender, a user would have to first map its value channel into a single value of a single type -- say, with the into_variant sender algorithm -- before co_await-ing that sender.

      +

      Only some senders can be made awaitable directly because of the fact that +callbacks are more expressive than coroutines. An awaitable expression has a +single type: the result value of the async operation. In contrast, a callback +can accept multiple arguments as the result of an operation. What’s more, the +callback can have overloaded function call signatures that take different sets +of arguments. There is no way to automatically map such senders into awaitables. +The with_awaitable_senders utility recognizes as awaitables those senders that +send a single value of a single type. To await another kind of sender, a user +would have to first map its value channel into a single value of a single type +-- say, with the into_variant sender algorithm -- before co_await-ing that +sender.

      4.17. Cancellation of a sender can unwind a stack of coroutines

      -

      When looking at the sender-based retry algorithm in the previous section, we can see that the value and error cases are correctly handled. But what about cancellation? What happens to a coroutine that is suspended awaiting a sender that completes by calling execution::set_stopped?

      -

      When your task type’s promise inherits from with_awaitable_senders, what happens is this: the coroutine behaves as if an uncatchable exception had been thrown from the co_await expression. (It is not really an exception, but it’s helpful to think of it that way.) Provided that the promise types of the calling coroutines also inherit from with_awaitable_senders, or more generally implement a member function called unhandled_stopped, the exception unwinds the chain of coroutines as if an exception were thrown except that it bypasses catch(...) clauses.

      -

      In order to "catch" this uncatchable stopped exception, one of the calling coroutines in the stack would have to await a sender that maps the stopped channel into either a value or an error. That is achievable with the execution::let_stopped, execution::upon_stopped, execution::stopped_as_optional, or execution::stopped_as_error sender adaptors. For instance, we can use execution::stopped_as_optional to "catch" the stopped signal and map it into an empty optional as shown below:

      +

      When looking at the sender-based retry algorithm in the previous section, we +can see that the value and error cases are correctly handled. But what about +cancellation? What happens to a coroutine that is suspended awaiting a sender +that completes by calling execution::set_stopped?

      +

      When your task type’s promise inherits from with_awaitable_senders, what +happens is this: the coroutine behaves as if an uncatchable exception had been +thrown from the co_await expression. (It is not really an exception, but it’s +helpful to think of it that way.) Provided that the promise types of the calling +coroutines also inherit from with_awaitable_senders, or more generally +implement a member function called unhandled_stopped, the exception unwinds +the chain of coroutines as if an exception were thrown except that it bypasses catch(...) clauses.

      +

      In order to "catch" this uncatchable stopped exception, one of the calling +coroutines in the stack would have to await a sender that maps the stopped +channel into either a value or an error. That is achievable with the execution::let_stopped, execution::upon_stopped, execution::stopped_as_optional, or execution::stopped_as_error sender +adaptors. For instance, we can use execution::stopped_as_optional to "catch" +the stopped signal and map it into an empty optional as shown below:

      if (auto opt = co_await execution::stopped_as_optional(some_sender)) {
         // OK, some_sender completed successfully, and opt contains the result.
       } else {
         // some_sender completed with a cancellation signal.
       }
       
      -

      As described in the section "All awaitables are senders", the sender customization points recognize awaitables and adapt them transparently to model the sender concept. When connect-ing an awaitable and a receiver, the adaptation layer awaits the awaitable within a coroutine that implements unhandled_stopped in its promise type. The effect of this is that an "uncatchable" stopped exception propagates seamlessly out of awaitables, causing execution::set_stopped to be called on the receiver.

      -

      Obviously, unhandled_stopped is a library extension of the coroutine promise interface. Many promise types will not implement unhandled_stopped. When an uncatchable stopped exception tries to propagate through such a coroutine, it is treated as an unhandled exception and terminate is called. The solution, as described above, is to use a sender adaptor to handle the stopped exception before awaiting it. It goes without saying that any future Standard Library coroutine types ought to implement unhandled_stopped. The author of Add lazy coroutine (coroutine task) type, which proposes a standard coroutine task type, is in agreement.

      +

      As described in the section "All +awaitables are senders", the sender customization points recognize +awaitables and adapt them transparently to model the sender concept. When connect-ing an awaitable and a receiver, the adaptation layer awaits the +awaitable within a coroutine that implements unhandled_stopped in its promise +type. The effect of this is that an "uncatchable" stopped exception propagates +seamlessly out of awaitables, causing execution::set_stopped to be called on +the receiver.

      +

      Obviously, unhandled_stopped is a library extension of the coroutine promise +interface. Many promise types will not implement unhandled_stopped. When an +uncatchable stopped exception tries to propagate through such a coroutine, it is +treated as an unhandled exception and terminate is called. The solution, as +described above, is to use a sender adaptor to handle the stopped exception +before awaiting it. It goes without saying that any future Standard Library +coroutine types ought to implement unhandled_stopped. The author of Add lazy coroutine (coroutine task) type, which proposes a standard coroutine task type, is in agreement.

      4.18. Composition with parallel algorithms

      -

      The C++ Standard Library provides a large number of algorithms that offer the potential for non-sequential execution via the use of execution policies. The set of algorithms with execution policy overloads are often referred to as "parallel algorithms", although -additional policies are available.

      -

      Existing policies, such as execution::par, give the implementation permission to execute the algorithm in parallel. However, the choice of execution resources used to perform the work is left to the implementation.

      -

      We will propose a customization point for combining schedulers with policies in order to provide control over where work will execute.

      +

      The C++ Standard Library provides a large number of algorithms that offer the +potential for non-sequential execution via the use of execution policies. The +set of algorithms with execution policy overloads are often referred to as +"parallel algorithms", although additional policies are available.

      +

      Existing policies, such as execution::par, give the implementation permission +to execute the algorithm in parallel. However, the choice of execution resources +used to perform the work is left to the implementation.

      +

      We will propose a customization point for combining schedulers with policies in +order to provide control over where work will execute.

      template<class ExecutionPolicy>
       unspecified executing_on(
           execution::scheduler auto scheduler,
           ExecutionPolicy && policy
       );
       
      -

      This function would return an object of an unspecified type which can be used in place of an execution policy as the first argument to one of the parallel algorithms. The overload selected by that object should execute its computation as requested by policy while using scheduler to create any work to be run. The expression may be ill-formed if scheduler is not able to support the given policy.

      -

      The existing parallel algorithms are synchronous; all of the effects performed by the computation are complete before the algorithm returns to its caller. This remains unchanged with the executing_on customization point.

      -

      In the future, we expect additional papers will propose asynchronous forms of the parallel algorithms which (1) return senders rather than values or void and (2) where a customization point pairing a sender with an execution policy would similarly be used to -obtain an object of unspecified type to be provided as the first argument to the algorithm.

      +

      This function would return an object of an unspecified type which can be used in +place of an execution policy as the first argument to one of the parallel +algorithms. The overload selected by that object should execute its computation +as requested by policy while using scheduler to create any work to be run. +The expression may be ill-formed if scheduler is not able to support the given +policy.

      +

      The existing parallel algorithms are synchronous; all of the effects performed +by the computation are complete before the algorithm returns to its caller. This +remains unchanged with the executing_on customization point.

      +

      In the future, we expect additional papers will propose asynchronous forms of +the parallel algorithms which (1) return senders rather than values or void and (2) where a customization point pairing a sender with an execution policy +would similarly be used to obtain an object of unspecified type to be provided +as the first argument to the algorithm.

      4.19. User-facing sender factories

      -

      A sender factory is an algorithm that takes no senders as parameters and returns a sender.

      +

      A sender factory is an algorithm that takes no senders as parameters and +returns a sender.

      4.19.1. execution::schedule

      execution::sender auto schedule(
           execution::scheduler auto scheduler
       );
       
      -

      Returns a sender describing the start of a task graph on the provided scheduler. See § 4.2 Schedulers represent execution resources.

      +

      Returns a sender describing the start of a task graph on the provided scheduler. +See § 4.2 Schedulers represent execution resources.

      execution::scheduler auto sch1 = get_system_thread_pool().scheduler();
       
       execution::sender auto snd1 = execution::schedule(sch1);
      @@ -4908,7 +5419,11 @@ 

      auto ...&& values );

      -

      Returns a sender with no completion schedulers, which sends the provided values. The input values are decay-copied into the returned sender. When the returned sender is connected to a receiver, the values are moved into the operation state if the sender is an rvalue; otherwise, they are copied. Then xvalues referencing the values in the operation state are passed to the receiver’s set_value.

      +

      Returns a sender with no completion schedulers, which sends the provided values. The input values are decay-copied into the +returned sender. When the returned sender is connected to a receiver, the values +are moved into the operation state if the sender is an rvalue; otherwise, they +are copied. Then xvalues referencing the values in the operation state are +passed to the receiver’s set_value.

      execution::sender auto snd1 = execution::just(3.14);
       execution::sender auto then1 = execution::then(snd1, [] (double d) {
         std::cout << d << "\n";
      @@ -4941,11 +5456,17 @@ 

      4.19.4. execution::just_stopped

      execution::sender auto just_stopped();
       
      -

      Returns a sender with no completion schedulers, which completes immediately by calling the receiver’s set_stopped.

      +

      Returns a sender with no completion schedulers, which +completes immediately by calling the receiver’s set_stopped.

      4.19.5. execution::read

      execution::sender auto read(auto tag);
       
      @@ -4962,8 +5483,14 @@ 

      return read(execution::get_stop_token); }

      -

      Returns a sender that reaches into a receiver’s environment and pulls out the current value associated with the customization point denoted by Tag. It then sends the value read back to the receiver through the value channel. For instance, get_scheduler() (with no arguments) is a sender that asks the receiver for the currently suggested scheduler and passes it to the receiver’s set_value completion-signal.

      -

      This can be useful when scheduling nested dependent work. The following sender pulls the current schduler into the value channel and then schedules more work onto it.

      +

      Returns a sender that reaches into a receiver’s environment and pulls out the +current value associated with the customization point denoted by Tag. It then +sends the value read back to the receiver through the value channel. For +instance, get_scheduler() (with no arguments) is a sender that asks the +receiver for the currently suggested scheduler and passes it to the receiver’s set_value completion-signal.

      +

      This can be useful when scheduling nested dependent work. The following sender +pulls the current schduler into the value channel and then schedules more work +onto it.

      execution::sender auto task =
         execution::get_scheduler()
           | execution::let_value([](auto sched) {
      @@ -4972,11 +5499,19 @@ 

      this_thread::sync_wait( std::move(task) ); // wait for it to finish

      -

      This code uses the fact that sync_wait associates a scheduler with the receiver that it connects with task. get_scheduler() reads that scheduler out of the receiver, and passes it to let_value’s receiver’s set_value function, which in turn passes it to the lambda. That lambda returns a new sender that uses the scheduler to schedule some nested work onto sync_wait’s scheduler.

      +

      This code uses the fact that sync_wait associates a scheduler with the +receiver that it connects with task. get_scheduler() reads that scheduler +out of the receiver, and passes it to let_value’s receiver’s set_value function, which in turn passes it to the lambda. That lambda returns a new +sender that uses the scheduler to schedule some nested work onto sync_wait’s +scheduler.

      4.20. User-facing sender adaptors

      -

      A sender adaptor is an algorithm that takes one or more senders, which it may execution::connect, as parameters, and returns a sender, whose completion is related to the sender arguments it has received.

      -

      Sender adaptors are lazy, that is, they are never allowed to submit any work for execution prior to the returned sender being started later on, and are also guaranteed to not start any input senders passed into them. Sender consumers -such as § 4.21.1 execution::start_detached and § 4.21.2 this_thread::sync_wait start senders.

      +

      A sender adaptor is an algorithm that takes one or more senders, which it +may execution::connect, as parameters, and returns a sender, whose completion +is related to the sender arguments it has received.

      +

      Sender adaptors are lazy, that is, they are never allowed to submit any +work for execution prior to the returned sender being started later on, and +are also guaranteed to not start any input senders passed into them. Sender +consumers such as § 4.21.1 execution::start_detached and § 4.21.2 this_thread::sync_wait start senders.

      For more implementer-centric description of starting senders, see § 5.5 Sender adaptors are lazy.

      4.20.1. execution::transfer

      execution::sender auto transfer(
      @@ -4984,7 +5519,8 @@ 

      § 4.6 Execution resource transitions are explicit.

      +

      Returns a sender describing the transition from the execution agent of the input +sender to the execution agent of the target scheduler. See § 4.6 Execution resource transitions are explicit.

      execution::scheduler auto cpu_sched = get_system_thread_pool().scheduler();
       execution::scheduler auto gpu_sched = cuda::scheduler();
       
      @@ -5000,8 +5536,10 @@ 

      std::invocable<values-sent-by(input)...> function );

      -

      then returns a sender describing the task graph described by the input sender, with an added node of invoking the provided function with the values sent by the input sender as arguments.

      -

      then is guaranteed to not begin executing function until the returned sender is started.

      +

      then returns a sender describing the task graph described by the input sender, +with an added node of invoking the provided function with the values sent by the input sender as arguments.

      +

      then is guaranteed to not begin executing function until the returned +sender is started.

      execution::sender auto input = get_input();
       execution::sender auto snd = execution::then(input, [](auto... args) {
           std::print(args...);
      @@ -5009,7 +5547,8 @@ 

      // snd describes the work described by pred // followed by printing all of the values sent by pred

      -

      This adaptor is included as it is necessary for writing any sender code that actually performs a useful function.

      +

      This adaptor is included as it is necessary for writing any sender code that +actually performs a useful function.

      4.20.3. execution::upon_*

      execution::sender auto upon_error(
           execution::sender auto input,
      @@ -5021,7 +5560,8 @@ 

      std::invocable auto function );

      -

      upon_error and upon_stopped are similar to then, but where then works with values sent by the input sender, upon_error works with errors, and upon_stopped is invoked when the "stopped" signal is sent.

      +

      upon_error and upon_stopped are similar to then, but where then works +with values sent by the input sender, upon_error works with errors, and upon_stopped is invoked when the "stopped" signal is sent.

      4.20.4. execution::let_*

      execution::sender auto let_value(
           execution::sender auto input,
      @@ -5038,8 +5578,15 @@ 

      < std::invocable auto function );

      -

      let_value is very similar to then: when it is started, it invokes the provided function with the values sent by the input sender as arguments. However, where the sender returned from then sends exactly what that function ends up returning - let_value requires that the function return a sender, and the sender returned by let_value sends the values sent by the sender returned from the callback. This is similar to the notion of "future unwrapping" in future/promise-based frameworks.

      -

      let_value is guaranteed to not begin executing function until the returned sender is started.

      +

      let_value is very similar to then: when it is started, it invokes the +provided function with the values sent by the input sender as +arguments. However, where the sender returned from then sends exactly what +that function ends up returning - let_value requires that the function return a sender, and the sender returned +by let_value sends the values sent by the sender returned from the callback. +This is similar to the notion of "future unwrapping" in future/promise-based +frameworks.

      +

      let_value is guaranteed to not begin executing function until the +returned sender is started.

      let_error and let_stopped are similar to let_value, but where let_value works with values sent by the input sender, let_error works with errors, and let_stopped is invoked when the "stopped" signal is sent.

      4.20.5. execution::on

      execution::sender auto on(
      @@ -5047,13 +5594,19 @@ 

      execution::sender auto snd );

      -

      Returns a sender which, when started, will start the provided sender on an execution agent belonging to the execution resource associated with the provided scheduler. This returned sender has no completion schedulers.

      +

      Returns a sender which, when started, will start the provided sender on an +execution agent belonging to the execution resource associated with the provided +scheduler. This returned sender has no completion +schedulers.

      4.20.6. execution::into_variant

      execution::sender auto into_variant(
           execution::sender auto snd
       );
       
      -

      Returns a sender which sends a variant of tuples of all the possible sets of types sent by the input sender. Senders can send multiple sets of values depending on runtime conditions; this is a helper function that turns them into a single variant value.

      +

      Returns a sender which sends a variant of tuples of all the possible sets of +types sent by the input sender. Senders can send multiple sets of values +depending on runtime conditions; this is a helper function that turns them into +a single variant value.

      4.20.7. execution::stopped_as_optional

      execution::sender auto stopped_as_optional(
           single-sender auto snd
      @@ -5075,16 +5628,27 @@ 

      invocable<decltype(size), values-sent-by(input)...> function );

      -

      Returns a sender describing the task of invoking the provided function with every index in the provided shape along with the values sent by the input sender. The returned sender completes once all invocations have completed, or an error has occurred. If it completes -by sending values, they are equivalent to those sent by the input sender.

      -

      No instance of function will begin executing until the returned sender is started. Each invocation of function runs in an execution agent whose forward progress guarantees are determined by the scheduler on which they are run. All agents created by a single use -of bulk execute with the same guarantee. The number of execution agents used by bulk is not specified. This allows a scheduler to execute some invocations of the function in parallel.

      -

      In this proposal, only integral types are used to specify the shape of the bulk section. We expect that future papers may wish to explore extensions of the interface to explore additional kinds of shapes, such as multi-dimensional grids, that are commonly used for -parallel computing tasks.

      +

      Returns a sender describing the task of invoking the provided function with +every index in the provided shape along with the values sent by the input +sender. The returned sender completes once all invocations have completed, or an +error has occurred. If it completes by sending values, they are equivalent to +those sent by the input sender.

      +

      No instance of function will begin executing until the returned sender is +started. Each invocation of function runs in an execution agent whose forward +progress guarantees are determined by the scheduler on which they are run. All +agents created by a single use of bulk execute with the same guarantee. The +number of execution agents used by bulk is not specified. This allows a +scheduler to execute some invocations of the function in parallel.

      +

      In this proposal, only integral types are used to specify the shape of the bulk +section. We expect that future papers may wish to explore extensions of the +interface to explore additional kinds of shapes, such as multi-dimensional +grids, that are commonly used for parallel computing tasks.

      4.20.10. execution::split

      execution::sender auto split(execution::sender auto sender);
       
      -

      If the provided sender is a multi-shot sender, returns that sender. Otherwise, returns a multi-shot sender which sends values equivalent to the values sent by the provided sender. See § 4.7 Senders can be either multi-shot or single-shot.

      +

      If the provided sender is a multi-shot sender, returns that sender. Otherwise, +returns a multi-shot sender which sends values equivalent to the values sent by +the provided sender. See § 4.7 Senders can be either multi-shot or single-shot.

      4.20.11. execution::when_all

      execution::sender auto when_all(
           execution::sender auto ...inputs
      @@ -5094,7 +5658,14 @@ 

      when_all returns a sender that completes once all of the input senders have completed. It is constrained to only accept senders that can complete with a single set of values (_i.e._, it only calls one overload of set_value on its receiver). The values sent by this sender are the values sent by each of the input senders, in order of the arguments passed to when_all. It completes inline on the execution resource on which the last input sender completes, unless stop is requested before when_all is started, in which case it completes inline within the call to start.

      +

      when_all returns a sender that completes once all of the input senders have +completed. It is constrained to only accept senders that can complete with a +single set of values (_i.e._, it only calls one overload of set_value on its +receiver). The values sent by this sender are the values sent by each of the +input senders, in order of the arguments passed to when_all. It completes +inline on the execution resource on which the last input sender completes, +unless stop is requested before when_all is started, in which case it +completes inline within the call to start.

      when_all_with_variant does the same, but it adapts all the input senders using into_variant, and so it does not constrain the input arguments as when_all does.

      The returned sender has no completion schedulers.

      execution::scheduler auto sched = thread_pool.scheduler();
      @@ -5117,40 +5688,63 @@ 

      ensure_started returns, it is known that the provided sender has been connected and start has been called on the resulting operation state (see § 5.2 Operation states represent work); in other words, the work described by the provided sender has been submitted -for execution on the appropriate execution resources. Returns a sender which completes when the provided sender completes and sends values equivalent to those of the provided sender.

      -

      If the returned sender is destroyed before execution::connect() is called, or if execution::connect() is called but the -returned operation-state is destroyed before execution::start() is called, then a stop-request is sent to the eagerly launched -operation and the operation is detached and will run to completion in the background. Its result will be discarded when it -eventually completes.

      -

      Note that the application will need to make sure that resources are kept alive in the case that the operation detaches. -e.g. by holding a std::shared_ptr to those resources or otherwise having some out-of-band way to signal completion of +

      Once ensure_started returns, it is known that the provided sender has been connected and start has been called on the resulting operation +state (see § 5.2 Operation states represent work); in other words, the work described by the +provided sender has been submitted +for execution on the appropriate execution resources. Returns a sender which +completes when the provided sender completes and sends values equivalent to +those of the provided sender.

      +

      If the returned sender is destroyed before execution::connect() is called, or +if execution::connect() is called but the returned operation-state is +destroyed before execution::start() is called, then a stop-request is sent to +the eagerly launched operation and the operation is detached and will run to +completion in the background. Its result will be discarded when it eventually +completes.

      +

      Note that the application will need to make sure that resources are kept alive +in the case that the operation detaches. e.g. by holding a std::shared_ptr to +those resources or otherwise having some out-of-band way to signal completion of the operation so that resource release can be sequenced after the completion.

      4.21. User-facing sender consumers

      -

      A sender consumer is an algorithm that takes one or more senders, which it may execution::connect, as parameters, and does not return a sender.

      +

      A sender consumer is an algorithm that takes one or more senders, which it +may execution::connect, as parameters, and does not return a sender.

      4.21.1. execution::start_detached

      void start_detached(
           execution::sender auto sender
       );
       
      -

      Like ensure_started, but does not return a value; if the provided sender sends an error instead of a value, std::terminate is called.

      +

      Like ensure_started, but does not return a value; if the provided sender sends +an error instead of a value, std::terminate is called.

      4.21.2. this_thread::sync_wait

      auto sync_wait(
           execution::sender auto sender
       ) requires (always-sends-same-values(sender))
           -> std::optional<std::tuple<values-sent-by(sender)>>;
       
      -

      this_thread::sync_wait is a sender consumer that submits the work described by the provided sender for execution, similarly to ensure_started, except that it blocks the current std::thread or thread of main until the work is completed, and returns -an optional tuple of values that were sent by the provided sender on its completion of work. Where § 4.19.1 execution::schedule and § 4.19.2 execution::just are meant to enter the domain of senders, sync_wait is meant to exit the domain of -senders, retrieving the result of the task graph.

      -

      If the provided sender sends an error instead of values, sync_wait throws that error as an exception, or rethrows the original exception if the error is of type std::exception_ptr.

      +

      this_thread::sync_wait is a sender consumer that submits the work described by +the provided sender for execution, similarly to ensure_started, except that it +blocks the current std::thread or thread of main until the work is +completed, and returns an optional tuple of values that were sent by the +provided sender on its completion of work. Where § 4.19.1 execution::schedule and § 4.19.2 execution::just are +meant to enter the domain of senders, sync_wait is meant to exit the domain of senders, retrieving the result of the task graph.

      +

      If the provided sender sends an error instead of values, sync_wait throws that +error as an exception, or rethrows the original exception if the error is of +type std::exception_ptr.

      If the provided sender sends the "stopped" signal instead of values, sync_wait returns an empty optional.

      -

      For an explanation of the requires clause, see § 5.8 All senders are typed. That clause also explains another sender consumer, built on top of sync_wait: sync_wait_with_variant.

      -

      Note: This function is specified inside std::this_thread, and not inside execution. This is because sync_wait has to block the current execution agent, but determining what the current execution agent is is not reliable. Since the standard -does not specify any functions on the current execution agent other than those in std::this_thread, this is the flavor of this function that is being proposed. If C++ ever obtains fibers, for instance, we expect that a variant of this function called std::this_fiber::sync_wait would be provided. We also expect that runtimes with execution agents that use different synchronization mechanisms than std::thread’s will provide their own flavors of sync_wait as well (assuming their execution agents have the means +

      For an explanation of the requires clause, see § 5.8 All senders are typed. That clause +also explains another sender consumer, built on top of sync_wait: sync_wait_with_variant.

      +

      Note: This function is specified inside std::this_thread, and not inside execution. This is because sync_wait has to block the current execution agent, but determining what the current execution agent is is not +reliable. Since the standard does not specify any functions on the current +execution agent other than those in std::this_thread, this is the flavor of +this function that is being proposed. If C++ ever obtains fibers, for instance, +we expect that a variant of this function called std::this_fiber::sync_wait would be provided. We also expect that runtimes with execution agents that use +different synchronization mechanisms than std::thread’s will provide their own +flavors of sync_wait as well (assuming their execution agents have the means to block in a non-deadlock manner).

      4.22. execution::execute

      -

      In addition to the three categories of functions presented above, we also propose to include a convenience function for fire-and-forget eager one-way submission of an invocable to a scheduler, to fulfil the role of one-way executors from P0443.

      +

      In addition to the three categories of functions presented above, we also +propose to include a convenience function for fire-and-forget eager one-way +submission of an invocable to a scheduler, to fulfil the role of one-way +executors from P0443.

      void execution::execute(
           execution::schedule auto sched,
           std::invocable auto fn
      @@ -5163,7 +5757,8 @@ 

      5. Design - implementer side

      5.1. Receivers serve as glue between senders

      -

      A receiver is a callback that supports more than one channel. In fact, it supports three of them:

      +

      A receiver is a callback that supports more than one channel. In fact, it +supports three of them:

      • set_value, which is the moral equivalent of an operator() or a function @@ -5179,21 +5774,33 @@

        std::promise, which provides the first two signals as set_value and set_exception, and it’s possible to emulate the third channel with lifetime management of the promise.

        -

        Receivers are not a part of the end-user-facing API of this proposal; they are necessary to allow unrelated senders communicate with each other, but the only users who will interact with receivers directly are authors of senders.

        +

        Receivers are not a part of the end-user-facing API of this proposal; they are +necessary to allow unrelated senders communicate with each other, but the only +users who will interact with receivers directly are authors of senders.

        Receivers are what is passed as the second argument to § 5.3 execution::connect.

        5.2. Operation states represent work

        -

        An operation state is an object that represents work. Unlike senders, it is not a chaining mechanism; instead, it is a concrete object that packages the work described by a full sender chain, ready to be executed. An operation state is neither movable nor -copyable, and its interface consists of a single algorithm: start, which serves as the submission point of the work represented by a given operation state.

        -

        Operation states are not a part of the user-facing API of this proposal; they are necessary for implementing sender consumers like execution::ensure_started and this_thread::sync_wait, and the knowledge of them is necessary to implement senders, so the only users who will -interact with operation states directly are authors of senders and authors of sender algorithms.

        -

        The return value of § 5.3 execution::connect must satisfy the operation state concept.

        +

        An operation state is an object that represents work. Unlike senders, it is +not a chaining mechanism; instead, it is a concrete object that packages the +work described by a full sender chain, ready to be executed. An operation state +is neither movable nor copyable, and its interface consists of a single +algorithm: start, which serves as the submission point of the work represented +by a given operation state.

        +

        Operation states are not a part of the user-facing API of this proposal; they +are necessary for implementing sender consumers like execution::ensure_started and this_thread::sync_wait, and the knowledge of them is necessary to +implement senders, so the only users who will interact with operation states +directly are authors of senders and authors of sender algorithms.

        +

        The return value of § 5.3 execution::connect must satisfy the operation state +concept.

        5.3. execution::connect

        -

        execution::connect is a customization point which connects senders with receivers, resulting in an operation state that will ensure that if start is called that one of the completion operations will be called on the receiver passed to connect.

        +

        execution::connect is a customization point which connects senders with +receivers, resulting in an operation state that will ensure that if start is +called that one of the completion operations will be called on the receiver +passed to connect.

        execution::sender auto snd = some input sender;
         execution::receiver auto rcv = some receiver;
         execution::operation_state auto state = execution::connect(snd, rcv);
        @@ -5207,20 +5814,31 @@ 

        5.4. Sender algorithms are customizable

        -

        Senders being able to advertise what their completion schedulers are fulfills one of the promises of senders: that of being able to customize an implementation of a sender algorithm based on what scheduler any work it depends on will complete on.

        -

        The simple way to provide customizations for functions like then, that is for sender adaptors and sender consumers, is to follow the customization scheme that has been adopted for C++20 ranges library; to do that, we would define -the expression execution::then(sender, invocable) to be equivalent to:

        +

        Senders being able to advertise what their completion schedulers are +fulfills one of the promises of senders: that of being able to customize an +implementation of a sender algorithm based on what scheduler any work it depends +on will complete on.

        +

        The simple way to provide customizations for functions like then, that is for sender adaptors and sender consumers, is to follow the customization +scheme that has been adopted for C++20 ranges library; to do that, we would +define the expression execution::then(sender, invocable) to be equivalent to:

        1. sender.then(invocable), if that expression is well-formed; otherwise

        2. -

          then(sender, invocable), performed in a context where this call always performs ADL, if that expression is well-formed; otherwise

          +

          then(sender, invocable), performed in a context where this call always + performs ADL, if that expression is well-formed; otherwise

        3. -

          a default implementation of then, which returns a sender adaptor, and then define the exact semantics of said adaptor.

          +

          a default implementation of then, which returns a sender adaptor, and + then define the exact semantics of said adaptor.

        -

        However, this definition is problematic. Imagine another sender adaptor, bulk, which is a structured abstraction for a loop over an index space. Its default implementation is just a for loop. However, for accelerator runtimes like CUDA, we would like sender algorithms -like bulk to have specialized behavior, which invokes a kernel of more than one thread (with its size defined by the call to bulk); therefore, we would like to customize bulk for CUDA senders to achieve this. However, there’s no reason for CUDA kernels to -necessarily customize the then sender adaptor, as the generic implementation is perfectly sufficient. This creates a problem, though; consider the following snippet:

        +

        However, this definition is problematic. Imagine another sender adaptor, bulk, +which is a structured abstraction for a loop over an index space. Its default +implementation is just a for loop. However, for accelerator runtimes like CUDA, +we would like sender algorithms like bulk to have specialized behavior, which +invokes a kernel of more than one thread (with its size defined by the call to bulk); therefore, we would like to customize bulk for CUDA senders to +achieve this. However, there’s no reason for CUDA kernels to necessarily +customize the then sender adaptor, as the generic implementation is perfectly +sufficient. This creates a problem, though; consider the following snippet:

        execution::scheduler auto cuda_sch = cuda_scheduler{};
         
         execution::sender auto initial = execution::schedule(cuda_sch);
        @@ -5234,8 +5852,9 @@ 

        execution::sender auto kernel_sender = execution::bulk(next, shape, [](int i){ ... });

        -

        How can we specialize the bulk sender adaptor for our wrapped schedule_sender? Well, here’s one possible approach, taking advantage of ADL (and the fact that the definition of "associated namespace" also recursively enumerates the associated namespaces of all template -parameters of a type):

        +

        How can we specialize the bulk sender adaptor for our wrapped schedule_sender? Well, here’s one possible approach, taking advantage of ADL +(and the fact that the definition of "associated namespace" also recursively +enumerates the associated namespaces of all template parameters of a type):

        namespace cuda::for_adl_purposes {
         template<typename... SentValues>
         class schedule_sender {
        @@ -5252,178 +5871,195 @@ 

        } } // namespace cuda::for_adl_purposes

        -

        However, if the input sender is not just a then_sender_adaptor like in the example above, but another sender that overrides bulk by itself, as a member function, because its author believes they know an optimization for bulk - the specialization above will no -longer be selected, because a member function of the first argument is a better match than the ADL-found overload.

        -

        This means that well-meant specialization of sender algorithms that are entirely scheduler-agnostic can have negative consequences. -The scheduler-specific specialization - which is essential for good performance on platforms providing specialized ways to launch certain sender algorithms - would not be selected in such cases. -But it’s really the scheduler that should control the behavior of sender algorithms when a non-default implementation exists, not the sender. Senders merely describe work; schedulers, however, are the handle to the -runtime that will eventually execute said work, and should thus have the final say in how the work is going to be executed.

        -

        Therefore, we are proposing the following customization scheme (also modified to take § 5.9 Ranges-style CPOs vs tag_invoke into account): the expression execution::<sender-algorithm>(sender, args...), for any given sender algorithm that accepts a sender as its first argument, should be -equivalent to:

        +

        However, if the input sender is not just a then_sender_adaptor like in the +example above, but another sender that overrides bulk by itself, as a member +function, because its author believes they know an optimization for bulk - the +specialization above will no longer be selected, because a member function of +the first argument is a better match than the ADL-found overload.

        +

        This means that well-meant specialization of sender algorithms that are entirely +scheduler-agnostic can have negative consequences. The scheduler-specific +specialization - which is essential for good performance on platforms providing +specialized ways to launch certain sender algorithms - would not be selected in +such cases. But it’s really the scheduler that should control the behavior of +sender algorithms when a non-default implementation exists, not the sender. +Senders merely describe work; schedulers, however, are the handle to the runtime +that will eventually execute said work, and should thus have the final say in how the work is going to be executed.

        +

        Therefore, we are proposing the following customization scheme: the expression execution::<sender-algorithm>(sender, args...), for any given sender algorithm +that accepts a sender as its first argument, should do the following:

        1. -

          tag_invoke(<sender-algorithm>, get_completion_scheduler<Tag>(get_env(sender)), sender, args...), if that expression is well-formed; otherwise

          +

          Create a sender that implements the default implementation of the sender + algorithm. That sender is tuple-like; it can be destructured into its + constituent parts: algorithm tag, data, and child sender(s).

        2. -

          tag_invoke(<sender-algorithm>, sender, args...), if that expression is well-formed; otherwise

          +

          We query the child sender for its domain. A domain is a tag type + associated with the scheduler that the child sender will complete on. + If there are multiple child senders, we query all of them for their + domains and require that they all be the same.

        3. -

          a default implementation, if there exists a default implementation of the given sender algorithm.

          +

          We use the domain to dispatch to a transform_sender customization, which + accepts the sender and optionally performs a domain-specific + transformation on it. This customization is expected to return a new + sender, which will be returned from <sender-algorithm> in place of the + original sender.

        -

        where Tag is one of set_value, set_error, or set_stopped. For most sender algorithms, the completion scheduler for set_value would be used, but for some (like upon_error or let_stopped), one of the others would be used.

        -

        For sender algorithms which accept concepts other than sender as their first argument, we propose that the customization scheme remains as it has been in A Unified Executors Proposal for C++ so far, except it should also use tag_invoke.

        5.5. Sender adaptors are lazy

        -

        Contrary to early revisions of this paper, we propose to make all sender adaptors perform strictly lazy submission, unless specified otherwise (the one notable exception in this paper is § 4.20.12 execution::ensure_started, whose sole purpose is to start an -input sender).

        -

        Strictly lazy submission means that there is a guarantee that no work is submitted to an execution resource before a receiver is connected to a sender, and execution::start is called on the resulting operation state.

        +

        Contrary to early revisions of this paper, we propose to make all sender +adaptors perform strictly lazy submission, unless specified otherwise (the one +notable exception in this paper is § 4.20.12 execution::ensure_started, +whose sole purpose is to start an input sender).

        +

        Strictly lazy submission means that there is a guarantee +that no work is submitted to an execution resource before a receiver is +connected to a sender, and execution::start is called on the resulting +operation state.

        5.6. Lazy senders provide optimization opportunities

        -

        Because lazy senders fundamentally describe work, instead of describing or representing the submission of said work to an execution resource, and thanks to the flexibility of the customization of most sender algorithms, they provide an opportunity for fusing -multiple algorithms in a sender chain together, into a single function that can later be submitted for execution by an execution resource. There are two ways this can happen.

        -

        The first (and most common) way for such optimizations to happen is thanks to the structure of the implementation: because all the work is done within callbacks invoked on the completion of an earlier sender, recursively up to the original source of computation, -the compiler is able to see a chain of work described using senders as a tree of tail calls, allowing for inlining and removal of most of the sender machinery. In fact, when work is not submitted to execution resources outside of the current thread of execution, -compilers are capable of removing the senders abstraction entirely, while still allowing for composition of functions across different parts of a program.

        -

        The second way for this to occur is when a sender algorithm is specialized for a specific set of arguments. For instance, we expect that, for senders which are known to have been started already, § 4.20.12 execution::ensure_started will be an identity transformation, -because the sender algorithm will be specialized for such senders. Similarly, an implementation could recognize two subsequent § 4.20.9 execution::bulks of compatible shapes, and merge them together into a single submission of a GPU kernel.

        +

        Because lazy senders fundamentally describe work, instead of describing or +representing the submission of said work to an execution resource, and thanks to +the flexibility of the customization of most sender algorithms, they provide an +opportunity for fusing multiple algorithms in a sender chain together, into a +single function that can later be submitted for execution by an execution +resource. There are two ways this can happen.

        +

        The first (and most common) way for such optimizations to happen is thanks to +the structure of the implementation: because all the work is done within +callbacks invoked on the completion of an earlier sender, recursively up to the +original source of computation, the compiler is able to see a chain of work +described using senders as a tree of tail calls, allowing for inlining and +removal of most of the sender machinery. In fact, when work is not submitted to +execution resources outside of the current thread of execution, compilers are +capable of removing the senders abstraction entirely, while still allowing for +composition of functions across different parts of a program.

        +

        The second way for this to occur is when a sender algorithm is specialized for a +specific set of arguments. For instance, we expect that, for senders which are +known to have been started already, § 4.20.12 execution::ensure_started will be an identity transformation, because the sender algorithm will be +specialized for such senders. Similarly, an implementation could recognize two +subsequent § 4.20.9 execution::bulks of compatible shapes, and merge them +together into a single submission of a GPU kernel.

        5.7. Execution resource transitions are two-step

        -

        Because execution::transfer takes a sender as its first argument, it is not actually directly customizable by the target scheduler. This is by design: the target scheduler may not know how to transition from a scheduler such as a CUDA scheduler; -transitioning away from a GPU in an efficient manner requires making runtime calls that are specific to the GPU in question, and the same is usually true for other kinds of accelerators too (or for scheduler running on remote systems). To avoid this problem, -specialized schedulers like the ones mentioned here can still hook into the transition mechanism, and inject a sender which will perform a transition to the regular CPU execution resource, so that any sender can be attached to it.

        -

        This, however, is a problem: because customization of sender algorithms must be controlled by the scheduler they will run on (see § 5.4 Sender algorithms are customizable), the type of the sender returned from transfer must be controllable by the target scheduler. Besides, the target -scheduler may itself represent a specialized execution resource, which requires additional work to be performed to transition to it. GPUs and remote node schedulers are once again good examples of such schedulers: executing code on their execution resources -requires making runtime API calls for work submission, and quite possibly for the data movement of the values being sent by the input sender passed into transfer.

        -

        To allow for such customization from both ends, we propose the inclusion of a secondary transitioning sender adaptor, called schedule_from. This adaptor is a form of schedule, but takes an additional, second argument: the input sender. This adaptor is not -meant to be invoked manually by the end users; they are always supposed to invoke transfer, to ensure that both schedulers have a say in how the transitions are made. Any scheduler that specializes transfer(snd, sch) shall ensure that the -return value of their customization is equivalent to schedule_from(sch, snd2), where snd2 is a successor of snd that sends values equivalent to those sent by snd.

        +

        Because execution::transfer takes a sender as its first argument, it is not +actually directly customizable by the target scheduler. This is by design: the +target scheduler may not know how to transition from a scheduler such as +a CUDA scheduler; transitioning away from a GPU in an efficient manner requires +making runtime calls that are specific to the GPU in question, and the same is +usually true for other kinds of accelerators too (or for scheduler running on +remote systems). To avoid this problem, specialized schedulers like the ones +mentioned here can still hook into the transition mechanism, and inject a sender +which will perform a transition to the regular CPU execution resource, so that +any sender can be attached to it.

        +

        This, however, is a problem: because customization of sender algorithms must be +controlled by the scheduler they will run on (see § 5.4 Sender algorithms are customizable), +the type of the sender returned from transfer must be controllable by the +target scheduler. Besides, the target scheduler may itself represent a +specialized execution resource, which requires additional work to be performed +to transition to it. GPUs and remote node schedulers are once again good +examples of such schedulers: executing code on their execution resources +requires making runtime API calls for work submission, and quite possibly for +the data movement of the values being sent by the input sender passed into transfer.

        +

        To allow for such customization from both ends, we propose the inclusion of a +secondary transitioning sender adaptor, called schedule_from. This adaptor is +a form of schedule, but takes an additional, second argument: the input +sender. This adaptor is not meant to be invoked manually by the end users; they +are always supposed to invoke transfer, to ensure that both schedulers have a +say in how the transitions are made. Any scheduler that specializes transfer(snd, sch) shall ensure that the return value of their customization +is equivalent to schedule_from(sch, snd2), where snd2 is a successor of snd that sends values equivalent to those sent by snd.

        The default implementation of transfer(snd, sched) is schedule_from(sched, snd).

        5.8. All senders are typed

        -

        All senders must advertise the types they will send when they complete. -This is necessary for a number of features, and writing code in a way that’s -agnostic of whether an input sender is typed or not in common sender adaptors -such as execution::then is hard.

        -

        The mechanism for this advertisement is similar to the one in A Unified Executors Proposal for C++; the -way to query the types is through completion_signatures_of_t<S, [Env]>::value_types<tuple_like, variant_like>.

        -

        completion_signatures_of_t::value_types is a template that takes two -arguments: one is a tuple-like template, the other is a variant-like template. -The tuple-like argument is required to represent senders sending more than one -value (such as when_all). The variant-like argument is required to represent -senders that choose which specific values to send at runtime.

        -

        There’s a choice made in the specification of § 4.21.2 this_thread::sync_wait: it returns a tuple of values sent by the -sender passed to it, wrapped in std::optional to handle the set_stopped signal. However, this assumes that those values can be represented as a tuple, -like here:

        -
        execution::sender auto sends_1 = ...;
        -execution::sender auto sends_2 = ...;
        -execution::sender auto sends_3 = ...;
        -
        -auto [a, b, c] = this_thread::sync_wait(
        -    execution::when_all(
        -        sends_1,
        -        sends_2,
        -        sends_3)
        -    | execution::transfer(
        -        execution::get_completion_scheduler<execution::set_value_t>(get_env(sends_1))),
        -    ).value();
        -// a == 1
        -// b == 2
        -// c == 3
        -
        -

        This works well for senders that always send the same set of arguments. If we ignore the possibility of having a sender that sends different sets of arguments into a receiver, we can specify the "canonical" (i.e. required to be followed by all senders) form of value_types of a sender which sends Types... to be as follows:

        -
        template<template<typename ...> typename TupleLike>
        -using value_types = TupleLike;
        -
        -

        If senders could only ever send one specific set of values, this would probably need to be the required form of value_types for all senders; defining it otherwise would cause very weird results and should be considered a bug.

        -

        This matter is somewhat complicated by the fact that (1) set_value for receivers can be overloaded and accept different sets of arguments, and (2) senders are allowed to send multiple different sets of values, depending on runtime conditions, the data they -consumed, and so on. To accomodate this, A Unified Executors Proposal for C++ also includes a second template parameter to value_types, one that represents a variant-like type. If we permit such senders, we would almost certainly need to require that the canonical form of value_types for all senders (to ensure consistency in how they are handled, and to avoid accidentally interpreting a user-provided variant as a sender-provided one) sending the different sets of arguments Types1..., Types2..., ..., TypesN... to be as follows:

        -
        template<
        -    template<typename ...> typename TupleLike,
        -    template<typename ...> typename VariantLike
        ->
        -using value_types = VariantLike<
        -    TupleLike<Types1...>,
        -    TupleLike<Types2...>,
        -    ...,
        -    TupleLike<Types3...>
        ->;
        -
        -

        This, however, introduces a couple of complications:

        -
          -
        1. -

          A just(1) sender would also need to follow this structure, so the correct type for storing the value sent by it would be std::variant<std::tuple<int>> or some such. This introduces a lot of compile time overhead for the simplest senders, and this overhead -effectively exists in all places in the code where value_types is queried, regardless of the tuple-like and variant-like templates passed to it. Such overhead does exist if only the tuple-like parameter exists, but is made much worse by adding this second -wrapping layer.

          -
        2. -

          As a consequence of (1): because sync_wait needs to store the above type, it can no longer return just a std::tuple<int> for just(1); it has to return std::variant<std::tuple<int>>. C++ currently does not have an easy way to destructure this; it may get -less awkward with pattern matching, but even then it seems extremely heavyweight to involve variants in this API, and for the purpose of generic code, the kind of the return type of sync_wait must be the same across all sender types.

          -
        -

        One possible solution to (2) above is to place a requirement on sync_wait that it can only accept senders which send only a single set of values, therefore removing the need for std::variant to appear in its API; because of this, we propose to expose both sync_wait, which is a simple, user-friendly version of the sender consumer, but requires that value_types have only one possible variant, and sync_wait_with_variant, which accepts any sender, but returns an optional whose value type is the variant of all the -possible tuples sent by the input sender:

        -
        auto sync_wait_with_variant(
        -    execution::sender auto sender
        -) -> std::optional<std::variant<
        -        std::tuple<values0-sent-by(sender)>,
        -        std::tuple<values1-sent-by(sender)>,
        -        ...,
        -        std::tuple<valuesn-sent-by(sender)>
        -    >>;
        -
        -auto sync_wait(
        -    execution::sender auto sender
        -) requires (always-sends-same-values(sender))
        -    -> std::optional<std::tuple<values-sent-by(sender)>>;
        +   

        All senders must advertise the types they will send when they complete. There +are many sender adaptors that need this information. Even just transitioning +from one execution context to another requires temporarily storing the async +result data so it can be propagated in the new execution context. Doing that +efficiently requires knowing the type of the data.

        +

        The mechanism a sender uses to advertise its completions is the get_completion_signatures customization point, which takes an environment and +must return a specialization of the execution::completion_signatures class +template. The template parameters of execution::completion_signatures is a +list of function types that represent the completion operations of the sender. +for example, the type execution::set_value_t(size_t, const char*) indicates +that the sender can complete successfully by passing a size_t and a const char* to the receiver’s set_value function.

        +

        This proposal includes utilities for parsing and manipulating the list of a +sender’s completion signatures. For instance, values_of_t is a template alias +for accessing a sender’s value completions. It takes a sender, an environment, +and two variadic template template parameters: a tuple-like template and a +variant-like template. You can get the value completions of S and Env with value_types_of_t<S, Env, tuple-like, variant-like>. For example, for a sender that can complete +successfully with either Ts... or Us..., value_types_of_t<S, Env, std::tuple, std::variant> would name the type std::variant<std::tuple<Ts...>, std::tuple<Us...>>.

        +

        5.9. Customization points

        +

        Earlier versions of this paper used a dispatching technique known as tag_invoke (see tag_invoke: A general pattern for supporting customisable functions) to allow for customization of basis operations +and sender algorithms. This technique used private friend functions named +"tag_invoke" that are found by argument-dependent look-up. The tag_invoke overloads are distinguished from each other by their first argument, which is +the type of the customization point object being customized. For instance, to +customize the execution::set_value operation, a receiver type might do the +following:

        +
        struct my_receiver {
        +  friend void tag_invoke(execution::set_value_t, my_receiver&& self, int value) noexcept {
        +    std::cout << "received value: " << value;
        +  }
        +  //...
        +};
         
        -

        5.9. Ranges-style CPOs vs tag_invoke

        -

        The contemporary technique for customization in the Standard Library is customization point objects. A customization point object, will it look for member functions and then for nonmember functions with the same name as the customization point, and calls those if -they match. This is the technique used by the C++20 ranges library, and previous executors proposals (A Unified Executors Proposal for C++ and Towards C++23 executors: A proposal for an initial set of algorithms) intended to use it as well. However, it has several unfortunate consequences:

        -
          -
        1. -

          It does not allow for easy propagation of customization points unknown to the adaptor to a wrapped object, which makes writing universal adapter types much harder - and this proposal uses quite a lot of those.

          -
        2. -

          It effectively reserves names globally. Because neither member names nor ADL-found functions can be qualified with a namespace, every customization point object that uses the ranges scheme reserves the name for all types in all namespaces. This is unfortunate -due to the sheer number of customization points already in the paper, but also ones that we are envisioning in the future. It’s also a big problem for one of the operations being proposed already: sync_wait. We imagine that if, in the future, C++ was to -gain fibers support, we would want to also have std::this_fiber::sync_wait, in addition to std::this_thread::sync_wait. However, because we would want the names to be the same in both cases, we would need to make the names of the customizations not match the -names of the customization points. This is undesirable.

          -
        -

        This paper proposes to instead use the mechanism described in tag_invoke: A general pattern for supporting customisable functions: tag_invoke; the wording for tag_invoke has been incorporated into the proposed specification in this paper.

        -

        In short, instead of using globally reserved names, tag_invoke uses the type of the customization point object itself as the mechanism to find customizations. It globally reserves only a single name - tag_invoke - which itself is used the same way that -ranges-style customization points are used. All other customization points are defined in terms of tag_invoke. For example, the customization for std::this_thread::sync_wait(s) will call tag_invoke(std::this_thread::sync_wait, s), instead of attempting -to invoke s.sync_wait(), and then sync_wait(s) if the member call is not valid.

        -

        Using tag_invoke has the following benefits:

        -
          -
        1. -

          It reserves only a single global name, instead of reserving a global name for every customization point object we define.

          -
        2. -

          It is possible to propagate customizations to a subobject, because the information of which customization point is being resolved is in the type of an argument, and not in the name of the function:

          -
          // forward most customizations to a subobject
          -template<typename Tag, typename ...Args>
          -friend auto tag_invoke(Tag && tag, wrapper & self, Args &&... args) {
          -    return std::forward<Tag>(tag)(self.subobject, std::forward<Args>(args)...);
          -}
          +   

          The tag_invoke technique, although it had its strengths, has been replaced +with a new (or rather, a very old) technique that uses explicit concept opt-ins +and named member functions. For instance, the execution::set_value operation +is now customized by defining a member function named set_value in the +receiver type. This technique is more explicit and easier to understand than tag_invoke. This is what a receiver author would do to customize execution::set_value now:

          +
          struct my_receiver {
          +  using receiver_concept = execution::receiver_t;
           
          -// but override one of them with a specific value
          -friend auto tag_invoke(specific_customization_point_t, wrapper & self) {
          -    return self.some_value;
          -}
          +  void set_value(int value) && noexcept {
          +    std::cout << "received value: " << value;
          +  }
          +  //...
          +};
           
          -
        3. -

          It is possible to pass those as template arguments to types, because the information of which customization point is being resolved is in the type. Similarly to how A Unified Executors Proposal for C++ defines a polymorphic executor wrapper which accepts a list of properties it -supports, we can imagine scheduler and sender wrappers that accept a list of queries and operations they support. That list can contain the types of the customization point objects, and the polymorphic wrappers can then specialize those customization points on -themselves using tag_invoke, dispatching to manually constructed vtables containing pointers to specialized implementations for the wrapped objects. For an example of such a polymorphic wrapper, see unifex::any_unique (example).

          -
        +

        The only exception to this is the customization of queries. There is a need to +build queryable adaptors that can forward and open and unknowable set of queries +to some wrapped object. This is done by defining a member function named query in the adaptor type that takes the query CPO object as its first +(and usually only) argument. A queryable adaptor might look like this:

        +
        template <class Query, class Queryable, class... Args>
        +concept query_for =
        +  execution::queryable<Queryable> &&
        +  requires (const Queryable& o, Args&&... args) {
        +    o.query(Query(), (Args&&) args...);
        +  };
        +
        +template<class Allocator = std::allocator<>,
        +         execution::queryable Base = execution::empty_env>
        +struct with_allocator {
        +  Allocator alloc{};
        +  Base base{};
        +
        +  // Forward unknown queries to the wrapped object:
        +  template<query_for<Base> Query>
        +  decltype(auto) query(Query q) const {
        +    return base.query(q);
        +  }
        +
        +  // Specialize the query for the allocator:
        +  Allocator query(execution::get_allocator_t) const {
        +    return alloc;
        +  }
        +};
        +
        +

        Customization of sender algorithms such as execution::then and execution::bulk are handled differently because they must dispatch based on +where the sender is executing. See the section on § 5.4 Sender algorithms are customizable for +more information.

        6. Specification

        Much of this wording follows the wording of A Unified Executors Proposal for C++.

        -

        § 8 Library introduction [library] is meant to be a diff relative to the wording of the [library] clause of Working Draft, Standard for Programming Language C++.

        -

        § 9 General utilities library [utilities] is meant to be a diff relative to the wording of the [utilities] clause of Working Draft, Standard for Programming Language C++. This diff applies changes from tag_invoke: A general pattern for supporting customisable functions.

        +

        § 9 General utilities library [utilities] is meant to be a diff relative to the wording of the [utilities] clause of Working Draft, Standard for Programming Language C++.

        § 10 Thread support library [thread] is meant to be a diff relative to the wording of the [thread] clause of Working Draft, Standard for Programming Language C++. This diff applies changes from Composable cancellation for sender-based async operations.

        -

        § 11 Execution control library [exec] is meant to be added as a new library clause to the working draft of C++.

        +

        § 11 Execution control library [exec] is meant to be added as a new library clause to the working +draft of C++.

        7. Exception handling [except]

        7.1. Special functions [except.special]

        7.1.1. General [except.special.general]

        7.1.1.1. The std::terminate function [except.terminate]
        -
        At the end of the bulleted list in the Note in paragraph 1, add a new bullet as follows:
        +
        At the end of the bulleted list in the Note in paragraph 1, add +a new bullet as follows:
        • -

          when a callback invocation exits via an exception when requesting stop on a std::stop_source or a std::in_place_stop_source ([stopsource.mem], [stopsource.inplace.mem]), or in -the constructor of std::stop_callback or std::in_place_stop_callback ([stopcallback.cons], [stopcallback.inplace.cons]) when a callback invocation exits -via an exception.

          +

          when a callback invocation exits via an exception when requesting stop on a std::stop_source or a std::in_place_stop_source ([stopsource.mem], +[stopsource.inplace.mem]), or in the constructor of std::stop_callback or std::in_place_stop_callback ([stopcallback.cons], +[stopcallback.inplace.cons]) when a callback invocation exits via an +exception.

        @@ -5441,141 +6077,32 @@

        -
        In subclause [conforming], after [lib.types.movedfrom], -add the following new subclause with suggested stable name [lib.tmpl-heads].
        - -
        - 16.4.6.17 Class template-heads -
          -
        1. -

          If a class template’s template-head is marked with "arguments are not -associated entities"", any template arguments do not contribute to the -associated entities ([basic.lookup.argdep]) of a function call where a -specialization of the class template is an associated entity. In such a case, -the class template can be implemented as an alias template referring to a -templated class, or as a class template where the template arguments -themselves are templated classes.

          -
        2. -

          [Example:

          -
          template<class T> // arguments are not associated entities
          -struct S {};
          -
          -namespace N {
          -  int f(auto);
          -  struct A {};
          -}
          -
          -int x = f(S<N::A>{});  // error: N::f not a candidate
          -
          -

          The template S specified above can be implemented as

          -
          template<class T>
          -struct s-impl {
          -  struct type { };
          -};
          -
          -template<class T>
          -using S = s-impl<T>::type;
          -
          -

          or as

          -
          template<class T>
          -struct hidden {
          -  using type = struct _ {
          -    using type = T;
          -  };
          -};
          -
          -template<class HiddenT>
          -struct s-impl {
          -  using T = HiddenT::type;
          -};
          -
          -template<class T>
          -using S = s-impl<typename hidden<T>::type>;
          -
          -

          -- end example]

          -
        -
        -

        9. General utilities library [utilities]

        9.1. Function objects [function.objects]

        9.1.1. Header <functional> synopsis [functional.syn]

        At the end of this subclause, insert the following declarations into the synopsis within namespace std:

        -
        // expositon only:
        -template<class Fn, class... Args>
        -  concept callable =
        +
        template<class Fn, class... Args>
        +  concept callable =  // expositon only
             requires (Fn&& fn, Args&&... args) {
               std::forward<Fn>(fn)(std::forward<Args>(args)...);
             };
         template<class Fn, class... Args>
        -  concept nothrow-callable =
        +  concept nothrow-callable =   // expositon only
             callable<Fn, Args...> &&
             requires (Fn&& fn, Args&&... args) {
               { std::forward<Fn>(fn)(std::forward<Args>(args)...) } noexcept;
             };
        +// expositon only:
         template<class Fn, class... Args>
           using call-result-t = decltype(declval<Fn>()(declval<Args>()...));
         
        -// [func.tag_invoke], tag_invoke
        -namespace tag-invoke { // exposition only
        -  void tag_invoke();
        -
        -  template<class Tag, class... Args>
        -    concept tag_invocable =
        -      requires (Tag&& tag, Args&&... args) {
        -        tag_invoke(std::forward<Tag>(tag), std::forward<Args>(args)...);
        -      };
        -
        -  template<class Tag, class... Args>
        -    concept nothrow_tag_invocable =
        -      tag_invocable<Tag, Args...> &&
        -      requires (Tag&& tag, Args&&... args) {
        -        { tag_invoke(std::forward<Tag>(tag), std::forward<Args>(args)...) } noexcept;
        -      };
        -
        -  template<class Tag, class... Args>
        -    using tag_invoke_result_t =
        -      decltype(tag_invoke(declval<Tag>(), declval<Args>()...));
        -
        -  template<class Tag, class... Args>
        -    struct tag_invoke_result<Tag, Args...> {
        -      using type =
        -        tag_invoke_result_t<Tag, Args...>; // present if and only if tag_invocable<Tag, Args...> is true
        -    };
        -
        -  struct tag; // exposition only
        -}
        -inline constexpr tag-invoke::tag tag_invoke {};
        -using tag-invoke::tag_invocable;
        -using tag-invoke::nothrow_tag_invocable;
        -using tag-invoke::tag_invoke_result_t;
        -using tag-invoke::tag_invoke_result;
        -
         template<auto& Tag>
           using tag_t = decltype(auto(Tag));
         
        -

        9.1.2. tag_invoke [func.tag_invoke]

        -

        Insert this subclause as a new subclause, between Searchers [func.search] and Class template hash [unord.hash].

        - -
        -
          -
        1. -

          Given a subexpression E, let REIFY(E) be expression-equivalent to -a glvalue with the same type and value as E as if by identity()(E).

          -
        2. -

          The name std::tag_invoke denotes a customization point object [customization.point.object]. -Given subexpressions T and A..., the expression std::tag_invoke(T, A...) is -expression-equivalent [defns.expression-equivalent] to tag_invoke(REIFY(T), REIFY(A)...) with overload resolution performed in a context in which unqualified lookup for tag_invoke finds only the declaration

          -
          void tag_invoke();
          -
          -
        3. -

          [Note: Diagnosable ill-formed cases above result in substitution failure when std::tag_invoke(T, A...) appears in the immediate context of a template instantiation. —end note]

          -
        -
        -

        10. Thread support library [thread]

        10.1. Stop tokens [thread.stoptoken]

        10.1.1. Header <stop_token> synopsis [thread.stoptoken.syn]

        @@ -5658,11 +6185,11 @@

        - LWG directed me to replace T::stop_possible() with t.stop_possible() because -of the recent constexpr changes in P2280R2. However, even with those changes, a nested -requirement like requires (!t.stop_possible()), where t is an argument in the requirement-parameter-list, is ill-formed according to [expr.prim.req.nested/p2]: + LWG directed me to replace T::stop_possible() with t.stop_possible() because of the recent constexpr changes in P2280R2. However, even with those changes, a nested requirement like requires (!t.stop_possible()), where t is an argument in the +requirement-parameter-list, is ill-formed according to [expr.prim.req.nested/p2]:
        -

        A local parameter shall only appear as an unevaluated operand within the constraint-expression.

        +

        A local parameter shall only appear as an unevaluated operand within the +constraint-expression.

        This is the subject of core issue 2517.

        @@ -5671,9 +6198,13 @@

        t and u be distinct, valid objects of type T. The type T models stoppable_token only if:

        1. -

          If t.stop_possible() evaluates to false then, if t and u reference the same logical shared stop state, u.stop_possible() shall also subsequently evaluate to false and u.stop_requested() shall also subsequently evaluate to false.

          +

          If t.stop_possible() evaluates to false then, if t and u reference the same logical shared stop state, u.stop_possible() shall +also subsequently evaluate to false and u.stop_requested() shall also +subsequently evaluate to false.

        2. -

          If t.stop_requested() evaluates to true then, if t and u reference the same logical shared stop state, u.stop_requested() shall also subsequently evaluate to true and u.stop_possible() shall also subsequently evaluate to true.

          +

          If t.stop_requested() evaluates to true then, if t and u reference the same logical shared stop state, u.stop_requested() shall +also subsequently evaluate to true and u.stop_possible() shall also +subsequently evaluate to true.

      • Let t and u be distinct, valid objects of type T and let init be an @@ -5695,20 +6226,27 @@

        t.stop_requested() evaluates to true at the time callback is registered then callback can be invoked on the thread executing cb’s constructor.

      • -

        If callback is invoked then, if t and u reference the same shared stop -state, an evaluation of u.stop_requested() will be true if the beginning of the invocation of callback strongly-happens-before the evaluation of u.stop_requested().

        +

        If callback is invoked then, if t and u reference the same +shared stop state, an evaluation of u.stop_requested() will be true if the beginning of the invocation of callback strongly-happens-before the evaluation of u.stop_requested().

      • -

        [Note: If t.stop_possible() evaluates to false then the construction of cb is not required to construct and initialize callback. --end note]

        +

        If t.stop_possible() evaluates to false then the construction of cb is not required to construct and +initialize callback.

      • -

        Construction of a T::callback_type<CB> instance shall only throw exceptions thrown by the initialization of the CB instance from the value of type Initializer.

        +

        Construction of a T::callback_type<CB> instance shall only throw +exceptions thrown by the initialization of the CB instance from the +value of type Initializer.

      • -

        Destruction of the T::callback_type<CB> object, cb, removes callback from the shared stop state such that callback will not be invoked after the destructor returns.

        +

        Destruction of the T::callback_type<CB> object, cb, removes callback from the shared stop state such that callback will not be +invoked after the destructor returns.

        1. -

          If callback is currently being invoked on another thread then the destructor of cb will block until the invocation of callback returns such that the return from the invocation of callback strongly-happens-before the destruction of callback.

          +

          If callback is currently being invoked on another thread then the +destructor of cb will block until the invocation of callback returns such that the return from the invocation of callback strongly-happens-before the destruction of callback.

        2. -

          Destruction of a callback cb shall not block on the completion of the invocation of some other callback registered with the same shared stop state.

          +

          Destruction of a callback cb shall not block on the completion of +the invocation of some other callback registered with the same shared +stop state.

        @@ -5729,11 +6267,14 @@
        // ...
      • 10.1.4. Class never_stop_token [stoptoken.never]

        -

        Insert a new subclause, Class never_stop_token [stoptoken.never], after subclause Class template stop_callback [stopcallback], as a new subclause of Stop tokens [thread.stoptoken].

        +

        Insert a new subclause, Class never_stop_token [stoptoken.never], after +subclause Class template stop_callback [stopcallback], as a new +subclause of Stop tokens [thread.stoptoken].

        10.1.4.1. General [stoptoken.never.general]
        1. -

          The class never_stop_token provides an implementation of the unstoppable_token concept. It provides a stop token interface, but also provides static information that a stop is never possible nor requested.

          +

          The class never_stop_token provides an implementation of the unstoppable_token concept. It provides a stop token interface, but also +provides static information that a stop is never possible nor requested.

        namespace std
         {
        @@ -5754,12 +6295,16 @@ 
        10.1.5. Class in_place_stop_token [stoptoken.inplace]
        -

        Insert a new subclause, Class in_place_stop_token [stoptoken.inplace], after the subclause added above, as a new subclause of Stop tokens [thread.stoptoken].

        +

        Insert a new subclause, Class in_place_stop_token [stoptoken.inplace], +after the subclause added above, as a new subclause of Stop tokens [thread.stoptoken].

        10.1.5.1. General [stoptoken.inplace.general]
        1. -

          The class in_place_stop_token provides an interface for querying whether a stop request has been made (stop_requested) or can ever be made (stop_possible) using an associated in_place_stop_source object ([stopsource.inplace]). -An in_place_stop_token can also be passed to an in_place_stop_callback ([stopcallback.inplace]) constructor to register a callback to be called when a stop request has been made from an associated in_place_stop_source.

          +

          The class in_place_stop_token provides an interface for querying whether a +stop request has been made (stop_requested) or can ever be made +(stop_possible) using an associated in_place_stop_source object +([stopsource.inplace]). An in_place_stop_token can also be passed to an in_place_stop_callback ([stopcallback.inplace]) constructor to register a +callback to be called when a stop request has been made from an associated in_place_stop_source.

        namespace std {
           class in_place_stop_token {
        @@ -5804,8 +6349,9 @@ 
        return source_ != nullptr && source_->stop_requested();

      • -

        [Note: The behavior of stop_requested() is undefined unless the call -strongly happens before the start of the destructor of the associated in_place_stop_source, if any ([basic.life]). --end note]

        +

        The behavior of stop_requested() is undefined unless +the call strongly happens before the start of the destructor of the +associated in_place_stop_source, if any ([basic.life]).

        [[nodiscard]] bool stop_possible() const noexcept;
         
        @@ -5813,9 +6359,10 @@
        return source_ != nullptr;

      • -

        [Note: The behavior of stop_possible() is implementation-defined unless -the call strongly happens before the end of the storage duration of the -associated in_place_stop_source object, if any ([basic.stc.general]). --end note]

        +

        The behavior of stop_possible() is +implementation-defined unless the call strongly happens before the end of +the storage duration of the associated in_place_stop_source object, if any +([basic.stc.general]).

        10.1.5.4. Non-member functions [stoptoken.inplace.nonmembers]
        friend void swap(in_place_stop_token& x, in_place_stop_token& y) noexcept;
        @@ -5825,14 +6372,17 @@ 
        x.swap(y).

        10.1.6. Class in_place_stop_source [stopsource.inplace]

        -

        Insert a new subclause, Class in_place_stop_source [stopsource.inplace], after the subclause added above, as a new subclause of Stop tokens [thread.stoptoken].

        +

        Insert a new subclause, Class in_place_stop_source [stopsource.inplace], after the subclause added above, as a new subclause +of Stop tokens [thread.stoptoken].

        10.1.6.1. General [stopsource.inplace.general]
        1. -

          The class in_place_stop_source implements the semantics of making a stop request, without the need for a dynamic allocation of a shared state. -A stop request made on a in_place_stop_source object is visible to all associated in_place_stop_token ([stoptoken.inplace]) objects. -Once a stop request has been made it cannot be withdrawn (a subsequent stop request has no effect). -All uses of in_place_stop_token objects associated with a given in_place_stop_source object must happen before the start of the destructor of that in_place_stop_source object.

          +

          The class in_place_stop_source implements the semantics of making a stop +request, without the need for a dynamic allocation of a shared state. A stop +request made on a in_place_stop_source object is visible to all associated in_place_stop_token ([stoptoken.inplace]) objects. Once a stop request has +been made it cannot be withdrawn (a subsequent stop request has no effect). +All uses of in_place_stop_token objects associated with a given in_place_stop_source object must happen before the start of the destructor +of that in_place_stop_source object.

        namespace std {
           class in_place_stop_source {
        @@ -5853,28 +6403,32 @@ 
      • -

        An instance of in_place_stop_source maintains a list of registered callback invocations. -The registration of a callback invocation either succeeds or fails. When an invocation -of a callback is registered, the following happens atomically:

        +

        An instance of in_place_stop_source maintains a list of registered callback +invocations. The registration of a callback invocation either succeeds or +fails. When an invocation of a callback is registered, the following happens +atomically:

        • -

          The stop state is checked. If stop has not been requested, the callback invocation is -added to the list of registered callback invocations, and registration has succeeded.

          +

          The stop state is checked. If stop has not been requested, the callback + invocation is added to the list of registered callback invocations, + and registration has succeeded.

        • Otherwise, registration has failed.

        -

        When an invocation of a callback is unregistered, the invocation is atomically removed -from the list of registered callback invocations. The removal is not blocked by the concurrent -execution of another callback invocation in the list. If the callback invocation -being unregistered is currently executing, then:

        +

        When an invocation of a callback is unregistered, the invocation is +atomically removed from the list of registered callback invocations. The +removal is not blocked by the concurrent execution of another callback +invocation in the list. If the callback invocation being unregistered is +currently executing, then:

        • -

          If the execution of the callback invocation is happening concurrently on another thread, -the completion of the execution strongly happens before ([intro.races]) the end of the -callback’s lifetime.

          +

          If the execution of the callback invocation is happening concurrently on + another thread, the completion of the execution strongly happens + before ([intro.races]) the end of the callback’s lifetime.

        • -

          Otherwise, the execution is happening on the current thread. Removal of the -callback invocation does not block waiting for the execution to complete.

          +

          Otherwise, the execution is happening on the current thread. Removal of + the callback invocation does not block waiting for the execution to + complete.

        10.1.6.2. Constructors, copy, and assignment [stopsource.inplace.cons]
        @@ -5897,23 +6451,27 @@
      • -

        Returns: true if the stop state inside *this has received a stop request; otherwise, false.

        +

        Returns: true if the stop state inside *this has received a stop +request; otherwise, false.

        bool request_stop() noexcept;
         
        1. -

          Effects: Atomically determines whether the stop state inside *this has received a stop request, and if not, makes a stop request. -The determination and making of the stop request are an atomic read-modify-write operation ([intro.races]). -If the request was made, the registered invocations are executed and the evaluations of the invocations are indeterminately sequenced. -If an invocation of a callback exits via an exception then terminate is invoked ([except.terminate]).

          +

          Effects: Atomically determines whether the stop state inside *this has +received a stop request, and if not, makes a stop request. The determination +and making of the stop request are an atomic read-modify-write operation +([intro.races]). If the request was made, the registered invocations are +executed and the evaluations of the invocations are indeterminately +sequenced. If an invocation of a callback exits via an exception then terminate is invoked ([except.terminate]).

        2. Postconditions: stop_requested() is true.

        3. Returns: true if this call made a stop request; otherwise false.

        10.1.7. Class template in_place_stop_callback [stopcallback.inplace]

        -

        Insert a new subclause, Class template in_place_stop_callback [stopcallback.inplace], after the subclause added above, as a new subclause of Stop tokens [thread.stoptoken].

        +

        Insert a new subclause, Class template in_place_stop_callback [stopcallback.inplace], after the subclause added above, as a new +subclause of Stop tokens [thread.stoptoken].

        10.1.7.1. General [stopcallback.inplace.general]
        1. @@ -5941,11 +6499,14 @@
          in_place_stop_callback is instantiated with an argument for the template parameter Callback that satisfies both invocable and destructible.

          +

          Mandates: in_place_stop_callback is instantiated with an argument for the +template parameter Callback that satisfies both invocable and destructible.

        2. -

          Preconditions: in_place_stop_callback is instantiated with an argument for the template parameter Callback that models both invocable and destructible.

          +

          Preconditions: in_place_stop_callback is instantiated with an argument +for the template parameter Callback that models both invocable and destructible.

        3. -

          Recommended practice: Implementations should use the storage of the in_place_stop_callback objects to store the state necessary for their association with an in_place_stop_source object.

          +

          Recommended practice: Implementations should use the storage of the in_place_stop_callback objects to store the state necessary for their +association with an in_place_stop_source object.

        10.1.7.2. Constructors and destructor [stopcallback.inplace.cons]
        template<class C>
        @@ -5958,14 +6519,15 @@ 
        Callback and C model constructible_from<Callback, C>.

      • -

        Effects: Initializes callback_ with std::forward<C>(cb). -Any in_place_stop_source associated with st becomes associated with *this. -Registers ([stopsource.inplace.general]) the callback invocation std::forward<Callback>(callback_)() with the associated in_place_stop_source, if any. If the registration fails, evaluates -the callback invocation.

        +

        Effects: Initializes callback_ with std::forward<C>(cb). Any in_place_stop_source associated with st becomes associated with *this. Registers ([stopsource.inplace.general]) +the callback invocation std::forward<Callback>(callback_)() with the +associated in_place_stop_source, if any. If the registration fails, +evaluates the callback invocation.

      • Throws: Any exception thrown by the initialization of callback_.

      • -

        Remarks: If evaluating std::forward<Callback>(callback_)() exits via an exception, then terminate is invoked ([except.terminate]).

        +

        Remarks: If evaluating std::forward<Callback>(callback_)() exits via an +exception, then terminate is invoked ([except.terminate]).

        ~in_place_stop_callback();
         
        @@ -6064,16 +6626,10 @@

        This clause makes use of the following exposition-only entities:

        1. -
          template<class Fn, class... Args>
          -    requires callable<Fn, Args...>
          -  constexpr auto mandate-nothrow-call(Fn&& fn, Args&&... args) noexcept
          -    -> call-result-t<Fn, Args...> {
          -    return std::forward<Fn>(fn)(std::forward<Args>(args)...);
          -  }
          -
          +

          For a subexpression expr, let MANDATE-NOTHROW(expr) be expression-equivalent to expr.

          • -

            Mandates: nothrow-callable<Fn, Args...> is true.

            +

            Mandates: noexcept(expr) is true.

        2. template<class T>
          @@ -6121,7 +6677,8 @@ 

          tag_invoke(q, env, args...) is well-formed, then q(env, args...) is expression-equivalent to tag_invoke(q, env, args...).

          +

          If the expression env.query(q, args...) is well-formed, +then it is expression-equivalent to q(env, args...).

        3. Unless otherwise specified, the value returned by the expression q(env, args...) is valid as long as env is valid.

        @@ -6134,9 +6691,9 @@

        queryable concept specifies the constraints on the types of queryable objects.

      • -

        Let env be an object of type Env. The type Env models queryable if for each -callable object q and a pack of subexpressions args, -if requires { q(env, args...) } is true then q(env, args...) meets any semantic requirements imposed by q.

        +

        Let env be an object of type Env. The type Env models queryable if +for each callable object q and a pack of subexpressions args, if requires { q(env, args...) } is true then q(env, args...) meets any semantic requirements imposed +by q.

        11.3. Asynchronous operations [async.ops]

          @@ -6310,17 +6867,13 @@

          // [exec.queryable], queryable objects template<class T> - concept queryable = destructible; + concept queryable = destructible<T>; // [exec.queries], queries - namespace queries { // exposition only - struct forwarding_query_t; - struct get_allocator_t; - struct get_stop_token_t; - } - using queries::forwarding_query_t; - using queries::get_allocator_t; - using queries::get_stop_token_t; + struct forwarding_query_t; + struct get_allocator_t; + struct get_stop_token_t; + inline constexpr forwarding_query_t forwarding_query{}; inline constexpr get_allocator_t get_allocator{}; inline constexpr get_stop_token_t get_stop_token{}; @@ -6337,19 +6890,13 @@

          namespace std::execution { // [exec.queries], queries enum class forward_progress_guarantee; - namespace queries { // exposition only - struct get_domain_t; - struct get_scheduler_t; - struct get_delegatee_scheduler_t; - struct get_forward_progress_guarantee_t; - template<class CPO> - struct get_completion_scheduler_t; - } - using queries::get_domain_t; - using queries::get_scheduler_t; - using queries::get_delegatee_scheduler_t; - using queries::get_forward_progress_guarantee_t; - using queries::get_completion_scheduler_t; + struct get_domain_t; + struct get_scheduler_t; + struct get_delegatee_scheduler_t; + struct get_forward_progress_guarantee_t; + template<class CPO> + struct get_completion_scheduler_t; + inline constexpr get_domain_t get_domain{}; inline constexpr get_scheduler_t get_scheduler{}; inline constexpr get_delegatee_scheduler_t get_delegatee_scheduler{}; @@ -6357,12 +6904,8 @@

          template<class CPO> inline constexpr get_completion_scheduler_t<CPO> get_completion_scheduler{}; - namespace exec-envs { // exposition only - struct empty_env {}; - struct get_env_t; - } - using envs-envs::empty_env; - using envs-envs::get_env_t; + struct empty_env {}; + struct get_env_t; inline constexpr get_env_t get_env {}; template<class T> @@ -6372,49 +6915,40 @@

          struct default_domain; // [exec.sched], schedulers + struct scheduler_t {}; + template<class Sch> concept scheduler = see below; // [exec.recv], receivers struct receiver_t {}; - template<class Rcvr> - inline constexpr bool enable_receiver = see below; - template<class Rcvr> concept receiver = see below; template<class Rcvr, class Completions> concept receiver_of = see below; - namespace receivers { // exposition only - struct set_value_t; - struct set_error_t; - struct set_stopped_t; - } - using receivers::set_value_t; - using receivers::set_error_t; - using receivers::set_stopped_t; + struct set_value_t; + struct set_error_t; + struct set_stopped_t; + inline constexpr set_value_t set_value{}; inline constexpr set_error_t set_error{}; inline constexpr set_stopped_t set_stopped{}; // [exec.opstate], operation states + struct operation_state_t {}; + template<class O> concept operation_state = see below; - namespace op-state { // exposition only - struct start_t; - } - using op-state::start_t; + struct start_t; inline constexpr start_t start{}; // [exec.snd], senders struct sender_t {}; - template<class Sndr> - inline constexpr bool enable_sender = see below; - template<class Sndr> concept sender = see below; @@ -6434,10 +6968,7 @@

          concept single-sender = see below; // exposition only // [exec.getcomplsigs], completion signatures - namespace completion-signatures { // exposition only - struct get_completion_signatures_t; - } - using completion-signatures::get_completion_signatures_t; + struct get_completion_signatures_t; inline constexpr get_completion_signatures_t get_completion_signatures {}; template<class Sndr, class Env = empty_env> @@ -6471,40 +7002,38 @@

          using tag_of_t = see below; // [exec.snd.transform], sender transformations - template<class Domain, sender Sndr> - constexpr sender decltype(auto) transform_sender(Domain dom, Sndr&& sndrv); - - template<class Domain, sender Sndr, queryable Env> - constexpr sender decltype(auto) transform_sender(Domain dom, Sndr&& sndr, const Env& env); + template<class Domain, sender Sndr, queryable... Env> + requires (sizeof...(Env) <= 1) + constexpr sender decltype(auto) transform_sender( + Domain dom, Sndr&& sndr, const Env&... env) noexcept(see below); + // [exec.snd.transform.env], environment transformations template<class Domain, sender Sndr, queryable Env> - constexpr decltype(auto) transform_env(Domain dom, Sndr&& sndr, Env&& env) noexcept; + constexpr queryable decltype(auto) transform_env( + Domain dom, Sndr&& sndr, Env&& env) noexcept; // [exec.snd.apply], sender algorithm application template<class Domain, class Tag, sender Sndr, class... Args> - constexpr decltype(auto) apply_sender(Domain dom, Tag, Sndr&& sndr, Args&&... args) noexcept(see below); + constexpr decltype(auto) apply_sender( + Domain dom, Tag, Sndr&& sndr, Args&&... args) noexcept(see below); // [exec.connect], the connect sender algorithm - namespace senders-connect { // exposition only - struct connect_t; - } - using senders-connect::connect_t; + struct connect_t; inline constexpr connect_t connect{}; template<class Sndr, class Rcvr> - using connect_result_t = decltype(connect(declval<Sndr>(), declval<Rcvr>())); + using connect_result_t = + decltype(connect(declval<Sndr>(), declval<Rcvr>())); // [exec.factories], sender factories - namespace sender-factories { // exposition only - struct just_t; - struct just_error_t; - struct just_stopped_t; - struct schedule_t; - } - inline constexpr just just{}; + struct just_t; + struct just_error_t; + struct just_stopped_t; + struct schedule_t; + + inline constexpr just_t just{}; inline constexpr just_error_t just_error{}; inline constexpr just_stopped_t just_stopped{}; - using sender-factories::schedule_t; inline constexpr schedule_t schedule{}; inline constexpr unspecified read{}; @@ -6512,109 +7041,62 @@

          using schedule_result_t = decltype(schedule(declval<Sndr>())); // [exec.adapt], sender adaptors - namespace sender-adaptor-closure { // exposition only - template<class-type D> - struct sender_adaptor_closure { }; - } - using sender-adaptor-closure::sender_adaptor_closure; - - namespace sender-adaptors { // exposition only - struct on_t; - struct transfer_t; - struct schedule_from_t; - struct then_t; - struct upon_error_t; - struct upon_stopped_t; - struct let_value_t; - struct let_error_t; - struct let_stopped_t; - struct bulk_t; - struct split_t; - struct when_all_t; - struct when_all_with_variant_t; - struct into_variant_t; - struct stopped_as_optional_t; - struct stopped_as_error_t; - struct ensure_started_t; - } - using sender-adaptors::on_t; - using sender-adaptors::transfer_t; - using sender-adaptors::schedule_from_t; - using sender-adaptors::then_t; - using sender-adaptors::upon_error_t; - using sender-adaptors::upon_stopped_t; - using sender-adaptors::let_value_t; - using sender-adaptors::let_error_t; - using sender-adaptors::let_stopped_t; - using sender-adaptors::bulk_t; - using sender-adaptors::split_t; - using sender-adaptors::when_all_t; - using sender-adaptors::when_all_with_variant_t; - using sender-adaptors::into_variant_t; - using sender-adaptors::stopped_as_optional_t; - using sender-adaptors::stopped_as_error_t; - using sender-adaptors::ensure_started_t; + template<class-type D> + struct sender_adaptor_closure { }; + + struct on_t; + struct transfer_t; + struct schedule_from_t; + struct then_t; + struct upon_error_t; + struct upon_stopped_t; + struct let_value_t; + struct let_error_t; + struct let_stopped_t; + struct bulk_t; + struct split_t; + struct ensure_started_t; + struct when_all_t; + struct when_all_with_variant_t; + struct into_variant_t; + struct stopped_as_optional_t; + struct stopped_as_error_t; inline constexpr on_t on{}; inline constexpr transfer_t transfer{}; inline constexpr schedule_from_t schedule_from{}; - inline constexpr then_t then{}; inline constexpr upon_error_t upon_error{}; inline constexpr upon_stopped_t upon_stopped{}; - inline constexpr let_value_t let_value{}; inline constexpr let_error_t let_error{}; inline constexpr let_stopped_t let_stopped{}; - inline constexpr bulk_t bulk{}; - inline constexpr split_t split{}; + inline constexpr ensure_started_t ensure_started{}; inline constexpr when_all_t when_all{}; inline constexpr when_all_with_variant_t when_all_with_variant{}; - inline constexpr into_variant_t into_variant{}; - inline constexpr stopped_as_optional_t stopped_as_optional; - inline constexpr stopped_as_error_t stopped_as_error; - inline constexpr ensure_started_t ensure_started{}; - // [exec.consumers], sender consumers - namespace sender-consumers { // exposition only - struct start_detached_t; - } - using sender-consumers::start_detached_t; + struct start_detached_t; inline constexpr start_detached_t start_detached{}; // [exec.utils], sender and receiver utilities - // [exec.utils.rcvr.adptr] - template< - class-type Derived, - receiver Base = unspecified> // arguments are not associated entities ([lib.tmpl-heads]) - class receiver_adaptor; - + // [exec.utils.cmplsigs] template<class Fn> concept completion-signature = // exposition only see below; - // [exec.utils.cmplsigs] template<completion-signature... Fns> struct completion_signatures {}; - template<class... Args> // exposition only - using default-set-value = - completion_signatures<set_value_t(Args...)>; - - template<class Err> // exposition only - using default-set-error = - completion_signatures<set_error_t(Err)>; - template<class Sigs> // exposition only concept valid-completion-signatures = see below; - // [exec.utils.mkcmplsigs] + // [exec.utils.tfxcmplsigs] template< valid-completion-signatures InputSignatures, valid-completion-signatures AdditionalSignatures = completion_signatures<>, @@ -6633,57 +7115,40 @@

          requires sender_in<Sndr, Env> using transform_completion_signatures_of = transform_completion_signatures< - completion_signatures_of_t<Sndr, Env>, AdditionalSignatures, SetValue, SetError, SetStopped>; + completion_signatures_of_t<Sndr, Env>, + AdditionalSignatures, SetValue, SetError, SetStopped>; // [exec.ctx], execution resources + // [exec.run.loop], run_loop class run_loop; } namespace std::this_thread { // [exec.queries], queries - namespace queries { // exposition only - struct execute_may_block_caller_t; - } - using queries::execute_may_block_caller_t; + struct execute_may_block_caller_t; inline constexpr execute_may_block_caller_t execute_may_block_caller{}; - namespace this-thread { // exposition only - struct sync-wait-env; // exposition only - template<class Sndr> - requires sender_in<Sndr, sync-wait-env> - using sync-wait-result-type = see below; // exposition only - template<class Sndr> - using sync-wait-with-variant-result-type = see below; // exposition only + struct sync_wait_t; + struct sync_wait_with_variant_t; - struct sync_wait_t; - struct sync_wait_with_variant_t; - } - using this-thread::sync_wait_t; - using this-thread::sync_wait_with_variant_t; inline constexpr sync_wait_t sync_wait{}; inline constexpr sync_wait_with_variant_t sync_wait_with_variant{}; } namespace std::execution { // [exec.execute], one-way execution - namespace execute { // exposition only - struct execute_t; - } - using execute::execute_t; + struct execute_t; inline constexpr execute_t execute{}; // [exec.as.awaitable] - namespace coro-utils { // exposition only - struct as_awaitable_t; - } - using coro-utils::as_awaitable_t; + struct as_awaitable_t; inline constexpr as_awaitable_t as_awaitable; // [exec.with.awaitable.senders] template<class-type Promise> struct with_awaitable_senders; } -

      • +
        1. The exposition-only type variant-or-empty<Ts...> is @@ -6713,7 +7178,8 @@

          mandate-nothrow-call(tag_invoke, forwarding_query, q) if that expression is well-formed.

          +

          MANDATE-NOTHROW(q.query(forwarding_query)) if that +expression is well-formed.

          • Mandates: The expression above has type bool and is a core @@ -6730,14 +7196,15 @@

            get_allocator asks an object for its associated allocator.

          • -

            The name get_allocator denotes a query object. For some subexpression env, get_allocator(env) is expression-equivalent to mandate-nothrow-call(tag_invoke, get_allocator, as_const(env)).

            +

            The name get_allocator denotes a query object. For a subexpression env, get_allocator(env) is expression-equivalent to MANDATE-NOTHROW(as_const(env).query(get_allocator)).

            • -

              Mandates: The type of the expression above -satisfies Allocator.

              +

              Mandates: If the expression above is well-formed, its type + satisfies Allocator.

          • -

            forwarding_query(get_allocator) is true.

            +

            forwarding_query(get_allocator) is a core constant +expression and has value true.

          • get_allocator() (with no arguments) is expression-equivalent to execution::read(get_allocator) ([exec.read]).

        @@ -6746,10 +7213,10 @@

        get_stop_token asks an object for an associated stop token.

      • -

        The name get_stop_token denotes a query object. For some subexpression env, get_stop_token(env) is expression-equivalent to:

        +

        The name get_stop_token denotes a query object. For a subexpression env, get_stop_token(env) is expression-equivalent to:

        1. -

          mandate-nothrow-call(tag_invoke, get_stop_token, as_const(env)), if this expression is well-formed.

          +

          MANDATE-NOTHROW(as_const(env).query(get_stop_token)) if that expression is well-formed.

          • Mandates: The type of the expression above satisfies stoppable_token.

            @@ -6766,10 +7233,10 @@

            11.5.4. execution::get_env [exec.get.env]

            1. -

              get_env is a customization point object. For some subexpression o of type O, get_env(o) is expression-equivalent to

              +

              execution::get_env is a customization point object. For a subexpression o, execution::get_env(o) is expression-equivalent to:

              1. -

                tag_invoke(get_env, const_cast<const O&>(o)) if that expression is +

                as_const(o).get_env() if that expression is well-formed.

                • @@ -6782,16 +7249,16 @@

                  get_env(o) shall be valid while o is valid.

                • -

                  When passed a sender object, get_env returns the sender’s attributes. When -passed a receiver, get_env returns the receiver’s environment.

                  +

                  When passed a sender object, get_env returns the +sender’s attributes. When passed a receiver, get_env returns the +receiver’s environment.

              11.5.5. execution::get_domain [exec.get.domain]

              1. get_domain asks an object for an associated execution domain tag.

              2. -

                The name get_domain denotes a query object. For some subexpression env, get_domain(env) is expression-equivalent to mandate-nothrow-call(tag_invoke, get_domain, as_const(env)), -if this expression is well-formed.

                +

                The name get_domain denotes a query object. For a subexpression env, get_domain(env) is expression-equivalent to MANDATE-NOTHROW(as_const(env).query(get_domain)).

              3. forwarding_query(execution::get_domain) is a core constant expression and has value true.

                @@ -6803,11 +7270,12 @@

                get_scheduler asks an object for its associated scheduler.

              4. -

                The name get_scheduler denotes a query object. For some -subexpression env, get_scheduler(env) is expression-equivalent to mandate-nothrow-call(tag_invoke, get_scheduler, as_const(env)).

                +

                The name get_scheduler denotes a query object. For a +subexpression env, get_scheduler(env) is expression-equivalent to MANDATE-NOTHROW(as_const(env).query(get_scheduler)).

                • -

                  Mandates: The type of the expression above satisfies scheduler.

                  +

                  Mandates: If the expression above is well-formed, its type + satisfies scheduler.

              5. forwarding_query(execution::get_scheduler) is a core constant @@ -6818,13 +7286,15 @@

                11.5.7. execution::get_delegatee_scheduler [exec.get.delegatee.scheduler]

                1. -

                  get_delegatee_scheduler asks an object for a scheduler that can be used to delegate work to for the purpose of forward progress delegation.

                  +

                  get_delegatee_scheduler asks an object for a scheduler that can be used to +delegate work to for the purpose of forward progress delegation.

                2. -

                  The name get_delegatee_scheduler denotes a query object. For some -subexpression env, get_delegatee_scheduler(env) is expression-equivalent to mandate-nothrow-call(tag_invoke, get_delegatee_scheduler, as_const(env)).

                  +

                  The name get_delegatee_scheduler denotes a query object. For a +subexpression env, get_delegatee_scheduler(env) is expression-equivalent to MANDATE-NOTHROW(as_const(env).query(get_delegatee_scheduler)).

                  • -

                    Mandates: The type of the expression above is satisfies scheduler.

                    +

                    Mandates: If the expression above is well-formed, its type + satisfies scheduler.

                3. forwarding_query(execution::get_delegatee_scheduler) is a core @@ -6841,13 +7311,18 @@

                  get_forward_progress_guarantee asks a scheduler about the forward progress guarantees of execution agents created by that scheduler.

                  +

                  get_forward_progress_guarantee asks a scheduler about the forward progress +guarantee of execution agents created by that scheduler.

                4. -

                  The name get_forward_progress_guarantee denotes a query object. For some subexpression sch, let Sch be decltype((sch)). If Sch does not satisfy scheduler, get_forward_progress_guarantee is ill-formed. -Otherwise, get_forward_progress_guarantee(sch) is expression-equivalent to:

                  +

                  The name get_forward_progress_guarantee denotes a query object. For a +subexpression sch, let Sch be decltype((sch)). If Sch does not +satisfy scheduler, get_forward_progress_guarantee is ill-formed. +Otherwise, get_forward_progress_guarantee(sch) is expression-equivalent +to:

                  1. -

                    mandate-nothrow-call(tag_invoke, get_forward_progress_guarantee, as_const(sch)), if this expression is well-formed.

                    +

                    MANDATE-NOTHROW(as_const(sch).query(get_forward_progress_guarantee)), +if this expression is well-formed.

                    • Mandates: The type of the expression above is forward_progress_guarantee.

                      @@ -6856,17 +7331,27 @@

                      forward_progress_guarantee::weakly_parallel.

                5. -

                  If get_forward_progress_guarantee(sch) for some scheduler sch returns forward_progress_guarantee::concurrent, all execution agents created by that scheduler shall provide the concurrent forward progress guarantee. If it returns forward_progress_guarantee::parallel, all execution agents created by that scheduler shall provide at least the parallel forward progress guarantee.

                  +

                  If get_forward_progress_guarantee(sch) for some scheduler sch returns forward_progress_guarantee::concurrent, all execution agents created by +that scheduler shall provide the concurrent forward progress guarantee. If +it returns forward_progress_guarantee::parallel, all execution agents +created by that scheduler shall provide at least the parallel forward +progress guarantee.

                11.5.9. this_thread::execute_may_block_caller [exec.execute.may.block.caller]

                1. -

                  this_thread::execute_may_block_caller asks a scheduler sch whether a call execute(sch, f) with any invocable f may block the thread where such a call occurs.

                  +

                  this_thread::execute_may_block_caller asks a scheduler sch whether a call execute(sch, f) with any invocable f may block the thread where such a +call occurs.

                2. -

                  The name this_thread::execute_may_block_caller denotes a query object. For some subexpression sch, let Sch be decltype((sch)). If Sch does not satisfy scheduler, this_thread::execute_may_block_caller is ill-formed. Otherwise, this_thread::execute_may_block_caller(sch) is expression-equivalent to:

                  +

                  The name this_thread::execute_may_block_caller denotes a query object. For +a subexpression sch, let Sch be decltype((sch)). If Sch does not +satisfy scheduler, this_thread::execute_may_block_caller is ill-formed. +Otherwise, this_thread::execute_may_block_caller(sch) is +expression-equivalent to:

                  1. -

                    mandate-nothrow-call(tag_invoke, this_thread::execute_may_block_caller, as_const(sch)), if this expression is well-formed.

                    +

                    MANDATE-NOTHROW(as_const(sch).query(this_thread::execute_may_block_caller)), +if this expression is well-formed.

                    • Mandates: The type of the expression above is bool.

                      @@ -6875,7 +7360,8 @@

                      true.

                3. -

                  If this_thread::execute_may_block_caller(sch) for some scheduler sch returns false, no execute(sch, f) call with some invocable f shall block the calling thread.

                  +

                  If this_thread::execute_may_block_caller(sch) for some scheduler sch returns false, no execute(sch, f) call with some invocable f shall +block the calling thread.

                11.5.10. execution::get_completion_scheduler [exec.completion.scheduler]

                  @@ -6884,14 +7370,14 @@

                  get_completion_scheduler denotes a query object template. For some +

                  The name get_completion_scheduler denotes a query object template. For a subexpression q, let Q be decltype((q)). If the template argument Tag in get_completion_scheduler<Tag>(q) is not one of set_value_t, set_error_t, or set_stopped_t, get_completion_scheduler<Tag>(q) is ill-formed. Otherwise, get_completion_scheduler<Tag>(q) is -expression-equivalent to mandate-nothrow-call(tag_invoke, get_completion_scheduler<Tag>, as_const(q)) if this expression is -well-formed.

                  +expression-equivalent to MANDATE-NOTHROW(as_const(q).query(get_completion_scheduler<Tag>)).

                  • -

                    Mandates: The type of the expression above satisfies scheduler.

                    +

                    Mandates: If the expression above is well-formed, its type + satisfies scheduler.

                1. If, for some sender sndr and completion function C that has an associated @@ -6910,12 +7396,20 @@

                  schedule is a customization point object that accepts a scheduler. A valid invocation of schedule is a schedule-expression.

                  template<class Sch>
                  +  concept enable-scheduler = // exposition only
                  +    requires {
                  +      requires derived_from<typename Sch::scheduler_concept, scheduler_t>;
                  +    };
                  +
                  +template<class Sch>
                     concept scheduler =
                  +    enable-scheduler<remove_cvref_t<Sch>> &&
                       queryable<Sch> &&
                  -    requires(Sch&& sch, const get_completion_scheduler_t<set_value_t> tag) {
                  +    requires(Sch&& sch) {
                         { schedule(std::forward<Sch>(sch)) } -> sender;
                  -      { tag_invoke(tag, get_env(
                  -          schedule(std::forward<Sch>(sch)))) } -> same_as<remove_cvref_t<Sch>>;
                  +      { get_completion_scheduler<set_value_t>(
                  +          get_env(schedule(std::forward<Sch>(sch)))) }
                  +            -> same_as<remove_cvref_t<Sch>>;
                       } &&
                       equality_comparable<remove_cvref_t<Sch>> &&
                       copy_constructible<remove_cvref_t<Sch>>;
                  @@ -6952,15 +7446,14 @@ 

                  get_env customization point is used to access a receiver’s associated environment.

                  template<class Rcvr>
                  -  concept is-receiver = // exposition only
                  -    derived_from<typename Rcvr::receiver_concept, receiver_t>;
                  -
                  -template<class Rcvr>
                  -  inline constexpr bool enable_receiver = is-receiver<Rcvr>;
                  +  concept enable-receiver = // exposition only
                  +    requires {
                  +      requires derived_from<typename Rcvr::receiver_concept, receiver_t>;
                  +    };
                   
                   template<class Rcvr>
                     concept receiver =
                  -    enable_receiver<remove_cvref_t<Rcvr>> &&
                  +    enable-receiver<remove_cvref_t<Rcvr>> &&
                       requires(const remove_cvref_t<Rcvr>& rcvr) {
                         { get_env(rcvr) } -> queryable;
                       } &&
                  @@ -6976,16 +7469,18 @@ 

                  enable_receiver to true for cv-unqualified program-defined types that model receiver, and false for types that do not. Such specializations shall be usable in constant -expressions ([expr.const]) and have type const bool.

                  +

                  Class types that are final do not model the receiver concept.

                2. Let rcvr be a receiver and let op_state be an operation state associated with an asynchronous operation created by connecting rcvr with a sender. Let token be a stop token equal to get_stop_token(get_env(rcvr)). token shall @@ -7001,28 +7496,35 @@

                  set_value is a value completion function ([async.ops]). Its associated completion tag is set_value_t. The expression set_value(rcvr, vs...) for -some subexpression rcvr and pack of subexpressions vs is ill-formed if rcvr is an lvalue or a const rvalue. Otherwise, it is expression-equivalent to mandate-nothrow-call(tag_invoke, set_value, rcvr, vs...).

                  +a subexpression rcvr and pack of subexpressions vs is ill-formed if rcvr is an lvalue or a const rvalue. Otherwise, it is expression-equivalent to MANDATE-NOTHROW(rcvr.set_value(vs...)).

                11.7.3. execution::set_error [exec.set.error]

                1. set_error is an error completion function. Its associated completion tag is set_error_t. The expression set_error(rcvr, err) for some subexpressions rcvr and err is ill-formed if rcvr is an lvalue or a const rvalue. Otherwise, it is -expression-equivalent to mandate-nothrow-call(tag_invoke, set_error, rcvr, err).

                  +expression-equivalent to MANDATE-NOTHROW(rcvr.set_error(err)).

                11.7.4. execution::set_stopped [exec.set.stopped]

                1. set_stopped is a stopped completion function. Its associated completion tag -is set_stopped_t. The expression set_stopped(rcvr) for some subexpression rcvr is ill-formed if rcvr is an lvalue or a const rvalue. Otherwise, it is -expression-equivalent to mandate-nothrow-call(tag_invoke, set_stopped, rcvr).

                  +is set_stopped_t. The expression set_stopped(rcvr) for a subexpression rcvr is ill-formed if rcvr is an lvalue or a const rvalue. Otherwise, it is +expression-equivalent to MANDATE-NOTHROW(rcvr.set_stopped()).

                11.8. Operation states [exec.opstate]

                1. The operation_state concept defines the requirements of an operation state type ([async.ops]).

                  -
                  template<class O>
                  +
                  template<class Rcvr>
                  +  concept enable-opstate = // exposition only
                  +    requires {
                  +      requires derived_from<typename Rcvr::operation_state_concept, operation_state_t>;
                  +    };
                  +
                  +template<class O>
                     concept operation_state =
                  +    enable-opstate<remove_cvref_t<O>> &&
                       queryable<O> &&
                       is_object_v<O> &&
                       requires (O& o) {
                  @@ -7040,13 +7542,13 @@ 

                  start denotes a customization point object that starts ([async.ops]) the asynchronous operation associated with the operation state -object. The expression start(O) for some subexpression O is ill-formed -if O is an rvalue. Otherwise, it is expression-equivalent to:

                  -
                  mandate-nothrow-call(tag_invoke, start, O)
                  +object. For a subexpression op, the expression start(op) is ill-formed
                  +if op is an rvalue. Otherwise, it is expression-equivalent to:

                  +
                  MANDATE-NOTHROW(op.start())
                   
                2. -

                  If the function selected by tag_invoke does not start the asynchronous -operation associated with the operation state O, the behavior of calling start(O) is undefined.

                  +

                  If op.start() does not start the asynchronous operation associated with the +operation state op, the behavior of calling start(op) is undefined.

                11.9. Senders [exec.snd]

                11.9.1. General [exec.snd.general]

                @@ -7076,22 +7578,22 @@

                env, let FWD-ENV(env) be a queryable object such that for a query object q and a pack of -subexpressions as, the expression tag_invoke(q, FWD-ENV(env), as...) is ill-formed if forwarding_query(q) is false; -otherwise, it is expression-equivalent to tag_invoke(q, env, as...).

                +subexpressions as, the expression FWD-ENV(env).query(q, as...) is ill-formed if forwarding_query(q) is false; +otherwise, it is expression-equivalent to env.query(q, as...).

              6. For a query object q and a subexpression v, let MAKE-ENV(q, v) be a queryable object env such that -the result of tag_invoke(q, env) has a value equal to v ([concepts.equality]). Unless otherwise stated, the object to which tag_invoke(q, env) refers remains valid while env remains valid.

                +the result of env.query(q) has a value equal to v ([concepts.equality]). Unless otherwise stated, the object to which env.query(q) refers remains valid while env remains valid.

              7. For two queryable objects env1 and env2, a query object q and a -pack of subexpressions as, let JOIN-ENV(env1, env2) be a queryable object env3 such that tag_invoke(q, env3, as...) is expression-equivalent to:

                +pack of subexpressions as, let JOIN-ENV(env1, env2) be a queryable object env3 such that env3.query(q, as...) is expression-equivalent to:

                • -

                  tag_invoke(q, env1, as...) if that expression is well-formed,

                  +

                  env1.query(q, as...) if that expression is well-formed,

                • -

                  otherwise, tag_invoke(q, env2, as...) if that expression is +

                  otherwise, env2.query(q, as...) if that expression is well-formed,

                • -

                  otherwise, tag_invoke(q, env3, as...) is ill-formed.

                  +

                  otherwise, env3.query(q, as...) is ill-formed.

              8. The expansions of FWD-ENV, MAKE-ENV, and JOIN-ENV can be context-dependent; i.e., they can expand to @@ -7099,10 +7601,10 @@

                sch, let SCHED-ATTRS(sch) be a -queryable object o1 such that tag_invoke(get_completion_scheduler<Tag>, o1) is a +queryable object o1 such that o1.query(get_completion_scheduler<Tag>) is a prvalue with the same type and value as sch where Tag is one -of set_value_t or set_stopped_t; and let tag_invoke(get_domain, o1) be expression-equivalent to tag_invoke(get_domain, sch). Let SCHED-ENV(sch) be a queryable object o2 such that tag_invoke(get_scheduler, o2) is a prvalue with the same -type and value as sch, and let tag_invoke(get_domain, o2) be expression-equivalent to tag_invoke(get_domain, sch).

                +of set_value_t or set_stopped_t; and let o1.query(get_domain) be expression-equivalent to sch.query(get_domain). Let SCHED-ENV(sch) be a queryable object o2 such that o1.query(get_scheduler) is a prvalue with the same +type and value as sch, and let o2.query(get_domain) be expression-equivalent to sch.query(get_domain).

              9. For two subexpressions rcvr and expr, let SET-VALUE(rcvr, expr) be (expr, set_value(rcvr)) if the type of expr is void; otherwise, it is set_value(rcvr, expr). Let TRY-EVAL(rcvr, expr) be:

                @@ -7115,44 +7617,42 @@

                expr is potentially-throwing; otherwise, expr. Let TRY-SET-VALUE(rcvr, expr) be TRY-EVAL(rcvr, SET-VALUE(rcvr, expr)) except that rcvr is evaluated only once.

              10. template<class Default = default_domain, class Sndr>
                -constexpr auto completion-domain(const Sndr& sndr) noexcept;
                +  constexpr auto completion-domain(const Sndr& sndr) noexcept;
                 
                1. -

                  Effects: Let COMPL-DOMAIN(T) be the type of the expression get_domain(get_completion_scheduler<T>(get_env(sndr))). If COMPL-DOMAIN(set_value_t), COMPL-DOMAIN(set_error_t), and COMPL-DOMAIN(set_stopped_t) all share a common type -[meta.trans.other] (ignoring those types that are ill-formed), then completion-domain<Default>(sndr) is a default-constructed -prvalue of that type. -Otherwise, if all of those types are ill-formed, completion-domain<Default>(sndr) is a default-constructed -prvalue of type Default. -Otherwise, completion-domain<Default>(sndr) is ill-formed.

                  +

                  Effects: Let COMPL-DOMAIN(T) be the type of the +expression get_domain(get_completion_scheduler<T>(get_env(sndr))). +If COMPL-DOMAIN(set_value_t), COMPL-DOMAIN(set_error_t), and COMPL-DOMAIN(set_stopped_t) all share a common +type [meta.trans.other] (ignoring those types that are ill-formed), +then completion-domain<Default>(sndr) is a +default-constructed prvalue of that type. Otherwise, if all of those +types are ill-formed, completion-domain<Default>(sndr) is a +default-constructed prvalue of type Default. Otherwise, completion-domain<Default>(sndr) is +ill-formed.

              11. template<class Tag, class Env, class Default>
                -constexpr decltype(auto) query-with-default(Tag, const Env& env, Default&& value) noexcept(see below);
                +  constexpr decltype(auto) query-with-default(
                +    Tag, const Env& env, Default&& value) noexcept(see below);
                 
                1. -

                  Effects: Equivalent to:

                  -
                    -
                  • -

                    return Tag()(env); if that expression is well-formed,

                    -
                  • -

                    return static_cast<Default>(std::forward<Default>(value)); otherwise.

                    -
                  +

                  Let e be the expression Tag()(env) if that +expression is well-formed; otherwise, it is static_cast<Default>(std::forward<Default>(value)).

                2. -

                  Remarks: The expression in the noexcept clause is:

                  -
                  is_invocable_v<Tag, const Env&> ? is_nothrow_invocable_v<Tag, const Env&>
                  -                                : is_nothrow_constructible_v<Default, Default>
                  -
                  +

                  Returns: e.

                  +
                3. +

                  Remarks: The expression in the noexcept clause is noexcept(e).

              12. template<class Sndr>
                -constexpr auto get-domain-early(const Sndr& sndr) noexcept;
                +  constexpr auto get-domain-early(const Sndr& sndr) noexcept;
                 
                1. -

                  Effects: Equivalent to return Domain(); where Domain is the decayed type of the first of the following -expressions that is well-formed:

                  +

                  Effects: Equivalent to return Domain(); where Domain is the decayed type of the first of the +following expressions that is well-formed:

                  • get_domain(get_env(sndr))

                    @@ -7164,7 +7664,7 @@

                    template<class Sndr, class Env> -constexpr auto get-domain-late(const Sndr& sndr, const Env& env) noexcept; + constexpr auto get-domain-late(const Sndr& sndr, const Env& env) noexcept;

        1. @@ -7196,9 +7696,10 @@

          default_domain().

      -

      The transfer algorithm is unique in that it ignores the -execution domain of its predecessor, using only the domain of its -destination scheduler to select a customization.

      +

      The transfer algorithm is unique in that it +ignores the execution domain of its predecessor, using only the +domain of its destination scheduler to select a +customization.

    • template<callable Fun>
      @@ -7222,14 +7723,14 @@ 

      tuple, optional, and variant.

    • -
      struct on-stop-request {
      -  in_place_stop_source& stop_src;
      -  void operator()() noexcept { stop_src.request_stop(); }
      +
      struct on-stop-request { // exposition only
      +  in_place_stop_source& stop-src; // exposition only
      +  void operator()() noexcept { stop-src.request_stop(); }
       };
       
    • template<class... T>
      -struct product-type {
      +struct product-type {  // exposition only
         using type0 = T0;      // exposition only
         using type1 = T1;      // exposition only
           ...
      @@ -7248,16 +7749,13 @@ 

      template <semiregular Tag, movable-value Data = see below, sender... Child> -constexpr auto make-sender(Tag, Data&& data, Child&&... child); + constexpr auto make-sender(Tag, Data&& data, Child&&... child);

      1. -

        Remarks: The default template argument for the Data template parameter -denotes an unspecified empty trivial class type.

        -
      2. -

        Returns: A prvalue of type basic-sender<Tag, decay_t<Data>, decay_t<Child>...> where the tag member has been default-initialized and the data and childn... members have -been direct initialized from their respective forwarded arguments, where basic-sender is the following exposition-only class template -except as noted below:

        +

        Returns: A prvalue of type basic-sender<Tag, decay_t<Data>, decay_t<Child>...> where the tag member has been default-initialized and the data and childn... members have been direct +initialized from their respective forwarded arguments, where basic-sender is the following exposition-only +class template except as noted below:

        template<class T, class... Us>
         concept one-of = (same_as<T, Us> ||...); // exposition only
         
        @@ -7294,25 +7792,38 @@ 

        Data template parameter +denotes an unspecified empty trivial class type.

      3. It is unspecified whether instances of basic-sender can be aggregate initialized.

      4. -

        An expression of type basic-sender is usable as the -initializer of a structured binding declaration -[dcl.struct.bind].

        +

        An expression of type basic-sender is usable as the initializer of a +structured binding declaration [dcl.struct.bind].

      5. -

        The member default-impls::get-attrs is initialized -with a callable object equivalent to the following lambda:

        +

        The member default-impls::get-attrs is +initialized with a callable object equivalent to the following +lambda:

        [](const auto& data, const auto&... child) noexcept -> decltype(auto) {
           if constexpr (sizeof...(child) == 1)
             return FWD-ENV(get_env(child...)); //
        @@ -7475,17 +7986,18 @@ 

        template<class Sndr> concept is-sender = // exposition only - derived_from<typename Sndr::sender_concept, sender_t>; + requires { + requires derived_from<typename Sndr::sender_concept, sender_t>; + }; template<class Sndr> - inline constexpr bool enable_sender = is-sender<Sndr>; - -template<is-awaitable<env-promise<empty_env>> Sndr> // [exec.awaitables] - inline constexpr bool enable_sender<Sndr> = true; + concept enable-sender = // exposition only + is-sender<Sndr> || + is-awaitable<Sndr, env-promise<empty_env>>; // [exec.awaitables] template<class Sndr> concept sender = - enable_sender<remove_cvref_t<Sndr>> && + bool(enable-sender<remove_cvref_t<Sndr>>) && // atomic constraint requires (const remove_cvref_t<Sndr>& sndr) { { get_env(sndr) } -> queryable; } && @@ -7497,8 +8009,8 @@

        sender<Sndr> && queryable<Env> && requires (Sndr&& sndr, Env&& env) { - { get_completion_signatures(std::forward<Sndr>(sndr), std::forward<Env>(env)) } -> - valid-completion-signatures; + { get_completion_signatures(std::forward<Sndr>(sndr), std::forward<Env>(env)) } + -> valid-completion-signatures; }; template<class Sndr, class Rcvr> @@ -7519,9 +8031,6 @@

        A type Sigs satisfies and models the exposition-only concept valid-completion-signatures if it denotes a specialization of the completion_signatures class template.

        -
      6. -

        Remarks: Pursuant to [namespace.std], users can specialize enable_sender to true for cv-unqualified program-defined types that model sender, and false for types that do not. Such specializations shall be usable in constant -expressions ([expr.const]) and have type const bool.

      7. The exposition-only concepts sender-of and sender-of-in define the requirements for a sender type that completes with a given unique set of value result types.

        @@ -7565,10 +8074,11 @@

        Library-provided sender types:

        • -

          Always expose an overload of a customization of connect that accepts an rvalue sender.

          +

          Always expose an overload of a member connect that accepts an rvalue + sender.

        • -

          Only expose an overload of a customization of connect that - accepts an lvalue sender if they model copy_constructible.

          +

          Only expose an overload of a member connect that accepts an lvalue + sender if they model copy_constructible.

        • Model copy_constructible if they satisfy copy_constructible.

        @@ -7623,21 +8133,26 @@

        p of type Promise, await-result-type<C, Promise> denotes the type decltype(GET-AWAITER(c, p).await_resume()).

      8. Let with-await-transform be the exposition-only class template:

        -
        template<class Derived>
        -struct with-await-transform {
        -  template<class T>
        -  T&& await_transform(T&& value) noexcept {
        -    return std::forward<T>(value);
        -  }
        +
        template<class T, class Promise>
        +  concept has-as-awaitable = // exposition only
        +    requires (T&& t, Promise& p) {
        +      { std::forward<T>(t).as_awaitable(p) } -> is-awaitable<Promise&>;
        +    };
         
        -  template<class T>
        -    requires tag_invocable<as_awaitable_t, T, Derived&>
        -  auto await_transform(T&& value)
        -    noexcept(nothrow_tag_invocable<as_awaitable_t, T, Derived&>)
        -    -> tag_invoke_result_t<as_awaitable_t, T, Derived&> {
        -    return tag_invoke(as_awaitable, std::forward<T>(value), static_cast<Derived&>(*this));
        -  }
        -};
        +template<class Derived>
        +  struct with-await-transform {
        +    template<class T>
        +      T&& await_transform(T&& value) noexcept {
        +        return std::forward<T>(value);
        +      }
        +
        +    template<has-as-awaitable<Derived> T>
        +      auto await_transform(T&& value)
        +        noexcept(noexcept(std::forward<T>(value).as_awaitable(declval<Derived&>())))
        +        -> decltype(std::forward<T>(value).as_awaitable(declval<Derived&>())) {
        +        return std::forward<T>(value).as_awaitable(static_cast<Derived&>(*this));
        +      }
        +  };
         
      9. Let env-promise be the exposition-only class template:

        @@ -7650,7 +8165,7 @@

        void return_void() noexcept; coroutine_handle<> unhandled_stopped() noexcept; - friend const Env& tag_invoke(get_env_t, const env-promise&) noexcept; + const Env& get_env() const noexcept; };

      10. Specializations of env-promise are only used for the purpose of type computation; its members need not be @@ -7660,19 +8175,22 @@

        struct default_domain { template <sender Sndr, queryable... Env> requires (sizeof...(Env) <= 1) - static constexpr sender decltype(auto) transform_sender(Sndr&& sndr, const Env&... env) noexcept(see below); + static constexpr sender decltype(auto) transform_sender(Sndr&& sndr, const Env&... env) + noexcept(see below); template <sender Sndr, queryable Env> static constexpr queryable decltype(auto) transform_env(Sndr&& sndr, Env&& env) noexcept; template<class Tag, sender Sndr, class... Args> - static constexpr decltype(auto) apply_sender(Tag, Sndr&& sndr, Args&&... args) noexcept(see below); + static constexpr decltype(auto) apply_sender(Tag, Sndr&& sndr, Args&&... args) + noexcept(see below); };

      11. 11.9.4.1. Static members [exec.domain.default.statics]
        template <sender Sndr, queryable... Env>
             requires (sizeof...(Env) <= 1)
        -  constexpr sender decltype(auto) transform_sender(Sndr&& sndr, const Env&... env) noexcept(see below);
        +  constexpr sender decltype(auto) transform_sender(Sndr&& sndr, const Env&... env)
        +    noexcept(see below);
         
        1. @@ -7695,7 +8213,8 @@
          e.

        template<class Tag, sender Sndr, class... Args>
        -  constexpr decltype(auto) apply_sender(Tag, Sndr&& sndr, Args&&... args) noexcept(see below);
        +  constexpr decltype(auto) apply_sender(Tag, Sndr&& sndr, Args&&... args)
        +    noexcept(see below);
         
        1. @@ -7710,7 +8229,8 @@
          11.9.5. execution::transform_sender [exec.snd.transform]
          template<class Domain, sender Sndr, queryable... Env>
               requires (sizeof...(Env) <= 1)
          -  constexpr sender decltype(auto) transform_sender(Domain dom, Sndr&& sndr, const Env&... env) noexcept(see below);
          +  constexpr sender decltype(auto) transform_sender(Domain dom, Sndr&& sndr, const Env&... env)
          +    noexcept(see below);
           
          1. @@ -7737,7 +8257,8 @@

            11.9.7. execution::apply_sender [exec.snd.apply]

            template<class Domain, class Tag, sender Sndr, class... Args>
            -  constexpr decltype(auto) apply_sender(Domain dom, Tag, Sndr&& sndr, Args&&... args) noexcept(see below);
            +  constexpr decltype(auto) apply_sender(Domain dom, Tag, Sndr&& sndr, Args&&... args)
            +    noexcept(see below);
             
            1. @@ -7754,19 +8275,19 @@

            2. get_completion_signatures is a customization point object. Let sndr be an -expression such that decltype((sndr)) is Sndr, and let env be an expression -such that decltype((env)) is Env. Then get_completion_signatures(sndr, env) is -expression-equivalent to:

              +expression such that decltype((sndr)) is Sndr, and let env be an +expression such that decltype((env)) is Env. Then get_completion_signatures(sndr, env) is expression-equivalent to:

              1. -

                tag_invoke_result_t<get_completion_signatures_t, Sndr, Env>{} if that +

                decltype(sndr.get_completion_signatures(env)){} if that expression is well-formed,

              2. Otherwise, remove_cvref_t<Sndr>::completion_signatures{} if that expression is well-formed,

              3. Otherwise, if is-awaitable<Sndr, env-promise<Env>> is true, then:

                completion_signatures<
                -  SET-VALUE-SIG(await-result-type<Sndr, env-promise<Env>>), // see [exec.snd.concepts]
                +  SET-VALUE-SIG(await-result-type<Sndr,
                +                env-promise<Env>>), // see [exec.snd.concepts]
                   set_error_t(exception_ptr),
                   set_stopped_t()>{}
                 
                @@ -7790,7 +8311,8 @@

                connect denotes a customization point object. For subexpressions sndr and rcvr, let Sndr be decltype((sndr)) and Rcvr be decltype((rcvr)), and let DS and DR be the decayed types of Sndr and Rcvr, respectively.

              4. Let connect-awaitable-promise be the following class:

                -
                struct connect-awaitable-promise : with-await-transform<connect-awaitable-promise> {
                +
                struct connect-awaitable-promise
                +  : with-await-transform<connect-awaitable-promise> {
                   DR& rcvr; // exposition only
                 
                   connect-awaitable-promise(DS&, DR& rcvr) noexcept : rcvr(rcvr) {}
                @@ -7810,14 +8332,15 @@ 

                operation-state-task be the following class:

                struct operation-state-task {
                +  using operation_state_concept = operation_state_t;
                   using promise_type = connect-awaitable-promise;
                   coroutine_handle<> coro; // exposition only
                 
                @@ -7826,8 +8349,8 @@ 

                tag_invoke(connect, sndr, rcvr) if connectable-with-tag-invoke<Sndr, Rcvr> is modeled.

                +

                sndr.connect(rcvr) if that expression is well-formed.

                • -

                  Mandates: The type of the tag_invoke expression above -satisfies operation_state.

                  +

                  Mandates: The type of the expression above satisfies operation_state.

              5. Otherwise, connect-awaitable(sndr, rcvr) if that expression is @@ -7893,16 +8415,16 @@

                schedule obtains a schedule-sender ([async.ops]) from a scheduler.

              6. -

                The name schedule denotes a customization point object. For some +

                The name schedule denotes a customization point object. For a subexpression sch, the expression schedule(sch) is expression-equivalent to:

                1. -

                  tag_invoke(schedule, sch), if that expression is valid. If the function -selected by tag_invoke does not return a sender whose set_value completion scheduler is equivalent to sch, the behavior of calling schedule(sch) is undefined.

                  +

                  sch.schedule() if that expression is valid. If sch.schedule() does +not return a sender whose set_value completion scheduler is equal +to sch, the behavior of calling schedule(sch) is undefined.

                  • -

                    Mandates: The type of the tag_invoke expression above -satisfies sender.

                    +

                    Mandates: The type of sch.schedule() satisfies sender.

                2. Otherwise, schedule(sch) is ill-formed.

                  @@ -7989,30 +8511,6 @@
                  FWD-ENV(get_env(rcvr)). This requirement applies to any sender returned from a function that is selected by the implementation of such sender adaptor.

                  -
                3. -

                  For any sender type, receiver type, operation state type, queryable type, or -coroutine promise type that is part of the implementation of any sender -adaptor in this subclause and that is a class template, the template -arguments do not contribute to the associated entities -([basic.lookup.argdep]) of a function call where a specialization of the -class template is an associated entity.

                  -

                  [Example:

                  -
                  namespace sender-adaptors { // exposition only
                  -  template<class Sch, class Sndr> // arguments are not associated entities ([lib.tmpl-heads])
                  -  class on-sender {
                  -    // ...
                  -  };
                  -
                  -  struct on_t {
                  -    template<scheduler Sch, sender Sndr>
                  -    on-sender<Sch, Sndr> operator()(Sch&& sch, Sndr&& sndr) const {
                  -      // ...
                  -    }
                  -  };
                  -}
                  -inline constexpr sender-adaptors::on_t on{};
                  -
                  -

                  -- end example]

                4. If a sender returned from a sender adaptor specified in this subclause is specified to include set_error_t(Err) among its set of completion signatures @@ -8029,42 +8527,56 @@

                  c(sndr) sndr | c
              7. -

                Given an additional pipeable sender adaptor closure object d, the expression c | d produces another pipeable sender adaptor closure object e:

                -

                e is a perfect forwarding call wrapper ([func.require]) with the following properties:

                +

                Given an additional pipeable sender adaptor closure object d, the +expression c | d produces another pipeable sender adaptor closure object e:

                +

                e is a perfect forwarding call wrapper ([func.require]) with the following +properties:

                • Its target object is an object d2 of type decay_t<decltype((d))> direct-non-list-initialized with d.

                • It has one bound argument entity, an object c2 of type decay_t<decltype((c))> direct-non-list-initialized with C.

                • -

                  Its call pattern is d2(c2(arg)), where arg is the argument used in a function call expression of e.

                  +

                  Its call pattern is d2(c2(arg)), where arg is the argument used in a +function call expression of e.

                -

                The expression c | d is well-formed if and only if the initializations of the state entities of e are all well-formed.

                +

              +

              The expression c | d is well-formed if and only if the initializations of + the state entities of e are all well-formed.

              +
              1. An object t of type T is a pipeable sender adaptor closure object if T models derived_from<sender_adaptor_closure<T>>, T has no other base classes of type sender_adaptor_closure<U> for any other type U, and T does not model sender.

              2. -

                The template parameter D for sender_adaptor_closure can be an incomplete type. Before any expression of type cv D appears as -an operand to the | operator, D shall be complete and model derived_from<sender_adaptor_closure<D>>. The behavior of an expression involving an -object of type cv D as an operand to the | operator is undefined if overload resolution selects a program-defined operator| function.

                +

                The template parameter D for sender_adaptor_closure can be an incomplete +type. Before any expression of type cv D appears as an +operand to the | operator, D shall be complete and model derived_from<sender_adaptor_closure<D>>. The behavior of an expression +involving an object of type cv D as an operand to the | operator is undefined if overload resolution selects a program-defined operator| function.

              3. -

                A pipeable sender adaptor object is a customization point object that accepts a sender as its first argument and returns a sender.

                +

                A pipeable sender adaptor object is a customization point object that +accepts a sender as its first argument and returns a sender.

              4. -

                If a pipeable sender adaptor object accepts only one argument, then it is a pipeable sender adaptor closure object.

                +

                If a pipeable sender adaptor object accepts only one argument, then it is a +pipeable sender adaptor closure object.

              5. -

                If a pipeable sender adaptor object adaptor accepts more than one argument, then let sndr be an expression such that decltype((sndr)) models sender, -let args... be arguments such that adaptor(sndr, args...) is a well-formed expression as specified in the rest of this subclause -([exec.adapt.objects]), and let BoundArgs be a pack that denotes decay_t<decltype((args))>.... The expression adaptor(args...) produces a pipeable sender adaptor closure object f that is a perfect forwarding call wrapper with the following properties:

                +

                If a pipeable sender adaptor object adaptor accepts more than one argument, +then let sndr be an expression such that decltype((sndr)) models sender, let args... be arguments such that adaptor(sndr, args...) is a +well-formed expression as specified in the rest of this subclause +([exec.adapt.objects]), and let BoundArgs be a pack that denotes decay_t<decltype((args))>.... The expression adaptor(args...) produces a +pipeable sender adaptor closure object f that is a perfect forwarding call +wrapper with the following properties:

                • Its target object is a copy of adaptor.

                • Its bound argument entities bound_args consist of objects of types BoundArgs... direct-non-list-initialized with std::forward<decltype((args))>(args)..., respectively.

                • -

                  Its call pattern is adaptor(rcvr, bound_args...), where rcvr is the argument used in a function call expression of f.

                  +

                  Its call pattern is adaptor(rcvr, bound_args...), where rcvr is the +argument used in a function call expression of f.

                -

                The expression adaptor(args...) is well-formed if and only if the initializations of the bound argument entities of the result, as specified above, - are all well-formed.

                +

                The expression adaptor(args...) is well-formed if and only if the +initializations of the bound argument entities of the result, as specified +above, are all well-formed.

              11.9.11.3. execution::on [exec.on]
                @@ -8216,7 +8728,8 @@
                receiver-type denote the following class:

                -
                struct receiver-type : receiver_adaptor<receiver-type> {
                +
                struct receiver-type {
                +  using receiver_concept = receiver_t;
                   state-type* state; // exposition only
                 
                   Rcvr&& base() && noexcept { return std::move(state->rcvr); }
                @@ -8232,6 +8745,19 @@ 
                receiver2 denote the following exposition-only class template:

                template<class Rcvr, class Env>
                -struct receiver2 : receiver_adaptor<receiver2<Rcvr, Env>, Rcvr> {
                +struct receiver2 : Rcvr {
                   explicit receiver2(Rcvr rcvr, Env env)
                -    : receiver2::receiver_adaptor{std::move(rcvr)}, env(std::move(env)) {}
                +    : Rcvr(std::move(rcvr)), env(std::move(env)) {}
                 
                   auto get_env() const noexcept {
                -    return JOIN-ENV(env, FWD-ENV(execution::get_env(this->base())));
                +    const Rcvr& rcvr = *this;
                +    return JOIN-ENV(env, FWD-ENV(execution::get_env(rcvr)));
                   }
                 
                   Env env; // exposition only
                @@ -8614,20 +9141,42 @@ 
                stopped_as_optional maps an input sender’s stopped completion operation into the value completion operation as an empty optional. The input sender’s value completion operation is also converted into an optional. The result is a sender that never completes with stopped, reporting cancellation by completing with an empty optional.

              1. -

                The name stopped_as_optional denotes a customization point object. For some subexpression sndr, let Sndr be decltype((sndr)). +

                The name stopped_as_optional denotes a customization point object. For a subexpression sndr, let Sndr be decltype((sndr)). The expression stopped_as_optional(sndr) is expression-equivalent to:

                transform_sender(
                   get-domain-early(sndr),
                @@ -9146,7 +9695,7 @@ 
                start_detached eagerly starts a sender without the caller needing to manage the lifetimes of any objects.

              2. -

                The name start_detached denotes a customization point object. For some +

                The name start_detached denotes a customization point object. For a subexpression sndr, let Sndr be decltype((sndr)). If sender_in<Sndr, empty_env> is false, start_detached is ill-formed. Otherwise, the expression start_detached(sndr) is expression-equivalent to the following except that sndr is evaluated only once:

                @@ -9162,29 +9711,31 @@
                start_detached(sndr) is undefined.

              3. Let sndr be a subexpression such that Sndr is decltype((sndr)), and let detached-receiver and detached-operation be the following exposition-only -class types:

                -
                struct detached-receiver {
                +class templates:

                +
                template<class Sndr>
                +struct detached-receiver {
                   using receiver_concept = receiver_t;
                -  detached-operation* op; // exposition only
                +  detached-operation<Sndr>* op; // exposition only
                 
                -  friend void tag_invoke(set_value_t, detached-receiver&& self) noexcept { delete self.op; }
                -  friend void tag_invoke(set_error_t, detached-receiver&&, auto&&) noexcept { terminate(); }
                -  friend void tag_invoke(set_stopped_t, detached-receiver&& self) noexcept { delete self.op; }
                -  friend empty_env tag_invoke(get_env_t, const detached-receiver&) noexcept { return {}; }
                +  void set_value() && noexcept { delete op; }
                +  void set_error() && noexcept { terminate(); }
                +  void set_stopped() && noexcept { delete op; }
                +  empty_env get_env() const noexcept { return {}; }
                 };
                 
                +template<class Sndr>
                 struct detached-operation {
                -  connect_result_t<Sndr, detached-receiver> op; // exposition only
                +  connect_result_t<Sndr, detached-receiver<Sndr>> op; // exposition only
                 
                   explicit detached-operation(Sndr&& sndr)
                -    : op(connect(std::forward<Sndr>(sndr), detached-receiver{this}))
                +    : op(connect(std::forward<Sndr>(sndr), detached-receiver<Sndr>{this}))
                   {}
                 };
                 
              4. -

                If sender_to<Sndr, detached-receiver> is false, the +

                If sender_to<Sndr, detached-receiver<Sndr>> is false, the expression start_detached.apply_sender(sndr) is ill-formed; otherwise, it is -expression-equivalent to start((new detached-operation(sndr))->op).

                +expression-equivalent to start((new detached-operation<Sndr>(sndr))->op).

              11.9.12.2. this_thread::sync_wait [exec.sync.wait]
                @@ -9196,16 +9747,11 @@
                sync-wait-env be the following exposition-only class type:

                -
                template<class Tag>
                -concept get-sched-query = // exposition only
                -  one-of<Tag, execution::get_scheduler_t, execution::get_delegatee_scheduler_t>;
                -
                -struct sync-wait-env {
                +
                struct sync-wait-env {
                   execution::run_loop* loop; // exposition only
                 
                -  friend auto tag_invoke(get-sched-query auto, sync-wait-env self) noexcept {
                -    return self.loop->get_scheduler();
                -  }
                +  auto query(execution::get_scheduler_t) const noexcept { loop->get_scheduler(); }
                +  auto query(execution::get_delegatee_scheduler_t) const noexcept { loop->get_scheduler(); }
                 };
                 
              1. @@ -9249,38 +9795,51 @@
                sync-wait-receiver::complete behaves as follows:

                +
                template<class... Args>
                +void sync-wait-receiver::set_value(Args&&... args) && noexcept;
                +
                1. -

                  If Tag is set_value_t, evaluates:

                  +

                  Effects: Equivalent to:

                  try {
                  -  state->result.emplace(std::forward<Ts>(ts)...);
                  +  state->result.emplace(std::forward<Args>(args)...);
                   } catch (...) {
                     state->error = current_exception();
                   }
                  +state->loop.finish();
                   
                  +
                +
              2. +
                template<class Error>
                +void sync-wait-receiver::set_error(Error&& err) && noexcept;
                +
                +
                1. -

                  Otherwise, if Tag is set_error_t, evaluates:

                  -
                  state->error = AS-EXCEPT-PTR(std::forward(ts)...); // see [exec.general]
                  -
                  +

                  Effects: Equivalent to:

                  +
                  state->error = AS-EXCEPT-PTR(std::forward<Error>(err)); // see [exec.general]
                  +state->loop.finish();
                  +
                  +
                +
              3. +
                template<class Error>
                +void sync-wait-receiver::set_stopped() && noexcept;
                +
                +
                1. -

                  Otherwise, does nothing.

                  +

                  Effects: Equivalent to state->loop.finish().

            3. @@ -9384,186 +9943,7 @@

              start_detached(then(sndr, f)).

            11.11. Sender/receiver utilities [exec.utils]

            -
              -
            1. -

              This subclause makes use of the following exposition-only entities:

              -
              // [Editorial note: copy_cvref_t as in [[P1450R3]] -- end note]
              -// Mandates: is_base_of_v<T, remove_reference_t<U>> is true
              -template<class T, class U>
              -  copy_cvref_t<U&&, T> c-style-cast(U&& u) noexcept requires decays-to<T, T> {
              -    return (copy_cvref_t<U&&, T>) std::forward<U>(u);
              -  }
              -
              -
            2. -

              - [Note: The C-style cast in - c-style-cast - is to disable accessibility checks. -- end note] -

              -
            -

            11.11.1. execution::receiver_adaptor [exec.utils.rcvr.adptr]

            -
            template<
            -    class-type Derived,
            -    receiver Base = unspecified> // arguments are not associated entities ([lib.tmpl-heads])
            -  class receiver_adaptor;
            -
            -
              -
            1. -

              receiver_adaptor simplifies the implementation of one receiver type in terms of another. It defines tag_invoke overloads that forward to named members if they exist, and to the adapted receiver otherwise.

              -
            2. -

              If Base is an alias for the unspecified default template argument, then:

              -
                -
              • -

                Let HAS-BASE be false, and

                -
              • -

                Let GET-BASE(d) be d.base().

                -
              -

              otherwise, let:

              -
                -
              • -

                Let HAS-BASE be true, and

                -
              • -

                Let GET-BASE(d) be c-style-cast<receiver_adaptor<Derived, Base>>(d).base().

                -
              -

              Let BASE-TYPE(D) be the type of GET-BASE(declval<D>()).

              -
            3. -

              receiver_adaptor<Derived, Base> is equivalent to the following:

              -
              template<
              -  class-type Derived,
              -  receiver Base = unspecified> // arguments are not associated entities ([lib.tmpl-heads])
              -class receiver_adaptor {
              -  friend Derived;
              - public:
              -  using receiver_concept = receiver_t;
              -
              -  // Constructors
              -  receiver_adaptor() = default;
              -  template<class B>
              -      requires HAS-BASE && constructible_from<Base, B>
              -    explicit receiver_adaptor(B&& base) : base_(std::forward<B>(base)) {}
              -
              - private:
              -  using set_value = unspecified;
              -  using set_error = unspecified;
              -  using set_stopped = unspecified;
              -  using get_env = unspecified;
              -
              -  // Member functions
              -  template<class Self>
              -    requires HAS-BASE
              -  decltype(auto) base(this Self&& self) noexcept {
              -    return (std::forward<Self>(self).base_);
              -  }
              -
              -  // [exec.utils.rcvr.adptr.nonmembers] Non-member functions
              -  template<class... As>
              -    friend void tag_invoke(set_value_t, Derived&& self, As&&... as) noexcept;
              -
              -  template<class Err>
              -    friend void tag_invoke(set_error_t, Derived&& self, Err&& err) noexcept;
              -
              -  friend void tag_invoke(set_stopped_t, Derived&& self) noexcept;
              -
              -  friend decltype(auto) tag_invoke(get_env_t, const Derived& self) noexcept;
              -
              -  [[no_unique_address]] Base base_; // present if and only if HAS-BASE is true
              -};
              -
              -
            4. -

              [Note: receiver_adaptor provides tag_invoke overloads on behalf of -the derived class Derived, which is incomplete when receiver_adaptor is -instantiated.]

              -
            5. -

              [Example:

              -
              using _int_completion =
              -  completion_signatures<set_value_t(int)>;
              -
              -template<receiver_of<_int_completion> Rcvr>
              -  class my_receiver : receiver_adaptor<my_receiver<Rcvr>, Rcvr> {
              -    friend receiver_adaptor<my_receiver, Rcvr>;
              -    void set_value() && {
              -      set_value(std::move(*this).base(), 42);
              -    }
              -   public:
              -    using receiver_adaptor<my_receiver, Rcvr>::receiver_adaptor;
              -  };
              -
              -

              -- end example]

              -
            -
            11.11.1.1. Non-member functions [exec.utils.rcvr.adptr.nonmembers]
            -
            template<class... As>
            -  friend void tag_invoke(set_value_t, Derived&& self, As&&... as) noexcept;
            -
            -
              -
            1. -

              Let SET-VALUE-MBR be the expression std::move(self).set_value(std::forward<As>(as)...).

              -
            2. -

              Constraints: Either SET-VALUE-MBR is a valid expression or typename Derived::set_value denotes a type and callable<set_value_t, BASE-TYPE(Derived), As...> is true.

              -
            3. -

              Mandates: SET-VALUE-MBR, if that expression is valid, is not potentially-throwing.

              -
            4. -

              Effects: Equivalent to:

              -
                -
              • -

                If SET-VALUE-MBR is a valid expression, SET-VALUE-MBR;

                -
              • -

                Otherwise, set_value(GET-BASE(std::move(self)), std::forward<As>(as)...).

                -
              -
            -
            template<class Err>
            -  friend void tag_invoke(set_error_t, Derived&& self, Err&& err) noexcept;
            -
            -
              -
            1. -

              Let SET-ERROR-MBR be the expression std::move(self).set_error(std::forward<Err>(err)).

              -
            2. -

              Constraints: Either SET-ERROR-MBR is a valid expression or typename Derived::set_error denotes a type and callable<set_error_t, BASE-TYPE(Derived), Err> is true.

              -
            3. -

              Mandates: SET-ERROR-MBR, if that expression is valid, is not potentially-throwing.

              -
            4. -

              Effects: Equivalent to:

              -
                -
              • -

                If SET-ERROR-MBR is a valid expression, SET-ERROR-MBR;

                -
              • -

                Otherwise, set_error(GET-BASE(std::move(self)), std::forward<Err>(err)).

                -
              -
            -
            friend void tag_invoke(set_stopped_t, Derived&& self) noexcept;
            -
            -
              -
            1. -

              Let SET-STOPPED-MBR be the expression std::move(self).set_stopped().

              -
            2. -

              Constraints: Either SET-STOPPED-MBR is a valid expression or typename Derived::set_stopped denotes a type and callable<set_stopped_t, BASE-TYPE(Derived)> is true.

              -
            3. -

              Mandates: SET-STOPPED-MBR, if that expression is valid, is not potentially-throwing.

              -
            4. -

              Effects: Equivalent to:

              -
                -
              • -

                If SET-STOPPED-MBR is a valid expression, SET-STOPPED-MBR;

                -
              • -

                Otherwise, set_stopped(GET-BASE(std::move(self))).

                -
              -
            -
            friend decltype(auto) tag_invoke(get_env_t, const Derived& self) noexcept;
            -
            -
              -
            1. -

              Constraints: Either self.get_env() is a valid expression or typename Derived::get_env denotes a type and callable<get_env_t, BASE-TYPE(const Derived&)> is true.

              -
            2. -

              Mandates: noexcept(self.get_env()) is true if it is a valid expression.

              -
            3. -

              Effects: Equivalent to:

              -
                -
              • -

                If self.get_env() is a valid expression, self.get_env();

                -
              • -

                Otherwise, get_env(GET-BASE(self)).

                -
              -
            -

            11.11.2. execution::completion_signatures [exec.utils.cmplsigs]

            +

            11.11.1. execution::completion_signatures [exec.utils.cmplsigs]

            1. completion_signatures is a type that encodes a set of completion signatures @@ -9605,12 +9985,14 @@

              Fn satisfies completion-signature if and only if it is a function type with one of the following forms:

              +

              A type Fn satisfies completion-signature if and +only if it is a function type with one of the following forms:

              • set_value_t(Vs...), where Vs is an arbitrary parameter pack.

              • -

                set_error_t(Err), where Err is an arbitrary type.

                +

                set_error_t(Err), where Err is +an arbitrary type.

              • set_stopped_t()

              @@ -9623,17 +10005,14 @@

            2. -

              Let Fns... be a template parameter pack of the arguments of the completion_signatures specialization named by Completions, let TagFns be a -template parameter pack of the function types in Fns whose return types -are Tag, and let Tsn be a template parameter -pack of the function argument types in the n-th type -in TagFns. Then, given two variadic templates Tuple and Variant, the type gather-signatures<Tag, Completions, Tuple, Variant> names the type META-APPLY(Variant, META-APPLY(Tuple, Ts0...), META-APPLY(Tuple, Ts1...), ... META-APPLY(Tuple, Tsm-1...)), where m is the size of the parameter pack TagFns and META-APPLY(T, As...) is -equivalent to:

              +

              Let Fns... be a template parameter pack of the arguments of the completion_signatures specialization named by Completions, let TagFns be a template parameter pack of the function +types in Fns whose return types are Tag, and let Tsn be a template parameter pack +of the function argument types in the n-th type in TagFns. Then, given two variadic templates Tuple and Variant, the type gather-signatures<Tag, Completions, Tuple, Variant> names the type META-APPLY(Variant, META-APPLY(Tuple, Ts0...), META-APPLY(Tuple, Ts1...), ... META-APPLY(Tuple, Tsm-1...)), where m is the size of the parameter pack TagFns and META-APPLY(T, As...) is equivalent to:

              typename indirect-meta-apply<always-true<As...>>::template meta-apply<T, As...>;
               
            3. -

              The purpose of META-APPLY is to make it -valid to use non-variadic templates as Variant and Tuple arguments to gather-signatures.

              +

              The purpose of META-APPLY is +to make it valid to use non-variadic templates as Variant and Tuple arguments to gather-signatures.

          2. template<completion-signature... Fns>
            @@ -9662,7 +10041,7 @@ 

            11.11.3. execution::transform_completion_signatures [exec.utils.tfxcmplsigs]

            +

            11.11.2. execution::transform_completion_signatures [exec.utils.tfxcmplsigs]

            1. transform_completion_signatures is an alias template used to transform one @@ -9708,49 +10087,56 @@

              SetValue shall name an alias template such that for any template parameter pack As..., the type SetValue<As...> is either ill-formed or else valid-completion-signatures<SetValue<As...>> is satisfied.

            2. -

              SetError shall name an alias template such that for any type Err, SetError<Err> is either ill-formed or else valid-completion-signatures<SetError<Err>> is satisfied.

              - +

              SetError shall name an alias template such that for any type Err, SetError<Err> is either ill-formed or else valid-completion-signatures<SetError<Err>> is +satisfied.

              +

            Then:

            -
              +
              1. -

                Let Vs... be a pack of the types in the type-list named -by gather-signatures<set_value_t, InputSignatures, SetValue, type-list>.

                +

                Let Vs... be a pack of the types in the type-list named by gather-signatures<set_value_t, InputSignatures, SetValue, type-list>.

              2. -

                Let Es... be a pack of the types in the type-list named by gather-signatures<set_error_t, InputSignatures, type_identity_t, error-list>, where error-list is an -alias template such that error-list<Ts...> names type-list<SetError<Ts>...>.

                +

                Let Es... be a pack of the types in the type-list named by gather-signatures<set_error_t, InputSignatures, type_identity_t, error-list>, where error-list is an alias template such that error-list<Ts...> names type-list<SetError<Ts>...>.

              3. Let Ss name the type completion_signatures<> if gather-signatures<set_stopped_t, InputSignatures, type-list, type-list> is an alias for the type type-list<>; otherwise, SetStopped.

                -
            +

          Then:

          -
            +
            1. If any of the above types are ill-formed, then transform_completion_signatures<InputSignatures, AdditionalSignatures, SetValue, SetError, SetStopped> is ill-formed,

            2. -

              Otherwise, transform_completion_signatures<InputSignatures, AdditionalSignatures, SetValue, SetError, SetStopped> names the type completion_signatures<Sigs...> where Sigs... is the unique set of types in all the template arguments -of all the completion_signatures specializations in [AdditionalSignatures, Vs..., Es..., Ss].

              +

              Otherwise, transform_completion_signatures<InputSignatures, AdditionalSignatures, SetValue, SetError, SetStopped> names the type completion_signatures<Sigs...> where Sigs... is the unique set of +types in all the template arguments of all the completion_signatures specializations in [AdditionalSignatures, Vs..., Es..., Ss].

          11.12. Execution contexts [exec.ctx]

          1. -

            This subclause specifies some execution resources on which work can be scheduled.

            +

            This subclause specifies some execution resources on which work can be +scheduled.

          11.12.1. run_loop [exec.run.loop]

          1. -

            A run_loop is an execution resource on which work can be scheduled. It maintains a simple, thread-safe first-in-first-out queue of work. Its run() member function removes elements from the queue and executes them in a loop on whatever thread of execution calls run().

            +

            A run_loop is an execution resource on which work can be scheduled. It +maintains a simple, thread-safe first-in-first-out queue of work. Its run() member function removes elements from the queue and executes them in a loop +on whatever thread of execution calls run().

          2. -

            A run_loop instance has an associated count that corresponds to the number of work items that are in its queue. Additionally, a run_loop has an associated state that can be one of starting, running, or finishing.

            +

            A run_loop instance has an associated count that corresponds to the +number of work items that are in its queue. Additionally, a run_loop has an +associated state that can be one of starting, running, +or finishing.

          3. Concurrent invocations of the member functions of run_loop, other than run and its destructor, do not introduce data races. The member functions pop_front, push_back, and finish execute atomically.

          4. -

            [Note: Implementations are encouraged to use an intrusive queue of operation states to hold the work units to make scheduling allocation-free. — end note]

            +

            Implementations are encouraged to use an intrusive +queue of operation states to hold the work units to make scheduling +allocation-free.

            class run_loop {
               // [exec.run.loop.types] Associated types
               class run-loop-scheduler; // exposition only
            @@ -9785,11 +10171,15 @@ 
            run-loop-scheduler is an unspecified type that models the scheduler concept.

            +

            run-loop-scheduler is an unspecified type that models +the scheduler concept.

          5. -

            Instances of run-loop-scheduler remain valid until the end of the lifetime of the run_loop instance from which they were obtained.

            +

            Instances of run-loop-scheduler remain valid until the +end of the lifetime of the run_loop instance from which they were +obtained.

          6. -

            Two instances of run-loop-scheduler compare equal if and only if they were obtained from the same run_loop instance.

            +

            Two instances of run-loop-scheduler compare equal if +and only if they were obtained from the same run_loop instance.

          7. Let sch be an expression of type run-loop-scheduler. The expression schedule(sch) is not potentially-throwing and has type run-loop-sender.

          @@ -9798,31 +10188,36 @@
          run-loop-sender is an unspecified type such that sender-of<run-loop-sender> is true. - Additionally, the types reported by its error_types associated type is exception_ptr, and the value of its sends_stopped trait is true.

          +Additionally, the types reported by its error_types associated type is exception_ptr, and the value of its sends_stopped trait is true.

        2. An instance of run-loop-sender remains valid until the - end of the lifetime of its associated run_loop instance.

          +end of the lifetime of its associated run_loop instance.

        3. Let sndr be an expression of type run-loop-sender, let rcvr be an - expression such that decltype(rcvr) models the receiver_of concept, and let C be either set_value_t or set_stopped_t. Then:

          +expression such that decltype(rcvr) models the receiver_of concept, and let C be either set_value_t or set_stopped_t. Then:

          • The expression connect(sndr, rcvr) has type run-loop-opstate<decay_t<decltype(rcvr)>> and is potentially-throwing if and only if the initialiation of decay_t<decltype(rcvr)> from rcvr is potentially-throwing.

          • -

            The expression get_completion_scheduler<C>(get_env(sndr)) is not potentially-throwing, has type run-loop-scheduler, and compares equal to the run-loop-scheduler instance from which sndr was obtained.

            +

            The expression get_completion_scheduler<C>(get_env(sndr)) is +not potentially-throwing, has type run-loop-scheduler, and compares equal to the run-loop-scheduler instance from which sndr was obtained.

        -
        template<receiver_of<completion_signatures<set_value_t()>> Rcvr> // arguments are not associated entities ([lib.tmpl-heads])
        +
        template<receiver_of<completion_signatures<set_value_t()>> Rcvr>
           struct run-loop-opstate;
         
        1. -

          run-loop-opstate<Rcvr> inherits unambiguously from run-loop-opstate-base.

          +

          run-loop-opstate<Rcvr> inherits unambiguously +from run-loop-opstate-base.

        2. -

          Let o be a non-const lvalue of type run-loop-opstate<Rcvr>, and let REC(o) be a non-const lvalue reference to an instance of type Rcvr that was initialized with the expression rcvr passed to the invocation of connect that returned o. Then:

          +

          Let o be a non-const lvalue of type run-loop-opstate<Rcvr>, and let REC(o) be a non-const lvalue reference to an +instance of type Rcvr that was initialized with the +expression rcvr passed to the invocation of connect that returned o. Then:

          • -

            The object to which REC(o) refers remains valid for the lifetime of the object to which o refers.

            +

            The object to which REC(o) refers remains +valid for the lifetime of the object to which o refers.

          • The type run-loop-opstate<Rcvr> overrides run-loop-opstate-base::execute() such that o.execute() is equivalent to the following:

            if (get_stop_token(REC(o)).stop_requested()) {
            @@ -9832,7 +10227,8 @@ 
            start(o) is equivalent to the following:

            +

            The expression start(o) is equivalent to the +following:

            try {
               o.loop_->push_back(&o);
             } catch(...) {
            @@ -9859,12 +10255,15 @@ 
            true:

            +

            Effects: Blocks ([defns.block]) until one of the following conditions +is true:

            • count is 0 and state is finishing, in which case pop_front returns nullptr; or

            • -

              count is greater than 0, in which case an item is removed from the front of the queue, count is decremented by 1, and the removed item is returned.

              +

              count is greater than 0, in which case an item is removed from +the front of the queue, count is decremented by 1, and the +removed item is returned.

        void run_loop::push_back(run-loop-opstate-base* item);
        @@ -9879,7 +10278,8 @@ 
        run-loop-scheduler that can be used to schedule work onto this run_loop instance.

        +

        Returns: an instance of run-loop-scheduler that +can be used to schedule work onto this run_loop instance.

      void run_loop::run();
       
      @@ -9895,7 +10295,8 @@
      void run_loop::finish();
    • @@ -9909,7 +10310,9 @@

      11.13.1. execution::as_awaitable [exec.as.awaitable]

      1. -

        as_awaitable transforms an object into one that is awaitable within a particular coroutine. This subclause makes use of the following exposition-only entities:

        +

        as_awaitable transforms an object into one that is awaitable within a +particular coroutine. This subclause makes use of the following +exposition-only entities:

        template<class Sndr, class Env>
           using single-sender-value-type = see below;
         
        @@ -9934,17 +10337,24 @@ 

        value_types_of_t<Sndr, Env, Tuple, Variant> would have the form Variant<Tuple<T>>, then single-sender-value-type<Sndr, Env> is an alias for type decay_t<T>.

        +

        If value_types_of_t<Sndr, Env, Tuple, Variant> would have the form Variant<Tuple<T>>, then single-sender-value-type<Sndr, Env> is an +alias for type decay_t<T>.

      2. -

        Otherwise, if value_types_of_t<Sndr, Env, Tuple, Variant> would have the form Variant<Tuple<>> or Variant<>, then single-sender-value-type<Sndr, Env> is an alias for type void.

        +

        Otherwise, if value_types_of_t<Sndr, Env, Tuple, Variant> would +have the form Variant<Tuple<>> or Variant<>, then single-sender-value-type<Sndr, Env> is an +alias for type void.

      3. -

        Otherwise, if value_types_of_t<Sndr, Env, Tuple, Variant> would have the form Variant<Tuple<Ts...>> where Ts is a parameter pack, then single-sender-value-type<Sndr, Env> is an alias for type std::tuple<decay_t<Ts>...>.

        +

        Otherwise, if value_types_of_t<Sndr, Env, Tuple, Variant> would +have the form Variant<Tuple<Ts...>> where Ts is a parameter pack, +then single-sender-value-type<Sndr, Env> is an +alias for type std::tuple<decay_t<Ts>...>.

      4. Otherwise, single-sender-value-type<Sndr, Env> is ill-formed.

    • -

      The type sender-awaitable<Sndr, Promise> is equivalent to the following:

      -
      template<class Sndr, class Promise> // arguments are not associated entities ([lib.tmpl-heads])
      +       

      The type sender-awaitable<Sndr, Promise> is +equivalent to the following:

      +
      template<class Sndr, class Promise>
       class sender-awaitable {
         struct unit {};
         using value_t = single-sender-value-type<Sndr, env_of_t<Promise>>;
      @@ -9971,10 +10381,12 @@ 

      rcvr be an rvalue expression of type awaitable-receiver, let crcvr be a const lvalue that refers to rcvr, let vs be a parameter pack of types Vs..., and let err be an arbitrary expression of type Err. Then:

      +

      Let rcvr be an rvalue expression of type awaitable-receiver, let crcvr be a const lvalue that refers to rcvr, let vs be a parameter pack of types Vs..., and let err be an arbitrary expression of type Err. +Then:

      1. -

        If constructible_from<result_t, Vs...> is satisfied, the expression set_value(rcvr, vs...) is equivalent to:

        +

        If constructible_from<result_t, Vs...> is satisfied, the +expression set_value(rcvr, vs...) is equivalent to:

        try {
           rcvr.result_ptr_->emplace<1>(vs...);
         } catch(...) {
        @@ -9991,7 +10403,9 @@ 

        set_stopped(rcvr) is equivalent to static_cast<coroutine_handle<>>(rcvr.continuation_.promise().unhandled_stopped()).resume().

      2. -

        For any expression tag whose type satisfies forwarding-query and for any pack of subexpressions as, tag_invoke(tag, get_env(crcvr), as...) is expression-equivalent to tag(get_env(as_const(crcvr.continuation_.promise())), as...) when that expression is well-formed.

        +

        For any expression tag whose type satisfies forwarding-query and for any pack of +subexpressions as, get_env(crcvr).query(tag, as...) is +expression-equivalent to tag(get_env(as_const(crcvr.continuation_.promise())), as...) when that expression is well-formed.

    • sender-awaitable::sender-awaitable(Sndr&& sndr, Promise& p)

      @@ -10016,21 +10430,22 @@

      as_awaitable is a customization point object. For some subexpressions expr and p where p is an lvalue, Expr names the type decltype((expr)) and Promise names the type decltype((p)), as_awaitable(expr, p) is expression-equivalent to the following:

      1. -

        tag_invoke(as_awaitable, expr, p) if that expression is well-formed.

        +

        expr.as_awaitable(p) if that expression is well-formed.

        • -

          Mandates: is-awaitable<A, Promise> is true, where A is the type of the tag_invoke expression above.

          +

          Mandates: is-awaitable<A, Promise> is true, where A is the type of the expression above.

      2. Otherwise, expr if is-awaitable<Expr, U> is true, where U is an unspecified class type that lacks a member named await_transform. The -condition is not is-awaitable<Expr, Promise> as that -creates the potential for constraint recursion.

        +condition is not is-awaitable<Expr, Promise> as +that creates the potential for constraint recursion.

        • -

          Preconditions: is-awaitable<Expr, Promise> is true and the expression co_await expr in a coroutine with promise -type U is expression-equivalent to the same -expression in a coroutine with promise type Promise.

          +

          Preconditions: is-awaitable<Expr, Promise> is true and the expression co_await expr in a +coroutine with promise type U is +expression-equivalent to the same expression in a coroutine with +promise type Promise.

      3. Otherwise, sender-awaitable{expr, p} if awaitable-sender<Expr, Promise> is true.

        @@ -10041,8 +10456,11 @@

        11.13.2. execution::with_awaitable_senders [exec.with.awaitable.senders]

        1. -

          with_awaitable_senders, when used as the base class of a coroutine promise type, makes senders awaitable in that coroutine type.

          -

          In addition, it provides a default implementation of unhandled_stopped() such that if a sender completes by calling set_stopped, it is treated as if an uncatchable "stopped" exception were thrown from the await-expression. In practice, the coroutine is never resumed, and the unhandled_stopped of the coroutine caller’s promise type is called.

          +

          with_awaitable_senders, when used as the base class of a coroutine promise +type, makes senders awaitable in that coroutine type.

          +

          In addition, it provides a default implementation of unhandled_stopped() such that if a sender completes by calling set_stopped, it is treated as +if an uncatchable "stopped" exception were thrown from the await-expression. In practice, the coroutine is never resumed, and +the unhandled_stopped of the coroutine caller’s promise type is called.

          template<class-type Promise>
             struct with_awaitable_senders {
               template<OtherPromise>
          @@ -10058,7 +10476,7 @@ 

          continuation_ = h; if constexpr ( requires(OtherPromise& other) { other.unhandled_stopped(); } ) { stopped_handler_ = [](void* p) noexcept -> coroutine_handle<> { @@ -10083,15 +10502,14 @@

          call-result-t<as_awaitable_t, Value, Promise&> await_transform(Value&& value)

          • Effects: equivalent to:

            +
          return as_awaitable(std::forward<Value>(value), static_cast<Promise&>(*this));
           
          -