C++20 — Practical Coroutines

Šimon Tóth
ITNEXT
Published in
5 min readNov 30, 2021

--

Writing custom coroutines is not a trivial task in C++20. In this article, I will guide you through three increasingly complex examples of coroutines.

While this article will explain each example in detail, I will not go into the basics of coroutines. For that, please read my C++20 Coroutines article. All the examples are from my Coroutines Epoll and Sockets library.

The synchronous coroutine

The simplest coroutine is a coroutine that behaves like a normal function. So, of course, it might seem pointless to implement such a coroutine. Remember, however, that we cannot use coroutine keywords like co_await outside of coroutines. So for convenience, we can write a special coroutine type for an asynchronous main.

The main difference from a completely synchronous coroutine is that we return std::suspend_always in final_suspend() (line 8). Without this, the promise would be destructed as the coroutine finishes running. However, we use the promise to store the result. We set it in return_value (line 10), which is called on co_return and then read it in the conversion to int operator (line 23).

We also need to write a destructor since the coroutine must be explicitly destroyed now (line 20).

We can then use this coroutine as a replacement for main like this:

Inside of our async_main we can now co_await on other coroutines.

The chainable coroutine

One of the compelling features of coroutines is the ability of symmetric transfer. With synchronous code, we only rarely need to care about running out of stack space. However, it is not unusual in asynchronous code to only execute small chunks of code and then hand off control to another part of the program. Avoiding stack space issues then requires meticulous design. With coroutines, we can avoid this problem entirely by relying on the code generated by the compiler.

We will look at chainable_task in the next section, however, let's go through what is happening here first. We call co_await on the result of async_op(). This requires the result type chainable_task to provide the required await_ready(), await_suspend() and await_resume() interface. With await_resume() providing the result value.

From the perspective of the caller, this behaves exactly like a synchronous call. The caller is suspended after the call and only resumed at the end to read the result value. However, in the background, we instead chain the coroutines without nesting them.

So let's look at what is happening in the background:

Let's focus on the critical parts. First, the awaitable interface on lines 43–48 is relatively straightforward:

  • we want the caller to suspend, so we return false in await_ready()
  • inside await_suspend(), we remember the caller's handle (caller will be demo() in the last example) and return our handle, which will resume this coroutine (async_op() in the last example)
  • inside await_resume() we return the stored result

Second, the promise type (lines 16–19) itself is where we set the main behaviour of the task:

  • we suspend in initial_suspend() , which returns the control (and an instance of chainable_task) to the caller, who then calls the previously discussed co_await
  • in final_suspend(), we want to handoff the control back to the caller, so we return a special awaitable object that returns (line 5) the handle we stored inside await_suspend() on line 45
  • lastly, the return_value() method on line 19 is what stores the result that is then read in await_resume() on line 48

All this comes together to chain the execution instead of nesting it. This is a lot to take in, so I recommend grabbing the library code and adding debug prints to constructors and destructors if you are still struggling. It will allow you to observe changes in behaviour as you change your code.

Detached tasks

So far, we have only discussed linear execution models. By that, I mean that the code we write is still behaving like completely synchronous code. When we write co_await some_coro() the following line will only be executed when the some_coro() finishes running.

Imagine code like this:

If we never receive a connection on server1, this code will prevent server2 from accepting connections, even if we have plenty of them. Of course, we could spawn these on separate threads, but we can avoid this and still run everything on a single thread with coroutines.

Conceptually, we want to keep coroutines blocked on external events (connection arrived, data to read, timer expired, etc.) in a suspended state and only resume them when we know for sure that they have work to do.

To achieve this, we will need to introduce a global component, a scheduler of coroutines. All we want from the scheduler is to be a generator style coroutine that keeps looping and yielding control to coroutines that can run:

Emitters are external sources of events (connection arrived, socket ready for writing, condition evaluated to true, timeout expired, etc.). The scheduler then maps the received event to the corresponding coroutine blocked by it and resumes it:

In the CES library, the coroutine handle blocked by an event is part of the emitted event. So we simply handoff to that coroutine using a special awaitable type.

We now have a way to resume suspended coroutines. However, we still need the other side, which is a way to suspend a coroutine and register it in the scheduler:

This awaitable type stores the continuation and information about the event that is blocking the coroutine. In await_ready we also handle the early-return case when a coroutine co_awaits a condition that is already true. The notify_emitters call on line 14 is a flip side of notify_departure call from the last example. Notifying emitters will register the event to be watched, departure will unregister the event to mitigate fantom events.

The last part of the puzzle is a way for calling coroutines to spawn multiple detached coroutines. On top of that, we also need a way to add event emitters to the scheduler.

With this interface, multiple coroutines can be enqueued, and the scheduler then resumed with a call to run.

Links and technical notes

All the code examples are either directly taken from a simplified from the CES: Coroutines, Epoll and Sockets library. This library is tested and working using trunk version of GCC 12.

Thank you for reading

Thank you for reading this article. Did you enjoy it?

I also publish videos on YouTube. Do you have questions? Hit me up on Twitter or LinkedIn.

--

--

I'm an ex-Software Engineer and ex-Researcher focusing on providing free educational content.