Adding Coroutine Feature To C Language
In this article, we discuss how a coroutine is a function/sub-routine(co-operative sub-routine to be precise) and how it can be suspended and resumed.
Join the DZone community and get the full member experience.Join For Free
If you are an absolute beginner, you can go through all of the below pre-requisites. And if you are not a beginner, you better know what to skip!
What Is Coroutine?
- A coroutine is a function/sub-routine(co-operative sub-routine to be precise) that can be suspended and resumed.
- In other words, you can think of coroutine as the in-between solution of normal function & thread. Because, once function/sub-routine called, it executes until the end. On the other hand, the thread can be blocked by synchronization primitives (like mutex, semaphores, etc) or suspended by the OS scheduler. But again, you can not decide on suspension & resumption on it (as it is done by the OS scheduler).
- With coroutine, it can be suspended on a pre-defined point & resumed later on a need basis by the programmer. So here programmer will be having complete control of execution flow. That too with minimal overhead as compared to thread.
- A coroutine is also known as native threads, fibers (in windows), lightweight threads, green threads (in java), etc.
As I usually do, before learning anything new, you should be asking this question to yourself. But, let me answer it:
- Coroutines can provide a very high level of concurrency with very little overhead, as it doesn't need OS intervention in scheduling. While in a threaded environment, you have to bear the OS scheduling overhead.
- A coroutine can suspend on pre-determined points, so you can also avoid locking on shared data structures. Because you would never tell your code to switch to another coroutine in the middle of a critical section.
- With the threads, each thread needs its own stack with thread local storage & other things. So your memory usage grows linearly with the number of threads you have. While with co-routines, the number of routines you have doesn't have a direct relationship with your memory usage.
- For most use cases, a coroutine is a more optimal choice as it is faster as compared to a thread.
- And if you are still not convinced then wait for my C++ Coroutine post.
Before we dive into a coroutine, we need to understand the below foundation functions/APIs for context switching. Off-course, as we do, with less, to-the-point theory, and with more code examples.
- If you are already familiar with
longjmp, then you might have ease in understanding these functions. You can consider these functions as an advanced version of
- The only difference is
longjmpallows only a single non-local jump up the stack. Whereas, these APIs allow the creation of multiple cooperative threads of control, each with its own stack or entry point.
ucontext_ttype structure defined below is used to store the execution context.
- All four (
swapcontext) control flow functions operate on this structure.
uc_linkpoints to the context which will be resumed when the current context exits, if the context was created with
makecontext(a secondary context).
uc_stackis the stack used by the context.
uc_mcontextstores execution state, including all registers and CPU flags, frame/base pointer(i.e. indicates current execution frame), instruction pointer(i.e. program counter), link register(i.e. stores return address) and the stack pointer(i.e. indicates current stack limit or end of current frame).
mcontext_tis an opaque type.
uc_sigmaskis used to store the set of signals blocked in the context. Which isn't the focus for today.
- This function transfers control to the context in
ucp. Execution continues from the point at which the context was stored in
setcontextdoes not return.
- Saves current context into
ucp. This function returns in two possible cases:
- after the initial call,
- or when a thread switches to the context in
getcontextfunction does not provide a return value to distinguish the cases (its return value is used solely to signal error), so the programmer must use an explicit flag variable, which must not be a register variable and must be declared
volatileto avoid constant propagation or other compiler optimizations.
makecontextfunction sets up an alternate thread of control in
ucp, which has previously been initialized using
ucp.uc_stackmember should be pointed to an appropriately sized stack; the constant SIGSTKSZ or MINSIGSTKSZ is commonly used.
ucpis jumped to using
swapcontext, execution will begin at the entry point to the function pointed to by
argcarguments as specified. When
functerminates, control is returned to the context specified in
- Saves the current execution state into
oucpand then transfers the execution control to
- Now, that we have read lot of theory. Let's create meaning out of it.
- Consider the below program that implements plain infinite loop printing "Hello world" every second.
getcontextis returning with both possible cases as we have mentioned earlier i.e.:
- after the initial call,
- when a thread switches to the context via
- Rest is self-explanatory.
- Here, the
makecontextfunction sets up an alternate thread of control in
ctx. And when jump made with
swapcontext, execution will begin at
assign, with respective arguments as specified.
assignterminates, control will be switch to
ctx.uc_link. Which is pointing to
back& will be populated by
- If the
ctx.uc_linkis made to 0, then the current execution context is considered as the main context, and the thread will exit when
assigncontext gets over.
- Before a call is made to
makecontext, the application/developer needs to ensure that the context being modified has a pre-allocated stack. And
argcmatches the number of arguments of type
func. Otherwise, the behavior is undefined.
- Initially, I have created single file example. But then I realized it's too much for a single file. Hence, I split implementation & usage examples into different files, which will make the example more comprehensible & easy to understand. ## Coroutine Implementation
- So, here is the simplest coroutine in c language:
- Just ignore the coroutine APIs as of now.
- The main thing to focus on here is the coroutine handler that has following field:
function: That holds the address of the actual coroutine function supplied by the user.
suspend_context: That used to suspend the coroutine function.
resume_context: That holds the context of actual coroutine function.
yield_value: To store the return value between intermediate suspension point & also final return value.
is_coro_finished: An indicator to check status on coroutine lifetime.
- The most used APIs for coroutine is
coro_yieldthat drags the actual work of suspension & resumption.
- If you already have consciously gone through the above Context Switching API Examples, then I don't think there is much to explain for
coro_yield. Its just
coro_resume& vice-versa. Except for the first call to
coro_resumewhich jumps to
coro_newfunction allocates memory for handler as well as stack & then populates the handler members. Again
makecontextshould be clear by this point. If not then please re-read above section on Context Switching API Examples.
- If you genuinely understand the above coroutine API implementation, then the obvious question would be why do we even need
_coro_entry_point? Why can't we directly jump to actual coroutine function?.
- But then my argument will be "How do you ensure the lifetime of coroutine?"
- Which technically means, number of call to
coro_resumeshould be similar/valid to number of call to
coro_yieldplus one(for actual return).
- Otherwise, you can not keep track of yields. And behavior will become undefined.
_coro_entry_pointfunction is needed otherwise there is no way by which you can deduce the coroutine execution finished completely. And next/subsequent call to
coro_resumeis not valid anymore.
- By the above implementation, using a coroutine handler, you should only be able to execute the coroutine function completely once throughout the program/application life.
- If you want to call the coroutine function again, then you need to create a new coroutine handler. And rest of the process will remain the same.
- Usecase is pretty straight forward:
- First, you create a coroutine handler.
- Then, you start/resume the actual coroutine function with the help of the same coroutine handler.
- And, whenever your actual coroutine function encounters a call the
coro_yield, it will suspend the execution & return the value passed in 2nd argument of
- And when actual coroutine function execution finishes completely. The call to
-1to indicate that the coroutine handler object is no more valid & the lifetime is expired.
- So, you see
coro_resumeis a wrapper to our coroutine
hello_worldin parts(obviously by context switching).
- I have tested this example in WSL with GCC 9.3.0 & Glibc 2.31.
$ gcc -I./ coroutine_example.c coroutine.c -o myapp && ./myapp
You see, there is no magic if you understand how the CPU executes the code, given Glibc provided a rich set of context switching API. And, from the perspective of low-level developers, it's merely a well-arranged & difficult to organize/maintain(if used raw) context switching function calls.
My intention here was to put the foundation for C++20 Coroutine. Because I believe, if you see the code from CPU & compiler’s point of view, then everything becomes easy to reason about in C++.
See you next time with my C++20 Coroutine post!
Published at DZone with permission of Vishal Chovatiya. See the original article here.
Opinions expressed by DZone contributors are their own.