Ballerina Concurrency Model and Non-Blocking I/O
Take an in-depth look at Ballerina's concurrency model and how it provides the foundation for the inherent non-blocking I/O support available in the language.
Join the DZone community and get the full member experience.
Join For FreeIntroduction
The Ballerina programming language has a unique concurrency model that promotes efficient resource usage and provides an intuitive programming model for users. Its concurrency model is also critical to the non-blocking I/O support provided with the communication protocols. In this article, we will take an in-depth look into Ballerina’s concurrency support, and see how the non-blocking I/O operations are implemented on top of this.
Let’s first take a look at the general concurrency constructs provided by an operating system and how they work, and then move onto the concurrency primitives provided by Ballerina.
Multitasking With Processes and Threads
The operating system (OS) provides us with the general constructs of processes and threads. A single program or a process can contain one or more threads. A thread of execution is scheduled on a specific CPU core by the OS scheduler. So from the OS, a thread is the most primitive execution construct for our programs. There can, of course, be more threads than the CPU cores we have in our machine.
The execution of these threads happens in a preemptive manner using the OS scheduler, where currently running threads in CPU cores are interrupted in time intervals to timeshare with the other threads in the system. This provides us with the illusion that all processes and threads are executed concurrently. We humans do not detect this pausing and resuming of execution; especially in interactive applications, we see everything executing at the same time.
A thread pausing its work on a CPU core and resuming another thread to execute there is called a context switch. This is somewhat of an expensive operation since the current thread’s full state needs to be stored in memory, and the other thread’s earlier stored state has to be read in and restored to start its execution. A higher number of unwanted context switches is not good for performance, so generally, we try to keep the context switches low. The best way to do this is to make current CPU-bound threads equal to the number of CPU cores. In this way, the busiest threads will mostly be scheduled in a CPU core and will rarely be preempted for other low-priority threads in the system.
So if the ideal scenario is limiting the number of threads to the CPU core count, how can we scale our executions for concurrent execution? This can be done by having a user-space scheduler to switch between concurrent executions. A user-space scheduler generally has a lower overhead compared to the OS scheduler’s context switching. Ballerina takes on this approach when implementing its concurrency primitives.
Ballerina Concurrency With Strands
Ballerina implements a user-space scheduler and uses an execution construct known as a strand. A strand is a lightweight thread of execution, which is cooperatively multitasked. Cooperative multitasking is different from preemptive multitasking done in the OS scheduler as it relies on the running execution itself to yield its CPU time to another execution. So rather than it being forcefully stopped by the scheduler, the execution itself signals to the scheduler that it's ready to give up its execution time and willingly hands over its OS-level thread to a different concurrent execution in the application. In Ballerina, there are certain implicit yield points that trigger this, such as communication between concurrent executions, sleep operations, and other library function calls that may signal a yield. This concurrency pattern is also known as coroutines.
A group of strands belongs to an OS thread. At any given moment, a single strand in its group is actively executing in the thread it belongs to. The strands in the same group cooperatively multitask in sharing the execution thread. In Ballerina code, a worker represents a single strand of execution in a function execution. Every function in Ballerina has one or more workers. The function will always have a default worker, and optionally more named workers. If there are no explicitly defined workers, the code in the function belongs to the default worker. All the workers in a function are concurrently executed, where each worker is assigned its own strand. The default worker in a function is executed in the same strand as the worker of the function it was called from.
Let’s take a look at some sample Ballerina code and see how its execution is modeled using workers and how strands are mapped in the runtime:
x
import ballerina/io;
public function main() returns error? {
int x = 2 * 5;
printNumber(x);
io:println("Done");
}
public function printNumber(int x) {
io:println(x);
}
Listing 1
The Ballerina program execution above starts in the main function. Here, a new strand is created and the main function’s default worker starts its execution. The default worker of the main function is basically represented by its full function body. At line 5, it invokes the function printNumber
. In this invocation, the current worker in the main function is suspended, and its strand moves into executing the default worker of the printNumber
function. After the printNumber
function is finished with its instructions, its worker is terminated, the main function’s default worker is made active again, and the strand moves onto executing from line 6 in the main function.
So this is a simple case where only a single strand is executing the workers in our Ballerina functions. Let’s see a scenario of having explicitly named workers in a function.
import ballerina/io;
public function main() {
int a = 2;
int b = 5;
process(a, b);
io:println("Done");
}
public function process(int a, int b) {
io:println("A: ", a, " B: ", b);
worker w1 returns int {
int result = a + b;
return result;
}
worker w2 returns int {
int result = a * b;
return result;
}
int addResult = wait w1;
int mulResult = wait w2;
io:println("Add result: ", addResult);
io:println("Multiplication result: ", mulResult);
}
Listing 2
In this scenario, the function process has three workers. The default worker is defined using code lines 11 and 20-23. There are two additional named workers w1
and w2
. A worker in Ballerina optionally returns a value, and if a return type is not mentioned, it returns an implicit nil - () value. The return type of a function is the return type of its default worker. In this execution, the strand of the default worker of the function process creates new strands for the execution of workers w1
and w2
. When a new strand is created, in the default policy, it is added to the same group as its parent strand (the strand that created it). So by line 20, we have three active strands, one which is executing the default worker of the process function, and two other strands that execute workers w1
and w2
.
A function returns when its default worker returns a value and terminates its execution. Any other named workers’ state does not affect this behavior. These other workers may be already terminated or can still be running when a function is returned. In this case, we explicitly wait for the termination and its return value of the workers using the wait action.
In a multiple worker scenario, communication between them can be done using a message passing mechanism. This is done using worker send (->) and receive (<-) actions.
xxxxxxxxxx
import ballerina/io;
public function main() {
int a = 2;
int b = 5;
process(a, b);
io:println("Done");
}
public function process(int a, int b) {
io:println("A: ", a, " B: ", b);
worker w1 returns int {
int result = a + b;
result -> w2;
return result;
}
worker w2 returns int {
int input = <- w1;
int result = input * input;
return result;
}
int r1 = wait w1;
int r2 = wait w2;
io:println("R1: ", r1);
io:println("R2: ", r2);
}
Listing 3
Here, in line 14 we do a worker send operation, which sends the value result to the worker w2
. In worker w2
, at line 18, there is a corresponding worker receive operation to read in the value sent by worker w1
. To reduce the chance of deadlocks, the compiler does static analysis of the code to see if every worker send
is matched with a worker receive
in the recipient worker.
Non-Blocking I/O
Ballerina’s concurrency model shines through in its use in non-blocking I/O operations. The coroutine-based execution allows I/O operations to not block execution threads, but rather suspend a function execution and resume its execution after its I/O operations are done. All of this is possible without the use of any obscure callback-based or reactive programming model. Let’s use an example to demonstrate this behavior.
xxxxxxxxxx
import ballerina/http;
import ballerina/io;
public function main() returns error? {
http:Client httpClient = check new ("https://api.mathjs.org");
string response = <string> check httpClient->get(
"/v4/?expr=7*8", targetType = string);
io:println("Result: ", response);
}
Listing 4
Here, we only have the default worker in the main function, so there is only one strand executing. In this scenario, we are using an HTTP client to make a network request to a remote endpoint. At line 6, we invoke the get remote method, which does a non-blocking I/O request through HTTP. In this instance, the pending I/O operation is handed over to the OS, and the executing strand yields its executing thread. With the yielding, we effectively pause/suspend our worker execution, and the execution thread is released to be used by other strands. This behavior makes sure we optimally use OS threads by using lesser active threads, resulting in fewer context switches and more efficiency.
Ballerina keeps a separate thread pool for I/O operations that cannot be implemented in a non-blocking manner (some database connectors, etc.) since they will block the calling thread.
The non-blocking I/O scenario will be more apparent when we use multiple workers in our code. This is demonstrated below.
xxxxxxxxxx
import ballerina/http;
import ballerina/io;
public function main() returns error? {
http:Client httpClient = check new ("https://api.mathjs.org");
worker w1 returns string|error {
string response = <string> check httpClient->get(
"/v4/?expr=7*8", targetType = string);
return response;
}
worker w2 returns string|error {
string response = <string> check httpClient->get(
"/v4/?expr=2*5", targetType = string);
return response;
}
string r1 = check wait w1;
string r2 = check wait w2;
io:println("R1: ", r1);
io:println("R2: ", r2);
}
Listing 5
Here, in the main function, we have the default worker and two named workers, which results in three execution strands. By default, all the strands belong to the same group, thus a single execution thread is assigned to execute the strands. Let’s take a look at the ordering of instruction execution in this scenario.
- Create strand
s1
for the default worker. - Create strand
s2
for workerw1
. - Create strand
s3
for workerw2
. - Strand
s1
is active and executes instructions in the default worker. - Worker
w1
andw2
are still suspended since their strands do not have a free thread to execute their instructions—the available thread in the group is occupied by strands1
. - At line 16, the wait action is executed, yielding the execution of
s1
, thus suspending the default worker. - The execution thread is now free to schedule any other pending strands. Let’s assume strand
s2
is scheduled, thus executing operations in workerw1
. - Worker
w1
executes a non-blocking I/O operation at line 7, which results in yielding its strand. This allows the strands3
to execute and do its non-blocking I/O operation, which in turn yieldss3
and suspends its worker. - At this point, all the workers are suspended, and the execution thread of the strand group is released. However, at this point, both the network calls have been made concurrently to the remote endpoints, and both workers are waiting for the same time until their results are communicated back.
- When one of the I/O operations returns, the corresponding strand is again scheduled in a free execution thread, and its code is executed to return from the worker.
As we can see from the flow above, the I/O operations are executed concurrently without a single I/O operation blocking the execution thread until its result is received.
As we now know, by default, all the strands created from a calling function (parent) belong to the same group, thus they are all represented using a single execution thread. However, if we have CPU-bound operations that need to be executed concurrently in the workers rather than non-I/O operations, then these strands need to execute in their own execution thread. This can be done by directing the runtime to create a strand with its own strand group, and not to inherit the group from its parent strand. Let’s take a look at some sample code that does this.
xxxxxxxxxx
import ballerina/io;
public function main() {
@strand {thread: "any"}
worker w1 returns int {
int result = 0;
foreach var i in 1...10000000 {
result += i;
}
io:println("R1: ", result);
return result;
}
@strand {thread: "any"}
worker w2 returns int {
int result = 0;
foreach var i in 10000001...20000000 {
result += i;
}
io:println("R2: ", result);
return result;
}
int r1 = wait w1;
int r2 = wait w2;
io:println("R1 + R2: ", r1 + r2);
}
Listing 6
xxxxxxxxxx
$ bal run demo.bal
Compiling source
demo.bal
Running executable
R1: 50000005000000
R2: 150000005000000
R1 + R2: 200000020000000
In the code above, we have used the @strand
annotation to provide the property that allows it to schedule its strand in any other thread that is available. The default value of this signals the strand to be scheduled in the same thread as the parent strand (same group as the parent strand). We can notice that in a multi-core CPU, with this specific annotation, our execution is faster since it will allocate two execution threads to execute our workers w1
and w2
, thus executing them parallelly. Without this, the workers will be executed one after another, and for a non-I/O scenario, it wouldn’t make much sense to do the execution in this manner.
In Ballerina, the start action provides a convenient operation to perform a function invocation in its own dedicated strand and returns its default worker as a future construct. The following shows an example of this scenario.
xxxxxxxxxx
import ballerina/io;
public function main() {
future<int> f1 = start calc(0, 10000000);
future<int> f2 = start calc(10000001, 20000000);
int r1 = wait f1;
int r2 = wait f2;
io:println("R1 + R2: ", r1 + r2);
}
function calc(int a, int b) returns int {
int result = 0;
foreach var i in a...b {
result += i;
}
io:println("R: ", result);
return result;
}
Listing 7
xxxxxxxxxx
$ bal run demo.bal
Compiling source
demo.bal
Running executable
R: 150000005000000
R: 50000005000000
R1 + R2: 200000010000000
In the example above, we implement a scenario similar to Listing 6. However, all the strands created for the workers belong to a single thread. This behavior again can be customized by using the @strand
annotation in the following manner:
xxxxxxxxxx
future<int> f1 = @strand {thread: "any"} start calc(0, 10000000);
future<int> f2 = @strand {thread: "any"} start calc(10000001, 20000000);
This makes sure the generated strands for the default worker execution of calc
function invocations are generated in their own group, thus with their own execution thread.
Conclusion
In this article, we have gone through the concurrency model of Ballerina. We delved into some of the aspects that make it unique and how it provides a user-friendly experience and features for efficient computational and I/O-related operations.
For more information on the Ballerina language and its features, refer to the resources on the Ballerina Learn Page.
Published at DZone with permission of Anjana Fernando. See the original article here.
Opinions expressed by DZone contributors are their own.
Trending
-
Cucumber Selenium Tutorial: A Comprehensive Guide With Examples and Best Practices
-
From On-Prem to SaaS
-
Integrate Cucumber in Playwright With Java
-
Adding Mermaid Diagrams to Markdown Documents
Comments