Erlang: Concurrency

DZone 's Guide to

Erlang: Concurrency

· Web Dev Zone ·
Free Resource

Erlang's concurrency model is more opinionated than what goes on in other languages, where constructs are added in newer versions and many models can coexists. For example, the C language and its libraries support everything from threading to forking new processes.

Erlang's opinions for concurrency are:

  • multiple processes may run in parallel all the time, even if they are thousands. These processes may fail, but independently.
  • There is no shared data: each process is a little computer with its own resources. There is no race condition if there's no shared data.
  • Process communicates via explicit messages as their only link. It's a bit like sending packets on a network, and in fact Erlang is diffused in the telecommunication world.
  • As a consequence, communication is asynchronous as even what is known as a remote procedure call must be implemented via a pair of messages, one for the call and one for the return value.

Note that there's very little overhead for that thousands processes, since they are not provided by the OS but by the Erlang runtime system.

Erlang follows then the actor model. In this model, everything is an actor. It's a saying similar to everything is an object, but execution is primarily concurrent, instead of sequential by default like in OO applications.

An actor can, at any time:

  • send messages to other actors that he knows the name of.
  • Receive a message and act on it.
  • Create new actors.

Compare this model with objects that send messages to each other, and create new objects.

An example

Concurrency in OO languages like Java is full of synchronized keywords to access shared data structures, but in the actor model the communication channels between lines of code belonging to different processes is explicit. Even when the processes are local, any data passing from one to the other will be forced into a construct with a special syntax.

In this example, we want to calculate a distance between two points, each represented as a list of numbers. To speed things up (or just to experiment with concurrency primitives), we can parallelize along the number of dimensions involved. The number of dimensions is just 2 or 3 in 2D or 3D graphics, but is sensible in other applications like machine learning where it is common to represent entitites as point in a space of thousands of dimensions.

A specification for framing the problem

Let's start with our test case. We want to expose a synchronous behavior, to simplify writing tests:

distance_parallel_test() ->
    ?assertEqual(0.0, distance_parallel([1], [1])),
    ?assertEqual(1.0, distance_parallel([2], [3])),
    ?assertEqual(5.0, distance_parallel([0, 0], [3, 4])),
    ?assertEqual(2.0, distance_parallel([1, 2, 3, 4], [2, 3, 4, 5])).

The formula for Euclidean distance calculation is the length of the difference vector,, which is obtained with Pythagoras's theorem.

We can implements distance_parallel/2 in 3 phases:

  • first we spawn N new processes, one for each dimension the points have.
  • then we collect N responses from these processes, each representing the square of the difference between the two values given to it.
  • We can then sum the responses and extract the square root.

We could parallelize further also the 3rd step, but this is our first attempt at concurrency in Erlang and that would make the example more obscure.

This is distance_parallel/2's skeleton:

distance_parallel(FirstPoint, SecondPoint) ->
    parallelize_dimensions(FirstPoint, SecondPoint),
    Squares = collect(length(FirstPoint)),


parallelize_dimensions/2 will take the first available dimensions of the two points, spawn a new process, and recur with the remaining dimensions.

parallelize_dimensions([], []) -> ok;
parallelize_dimensions([HeadFirst|TailFirst], [HeadSecond|TailSecond]) ->
    Pid = self(),
    spawn(fun() -> difference_squared(HeadFirst, HeadSecond, Pid) end),
    dimensions(TailFirst, TailSecond).

spawn() is passed an anonymous function consisting of a call to difference_squared, which is passed the two relevant numbers and the id of the main process.

difference_squared(X1, X2, Destination) ->
    Difference = X1 - X2,
    Result = math:pow(Difference, 2),
    Destination ! {dimension, Result}.

The process id is a tuple uniquely identifying the current process on this host; it can be obtained with self() and then passed to the child process like we would do with an object reference. The process could look it up in a global variable, but it's easier to avoid singleton thinking from the start.

A message is sent to another process with the ! operator, applied to a process id. Usually messages are tuples whose first element is an atom (tags), so that different types of messages can be distinguished just by matching on the first element.

Collection of results

We also need to receive the messages on the main process. A tail-recursive function can do this, but the focus is on the receive construct.

collect(N) ->
    collect(N, []).
collect(0, SquaresAccumulator) -> SquaresAccumulator;
collect(N, SquaresAccumulator) ->
        {dimension, Result} -> done
    collect(N - 1, lists:append(SquaresAccumulator, [Result])).

The construct works via pattern matching, and in this case only accepts tuples containing the dimension atom; other tuples are ignored. In this case we are just assigning to Result part of the content of the message, but instead of returning done we could execute code inside the construct itself.


That's it: creating basic, local processes in Erlang it's a matter of combining spawn/1, self(), the ! operator and the receive construct. Once you get the code working, it's easy to marvel at, but they may be a bit heterogeneous to digest immediately.

All the code is available for your tinkering at the repository for this series.


Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}