Over a million developers have joined DZone.

First Look at Akka Streams: A Powerful Library for Composable Data Streams

DZone's Guide to

First Look at Akka Streams: A Powerful Library for Composable Data Streams

Akka creator Jonas Bonér sat down with DZone to give our readers a download on the general availability of Akka Streams.

· Big Data Zone
Free Resource

Learn best practices according to DataOps. Download the free O'Reilly eBook on building a modern Big Data platform.

A year ago, Typesafe announced an early preview of Akka Streams, an open source implementation of the Reactive Streams specification using Akka Actors under the hood. Akka creator Jonas Bonér sat down with DZone to give our readers a download on the general availability of Akka Streams, and the features and functionality that he believes are going to be most exciting to developers creating applications that participate in streaming architectures.

A DSL for Data Streaming 

Akka Streams features a visually appealing DSL for describing data flow that allows developers to wire up their stream endpoints and transformation stages as a graph processing “blueprint”—a staged “executable specification” that (in the example above) allows the developer to take streaming inputs and outputs, and in two lines of code capture a complex graph as a reusable and composable object. The DSL allows streams to be introspected, passed around, composed and reused.

As an example, this “whiteboard sketch”: 

Can be mapped to the following “executable blueprint”: 

in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out
                                  bcast ~> f4 ~> merge

Here is the full sample including setting up of the context, the Source and Sink and a set of simple transformations: 

// set up the context
val g = FlowGraph.closed() { implicit builder: FlowGraph.Builder =>
  import FlowGraph.Implicits._

  // create the Source and Sink
  val in = Source(1 to 10)
  val out = Sink.ignore

  // create the fan out and fan in stages 
  val bcast = builder.add(Broadcast[Int](2))
  val merge = builder.add(Merge[Int](2))

  // create a set of transformations
  val f1, f2, f3, f4 = Flow[Int].map(_ + 10)

  // define the graph/stream processing blueprint
  in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out
                                    bcast ~> f4 ~> merge

Higher Level than Actors

Snippets in the DSL are workflow pieces that allow easy definition of what should happen through different stages. Actors are extremely powerful (a model of general computation), but are fairly low-level and suffer from a lack of type safety, making them difficult to compose into workflows. Akka Streams is designed to provide a higher level abstraction on top of actors for defining your workflows—in a type and resource-safe manner. Developers working on streaming data into applications need to move data from point A to point B and do transformations in between, or capture data from A and B and then merge and enrich and pipe it to C. Akka Streams lets you to do all this in a very clean, intuitive and fully type safe way, allowing you to focus on the high-level workflow and business rules of the application without getting distracted by low level details and protocol design. 

Decoupling Definition from Execution

Bonér says one of the design goals for Akka Streams was to decouple what’s being executed from how and where it’s being executed, and that Akka Streams features what is called Materializers (engines for execution, if you so will). The default Materializer maps each transformation stage to an Actor, does asynchronous handoffs between the stages, and allow each stage (Actor) to run concurrently, possibly even on a different core. What is  important to understand is that this decoupling enables different pluggable Materializer implementations. For example, transformation stages can be fused together (using “Stream Fusion”) to run on a single thread of execution and/or you can take it in a different direction and distribute a graph on multiple machines, where you get the benefit of Akka Persistence and Akka Cluster for durability and scalability.

Built with Type Safety

Actors are extremely powerful and flexible but one drawback is that they suffer from the lack of type safety, making them hard to compose. One benefit of using the higher-level Akka Streams library is that it is fully type safe. Its processing stages are like typed functions and can be reused and composed in a similar fashion—just like Lego blocks—giving you fully type safe graph processing “blueprints”.

Stream Transformation Building Blocks

Out of the box, Akka Streams offers a whole suite of predefined so called stream “combinators” (transformation stages) for splitting, transforming, merging streams, and more.

Built on the Reactive Streams Specification, for Resilience and Interoperability

Akka Streams is an implementation of the Reactive Streams specification, which deals with some of the main challenges for developers working with streaming data, both in terms of real-time content delivery and bulk data transfers. The difficulty in these scenarios is to keep the data flowing while limiting the resources that are consumed on the systems through which the stream passes. The crucial ingredient to getting this right is: applying flow control. Reactive Streams protects each consumer of data from being overwhelmed by its producer by propagating “back pressure”—this is key for resilience and graceful degradation. Akka Streams’ support for Reactive Streams also means interoperability with any other streaming technology built on top of the Reactive Streams spec.

Bonér’s long-term vision for Akka Streams is to extend its capabilities as an integration “bus” for “all things streaming”. He cites Apache Camel and Mule as great technologies, but because they are synchronous they face challenges leveraging concurrency so—one flawed endpoint can stall the whole data pipeline, or in the worst case, take down the whole bus. Akka Streams does things in a fully asynchronous, back-pressured, loosely-coupled fashion, and Bonér believes that since it is build on the firm ground that Akka provides, it will give the users the performance and resilience—even under heavy load—that is needed to put the technology at the center of integrating Streaming systems longer-term.

Find the perfect platform for a scalable self-service model to manage Big Data workloads in the Cloud. Download the free O'Reilly eBook to learn more.

scala ,akka ,big data ,typesafe ,akka streams ,stream-based programming ,reactive programming ,enterprise-integration

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}