Stateful Functions: An Open Source Framework for Lightweight, Stateful Applications at Scale

DZone 's Guide to

Stateful Functions: An Open Source Framework for Lightweight, Stateful Applications at Scale

In this article, we discuss Stateful Functions, a recent open source framework designed to reduce the complexity of stateful applications at scale.

· Big Data Zone ·
Free Resource

At Flink Forward Europe 2019, Stephan Ewen from ververica announced the release of Stateful Functions, an open-source framework that reduces the complexity of building and orchestrating distributed stateful applications at scale. Stateful Functions brings together the benefits of stream processing with Apache Flink and Function-as-a-Service (FaaS) to provide a powerful abstraction for the next generation of event-driven architectures.

In this article, we will explain the motivation behind building Stateful Functions, and why we proposed the project to the Apache Flink community as an open-source contribution.

Staeful Functions

The Problem: Stateful Applications Are Still Hard

While orchestration for stateless compute has come a long way — driven by technologies like Kubernetes and FaaS — most offerings still fall short when it comes to stateful distributed applications by focusing primarily on computation, not state. Additionally, the interaction between functions still poses challenges to the overall ease of development and data consistency.

Stateful Functions is purpose-built to overcome these limitations, allowing users to define loosely-coupled, independent functions with a small footprint that can interact consistently and reliably in a shared pool of resources. The framework is composed of an API that implements the “stateful function” abstraction (Fig. 1) and a runtime based on Apache Flink for distributed coordination, communication, and state management.

You may also like: Stream Processing With Apache Flink.

Stateful Functions API

The API is based on — as the name suggests —, stateful functions: small snippets of functionality that encapsulate business logic, somewhat similar to actors. These functions exist as virtual instances — typically, one per entity in the application (e.g. per user or stock item) — and are distributed across shards, making it possible to horizontally scale applications out-of-the-box. Each function has persistent user-defined state in local variables and can arbitrarily message other functions (including itself!) with exactly-once guarantees.


The runtime that powers Stateful Functions is based on stream processing with Apache Flink. With Stateful Functions, state and computation are co-located in the stream processing engine, giving you fast and consistent state access. State durability and fault tolerance build on Flink’s robust distributed snapshots model.

Image title

Fig. 1

Computing Over State, not From State

The framework is not meant as a replacement for FaaS or (even) serverless; rather, it's designed to provide a set of properties similar to what characterizes serverless compute applied to state-centric problems.


Stateful Functions primarily scales state and the interaction between different states and events with logic that facilitates these interactions as the main focus of computation. Event-driven applications that juggle interacting state machines and a need to remember contextual information are a good fit for the state-centric paradigm.


On the other hand, FaaS and serverless application frameworks excel at elastically scaling dedicated resources for computation. Interacting with state and other functions is not as well integrated and not their core strength. A good example of a fitting use case is the classical “Image Resizing with AWS Lambda”.

Image title

Fig. 2

To achieve this, the runtime underneath the Stateful Functions API relies on stream processing with Apache Flink and extends its powerful model for state management and fault tolerance. The major advantage of this model is that state and computation are co-located on the same side of the network — which means you don’t need a round-trip per record to fetch state from an external storage system (e.g. Cassandra, DynamoDB) or a specific state management pattern for consistency (e.g. event sourcing, CQRS). Other advantages include:

  • No need to manage in-flight messages and maintain complex replication or repartition strategies, as persistence is as simple as having an object store for state snapshots.

  • High throughput for both stream (fast real-time) and batch (offline) processing, allowing you to blur the boundaries between event-driven applications and generic data processing.

Stateful Functions splits compute and storage differently than the classical two-tier architecture, maintaining: one ephemeral state/compute tier (Apache Flink) and a simple persistent blob storage tier (Fig. 2). Programatically, persistence is based on the concept of persisted values that enable each function instance to maintain and track fault-tolerant state independently.

Stream Processing, Extended

Although the Stateful Functions API is independent of Flink, the runtime is built on top of Flink’s DataStream API and uses a lightweight version of process functions (i.e. low-level functions accessing state) to materialize this abstraction under the hood. The core advantage with Stateful Functions, compared to vanilla Flink, is that each function can arbitrarily send events to all other functions, rather than only downstream in a DAG.
Image title

Fig. 3

Stateful Functions applications are typically modular, containing multiple bundles of functions, that can interact consistently and reliably, multiplexed into a single Flink application (Fig. 3). This allows many small jobs to share the same pool of resources and harness them as needed, instead of requiring the resources that you might need upfront. At any point in time, the vast majority of virtual instances are idle and consume no compute resources.

Get Started With Stateful Functions

If you find this project interesting, we invite you to give Stateful Functions a try! To get started, have a look through the documentation and follow one of the introduction walkthroughs, going from a simple stateful “Hello World!” (Fig.4) to a more complex Ride Sharing App.

Image title

Fig. 4

If you find a bug or have an idea about how to improve the project, we strongly encourage you to file an issue or open a pull request on GitHub! At any time, you can ask us questions on Stack Overflow, using the hashtag #statefun.

Upcoming Work

Stateful Functions is a work-in-progress with what we believe is a promising direction. Our team will continue to introduce improvements to establish and amplify the value of the project, such as support for non-JVM languages, fine-grained observability and stricter recovery times. Possibilities for enhancements to the runtime and operations will also evolve with the evolution of capabilities of Apache Flink.

Stateful Functions has been accepted as a contribution to the Apache Software Foundation, as part of Apache Flink, with the kickoff release happening soon. Stay tuned!

Further Reading

apache flink ,stream processing ,serverless ,function as a service ,functions as a service ,faas ,functions ,open source ,big data

Published at DZone with permission of Marta Paes . See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}