Long-Running Durable Agents With Spring AI and Dapr Workflows

Spring AI agentic patterns show how to coordinate multiple ChatClient calls to LLMs. We look at how Dapr Workflows can make these interactions durable and resilient.

Mauricio Salatino

Oct. 10, 25 · Analysis

Likes (0)

Comment

Save

4.6K Views

Over the last year, we have seen a rise in various patterns and usages that combine popular frameworks, such as Spring AI and LLM interactions. In January this year, Christian from the Spring AI team published Building Effective Agents with Spring AI, covering common agentic patterns described in the Anthropic paper titled Building Effective Agents.

I strongly recommend both of these blog posts to gain a good understanding of how these concepts are shaping up and the tools needed to implement the patterns suggested in these two articles.

One key thing that I will reiterate from these two articles is the distinction between agents and workflows:

The research publication makes an important architectural distinction between two types of agentic systems:

Workflows: Systems where LLMs and tools are orchestrated through predefined code paths (e.g., prescriptive system)
Agents: Systems where LLMs dynamically direct their own processes and tool usage

The key insight is that while fully autonomous agents might seem appealing, workflows often provide better predictability and consistency for well-defined tasks. This aligns perfectly with enterprise requirements where reliability and maintainability are crucial.

Having worked in the workflow “engine” industry for at least 10 years (Drools workflows, jBPM, Activity Cloud, Camunda/Zeebe, and now Dapr Workflows), the patterns described in the Spring AI article are very familiar to me, as workflows that require human intervention require similar interactions to those required by LLMs.

So, what is this blog post about? First, we will discuss durable executions and how using a tool like Dapr Workflows can enhance these patterns to become durable, fault-tolerant, and observable. Next, we will look into a simple example that you can run on your laptop to demonstrate how these patterns can be combined to create long-running agents.

Agentic Workflows, Durable Execution, and Dapr Workflows

To set up the context, let’s go back to the Spring AI article about agentic patterns, where five patterns are covered:

Chain workflow: The Chain Workflow pattern exemplifies the principle of breaking down complex tasks into simpler, more manageable steps.
Parallelization workflow: LLMs can work simultaneously on tasks and have their outputs aggregated programmatically. The parallelization workflow manifests in two key variations:
- Sectioning: Breaking tasks into independent subtasks for parallel processing
- Voting: Running multiple instances of the same task for consensus
Routing workflow: The Routing pattern implements intelligent task distribution, enabling specialized handling for different types of input.
Orchestrator/workers: This pattern demonstrates how to implement more complex agent-like behavior while maintaining control:
- A central LLM orchestrates task decomposition
- Specialized workers handle specific subtasks
- Clear boundaries maintain system reliability
Evaluator optimizer: The Evaluator-Optimizer pattern implements a dual-LLM process where one model generates responses while another provides evaluation and feedback in an iterative loop, similar to a human writer's refinement process.

For all these scenarios, we can find examples in the spring-projects/spring-ai-examples repository. The beauty of this approach is that we only need to add a single dependency to our Spring Boot applications to start integrating these agentic patterns.

So, what’s missing? If you look into the sample code provided, these examples demonstrate how LLM calls can be orchestrated with plain Java constructs. For example, the Parallelization code that processes specialized prompts:

    Java
   
 

   public List<String> parallel(String prompt, List<String> inputs, int nWorkers) {
		Assert.notNull(prompt, "Prompt cannot be null");
		Assert.notEmpty(inputs, "Inputs list cannot be empty");
		Assert.isTrue(nWorkers > 0, "Number of workers must be greater than 0");

		ExecutorService executor = Executors.newFixedThreadPool(nWorkers);
		try {
			List<CompletableFuture<String>> futures = inputs.stream()
					.map(input -> CompletableFuture.supplyAsync(() -> {
						try {
							return chatClient.prompt(prompt + "\nInput: " + input).call().content();
						} catch (Exception e) {
							throw new RuntimeException("Failed to process input: " + input, e);
						}
					}, executor))
					.collect(Collectors.toList());

			// Wait for all tasks to complete
			CompletableFuture<Void> allFutures = CompletableFuture.allOf(
					futures.toArray(CompletableFuture[]::new));
			allFutures.join();

			return futures.stream()
					.map(CompletableFuture::join)
					.collect(Collectors.toList());

		} finally {
			executor.shutdown();
		}
}

  

This example demonstrates how to utilize an Executor pool with nWorkers to process prompts in parallel, subsequently joining all the results once all tasks are completed.

While this gets the job done, if we want to process several tasks (prompts) in parallel, for this example, we are relying on a single JVM to handle these tasks. If the JVM (our application) goes down and we have already completed X tasks, we will need to start sending all the prompts again for reprocessing.

Similarly, the “chaining workflow” example:

Prompt Chaining workflow pattern for LLMs using Spring AI (image source)

If our application crashes after “LLM Call 2,” we just need to start again. In Java, this looks like a for loop with some if statements to check the conditions (“Gate”) to decide if new calls to the LLM are needed or not. For reference, see the GitHub repo spring-projects/spring-ai-examples.

Supercharging our Agentic Patterns With Dapr Workflows

How can we make sure that if our application fails, we do not repeat the steps that we already executed before saving time, money, and resources? Welcome, durable execution.

Durable execution is not a new concept, and without going deep into explaining how durable execution frameworks work, we want to make these patterns resilient, durable, and scalable beyond a single JVM. Check out our presentation Durable Exec. in Serverless Arch.: Cloudflare, SpringBoot & Dapr with Nele at JNation, where we provide an example of how these frameworks work under the hood.

For practical purposes, I’ve forked the Spring AI examples repository and migrated one-to-one the examples to use Dapr Workflows that provide workflow as code - durable executions. Check the Dapr Workflow official documentation for a complete list of features.

You can find the migrated examples on GitHub at salaboy/spring-ai-examples.

How are these examples different? These examples combine Spring AI with the Dapr Workflows runtime to implement the same patterns but now with a durable and resilient approach. What does this mean?

Let’s look at the code and compare it with the plain Java version.

The original example from the ParallelizationWorkflow goes over the input array and, for each item in the array, executes a prompt using the chatClient. For each prompt executed, it collects a CompletableFuture that can be used to track the execution in an asynchronous fashion. This means that these prompts are being performed in parallel while the CompletableFuture serves as a callback hook to get the results whenever they are returned by the call.

Calling allFutures.join(); allows the application to block until all the futures are completed, meaning that we get all the results from all the prompts that we submitted. Finally, we iterate over all the results and return them all as a list to the caller. See ParallelizationlWorkflow.java.

    Java
   
 

   List<CompletableFuture<String>> futures = inputs.stream()
       .map(input -> CompletableFuture.supplyAsync(() -> {
           try {
   			  return chatClient.prompt(prompt + "\nInput: " + input).call().content();
           } catch (Exception e) {
    	      throw new RuntimeException("Failed to process input: " + input, e);
           }
        }, executor))
       .collect(Collectors.toList());

// Wait for all tasks to complete
CompletableFuture<Void> allFutures = CompletableFuture.allOf(
					futures.toArray(CompletableFuture[]::new));

allFutures.join();

return futures.stream()
        .map(CompletableFuture::join)
		.collect(Collectors.toList());

  

On the Dapr workflow approach, we use the same constructs as we did before. We iterate theinputs().stream() and for each input, we call the PromtpActivity which returns a Task<String> which serves the same purpose as the CompleteableFuture that we saw before.

Using ctx.allOf(processTasks).await() to wait on all the Tasks to be completed so we can fetch the results and collect them all in a list to return to the user.

See ParallelWorkflow.java.

By implementing the Workflow interface, you are defining a durable orchestration. Within the Workflow implementation, you define which tasks need to be executed and how they are related to one another. In other words, you can define any pattern you want by just coding it inside the workflow implementation, but to keep things safe, you can break apart your logic into tasks that are defined by implementing the WorkflowActivity interface.

You can see below how the Parallelization workflow is quite straightforward:

    Java
   
 

   @Component
public class ParallelWorkflow implements Workflow {
  @Override
  public WorkflowStub create() {
    return ctx -> {
      ctx.getLogger().info("Starting Workflow: {}", ctx.getName());

      WorkflowInput workflowInput = ctx.getInput(WorkflowInput.class);

      List<Task<String>> processTasks = workflowInput.inputs()
              .stream()
              .map(input -> ctx.callActivity(PromptActivity.class.getName(), new PromptInput(workflowInput.prompt(), input), String.class))
              .collect(Collectors.toList());

      List<String> workerResponses = ctx.allOf(processTasks).await();

      ctx.complete(workerResponses);
    };
  }

}

  

This code looks extremely similar to the Spring AI original example. It iterates through a list of prompts to create asynchronous tasks for processing. To collect all the responses once all the tasks are completed.

The big difference here is that each prompt is encapsulated with the PromptActivity:

    Java
   
 

   @Component
public class PromptActivity implements WorkflowActivity {

  private final ChatClient chatClient;

  public PromptActivity(ChatClient.Builder chatClientBuilder) {
    this.chatClient = chatClientBuilder.build();
  }


  @Override
  public Object run(WorkflowActivityContext workflowActivityContext) {
    ParallelWorkflow.PromptInput promptInput = workflowActivityContext.getInput(ParallelWorkflow.PromptInput.class);
    try{
      return chatClient.prompt(promptInput.prompt() + "\nInput: " + promptInput.input()).call().content();
    } catch (Exception e){
      throw new RuntimeException("Failed to process input: " + promptInput.input(), e);
    }

  }
}

  

By registering WorkflowActivities to Dapr Workflows, the execution of these tasks is monitored by the orchestrator; hence, if WorkflowActivities are completed and something goes wrong with the application, as soon as we restart the application, our workflow execution will continue from where it left off.

The second advantage of this approach is that WorkflowActivities can be hosted in different instances of the application or even different applications. This will enable us to scale up our executions across multiple JVMs, where we can truly execute tons of activities in parallel.

Check the Multi-App Workflow documentation for more information about this.

Ok, but what do I need to add to my application to use Dapr Workflows?

Adding Dapr Workflow to Your Spring Boot Applications

Similar to Spring AI, we just need to add one dependency to our Spring Boot application to start using Dapr Workflows.

    Java
   
   <dependency>
	<groupId>io.dapr.spring</groupId>
	<artifactId>dapr-spring-boot-starter</artifactId>
	<version>${dapr.version}</version>
</dependency>

Note: We just released 1.16.0 version of the SDK.

By adding the Dapr Spring Boot Starter, you have access to all the Dapr APIs, including Workflows. However, to run your application, you must first bootstrap the Dapr runtime, which includes the Workflow orchestrator that operates independently of your application. Luckily, you can rely on Testcontainers to set up the runtime for your application. By adding the following dependency, Testcontainers brings the Dapr module, which will start the Dapr runtime whenever you start your application for local testing.

    Java
   
   <dependency>
	<groupId>io.dapr.spring</groupId>
	<artifactId>dapr-spring-boot-starter-test</artifactId>
	<version>${dapr.version}</version>
	<scope>test</scope>
</dependency>

With the “Dapr Spring Boot Starter Test” dependency, you can start your Spring Boot application using the test context (mvn spring-boot:test-run), which will automatically bootstrap and connect your application to the Dapr runtime. See the Testcontainers configuration in DaprTestContainersConfig.java here.

Let's Sum This Up

In this blog post, we have seen how we can expand on the Spring AI examples with durable workflow executions. This helps us to make our agents production-ready as they are now scalable and resilient to failure. We have looked into how to add Dapr Workflows to your existing Spring Boot Applications (dapr-spring-boot-starter and dapr-spring-boot-starter-test) and how, with the Testcontainers integration, to enable a local development experience that doesn’t require you to run Dapr in a Kubernetes cluster or download any other tool that you are not already using with your Spring Boot applications.

AI Spring Boot workflow

Opinions expressed by DZone contributors are their own.

Related

Trending