Building a Sentiment Analysis Pipeline With Apache Camel and Deep Java Library (DJL)
This tutorial shows how to build a sentiment analysis pipeline entirely in Java using Apache Camel and Deep Java Library (DJL).
Join the DZone community and get the full member experience.
Join For FreeSentiment analysis is now a key part of many applications. Whether you’re processing customer feedback, sorting support tickets, or tracking social media, knowing how users feel can be just as important as knowing what they say.
For Java developers, the main challenge isn’t finding machine learning models, but applying them within the existing or new Java applications without relying on Python. Most NLP models are shown in Python notebooks, while real systems use file pipelines, routing, retries, fallbacks, and monitoring. Many teams find it hard to connect these pieces smoothly.
This tutorial will show you how to build a sentiment analysis pipeline in pure Java with Apache Camel and the Deep Java Library (DJL). We will cover handling text files, running sentiment checks with a ready-to-go DistilBERT model, tidying up the outputs, and gracefully handling glitches, all while leaning on those trusty Enterprise Integration Patterns you are probably already familiar with.
What You'll Learn
By the time you're done here, you'll be comfortable with:
- Develop a file-based NLP pipeline using Apache Camel.
- Use a pre-trained sentiment analysis model via Camel’s DJL component.
- Understand the djl: URI syntax and model configuration in Apache Camel.
- Structure routes with fallback handling and clean output formatting.
- Run sentiment analysis locally using the Java and Apache Camel frameworks, without external APIs or Python services.
Frameworks Used
Apache Camel
Apache Camel is an awesome open-source integration framework built on Enterprise Integration Patterns. It has great components for connecting systems, moving data, and orchestrating workflows using declarative routes.
In this example, we look at file ingestion, Message transformation, routing, conditional logic, error handling, and output persistence.
Deep Java Library (DJL)
DJL is a deep learning framework for Java that is engine-agnostic. It strikes a nice balance between quick performance and solid accuracy, making it perfect for sentiment tasks without bogging things down.
We use the Camel-DJL component to download and cache a pre-trained DistilBERT model, run sentiment inference inside the JVM, and return structured classification results.
DistilBERT for Sentiment Analysis
DistilBERT is a smaller, fast transformer model based on BERT. It gives a good mix of speed and accuracy for sentiment analysis tasks.
Project Structure
Let's look at the project structure below:
camel-sentiment-analysis/
├── src/main/java/com/example/sentimentanalysis/
│ ├── MainApp.java # Application entry point
│ ├── routes/
│ │ └── SentimentAnalysisRoutes.java # Camel route for text processing
│ └── processor/
│ ├── ClassificationsFormatter.java # Formats sentiment results
│ └── FallbackFormatter.java # Handles unexpected outputs
├── src/main/resources/
│ └── application.properties # Camel configuration
├── data/
│ ├── input/ # Input text files
│ │ ├── sample-positive.txt # Sample positive text
│ │ └── sample-negative.txt # Sample negative text
│ ├── output/ # Sentiment analysis results
│ └── analyzed/ # Processed files archive
├── gradle/wrapper/ # Gradle wrapper files
├── build.gradle # Project dependencies
├── settings.gradle # Gradle settings
├── gradlew.bat # Gradle wrapper script
├── README.md # Main documentation
Gradle Dependencies
build.gradle
plugins {
id 'java'
id 'application'
}
group = 'com.example'
version = '1.0.0'
description = 'Sentiment Analysis with Apache Camel and DJL'
java {
toolchain {
languageVersion = JavaLanguageVersion.of(21)
}
}
application {
mainClass = 'com.example.sentimentanalysis.MainApp'
}
repositories {
mavenCentral()
}
dependencies {
// Apache Camel
implementation 'org.apache.camel:camel-core:4.17.0'
implementation 'org.apache.camel:camel-main:4.17.0'
implementation 'org.apache.camel:camel-file:4.17.0'
implementation 'org.apache.camel:camel-djl:4.17.0'
// DJL (Deep Java Library) for sentiment analysis
implementation platform('ai.djl:bom:0.36.0')
implementation 'ai.djl:api'
// PyTorch engine for DistilBERT sentiment analysis
implementation 'ai.djl.pytorch:pytorch-engine'
implementation 'ai.djl.pytorch:pytorch-model-zoo'
// Use CPU-only PyTorch runtime for Windows
runtimeOnly 'ai.djl.pytorch:pytorch-native-cpu:2.1.1:win-x86_64'
// Logging
implementation 'org.slf4j:slf4j-simple:2.0.16'
}
The DJL BOM make sures the versions are consistent across engines and models.
Application Entry Point
The application starts up using the MainApp class and starts Camel using Main:
package com.example.sentimentanalysis;
import com.example.sentimentanalysis.routes.SentimentAnalysisRoutes;
import org.apache.camel.main.Main;
public class MainApp {
public static void main(String[] args) throws Exception {
// Create and configure Camel Main
Main main = new Main();
// Add routes
main.configure().addRoutesBuilder(new SentimentAnalysisRoutes());
main.run();
}
}
Sentiment Analysis Route
The SentimentAnalysisRoutes.java is where the core logic is implemented using camel-djl component's URI. It uses the "from" component for file ingestion (Watches for a file, processes them one at a time to extract the text, archives them with a timestamp), and uses a single "to" URI endpoint to run sentiment analysis using the DJL component URI with a lazy producer, which allows the route to start even if the endpoint can't connect. The producer starts on the first message, enabling error handling during routing.
SentimentAnalysisRoutes.java
package com.example.sentimentanalysis.routes;
import com.example.sentimentanalysis.processor.ClassificationsFormatter;
import com.example.sentimentanalysis.processor.FallbackFormatter;
import org.apache.camel.builder.RouteBuilder;
import java.io.File;
import java.nio.file.Files;
/**
* Apache Camel routes for sentiment analysis.
* Processes text files from input folder, runs sentiment analysis using DJL,
* and writes results to output folder.
*/
public class SentimentAnalysisRoutes extends RouteBuilder {
@Override
public void configure() throws Exception {
// Route to process text files from input folder
from("file:data/input?include=.*\\.txt&noop=false&move=../analyzed/${date:now:yyyyMMdd-HHmmss}-${file:name}")
.routeId("sentiment-analysis-route")
.log("Processing text file: ${file:name}")
// Read file content as String
.process(exchange -> {
File textFile = exchange.getIn().getBody(File.class);
String content = Files.readString(textFile.toPath());
exchange.getIn().setBody(content);
})
.to("djl:nlp/sentiment_analysis?artifactId=ai.djl.pytorch:distilbert:0.0.1&lazyStartProducer=true")
// Convert output to a text report using Camel choice/bean components
.choice()
.when(body().isInstanceOf(ai.djl.modality.Classifications.class))
.bean(new ClassificationsFormatter(), "format")
.otherwise()
.bean(new FallbackFormatter(), "format")
.end()
.log("Sentiment analysis done for ${file:name}")
// Write results to output folder
.to("file:data/output?fileName=${date:now:yyyyMMdd-HHmmss}-${file:name.noext}-sentiment.txt")
.log("Results saved to output folder");
}
}
Let’s break this down:
.to("djl:nlp/sentiment_analysis?artifactId=ai.djl.pytorch:distilbert:0.0.1&lazyStartProducer=true")
- djl – Camel DJL component
- nlp/sentiment_analysis – NLP task type
- artifactId – Identifies the model from DJL’s Model Zoo
- lazyStartProducer=true – Defers model loading until first use
This single line replaces hundreds of lines of traditional ML plumbing.
How to Run the Application
Build and run using Gradle:
gradlew clean run
Then drop text files into data/input/, for example:
This is an amazing product! I absolutely love it and would highly recommend it.
The analyzed sentiment is written to data/output/, and the original file is archived.
ClassificationsFormatter.java
package com.example.sentimentanalysis.processor;
import ai.djl.modality.Classifications;
import org.apache.camel.Exchange;
import java.util.List;
/**
* Bean to format DJL Classifications object into a text report for sentiment
* analysis.
*/
public class ClassificationsFormatter {
public String format(Classifications classifications, Exchange exchange) {
StringBuilder sb = new StringBuilder();
String fileName = exchange.getIn().getHeader("CamelFileName", String.class);
sb.append("Text File: ").append(fileName).append('\n');
sb.append("Sentiment Analysis Results\n");
sb.append("==========================\n\n");
List<Classifications.Classification> allResults = classifications.topK();
if (!allResults.isEmpty()) {
Classifications.Classification top = allResults.get(0);
sb.append("Overall Sentiment: ").append(top.getClassName().toUpperCase()).append(" (Confidence: ")
.append(String.format("%.2f%%", top.getProbability() * 100)).append(")\n\n");
}
sb.append("Sentiment Breakdown:\n");
for (int i = 0; i < allResults.size(); i++) {
Classifications.Classification c = allResults.get(i);
sb.append(String.format(" %s: %.2f%%\n", c.getClassName(), c.getProbability() * 100));
}
return sb.toString();
}
}
On a successful classification, ClassificationsFormatter converts DJL’s output into a clean, human-readable report like below:
Text File: sample-positive.txt
Sentiment Analysis Results
==========================
Overall Sentiment: POSITIVE (Confidence: 99.98%)
Sentiment Breakdown:
Positive: 99.98%
Negative: 0.02%
In case of an ML failure, there is a FallbackFormatter that ensures the pipeline continues to produce meaningful output rather than crashing, which follows a critical production pattern to fail softly.
FallbackFormatter.java
package com.example.sentimentanalysis.processor;
import org.apache.camel.Exchange;
/**
* Bean to format unexpected result types into a text report.
*/
public class FallbackFormatter {
public String format(Object result, Exchange exchange) {
StringBuilder sb = new StringBuilder();
String fileName = exchange.getIn().getHeader("CamelFileName", String.class);
sb.append("Text File: ").append(fileName).append('\n');
sb.append("Raw result type: ").append(result == null ? "null" : result.getClass().getName()).append('\n');
sb.append("Result:\n").append(String.valueOf(result)).append('\n');
return sb.toString();
}
}
DJL Behind the Scenes
On first execution, the DistilBERT model is downloaded automatically, & Native PyTorch libraries are initialized. The model is cached locally, and on subsequent runs it loads from cache, resulting in significantly faster startup.
Production Considerations
For performance reasons, always warm up the model on startup if latency is a concern and allocate sufficient JVM heap (models are memory-intensive).
Scale horizontally with multiple Camel instances or vertically using GPU-enabled DJL engines. Always use lazyStartProducer for ML endpoints.
Conclusion
This tutorial demonstrates that machine learning does not need to be a separate system. With Apache Camel and DJL, sentiment analysis becomes just another step in your integration flow, composable, observable, and production-ready. Compared to external ML APIs, there is no per-request cost, data is not going out, and there is full control over routing and error handling.
Compared to Python pipelines, it is native integration with enterprise Java systems & first class support for integration patterns.
If you already use Camel, adding NLP capabilities is no longer a leap. It is a small, well-structured step.
Opinions expressed by DZone contributors are their own.
Comments