DZone Spotlight

Wednesday, February 5 View All Articles »

Building a Full-Stack Resume Screening Application With AI

By Anujkumarsinh Donvir

CORE

The release of the DeepSeek open-source AI model has created a lot of excitement in the technology community. It allows developers to build applications entirely locally without needing to connect to online AI models such as Claude, ChatGPT, and more. Open-source models have opened doors to new opportunities when building enterprise applications that integrate with generative AI. In this article, you will learn how to run such a model locally on your personal machine and build a full-stack React and NodeJS-powered application that is not just another chatbot. You will be able to use this application to analyze resumes faster and make smarter hiring decisions. Before you build the application, it is important to understand the benefits of open-source LLMs. Benefits of Open-Source Large Language Models Open-source models provide several key benefits over using proprietary models : Cost-Effective and License-Free Open-source LLMs are cost-effective and don't need a special license. For example, as of the date of writing this, OpenAI’s o1 costs $60 per million output tokens, and open-source DeepSeek R1 costs $2.19. Customizable and Fine-Tunable Open-source models can be fine-tuned easily to meet unique business cases - allowing for more domain-specific use cases to be built. This leads to optimized performance in the enterprise applications. Enhanced Data Security and Privacy Open source makes applications more secure as precious personal data doesn't need to be uploaded to third-party servers and stays on a local machine or a company's network only stay in local machine or a companies network only. This provides a high level of data security. Furthermore, open-source models can be fine-tuned to remove any biases of data. Community-Driven and No Vendor Lock-In Open-source models enjoy large community support and benefit from the rapid pace of feature development. On the other hand, using property models makes the application vendor-locked and reliant on vendor companies to provide feature updates. With this information in hand, you are ready to build a real-world application using the DeepSeek R1 open-source model, Node.JS, and React. Project and Architecture Overview You will be building a resume analyzer application — which will help you learn the benefits and shortcomings of the uploaded resume. DeepSeek R1 LLM will analyze the uploaded resume and provide feedback. You can learn about the architecture of the application through the illustration below. Architecture Diagram The React-based user interface communicates with the NodeJS-based backend using REST APIs. NodeJS backend then sends the user request to DeepSeek R1 hosted using Ollama. This entire tech stack can be run on a single machine, as you will do throughout the article, or it can be hosted across multiple containers in more complex use cases. Prerequisites To run the project, you will need a machine with some compute power, preferably one that has an NVIDIA graphics card. The project has been developed and tested on NVIDIA 4090RTX based Windows machine and M2 MacBook Pro. You will need to have NodeJS installed on the machine. This project has been built on NodeJS version 22.3.0. You can verify NodeJS installation using the node -v command. You will also need an editor of your choice to work through the code. Visual Studio Code has been used while building the application and is generally recommended. Setting Up and Running DeepSeek Locally To run DeepSeek R1 locally, follow the steps below: 1. Install Ollama from its official website. 2. After installation is complete, you will be able to run models using the ollama run command from your machine's terminal. 3. Run the DeepSeek model of your choice. This tutorial was built using the DeepSeek R1 8-Billon parameter model. You can run it by using the command ollama run deepseek-r1:8b. 4. If you have a lower-specification machine than the one mentioned in the prerequisites section, the 7B and 1.5B parameter models will work as well, but the generated output quality may be lower. 5. It may take some time for models to run the first time as they will need to get downloaded. Once the model is running, you should be able to ask it a question right in the terminal and get an output. You can refer to the illustration below to view the DeepSeek R1 8B model in action. Ollama DeepSeek R1 6. DeepSeek R1 is a reasoning model, and therefore, it thinks before giving the first answer it can generate. As highlighted in the illustration above, it is thinking before giving the answer to our prompt. This thinking can be seen in tags <think> </think>. Cloning and Running NodeJS Backend The Ollama service can be accessed via an API as well. You are going to leverage this API and build a NodeJS-based backend layer. This layer will take the uploaded PDF from the user and extract text from it. After the text extraction, the backend will feed the text to the DeepSeek R1 model via the Ollama API and get a response back. This response will be sent to the client to display to the user. 1. Clone the backend project from GitHub using this URL. Ideally, you should fork the project and then clone your own local copy. 2. After cloning, to run the project, go to the project root directory using cd deepseek-ollama-backend. 3. Once inside the project root, install dependencies by giving npm install command. Once the installation completes, the project can be run using the npm start command. The core of the project is the app.js file. Examine its code, which is provided below. JavaScript const express = require('express'); const multer = require('multer'); const pdfParse = require('pdf-parse'); const axios = require('axios'); const fs = require('fs'); const cors = require('cors'); const app = express(); app.use(cors()); app.use(express.json()); const upload = multer({ dest: 'uploads/', fileFilter: (req, file, cb) => { file.mimetype === 'application/pdf' ? cb(null, true) : cb(new Error('Only PDF files are allowed!')); } }).single('pdfFile'); app.post('/analyze-pdf', (req, res) => { upload(req, res, async function(err) { if (err) { return res.status(400).json({ error: 'Upload error', details: err.message }); } try { if (!req.file) { return res.status(400).json({ error: 'No PDF file uploaded' }); } const dataBuffer = fs.readFileSync(req.file.path); const data = await pdfParse(dataBuffer); const pdfText = data.text; fs.unlinkSync(req.file.path); const response = await axios.post('http://127.0.0.1:11434/api/generate', { model: "deepseek-r1:8b", prompt: `Analyze this resume. Resume text is between two --- given ahead: ---${pdfText}---`, stream: false }); res.json({ success: true, message: 'Successfully connected to Ollama', ollamaResponse: response.data }); } catch (error) { if (req.file && fs.existsSync(req.file.path)) { fs.unlinkSync(req.file.path); } res.status(500).json({ error: 'Error processing PDF', details: error.message }); } }); }); if (!fs.existsSync('uploads')) { fs.mkdirSync('uploads'); } const PORT = process.env.PORT || 3000; app.listen(PORT, () => { console.log(`Server is running on port ${PORT}`); }); 4. The client interacts with the backend by invoking /analyze-pdf API endpoint, which is of type POST. The client sends the user-uploaded pdf file as a payload to this API. 5. The server stores this file in uploads directory temporarily, and extracts the text in the file. 6. The server then prompts DeepSeek R1 using Ollama's localhost API endpoint. 7. DeepSeek R1 analyzes the content of the resume and provides its feedback. The server then responds to the client with this analysis using res.json(). Cloning and Running the React User Interface The user interface of the project will allow users to upload the resume, send this resume to the backend and then display the result of DeepSeek R1's analysis of the resume to the user. It will also display an internal chain of thoughts or thinking of DeepSeek R1 as well. 1. To get started, fork and then clone the project from this GitHub URL. You can simply clone it as well if you don't intend to do many customizations. 2. Once the project is cloned, go to the root project directory using the command cd deepseek-ollama-frontend. 3. Inside the project root, install the necessary dependencies using the npm install command. After the installation completes, start the project using the npm run dev command. 4. The main component of this React application is ResumeAnalyzer. Open it in your editor of choice and analyze it. JSX import './ResumeAnalyzer.css'; import React, { useState } from 'react'; import { Upload, Loader2 } from 'lucide-react'; import AnalysisSection from './AnalysisSection'; const ResumeAnalyzer = () => { const [file, setFile] = useState(null); const [loading, setLoading] = useState(false); const [feedback, setFeedback] = useState(null); const [error, setError] = useState(null); const handleFileChange = (event) => { const selectedFile = event.target.files?.[0]; if (selectedFile && selectedFile.type === 'application/pdf') { setFile(selectedFile); setError(null); } else { setError('Please upload a PDF file'); setFile(null); } }; const analyzePDF = async () => { if (!file) return; setLoading(true); setError(null); try { const formData = new FormData(); formData.append('pdfFile', file); const response = await fetch('http://localhost:3000/analyze-pdf', { method: 'POST', body: formData, }); if (!response.ok) { const errorData = await response.json(); throw new Error(errorData.details || 'Failed to analyze PDF'); } const data = await response.json(); setFeedback(data); } catch (err) { setError(err.message || 'An error occurred'); } finally { setLoading(false); } }; return ( <div className="max-w-4xl mx-auto p-4"> <div className="bg-gray-50 rounded-lg shadow-lg p-6"> <h1 className="text-3xl font-bold mb-6 text-gray-800">Resume Analyzer</h1> <div className="bg-white rounded-lg shadow-sm p-8"> <div className="border-2 border-dashed border-gray-300 rounded-lg p-8 text-center"> <Upload className="w-12 h-12 text-gray-400 mx-auto mb-4" /> <input type="file" accept=".pdf" onChange={handleFileChange} className="hidden" id="file-upload" /> <label htmlFor="file-upload" className="cursor-pointer text-blue-600 hover:text-blue-800 font-medium" > Upload Resume (PDF) </label> {file && ( <p className="mt-2 text-sm text-gray-600"> Selected: {file.name} </p> )} </div> </div> {error && ( <div className="mt-4 p-4 bg-red-50 text-red-700 rounded-lg border border-red-200"> {error} </div> )} <button onClick={analyzePDF} disabled={!file || loading} className="mt-6 w-full bg-blue-600 text-white py-3 px-4 rounded-lg hover:bg-blue-700 disabled:opacity-50 disabled:cursor-not-allowed flex items-center justify-center font-medium transition-colors" > {loading ? ( <> <Loader2 className="mr-2 h-5 w-5 animate-spin" /> Analyzing Resume... </> ) : ( 'Analyze Resume' )} </button> {feedback && !loading && ( <div className="mt-8"> <h2 className="text-2xl font-bold mb-6 text-gray-800">Analysis Results</h2> {feedback.ollamaResponse && <AnalysisSection ollamaResponse={feedback.ollamaResponse} /> } </div> )} </div> </div> ); }; export default ResumeAnalyzer; 5. This component provides an input field for the user to upload the file. 6. The uploaded file is sent to the server using the API endpoint. 7. The response of the server is divided into two parts- Internal thinking of the model and actual response of the model. 8. The AnalysisSection component displays the response of the model as well as houses the ExpandableSection component, which is used for displaying DeepSeek R1's internal thinking. 9. Navigate to the URL in your browser to load the application. Upload your resume (or any sample resume) and observe the analysis received by DeepSeek R1. Resume Analyzer Conclusion DeepSeek R1 provides a unique opportunity to build GenAI-powered applications completely in-house, and customize them as per your needs. In this article, you have learned about the benefits of using open-source GenAI models. Furthermore, you have set up a real application using DeepSeek R1, Node.js, and React. This setup allows you to perform resume analysis using AI completely offline. You can use this tool to hire smart at your organization, and I advise you to continue building on the knowledge gained from this article and explore more use cases and applications. More

Building RAG Apps With Apache Cassandra, Python, and Ollama

By Varun Setia

Retrieval-augmented generation (RAG) is the most popular approach for obtaining real-time data or updated data from a data source based on text input by users. Thus empowering all our search applications with state-of-the-art neural search. In RAG search systems, each user request is converted into a vector representation by embedding model, and this vector comparison is performed using various algorithms such as cosine similarity, longest common sub-sequence, etc., with existing vector representations stored in our vector-supporting database. The existing vectors stored in the vector database are also generated or updated asynchronously by a separate background process. This diagram provides a conceptual overview of vector comparison To use RAG, we need at least an embedding model and a vector storage database to be used by the application. Contributions from community and open-source projects provide us with an amazing set of tools that help us build effective and efficient RAG applications. In this article, we will implement the usage of a vector database and embedding generation model in a Python application. If you are reading this concept for the first time or nth time, you only need tools to work, and no subscription is needed for any tool. You can simply download tools and get started. Our tech stack consists of the following open-source and free-to-use tools: Operating system – Ubuntu LinuxVector database – Apache CassandraEmbedding model – nomic-embed-textProgramming language – Python Key Benefits of this Stack Open-sourceIsolated data to meet data compliance standards This diagram provides a high-level dependency architecture of the system Implementation Walkthrough You may implement and follow along if prerequisites are fulfilled; otherwise, read to the end to understand the concepts. Prerequisites Linux (In my case, it's Ubuntu 24.04.1 LTS)Java Setup (OpenJDK 17.0.2)Python (3.11.11)Ollama Ollama Model Setup Ollama is an open-source middleware server that acts as an abstraction between generative AI and applications by installing all the necessary tools to make generative AI models available to consume as CLI and API in a machine. It has most of the openly available models like llama, phi, mistral, snowflake-arctic-embed, etc. It is cross-platform and can be easily configured in OS. In Ollama, we will pull the nomic-embed-text model to generate embeddings. Run in command line: Plain Text ollama pull nomic-embed-text This model generates embeddings of size 768 vectors. Apache Cassandra Setup and Scripts Cassandra is an open-source NoSQL database designed to work with a high amount of workloads that require high scaling as per industry needs. Recently, it has added support for Vector search in version 5.0 that will facilitate our RAG use case. Note: Cassandra requires Linux OS to work; it can also be installed as a docker image. Installation Download Apache Cassandra from https://cassandra.apache.org/_/download.html. Configure Cassandra in your PATH. Start the server by running the following command in the command line: Plain Text cassandra Table Open a new Linux terminal and write cqlsh; this will open the shell for Cassandra Query Language. Now, execute the below scripts to create the embeddings keyspace, document_vectors table, and necessary index edv_ann_index to perform a vector search. SQL CREATE KEYSPACE IF NOT EXISTS embeddings WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : '1' }; USE embeddings; CREATE TABLE IF NOT EXISTS embeddings.document_vectors ( record_id timeuuid, id uuid, content_chunk text, content_vector VECTOR <FLOAT, 768>, created_at timestamp, PRIMARY KEY (id, created_at) ) WITH CLUSTERING ORDER BY (created_at DESC); CREATE INDEX IF NOT EXISTS edv_ann_index ON embeddings.document_vectors(content_vector) USING 'sai'; Note: content_vector VECTOR <FLOAT, 768> is responsible for storing vectors of 768 length that are generated by the model. Milestone 1: We are ready with database setup to store vectors. Python Code This programming language certainly needs no introduction; it is easy to use and loved by the industry with strong community support. Virtual Environment Set up virtual environment: Plain Text sudo apt install python3-virtualenv && python3 -m venv myvenv Activate virtual environment: Plain Text source /media/setia/Data/Tutorials/cassandra-ollama-app/myvenv/bin/activate Packages Download Datastax Cassandra package: Plain Text pip install cassandra-driver Download requests package: Plain Text pip install requests File Create a file named app.py. Now, write the code below to insert sample documents in Cassandra. This is the first step always to insert data in the database; it can be done by a separate process asynchronously. For demo purposes, I have written a method that will insert documents first in the database. Later on, we can comment on this method once the insertion of documents is successful. Python from cassandra.cluster import Cluster from cassandra.query import PreparedStatement, BoundStatement import uuid import datetime import requests cluster = Cluster(['127.0.0.1'],port=9042) session = cluster.connect() def generate_embedding(text): embedding_url = 'http://localhost:11434/api/embed' body = { "model": "nomic-embed-text", "input": text } response = requests.post(embedding_url, json = body) return response.json()['embeddings'][0] def insert_chunk(content, vector): id = uuid.uuid4() content_chunk = content content_vector = vector created_at = datetime.datetime.now() insert_query = """ INSERT INTO embeddings.document_vectors (record_id, id, content_chunk, content_vector, created_at) VALUES (now(), ?, ?, ?, ?) """ prepared_stmt = session.prepare(insert_query) session.execute(prepared_stmt, [ id, content_chunk, content_vector, created_at ]) def insert_sample_data_in_cassandra(): sentences = [ "The aroma of freshly baked bread wafted through the quaint bakery nestled in the cobblestone streets of Paris, making Varun feel like time stood still.", "Sipping a spicy masala chai in a bustling tea stall in Mumbai, Varun felt he was tasting the very soul of the city.", "The sushi in a small Tokyo diner was so fresh, it felt like Varun was on a culinary journey to the sea itself.", "Under the starry desert sky in Morocco, Varun enjoyed a lamb tagine that tasted like a dream cooked slowly over a fire.", "The cozy Italian trattoria served the creamiest risotto, perfectly capturing the heart of Tuscany on a plate, which Varun savored with delight.", "Enjoying fish tacos on a sunny beach in Mexico, with the waves crashing nearby, made the flavors unforgettable for Varun.", "The crispy waffles drizzled with syrup at a Belgian café were worth every minute of waiting, as Varun indulged in the decadent treat.", "A bowl of warm pho in a roadside eatery in Hanoi felt like comfort wrapped in a broth of herbs and spices, giving Varun a sense of warmth.", "Sampling chocolate truffles in a Swiss chocolate shop, Varun found himself in a moment of pure bliss amidst snow-capped mountains.", "The street food stalls in Bangkok served fiery pad Thai that left Varun with a tangy memory of the city’s vibrant energy." ] for sentence in sentences: vector = generate_embedding(sentence) insert_chunk(sentence, vector) insert_sample_data_in_cassandra() Now, run this file using the commandline in the virtual environment: Plain Text python app.py Once the file is executed and documents are inserted, this can be verified by querying the Cassandra database from the cqlsh console. For this, open cqlsh and execute: SQL SELECT content_chunk FROM embeddings.document_vectors; This will return 10 documents inserted in the database, as seen in the screenshot below. Milestone 2: We are done with data setup in our vector database. Now, we will write code to query documents based on cosine similarity. Cosine similarity is the dot product of two vector values. Its formula is A.B / |A||B|. This cosine similarity is internally supported by Apache Cassandra, helping us to compute everything in the database and handle large data efficiently. The code below is self-explanatory; it fetches the top three results based on cosine similarity using ORDER BY <column name> ANN OF <text_vector> and also returns cosine similarity values. To execute this code, we need to ensure that indexing is applied to this vector column. Python def query_rag(text): text_embeddings = generate_embedding(text) select_query = """ SELECT content_chunk,similarity_cosine(content_vector, ?) FROM embeddings.document_vectors ORDER BY content_vector ANN OF ? LIMIT 3 """ prepared_stmt = session.prepare(select_query) result_rows = session.execute(prepared_stmt, [ text_embeddings, text_embeddings ]) for row in result_rows: print(row[0], row[1]) query_rag('Tell about my Bangkok experiences') Remember to comment insertion code: Python #insert_sample_data_in_cassandra() Now, execute the Python code by using python app.py. We will get the output below: Plain Text (myvenv) setia@setia-Lenovo-IdeaPad-S340-15IIL:/media/setia/Data/Tutorials/cassandra-ollama-app$ python app.py The street food stalls in Bangkok served fiery pad Thai that left Varun with a tangy memory of the city’s vibrant energy. 0.8205469250679016 Sipping a spicy masala chai in a bustling tea stall in Mumbai, Varun felt he was tasting the very soul of the city. 0.7719690799713135 A bowl of warm pho in a roadside eatery in Hanoi felt like comfort wrapped in a broth of herbs and spices, giving Varun a sense of warmth. 0.7495554089546204 You can see the cosine similarity of "The street food stalls in Bangkok served fiery pad Thai that left Varun with a tangy memory of the city’s vibrant energy." is 0.8205469250679016, which is the closest match. Final Milestone: We have implemented the RAG search. Enterprise Applications Apache Cassandra For enterprises, we can use Apache Cassandra 5.0 from popular cloud vendors such as Microsoft Azure, AWS, GCP, etc. Ollama This middleware requires a VM compatible with Nvidia-powered GPU for running high-performance models, but we don't need high-end VMs for models used for generating vectors. Depending upon traffic requirements, multiple VMs can be used, or any generative AI service like Open AI, Anthropy, etc, whichever Total Cost of Ownership is lower for scaling needs or Data Governance needs. Linux VM Apache Cassandra and Ollama can be combined and hosted in a single Linux VM if the use case doesn't require high usage to lower the Total Cost of Ownership or to address Data Governance needs. Conclusion We can easily build RAG applications by using Linux OS, Apache Cassandra, embedding models (nomic-embed-text) used via Ollama, and Python with good performance without needing any additional cloud subscription or services in the comfort of our machines/servers. However, hosting a VM in server(s) or opt for a cloud subscription for scaling as an enterprise application compliant with scalable architectures is recommended. In this Apache, Cassandra is a key component to do the heavy lifting of our vector storage and vector comparison and Ollama server for generating vector embeddings. That's it! Thanks for reading 'til the end. More

Trend Report

Observability and Performance

The dawn of observability across the software ecosystem has fully disrupted standard performance monitoring and management. Enhancing these approaches with sophisticated, data-driven, and automated insights allows your organization to better identify anomalies and incidents across applications and wider systems. While monitoring and standard performance practices are still necessary, they now serve to complement organizations' comprehensive observability strategies. This year's Observability and Performance Trend Report moves beyond metrics, logs, and traces — we dive into essential topics around full-stack observability, like security considerations, AIOps, the future of hybrid and cloud-native observability, and much more.

Refcard #401

Getting Started With Agentic AI

By Lahiru Fernando

Refcard #400

Java Application Containerization and Deployment

By Mark Heckler

Managing Distributed System Locks With Azure Storage

Distributed systems have been there for a while now and there are well-known patterns already established when designing them. Today, we will discuss one of the popular patterns: "locks." Simply put, locks are how processes gain exclusive access to a resource to perform a certain action. For example, imagine there are a bunch of Blobs in a storage account, and you need one instance of your service to process each blob to avoid duplicate processing. The way to do it would be to acquire a lock on the blob, complete processing, and release it. However, a potential issue arises if a process fails before releasing the lock, either because the process died or due to a network partition, leaving the resource locked indefinitely. This can lead to deadlocks and resource contention. To prevent deadlocks, one strategy that can be employed is to use timeouts or leased-based locks. Timeout Lock In this case, there is a predefined timeout the process requests the lock for. If the lock is not released before the timeout, the system ensures the lock is eventually released. Lease Lock For lease-based locks, a renew lease API is provided alongside the timeout mechanism. The process holding the lock must call this API before the lease expires to maintain exclusive access to the resource. If the process fails to renew the lease in time, the lock is automatically released, allowing other processes to acquire it. Pros and Cons of Timeout and Lease-Based Locks ProsConsTimeout based lockSimple to implementRequires careful selection of the timeoutPrevent permanent locksIf the processing is not complete, then there is no way to renew the leaseLease based lockReduces risk of premature lock expirationRequires mechanism for lease renewalProcess can continue to request the lease until work is complete. Both the above strategies are a way to quickly recover from process failures or network partitions in distributed systems. Lease Lock Strategy With Azure Storage Let's look at how to use the Lease Lock strategy with Azure Storage. This also covers the Timeout lock strategy. Step 1: Import the Storage Blob Nuget "12.23.0" is the latest version at the time of authoring this article. The latest versions can be found at Azure Storage Blobs. XML <ItemGroup> <PackageReference Include="Azure.Storage.Blobs" Version="12.23.0" /> </ItemGroup> Step 2: Acquire the Lease Below is the code to acquire the lease. C# public async Task<string> TryAcquireLeaseAsync(string blobName, TimeSpan durationInSeconds, string leaseId = default) { BlobContainerClient blobContainerClient = new BlobContainerClient(new Uri($"https://{storageName}.blob.core.windows.net/processors"), tokenCredential, blobClientOptions); BlobLeaseClient blobLeaseClient = blobContainerClient.GetBlobClient(blobName).GetBlobLeaseClient(leaseId); try { BlobLease lease = await blobLeaseClient.AcquireAsync(durationInSeconds).ConfigureAwait(false); return lease.LeaseId; } catch (RequestFailedException ex) when (ex.Status == 409) { return default; } } First, we create a Blob Container Client and retrieve the Blob Client for the specific blob we want to acquire a lease on.Second, the "Acquire Async" method tries to acquire the lease for a specific duration. If the acquisition was successful, a lease Id is returned if not a 409 (Status code for conflict) is thrown.The "Acquire Async" is the key method here. The rest of the code can be tailored/edited as per your needs. Step 3: Renew the Lease "Renew Async" is the method in the Storage .NET SDK used for renewing the lease.If the renewal is unsuccessful an exception is thrown along with the reason for the cause of failure. C# public async Task ReleaseLeaseAsync(string blobName, string leaseId) { BlobLeaseClient blobLeaseClient = this.blobContainerClient.GetBlobClient(blobName).GetBlobLeaseClient(leaseId); await blobLeaseClient.RenewAsync().ConfigureAwait(false); } Step 4: Orchestrate the Acquire and Renew Lease Methods Initially, we call the "Try Acquire Lease Async" to fetch the lease identifier from Step 2. Once it is successful, a background task is kicked off that calls the "Renew Lease Async" from Step 3 every X seconds. Just make sure there is enough time between the timeout and when the renew lease method is called. C# string leaseId = await this.blobReadProcessor.TryAcquireLeaseAsync(blobName, TimeSpan.FromSeconds(60)).ConfigureAwait(false); Task leaseRenwerTask = this.taskFactory.StartNew( async () => { while (leaseId != default && !cancellationToken.IsCancellationRequested) { await Task.Delay(renewLeaseMillis).ConfigureAwait(false); await this.blobReadProcessor.RenewLeaseAsync(blobName, leaseId).ConfigureAwait(false); } }, CancellationToken.None, TaskCreationOptions.LongRunning, TaskScheduler.Default); The cancellation token is used to gracefully stop the lease renewal task when it's no longer needed. Step 5: Cancel the Lease Renewal When the "Cancel Async" method is called, the "IsCancellationRequested" in Step 4 becomes true, because of which we no longer enter the while loop and request for lease renewal. C# await cancellationTokenSource.CancelAsync().ConfigureAwait(false); await leaseRenwerTask.WaitAsync(Timeout.InfiniteTimeSpan).ConfigureAwait(false); Step 6: Release the Lease Finally, to release the lease just call the "Release Async" method. C# public async Task ReleaseLeaseAsync(string blobName, string leaseId) { BlobLeaseClient blobLeaseClient = this.blobContainerClient.GetBlobClient(blobName).GetBlobLeaseClient(leaseId); await blobLeaseClient.ReleaseAsync().ConfigureAwait(false); } Conclusion Locks are among the fundamental patterns in distributed systems to gain exclusive access to resources. It is necessary to keep the pitfalls in mind while dealing with them for the smooth running of operations. By using Azure Storage, we can implement these efficient locking mechanisms that can prevent indefinite blocking and, at the same time, provide elasticity in how the locks are maintained.

By Siri Varma Vegiraju

CORE

MuleSoft OAuth 2.0 Provider: Password Grant Type

OAuth 2.0 is a widely used authorization framework that allows third-party applications to access user resources on a resource server without sharing the user's credentials. The Password Grant type, also known as Resource Owner Password Credentials Grant, is a specific authorization grant defined in the OAuth 2.0 specification. It's particularly useful in scenarios where the client application is highly trusted and has a direct relationship with the user (e.g., a native mobile app or a first-party web application). This grant type allows the client to request an access token by directly providing the user's username and password to the authorization server. While convenient, it's crucial to implement this grant type securely, as it involves handling sensitive user credentials. This article details how to configure MuleSoft as an OAuth 2.0 provider using the Password Grant type, providing a step-by-step guide and emphasizing security best practices. Implementing this in MuleSoft allows you to centralize your authentication and authorization logic, securing your APIs and resources. Use Cases and Benefits Native mobile apps: Suitable for mobile applications where the user interacts directly with the app to provide their credentials.Trusted web applications: Appropriate for first-party web applications where the application itself is trusted to handle user credentials securely.API security: Enhances API security by requiring clients to obtain an access token before accessing protected resources.Centralized authentication: Allows for centralized management of user authentication within your MuleSoft environment. Prerequisites MuleSoft Anypoint Studio (latest version recommended)Basic understanding of OAuth 2.0 conceptsFamiliarity with Spring SecurityA tool for generating bcrypt hashes (or a library within your Mule application) Steps 1. Enable Spring Security Module Create a Mule Project Start by creating a new Mule project in Anypoint Studio. Add Spring Module Add the "Spring Module" from the Mule palette to your project. Drag and drop it into the canvas. Configure Spring Security Manager In the "Global Elements" tab, add a "Spring Config" and a "Spring Security Manager." These will appear as global elements. Configure the "Spring Security Manager" Set the "Name" to resourceOwnerSecurityProvider. This is a logical name for your security manager. Set the "Delegate Reference" to resourceOwnerAuthenticationManager. This links the security manager to the authentication manager defined in your Spring configuration. Configure Spring Config Set the "Path" of the "Spring Config" to your beans.xml file (e.g., src/main/resources/beans.xml). This tells Mule where to find your Spring configuration. Create the beans.xml file in the specified location (src/main/resources/beans.xml). This file defines the Spring beans, including the authentication manager. Add the following configuration: XML <?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ss="http://www.springframework.org/schema/security" xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-5.3.x.xsd http://www.springframework.org/schema/security http://www.springframework.org/schema/security/spring-security-5.3.x.xsd"> <ss:authentication-manager alias="resourceOwnerAuthenticationManager"> <ss:authentication-provider> <ss:user-service> <ss:user name="john" password="{bcrypt}$2a$10$somehashedpassword" authorities="READ_PROFILES"/> </ss:user-service> </ss:authentication-provider> </ss:authentication-manager> </beans> Critical Security Password hashing: The most important security practice is to never store passwords in plain text. The example above uses bcrypt, a strong hashing algorithm. You must replace $2a$10$somehashedpassword with an actual bcrypt hash of the user's password. Use a tool or library to generate this hash. The {bcrypt} prefix tells Spring Security to use the bcrypt password encoder.Spring security version: Ensure your beans.xml uses a current, supported version of the Spring Security schema. Older versions have known vulnerabilities. The provided example uses 5.3.x; adjust as needed. 2. Configure OAuth 2.0 Provider Add OAuth Provider Module Add the "OAuth Provider" module from the Mule palette to your project. Add OAuth2 Provider Config Add an "OAuth2 Provider Config" in the Global Configuration. This is where you'll configure the core OAuth settings. Configure OAuth Provider Token store: Choose a persistent token store. "In-Memory" is suitable only for development and testing. For production, use a database-backed store (e.g., using the Database Connector) or a distributed cache like Redis for better performance and scalability.Client store: Similar to the token store, use a persistent store for production (database or Redis recommended). This store holds information about registered client applications.Authorization endpoint: The URL where clients can request authorization. The default is usually /oauth2/authorize.Token endpoint: The URL where clients exchange authorization codes (or user credentials in the Password Grant case) for access tokens. The default is usually /oauth2/token.Authentication manager: Set this to resourceOwnerSecurityProvider (the name of your Spring Security Manager). This tells the OAuth provider to use your Spring Security configuration for user authentication. 3. Client Registration Flow You need a mechanism to register client applications. Create a separate Mule flow (or API endpoint) for this purpose. This flow should: Accept client details (e.g., client name, redirect URIs, allowed grant types).Generate a unique client ID and client secret.Store the client information (including the generated ID and secret) in the Client Store you configured in the OAuth Provider. Never expose client secrets in logs or API responses unless absolutely necessary, and you understand the security implications. Hash the client secret before storing it. 4. Validate Token Flow Create a flow to validate access tokens. This flow will be used by your protected resources to verify the validity of access tokens presented by clients. Use the "Validate Token" operation from the OAuth Provider module in this flow. This operation will check the token's signature, expiry, and other attributes against the Token Store. 5. Protected Resource Create the API endpoints or flows that you want to protect with OAuth 2.0. At the beginning of these protected flows, call the "Validate Token" flow you created. If the token is valid, the flow continues; otherwise, it returns an error (e.g., HTTP 401 Unauthorized). Testing 1. Register a Client Use Postman or a similar tool to register a client application, obtaining a client ID and client secret. If you implemented a client registration flow, use that. 2. Get Access Token (Password Grant) Send a POST request to the /oauth2/token endpoint with the following parameters: grant_type: passwordusername: johnpassword: test (use the bcrypt hashed password if you're storing passwords securely)client_id: Your client IDclient_secret: Your client secret 3. Access Protected Resource Send a request to your protected resource, including the access token in the Authorization header (e.g., Authorization: Bearer <access_token>). 4. Validate Token (Optional) You can also test the validation flow directly by sending a request with a token to the endpoint that triggers the "Validate Token" flow. Conclusion This document has provided a comprehensive guide to configuring MuleSoft as an OAuth 2.0 provider using the Password Grant type. By following these steps and paying close attention to the security considerations, you can effectively secure your APIs and resources. Remember that the Password Grant type should be used only when the client application is highly trusted. For other scenarios, explore other OAuth 2.0 grant types like the Authorization Code Grant, which offers better security for less trusted clients. Always consult the official MuleSoft and Spring Security documentation for the latest information and advanced configuration options. Properly securing your OAuth implementation is paramount to protecting user data and your systems. Relevant Links Spring cache annotations @cacheable, @CacheEvict, @CachePut useTennisScraper

By Nikhil Chawla

Magic of Aspects: How AOP Works in Spring

It is from modern applications that one expects a clean and maintainable codebase in order to be able to manage the growing complexity. This is where Aspect Oriented Programming (AOP) comes in. AOP is a paradigm that enables the developers to separate the cross-cutting concerns (such as logging, metrics, and security) from the business logic of the application, making the code both modular and easy to maintain. Why Is It Important to Know AOP? I’ll begin with a simple analogy: There are some things that you should do when building a house: you should think about the design of the house, about the rooms and the decor of the rooms. There are some other things, however, that you should be able to do across all the rooms of the house: smoke detectors should be easy to install, and security cameras should be easy to wire. Wouldn’t it be inefficient to do each of these in every single room? AOP has an invisible team that applies these systems wherever they are required without having to modify your primary construction plan. What Is AOP and Why Does It Matter? Aspect Oriented Programming is a programming paradigm that is complementary to Object Oriented Programming (OOP). While OOP arranges code into objects and methods, AOP is concerned with aspects or parts of the program that are not dependent on the objects. AOP is useful in tackling cross-cutting concerns, which are needs that are fully incorporated in the development of an application but cannot be attributed to a specific class. For instance: Logging We all know what logging is and why we need it, right? And I’m sure every one of you had that experience writing similar log lines, for example, at the beginning of your method. Something like: Plain Text log.info("Executing method someMethodName with following params {} {} {}".......) If you need to capture details about method execution across various classes, or, let’s say you are building an e-commerce application and you want to log a purchase. This is where AOP comes to help you. Instead of copy-pasting your log lines across all the methods, you may just create one aspect and apply it to all methods you need across the whole application. Security Ensuring sensitive methods are accessible only to authorized users. Are you developing a banking and don’t want all users to be able to see each other's account balances and all other data? Don’t I have to put in role checking in every method? I can implement role-based access control and then forget about it with AOP. Performance Metrics AOP might be useful for tracking the execution time of methods to identify bottlenecks or build dashboards. For applications that deal with many requests, for example, streaming services or real-time analytics dashboards, it is beneficial to know how long each request took to be processed. AOP can help measure and log the execution time of all methods so that optimizations can be done quickly. Therefore, These tasks are abstracted away from the developers by AOP so that they can focus on developing business logic without having to develop boilerplate code. Without AOP, you would have to copy and paste millions of lines of repetitive and intrusive code in every single method. On the flip side, with AOP, you can extract these concerns into one place and apply them dynamically where you need them just by adding a couple of annotations (I think annotations are the most convenient way, but they’re, of course, not the only option.) A Peek Behind the Magic I’ve just mentioned annotations, so let’s talk about them a little. In Spring, AOP often works hand-in-hand with annotations, making it almost magical. For example: @Transactional annotation. Oh yeah, every single time you add this annotation, you actually add a new aspect to your code.@Aspect annotation. Can you guess what it does? Exactly! It lets you define custom aspects. And you may ask — how does Spring achieve it? It uses proxies under the hood — dynamic classes that intercept method calls and inject additional behaviors, whatever you need aspects for. For example, when you annotate a method with @Transactional, Spring’s AOP intercepts the method call, begins a transaction, calls the method, and commits or rolls back the transaction depending on the outcome. This process is seamless to the developer but incredibly powerful. Just one word and so many actions under the hood. You may say @Transactional is great. But what if I want to create my own Aspect, what should I know? So, I think it’s time to dive into the foundational concepts of AOP, and understand what the usual concept consists of. Key Terms Aspect Let’s sum up: An aspect is a module that is a collection of cross-cut concerns such as logging, security transactions, or management. An aspect can be thought of as some helper module that would contain code that you don’t want in your business logic. It is like a cleaning robot that makes sure that your house is always clean. The robot works independently and enhances the quality of your life; in the same way, an aspect enhances your code. In Spring, aspects are represented as classes that are annotated with @Aspect annotation. Advice Advice is the "what" of AOP; it is used to define what action should be performed at which point in the application. There are several types of advice in Spring, and I first let you guess when it is usually executed: BeforeAfterAfterReturningAfterThrowingAround I’m sure you've managed to do it, but I still have to (just in case) add my clarifications on all the advice types: Before: It is executed before a method.After: It is executed after the method completion, irrespective of the method’s result.AfterReturning: It is executed after the method has completed successfully.AfterThrowing: It is executed after the method has thrown an exception.Around: It surrounds the method execution and can be the most versatile (e.g., timing how long a method takes to run). Plain Text @Aspect // This annotation defines a class as an Aspect, and will add this class in spring context public class LoggingAspect { @Before("execution(* dev.temnikov.service.*.*(..))") // Executes before any method in 'service' package public void logBeforeMethodExecution() { System.out.println("Logging BEFORE the method execution!"); } } Explanation @Before indicates that this advice will run before the method execution.execution(* dev.temnikov.service.*.*(..)) is a pointcut expression defining where this advice applies. (We’ll discuss pointcuts soon!) Join Point A join point in your application is any point at which an aspect may be applied. Join points in a Spring include method calls, object initialization, and field assignments. In Spring AOP, join points are all limited to method execution, so it’s pretty straightforward. Just to sum it up -> in Spring AOP, you are not allowed to apply aspects on Object Initialization or field assignment, and all your aspects should be associated with executions of methods. Pointcut A pointcut is a rule or expression that defines at which point advice should be applied. It is a filter that helps your code to select specific join points depending on different criteria. Such criteria might be method names, annotations, or parameters. Plain Text @Aspect public class PerformanceAspect { @Pointcut("execution(* dev.temnikov.service.*.*(..))") // Matches all methods in the 'service' package public void serviceLayerMethods() { // This method is just a marker; you should remain body empty } @Around("serviceLayerMethods()") // Applies advice to the defined pointcut public Object logExecutionTime(ProceedingJoinPoint joinPoint) throws Throwable { long start = System.currentTimeMillis(); Object result = joinPoint.proceed(); // Executes the target method long duration = System.currentTimeMillis() - start; System.out.println(joinPoint.getSignature() + " executed in " + duration + "ms"); return result; } } Explanation @Pointcut: Declares a reusable expression to match join points. You may think about pointcuts like filters.@Around: Wraps the method execution. As we mentioned in the Advice section, this advice executes your logic before and after target method execution.ProceedingJoinPoint: Represents the method being intercepted by Aspect. It allows us to control its execution. joinPoint.proceed() is the line of code you will use in more than 90% of your aspects because it executes the target method with all the parameters. How AOP Works in Spring Spring relies on runtime weaving via proxies. For example, if you have a UserService class, Spring creates a proxy object that applies the configured aspects to method calls on that proxy object. Proxies are a critical part of AOP, so let’s take a deeper look at proxies. So, let’s dig into the inner workings of Spring AOP: dynamic proxies and dependency injection in the Spring container. Proxies and Spring AOP Spring AOP is built around proxies that sit between objects and intercept method calls to apply the aspects. Spring uses two mechanisms to create these proxies: JDK Dynamic Proxies and CGLib. Let’s take a closer look at both of them: JDK Dynamic Proxies The JDK Proxy class is used when the target object implements at least one interface. The proxy acts as an intermediary, intercepting calls to the interface methods and applying aspects. How it works: A proxy class is generated at runtime by Spring.The proxy class implements the same interfaces as the target class.Method calls are made through the proxy, which applies aspects both before and after the execution of the target method. Plain Text public interface UserService { void performAction(); } public class UserServiceImpl implements UserService { @Override public void performAction() { System.out.println("Executing business logic..."); } } Let’s imagine we would like to create some Around aspect. If this aspect is applied to UserService interface, Spring will generate a proxy class like the code below. Pay attention to the way a proxy is created. It actually creates a new class using the Proxy class. Plain Text UserService proxy = (UserService) Proxy.newProxyInstance( UserService.class.getClassLoader(), new Class[]{UserService.class}, (proxyObj, method, args) -> { System.out.println("Aspect: Before method call"); Object result = method.invoke(new UserServiceImpl(), args); System.out.println("Aspect: After method call"); return result; } ); CGLIB Proxies When the target class doesn’t implement any interfaces, then Spring uses CGLIB (Code Generation Library) to create a subclass proxy. This proxy simply overrides the target class methods and applies aspects around method calls. Limitations: The target class cannot be final because then it will not be possible to extend it to create a proxy.As methods cannot be overridden, then they cannot be advised. Let’s create a simple class called OrderService with one void method: Plain Text public class OrderService { public void placeOrder() { System.out.println("Placing an order..."); } } If we add some aspect to it using Spring AOP — Spring would generate a proxy like this, just overriding methods. Plain Text public class OrderService$$EnhancerBySpring extends OrderService { @Override public void placeOrder() { System.out.println("Aspect: Before placing the order"); super.placeOrder(); System.out.println("Aspect: After placing the order"); } } Note: For explanatory purposes, I did not add any annotations in our base services: UserService and OrderService. My goal here was just to demonstrate the approach Spring will create proxy IF we add an aspect. As we can see, using CGLib allows us to create a proxy by just extending the base class and overriding methods we need to proxy. That’s why the limitation we mentioned earlier arose: If the method we want to use as jointPoint or the class itself is marked as Final, we can not override it or extend it because of Java limitations. Spring AOP Limitations AOP comes with its own set of challenges, despite AOP bringing numerous benefits such as separating concerns and reducing boilerplate code. In this chapter, we’ll cover situations where AOP may not be the ideal solution, along with some limitations of Spring AOP, and best practices to improve the readability and testability of your code with AOP. Methods on Beans Spring AOP only applies to Spring beans managed by the Spring container. For example, if a class is not a Spring bean (so it is not registered in the application context), then AOP won’t work. Moreover, Spring AOP doesn’t apply aspects to methods of beans called from within the same class since the proxy is created per bean. To understand why it happens you need to realize the stack trace of method execution. So, let’s consider the following code: Plain Text public class TestService { @Transactional public void A() { //Sorry for Upper Case naming System.out.println("Hello from method A"); B() } @Transactional public void B() {//Sorry for Upper Case naming System.out.println("Hello from method B"); } } So what will happen when somebody executes testService.A()? First of all, method A() will be executed as a method of Spring bean, that’s why transactional annotation will be applied and a new transaction opens (or maybe not if you configured it another way, nevertheless @Transactional will work as it should according to your configuration). But will the Transactional aspect be applied to the second method execution? When method A() calls method B()? The answer is NO. Why? As I mentioned before, we need to understand the stack trace of execution. Method A(), as I mentioned a couple of lines above, is executed via Spring Bean, so it is a proxied method. Actually, when we execute testService.A() our program executes testServiceProxy.A() where testServiceProxy is a Spring-generated proxy with aspect applied.But as we can see in the code, method B() is executed from method A(). So, in method A(), we are actually executing this.B(). This means method B() is executed directly from TestService rather than through the proxy, so it is not extended with aspect code. That’s why you should be careful when executing methods in one class. If you expect aspects to be implemented, you should think about some workaround or code refactoring. Tip When you need internal method calls to be advised, use a proxy-targeted approach, like putting the internal method calls through Spring-managed beans (i.e., bail out the logic into other beans). Or, refactor the method into a separate service.You can also use @PostConstruct and @PreDestroy annotations for initializing and cleaning up beans in a way that AOP can manage. Private Methods and Constructors If we think one way forward, we will be able to realize why Spring AOP only intercepts method calls via public or protected methods and does not support aspects applied to private methods. To execute the proxied method, we need to execute it from somewhere else: class from another package, or at least children class. So, there is no sense in creating any proxies/aspects for private methods. Tip Consider refactoring private methods into public methods if possible (not always a good idea for encapsulation reasons).If you still need to add some aspect for constructors, use @PostConstruct. It will add some initialization logic, and as you can see in the name, it will be executed right after the constructor is initiated.For advanced needs, you might have to switch to AspectJ, which provides compile-time or load-time weaving, and it may help you handle private methods and constructors, but this is out of the scope of this article. Summary Spring AOP is a powerful tool that introduces a lot of "magic" into your application. Every Spring developer uses this magic, sometimes without even realizing it. I hope this article has provided you with more clarity on how it works and how Spring creates and manages aspects internally.

By Danil Temnikov

Creating a Service for Sensitive Data With Spring and Redis

Many companies work with user-sensitive data that can’t be stored permanently due to legal restrictions. Usually, this can happen in fintech companies. The data must not be stored for longer than a predefined time period and should preferably be deleted after it has been used for service purposes. There are multiple possible options to solve this problem. In this post, I would like to present a simplified example of an application that handles sensitive data leveraging Spring and Redis. Redis is a high-performance NoSQL database. Usually, it is used as an in-memory caching solution because of its speed. However, in this example, we will be using it as the primary datastore. It perfectly fits our problem’s needs and has a good integration with Spring Data. We will create an application that manages a user's full name and card details (as an example of sensitive data). Card details will be passed (POST request) to the application as an encrypted string (just a normal string for simplicity). The data will be stored in the DB for five minutes only. After the data is read (GET request), it will be automatically deleted. The app is designed as an internal microservice of the company without public access. The user’s data can be passed from a user-facing service. Card details can then be requested by other internal microservices, ensuring sensitive data is kept secure and inaccessible from external services. Initialize Spring Boot Project Let’s start creating the project with Spring Initializr. We will need Spring Web, Spring Data Redis, and Lombok. I also added Spring Boot Actuator as it would definitely be useful in a real microservice. After initializing the service, we should add other dependencies. To be able to delete the data automatically after it has been read we will be using AspectJ. I also added some other dependencies that are helpful for the service and make it look more realistic (for a real-world service, you would definitely add some validation, for example). The final build.gradle would look like this: Groovy plugins { id 'java' id 'org.springframework.boot' version '3.3.3' id 'io.spring.dependency-management' version '1.1.6' id "io.freefair.lombok" version "8.10.2" } java { toolchain { languageVersion = JavaLanguageVersion.of(22) } } repositories { mavenCentral() } ext { springBootVersion = '3.3.3' springCloudVersion = '2023.0.3' dependencyManagementVersion = '1.1.6' aopVersion = "1.9.19" hibernateValidatorVersion = '8.0.1.Final' testcontainersVersion = '1.20.2' jacksonVersion = '2.18.0' javaxValidationVersion = '3.1.0' } dependencyManagement { imports { mavenBom "org.springframework.boot:spring-boot-dependencies:${springBootVersion}" mavenBom "org.springframework.cloud:spring-cloud-dependencies:${springCloudVersion}" } } dependencies { implementation 'org.springframework.boot:spring-boot-starter-data-redis' implementation 'org.springframework.boot:spring-boot-starter-web' implementation 'org.springframework.boot:spring-boot-starter-actuator' implementation "org.aspectj:aspectjweaver:${aopVersion}" implementation "com.fasterxml.jackson.core:jackson-core:${jacksonVersion}" implementation "com.fasterxml.jackson.core:jackson-databind:${jacksonVersion}" implementation "com.fasterxml.jackson.core:jackson-annotations:${jacksonVersion}" implementation "jakarta.validation:jakarta.validation-api:${javaxValidationVersion}" implementation "org.hibernate:hibernate-validator:${hibernateValidatorVersion}" testImplementation('org.springframework.boot:spring-boot-starter-test') { exclude group: 'org.junit.vintage' } testImplementation "org.testcontainers:testcontainers:${testcontainersVersion}" testImplementation 'org.junit.jupiter:junit-jupiter' } tasks.named('test') { useJUnitPlatform() } We need to set up a connection to Redis. Spring Data Redis properties in application.yml: YAML spring: data: redis: host: localhost port: 6379 Domain CardInfo is the data object that we will be working with. To make it more realistic let’s make card details to be passed in the service as encrypted data. We need to decrypt, validate, and then store incoming data. There will be three layers in the domain: DTO: request-level, used in controllersModel: service-level, used in business logicEntity: persistent-level, used in repositories DTO is converted to Model and vice versa in CardInfoConverter. Model is converted to Entity and vice versa in CardInfoEntityMapper. We use Lombok for convenience. DTO Java @Builder @Getter @ToString(exclude = "cardDetails") @NoArgsConstructor @AllArgsConstructor @JsonIgnoreProperties(ignoreUnknown = true) public class CardInfoRequestDto { @NotBlank private String id; @Valid private UserNameDto fullName; @NotNull private String cardDetails; } Where UserNameDto Java @Builder @Getter @ToString @NoArgsConstructor @AllArgsConstructor @JsonIgnoreProperties(ignoreUnknown = true) public class UserNameDto { @NotBlank private String firstName; @NotBlank private String lastName; } Card details here represent an encrypted string, and fullName is a separate object that is passed as it is. Notice how the cardDetails field is excluded from the toString() method. Since the data is sensitive, it shouldn’t be accidentally logged. Model Java @Data @Builder public class CardInfo { @NotBlank private String id; @Valid private UserName userName; @Valid private CardDetails cardDetails; } Java @Data @Builder public class UserName { private String firstName; private String lastName; } CardInfo is the same as CardInfoRequestDto except cardDetails (converted in CardInfoEntityMapper). CardDetails now is a decrypted object that has two sensitive fields: pan (card number) and CVV (security number): Java @Data @Builder @NoArgsConstructor @AllArgsConstructor @ToString(exclude = {"pan", "cvv"}) public class CardDetails { @NotBlank private String pan; private String cvv; } See again that we excluded sensitive pan and CVV fields from toString() method. Entity Java @Getter @Setter @ToString(exclude = "cardDetails") @NoArgsConstructor @AllArgsConstructor @Builder @RedisHash public class CardInfoEntity { @Id private String id; private String cardDetails; private String firstName; private String lastName; } In order for Redis to create a hash key of an entity, one needs to add @RedisHash annotation along with @Id annotation. This is how dto -> model conversion happens: Java public CardInfo toModel(@NonNull CardInfoRequestDto dto) { final UserNameDto userName = dto.getFullName(); return CardInfo.builder() .id(dto.getId()) .userName(UserName.builder() .firstName(ofNullable(userName).map(UserNameDto::getFirstName).orElse(null)) .lastName(ofNullable(userName).map(UserNameDto::getLastName).orElse(null)) .build()) .cardDetails(getDecryptedCardDetails(dto.getCardDetails())) .build(); } private CardDetails getDecryptedCardDetails(@NonNull String cardDetails) { try { return objectMapper.readValue(cardDetails, CardDetails.class); } catch (IOException e) { throw new IllegalArgumentException("Card details string cannot be transformed to Json object", e); } } In this case, the getDecryptedCardDetails method just maps a string to a CardDetails object. In a real application, the decryption logic would be implemented within this method. Repository Spring Data is used to create a repository. The CardInfo in the service is retrieved by its ID, so there is no need to define custom methods, and the code looks like this: Java @Repository public interface CardInfoRepository extends CrudRepository<CardInfoEntity, String> { } Redis Configuration We need the entity to be stored only for five minutes. To achieve this, we have to set up TTL (time-to-live). We can do it by introducing a field in CardInfoEntity and adding the annotation @TimeToLive on top. It can also be achieved by adding the value to @RedisHash: @RedisHash(timeToLive = 5*60). Both ways have some flaws. In the first case, we have to introduce a field that doesn’t relate to business logic. In the second case, the value is hardcoded. There is another option: implement KeyspaceConfiguration. With this approach, we can use property in application.yml to set TTL and, if needed, other Redis properties. Java @Configuration @RequiredArgsConstructor @EnableRedisRepositories(enableKeyspaceEvents = RedisKeyValueAdapter.EnableKeyspaceEvents.ON_STARTUP) public class RedisConfiguration { private final RedisKeysProperties properties; @Bean public RedisMappingContext keyValueMappingContext() { return new RedisMappingContext( new MappingConfiguration(new IndexConfiguration(), new CustomKeyspaceConfiguration())); } public class CustomKeyspaceConfiguration extends KeyspaceConfiguration { @Override protected Iterable<KeyspaceSettings> initialConfiguration() { return Collections.singleton(customKeyspaceSettings(CardInfoEntity.class, CacheName.CARD_INFO)); } private <T> KeyspaceSettings customKeyspaceSettings(Class<T> type, String keyspace) { final KeyspaceSettings keyspaceSettings = new KeyspaceSettings(type, keyspace); keyspaceSettings.setTimeToLive(properties.getCardInfo().getTimeToLive().toSeconds()); return keyspaceSettings; } } @NoArgsConstructor(access = AccessLevel.PRIVATE) public static class CacheName { public static final String CARD_INFO = "cardInfo"; } } To make Redis delete entities with TTL, one has to add enableKeyspaceEvents = RedisKeyValueAdapter.EnableKeyspaceEvents.ON_STARTUP to @EnableRedisRepositories annotation. I introduced the CacheName class to use constants as entity names and to reflect that there can be multiple entities that can be configured differently if needed. TTL value is taken from RedisKeysProperties object: Java @Data @Component @ConfigurationProperties("redis.keys") @Validated public class RedisKeysProperties { @NotNull private KeyParameters cardInfo; @Data @Validated public static class KeyParameters { @NotNull private Duration timeToLive; } } Here, there is only cardInfo, but there can be other entities. TTL properties in application.yml: YAML redis: keys: cardInfo: timeToLive: PT5M Controller Let’s add API to the service to be able to store and access the data by HTTP. Java @RestController @RequiredArgsConstructor @RequestMapping( "/api/cards") public class CardController { private final CardService cardService; private final CardInfoConverter cardInfoConverter; @PostMapping @ResponseStatus(CREATED) public void createCard(@Valid @RequestBody CardInfoRequestDto cardInfoRequest) { cardService.createCard(cardInfoConverter.toModel(cardInfoRequest)); } @GetMapping("/{id}") public ResponseEntity<CardInfoResponseDto> getCard(@PathVariable("id") String id) { return ResponseEntity.ok(cardInfoConverter.toDto(cardService.getCard(id))); } } Auto Deletion With AOP We want the entity to be deleted right after it was successfully read with a GET request. It can be done with AOP and AspectJ. We need to create Spring Bean and annotate it with @Aspect. Java @Aspect @Component @RequiredArgsConstructor @ConditionalOnExpression("${aspect.cardRemove.enabled:false}") public class CardRemoveAspect { private final CardInfoRepository repository; @Pointcut("execution(* com.cards.manager.controllers.CardController.getCard(..)) && args(id)") public void cardController(String id) { } @AfterReturning(value = "cardController(id)", argNames = "id") public void deleteCard(String id) { repository.deleteById(id); } } A @Pointcut defines the place where the logic is applied. Or, in other words, what triggers the logic to execute. The deleteCard method is where the logic is defined. It deletes the cardInfo entity by ID using CardInfoRepository. The @AfterReturning annotation means that the method should run after a successful return from the method that is defined in the value attribute. I also annotated the class with @ConditionalOnExpression to be able to switch on/off this functionality from properties. Testing We will write web tests using MockMvc and Testcontainers. Testcontainers Initializer for Redis Java public abstract class RedisContainerInitializer { private static final int PORT = 6379; private static final String DOCKER_IMAGE = "redis:6.2.6"; private static final GenericContainer REDIS_CONTAINER = new GenericContainer(DockerImageName.parse(DOCKER_IMAGE)) .withExposedPorts(PORT) .withReuse(true); static { REDIS_CONTAINER.start(); } @DynamicPropertySource static void properties(DynamicPropertyRegistry registry) { registry.add("spring.data.redis.host", REDIS_CONTAINER::getHost); registry.add("spring.data.redis.port", () -> REDIS_CONTAINER.getMappedPort(PORT)); } } With @DynamicPropertySource, we can set properties from the started Redis Docker container. Afterwards, the properties will be read by the app to set up a connection to Redis. Here are basic tests for POST and GET requests: Java public class CardControllerTest extends BaseTest { private static final String CARDS_URL = "/api/cards"; private static final String CARDS_ID_URL = CARDS_URL + "/{id}"; @Autowired private CardInfoRepository repository; @BeforeEach public void setUp() { repository.deleteAll(); } @Test public void createCard_success() throws Exception { final CardInfoRequestDto request = aCardInfoRequestDto().build(); mockMvc.perform(post(CARDS_URL) .contentType(APPLICATION_JSON) .content(objectMapper.writeValueAsBytes(request))) .andExpect(status().isCreated()) ; assertCardInfoEntitySaved(request); } @Test public void getCard_success() throws Exception { final CardInfoEntity entity = aCardInfoEntityBuilder().build(); prepareCardInfoEntity(entity); mockMvc.perform(get(CARDS_ID_URL, entity.getId())) .andExpect(status().isOk()) .andExpect(jsonPath("$.id", is(entity.getId()))) .andExpect(jsonPath("$.cardDetails", notNullValue())) .andExpect(jsonPath("$.cardDetails.cvv", is(CVV))) ; } } And the test to check auto deletion with AOP: Java @Test @EnabledIf( expression = "${aspect.cardRemove.enabled}", loadContext = true ) public void getCard_deletedAfterRead() throws Exception { final CardInfoEntity entity = aCardInfoEntityBuilder().build(); prepareCardInfoEntity(entity); mockMvc.perform(get(CARDS_ID_URL, entity.getId())) .andExpect(status().isOk()); mockMvc.perform(get(CARDS_ID_URL, entity.getId())) .andExpect(status().isNotFound()) ; } I annotated this test with @EnabledIf as AOP logic can be switched off in properties, and the annotation determines whether the test should be run. Links The source code of the full version of this service is available on GitHub.

By Alexander Rumyantsev

Building Custom Tools With Model Context Protocol

Model Context Protocol (MCP) is becoming increasingly important in the AI development landscape, enabling seamless integration between AI models and external tools. In this guide, we'll explore how to create an MCP server that enhances AI capabilities through custom tool implementations. What Is Model Context Protocol? MCP is a protocol that allows AI models to interact with external tools and services in a standardized way. It enables AI assistants like Claude to execute custom functions, process data, and interact with external services while maintaining a consistent interface. Getting Started With MCP Server Development To begin creating an MCP server, you'll need a basic understanding of Python and async programming. Let's walk through the process of setting up and implementing a custom MCP server. Setting Up Your Project The easiest way to start is by using the official MCP server creation tool. You have two options: Plain Text # Using uvx (recommended) uvx create-mcp-server # Or using pip pip install create-mcp-server create-mcp-server This creates a basic project structure: Plain Text my-server/ ├── README.md ├── pyproject.toml └── src/ └── my_server/ ├── __init__.py ├── __main__.py └── server.py Implementing Your First MCP Server Let's create a practical example: an arXiv paper search tool that AI models can use to fetch academic papers. Here's how to implement it: Plain Text import asyncio from mcp.server.models import InitializationOptions import mcp.types as types from mcp.server import NotificationOptions, Server import mcp.server.stdio import arxiv server = Server("mcp-scholarly") client = arxiv.Client() @server.list_tools() async def handle_list_tools() -> list[types.Tool]: """ List available tools. Each tool specifies its arguments using JSON Schema validation. """ return [ types.Tool( name="search-arxiv", description="Search arxiv for articles related to the given keyword.", inputSchema={ "type": "object", "properties": { "keyword": {"type": "string"}, }, "required": ["keyword"], }, ) ] @server.call_tool() async def handle_call_tool( name: str, arguments: dict | None ) -> list[types.TextContent | types.ImageContent | types.EmbeddedResource]: """ Handle tool execution requests. Tools can modify server state and notify clients of changes. """ if name != "search-arxiv": raise ValueError(f"Unknown tool: {name}") if not arguments: raise ValueError("Missing arguments") keyword = arguments.get("keyword") if not keyword: raise ValueError("Missing keyword") # Search arXiv papers search = arxiv.Search( query=keyword, max_results=10, sort_by=arxiv.SortCriterion.SubmittedDate ) results = client.results(search) # Format results formatted_results = [] for result in results: article_data = "\n".join([ f"Title: {result.title}", f"Summary: {result.summary}", f"Links: {'||'.join([link.href for link in result.links])}", f"PDF URL: {result.pdf_url}", ]) formatted_results.append(article_data) return [ types.TextContent( type="text", text=f"Search articles for {keyword}:\n" + "\n\n\n".join(formatted_results) ), ] Key Components Explained Server initialization. The server is initialized with a unique name that identifies your MCP service.Tool registration. The @server.list_tools() decorator registers available tools and their specifications using JSON Schema.Tool implementation. The @server.call_tool() decorator handles the actual execution of the tool when called by an AI model.Response formatting. Tools return structured responses that can include text, images, or other embedded resources. Best Practices for MCP Server Development Input validation. Always validate input parameters thoroughly using JSON Schema.Error handling. Implement comprehensive error handling to provide meaningful feedback.Resource management. Properly manage external resources and connections.Documentation. Provide clear descriptions of your tools and their parameters.Type safety. Use Python's type hints to ensure type safety throughout your code. Testing Your MCP Server There are two main ways to test your MCP server: 1. Using MCP Inspector For development and debugging, the MCP Inspector provides a great interface to test your server: Plain Text npx @modelcontextprotocol/inspector uv --directory /your/project/path run your-server-name The Inspector will display a URL that you can access in your browser to begin debugging. 2. Integration With Claude Desktop To test your MCP server with Claude Desktop: Locate your Claude Desktop configuration file: MacOS: ~/Library/Application Support/Claude/claude_desktop_config.jsonWindows: %APPDATA%/Claude/claude_desktop_config.jsonAdd your MCP server configuration: Plain Text { "mcpServers": { "mcp-scholarly": { "command": "uv", "args": [ "--directory", "/path/to/your/mcp-scholarly", "run", "mcp-scholarly" ] } } } For published servers, you can use a simpler configuration: Plain Text { "mcpServers": { "mcp-scholarly": { "command": "uvx", "args": [ "mcp-scholarly" ] } } } Start Claude Desktop — you should now see your tool (e.g., "search-arxiv") available in the tools list: Testing checklist: Verify tool registration and discoveryTest input validationCheck error handlingValidate response formattingEnsure proper resource cleanup Integration With AI Models Once your MCP server is ready, it can be integrated with AI models that support the Model Context Protocol. The integration enables AI models to: Discover available tools through the list_tools endpointCall specific tools with appropriate parametersProcess the responses and incorporate them into their interactions For example, when integrated with Claude Desktop, your MCP tools appear in the "Available MCP Tools" list, making them directly accessible during conversations. The AI can then use these tools to enhance its capabilities — in our arXiv example, Claude can search and reference academic papers in real time during discussions. Common Challenges and Solutions Async operations. Ensure proper handling of asynchronous operations to prevent blocking.Resource limits. Implement appropriate timeouts and resource limits.Error recovery. Design robust error recovery mechanisms.State management. Handle server state carefully in concurrent operations. Conclusion Building an MCP server opens up new possibilities for extending AI capabilities. By following this guide and best practices, you can create robust tools that integrate seamlessly with AI models. The example arXiv search implementation demonstrates how to create practical, useful tools that enhance AI functionality. Whether you're building research tools, data processing services, or other AI-enhanced capabilities, the Model Context Protocol provides a standardized way to extend AI model functionality. Start building your own MCP server today and contribute to the growing ecosystem of AI tools and services. My official MCP Scholarly server has been accepted as a community server in the MCP repository. You can find it under the community section here. Resources Model Context Protocol DocumentationMCP Official RepositoryMCP Python SDKMCP Python Server CreatorMCP Server ExamplesarXiv API DocumentationExample arXiv Search MCP Server For a deeper understanding of MCP and its capabilities, you can explore the official MCP documentation, which provides comprehensive information about the protocol specification and implementation details.

By Aditya Karnam Gururaj Rao

How to Build a Data Dashboard Prototype With Generative AI

This article is a tutorial that shows how to build a data dashboard to visualize book reading data taken from Goodreads. It uses a low-code approach to prototype the dashboard using natural language prompts to an open-source tool, Vizro-AI, which generates Plotly charts that can be added to a template dashboard. You'll see how to iterate prompts to build three charts then add the prompts to a Notebook to generate an interactive dashboard. Finally, the generated dashboard code is added to a shared project that can be tweaked to improve the prototype. It's still not complete and can definitely be extended and improved upon. Let me know in the comments if you try it out! The dataset for this project was reading data from my personal Goodreads account; it can be downloaded from my GitHub repo. If you use Goodreads, you can export your data in CSV format, substitute it for the dataset provided, and explore it with the code for this tutorial. Before I started the project, I filtered the dataset to retain only books with an ISBN since that can be used with Google Books API to retrieve additional data about a book, such as the cover graphics and detailed category information. This project doesn't take advantage of the Books API, but by including ISBN data in the dataset, there is scope to extend the prototype project in the future, which is one of the potential extensions I mentioned above. OpenAI Parts of this tutorial use OpenAI models through Vizro-AI. To run through those steps yourself, you must have an OpenAI account with paid-for credits available. None of the free accounts will suffice. You may check the OpenAI models and pricing on their website. The latter parts of the tutorial take the generated code and work with that, and you don't need a key for those. Note: Before using a generative AI model, please review OpenAI's guidelines on risk mitigation to understand potential model limitations and best practices. See the OpenAI site for more details on responsible usage. Chart Generation With Vizro-AI In the first step, I use a hosted version of Vizro-AI, found at https://py.cafe/app/vizro-official/vizro-ai-charts. To see it for yourself, navigate to the site, which looks as follows: Settings The link will open with a settings pane where you can set the API key for your chosen vendor. At the time of writing, you can use OpenAI, Anthropic, Mistral, or xAI: To return to these settings at any time, you'll notice a cog icon at the top right-hand corner to access them. Once the API key is set, return to the main screen and upload the data for the project, which is stored here. Now, you can use Vizro-AI to build some charts by iterating text to form effective prompts. Chart 1: Books Timeline To ask Vizro-AI to build a chart, describe what you want to see. The first chart should show an ordered horizontal timeline to illustrate the sequence of reading the books. Plot a chart with the title "Sequence of reading" . It is a scatter chart. Use the x axis to show the date a book was read. Plot it at y=1. The plot looks as follows: It's not perfect! Hovering over each point gave the date the book was read, but not the title of the book, although this could be achieved by tweaking the prompt to ask explicitly for particular fields in the hover text. You'll also notice that the points are spaced evenly rather than proportionately. The code to generate the plot is shown underneath the prompt. Here it is for easy access (also as a gist): Python import plotly.graph_objects as go from vizro.models.types import capture @capture("graph") def custom_chart(data_frame): fig = go.Figure() fig.add_trace(go.Scatter(x=data_frame["Date Read"], y=[1] * len(data_frame), mode="markers")) fig.update_layout(title="Sequence of reading", xaxis_title="Date Read", yaxis_title="Y=1") return fig Chart 2: Reading Velocity The second chart should plot the cumulative total of pages read and the cumulative total of books read per year. The prompt took a few iterations to refine: Plot a chart with the title "Pages and Book totals" . It shows the cumulative total number of pages read by summing the Number of Pages of each book read in each year, using the Date Read data. Plot date on the x axis and the number of pages on the y axis using a scale on the left hand side of the chart. Superimpose a bar chart showing the total books read for each year, taking data from the Date Read column. Show the total books read using the right hand side of the chart, which can be a different scale to the y axis shown on the left hand side. The Plotly code generated with gpt-4-turbo was as follows (also as a gist): Python import pandas as pd import plotly.graph_objects as go from plotly.subplots import make_subplots from vizro.models.types import capture @capture("graph") def custom_chart(data_frame): # Convert Date Read to datetime data_frame["Date Read"] = pd.to_datetime(data_frame["Date Read"], dayfirst=True) # Group by year and sum pages pages_per_year = data_frame.groupby(data_frame["Date Read"].dt.year)["Number of Pages"].sum().cumsum() # Count books per year books_per_year = data_frame.groupby(data_frame["Date Read"].dt.year).size() # Create subplot fig = make_subplots(specs=[[{"secondary_y": True}]]) # Add line for cumulative pages fig.add_trace( go.Scatter( x=pages_per_year.index, y=pages_per_year, mode="lines", name="Cumulative Pages", ), secondary_y=False, ) # Add bar for books count fig.add_trace( go.Bar(x=books_per_year.index, y=books_per_year, name="Total Books"), secondary_y=True, ) # Set y-axes titles fig.update_yaxes(title_text="Cumulative Pages", secondary_y=False) fig.update_yaxes(title_text="Total Books", secondary_y=True) # Set layout fig.update_layout(title="Pages and Book totals", xaxis_title="Year", showlegend=True) return fig The chart could be improved by bringing the line graph on top of the bar chart, but varying the prompt to make this explicit did not have the desired results: Chart 3: Reviews Comparison The third chart should illustrate the difference between the rating the Goodreads reader assigned a book and the average rating across the Goodreads community. This prompt took a degree of iteration and needed me to specify how to draw the lines between the points, which is a key learning when using generative AI: your results will vary from run to run. The type of chart ("dumbbell") was given explicitly to guide the chart creation. For each row, create a dumbbell chart to show the difference between My Rating and Average Rating for each book. Use shapes to add the horizontal lines between markers. Omit the legend. Don't show any row where My Rating is 0. Here's the code generated (also as a gist): Python import plotly.graph_objects as go from vizro.models.types import capture @capture("graph") def custom_chart(data_frame): # Filter out rows where 'My Rating' is 0 df_filtered = data_frame[data_frame["My Rating"] != 0] # Create a blank figure fig = go.Figure() # Add dumbbell lines and markers for each book for index, row in df_filtered.iterrows(): fig.add_trace( go.Scatter( x=[row["My Rating"], row["Average Rating"]], y=[index, index], mode="markers+lines", marker=dict(size=10), line=dict(width=2), name=row["Title"], showlegend=False, ) ) # Update layout fig.update_layout( title="Comparison of My Rating vs Average Rating", xaxis_title="Rating", yaxis_title="Books", yaxis=dict( tickmode="array", tickvals=list(df_filtered.index), ticktext=df_filtered["Title"], ), ) return fig The plot looks as follows: Dashboard Generation With Vizro-AI Set Up a Jupyter Notebook At this point, I have prototypes for three Plotly charts for the Goodreads data. To display these as an interactive dashboard, I need some additional code, and Vizro-AI can generate this for me, but not through the application hosted on PyCafe (at the time of writing). I'll use a Jupyter Notebook instead. Before running the Notebook code, set up Vizro-AI inside a virtual environment with Python 3.10 or later. Install the package with pip install vizro_ai. You need to give Vizro-AI your API key to access OpenAI by adding it to your environment so the code you write in the next step can access it to successfully call OpenAI. There are some straightforward instructions in the OpenAI docs, and the process is also covered in Vizro's LLM setup guide. Build a Dashboard Now open a Jupyter Notebook to submit a single prompt that combines the three prompts listed above, with some small edits to ask for a dashboard that has three pages: one for each chart. The following shows the code (also available as a gist) to make the request to Vizro-AI to build and display the dashboard. The data manipulation has been omitted, but the full Notebook is available for download from my repo: Python user_question = """ Create a dashboard with 3 pages, one for each chart. On the first page, plot a chart with the title "Sequence of reading" . It is a scatter chart. Use the x axis to show the date a book was read. Plot it at y=1. On the second page, lot a chart with the title "Pages and Book totals" . It shows the cumulative total number of pages read by summing the Number of Pages of each book read in each year, using the Date Read data. Plot date on the x axis and the number of pages on the y axis using a scale on the left hand side of the chart. Superimpose a bar chart showing the total books read for each year, taking data from the Date Read column. Show the total books read using the right hand side of the chart which can be a different scale to the y axis shown on the left hand side. On the third page, for each row, create a dumbbell chart to show the difference between My Rating and Average Rating for each book. Use shapes to add the horizontal lines between markers. Omit the legend. Don't show any row where My Rating is 0. """ result = vizro_ai.dashboard([df_cleaned], user_question, return_elements=True) Vizro().build(result.dashboard).run(port=8006) print(result.code) Using gpt-4-turbo, Vizro-AI generates a set of Plotly chart codes and the necessary Vizro support code to build a dashboard. The generated code is displayed as output in the Notebook with the dashboard, although the dashboard is better viewed at http://localhost:8006/. Add Dashboard Interactivity To make the Vizro dashboards more interactive, I'll ask Vizro-AI to add the code for a control. As a simple example, let's extend the prompt to ask for a date picker control to modify the time period displayed for the Date Read column and change the scale on the x-axis of the first chart. diff user_question = """ Create a dashboard with 3 pages, one for each chart. On the first page, plot a chart with the title "Sequence of reading" . It is a scatter chart. Use the x axis to show the date a book was read. Plot it at y=1. + Add a date picker filter so the user can adjust the range of dates for the Date Read on the x axis. On the second page, plot a chart with the title "Pages and Book totals" . It shows the cumulative total number of pages read by summing the Number of Pages of each book read in each year, using the Date Read data. Plot date on the x axis and the number of pages on the y axis using a scale on the left hand side of the chart. Superimpose a bar chart showing the total books read for each year, taking data from the Date Read column. Show the total books read using the right hand side of the chart which can be a different scale to the y axis shown on the left hand side. On the third page, for each row, create a dumbbell chart to show the difference between My Rating and Average Rating for each book. Use shapes to add the horizontal lines between markers. Omit the legend. Don't show any row where My Rating is 0. """ Get the Notebook You can see the code output in the Notebook stored on GitHub, and I'll reproduce it in the next section below. You can also generate similar output by running it yourself, although it will not necessarily be identical because of the variability of results returned from generative AI. The charts Vizro-AI generated were similar to those created by the PyCafe host above, although the first chart was improved. The books are spaced proportionately to the date I read them, and the hover text includes the book title as well as the date read without an explicit request to do so. Now I have a Notebook with code to call Vizro-AI to build a prototype Vizro dashboard with a set of three pages and three charts, plus a control to filter the view. Because the code generated by Vizro-AI can vary from run to run, and calling OpenAI each time a dashboard is needed can get costly, it makes sense to convert the generated code into its own project. So, I have transferred the generated code from the output of Vizro-AI in the Notebook into a PyCafe project. There are three changes to the Notebook code needed for it to run on PyCafe: Add from vizro import Vizro to the imports listAdd Vizro().build(model).run() at the end of the code blockUncomment the data manager code and replace it with the code needed to access the filtered_books.csv dataset. More About PyCafe If you've not used it before, PyCafe is a free platform to create and share Python web applications, like Vizro dashboards, as well as Streamlit and Dash applications, through a web browser. It grew out of a need to share Solara code snippets, and communicate with users, and was launched in June this year. It's based on the open-source Pyodide project. Prototype Summary To get to this point, I used Vizro-AI to generate a set of charts by iterating prompts. I then converted the successful prompts to build a Vizro dashboard using Vizro-AI in a Notebook using a few lines of support code. Finally, I converted the generated Python to a PyCafe project with a few additional lines of support code, removing the dependency on OpenAI in doing so, and making the project easier to share. It's true that the code generated by Vizro-AI does not make a perfect dashboard, but it has been very easy to get it to a reasonable prototype. Let's look at a few improvements to the Plotly code to further improve the charts, as shown below in a separate PyCafe project. Improvements to Generated Code In this version of the dashboard, the first chart that shows the sequence of books read has been modified to improve the information supplied when hovering over a point, and the opacity of the points has been altered to make it more attractive. The control has been changed to a slider for the Date Read field. In the second chart to show the cumulative total of pages and books read, the line chart has been explicitly plotted on top of the bar chart. In the third chart that shows the rating comparison, the color scheme has been updated to make it clearer which is the My Rating value compared to the Average Rating. Follow this link to try out the dashboard in full app mode.

By Jo Stichbury

CORE

Using Spring AI to Generate Images With OpenAI's DALL-E 3

Hi, community! This is my first article in a series of introductions to Spring AI. Today, we will see how we can easily generate pictures using text prompts. To achieve this, we will leverage the OpenAI API and the DALL-E 3 model. In this article, I'll skip the explanation of some basic Spring concepts like bean management, starters, etc, as the main goal of this article is to discover Spring AI capabilities. For the same reason, I'll not create detailed instructions on how to generate an OpenAI API key. Prerequisite If you don't have an active OpenAI API key, do the following steps: Create an account on OpenAI.Generate the token on the API Keys page. Step 1: Set Up a Project To quickly generate a project template with all necessary dependencies, you may use https://start.spring.io/. In my example, I'll use Java 17 and Spring Boot 3.4.1. We also need to include the following dependencies: Spring WEB: This dependency will allow us to create a web server and expose REST endpoints as entry points to our applicationOpenAI: This dependency provides us with smooth integration with Open AI just by writing a couple lines of code and a few lines of configurations. After clicking generate, open downloaded files in the IDE you are working on and validate that all necessary dependencies exist in pom.xml. XML <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-openai-spring-boot-starter</artifactId> </dependency> Step 2: Set Up a Configuration File As a next step, we need to configure our property file. By default, Spring uses application.yaml or application.properties file. In this example, I'm going to use yaml format. You may reformat the code into .properties if you feel more comfortable working with this format. Here are all the configs we need to add to the application.yaml file: YAML spring: ai: openai: api-key: [your OpenAI api key] image: options: model: dall-e-3 size: 1024x1024 style: vivid quality: standard response-format: url Model: We are going to use the dall-e-3 model as the only available model in Spring AI at the moment of writing this article.Size: Configures the size of the generated image. It must be one of 1024x1024, 1792x1024, or 1024x1792 for dall-e-3 modelStyle: The vivid style generates more hyper-realistic images. If you want your pictures to look more real, set value natural.Quality: Might be one out of two options: standard or HD.Response-format: Might be one out of two options: url and b64_json. I'll be using the URL for demo purposes and simplicity. The image will be available by URL one hour after generation. Step 3: Create ImageGenerationService Let's create a service that will be responsible for generating images. Java @Service public class ImageGenerationService { @Autowired ImageModel imageModel; } We created a new class and annotated it as a Service. We also autowired the ImageModel bean. ImageModel is the main interface used to generate pictures. As we provided all the necessary configurations in Step 2, Spring Boot Starter will automatically generate an implementation of this interface called OpenAiImageModel for us. When our class is configured, we may start implementing a method that calls an OpenAI API to generate pictures using our prompts. And this is where the real magic of Spring AI will happen. Let's take a look at it. Java public String generateImage(String prompt) { ImagePrompt imagePrompt = new ImagePrompt(prompt); ImageResponse imageResponse = imageModel.call(imagePrompt); return imageResponse.getResult().getOutput().getUrl(); } That's it! We just need three lines of code to actually generate an image with Spring AI. Isn't that amazing? In the first step, we created a new ImagePrompt just by providing a string prompt. Next, we made an API call using imageModel.call(imagePrompt) and stored the response in the ImageResponse variable. In the last step, we returned the URL of the generated image. Remember, the image is only available for one hour; after that, the link will not be available anymore. So don't forget to save your masterpiece! Step 4: Create ImageGenerationController to Run Our Code We need to create a last file to allow users to execute our integration. It may look like this: Java @RestController() @RequestMapping("/image") public class ImageGenerationController { @Autowired ImageGenerationService imageService; @GetMapping("/generate") public ResponseEntity<String> generateImage(@RequestParam String prompt) { return ResponseEntity.ok(imageService.generateImage(prompt)); } } As you can see, we just created a simple controller with just one GET endpoint inside. This endpoint will be available at localhost:8080/image/generate. Step 5. Run Our Application To start our application, we need to run the following command: Plain Text mvn spring-boot:run When the application is running, we may check the result by executing the following curl with any prompt you want. I used the following: Cute cat playing chess. Don't forget to add %20 instead of whitespaces if you are using the command line for calling your endpoint: Shell curl -X GET "http://localhost:8080/image/generate?prompt=Cute%20cat%20playing%20chess" After executing, wait a few seconds as it takes some time for OpenAI to generate your image and voila: Congratulations! You've just created and tested your first Spring AI application, which generates images using custom prompts! Step 6: Give More Flexibility in Generating Images (Optional) In the second step, we configured the default behavior to our model and provided all the necessary configurations in the application.yaml file. But can we give more flexibility to our users and let them provide their configurations? The answer is yes! To do this, we need to use the ImageOptions interface. Here is an example: Java public String generateImage(GenerateImageRequest imageRequest) { ImageOptions options = OpenAiImageOptions.builder() .withQuality("standard") .withStyle("vivid") .withHeight(1024) .withWidth(1024) .withResponseFormat("url") .build(); ImagePrompt imagePrompt = new ImagePrompt(imageRequest.getPrompt(), options); ImageResponse imageResponse = imageModel.call(imagePrompt); return imageResponse.getResult().getOutput().getUrl(); } To achieve this, we need to build options programmatically using all the configs we set up in application.yaml and provide these options when creating an ImagePrompt object. You may find more configuration options in Spring AI docs. Conclusion Spring AI is a great tool that helps developers smoothly integrate with different AI models. As of writing this article, Spring AI supports five image models, including but not limited to Azure AI and Stability. I hope you found this article helpful and that it will inspire you to explore Spring AI more deeply.

By Danil Temnikov

Stop Shipping Waste: Fix Your Product Backlog

TL; DR: Stop Shipping Waste When product teams fail to establish stakeholder alignment and implement rigorous Product Backlog management, they get caught in an endless cycle of competing priorities, reactive delivery, and shipping waste. The result? Wasted resources, frustrated teams, and missed business opportunities. Success in 2025 requires turning your Product Backlog from a chaotic wish list into a strategic tool that connects vision to value delivery. Learn how to do so. Two Systemic Failures Leading to Shipping Waste Product management is a balancing act. Teams must manage customer needs, stakeholder expectations, technical constraints, and business goals while delivering measurable outcomes. Yet, despite their best intentions, many product teams fall short. Why? Two pervasive issues often lie at the root of this failure: A lack of alignment and a broken Product Backlog Management process. Let’s unpack why these failures matter — and how overcoming them can transform your team’s impact. (And, possibly, your career!) Failure #1: The Alignment Gap Imagine a scenario: Developers build features stakeholders think customers want, only to discover post-launch that the solution misses the mark. Sales teams push for one priority, product leadership has a different idea, engineering advocates for another, and executives demand faster timelines. The result? Often wasted efforts, frustrated teams, disappointed customers, and missed business objectives. Misalignment isn’t just inconvenient — it’s costly. When stakeholders operate in silos, ignoring the benefits of product leadership, product teams lose sight of the “why” behind their work. Product roadmaps become wish lists, product strategy feels disconnected from execution, and collaboration dissolves into competing agendas. Without shared ownership of priorities, even the most talented product teams struggle to deliver meaningful outcomes. The Fix Alignment isn’t about enforcing consensus — it’s about creating clarity. Teams need frameworks to connect product vision to daily work. Tools like user story mapping, outcome-focused roadmaps, and structured stakeholder workshops can bridge gaps. Moreover, by frequently integrating customer insights and data with business objectives, product teams foster collaboration, ensuring everyone rallies behind the same objectives. Failure #2: The Backlog Black Hole The Product Backlog is meant to be a strategic asset. Yet, for many teams, it’s an overwhelming, chaotic list of tasks — a “black hole” where ideas go to die. Common symptoms of dysfunctional Product Backlogs include: Endless, low-value items drowning critical priorities.Stakeholders bypassing processes to demand urgent work.Teams stuck in reactive mode, shipping outputs without measurable impact. A poorly managed backlog erodes trust. Stakeholders see delays and confusion; product teams feel overwhelmed by shifting demands. Worse, without transparency, the backlog becomes a source of conflict rather than a tool for value delivery. The Fix Effective Product Backlog management requires rigor and strategy. Teams need processes to prioritize ruthlessly, validate assumptions, and align backlog items with customer and business outcomes. Techniques like weighted scoring, value vs. effort analysis, and anti-pattern identification can transform Product Backlogs into dynamic, transparent tools. The Cost of Ignoring These Failures When alignment and Product Backlog Management break down, the consequences ripple across organizations: Lost opportunities: Teams waste precious capacity on low-impact work while competitors innovate.Stagnant careers: Product leaders lose credibility when they can’t articulate progress or outcomes.Cultural erosion: Misalignment breeds frustration, burnout, and attrition. But teams that address these challenges unlock transformative results. They ship solutions customers love (and contribute to the bottom line), build stakeholder trust, and create cultures where collaboration thrives. Why This Matters for Your Career Let me be blunt: The market doesn’t need more Product Owners who “manage” Product Backlogs. It requires product leaders who wield them strategically. When you master alignment and backlog rigor, you stop being seen as a “task coordinator” and become the person who delivers results. The product teams that thrive in 2025 and beyond will: Ship solutions customers love, not just tolerate.Turn stakeholders into collaborators, not critics.Use the backlog to drive decisions, not document them. This isn’t about process — it’s about impact. Conclusion Product teams often struggle with misalignment and chaotic Product Backlogs, leading to wasted effort, frustrated teams, and missed opportunities. By addressing these issues, teams can turn their Product Backlog into a strategic tool that drives value and aligns everyone around a shared vision. Success comes from fostering clarity and collaboration, prioritizing customer-centric decisions, and implementing rigorous Product Backlog management. Teams that embrace these principles will ship solutions customers love, build trust, and create a culture of accountability. For product leaders, this is a chance to elevate your career. Master alignment and backlog management to become a strategic leader who delivers measurable outcomes. Stop shipping waste and start delivering value.

By Stefan Wolpers

CORE

Why You Don’t Need That New JavaScript Library

Libraries can rise to stardom in months, only to crash and fade into obscurity within months. We’ve all seen this happen in the software development world, and my own journey has been filled with “must-have” JavaScript libraries, each claiming to be more revolutionary than the one before. But over the years, I’ve come to realize that the tools we need have been with us all along, and in this article, I’ll explain why it’s worth sticking to the fundamentals, how new libraries can become liabilities, and why stable, proven solutions usually serve us best in the long run. The Allure of New Libraries I’ll be the first to admit that I’ve been seduced by shiny new libraries before. Back in 2018, I led a team overhaul of our front-end architecture. We added a number of trendy state management tools and UI component frameworks, certain that they would streamline our workflow. Our package.json ballooned with dependencies, each seemingly indispensable. At first, it felt like we were riding a wave of innovation. Then, about six months in, a pattern emerged. A few libraries became outdated; some were abandoned by their maintainers. Every time we audited our dependencies, it seemed we were juggling security patches and version conflicts far more often than we shipped new features. The headache of maintenance made one thing crystal clear: every new dependency is a promise you make to maintain and update someone else’s code. The True Cost of Dependencies When we adopt a new library, we’re not just adding functionality; we’re also taking on significant risks. Here are just some of the hidden costs that frequently go overlooked: Maintenance Overhead New libraries don’t just drop into your project and remain stable forever. They require patching for security vulnerabilities, updating for compatibility with other tools, and diligence when major releases introduce breaking changes. If you’re not on top of these updates, you risk shipping insecure or buggy code to production. Version Conflicts Even robust tools like npm and yarn can’t guarantee complete harmony among your dependencies. One library might require a specific version of a package that conflicts with another library’s requirements. Resolving these inconsistencies can be a maddening, time-consuming process. Performance Implications The size of the bundle increases a lot because of front-end libraries. One specialized library may add tens or hundreds of kilobytes to your final JavaScript payload, making it heavier, which means slower load times and worse user experiences. Security Vulnerabilities In one audit for a client recently, 60% of their app’s vulnerabilities came from third-party packages, often many layers deep in the dependency tree. Sometimes, to patch one library, multiple interdependent packages need to be updated, which is rarely an easy process. A colleague and I once had a need for a date picker for a project. The hip thing to do would have been to install some feature-rich library and quickly drop it in. Instead, we polyfilled our own lightweight date picker in vanilla JavaScript, using the native Date object. It was a fraction of the size, had zero external dependencies, and was completely ours to modify. That tiny decision spared us from possible library update headaches, conflicts, or abandonment issues months later. The Power of Vanilla JavaScript Modern JavaScript is almost unrecognizable from what it was ten years ago. Many features that previously required libraries like Lodash or Moment are now part of the language — or can be replicated with a few lines of code. For example: JavaScript // Instead of installing Lodash to remove duplicates: const uniqueItems = [...new Set(items)]; // Instead of using a library for deep cloning: const clonedObject = structuredClone(complexObject); A deep familiarity with the standard library can frequently replace entire suites of utility functions. These days, JavaScript’s built-in methods handle most common tasks elegantly, making large chunks of external code unnecessary. When to Use External Libraries None of this is to say you should never install a third-party package. The key lies in discernment — knowing when a problem is big enough or specialized enough to benefit from a well-tested, well-maintained library. For instance: Critical complexity: Frameworks like React have proven their mettle for managing complex UI states in large-scale applications.Time-to-market: Sometimes, a short-term deliverable calls for a robust, out-of-the-box solution, and it makes sense to bring in a trusted library rather than build everything from scratch.Community and maintenance: Popular libraries with long track records and active contributor communities — like D3.js for data visualization — can be safer bets, especially if they’re solving well-understood problems. The key is to evaluate the cost-benefit ratio: Can this be done with native APIs or a small custom script?Do I trust this library’s maintainer track record?Is it solving a core problem or offering only minor convenience?Will my team actually use enough of its features to justify the extra weight? Strategies for Avoiding Unnecessary Dependencies To keep your projects lean and maintainable, here are a few best practices: 1. Evaluate Built-In Methods First You’d be surprised how many tasks modern JavaScript can handle without third-party code. Spend time exploring the newer ES features, such as array methods, Map/Set, async/await, and the Intl API for localization. 2. Document Your Choices If you do bring in a new library, record your reasoning in a few sentences. State the problem it solves, the alternatives you considered, and any trade-offs. Future maintainers (including your future self) will appreciate the context if questions arise later. 3. Regular Dependency Audits Re-scan your package.json every quarter or so. Is this library still maintained? Are you really using their features? Do a small cleanup of the project for removing dead weights that would reduce the potential for security flaws. 4. Aggressive Dependency vs. DevDependency Separation Throw build tooling, testing frameworks, other non-production packages into your devDependencies. Keep your production dependency listing lean in terms of just the things that you really need to function at runtime. The Case for Core Libraries A team I recently worked with had some advanced charting and visualization requirements. Although a newer charting library promised flashy animations and out-of-the-box UI components, we decided to use D3.js, a stalwart in the data visualization space. The maturity of the library, thorough documentation, and huge community made it a stable foundation for our custom charts. By building directly on top of D3’s fundamentals, we had full control over our final visualizations, avoiding the limitations of less established abstractions. That mindset — paying off in performance, maintainability, and peace of mind for embracing a core, proven library rather than chasing every new offering — means instead of spending time adapting our data to a proprietary system or debugging half-baked features, we have to focus on real product needs, confident that D3 would remain stable and well-supported. Performance Gains Libraries aren’t just maintenance overhead, they affect your app’s performance too. In one recent project, we reduced the initial bundle size by 60% simply by removing niche libraries and replacing them with native code. The numbers told the story. Load time dropped from 3.2s to 1.4s.Time to interact improved by nearly half.Memory usage fell by roughly 30%. These results didn’t come from advanced optimizations but from the simpler act of removing unnecessary dependencies. In an age of ever-growing user expectations, the performance benefits alone can justify a more minimal approach. Building for the Long Term Software is never static. Today’s must-have library may turn out to be tomorrow’s orphaned repository. Reliable, stable code tends to come from developers who favor well-understood, minimal solutions over ones that rely too heavily on external, fast-moving packages. Take authentication, for example: with the hundreds of packages that exist to handle user login flows, rolling a simple system with few dependencies may result in something easier to audit, more transparent, and less subject to churn from external libraries. The code might be a bit more verbose, but it’s also explicit, predictable, and directly under your control. Teaching and Team Growth One of the underrated benefits of using fewer libraries is how it fosters stronger problem-solving skills within your team. Having to implement features themselves forces the developers to have a deep understanding of core concepts-which pays dividends when debugging, performance tuning, or even evaluating new technologies in the future. Relying too much on abstractions from someone else can stunt that growth and transform capable coders into “framework operators.” Conclusion The next time you think about installing yet another trending package, reflect on whether it solves your pressing need or just for novelty. As experience has drummed into my head, each new dependency is for life. This is the way to ensure that one gets light solutions that are secure and easier to maintain by using built-in capabilities, well-tried libraries, and a deep understanding of fundamentals. Ultimately, “boring” but reliable libraries — and sometimes just vanilla JavaScript — tend to stand the test of time better than flashy newcomers. Balancing innovation with pragmatism is the hallmark of a seasoned developer. In an era of endless frameworks and packages, recognizing when you can simply reach for the tools you already have may be the most valuable skill of all.

By Denis Ermakov

How to Split PDF Files into Separate Documents Using Java

Asking our Java file-processing applications to manipulate PDF documents can only increase their value in the long run. PDF is by far the most popular, widely used file type in the world today, and that’s unlikely to change any time soon. Introduction In this article, we’ll specifically learn how to divide PDF files into a series of separate PDF documents in Java — resulting in exactly one new PDF per page of the original file — and we’ll discuss open-source and third-party web API options to facilitate implementing that programmatic workflow into our code. We’ll start with a high-level overview of how PDF files are structured to make this type of workflow possible. Distinguishing PDF from Open-Office XML File Types I’ve written a lot about MS Open-Office XML (OOXML) files (e.g., DOCX, XLSX, etc.) in recent months — and it’s worth noting right away that PDF files are extremely different. Where OOXML files are structured as a zip-compressed package of XML files containing document formatting instructions, PDF files use a binary file format that prioritizes layout fidelity over structured data representation and editability. In other words, PDF files care more about the visual appearance of content than its accessibility; we might’ve noticed this for ourselves if we’ve tried to copy and paste information directly from PDF files into another document. Understanding How PDF Files Manage Individual Pages Each individual page within a PDF document is organized in a hierarchical section called the Page Tree. Within this Tree, each page is represented as its own independent object, and each page object references its own content streams (i.e., how the file should render the page when it’s opened) and resources (i.e., which fonts, images, or other objects the file should use on each page). Each resource found on any given PDF page contains a specific byte offset reference in the PDF directory (called a cross-reference table), which directs the object to load in a specific page location. If we’ve spent time looking at any document file structures in the past, this should all sound pretty familiar. What might be less familiar is the path to building a series of new, independent PDF documents using each page object found within a PDF Page Tree. Creating New PDF Files From Split PDF Pages The latter stage of our process involves extracting and subsequently cloning PDF page content — which includes retaining the necessary resources (page rendering instructions) and maintaining the right object references (content location instructions) for each PDF page. The API we use to handle this stage of the process will often duplicate shared resources from the original PDF document to avoid issues in the subsequent standalone documents. Handling this part correctly is crucial to ensure the resulting independent PDF documents contain the correct content; this consideration is one of the many reasons why we (probably) wouldn’t enjoy writing a program to handle this workflow from scratch. Once each page is successfully cloned, a new PDF document must be created for each page object with a Page Tree that defines only one page, and the result of this process must be serialized. The original PDF metadata object (which includes information like the document title, author, creation date, etc.) may be retained or deleted, depending on the API. Splitting PDFs With an Open-Source Library If we’re heading in an open-source API direction for our project, we might’ve already guessed that we’d land on an Apache library. Like most Apache APIs, the Apache PDFBox library is extremely popular thanks to its frequent updates, extensive support, and exhaustive documentation. Apache PDFBox has a utility called PDFSplit which conveniently facilitates the PDF splitting process. More specifically, the PDFSplit utility is represented by the Splitter class from the Apache PDFBox library. After we create a Splitter instance in our code (this configures logic for splitting a PDF document), we can call the split() method that breaks our loaded PDF into a series of independent PDF document objects. Each new PDF document can then be stored independently with the save() method, and when our process is finished, we can invoke the close() method to prevent memory leaks from occurring in our program. Like any library, we can add Apache PDFBox to our Java project by adding the required dependencies to our pom.xml (for Maven projects) or to our build.gradle (for Gradle projects). Splitting PDFs With a Web API One of the challenges we often encounter using open-source APIs for complex file operations is the overhead incurred from local memory usage (i.e., on the machine running the code). When we split larger PDF files, for example, we consume a significant amount of local RAM, CPU, and disk space on our server. Sometimes, it’s best to itemize our file processing action as a web request and ask it to take place on another server entirely. This offloads the bulk of our file processing overhead to another server, distributing the workload more effectively. We could deploy a new server on our own, or we could lean on third-party web API servers with easy accessibility and robust features. This depends entirely on the scope and requirements of our project; we may not have permission to provision a new server or leverage a third-party service. We’ll now look at one example of a simple web API request that can offload PDF splitting and document generation on our behalf. Demonstration The below solution is free to use, requiring an API key in the configuration step. For a Maven project, we can install it by first adding the below reference to our pom.xml repository: XML <repositories> <repository> <id>jitpack.io</id> <url>https://jitpack.io</url> </repository> </repositories> And then adding the below reference to our pom.xml dependency: XML <dependencies> <dependency> <groupId>com.github.Cloudmersive</groupId> <artifactId>Cloudmersive.APIClient.Java</artifactId> <version>v4.25</version> </dependency> </dependencies> Alternatively, for a Gradle project, we’ll add the below in our root build.gradle (at the end of repositories): Groovy allprojects { repositories { ... maven { url 'https://jitpack.io' } } } We’ll then add the following dependency in build.gradle: Groovy dependencies { implementation 'com.github.Cloudmersive:Cloudmersive.APIClient.Java:v4.25' } Next, we’ll place the import classes at the top of our file: Java // Import classes: //import com.cloudmersive.client.invoker.ApiClient; //import com.cloudmersive.client.invoker.ApiException; //import com.cloudmersive.client.invoker.Configuration; //import com.cloudmersive.client.invoker.auth.*; //import com.cloudmersive.client.SplitDocumentApi; Then, we’ll add our API key configuration directly after: Java ApiClient defaultClient = Configuration.getDefaultApiClient(); // Configure API key authorization: Apikey ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey"); Apikey.setApiKey("YOUR API KEY"); // Uncomment the following line to set a prefix for the API key, e.g. "Token" (defaults to null) //Apikey.setApiKeyPrefix("Token"); Finally, we’ll create an instance of the SplitDocumentAPI and call the apiInstance.splitDocumentPdfByPage() method with our input PDF file: Java SplitDocumentApi apiInstance = new SplitDocumentApi(); File inputFile = new File("/path/to/inputfile"); // File | Input file to perform the operation on. Boolean returnDocumentContents = true; // Boolean | Set to true to directly return all of the document contents in the DocumentContents field; set to false to return contents as temporary URLs (more efficient for large operations). Default is false. try { SplitPdfResult result = apiInstance.splitDocumentPdfByPage(inputFile, returnDocumentContents); System.out.println(result); } catch (ApiException e) { System.err.println("Exception when calling SplitDocumentApi#splitDocumentPdfByPage"); e.printStackTrace(); } We'll most likely want to keep returnDocumentContents set to true in our code, just like the above example. This specifies that the API will return file byte strings in our response array rather than temporary URLs (which are used to "chain" edits together by referencing modified file content in a cache on the endpoint server). Our try/catch block will print errors (with stack trace) to the console for easy debugging. In our API response, we can expect an array of new PDF documents. Here's a JSON response model for reference: JSON { "Successful": true, "Documents": [ { "PageNumber": 0, "URL": "string", "DocumentContents": "string" } ] } And an XML version of the same, if that's more helpful: XML <?xml version="1.0" encoding="UTF-8"?> <SplitPdfResult> <Successful>true</Successful> <Documents> <PageNumber>0</PageNumber> <URL>string</URL> <DocumentContents>string</DocumentContents> </Documents> </SplitPdfResult> Conclusion In this article, we learned about how PDF files are structured, and we focused our attention on the way PDF pages are organized within PDF file structure. We learned about the high-level steps involved in splitting a PDF file into a series of separate documents, and we then explored two Java libraries — one open-source library and one third-party web API — to facilitate adding this workflow into our own Java project.

By Brian O'Neill

CORE