DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

How does AI transform chaos engineering from an experiment into a critical capability? Learn how to effectively operationalize the chaos.

Data quality isn't just a technical issue: It impacts an organization's compliance, operational efficiency, and customer satisfaction.

Are you a front-end or full-stack developer frustrated by front-end distractions? Learn to move forward with tooling and clear boundaries.

Developer Experience: Demand to support engineering teams has risen, and there is a shift from traditional DevOps to workflow improvements.

DZone Spotlight

Monday, June 23 View All Articles »
How to Add a Jenkins Agent With Docker Compose

How to Add a Jenkins Agent With Docker Compose

By Faisal Khatri DZone Core CORE
In the previous article on installing Jenkins with Docker Compose, we learned how to install and set up Jenkins using Docker Compose. It is now time to learn how to add a Jenkins agent using Docker Compose. With the installation of Jenkins using Docker Compose, we are familiar with Docker Compose and its related file contents. Before we talk about adding the Jenkins agent using Docker Compose, let’s first understand what Jenkins agents are and what they are used for. What Is a Jenkins Agent? An agent is usually a machine or container that links to a Jenkins controller and carries out tasks as instructed by the controller. It is not recommended to run any Jenkins builds directly using the Jenkins controller. It is best practice to use a Jenkins agent from a security and performance perspective to execute your tasks. Setting Up a Jenkins Agent With Docker Compose The following installations are prerequisites that are required for successfully setting up the agent on the machine with Docker Compose: JavaDockerJenkins I have broken the Jenkins agent configuration steps into clear, easy-to-follow steps mentioned below: Step 1: Generating an SSH Key Pair The SSH Key pair can be generated by running the following command in the terminal: Plain Text ssh-keygen -t ed25519 -f Jenkins-agent Explanation for the above command: ssh-keygen is the command to generate the SSH key pair. It will generate the public and private keys in two separate files.-t ed25519 states the type of key to create. Here, ed25519 is the recommended modern secure algorithm.-f Jenkins-agent specifies the name of the generated key pair. The private key will be saved as Jenkins-agent, and the public key will be saved as Jenkins-agent.pub. After running the command, it will ask to enter the passphrase. It is an optional step and can be skipped by pressing the Enter Key. Adding a passphrase is optional, but if you choose to set one, make sure to remember it  —  you’ll need to use the same passphrase in Jenkins later. The SSH Keygen files will be generated on the same path where the command is executed. Step 2: Add Credentials Using SSH Key in Jenkins Navigate to Jenkins and open the Manage Jenkins menu. Next, open the Credentials page by clicking on the “Credentials” menu on the right-hand side of the page. On the Credentials page, click on “(global)” under the “Domains” column. Click on the “Add Credential” option. Select/Update the following options in the “New Credentials” screen: 1. Kind  —  Select “SSH Username with private key.” 2. Scope  —  System (Jenkins and nodes only). By setting this option, the key can be used for Jenkins and nodes only and cannot be used for jobs. 3. Provide the ID and Description. 4. Enter Username as “jenkins.” 5. Select “Enter Directly” for the Private Key option. Paste the content of the “jenkins-agent” that we generated in Step 1. 6. If you used a passphrase while generating the SSH key, enter it in the “Passphrase” field. 7. Click on the “Create” button. The Jenkins-agent credentials should be added successfully. Step 3: Update the Docker Compose File and Add a New Service for the Jenkins Agent The Docker Compose file must be updated, and a new service for the Jenkins agent should be added to it. YAML agent: image: jenkins/ssh-agent:latest-jdk21 privileged: true user: root container_name: agent expose: - 22 environment: - JENKINS_AGENT_SSH_PUBKEY=ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIEnvpBuTsaCCG3zts8liJY/h6J7gKI6y6q0pc7xxcGZI [email protected] Understanding the Agent Service The “agent“ service block contains the details of the image, user, port, and SSH key pair details. The “image” specifies the Docker image used for this service. In our case, we are using the “jenkins/ssh-agent:latest-jdk21”, which is a pre-built Jenkins SSH agent image with JDK 21 support. The “privileged” is set to true, which means the container is given extended privileges. It is sometimes necessary to build Docker containers within Docker. To provide full administrative access within the container, the “user” is set to “root”. The “container_name” is set to “agent”. The “expose: 22” exposes port 22 (SSH port) from the container, which allows Jenkins to communicate with the agent over SSH. Under the “environment” attribute, the JENKINS_AGENT_SSH_PUBKEY is the environment variable, and the value after “=” is the content of the public SSH key “jenkins-agent.pub” that we generated in Step 1. It allows the Jenkins Controller to connect securely to this agent via SSH. With this service block updated in the “docker-compose.yaml” file, we are all set to use the Jenkins Controller and the agent via a single Docker Compose file. Following is the full “docker-compose.yaml” file for your reference: YAML # docker-compose.yaml version: '3.8' services: jenkins: image: jenkins/jenkins:lts privileged: true user: root ports: - 8080:8080 - 50000:50000 container_name: jenkins volumes: - /Users/faisalkhatri/jenkins-demo/jenkins-configuration:/var/jenkins_home - /var/run/docker.sock:/var/run/docker.sock agent: image: jenkins/ssh-agent:latest-jdk21 privileged: true user: root container_name: agent expose: - 22 environment: - JENKINS_AGENT_SSH_PUBKEY=ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIO5oiJGVDyuW+09Runc87mpZDJBtMe8e4TQwWLHzXqvB [email protected] Step 4: Restart Jenkins Using Docker Compose First, the already running Jenkins instance must be stopped by running the following command in the terminal: Plain Text docker compose down Next, start Docker Compose again by running the following command in the terminal: Plain Text docker compose up -d Step 5: Create a New Node in Jenkins Open the browser and navigate to Jenkins (https://localhost:8080). Open the “Nodes” window by navigating to the “Manage Jenkins” menu. Click on the “New Node” button to add a new node. Add the “Node name” and select “Type — Permanent Agent” and click on the “Create” button to proceed. In the next screen, update the values as provided below: 1. Description  –  It is an optional field; update any meaningful value describing the agent, or it can be left blank 2. Number of Executors  –  1 3. Remote root directory – ”/home/jenkins/agent” 4. Usage  –  Use this node as much as possible 5. Launch method –  Launch agents via SSH 6. Host – agent. The hostname is the container name that has been set in the “agent” service in the Docker Compose file. Each container can communicate with others by using its container name as a hostname. 7. Credentials –  Jenkins (jenkins-agent key) 8. Host Key Verification Strategy – “Non-verifying Verification Strategy” 9. Click on the “Advanced” button to update a few Advanced Settings: Port  –  22 There is no need for adding the Java Path here, as we have already used the latest JDK 21 image for the agent in the Docker Compose file. Connection Timeout in Seconds  –  60Maximum Number of Retries  – 10Seconds To Wait Between Retries  – 15 The rest of the settings should be left as default. Click on the Save button. With these settings, the configuration for the Jenkins agent is completed. The following steps should be performed to check that the agent is successfully up and running: 1. Navigate to the Manage Jenkins > Nodes. 2. Click on the “jenkins-agent” node. 3. On the left-hand side, click on the Log menu to check for the agent logs. It can be seen in the logs  —  “Agent successfully connected and online”. Congratulations! You have successfully added a Jenkins agent in Jenkins. This agent can be further used for executing the Jenkins jobs. Summary The Jenkins agent is a machine or container that links to the Jenkins Controller to execute the Jenkins tasks. With Docker Compose, the configuration was fast and easy. The Jenkins node creation step is a one-time activity, and once it is set up, we can run any task using the agent. Since we used Docker Volume to set up Jenkins, all data related to the Jenkins instance is preserved across restarts, enabling smooth and uninterrupted usage. More
Lessons Learned in Test-Driven Development

Lessons Learned in Test-Driven Development

By Arun Vishwanathan
When I began my career as a test engineer about a decade ago, fresh out of school, I was not aware of formal approaches to testing. Then, as I worked with developers on teams of various sizes, I learned about several different approaches, including test-driven development (TDD). I hope to share some insights into when I’ve found TDD to be effective. I’ll also share my experience with situations where traditional testing or a hybrid approach worked better than using TDD alone. A Great Experience With Test-Driven Development First Impressions At first, TDD seemed counterintuitive to me—a reverse of the traditional approach of writing code first and testing later. One of my first development teams was pretty small and flexible. So I suggested that we give TDD a try. Right off the bat, I could see that we could adopt TDD, thanks to the team's willingness to engage in supportive practices. Advantages for Test Planning and Strategy The team engaged in test planning and test strategy early in the release cycle. We discussed in detail potential positive and negative test cases that could come out of a feature. Each test case included expected behavior from the feature when exercised, and the potential value of the test. For us testers, this was a nice gateway to drive development design early by bringing team members to discussions upfront. This sort of planning also facilitated the Red-Green-Refactor TDD concept, which in TDD is: Red: To write a failing test that defines a desired behaviorGreen: To write just enough code to make the test passRefactor: To improve the code while keeping all tests passing Time and Clarity We had the time and clarity to engage thoughtfully with the design process instead of rushing to implementation. Writing tests upfront helped surface design questions early, creating a natural pause for discussion, before any major code was finalized. This shifted the tone of the project from reactive to responsive. We were not simply reacting to last-minute feature changes; instead, we actively shaped the system with clear, testable outcomes in mind. Solid Documentation Helps TDD encourages the documentation of code with expected behaviors. So we had comprehensive internal and external user-level documentation, not just an API spec. Developers linked their code examples against such tests. The internal documentation for features was very detailed and explanatory, and was updated regularly. Opportunities for Healthy Collaboration TDD requires healthy collaboration, and our team enthusiastically interacted and discussed important issues, fostering a shared understanding of design and quality objectives. We were able to share the workload, especially when the technical understanding was sound amongst members of our team. The developers did NOT have an attitude of "I type all the code and testers can take the time to test later." Quite the contrary. Challenges of Test-Driven Development in High-Pressure Environments Fast forward to my experience at my current job at a FAANG company: Here, the focus is responding to competition and delivering marketable products fast. In this environment, I have observed that although TDD as a concept could have been incorporated, it did present several challenges: Feature Churn and Speed Hinders TDD Adoption The feature churn in our team is indeed very fast. People are pushed to get features moving. Developers openly resisted the adoption of TDD: working with testers on test-driven feature design was perceived as "slowing down" development. The effort-to-value ratio was questioned by the team. Instead, developers simply write a few unit tests to validate their changes before they merge them. This keeps the pipeline moving quickly. As it turned out, about 80 percent of the product's features could in fact be tested after the feature was developed, and this was considered sufficient. Features in Flux and Volatile Requirements One challenge we faced with TDD was when feature requirements changed mid-development. Initially, one of the designs assumed all clients would use a specific execution environment for machine learning models in the team. But midway through development, stakeholders asked us to support a newer environment for certain clients, while preserving the legacy behavior for others. Since we had written tests early based on the original assumption, many of them became outdated and had to be rewritten to handle both cases. This made it clear that while TDD encourages clarity early on, it can also require substantial test refactoring when assumptions shift. Our models also relied on artifacts outside of the model, such as weights and other pre-processing data. (Weights are the core parameters that a machine learning model learns during its training process.) These details became clear only after the team strove for ever-higher efficiency over the course of the release. The resulting fluctuations made it difficult to go back and update behavioral tests. While the issue of frequent updates is not unique to TDD, it is amplified here, and it requires an iterative process to work. The development team was not in favor of creating behavioral tests with volatile projects only to have to go back and rework them later. In general, TDD is better for stable code blocks. The early authoring of tests in the situation I've described did not appear to be as beneficial as authoring tests on the fly. Hence, the traditional code-then-test approach was chosen. Frequent Changes Due to Complex Dependencies With several dependencies over multiple layers of the software stack, this made it difficult to pursue meaningful test design consistently. I noticed that not all teams whose work was cross-dependent communicated clearly and well. And so we caught defects mostly during full system tests. Tests for machine learning features required mocking or simulating dependencies, such as devices, datasets, or external APIs. These dependency changes over the feature development made the test a bit shaky. Mocking code with underlying dependencies could lead to fragile tests. So, in our case, TDD appeared to work best for modular and isolated units of code. Integration Testing Demands TDD largely focuses on unit tests, which may not adequately cover integration and system-level behavior, leading to gaps in overall test coverage. It can get too tightly coupled with the implementation details rather than focusing on the broader behavior or business logic. Many teams relied on us testers as the assessors of the overall state of product quality, since we were high up in the stack. The demand for regulated integration testing took up a big chunk of the team's energy and time. We had to present results to our sponsors and stakeholders every few weeks, since the focus was on overall stack quality. Developers across teams also largely looked to the results of our integration test suite to catch bugs they might have introduced. It was mainly through our wide system coverage that multiple regressions were caught across the stack across hardware, and action was taken. Developers Did Not Fully Understand TDD Processes Though developers did author unit-level tests, they wrote their code first, the traditional way. The time to learn and use TDD effectively was seen as an obstacle, and developers were understandably reluctant to risk valuable time. When developers are unfamiliar with TDD, they may misunderstand its core process of Red-Green-Refactor. Skipping or incorrectly implementing any stage can lead to inffective tests. And this was the case with our team. Instead of creating tests that defined expected outcomes for certain edge cases, the attempts focused heavily on overly simplistic scenarios that did not cover real-world data issues. Balancing TDD and Traditional Testing In situations like my company's FAANG product, it does seem natural and obvious to fall back to the traditional testing approach of coding first and then testing. While this is a pragmatic approach, it has its challenges. For example, the testing schedule and matrix have to be closely aligned with the feature churn to ensure issues are caught right away, in development … not by the customer in production. So, Is It Possible to Achieve the Best of Both Worlds? The answer, as with any computer science-related question, is that it depends. But I say it is possible, depending on how closely you work with the engineers on your team and what the team culture is. Though TDD might not give you a quality coverage sign-off, it does help to think from a user’s perspective and start from the ground up. Start Planning and Talking Early in the Process During the initial stages of feature planning, discussions around TDD principles can significantly influence design quality. This requires strong collaboration between developers and testers. There has to be a cooperative mindset and a willingness to explore the practice effectively. Leverage Hybrid Approaches TDD works well for unit tests, offering clarity and precision during the development phase. Writing tests before code forces developers to clarify edge cases and expected outcomes early. TDD appears to be a better suite for stable, modular components. TDD can also help to test interactions between dependencies versus testing components that are independent. Meanwhile, traditional pipelines are better suited for comprehensive system-level testing. One could delay writing tests for volatile or experimental features until requirements stabilize. Recognize the Value of Traditional Integration Testing Pipelines As release deadlines approach, traditional testing methods become critical. Establishing nightly, weekly, and monthly pipelines—spanning unit, system, and integration testing—provides a robust safety net. There is a lot of churn, which requires a close watch to catch regressions in the system and their impact. Especially during a code freeze, traditional integration testing is the final line of defense. Automate Testing as Much as Possible I have found it indispensable to design and use automated system-level tools to make sign-off on projects easier. These tools can leverage artificial intelligence (AI) as well. Traditional testing is usually a bottleneck when tests are combinatorially explosive over models and hardware, but with the advent of generative AI, test case generation can help here, with a pinch of salt. A lot of my test tooling is based on code ideas obtained using AI. AI-based TDD is picking up steam, but we are still not close to a reliable, widespread Artificial Generative Intelligence (AGI) use for testing. To Wrap Up For testers navigating the balance between TDD and traditional testing, the key takeaway is quite simple: Adapt the principles to fit your team’s workflow, do not hesitate to try out new things and understand the experience, and never underestimate the power of early behavioral testing (TDD) in delivering high-quality software. More

Trend Report

Generative AI

AI technology is now more accessible, more intelligent, and easier to use than ever before. Generative AI, in particular, has transformed nearly every industry exponentially, creating a lasting impact driven by its (delivered) promises of cost savings, manual task reduction, and a slew of other benefits that improve overall productivity and efficiency. The applications of GenAI are expansive, and thanks to the democratization of large language models, AI is reaching every industry worldwide.Our focus for DZone's 2025 Generative AI Trend Report is on the trends surrounding GenAI models, algorithms, and implementation, paying special attention to GenAI's impacts on code generation and software development as a whole. Featured in this report are key findings from our research and thought-provoking content written by everyday practitioners from the DZone Community, with topics including organizations' AI adoption maturity, the role of LLMs, AI-driven intelligent applications, agentic AI, and much more.We hope this report serves as a guide to help readers assess their own organization's AI capabilities and how they can better leverage those in 2025 and beyond.

Generative AI

Refcard #402

SBOM Essentials

By Siri Varma Vegiraju DZone Core CORE
SBOM Essentials

Refcard #269

Getting Started With Data Quality

By Miguel Garcia DZone Core CORE
Getting Started With Data Quality

More Articles

How to Marry MDC With Spring Integration
How to Marry MDC With Spring Integration

In modern enterprise applications, effective logging and traceability are critical for debugging and monitoring business processes. Mapped Diagnostic Context (MDC) provides a mechanism to enrich logging statements with contextual information, making it easier to trace requests across different components. This article explores the challenges of MDC propagation in Spring integration and presents strategies to ensure that the diagnostic context remains intact as messages traverse these channels. Let's start with a very brief overview of both technologies. If you are already familiar with them, you can go straight to the 'Marry Spring Integration with MDC' section. Mapped Diagnostic Context Mapped Diagnostic Context (MDC) plays a crucial role in logging by providing a way to enrich log statements with contextual information specific to a request, transaction, or process. This enhances traceability, making it easier to correlate logs across different components in a distributed system. Java { MDC.put("SOMEID", "xxxx"); runSomeProcess(); MDC.clear(); } All the logging calls invoked inside runSomeProcess will have "SOMEID" in the context and could be added to log messages with the appropriate pattern in the logger configuration. I will use log4j2, but SL4J also supports MDC. XML pattern="%d{HH:mm:ss} %-5p [%X{SOMEID}] [%X{TRC_ID}] - %m%n" The %X placeholder in log4j2 outputs MDC values (in this case - SOMEID and TRC_ID). Output: Plain Text 18:09:19 DEBUG [SOMEIDVALUE] [] SomClass:XX - log message text Here we can see that TRC_ID was substituted with an empty string as it was not set in the MDC context (so it does not affect operations, running out of context). And logs, that are a terrible mess of threads: Plain Text 19:54:03 49 DEBUG Service1:17 - process1. src length: 2 19:54:04 52 DEBUG Service2:22 - result: [77, 81, 61, 61] 19:54:04 52 DEBUG DirectChannel:191 - preSend on channel 'bean 'demoWorkflow.channel#4'; from source: 'com.fbytes.mdcspringintegration.integration.IntegrationConfiguration.demoWorkflow()'', message: GenericMessage [payload=MQ==, headers={SOMEID=30, id=abbff9b1-1273-9fc8-127d-ca78ffaae07a, timestamp=1747500844111}] 19:54:04 52 INFO IntegrationConfiguration:81 - Result: MQ== 19:54:04 52 DEBUG DirectChannel:191 - postSend (sent=true) on channel 'bean 'demoWorkflow.channel#4'; from source: 'com.fbytes.mdcspringintegration.integration.IntegrationConfiguration.demoWorkflow()'', message: GenericMessage [payload=MQ==, headers={SOMEID=30, id=abbff9b1-1273-9fc8-127d-ca78ffaae07a, timestamp=1747500844111}] 19:54:04 52 DEBUG QueueChannel:191 - postReceive on channel 'bean 'queueChannel-Q'; defined in: 'class path resource [com/fbytes/mdcspringintegration/integration/IntegrationConfiguration.class]'; from source: 'com.fbytes.mdcspringintegration.integration.IntegrationConfiguration.queueChannelQ()'', message: GenericMessage [payload=1, headers={SOMEID=31, id=d0b6c58d-457e-876c-a240-c36d36f7e4f5, timestamp=1747500838034}] 19:54:04 52 DEBUG PollingConsumer:313 - Poll resulted in Message: GenericMessage [payload=1, headers={SOMEID=31, id=d0b6c58d-457e-876c-a240-c36d36f7e4f5, timestamp=1747500838034}] 19:54:04 52 DEBUG ServiceActivatingHandler:313 - ServiceActivator for [org.springframework.integration.handler.MethodInvokingMessageProcessor@1907874b] (demoWorkflow.org.springframework.integration.config.ConsumerEndpointFactoryBean#4) received message: GenericMessage [payload=1, headers={SOMEID=31, id=d0b6c58d-457e-876c-a240-c36d36f7e4f5, timestamp=1747500838034}] 19:54:04 52 DEBUG Service2:16 - encoding 1 19:54:04 49 DEBUG Service1:24 - words processed: 1 19:54:04 49 DEBUG QueueChannel:191 - preSend on channel 'bean 'queueChannel-Q'; defined in: 'class path resource [com/fbytes/mdcspringintegration/integration/IntegrationConfiguration.class]'; from source: 'com.fbytes.mdcspringintegration.integration.IntegrationConfiguration.queueChannelQ()'', message: GenericMessage [payload=1, headers={id=6a67a5b4-724b-6f54-4e9f-acdeb2a7a235, timestamp=1747500844114}] 19:54:04 49 DEBUG QueueChannel:191 - postSend (sent=true) on channel 'bean 'queueChannel-Q'; defined in: 'class path resource [com/fbytes/mdcspringintegration/integration/IntegrationConfiguration.class]'; from source: 'com.fbytes.mdcspringintegration.integration.IntegrationConfiguration.queueChannelQ()'', message: GenericMessage [payload=1, headers={SOMEID=37, id=07cf749d-741e-640c-eb4f-f9bcd293dbcd, timestamp=1747500844114}] 19:54:04 49 DEBUG DirectChannel:191 - postSend (sent=true) on channel 'bean 'demoWorkflow.channel#3'; from source: 'com.fbytes.mdcspringintegration.integration.IntegrationConfiguration.demoWorkflow()'', message: GenericMessage [payload=gd, headers={id=e7aedd50-8075-fa2a-9dd3-c11956e0d296, timestamp=1747500843637}] 19:54:04 49 DEBUG DirectChannel:191 - postSend (sent=true) on channel 'bean 'demoWorkflow.channel#2'; from source: 'com.fbytes.mdcspringintegration.integration.IntegrationConfiguration.demoWorkflow()'', message: GenericMessage [payload=gd, headers={id=e7aedd50-8075-fa2a-9dd3-c11956e0d296, timestamp=1747500843637}] 19:54:04 49 DEBUG DirectChannel:191 - postSend (sent=true) on channel 'bean 'demoWorkflow.channel#1'; from source: 'com.fbytes.mdcspringintegration.integration.IntegrationConfiguration.demoWorkflow()'', message: GenericMessage [payload=(37,gd), headers={id=3048a04c-ff44-e2ce-98a4-c4a84daa0656, timestamp=1747500843636}] 19:54:04 49 DEBUG DirectChannel:191 - postSend (sent=true) on channel 'bean 'demoWorkflow.channel#0'; from source: 'com.fbytes.mdcspringintegration.integration.IntegrationConfiguration.demoWorkflow()'', message: GenericMessage [payload=(37,gd), headers={id=d76dff34-3de5-e830-1f6b-48b337e0c658, timestamp=1747500843636}] 19:54:04 49 DEBUG SourcePollingChannelAdapter:313 - Poll resulted in Message: GenericMessage [payload=(38,g), headers={id=495fe122-df04-2d57-dde2-7fc045e8998f, timestamp=1747500844114}] 19:54:04 49 DEBUG DirectChannel:191 - preSend on channel 'bean 'demoWorkflow.channel#0'; from source: 'com.fbytes.mdcspringintegration.integration.IntegrationConfiguration.demoWorkflow()'', message: GenericMessage [payload=(38,g), headers={id=495fe122-df04-2d57-dde2-7fc045e8998f, timestamp=1747500844114}] 19:54:04 49 DEBUG ServiceActivatingHandler:313 - ServiceActivator for [org.springframework.integration.handler.LambdaMessageProcessor@7efd28bd] (demoWorkflow.org.springframework.integration.config.ConsumerEndpointFactoryBean#0) received message: GenericMessage [payload=(38,g), headers={id=495fe122-df04-2d57-dde2-7fc045e8998f, timestamp=1747500844114}] 19:54:04 49 DEBUG DirectChannel:191 - preSend on channel 'bean 'demoWorkflow.channel#1'; from source: 'com.fbytes.mdcspringintegration.integration.IntegrationConfiguration.demoWorkflow()'', message: GenericMessage [payload=(38,g), headers={id=1790d3d8-9501-f479-c5ee-6b9232295313, timestamp=1747500844114}] 19:54:04 49 DEBUG MessageTransformingHandler:313 - bean 'demoWorkflow.transformer#0' for component 'demoWorkflow.org.springframework.integration.config.ConsumerEndpointFactoryBean#1'; from source: 'com.fbytes.mdcspringintegration.integration.IntegrationConfiguration.demoWorkflow()' received message: GenericMessage [payload=(38,g), headers={id=1790d3d8-9501-f479-c5ee-6b9232295313, timestamp=1747500844114}] 19:54:04 49 DEBUG DirectChannel:191 - preSend on channel 'bean 'demoWorkflow.channel#2'; from source: 'com.fbytes.mdcspringintegration.integration.IntegrationConfiguration.demoWorkflow()'', message: GenericMessage [payload=g, headers={id=e2f69d41-f760-2f4d-87c2-4e990beefdaa, timestamp=1747500844114}] 19:54:04 49 DEBUG MessageFilter:313 - bean 'demoWorkflow.filter#0' for component 'demoWorkflow.org.springframework.integration.config.ConsumerEndpointFactoryBean#2'; from source: 'com.fbytes.mdcspringintegration.integration.IntegrationConfiguration.demoWorkflow()' received message: GenericMessage [payload=g, headers={id=e2f69d41-f760-2f4d-87c2-4e990beefdaa, timestamp=1747500844114}] 19:54:04 49 DEBUG DirectChannel:191 - preSend on channel 'bean 'demoWorkflow.channel#3'; from source: 'com.fbytes.mdcspringintegration.integration.IntegrationConfiguration.demoWorkflow()'', message: GenericMessage [payload=g, headers={id=e2f69d41-f760-2f4d-87c2-4e990beefdaa, timestamp=1747500844114}] 19:54:04 49 DEBUG ServiceActivatingHandler:313 - ServiceActivator for [org.springframework.integration.handler.MethodInvokingMessageProcessor@1e469dfd] (demoWorkflow.org.springframework.integration.config.ConsumerEndpointFactoryBean#3) received message: GenericMessage [payload=g, headers={id=e2f69d41-f760-2f4d-87c2-4e990beefdaa, timestamp=1747500844114}] 19:54:04 49 DEBUG Service1:17 - process1. src length: 1 19:54:04 49 DEBUG Service1:24 - words processed: 1 It will become readable, and even the internal Spring Integration messages are attached to specific SOMEID processing. Plain Text 19:59:44 49 DEBUG [19] [] Service1:17 - process1. src length: 3 19:59:45 52 DEBUG [6] [] Service2:22 - result: [77, 119, 61, 61] 19:59:45 52 DEBUG [6] [] DirectChannel:191 - preSend on channel 'bean 'demoWorkflow.channel#4'; from source: 'com.fbytes.mdcspringintegration.integration.IntegrationConfiguration.demoWorkflow()'', message: GenericMessage [payload=Mw==, headers={SOMEID=6, id=b19eb8b6-7c5b-aa5a-31d0-dc9b940e4cd9, timestamp=1747501185064}] 19:59:45 52 INFO [6] [] IntegrationConfiguration:81 - Result: Mw== 19:59:45 52 DEBUG [6] [] DirectChannel:191 - postSend (sent=true) on channel 'bean 'demoWorkflow.channel#4'; from source: 'com.fbytes.mdcspringintegration.integration.IntegrationConfiguration.demoWorkflow()'', message: GenericMessage [payload=Mw==, headers={SOMEID=6, id=b19eb8b6-7c5b-aa5a-31d0-dc9b940e4cd9, timestamp=1747501185064}] 19:59:45 52 DEBUG [6] [] QueueChannel:191 - postReceive on channel 'bean 'queueChannel-Q'; defined in: 'class path resource [com/fbytes/mdcspringintegration/integration/IntegrationConfiguration.class]'; from source: 'com.fbytes.mdcspringintegration.integration.IntegrationConfiguration.queueChannelQ()'', message: GenericMessage [payload=2, headers={SOMEID=7, id=5e4f9113-6520-c20c-afc8-f8e1520bf9e9, timestamp=1747501177082}] 19:59:45 52 DEBUG [7] [] PollingConsumer:313 - Poll resulted in Message: GenericMessage [payload=2, headers={SOMEID=7, id=5e4f9113-6520-c20c-afc8-f8e1520bf9e9, timestamp=1747501177082}] 19:59:45 52 DEBUG [7] [] ServiceActivatingHandler:313 - ServiceActivator for [org.springframework.integration.handler.MethodInvokingMessageProcessor@5d21202d] (demoWorkflow.org.springframework.integration.config.ConsumerEndpointFactoryBean#4) received message: GenericMessage [payload=2, headers={SOMEID=7, id=5e4f9113-6520-c20c-afc8-f8e1520bf9e9, timestamp=1747501177082}] 19:59:45 52 DEBUG [7] [] Service2:16 - encoding 2 19:59:45 53 DEBUG [] [] QueueChannel:191 - postReceive on channel 'bean 'queueChannel-Q'; defined in: 'class path resource [com/fbytes/mdcspringintegration/integration/IntegrationConfiguration.class]'; from source: 'com.fbytes.mdcspringintegration.integration.IntegrationConfiguration.queueChannelQ()'', message: GenericMessage [payload=2, headers={SOMEID=8, id=37400675-0f79-8a89-de36-dacf2feb106e, timestamp=1747501177343}] 19:59:45 53 DEBUG [8] [] PollingConsumer:313 - Poll resulted in Message: GenericMessage [payload=2, headers={SOMEID=8, id=37400675-0f79-8a89-de36-dacf2feb106e, timestamp=1747501177343}] 19:59:45 53 DEBUG [8] [] ServiceActivatingHandler:313 - ServiceActivator for [org.springframework.integration.handler.MethodInvokingMessageProcessor@5d21202d] (demoWorkflow.org.springframework.integration.config.ConsumerEndpointFactoryBean#4) received message: GenericMessage [payload=2, headers={SOMEID=8, id=37400675-0f79-8a89-de36-dacf2feb106e, timestamp=1747501177343}] 19:59:45 53 DEBUG [8] [] Service2:16 - encoding 2 19:59:45 52 DEBUG [7] [] Service2:22 - result: [77, 103, 61, 61] 19:59:45 52 DEBUG [7] [] DirectChannel:191 - preSend on channel 'bean 'demoWorkflow.channel#4'; from source: 'com.fbytes.mdcspringintegration.integration.IntegrationConfiguration.demoWorkflow()'', message: GenericMessage [payload=Mg==, headers={SOMEID=7, id=bbb9f71f-37d8-8bc4-90c3-bfb813430e4a, timestamp=1747501185469}] 19:59:45 52 INFO [7] [] IntegrationConfiguration:81 - Result: Mg== Under the hood, MDC uses ThreadLocal storage, tying the context to the current thread. This works seamlessly in single-threaded flows but requires special handling in multi-threaded scenarios, such as Spring Integration’s queue channels. Spring Integration A great part of Spring, allowing a new level of services decoupling by building the application workflow where data is passed between services as a message, defining what method of the service to invoke for data processing, rather than making direct service-to-service calls. Java IntegrationFlow flow = new IntegrationFlow.from("sourceChannel") .handle("service1", "runSomeProcess") .filter(....) .transform(...) .split() .channel("serviceInterconnect") .handle("service2", "runSomeProcess") .get(); Here we: Get data from "sourceChannel" (assuming a bean with such a name already registered);Invoke service1.runSomeProcess passing the data (unwrapped from Message<?> of Spring Integration)Returned result (whatever it is) is wrapped back in Message, undergoes some filtering and transformations;Result (assuming it is some array or Stream), split for per-entry processing;Entries (wrapped in Message) passed to "serviceInterconnect" channel;Entries processed by service2.runSomeProcess; Spring integration provides message channels of several types. What is important here is that some of them run the consumer process on a produced thread, while others (e.g., the Queue channel) delegate the processing to other consumer threads. The thread-local MDC context will be lost. So, we need to find a way to propagate it down the workflow. Marry Spring Integration With MDC While micrometer-tracing propagates MDC between microservices, it doesn’t handle Spring integration’s queue channels, where thread switches occur. To maintain the MDC context, it must be stored in message headers on the producer side and restored on the consumer side. Below are three methods to achieve this: Use Spring Integration Advice;Use Spring-AOP @Aspect;Use Spring Integration ChannelInterceptor. 1. Using Spring Integration Advice Java @Service class MdcAdvice implements MethodInterceptor { @Autowired IMDCService mdcService; @Override public Object invoke(MethodInvocation invocation) throws Throwable { Message<?> message = (Message<?>) invocation.getArguments()[0]; Map<String, String> mdcMap = (Map<String, String>) message.getHeaders().entrySet().stream() .filter(...) .collect(Collectors.toMap(Map.Entry::getKey, entry -> String.valueOf(entry.getValue()))); MDCService.set(mdcMap); try { return invocation.proceed(); } finally { MDCService.clear(mdcMap); } } } It should be directly specified for the handler in the workflow, e.g.: Java .handle("service1", "runSomeProcess", epConfig -> epConfig.advice(mdcAdvice)) Disadvantages It covers only the handler. Context cleared right after it, and thus, logging of the processes between handlers will have no context.It should be manually added to all handlers. 2. Using Spring-AOP @Aspect Java @Aspect @Component public class MdcAspect { @Autowired IMDCService mdcService; @Around("execution(* org.springframework.messaging.MessageHandler.handleMessage(..))") public Object aroundHandleMessage(ProceedingJoinPoint joinPoint) throws Throwable { Message<?> message = (Message<?>) joinPoint.getArgs()[0]; Map<String, String> mdcMap = (Map<String, String>) message.getHeaders().entrySet().stream() .filter(...) .collect(Collectors.toMap(Map.Entry::getKey, entry -> (String) entry.getValue())); mdcService.setContextMap(mdcMap); try { return joinPoint.proceed(); } finally { mdcService.clear(mdcMap); } } } Disadvantages It should automatically be invoked, but.. only for "stand-alone" MesssageHandlers. For those defined inline, e.g., it won't work, because the handler is not a proxied bean in this case. Java .handle((msg,headers) -> { return service1.runSomeProcess(); } It covers only the handlers, too. 3. Using Spring Integration ChannelInterceptor First, we need to clear the context at the end of the processing. It can be done by defining the custom TaskDecorator: Java @Service public class MdcClearingTaskDecorator implements TaskDecorator { private static final Logger logger = LogManager.getLogger(MdcClearingTaskDecorator.class); private final MDCService mdcService; public MdcClearingTaskDecorator(MDCService mdcService) { this.mdcService = mdcService; } @Override public Runnable decorate(Runnable runnable) { return () -> { try { runnable.run(); } finally { logger.debug("Cleaning the MDC context"); mdcService.clearMDC(); } }; } } And set it for all TaskExecutors: Java @Bean(name = "someTaskExecutor") public TaskExecutor someTaskExecutor() { ThreadPoolTaskExecutor executor = newThreadPoolExecutor(mdcService); executor.setTaskDecorator(mdcClearingTaskDecorator); executor.initialize(); return executor; } Used by pollers: Java @Bean(name = "somePoller") public PollerMetadata somePoller() { return Pollers.fixedDelay(Duration.ofSeconds(30)) .taskExecutor(someTaskExecutor()) .getObject(); } Inline: Java .from(consoleMessageSource, c -> c.poller(p -> p.fixedDelay(1000).taskExecutor(someTaskExecutor()))) Now, we need to save and restore the context as it passes the Pollable channels. Java @Service @GlobalChannelInterceptor(patterns = {"*-Q"}) public class MdcChannelInterceptor implements ChannelInterceptor { private static final Logger logger = LogManager.getLogger(MdcChannelInterceptor.class); @Value("${mdcspringintegration.mdc_header}") private String mdcHeader; @Autowired private MDCService mdcService; @Override public Message<?> preSend(Message<?> message, MessageChannel channel) { if (!message.getHeaders().containsKey(mdcHeader)) { return MessageBuilder.fromMessage(message) .setHeader(mdcHeader, mdcService.fetch(mdcHeader)) // Add a new header .build(); } if (channel instanceof PollableChannel) { logger.trace("Cleaning the MDC context for PollableChannel"); mdcService.clearMDC(); // clear MDC in producer's thread } return message; } @Override public Message<?> postReceive(Message<?> message, MessageChannel channel) { if (channel instanceof PollableChannel) { logger.trace("Setting MDC context for PollableChannel"); Map<String, String> mdcMap = message.getHeaders().entrySet().stream() .filter(entry -> entry.getKey().equals(mdcHeader)) .collect(Collectors.toMap(Map.Entry::getKey, entry -> (String) entry.getValue())); mdcService.setMDC(mdcMap); } return message; } } preSend is invoked on the producer thread before the message is added to the Queue and cleans the context (of the producer's thread)postReceive is invoked on the consumer thread before the message is processed by the consumer. This approach covers not only the handlers, but also the workflow (interrupting on queues only). @GlobalChannelInterceptor(patterns = {"*-Q"}) – automatically attaches the interceptor to all channels that match the pattern(s). A few words about the cleaning section of preSend. On first sight, it could look unnecessary, but let's see the thread's path when it encounters the Split. The thread iterates the item and thus keeps the context after sending the doc to the queue. Red arrows are showing places where the context will be leaked from doc1 processing to doc2 processing and from doc2 to doc3. That's it. We get an MDC context end-to-end in the Spring integration workflow. Do you know a better way? Please share in the comments. Example Code https://github.com/Sevick/MdcSpringIntegrationDemo

By Vsevolod Vasilyev
Building Smarter Chatbots: Using AI to Generate Reflective and Personalized Responses
Building Smarter Chatbots: Using AI to Generate Reflective and Personalized Responses

With the advent of artificial intelligence-based tools, chatbots have been integral for user interactions. However, in most cases, chatbots provide users with a generic response to their queries. This lack of personalization does not capture any behavioral signals from the user. This not only leads to a suboptimal user experience but also a lot of missed opportunities for the organizations to convert these users from being return customers. This is a critical bottleneck for a lot of organizations to solve, as this prevents free users from becoming paid customers. The first interaction these users have with these chatbots is more often than not the only opportunity to capture the user’s attention. For example, for a user visiting a mental health support application and interacting with its chatbot, it's vital to capture the user’s emotional state and potential needs and provide an empathetic response to make the user feel heard. Missing this window of opportunity is not only impacting the organization’s customer acquisition but also, leading to users not being led to a care pathway that could help improve their lives significantly. Solving such a problem would allow organizations to improve their user conversions by providing more intellectually enhanced personalized experiences improving user satisfaction. Aligning intervention cues with unique user contexts might aid in giving support that feels individualized rather than generic. Every user query can be thought as an input data to the chatbot algorithm. On the basis of that input query we want to formulate a response that is not only reflective of the original input but also has additional description and empathy. This requires the machine learning model to be trained on extensive query-response datasets to enable forming adequate responses. A large set of such curated datasets can be found at this Github repository. Figure 1 is a high level description of the process to generate a more personalized response to a user query. Figure 1: High Level Process Description For example, a user asks: “How may I deal with feeling depressed?”—this will be our query. Based on this, our trained machine learning model may generate a response saying: “I would like to find a problem that might be bothering you and try to understand how that makes you feel.” Then we would move to step 3, and convert the query to a reflection: “So, you want to understand how you may deal with feeling depressed,” and then prepend it to the model response. This leads us to our final response: “So, you want to understand how you may deal with feeling depressed. I would like to find a problem that might be bothering you and try to understand how that makes you feel.” Typical conversational systems like the one we are discussing are a part of language modeling problems. The main goal of such a model is to make it learn the likelihood that a would occur depending on the preceding sequence of words. We can use an n-gram level language model and bi-directional long short term memory (LSTM) to predict a natural language response. A typical training dataset would include a question and its response. Figure 2, describes the data pre-processing we would apply on the question and response. Figure 2: Data Pre-Processing Followed by the data pre-processing we would perform tokenization to produce a sequence of tokens. This process is typically known as n-gram sequenced token generation. A machine learning model works with an uniform input shape, so we would perform sequence padding to make all the token sequences of an uniform length. But, a model would need predictors and its corresponding label to learn. So, we generate an n-gram sequence which act as predictors and the next work of that sequence is considered as the label. A simple model architecture to begin with can be as described in Figure 3. Figure 3: Simple bi-directional LSTM based architecture The first layer can be a bi-directional LSTM with ReLU activation and 256 units which can be followed by a Dropout and finally a Dense layer with softmax activation. For loss, we can use a cross-entropy loss function with an Adam optimizer to train the model. This is an introductory model architecture which can further be enhanced depending on data and requirements. Developing such a system does come with significant challenges with lack of annotated data being the most prevalent one. Its typically painstaking to find datasets annotated with responses marked as being reflective. Also, proper punctuation and ensuring the generated response is grammatically accurate can be challenging. Finally, adding empathy and emotion to the response when required needs additional model development and training which comes under sentiment-enhanced text generation and might be constrained by availability of annotated data To further improve such a system we can refer to MISC coded data which will help us get annotated data comprising of simple and complex reflections known as RES and REC respectively. We can use this to annotate our data with simple and complex reflections. An intriguing work by Can D, Marín RA et. al. titled "It sounds like...: A natural language processing approach to detecting counselor reflections in motivational interviewing” can be a great work to implement given the availability of the right annotated data and can help us make significant advancement in generating reflective responses.

By Surabhi Sinha
The Scrum Guide Expansion Pack
The Scrum Guide Expansion Pack

TL; DR: Is There a Need for the Scrum Guide Expansion Pack? The Scrum Guide Expansion Pack represents a fascinating contradiction in the agile world. While attempting to cure Scrum’s reputation crisis, it may actually amplify the very problems it seeks to solve. Let me explain what this means for practitioners dealing with the aftermath of failed Scrum implementations. The Philosophical Shift: From Lightweight to Academic The 2020 Scrum Guide deliberately embraced minimalism. Ken Schwaber and Jeff Sutherland designed it as a “lightweight framework” that was “purposefully incomplete,” thus forcing teams to fill in the gaps with their intelligence and context. This approach was brilliant in its simplicity compared to competitors such as SAFe. The Expansion Pack abandons this philosophy entirely. Instead of lightweight guidance, we get a comprehensive academic treatise covering Complexity Thinking, Systems Theory, OODA loops, and Beyond Budgeting. While intellectually rich, this transformation creates a fundamental problem: Scrum may no longer be accessible to the average practitioner struggling with fundamental implementation challenges. The Double-Edged Sword of New Concepts What Actually Works The Expansion Pack introduces several genuinely valuable concepts: Definition of Outcome Done vs. Definition of Output Done: This distinction directly addresses the “feature factory” anti-pattern I’ve written extensively about. Too many organizations measure success by features shipped rather than customer problems solved. This framework forces conversations about actual value delivery.Stakeholder Recognition: Formally acknowledging stakeholders as part of the equation addresses a glaring omission in traditional Scrum implementations. Most failures I’ve observed stem from poor stakeholder management or systemic organizational issues, not team dysfunction.Professional Standards: The emphasis on professionalism and accountability provides ammunition against the “Scrum as an excuse for chaos and shooting from the hip” narrative that has damaged Scrum’s credibility. What Creates New Problems Weaponized Complexity: The prescriptive statements like “[…] Scrum Master who is neither willing, ready, nor able to be an agent of change should step down as a Scrum Master” are anti-pattern factories waiting to happen. In dysfunctional organizations, these become tools for blame and political maneuvering rather than the intended catalysts for improvement.Barrier to Entry: Organizations already struggling with basic Scrum concepts now face a mountain of additional theories to master. This fact perfectly feeds the “agile is too complicated” narrative that drives teams back to other approaches.Jargon Proliferation: New terminology like “Product Developer” versus “Developers” and multiple definitions of “done” create confusion rather than clarity, especially for teams still wrestling with fundamental concepts. The Market Reality Check Here’s the uncomfortable truth: most organizations implementing Scrum aren’t ready for graduate-level agile theory. They still struggle with basic concepts like cross-functional teams, empirical process control, and stakeholder collaboration. The Expansion Pack serves two distinct audiences with conflicting needs: Mature Organizations: Those with psychological safety, genuine improvement mindsets, and experienced practitioners will find this invaluable. It provides the theoretical framework to address systemic impediments and evolve beyond mechanical Scrum implementation.Struggling Organizations: Those already experiencing “Scrum fatigue” or “agile PTSD” will likely see this as confirmation that agile approaches are overly complex and theoretical. It reinforces the perception that successful Scrum requires extensive training and expertise rather than common sense and empirical learning. The Anti-Pattern Risk The Expansion Pack creates several new anti-pattern opportunities, for example: Theoretical Perfectionism: Teams may become paralyzed trying to implement every concept perfectly rather than focusing on empirical improvement. In large organizations, this manifests as endless “readiness assessments” where teams spend months studying complexity theory and systems thinking before being “allowed” to start working differently. Product development grinds to a halt while teams debate whether they’ve properly understood emergence and first principles.Consultant Dependency: The complexity provides ammunition for consultants to extend engagements indefinitely, promising mastery of increasingly esoteric concepts. Large organizations may become trapped in multi-year “transformation programs” where consultants introduce new theoretical frameworks every quarter, ensuring teams never achieve competence independently.Process Weaponization: In politically dysfunctional organizations, the Expansion Pack’s prescriptive language becomes ammunition for territorial warfare. Statements like “A Scrum Master who isn’t ready to be an agent of change should step down” get wielded by ambitious middle managers to eliminate rivals or by threatened executives to undermine agile adoption entirely.Expert Dependency Trap: Organizations become dependent on a small group of people who claim to understand advanced concepts. These “Scrum Philosophers” become bottlenecks for any decision, and their departure leaves teams unable to operate independently. Knowledge becomes centralized rather than distributed.Certification Theater: The theoretical depth perfectly covers expanding certification programs. Organizations mandate expensive “Advanced Scrum Professional” training focusing on memorizing complexity theory rather than improving team performance. HR departments then use these credentials to justify hiring decisions while ignoring practical experience.Analysis Paralysis at Scale: Large organizations use complexity as justification for endless analysis phases. Before “transforming to advanced Scrum,” they conduct six-month organization-wide assessments of systems thinking maturity, complexity readiness, and stakeholder relationship maps. The analysis becomes the goal, not the improvement.Compliance Corruption: In regulated industries, the language of professional standards gets twisted into new compliance requirements. “Definition of Output Done” becomes a documentation mandate requiring traceability matrices, and “stakeholder collaboration” requires formal sign-off processes that slow delivery to a crawl. Food for Thought on the Scrum Guide Expansion Pack How might organizations already suffering from Scrum fatigue distinguish between genuinely helpful concepts from the Expansion Pack and unnecessary complexity amplifiers that will worsen their situation?Could the Expansion Pack’s emphasis on theoretical grounding make Scrum more vulnerable to replacement by simpler alternatives like Shape Up or Kanban-based approaches?What specific anti-patterns do you predict will emerge as teams attempt to simultaneously implement “Definition of Output Done” and “Definition of Outcome Done”? Conclusion: A More Pragmatic Path Forward Instead of wholesale adoption, consider treating the Expansion Pack as precisely what it claims to be: a resource collection. Cherry-pick concepts that address your specific impediments. Use the Output/Outcome distinction to combat feature factory behavior, leverage stakeholder guidance to improve product discovery, and apply systems thinking only when team-level improvements plateau. The goal remains unchanged: solve customer problems sustainably and profitably. Scrum, in any form, is merely a tool toward that end. The moment the tool becomes more important than the outcome, you’re practicing Scrum Theater, regardless of which guide you follow. Now, given that the Scrum Guide Expansion Pack mainly aggregates long-known and practiced approaches to solving those customer problems, the elephant in the room is obvious: is there a need for the Scrum Guide Expansion Pack to begin with?

By Stefan Wolpers DZone Core CORE
Complete Guide: Managing Multiple GitHub Accounts on One System
Complete Guide: Managing Multiple GitHub Accounts on One System

This comprehensive guide helps developers configure and manage multiple GitHub accounts on the same system, enabling seamless switching between personal, work, and client accounts without authentication conflicts. Overview Managing multiple GitHub accounts on a single development machine is a common requirement for modern developers. Whether you're maintaining separate personal and professional identities, working with multiple clients, or contributing to different organizations, proper account management ensures secure access control and maintains clear commit attribution. This guide provides three distinct approaches to handle multiple GitHub accounts, from basic manual switching to advanced automated configuration. Each method has its own advantages and is suitable for different workflow requirements. Key Benefits: Secure isolation between different GitHub accountsAutomatic switching based on project directoryPrevention of accidental commits to wrong repositoriesStreamlined authentication across multiple accountsClear audit trail for commit attribution Understanding the Challenge When working with multiple GitHub accounts, developers face several technical challenges: Authentication Conflicts: Git and SSH typically use default credentials, leading to access denials when switching between accounts. The system may authenticate with the wrong account, resulting in permission errors. Identity Management: Each commit should be attributed to the correct developer identity. Using the wrong name or email in commits can cause confusion in project history and violate organizational policies. Repository Access: Different repositories may belong to different accounts or organizations, requiring specific authentication credentials for each context. Workflow Complexity: Manual switching between accounts is error-prone and time-consuming, especially when working on multiple projects simultaneously. Security Concerns: Improper configuration can lead to credential leakage or unauthorized access to repositories. Prerequisites Before implementing any of the solutions in this guide, ensure you have: Git installed (version 2.20 or later recommended)SSH client available on your systemMultiple GitHub accounts with appropriate access permissionsCommand line access with ability to edit configuration filesBasic understanding of Git workflows and SSH key conceptsAdministrative rights to modify SSH and Git configurations Step-by-Step Setup Generate SSH Keys for Each Account Shell # For your primary account (if you don't have one already) ssh-keygen -t ed25519 -C "[email protected]" -f ~/.ssh/id_ed25519 # For your secondary account ssh-keygen -t ed25519 -C "[email protected]" -f ~/.ssh/id_ed25519_secondary Note: When prompted for a passphrase, you can either: Leave it empty (press Enter) for convenienceSet a passphrase for additional security 2. Add SSH Keys to GitHub Accounts Copy your public keys: Shell # Primary account public key cat ~/.ssh/id_ed25519.pub # Secondary account public key cat ~/.ssh/id_ed25519_secondary.pub Add to GitHub: Log into each GitHub accountGo to Settings → SSH and GPG keysClick New SSH keyPaste the respective public keyGive it a descriptive title (e.g., "Work Laptop - Primary Account") 3. Configure SSH Config File Create or edit ~/.ssh/config Shell # Edit the SSH config file nano ~/.ssh/config Add the following configuration: Shell # Primary GitHub account (default) Host github.com HostName github.com User git IdentityFile ~/.ssh/id_ed25519 # Secondary GitHub account Host github-secondary HostName github.com User git IdentityFile ~/.ssh/id_ed25519_secondary Important: The User must always be git for GitHubReplace github-secondary with any alias you prefer (e.g. github-work,github-personal)Do NOT add .com to your custom host aliases 4. Set Proper Permissions Shell chmod 600 ~/.ssh/config chmod 700 ~/.ssh chmod 600 ~/.ssh/id_ed25519* 5. Test SSH Connections Shell # Test primary account ssh -T [email protected] # Test secondary account ssh -T git@github-secondary Expected output: Hi username! You've successfully authenticated, but GitHub does not provide shell access. Usage Examples Cloning Repositories Shell # Clone with primary account (default) git clone [email protected]:username/repo-name.git # Clone with secondary account git clone git@github-secondary:username/repo-name.git Setting Up Existing Repositories Shell # Check current remote git remote -v # Switch to secondary account git remote set-url origin git@github-secondary:username/repo-name.git # Verify the change git remote -v Configure Git Identity Per Repository Shell # Set identity for current repository git config user.name "Your Work Name" git config user.email "[email protected]" # Or for personal projects git config user.name "Your Personal Name" git config user.email "[email protected]" Troubleshooting Common Issues and Solutions 1. "Could not resolve hostname github-secondary" Problem: Using .com with your SSH alias or SSH config not properly set up. Solution: Use git@github-secondary NOT [email protected] Verify your SSH config file exists and has correct permissions 2. "Permission denied (publickey)" Problem: SSH key not added to GitHub or wrong key being used. Solutions: Shell # Test specific key ssh -i ~/.ssh/id_ed25519_secondary -T [email protected] # Add key to SSH agent ssh-add ~/.ssh/id_ed25519_secondary # Verify key is added to GitHub account 3. "Push rejected" or "Access denied" Problem: Repository remote URL doesn't match your SSH configuration. Solution: Shell # Check remote URL git remote -v # Update remote URL to use correct SSH alias git remote set-url origin git@github-secondary:username/repo.git 4. Wrong author in commits Problem: Git using wrong user identity. Solution: Shell # Check current identity git config user.name git config user.email # Set correct identity for this repository git config user.name "Correct Name" git config user.email "[email protected]" Debug Commands Shell # Test SSH connection with verbose output ssh -vT git@github-secondary # Check which SSH key is being used ssh-add -l # Verify SSH config parsing ssh -F ~/.ssh/config -G github-secondary # Check Git configuration hierarchy git config --list --show-origin | grep user Quick Reference Commands Shell # Generate SSH key ssh-keygen -t ed25519 -C "[email protected]" -f ~/.ssh/id_ed25519_name # Test SSH connection ssh -T git@github-alias # Clone with specific account git clone git@github-alias:username/repo.git # Set repository identity git config user.name "Name" git config user.email "[email protected]" # Change remote URL git remote set-url origin git@github-alias:username/repo.git # Check configuration git config --list --show-origin | grep user ssh-add -l git remote -v Best Practices Naming Conventions: Use descriptive SSH host aliases: github-work,github-personalUse consistent email patterns: [email protected] SSH keys descriptively: id_ed25519_work,id_ed25519_personal Security: Use strong passphrases for SSH keys in production environmentsRegularly rotate SSH keys (annually)Keep separate SSH keys for different purposesNever share private keys Backup and Recovery: Backup your SSH keys securelyDocument your SSH config setupKeep a record of which public keys are associated with which GitHub accounts

By Prathap Reddy M
How Docker Desktop Enhances Developer Workflows and Observability
How Docker Desktop Enhances Developer Workflows and Observability

Docker made it easier to build, ship, and run applications consistently using lightweight containers. While Docker Engine handles the core functionality, Docker Desktop brings those capabilities into a more accessible environment for everyday development tasks. Though it may not attract as much attention as container orchestration tools or microservices frameworks, Docker Desktop serves a practical purpose in local development. It offers a straightforward interface for managing containers directly on a developer’s machine. In this article, we’ll look at how Docker Desktop supports container management, use screenshots to illustrate its features, and examine its relevance in modern software development workflows. What is Docker Desktop? Docker Desktop is an all-in-one GUI and CLI tool that provides: Docker Engine (the core container runtime)Docker CLI + Docker ComposeIntegrated Kubernetes (optional)Graphical dashboard for managing containers, images, volumes, and networksA growing library of extensions (Docker Scout, Portainer, etc.) It runs natively on macOS, Windows, and recently even Linux, making it platform-agnostic and accessible to almost everyone. How Developers Use Docker Desktop Here's a closer look at how developers are using Docker Desktop in practice. 1. Visual Dashboard for Containers and Resources Docker Desktop provides a graphical interface that complements the command-line experience. It allows users to view and manage running containers, stop or restart them, inspect logs, and open terminals, all through the UI, without needing to run commands like docker ps or docker logs or look up container IDs. Docker Desktop Dashboard showing running containers with logs and CPU usage This is incredibly useful for debugging, especially in development environments with many moving parts (e.g., microservices, databases, local proxies). 2. See Space Usage at a Glance Tracking how much space containers, images, volumes, and caches consume can be tedious when relying solely on CLI commands. Docker Desktop’s Disk Usage panel offers a visual summary of storage use, including: Total space used by imagesContainers (running and stopped)VolumesBuilder cache It also provides a cleanup option to remove unused resources, helping manage disk space without manually running commands like docker system prune -a. Disk usage panel showing 4GB of images, 100MB of volumes, and a “Clean up” button This feature alone has saved developers gigabytes of disk space—and lots of confusion. Key Features That Support Local Development Extensions Marketplace Docker Desktop supports one-click extensions like: Docker Scout – for scanning image vulnerabilitiesPortainer – advanced container UIVolume Management Extension – for browsing, editing, and deleting volume content This allows teams to customize Docker Desktop for their needs without bloating their workflow. Extensions tab showing enabled extensions and quick install buttons. Real-Time Volume Sync File system performance has traditionally been a challenge when mounting source code into containers on macOS and Windows. Docker Desktop addresses this by integrating Mutagen, a file synchronization tool that significantly improves sync speed. This allows developers to edit code on the host machine and see near real-time updates inside the container, an important improvement for efficient local development workflows. Dev Environments (Beta) Docker Desktop introduced Dev Environments—pre-configured workspaces that team members can spin up with a single command. This is great for: Onboarding new developersStandardizing dev toolsCreating shareable demos or bug reproductions It connects directly to Git repos, Dockerfiles, and VS Code. My Real Experience: Docker Desktop at Work At Wayfair, we were working on migrating a monolithic app into a set of microservices. Docker Desktop played a pivotal role. Here’s how: Containerizing the Monolith – We created a Dockerfile for the old system and tested it locally using Docker Desktop. This gave us consistency across dev machines.Docker Compose for Local Dev – We broke the monolith into smaller services and used Docker Compose to spin up all dependencies: API, DB, Redis, and more. One docker-compose up command replaced hours of setup.Testing Helm Charts with Docker Desktop Kubernetes – Before deploying to our main cluster, we validated Kubernetes deployments locally. Docker Desktop’s single-node cluster made this safe and fast.Space Management – We noticed builds getting slower. Turns out, old volumes and builder cache were taking up 20+ GB. Docker Desktop’s disk usage view helped us clean things up easily. The GUI helped teammates who weren’t comfortable with Docker CLI, while power users still had full command-line access. Visual Tour: What Docker Desktop Shows You While I can’t embed images directly here, here’s what you’d see if you open Docker Desktop: Feature What You See Dashboard List of running containers with status & logs Volumes Tab Mount paths, volume sizes, create/delete buttons Disk Usage Panel Total disk used by images, containers, cache Kubernetes Tab Toggle to enable/disable k8s, kubeconfig info Extensions Browse, install, and configure extensions Docker Desktop vs. Alternatives Different tools offer overlapping functionality, but they vary in terms of usability, integration, and platform support. Here's a quick comparison: ToolGUI SupportKubernetes Built-InDisk Usage InsightsCross-PlatformPlugin/Extension SupportDocker DesktopYesYesYesYesYesPodmanLimited (via add-ons)NoNoMostlyNoMinikubeNoYesNoYesNo Each tool has its own focus. The best choice depends on your workflow and priorities. Learning and Team Enablement For teams getting started with containers, Docker Desktop can lower the initial learning curve: Teams don’t have to learn CLI firstClear feedback via logs and visuals“Works on my machine” problems almost vanishClean separation of dev environment using containers This can be especially useful when mentoring junior developers or collaborating across distributed teams, where ease of setup and clarity can streamline development workflows. Final Thoughts: Where Docker Desktop Fits In Docker—the engine, CLI, and broader ecosystem—transformed how applications are built and deployed. Docker Desktop builds on that foundation by making container-based workflows more accessible for everyday development tasks. It built on a powerful low-level technology by adding a more accessible, visual interface tailored for everyday development tasks. Today, if you’re working on: APIsFull-stack appsMicroservicesDevOps pipelinesCloud-native deployments Docker Desktop probably powers your local dev environment. Pro Tips Turn on Kubernetes only when needed—saves resourcesUse Resource Usage to spot memory leaks in devTry Docker Scout to keep your images secureClean up builder cache regularly to save space Conclusion Docker Desktop can act as a bridge between container technology and day-to-day development workflows. It simplifies: Space managementResource monitoringKubernetes integrationDeveloper onboardingExtension-based customization From development to operations, Docker Desktop helps streamline common tasks in container-based projects.

By Ravi Teja Thutari
From OCR Bottlenecks to Structured Understanding
From OCR Bottlenecks to Structured Understanding

Abstract When we talk about making AI systems better at finding and using information from documents, we often focus on fancy algorithms and cutting-edge language models. But here's the thing: if your text extraction is garbage, everything else falls apart. This paper looks at how OCR quality impacts retrieval-augmented generation (RAG) systems, particularly when dealing with scanned documents and PDFs. We explore the cascading effects of OCR errors through the RAG pipeline and present a modern solution using SmolDocling, an ultra-compact vision-language model that processes documents end-to-end. The recent OHRBench study (Zhang et al., 2024) provides compelling evidence that even modern OCR solutions struggle with real-world documents. We demonstrate how SmolDocling (Nassar et al., 2025), with just 256M parameters, offers a practical path forward by understanding documents holistically rather than character-by-character, outputting structured data that dramatically improves downstream RAG performance. Introduction The "Garbage in, garbage out" principle isn't just a catchy phrase — it's the reality of document-based RAG systems. While the AI community gets excited about the latest embedding models and retrieval algorithms, many are overlooking a fundamental bottleneck: the quality of text extraction from real-world documents. Recent research is starting to shine a light on this issue. Zhang et al. (2024) introduced OHRBench, showing that none of the current OCR solutions are competent for constructing high-quality knowledge bases for RAG systems. That's a pretty damning assessment of where we stand today. The State of OCR: It's Complicated 1. The Good News and the Bad News Modern OCR has come a long way. Google's Tesseract, now in version 4.0+, uses LSTM neural networks and can achieve impressive accuracy on clean, printed text (Patel et al., 2020). But here's where it gets messy: According to recent benchmarking studies, OCR error rates of 20% or higher are still common in historical documents (Bazzo et al., 2020). Rigaud et al. (2021) documented similar issues across digital libraries and specialized document types. A benchmarking study by Hamdi et al. (2022) comparing Tesseract, Amazon Textract, and Google Document AI found that Document AI delivered the best results, and the server-based processors (Textract and Document AI) performed substantially better than Tesseract, especially on noisy documents. But even the best performers struggled with complex layouts and historical documents. 2. Why OCR Still Struggles The challenges aren't just about old, faded documents (though those are definitely problematic). Modern OCR faces several persistent issues: Complex layouts: Multi-column formats, tables, and mixed text/image content confuse most OCR systemsVariable quality: Even documents from the same source can have wildly different scan qualityLanguage and font diversity: Non-Latin scripts and unusual fonts significantly degrade performanceReal-world noise: Coffee stains, handwritten annotations, stamps - the stuff that makes documents real also makes them hard to read As noted in the OHRBench paper (Zhang et al., 2024), two primary OCR noise types, Semantic Noise and Formatting Noise, were identified as the main culprits affecting downstream RAG performance. How OCR Errors Cascade Through RAG 1. The Domino Effect Here's what happens when OCR errors enter your RAG pipeline - it's not pretty: Chunking goes haywire: Your sophisticated semantic chunking algorithm tries to find sentence boundaries in text like "Thepatient presentedwith severesymptoms" and either creates tiny meaningless chunks or massive walls of text.Embeddings get confused: When your embedding model sees "diabetus" instead of "diabetes," it might place that chunk in a completely different semantic space. Multiply this by thousands of documents, and your vector space becomes chaos.Retrieval fails: A user searches for "diabetes treatment" but the relevant chunks are indexed under "diabetus" or "dialbetes" - no matches found.Generation hallucinates: With poor or missing context, your LLM starts making things up to fill the gaps. 2. Real Impact on RAG Performance The OHRBench study provides sobering data. They found that OCR noise significantly impacts RAG systems, with performance loss across all tested configurations. This isn't just about a few percentage points — we're talking about systems becoming effectively unusable for critical applications. Bazzo et al. (2020) found in their detailed investigation that while OCR errors may seem to have insignificant degradation on average, individual queries can be greatly impacted. They showed that significant impacts are noticed starting at a 5% error rate, and reported an impressive increase in the number of index terms in the presence of errors — essentially, OCR errors create phantom vocabulary that bloats your index. What We Propose: A Modern Solution With SmolDocling 1. Moving Beyond Traditional OCR After dealing with the frustrations of traditional OCR pipelines, we've adopted a fundamentally different approach using SmolDocling, an ultra-compact vision-language model released in March 2025 by IBM Research and HuggingFace (Nassar et al., 2025). Here's why this changes everything: instead of the traditional pipeline of OCR → post-processing → chunking → embedding, SmolDocling processes document images directly into structured output in a single pass. At just 256 million parameters, it's small enough to run on consumer GPUs while delivering results that compete with models 27 times larger. 2. The SmolDocling Architecture The model uses a clever architecture that combines: A visual encoder (SigLIP with 93M parameters) that directly processes document imagesA language model (SmolLM-2 variant with 135M parameters) that generates structured outputAn aggressive pixel shuffle strategy to compress visual features efficiently What makes this special is that SmolDocling doesn't just extract text - it understands document structure holistically. Tables stay as tables, code blocks maintain their indentation, formulas are preserved, and the spatial relationships between elements are captured. 3. DocTags: Structured Output That Actually Works One of SmolDocling's key innovations is DocTags, a markup format designed specifically for document representation. Instead of dumping unstructured text, you get structured output with precise location information: XML <picture><loc_77><loc_45><loc_423><loc_135> <other> <caption><loc_58><loc_150><loc_441><loc_177> Figure 1: SmolDocling/SmolVLM architecture. SmolDocling converts images of document pages to DocTags sequences. </caption> </picture> <text><loc_58><loc_191><loc_441><loc_211>In this work, we outline how we close the gaps left by publicly available datasets and establish a training approach to achieve end-to-end, full-featured document conversion through a vision-language model. </text> <unordered_list> <list_item><loc_80><loc_218><loc_441><loc_259>· SmolDocling: An ultra-compact VLM for end-to-end document conversion </list_item> <list_item><loc_80><loc_263><loc_441><loc_297>· We augment existing document pre-training datasets with additional feature annotations </list_item> </unordered_list> <table> <table_row> <table_cell><loc_50><loc_320><loc_150><loc_340>Test Name</table_cell> <table_cell><loc_151><loc_320><loc_250><loc_340>Result</table_cell> <table_cell><loc_251><loc_320><loc_350><loc_340>Normal Range</table_cell> </table_row> <table_row> <table_cell><loc_50><loc_341><loc_150><loc_361>Glucose</table_cell> <table_cell><loc_151><loc_341><loc_250><loc_361>126 mg/dL</table_cell> <table_cell><loc_251><loc_341><loc_350><loc_361>70-100 mg/dL</table_cell> </table_row> </table> Notice how each element includes <loc_X> tags that specify exact bounding box coordinates (x1, y1, x2, y2). This means: Your RAG system knows exactly where each piece of information appears on the page (automatic image extraction is very easy)Tables maintain their structure with proper cell boundariesLists, captions, and different text types are clearly distinguishedComplex layouts are preserved, not flattened into a text stream This structured format with spatial information means your RAG system can intelligently chunk based on actual document structure and location rather than arbitrary character counts. The difference is dramatic - where traditional OCR might produce a jumbled mess of text with lost formatting, SmolDocling maintains both the semantic structure and spatial relationships that make documents meaningful. 4.4 Real-World Performance The numbers from the SmolDocling paper (Nassar et al., 2025) tell a compelling story. Let's visualize how this 256M parameter model stacks up against much larger alternatives: Text recognition (OCR) metrics Layout understanding (mAP) Model characteristics The Bottom Line: SmolDocling achieves better accuracy than models 27x its size while using 28x less memory and processing pages in just 0.35 seconds (Average of 0.35 secs per page on A100 GPU). For RAG applications, this means you can process documents faster, more accurately, and on much more modest hardware - all while preserving the document structure that makes intelligent chunking possible. 5. Implementing SmolDocling in Your RAG Pipeline Here's the crucial insight that many teams miss: the quality of your data preparation determines everything that follows in your RAG pipeline. SmolDocling isn't just another OCR tool — it's a foundation that fundamentally changes how you can approach document processing. Why Structured Extraction Changes Everything Traditional OCR gives you a wall of text. SmolDocling gives you a semantic map of your document. This difference cascades through your entire pipeline: Intelligent chunking becomes possible: With DocTags providing element types and boundaries, you can chunk based on actual document structure. A table stays together as a semantic unit. A code block maintains its integrity. Multi-paragraph sections can be kept coherent. You're no longer blindly cutting at character counts.Context-aware embeddings: When your chunks have structure, your embeddings become more meaningful. A chunk containing a table with its caption creates a different embedding than the same text jumbled together. The semantic relationships are preserved, making retrieval more accurate.Hierarchical indexing: The location tags (<loc_x1><loc_y1><loc_x2><loc_y2>) aren't just coordinates — they represent document hierarchy. Headers, subheaders, and their associated content maintain their relationships. This enables sophisticated retrieval strategies where you can prioritize based on document structure. The Preparation Process That Matters When implementing SmolDocling, think about your data preparation in layers: Document ingestion: Process documents at appropriate resolution (144 DPI is the sweet spot)Structured extraction: Let SmolDocling create the DocTags representationSemantic chunking: Parse the DocTags to create meaningful chunks based on element typesMetadata enrichment: Use the structural information to add rich metadata to each chunkVector generation: Create embeddings that benefit from the preserved structure Real Impact on RAG Quality The difference is dramatic. In traditional pipelines, a search for "quarterly revenue figures" might return random text fragments that happen to contain those words. With SmolDocling-prepared data, you get the actual table containing the figures, with its caption and surrounding context intact. This isn't theoretical — our teams report 30-50% improvements in retrieval precision when switching from traditional OCR to structure-preserving extraction. The investment in proper data preparation pays off exponentially in RAG performance. 6. Why This Solves the OCR Problem Remember all those cascading errors we talked about? Here's how SmolDocling addresses them: No OCR errors to propagate: Since it's not doing character-by-character recognition but understanding the document holistically, many traditional OCR errors simply don't occur.Structure-aware from the start: Tables, lists, and formatting are preserved in the initial extraction, so your chunking strategy has rich information to work with.Unified processing: One model handles text, tables, formulas, and code - no need to stitch together outputs from multiple specialized tools.Built for modern documents: While traditional OCR struggles with complex layouts, SmolDocling was trained on diverse document types, including technical reports, patents, and forms. The shift from traditional OCR to vision-language models, such as SmolDocling, represents a fundamental change in how we approach document processing for RAG. Instead of fighting with OCR errors and trying to reconstruct document structure after the fact, we can work with clean, structured data from the start. Practical Implementation Considerations 1. When to Use SmolDocling vs Traditional OCR Let's be practical about this. SmolDocling is fantastic, but it's not always the right tool: Use SmolDocling when: You're processing diverse document types (reports, forms, technical docs)Document structure is important for your use caseYou need to handle tables, formulas, or code blocksYou have access to GPUs (even consumer-grade ones work)You want a single solution instead of juggling multiple tools Stick with traditional OCR when: You only need plain text from simple documentsYou're working with massive volumes where 0.35 seconds/page is too slowYou have specialized needs (like historical manuscript processing)You're constrained to a CPU-only environment 2. Monitoring and Quality Assurance Even with SmolDocling's improvements, you still need quality checks: Validation against known patterns: If processing invoices, check that you're extracting standard fieldsCross-referencing: For critical data, consider processing with both SmolDocling and traditional OCR, then comparingUser feedback loops: Build mechanisms for users to report issues Conclusion: The Future Is Multi-Modal Here's the bottom line: the days of treating OCR as a separate preprocessing step are numbered. Vision-language models, such as SmolDocling, show us a future where document understanding occurs holistically, rather than through fragmented pipelines. For organizations building RAG systems today, this presents both an opportunity and a challenge. The opportunity is clear: better document understanding leads to improved RAG performance. The challenge is that we're in a transition period where both approaches have their place. My recommendation? Start experimenting with SmolDocling on your most problematic documents — the ones where traditional OCR consistently fails. Measure the improvement not just in character accuracy, but in your end-to-end RAG performance. You might be surprised by how much better your system performs when it actually understands document structure rather than just extracting characters. The research is moving fast. Zhang et al. (2024) showed us how badly current OCR affects RAG. Nassar et al. (2025) gave us SmolDocling as a solution. What comes next will likely be even better. But don't wait for perfection. A RAG system that handles 90% of your documents well with SmolDocling is infinitely more valuable than one that theoretically could handle 100% but fails on real-world complexity. Because at the end of the day, our users don't care about our technical challenges. They just want accurate answers from their documents. And with approaches like SmolDocling, we're finally getting closer to delivering on that promise. References Bazzo, G.T., Lorentz, G.A., Vargas, D.S., & Moreira, V.P. (2020). "Assessing the Impact of OCR Errors in Information Retrieval." In Advances in Information Retrieval. ECIR 2020. Lecture Notes in Computer Science, vol 12036. Springer, Cham.Chen, K., et al. (2023). "LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking." Proceedings of the 31st ACM International Conference on Multimedia.Hamdi, A., et al. (2022). "OCR with Tesseract, Amazon Textract, and Google Document AI: a benchmarking experiment." Journal of Computational Social Science, 5(1), 861-882.Lewis, P., et al. (2020). "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks." Advances in Neural Information Processing Systems, 33, 9459-9474.Nassar, A., et al. (2025). "SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion." arXiv preprint arXiv:2503.11576.Neudecker, C., et al. (2021). "A Survey of OCR Evaluation Tools and Metrics." Proceedings of the 6th International Workshop on Historical Document Imaging and Processing, 13-20.Patel, D., et al. (2020). "Improving the Accuracy of Tesseract 4.0 OCR Engine Using Convolution-Based Preprocessing." Symmetry, 12(5), 715.Rigaud, C., et al. (2021). "What Do We Expect from Comic Panel Text Detection and Recognition?" Multimedia Tools and Applications, 80(14), 22199-22225.Shen, Z., et al. (2021). "LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis." Proceedings of the 16th International Conference on Document Analysis and Recognition (ICDAR).Zhang, J., et al. (2024). "OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation." arXiv preprint arXiv:2412.02592.

By Pier-Jean MALANDRINO DZone Core CORE
Debunking LLM Intelligence: What's Really Happening Under the Hood?
Debunking LLM Intelligence: What's Really Happening Under the Hood?

Large language models (LLMs) possess an impressive ability to generate text, poetry, code, and even hold complex conversations. Yet, a fundamental question arises: do these systems truly understand what they are saying, or do they merely imitate a form of thought? Is it a simple illusion, an elaborate statistical performance, or are LLMs developing a form of understanding, or even reasoning? This question is at the heart of current debates on artificial intelligence. On one hand, the achievements of LLMs are undeniable: they can translate languages, summarize articles, draft emails, and even answer complex questions with surprising accuracy. This ability to manipulate language with such ease could suggest genuine understanding. On the other hand, analysts emphasize that LLMs are first and foremost statistical machines, trained on enormous quantities of textual data. They learn to identify patterns and associations between words, but this does not necessarily mean they understand the deep meaning of what they produce. Don’t they simply reproduce patterns and structures they have already encountered, without true awareness of what they are saying? The question remains open and divides researchers. Some believe that LLMs are on the path to genuine understanding, while others think they will always remain sophisticated simulators, incapable of true thought. Regardless, the question of LLM comprehension raises philosophical, ethical, and practical issues that translate into how we can use them. Also, it appears more useful than ever today to demystify the human "thinking" capabilities sometimes wrongly attributed to them, due to excessive enthusiasm or simply a lack of knowledge about the underlying technology. This is the very point a team of researchers at Apple recently demonstrated in their study "The Illusion of Thinking." They observed that despite LLMs' undeniable progress in performance, their fundamental limitations remained poorly understood. Critical questions persisted, particularly regarding their ability to generalize reasoning or handle increasingly complex problems. "This finding strengthens evidence that the limitation is not just in problem-solving and solution strategy discovery but also in consistent logical verification and step execution limitation throughout the generated reasoning chains" - Example of Prescribed Algorithm for Tower of Hanoi - “The Illusion of Thinking” - Parshin Shojaee, Iman Mirzadeh, Keivan Alizadeh, Maxwell Horton, Samy Bengio, Mehrdad Farajtabar - APPLE To better get the essence of LLMs, let’s explore their internal workings and establish fundamental distinctions with human thought. To do this, let’s use the concrete example of this meme ("WHAT HAPPENED TO HIM? - P > 0.05") to illustrate both the technological prowess of LLMs and the fundamentally computational nature of their operation, which is essentially distinct from human consciousness. The 'P > 0.05' Meme Explained Simply by an LLM I asked an LLM to explain this meme to me simply, and here is its response: The LLM Facing the Meme: A Demonstration of Power If we look closely, for a human, understanding the humor of this meme requires knowledge of the Harry Potter saga, basic statistics, and the ability to get the irony of the funny juxtaposition. Now, when the LLM was confronted with this meme, it demonstrated an impressive ability to decipher it. It managed to identify the visual and textual elements, recognize the cultural context (the Harry Potter scene and the characters), understand an abstract scientific concept (the p-value in statistics and its meaning), and synthesize all this information to explain the meme's humor. Let's agree that the LLM's performance in doing the job was quite remarkable. It could, at first glance, suggest a deep "understanding," or even a form of intelligence similar to ours, capable of reasoning and interpreting the world. The Mechanisms of 'Reasoning': A Computational Process However, this performance does not result from 'reflection' in the human sense. The LLM does not 'think,' has no consciousness, no introspection, and even less subjective experience. What we perceive as reasoning is, in reality, a sophisticated analysis process, based on algorithms and a colossal amount of data. The Scale of Training Data An LLM like Gemini or ChatGPT is trained on massive volumes of data, reaching hundreds of terabytes, including billions of text documents (books, articles, web pages) and billions of multimodal elements (captioned images, videos, audio), containing billions of parameters. This knowledge base is comparable to a gigantic, digitized, and indexed library. It includes an encyclopedic knowledge of the world, entire segments of popular culture (like the Harry Potter saga), scientific articles, movie scripts, online discussions, and much more. It’s this massive and diverse exposure to information that allows it to recognize patterns, correlations, and contexts. The Algorithms at Work To analyze the meme, several types of algorithms come into play: Natural language processing (NLP): It’s the core of interaction with text. NLP allows the model to understand the semantics of phrases ('WHAT HAPPENED TO HIM?') and to process textual information.Visual recognition / OCR (Optical Character Recognition): For image-based memes, the system uses OCR algorithms to extract and 'read' the text present in the image ('P > 0.05'). Concurrently, visual recognition allows for the identification of graphic elements: the characters' faces, the specific scene from the movie, and even the creature's frail nature.Transformer neural networks: These are the main architectures of LLMs. They are particularly effective at identifying complex patterns and long-term relationships in data. They allow the model to link 'Harry Potter' to specific scenes and to understand that 'P > 0.05' is a statistical concept. The Meme Analysis Process, Step-by-Step: When faced with the meme, the LLM carries out a precise computational process: Extraction and recognition: The system identifies keywords, faces, the scene, and technical text.Activation of relevant knowledge: Based on these extracted elements, the model 'activates' and weighs the most relevant segments of its knowledge. It establishes connections with its data on Harry Potter (the 'limbo,' Voldemort's soul fragment), statistics (the definition of the p-value and the 0.05 threshold), and humor patterns related to juxtaposition.Response synthesis: The model then generates a text that articulates the humorous contrast. It explains that the joke comes from Dumbledore's cold and statistical response to a very emotional and existential question. This highlights the absence of 'statistical significance' of the creature's state. This explanation is constructed by identifying the most probable and relevant semantic associations, learned during its training. The Fundamental Difference: Statistics, Data, and Absence of Consciousness This LLM's 'reasoning,' or rather, its mode of operation, therefore results from a series of complex statistical inferences based on correlations observed in massive quantities of data. The model does not 'understand' the abstract meaning, emotional implications, or moral nuances of the Harry Potter scene. It just predicts the most probable sequence, the most relevant associations, based on the billions of parameters it has processed. This fundamentally contrasts with human thought. Indeed, humans possess consciousness, lived experience, and emotions. It’s with these that we create new meaning rather than simply recombining existing knowledge. We apprehend causes and effects beyond simple statistical correlations. It’s this that allows us to understand Voldemort's state, the profound implications of the scene, and the symbolic meaning of the meme. And above all, unlike LLMs, humans act with intentions, desires, and beliefs. LLMs merely execute a task based on a set of rules and probabilities. While LLMs are very good at manipulating very large volumes of symbols and representations, they lack the understanding of the real world, common sense, and consciousness inherent in human intelligence, not to mention the biases, unexpected behaviors, or 'hallucinations' they can generate. Conclusion Language models are tools that possess huge computational power, capable of performing tasks that mimic human understanding in an impressive way. However, their operation relies on statistical analysis and pattern recognition within vast datasets, and not on consciousness, reflection, or an inherently human understanding of the world. Understanding this distinction is important when the technological ecosystem exaggerates supposed reasoning capabilities. In this context, adopting a realistic view allows us to fully leverage the capabilities of these systems without attributing qualities to them that they don't possess. Personally, I’m convinced that the future of AI lies in intelligent collaboration between humans and machines, where each brings its unique strengths: consciousness, creativity, and critical thinking on one side; computational power, speed of analysis, and access to information on the other.

By Frederic Jacquet DZone Core CORE
Preventing Downtime in Public Safety Systems: DevOps Lessons from Production
Preventing Downtime in Public Safety Systems: DevOps Lessons from Production

Public safety systems can’t afford to fail silently. An unnoticed deployment bug, delayed API response, or logging blind spot can derail operations across city agencies. In environments like these, DevOps isn’t a workflow; it’s operational survival. With over two decades in software engineering and more than a decade leading municipal cloud platforms, I’ve built systems for cities that can't afford latency or silence. This article shares lessons we’ve gathered over years of working in high-stakes environments, where preparation, not luck, determines stability. The technical decisions described here emerged not from theory but from repeated trials, long nights, and the obligation to keep city services functional under load. Incident: When a Feature Deployed and Alerts Went Quiet In one rollout, a vehicle release notification module passed integration and staging tests. The CI pipeline triggered a green build, the version deployed, and nothing flagged. Hours later, city desk agents began reporting citizen complaints, alerts weren’t firing for a specific condition involving early-hour vehicle releases. The root cause? A misconfigured conditional in the notification service logic that silently failed when a timestamp edge case was encountered. Worse, no alert fired because the logging layer lacked contextual flags to differentiate a silent skip from a processed success. Recovery required a hotfix pushed mid-day with temporary logic patching and full log reindexing. The aftermath helped us reevaluate how we handle exception tracking, how we monitor non-events, and how we treat “no news” not as good news, but as something to investigate by default. Lesson 1: Don’t Deploy What You Can’t Roll Back After reevaluating our deployment strategy, we didn’t stop at staging improvements. We moved quickly to enforce safeguards that could protect the system in production. We re-architected our Azure DevOps pipeline with staged gates, rollback triggers, and dark launch toggles. Deployments now use feature flags via LaunchDarkly, isolating new features behind runtime switches. When anomalies appear, spikes in failed notifications, API response drift, or event lag, the toggle rolls the feature out of traffic. Each deploy attaches a build hash and environment tag. If a regression is reported, we can roll back based on hashtag lineage and revert to the last known-good state without rebuilding the pipeline. The following YAML template outlines the CI/CD flow used to manage controlled rollouts and rollback gating: YAML trigger: branches: include: - main jobs: - job: DeployApp steps: - task: AzureWebApp@1 inputs: appName: 'vehicle-location services-service' package: '$(System.ArtifactsDirectory)/release.zip' - task: ManualValidation@0 inputs: instructions: 'Verify rollback readiness before production push' This flow is paired with a rollback sequence that includes automatic traffic redirection to a green-stable instance, a cache warm-up verification, and a post-revert log streaming process with delta diff tagging. These steps reduce deployment anxiety and allow us to mitigate failures within minutes. Since implementing this approach, we've seen improved confidence during high-traffic deploy windows, particularly when agencies have enforcement seasons. Lesson 2: Logging is for Action, Not Just Audit We knew better visibility was next. The same incident revealed that while the notification service logged outputs, it didn’t emit semantic failure markers. Now, every service operation logs a set of structured, machine-readable fields: a unique job identifier, UTC-normalized timestamp, result tags, failure codes, and retry attempt metadata. Here's an example: INFO [release-notify] job=VRN_2398745 | ts=2024-11-10T04:32:10Z | result=FAIL | code=E103 | attempts=3 These logs are indexed and aggregated using Azure Monitor. We use queries like the following to track exception rate deltas across time: AppTraces | where Timestamp > ago(10m) | summarize Count = count() by ResultCode, bin(Timestamp, 1m) | where Count > 5 and ResultCode startswith "E" When retry rates exceed 3% in any 10-minute window, automated alerts are dispatched to Teams channels and escalated via PagerDuty. This kind of observability ensures we’re responding to faults long before users experience them. In a few cases, we've even detected upstream vendor slowdowns before our partners formally acknowledged issues. "A silent failure is still a failure, we just don’t catch it until it costs us." Lesson 3: Every Pipeline Should Contain a Kill Switch With observability in place, we still needed the ability to act quickly. To address this, we integrated dry-run validators into every deployment pipeline. These simulate the configuration delta before release. If a change introduces untracked environment variables, API version mismatches, or broken migration chains, the pipeline exits with a non-zero status and immediately alerts the on-call team. In addition, gateway-level kill switches let us unbind problematic services within seconds. For example: HTTP POST /admin/service/v1/kill Content-Type: application/json JSON POST /admin/service/v1/kill Body: { "service": "notification-notify", "reason": "spike-anomaly" } This immediately takes the target service offline, returning a controlled HTTP 503 with a fallback message. It's an emergency brake, but one that saved us more than once. We've added lightweight kill switch verification as part of post-deploy smoke tests to ensure the route binding reacts properly. Lesson 4: Failures Are Normal. Ignoring Them Isn't. None of this matters if teams panic during an incident. We conduct chaos drills every month. These include message queue overloads, DNS lag, and cold database cache scenarios. For each simulation, the system must surface exceptions within 15 seconds, trigger alerts within 20 seconds, and either retry or activate a fallback depending on severity. In one exercise, we injected malformed GPS coordinate records for location service. The system detected the malformed payload, tagged the source batch ID, rerouted it to a dead-letter queue, and preserved processing continuity for all other jobs. It’s not about perfection; it’s about graceful degradation and fast containment. We’ve also learned that how teams respond, not just whether systems recover, affects long-term product reliability and on-call culture. Final Words: Engineer for Failure, Operate for Trust What these lessons have reinforced is that uptime isn’t a metric; it’s a reflection of operational integrity. Systems that matter most need to be built to fail without collapsing. Don’t deploy without a rollback plan. Reversibility is insurance.Observability only works if your logs are readable and relevant.Build in controls that let you shut down safely when needed.Simulate failure regularly. Incident response starts before the outage. These principles haven’t made our systems perfect, but they’ve made them resilient. And in public infrastructure, resilience isn’t optional. It’s the baseline. You can’t promise availability unless you architect for failure. And you can’t recover fast unless your pipelines are built to react, not just deploy.

By Naga Srinivasa Rao Balajepally
Effective Exception Handling in Java and Spring Boot Applications
Effective Exception Handling in Java and Spring Boot Applications

When you're building Java applications, especially with frameworks like Spring Boot, it’s easy to overlook proper exception handling. However, poorly managed exceptions can make your application harder to debug, more difficult to maintain, and a nightmare for anyone dealing with production issues. In this post, we’ll explore how to handle exceptions effectively by avoiding generic catch blocks, using custom exceptions, and centralizing exception handling with Spring Boot’s @ControllerAdvice. Let’s dive in. The Problem With Generic Catch Blocks We've all seen this before: Java try { // risky code } catch (Exception e) { e.printStackTrace(); // Not helpful in production! } It might seem like a simple catch-all approach, but it's actually a terrible practice. Here’s why: It Swallows Real Issues: Catching Exception can hide bugs or other critical issues that need attention. If every exception is treated the same way, you might miss an important problem Difficult to Debug: Simply printing the stack trace isn’t helpful in a production environment. You need to log the error with enough context to figure out what went wrong. No Context: A generic catch(Exception e) doesn't tell you anything about the actual problem. You lose valuable information that could help you troubleshoot. How We Fix It Instead of catching Exception, always catch more specific exceptions. This gives you a clearer understanding of what went wrong and allows you to handle the error appropriately. Java try { // Risky operation } catch (IOException e) { log.error("File reading error: " + e.getMessage()); // Handle file reading error here } By catching IOException, you’re clearly indicating that the issue is related to input/output operations. It’s easier to identify and fix. Creating and Using Custom Exceptions Sometimes, the exceptions provided by Java or Spring aren't specific enough for your domain logic. This is where custom exceptions come in. When your application hits certain business rules or specific conditions, custom exceptions can provide clarity and better control over how to handle the problem. Domain-Specific: When you create exceptions that reflect your application’s business logic, you make your code more readable.Better Error Handling: Custom exceptions allow you to fine-tune how you handle specific errors in different parts of your application. How to create custom exception? Let’s say you're building an application that handles users, and you need to handle cases where a user can’t be found. Java public class UserNotFoundException extends RuntimeException { public UserNotFoundException(String userId) { super("User not found: " + userId); } } When you try to fetch a user from the database and they don’t exist, you can throw this custom exception: Java User user = userRepository.findById(id) .orElseThrow(() -> new UserNotFoundException(id)); Centralized Exception Handling With @ControllerAdvice Now that we have specific exceptions, how do we handle them consistently across our entire application? The answer is @ControllerAdvice. What is this annotation? Spring Boot’s @ControllerAdvice annotation allows you to define global exception handlers that can catch exceptions thrown by any controller and return a standardized response. This helps you avoid redundant exception handling code in every controller and keeps error responses consistent. For example: Create a global exception handler class using @ControllerAdvice: Java @RestControllerAdvice public class GlobalExceptionHandler { @ExceptionHandler(UserNotFoundException.class) public ResponseEntity<String> handleUserNotFound(UserNotFoundException ex) { return new ResponseEntity<>(ex.getMessage(), HttpStatus.NOT_FOUND); } @ExceptionHandler(Exception.class) public ResponseEntity<String> handleAll(Exception ex) { return new ResponseEntity<>("An unexpected error occurred", HttpStatus.INTERNAL_SERVER_ERROR); } } The @ExceptionHandler annotation tells Spring Boot that the method should handle specific exceptions. In this case, if a UserNotFoundException is thrown anywhere in the app, the handleUserNotFound() method will handle it.The handleAll() method catches any generic exception that’s not handled by other methods. This ensures that unexpected errors still get caught and return a meaningful response. With @ControllerAdvice, we now have a centralized place to handle exceptions and return proper HTTP status codes and messages. This keeps our controller code clean and focused only on its primary responsibility—handling the business logic. Use Structured Error Responses When an error occurs, it's important to give the client (or anyone debugging) enough information to understand what went wrong. Returning just a generic error message isn't very helpful. How to create structured error response? By creating a standard error response format, you can ensure consistency in error responses across your application. Here’s how you might define an error response DTO: Java public class ErrorResponse { private String message; private int status; private LocalDateTime timestamp; public ErrorResponse(String message, int status, LocalDateTime timestamp) { this.message = message; this.status = status; this.timestamp = timestamp; } // Getters and Setters } Then in @ControllerAdvice Java @ExceptionHandler(UserNotFoundException.class) public ResponseEntity<ErrorResponse> handleUserNotFound(UserNotFoundException ex) { ErrorResponse error = new ErrorResponse(ex.getMessage(), HttpStatus.NOT_FOUND.value(), LocalDateTime.now()); return new ResponseEntity<>(error, HttpStatus.NOT_FOUND); } What is the advantage ? Now, instead of returning just a plain string, you’re returning a structured response that includes: Error message, HTTP status code and Timestamp This structure makes it much easier for the client (or even developers) to understand the error context. Extra Tips for Better Exception Handling Log with Context: Always log errors with relevant context—such as user ID, request ID, or any other relevant details. This makes troubleshooting easier.Don’t Leak Sensitive Information: Avoid exposing sensitive information like stack traces or internal error details in production. You don’t want to expose your database schema or other internal workings to the outside world.Carefully Choose HTTP Status Codes: Return the right HTTP status codes for different types of errors. For example, use 404 for "not found" errors, 400 for validation errors, and 500 for internal server errors. Conclusion Effective exception handling is crucial for building robust and maintainable applications. By avoiding generic catch blocks, using custom exceptions for clarity, and centralizing exception handling with @ControllerAdvice, you can greatly improve both the quality and the maintainability of your code. Plus, with structured error responses, you provide a better experience for the developers and consumers of your API. So, the next time you find yourself writing a catch (Exception e), take a step back and consider whether there’s a more specific way to handle the error.

By Arunkumar Kallyodan
Why Whole-Document Sentiment Analysis Fails and How Section-Level Scoring Fixes It
Why Whole-Document Sentiment Analysis Fails and How Section-Level Scoring Fixes It

Have you ever tried to analyze the sentiment of a long-form document like a financial report, technical whitepaper or regulatory filing? You probably noticed that the sentiment score often feels way off. That’s because most sentiment analysis tools return a single, aggregated sentiment score—usually positive, negative, or neutral—for the entire document. This approach completely misses the complexity and nuance embedded in long-form content. I encountered this challenge while analyzing annual reports in the finance industry. These documents are rarely uniform in tone. The CEO’s message may sound upbeat, while the “Risk Factors” section could be steeped in caution. A single sentiment score doesn’t do justice to this mix. To solve this, I developed an open-source Python package that breaks down the sentiment of each section in a PDF. This gives you a high-resolution view of how tone fluctuates across a document, which is especially useful for analysts, editors and anyone who needs accurate sentiment cues. What the Package Does The pdf-section-sentiment package is built around a simple but powerful idea: break the document into meaningful sections, and score sentiment for each one individually. It has two core capabilities: PDF Section Extraction — Uses IBM’s Docling to convert the PDF into Markdown and retain section headings like “Executive Summary,” “Introduction,” or “Financial Outlook.”Section-Level Sentiment Analysis — Applies sentiment scoring per section using a lightweight model such as TextBlob. Instead of returning a vague label for the whole document, the tool returns a JSON structure that maps each section to a sentiment score and its associated label. This modular architecture makes the tool flexible, transparent, and easy to extend in the future. Get Started Install: pip install pdf-section-sentimentAnalyze: pdf-sentiment --input your.pdf --output result.jsonLearn more: GitHub RepositoryShare feedback or contribute via GitHub Why This Matters: The Case for Section-Level Analysis Imagine trying to judge the emotional tone of a movie by only watching the last five minutes. That’s what traditional document-level sentiment analysis is doing. Documents, especially professional ones like policy reports, contracts, or business filings are structured. They have introductions, problem statements, findings, and conclusions. The tone and intent of each of these sections can differ dramatically. A single score cannot capture that. With section-level scoring you get: A granular view of how tone changes throughout the documentThe ability to isolate and examine only the negative or positive sectionsBetter alignment with how people read and interpret informationThe foundation for downstream applications like summarization, risk analysis or flagging emotional tone shifts Industries like finance, law, research, and media rely on precision and traceability. This tool brings both to sentiment analysis. How It Works, Step by Step Here’s how the package operates behind the scenes: Convert PDF to Markdown It uses IBM Docling to convert complex PDFs into Markdown format.This ensures paragraph-level fidelity and captures section headers. Split Markdown into Sections LangChain’s MarkdownHeaderTextSplitter detects headers and organizes the document into logical sections.These sections are stored as key-value pairs with headers as keys. Run Sentiment Analysis Each section is fed to a model like TextBlob.It computes a polarity score (between -1 and +1) and assigns a label: positive, neutral, or negative. Output the Results It generates a structured JSON file as the final output.Each entry includes the section title, sentiment score, and label. Example output: JSON { "section": "Financial Outlook", "sentiment_score": -0.27, "sentiment": "negative" } This makes it easy to integrate the results into dashboards, reports or further processing pipelines. How to Use It the package offers two command-line interfaces (CLIs) for ease of use: 1. Extracting Sections Only pdf-extract --input myfile.pdf --output sections.json This extracts and saves all sections in a structured JSON file. 2. Extract + Sentiment Analysis pdf-sentiment --input myfile.pdf --output sentiment.json This performs both section extraction and sentiment scoring in one shot. Installation You can install the package directly from PyPI: pip install pdf-section-sentiment It requires Python 3.9 or later. Dependencies like docling, langchain-text-splitters, and textblob are included. When to Use This Tool Here are some concrete use cases: Finance professionals: Identify tone shifts in annual or quarterly earnings reports.Legal teams: Review long legal texts or contracts for section-specific tone.Policy analysts: Examine sentiment trends in whitepapers, proposals or legislation drafts.Content editors: Ensure consistent tone across reports, blog posts or thought leadership. If your workflow involves making sense of long documents where tone matters, this tool is built for you. How Sentiment is Calculated The sentiment score is a float ranging from -1 (strongly negative) to +1 (strongly positive). The corresponding label is determined based on configurable thresholds: score >= 0.1: Positive-0.1 < score < 0.1: Neutralscore <= -0.1: Negative These thresholds work well in most use cases and can be easily adjusted. Limitations and Future Work As with any tool, there are trade-offs: Layout challenges: Some PDFs may have non-standard formatting that hinders clean extraction.Lexicon-based models: TextBlob is fast but not always semantically aware.Scalability: Processing very large PDFs may require batching or optimization. Coming Soon: LLM-based sentiment models (OpenAI, Cohere, etc.)Multilingual supportSection tagging and classificationA web-based interface for visualizing results Conclusion: Bringing Precision to Document Sentiment Whole-document sentiment scoring is like painting with a broom. It’s broad and fast—but it’s also messy and imprecise. In contrast, this package acts like a fine brush. It captures tone at the level where decisions are actually made: section by section. By using structure-aware parsing and per-section sentiment scoring, this tool gives you insights that actually align with how humans read and interpret documents. Whether you’re scanning for red flags, comparing revisions or trying to summarize, this approach gives you the fidelity and context you need.

By Sanjay Krishnegowda

Culture and Methodologies

Agile

Agile

Career Development

Career Development

Methodologies

Methodologies

Team Management

Team Management

Lessons Learned in Test-Driven Development

June 20, 2025 by Arun Vishwanathan

The Scrum Guide Expansion Pack

June 19, 2025 by Stefan Wolpers DZone Core CORE

Understanding the 5 Levels of LeetCode to Crack Coding Interview

June 17, 2025 by Sajid khan

Data Engineering

AI/ML

AI/ML

Big Data

Big Data

Databases

Databases

IoT

IoT

Building an IoT Framework: Essential Components for Success

June 20, 2025 by Carsten Rhod Gregersen

Your Kubernetes Survival Kit: Master Observability, Security, and Automation

June 20, 2025 by Prabhu Chinnasamy

Building Smarter Chatbots: Using AI to Generate Reflective and Personalized Responses

June 20, 2025 by Surabhi Sinha

Software Design and Architecture

Cloud Architecture

Cloud Architecture

Integration

Integration

Microservices

Microservices

Performance

Performance

Breaking to Build Better: Platform Engineering With Chaos Experiments

June 20, 2025 by Josephine Eskaline Joyce DZone Core CORE

Innovation at Speed: How Cloud-Native Development Accelerates Time-to-Market

June 20, 2025 by Ubaid Pisuwala

Your Kubernetes Survival Kit: Master Observability, Security, and Automation

June 20, 2025 by Prabhu Chinnasamy

Coding

Frameworks

Frameworks

Java

Java

JavaScript

JavaScript

Languages

Languages

Tools

Tools

Building an IoT Framework: Essential Components for Success

June 20, 2025 by Carsten Rhod Gregersen

Your Kubernetes Survival Kit: Master Observability, Security, and Automation

June 20, 2025 by Prabhu Chinnasamy

How to Marry MDC With Spring Integration

June 20, 2025 by Vsevolod Vasilyev

Testing, Deployment, and Maintenance

Deployment

Deployment

DevOps and CI/CD

DevOps and CI/CD

Maintenance

Maintenance

Monitoring and Observability

Monitoring and Observability

Innovation at Speed: How Cloud-Native Development Accelerates Time-to-Market

June 20, 2025 by Ubaid Pisuwala

Your Kubernetes Survival Kit: Master Observability, Security, and Automation

June 20, 2025 by Prabhu Chinnasamy

Lessons Learned in Test-Driven Development

June 20, 2025 by Arun Vishwanathan

Popular

AI/ML

AI/ML

Java

Java

JavaScript

JavaScript

Open Source

Open Source

Building Smarter Chatbots: Using AI to Generate Reflective and Personalized Responses

June 20, 2025 by Surabhi Sinha

MCP Client Agent: Architecture and Implementation

June 20, 2025 by Venkata Buddhiraju

How to Marry MDC With Spring Integration

June 20, 2025 by Vsevolod Vasilyev

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • [email protected]

Let's be friends: