The 7 Biggest Cloud Misconfigurations That Hackers Love (and How to Fix Them)
Vibe Coding: Conversational Software Development - Part 2, In Practice
Software Supply Chain Security
Gone are the days of fragmented security checkpoints and analyzing small pieces of the larger software security puzzle. Today, we are managing our systems for security end to end. Thanks to this shift, software teams have access to a more holistic view — a "full-picture moment" — of our entire software security environment. In the house that DevSecOps built, software supply chains are on the rise as security continues to flourish and evolve across modern software systems. Through the increase of zero-trust architecture and AI-driven threat protection strategies, our security systems are more intelligent and resilient than ever before. DZone's Software Supply Chain Security Trend Report unpacks everything within the software supply chain, every touchpoint and security decision, via its most critical parts. Topics covered include AI-powered security, maximizing ROI when it comes to securing supply chains, regulations from a DevSecOps perspective, a dive into SBOMs, and more.Now, more than ever, is the time to strengthen resilience and enhance your organization's software supply chains.
Getting Started With DevSecOps
AI Automation Essentials
Hey, DZone Community! We have an exciting year of research ahead for our beloved Trend Reports. And once again, we are asking for your insights and expertise (anonymously if you choose) — readers just like you drive the content we cover in our Trend Reports. Check out the details for our research survey below. Data Engineering Research Across the globe, companies are leveling up their data capabilities and analytics maturity. While organizations have become increasingly aware of the copious new technologies at our disposal, it's now about how we can use them in a thoughtful, efficient, and strategic way. Take our short research survey (~10 minutes) to contribute to our upcoming Trend Report. Did we mention that anyone who takes the survey will be eligible for a chance to enter a raffle to win an e-gift card of their choosing? We're exploring key topics such as: Driving a data-centric cultureData storage and architectureBuilding a robust AI strategyStreaming and real-time dataDataOps trends and takeawaysThe future of data pipelines Join the Data Engineering Research Over the coming month, we will compile and analyze data from hundreds of respondents; results and observations will be featured in the "Key Research Findings" of our upcoming Trend Report. Your responses help inform the narrative of our Trend Reports, so we truly cannot do this without you. Stay tuned for each report's launch and see how your insights align with the larger DZone Community. We thank you in advance for your help! —The DZone Content and Community team
A grand gala was being held at the Jade Palace. The Furious Five were preparing, and Po was helping his father, Mr. Ping, in the kitchen. But as always, Po had questions. Po (curious): "Dad, how do you always make the perfect noodle soup no matter what the ingredients are?" Mr. Ping (smiling wisely): "Ah, my boy, that’s because I follow the secret recipe—a fixed template!" Mr. Ping Reveals the Template Method Pattern Mr. Ping: "Po, the Template Method Pattern is like my noodle recipe. The skeleton of the cooking steps stays the same, but the ingredients and spices can vary!" Po: "Wait, you mean like... every dish has a beginning, middle, and end—but I can change what goes inside?" Mr. Ping: "Exactly! The fixed steps are defined in a base class, but subclasses—or in our case, specific dishes—override the variable parts." Traditional Template Method in Java (Classic OOP) Java public abstract class DishRecipe { // Template method public final void cookDish() { boilWater(); addIngredients(); addSpices(); serve(); } private void boilWater() { System.out.println("Boiling water..."); } protected abstract void addIngredients(); protected abstract void addSpices(); private void serve() { System.out.println("Serving hot!"); } } class NoodleSoup extends DishRecipe { protected void addIngredients() { System.out.println("Adding noodles, veggies, and tofu."); } protected void addSpices() { System.out.println("Adding soy sauce and pepper."); } } class DumplingSoup extends DishRecipe { protected void addIngredients() { System.out.println("Adding dumplings and bok choy."); } protected void addSpices() { System.out.println("Adding garlic and sesame oil."); } } public class TraditionalCookingMain { public static void main(String[] args) { DishRecipe noodle = new NoodleSoup(); noodle.cookDish(); System.out.println("\n---\n"); DishRecipe dumpling = new DumplingSoup(); dumpling.cookDish(); } } //Output Boiling water... Adding noodles, veggies, and tofu. Adding soy sauce and pepper. Serving hot! --- Boiling water... Adding dumplings and bok choy. Adding garlic and sesame oil. Serving hot! Po: "Whoa! So each dish keeps the boiling and serving, but mixes up the center part. Just like kung fu forms!" Functional Template Method Style Po: "Dad, can I make it more... functional?" Mr. Ping: "Yes, my son. We now wield the power of higher-order functions." Java import java.util.function.Consumer; public class FunctionalTemplate { public static <T> void prepareDish(T dishName, Runnable boil, Consumer<T> addIngredients, Consumer<T> addSpices, Runnable serve) { boil.run(); addIngredients.accept(dishName); addSpices.accept(dishName); serve.run(); } public static void main(String[] args) { prepareDish("Noodle Soup", () -> System.out.println("Boiling water..."), dish -> System.out.println("Adding noodles, veggies, and tofu to " + dish), dish -> System.out.println("Adding soy sauce and pepper to " + dish), () -> System.out.println("Serving hot!") ); prepareDish("Dumpling Soup", () -> System.out.println("Boiling water..."), dish -> System.out.println("Adding dumplings and bok choy to " + dish), dish -> System.out.println("Adding garlic and sesame oil to " + dish), () -> System.out.println("Serving hot!") ); } } Po: "Look, dad! Now we can cook anything, as long as we plug in the steps! It's like building recipes with Lego blocks!" Mr. Ping (beaming): "Ah, my son. You are now a chef who understands both structure and flavor." Real-World Use Case – Coffee Brewing Machines Po: “Dad, Now I want to build the perfect coffee-making machine, just like our noodle soup recipe!” Mr. Ping: “Ah, coffee, the elixir of monks and night-coders! Use the same template method wisdom, my son.” Step-by-Step Template – Java OOP Coffee Brewer Java abstract class CoffeeMachine { // Template Method public final void brewCoffee() { boilWater(); addCoffeeBeans(); brew(); pourInCup(); } private void boilWater() { System.out.println("Boiling water..."); } protected abstract void addCoffeeBeans(); protected abstract void brew(); private void pourInCup() { System.out.println("Pouring into cup."); } } class EspressoMachine extends CoffeeMachine { protected void addCoffeeBeans() { System.out.println("Adding finely ground espresso beans."); } protected void brew() { System.out.println("Brewing espresso under high pressure."); } } class DripCoffeeMachine extends CoffeeMachine { protected void addCoffeeBeans() { System.out.println("Adding medium ground coffee."); } protected void brew() { System.out.println("Dripping hot water through the grounds."); } } public class CoffeeMain { public static void main(String[] args) { CoffeeMachine espresso = new EspressoMachine(); espresso.brewCoffee(); System.out.println("\n---\n"); CoffeeMachine drip = new DripCoffeeMachine(); drip.brewCoffee(); } } //Ouput Boiling water... Adding finely ground espresso beans. Brewing espresso under high pressure. Pouring into cup. --- Boiling water... Adding medium ground coffee. Dripping hot water through the grounds. Pouring into cup. Functional and Generic Coffee Brewing (Higher-Order Zen) Po, feeling enlightened, says: Po: “Dad! What if I want to make Green Tea or Hot Chocolate, too?” Mr. Ping (smirking): “Ahhh... Time to use the Generic Template of Harmony™!” Functional Java Template for Any Beverage Java import java.util.function.Consumer; public class BeverageBrewer { public static <T> void brew(T name, Runnable boil, Consumer<T> addIngredients, Consumer<T> brewMethod, Runnable pour) { boil.run(); addIngredients.accept(name); brewMethod.accept(name); pour.run(); } public static void main(String[] args) { brew("Espresso", () -> System.out.println("Boiling water..."), drink -> System.out.println("Adding espresso grounds to " + drink), drink -> System.out.println("Brewing under pressure for " + drink), () -> System.out.println("Pouring into espresso cup.") ); System.out.println("\n---\n"); brew("Green Tea", () -> System.out.println("Boiling water..."), drink -> System.out.println("Adding green tea leaves to " + drink), drink -> System.out.println("Steeping " + drink + " gently."), () -> System.out.println("Pouring into tea cup.") ); } } //Output Boiling water... Adding espresso grounds to Espresso Brewing under pressure for Espresso Pouring into espresso cup. --- Boiling water... Adding green tea leaves to Green Tea Steeping Green Tea gently. Pouring into tea cup. Mr. Ping’s Brewing Wisdom “In code as in cooking, keep your recipe fixed… but let your ingredients dance.” Template Pattern gives you structure.Higher-order functions give you flexibility.Use both, and your code becomes as tasty as dumplings dipped in wisdom! Mr. Ping: "Po, a great chef doesn't just follow steps. He defines the structure—but lets each ingredient bring its own soul." Po: "And I shall pass down the Template protocol to my children’s children’s children!" Also read... Part 1 – Kung Fu Code: Master Shifu Teaches Strategy Pattern to Po – The Functional WayPart 2 – Code of Shadows: Master Shifu and Po Use Functional Java to Solve the Decorator Pattern MysteryPart 3 – Kung Fu Commands: Shifu Teaches Po the Command Pattern with Java Functional Interfaces
Mobile development presents unique challenges in delivering new features and UI changes to users. We often find ourselves waiting on App Store or Play Store review cycles for even minor UI updates. Even after an update is approved, not all users install the latest version right away. This lag means some portion of our audience might be stuck on older UIs, leading to inconsistent user experiences across app versions. In traditional native development, any change to the interface — from a simple text tweak to a full layout overhaul — requires releasing a new app version. Combined with lengthy QA and release processes, this slows down our ability to respond to feedback or run timely experiments. Teams have explored workarounds to make apps more flexible. Some have tried loading portions of the UI in a web view, essentially embedding web pages in the app to avoid full releases. Cross-platform frameworks like React Native and Flutter reduce duplicated effort across iOS and Android, but they still package a fixed UI that requires redeployment for changes. In short, mobile UIs have historically been locked in code at build time. This rigidity clashes with the fast pace of modern product iterations. We need a way to change app interfaces on the fly — one that doesn’t sacrifice native performance or user experience. This is where server-driven UI (SDUI) enters the picture. The Concept of Server-Driven UI Server-driven UI (SDUI) is an architectural pattern that shifts much of the UI definition from the app to the server. Instead of baking every screen layout and widget into the mobile app binary, with SDUI, we allow the server to determine what UI components to display and how to arrange them. The client application (the app) becomes a rendering engine that interprets UI descriptions sent from the backend. In practice, this often means the server delivers a JSON payload that describes screens or components. This JSON defines what elements should appear (text, images, buttons, lists, etc.), their content, and possibly style attributes or layout hints. The mobile app is built with a library of native UI components (views) that correspond to possible element types in the JSON. When the app fetches the JSON, it maps each element to a native view and renders the interface accordingly. The server effectively controls both the data and presentation. This approach is reminiscent of how web browsers work — HTML from the server tells the browser what to render — except here the “HTML” is a custom JSON (or similar format) and the “browser” is our native app. How does this help? It decouples releasing new features from app releases. We can introduce a new promotion banner, change a screen layout, or run A/B tests by adjusting the server response. The next time users open the app (with internet connectivity), they’ll see the updated UI instantly, without updating the app. The client code remains simpler, focusing mainly on rendering logic and native integrations (like navigation or accessing device features), while the server takes on the responsibility of deciding what the UI should contain for each context or user. Let's illustrate with a simple example. Imagine a retail app’s home screen needs to show a “Featured Item Card” as part of a promotional campaign. Traditionally, we would implement this card natively in the app code and release a new version. With SDUI, we can define the card on the server and send it to the app: JSON { "type": "card", "id": "featured-item-123", "elements": [ { "type": "image", "url": "https://example.com/images/promo123.png", "aspectRatio": "16:9" }, { "type": "text", "style": "headline", "content": "Limited Edition Sneakers" }, { "type": "text", "style": "body", "content": "Exclusive launch! Available this week only." }, { "type": "button", "label": "Shop Now", "action": { "type": "deep_link", "target": "myapp://shop/featured/123" } } ], "style": { "padding": 16, "backgroundColor": "#FFF8E1", "cornerRadius": 8 } } In this JSON, the server describes a card component with an image, two text blocks, and a button. The client app knows how to render each element type (image, text, button) using native UI widgets. The layout and content (including text and image URL) come from the server. We could change the title or swap out the image URL in the JSON tomorrow, and every user’s app would reflect the new content immediately. All without a new binary release. Comparing SDUI With Native Development How does server-driven UI compare to the traditional native development approach? In native development, each screen and component is implemented in platform-specific code (SwiftUI/UIKit on iOS, Jetpack Compose/View system on Android, etc.). The client is in full control of the UI, and the server typically only supplies raw data (e.g., a list of products in JSON, but not how they should be displayed). Any change in presentation requires modifying the app’s code. This gives us maximum flexibility to use platform features and fine-tune performance, but it also means any significant UI update goes through the full development and deployment cycle. With SDUI, we trade some of that granular control for agility. The server defines what the UI should look like, but the rendering still happens with native components. That means users get a native look and feel — our JSON example above would result in actual native image views, text views, and buttons, not a web page. Performance can remain close to native since we’re not introducing a heavy abstraction layer; the app is essentially assembling native UI at runtime. However, the client must be generic enough to handle various combinations of components. We often implement a UI engine or framework within the app that knows how to take a JSON like the one above and inflate it into native views. This adds complexity to the app architecture — essentially, part of our app becomes an interpreter of a UI description language. Another consideration is version compatibility. In purely native development, the client and server have a fixed contract (e.g., the server sends data X, the client renders it in pre-defined UI Y). In SDUI, that contract is more fluid, and we must ensure the app can gracefully handle JSON meant for newer versions. For example, if we introduce a new element type "carousel" in the JSON but some users have an older app that doesn’t know about carousels, we need a strategy — perhaps the server avoids sending unsupported components to older app versions, or we build backward compatibility into the client’s SDUI engine. In summary, compared to native development, SDUI offers: Fast UI iterations: We can deploy UI changes like we deploy backend changes, without waiting on app store releases.Consistency: The server can ensure all users (on compatible app versions) see the updated design, reducing the divergence caused by slow adopters of app updates.Reduced client complexity in business logic: The app focuses on rendering and basic interactions, while business logic (what to show when) lives on the server. This can simplify client code for complex, dynamic content. On the flip side, native development still has the edge in certain areas. If an app’s UI rarely changes or demands pixel-perfect, platform-specific design tailored for each OS, the overhead of SDUI might not be worth it. Native apps also work offline out-of-the-box with their built-in UI since the interface is packaged with the app, whereas SDUI apps need to plan for offline behavior (more on that soon). There’s also the matter of tooling and debugging: a native developer can use interface builders and preview tools to craft screens, but debugging an SDUI layout might involve inspecting JSON and logs to understand what went wrong. We have to invest in developer experience for building and testing UI configurations on the server side. Comparing SDUI With Cross-Platform Frameworks Cross-platform frameworks (like React Native, Flutter, and Kotlin Multiplatform Mobile) address a different problem: the duplication of effort when building separate apps for iOS and Android. These frameworks allow us to write common UI code that runs on multiple platforms. React Native uses JavaScript and React to describe UI, which is then bridged to native widgets (or, in some cases, uses its own rendering). Flutter uses its own rendering engine and Dart language to draw UI consistently across platforms. The benefit is a unified codebase and faster development for multiple targets. However, cross-platform apps still typically follow a client-driven UI approach — the app’s packaged code dictates the interface. Comparing cross-platform with SDUI, we find that they can solve complementary issues. Cross-platform is about “build once, run anywhere”, whereas SDUI is about “update anytime from the server”. You could even combine them: for instance, a React Native app could implement a server-driven UI system in JavaScript that takes JSON from the server and renders React Native components. In fact, many cross-platform apps also want the ability to update content dynamically. The key comparisons are: Deployment vs. code sharing: SDUI focuses on decoupling deployment from UI changes. Cross-platform focuses on sharing code between platforms. With SDUI, we still need to implement the rendering engine on each platform (unless using a cross-platform framework underneath), but each platform’s app stays in sync by receiving the same server-driven definitions. Cross-platform frameworks remove the need to implement features twice, but when you need to change the UI, you still have to ship new code (unless the framework supports code push updates).Performance: Modern cross-platform frameworks are quite performant, but they introduce their own layers. For example, Flutter’s engine draws every pixel, and React Native uses a bridge between JavaScript and native. SDUI, by contrast, typically leans on truly native UI components for each platform (the JSON is interpreted by native code), so performance can be as good as native for the rendering, though there is some overhead in parsing JSON and the network round-trip.Flexibility: Cross-platform UIs may sometimes not feel perfectly “at home” on each platform, especially if the framework abstracts away platform-specific UI patterns. SDUI lets each platform render components in a way that matches its native guidelines (since the app code for rendering is platform-specific), while still coordinating the overall look via server. However, SDUI limits how much a local platform developer can tweak a screen’s behavior — the structure comes from server. In cross-platform, a developer could drop down to native code for tricky parts if needed; in SDUI, if a new interaction or component is needed, it likely requires server-side support and possibly updating the app to handle it. In practice, organizations choose one or the other (or both) based on their needs. For example, a team might stick with fully native iOS/Android development but add SDUI capabilities to handle highly dynamic parts of the app. Another team might build the bulk of the app in Flutter but reserve certain sections to be server-driven for marketing content. Importantly, SDUI is not a silver bullet replacement for cross-platform frameworks — it operates at a different layer. In fact, we can think of SDUI as introducing a cross-release contract (the JSON/DSL for UI) rather than a cross-platform codebase. We, as developers, still maintain code for each platform, but that code is largely generic. Cross-platform, conversely, minimizes platform-specific code but doesn’t inherently solve the deployment agility problem. A Concrete Example: "Featured Item Card" To make SDUI more tangible, let's walk through the Featured Item Card example in detail. Suppose our app has a home screen where we want to show a special promotional item to users. In a traditional app, we’d implement a new view class or fragment for this card, define its layout in XML (Android) or SwiftUI, and fetch the data to populate it. With SDUI, we instead add a description of this card to the server’s response for the home screen. When the app requests the home screen layout, the server responds with something like: JSON { "screen": "Home", "sections": [ { "type": "FeaturedItemCard", "data": { "itemId": 123, "title": "Limited Edition Sneakers", "description": "Exclusive launch! This week only.", "imageUrl": "https://example.com/images/promo123.png", "ctaText": "Shop Now", "ctaTarget": "myapp://shop/featured/123" } }, { "type": "ProductGrid", "data": { ... } } ] } In this hypothetical JSON, the home screen consists of multiple sections. The first section is of type "FeaturedItemCard" with associated data for the item to feature, and the second might be a product grid (the rest of the screen). The client app has a registry of component renderers keyed by type. When it sees "FeaturedItemCard", it knows to create a card UI: perhaps an image on top, text below it, and a button. The values for the image URL, text, and button label come from the data field. The ctaTarget might be a deep link or navigation instruction that the app will use to wire the button’s action. The beauty of this setup is that if next month Marketing wants to feature a different item or change the text, we just change the server’s JSON output. If we want to experiment with two different styles of featured cards (say, a larger image vs. a smaller image variant), we could have logic on the server that sends different JSON to different user segments. The app simply renders whatever it’s told, using its existing capabilities. Now, building this FeaturedItemCard in the app still requires some work upfront. We need to ensure the app knows how to render a section of type "FeaturedItemCard". That likely means in our SDUI framework on the client, we have a class or function that takes the data for a FeaturedItemCard and constructs the native UI elements, e.g., an ImageView with the image URL, a TextView or SwiftUI Text for the title, another for description, and a styled Button for the CTA. We might have done this already if FeaturedItemCard was a planned component; if not, adding a completely new type of component would require an app update to teach the client about it. This is a crucial point: SDUI excels when using a known set of components that can be reconfigured server-side, but introducing truly new UI paradigms can still require coordinated client-server releases. In our example, as long as “FeaturedItemCard” was built into version 1.0 of the app, we can send any number of different FeaturedItemCard instances with various data. But if we suddenly want a “FancyCarousel” and the app has no carousel component built-in, we have to update the app to add that capability. Our example also hints at nesting and composition. The JSON could describe entire screens (like the Home screen) composed of sections, or it could be more granular (maybe the server sends just the FeaturedItemCard JSON, and the app knows where to insert it). The design of the JSON schema for UI is a big part of SDUI implementation. Some teams use a layout tree structure (hierarchical JSON of containers and components), while others might use a flat list of components per screen with implicit layout logic. Regardless of the approach, the client needs to interpret it and transform it into a user interface. Handling Offline Support: Client-Side Caching Strategy One of the biggest challenges with a server-driven approach is offline support. If the app’s UI is coming from the server at runtime, what happens when the user has no internet connection or when the server is slow to respond? We certainly don’t want to show a blank screen or a loading spinner indefinitely. To mitigate this, we need a caching strategy on the client side. The idea is to cache the JSON (or whatever UI description) from the last successful fetch so that we can fall back to it when needed. A common approach is to use a small SQLite database on the device to store cached UI data. SQLite offers a lightweight, file-based database that’s perfect for mobile apps. By caching the server responses, the app can quickly load the last known UI state for a given screen without network access. For example, we might set up a SQLite table to store UI layouts by a key, such as the screen name or endpoint URL: SQLite CREATE TABLE IF NOT EXISTS ui_layout_cache ( screen_id TEXT PRIMARY KEY, layout_json TEXT NOT NULL, last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); In this schema, whenever the app fetches a UI description from the server (say for the “Home” screen or for a “FeaturedItemCard” module), it will store the JSON text in the ui_layout_cache table, keyed by an identifier (like "Home"). The last_updated timestamp helps us know how fresh the cache is. If the user opens the app offline, we can query this table for the layout of the Home screen. If found, we parse that JSON and render the UI as usual. This way, the user at least sees the last known interface (and possibly data content) instead of an error. The Future Potential: AI-Generated and AI-Optimized UIs Looking ahead, one exciting avenue for server-driven UI is the incorporation of AI in the UI generation process. Since SDUI already centralizes UI decisions on the server, it opens the door for using machine learning models to decide what UI to present, effectively AI-generated UIs. What could this look like in practice? Imagine a scenario where we want to personalize the app’s interface for each user to maximize engagement or accessibility. With SDUI, we have the flexibility to serve different layouts to different users. AI could be employed to analyze user behavior and preferences, and then dynamically select which components or layout variations to show. For example, if a user tends to ignore a certain type of banner but engages more with lists, an AI model might decide to send a more list-focused home screen JSON to that user. This goes beyond A/B testing into the realm of continuous optimization per user or per segment, driven by algorithms. Another possibility is using generative AI to actually create new UI component configurations. There are already early experiments in using AI to generate code or design assets. One could envision an AI system that, given high-level goals (like “promote item X to user Y in a non-intrusive way”), generates a JSON layout snippet for a new card or modal, which the app then renders. The AI could even iterate and test different generated layouts, measure user interactions (with proper caution to user experience), and gradually converge on optimal designs. In essence, the app’s UI becomes data-driven in real-time, not just server-driven but AI-driven on the server. AI-optimized UIs might also assist in responsive design across device types. If our app runs on phones and tablets, an AI model could adjust the layout JSON for the larger screen, maybe adding additional elements if space allows, or reordering content based on usage patterns. While traditional responsive design can handle simple resizing, an AI could learn from user engagement to decide, for instance, that tablet users prefer a two-column layout for a particular screen, and then automatically deliver that via SDUI. We should note that these ideas are still on the frontier. Implementing AI-driven UI decisions requires careful consideration of user trust and consistency. We wouldn’t want the app UI to become erratic or unpredictable. Likely, AI suggestions would be filtered through design constraints to ensure they align with the brand and usability standards. Server-driven UI provides the delivery mechanism for such dynamic changes, and AI provides the decision mechanism. Together, they could enable a level of UI personalization and optimization that was previously hard to achieve in native apps. In summary, the future may see SDUI and AI converge to produce apps that tailor themselves to each user and context, all without constant human micromanagement. We would still keep humans in the loop for oversight, but much of the heavy lifting in UI optimization could be offloaded to intelligent systems. This concept aligns well with the SDUI philosophy of the client being a flexible renderer, as that flexibility can now be exploited by advanced decision-making algorithms on the server.
After fifteen years of wrestling with SQL Server performance challenges in production environments, I can confidently say that indexed views remain one of the most underutilized yet powerful features for optimizing query performance. Introduced in SQL Server 2000 and significantly enhanced in subsequent versions, indexed views (also known as materialized views) allow you to physically store the result set of a view on disk with a clustered index, dramatically improving query performance for complex aggregations and joins. I've seen indexed views transform applications from sluggish report generators taking minutes to execute into responsive systems delivering results in seconds. However, they're not a silver bullet; they come with maintenance overhead and specific limitations that every DBA needs to understand before implementation. Today, where milliseconds matter and storage is relatively cheap compared to compute resources, indexed views offer a compelling solution for read-heavy workloads, data warehousing scenarios, and complex reporting requirements. This article will walk you through everything I've learned about implementing, maintaining, and optimizing indexed views in production SQL Server environments. Technical Deep Dive Indexed views work by physically materializing the view's result set and storing it with a unique clustered index. Unlike regular views, which are simply stored SELECT statements executed at query time, indexed views maintain their data on disk and are automatically updated when the underlying base tables change. The magic happens through SQL Server's query optimizer, which can automatically substitute indexed views for base table access when it determines the view can satisfy a query, even if the query doesn't directly reference the view. This automatic matching capability, available in Enterprise Edition, is what makes indexed views so powerful for performance optimization. Prerequisites and System Requirements From my experience, you'll need SQL Server Standard Edition or higher for basic indexed view creation, but Enterprise Edition for automatic view matching. The base tables must use two-part naming (schema.table), and several SET options must be configured correctly during creation: Plain Text SET NUMERIC_ROUNDABORT OFF SET ANSI_PADDING, ANSI_WARNINGS, CONCAT_NULL_YIELDS_NULL, ARITHABORT ON SET QUOTED_IDENTIFIER, ANSI_NULLS ON Critical Limitations I've Encountered The biggest gotcha I've faced is the restriction on deterministic functions only. No GETDATE(), NEWID(), or user-defined functions unless they're schema-bound and deterministic. Cross-database queries are forbidden, and you cannot use OUTER JOINs, subqueries, or certain aggregate functions like STDEV or VAR. Maintenance overhead is significant every INSERT, UPDATE, or DELETE on base tables triggers index maintenance on all related indexed views. I've seen poorly planned indexed views actually degrade performance on OLTP systems with heavy write activity. Comparison With Competing Solutions Oracle's materialized views offer more flexibility with refresh options (ON COMMIT, ON DEMAND), while SQL Server's indexed views are always automatically maintained. PostgreSQL's materialized views require manual refresh but offer more aggregation function support. SQL Server's approach trades flexibility for consistency. Your data is always current, but you pay the maintenance cost. The query optimizer's automatic matching in Enterprise Edition is SQL Server's killer feature here. I've migrated applications from Oracle, where we had to explicitly reference materialized views, to SQL Server, where the optimizer handles substitution automatically. Practical Implementation Let me walk you through implementing an indexed view using a real-world scenario I encountered at a manufacturing company. They needed fast access to monthly sales summaries that involved complex joins across orders, customers, and products. Step 1: Create the Base View Plain Text -- Ensure proper SET options SET NUMERIC_ROUNDABORT OFF; SET ANSI_PADDING, ANSI_WARNINGS, CONCAT_NULL_YIELDS_NULL, ARITHABORT ON; SET QUOTED_IDENTIFIER, ANSI_NULLS ON; -- Create the view with proper schema binding CREATE VIEW dbo.vw_MonthlySalesSummary WITH SCHEMABINDING AS SELECT YEAR(o.OrderDate) AS OrderYear, MONTH(o.OrderDate) AS OrderMonth, c.CustomerID, c.CompanyName, COUNT_BIG() AS OrderCount, SUM(od.Quantity od.UnitPrice) AS TotalSales, AVG(od.Quantity * od.UnitPrice) AS AvgOrderValue FROM dbo.Orders o INNER JOIN dbo.Customers c ON o.CustomerID = c.CustomerID INNER JOIN dbo.OrderDetails od ON o.OrderID = od.OrderID GROUP BY YEAR(o.OrderDate), MONTH(o.OrderDate), c.CustomerID, c.CompanyName; Step 2: Create the Clustered Index Plain Text -- Create unique clustered index CREATE UNIQUE CLUSTERED INDEX IX_MonthlySalesSummary_Main ON dbo.vw_MonthlySalesSummary (OrderYear, OrderMonth, CustomerID); -- Add supporting nonclustered indexes CREATE NONCLUSTERED INDEX IX_MonthlySalesSummary_Sales ON dbo.vw_MonthlySalesSummary (TotalSales DESC) INCLUDE (CompanyName, OrderCount); Error Handling and Common Issues The most frequent error I encounter is the "Cannot create index on view because it references imprecise or non-deterministic column" error. Always verify your functions are deterministic: Plain Text -- Check if functions are deterministic SELECT OBJECTPROPERTY(OBJECT_ID('dbo.MyFunction'), 'IsDeterministic'); For troubleshooting performance issues, use these diagnostic queries: Plain Text -- Check if indexed view is being used SELECT dm_db_index_usage_stats.object_id, OBJECT_NAME(dm_db_index_usage_stats.object_id) AS view_name, user_seeks, user_scans, user_lookups FROM sys.dm_db_index_usage_stats WHERE database_id = DB_ID() AND OBJECT_NAME(object_id) LIKE 'vw_%'; Hands-On Testing Section Let's create a comprehensive test scenario to demonstrate indexed view performance benefits. I'll use a simplified e-commerce schema that you can implement in any test environment. Test Environment Setup Plain Text -- Create test database and tables CREATE DATABASE IndexedViewTest; USE IndexedViewTest; -- Orders table CREATE TABLE dbo.Orders ( OrderID INT IDENTITY(1,1) PRIMARY KEY, CustomerID INT NOT NULL, OrderDate DATETIME NOT NULL, OrderTotal DECIMAL(10,2) NOT NULL ); -- OrderDetails table CREATE TABLE dbo.OrderDetails ( OrderDetailID INT IDENTITY(1,1) PRIMARY KEY, OrderID INT NOT NULL, ProductID INT NOT NULL, Quantity INT NOT NULL, UnitPrice DECIMAL(8,2) NOT NULL, FOREIGN KEY (OrderID) REFERENCES dbo.Orders(OrderID) ); -- Customers table CREATE TABLE dbo.Customers ( CustomerID INT IDENTITY(1,1) PRIMARY KEY, CompanyName NVARCHAR(100) NOT NULL, City NVARCHAR(50), Country NVARCHAR(50) ); Sample Data Generation Plain Text -- Generate test data (100K orders, 500K order details) DECLARE @Counter INT = 1; WHILE @Counter <= 100000 BEGIN INSERT INTO dbo.Orders (CustomerID, OrderDate, OrderTotal) VALUES ( (@Counter % 1000) + 1, DATEADD(DAY, -(@Counter % 365), GETDATE()), RAND() * 1000 + 50 ); SET @Counter += 1; END; -- Generate customers INSERT INTO dbo.Customers (CompanyName, City, Country) SELECT 'Company ' + CAST(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS VARCHAR(10)), 'City ' + CAST((ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) % 100) AS VARCHAR(10)), 'Country ' + CAST((ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) % 20) AS VARCHAR(10)) FROM sys.all_columns a1, sys.all_columns a2 WHERE ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) <= 1000; Performance Baseline Testing Plain Text -- Clear execution plan cache and buffer pool DBCC FREEPROCCACHE; DBCC DROPCLEANBUFFERS; -- Baseline query without indexed view SET STATISTICS IO ON; SET STATISTICS TIME ON; SELECT YEAR(o.OrderDate) AS OrderYear, MONTH(o.OrderDate) AS OrderMonth, COUNT(*) AS OrderCount, AVG(o.OrderTotal) AS AvgOrderValue, SUM(o.OrderTotal) AS TotalSales FROM dbo.Orders o WHERE YEAR(o.OrderDate) = 2024 GROUP BY YEAR(o.OrderDate), MONTH(o.OrderDate) ORDER BY OrderYear, OrderMonth; After Indexed View Implementation Plain Text -- Create indexed view CREATE VIEW dbo.vw_MonthlyOrderSummary WITH SCHEMABINDING AS SELECT YEAR(o.OrderDate) AS OrderYear, MONTH(o.OrderDate) AS OrderMonth, COUNT_BIG(*) AS OrderCount, AVG(o.OrderTotal) AS AvgOrderValue, SUM(o.OrderTotal) AS TotalSales FROM dbo.Orders o GROUP BY YEAR(o.OrderDate), MONTH(o.OrderDate); CREATE UNIQUE CLUSTERED INDEX IX_MonthlyOrderSummary ON dbo.vw_MonthlyOrderSummary (OrderYear, OrderMonth); -- Test same query - should use indexed view automatically -- (Results show 80-95% reduction in logical reads in my tests) Verification Steps Plain Text -- Verify indexed view usage in execution plan SELECT plan_handle, query_plan, execution_count FROM sys.dm_exec_query_stats CROSS APPLY sys.dm_exec_query_plan(plan_handle) WHERE query_plan.exist('//Object[@Table="[vw_MonthlyOrderSummary]"]') = 1; In my testing, I consistently see 80-95% reduction in logical reads and 60-85% improvement in query execution time for aggregation queries that can leverage indexed views. Industry Application and Use Cases In my consulting work, I've successfully implemented indexed views across various industries with remarkable results. Financial services companies use them extensively for regulatory reporting — one client reduced their month-end risk calculation batch from 6 hours to 45 minutes by implementing indexed views on position aggregations. E-commerce platforms benefit tremendously from indexed views on customer behavior analytics. I implemented indexed views for a major retailer's recommendation engine, aggregating purchase patterns and product affinities. The real-time dashboard queries that previously took 15-30 seconds now execute in under 2 seconds. Manufacturing companies with complex bill-of-materials calculations see enormous benefits. One client's production planning queries involving multi-level BOMs and inventory calculations improved from 45 seconds to 3 seconds after implementing strategic indexed views. Performance and Cost Implications From a cost perspective, indexed views trade storage space and maintenance overhead for query performance. In cloud environments like Azure SQL Database, this often results in net cost savings due to reduced DTU consumption, despite increased storage costs. I've seen clients reduce their Azure SQL Database service tier requirements by one or two levels after implementing indexed views strategically. Integration Considerations The biggest integration challenge is managing the maintenance overhead in high-transaction environments. I always recommend implementing indexed views during maintenance windows and monitoring transaction log growth carefully. For databases with heavy ETL processes, I often schedule indexed view rebuilds during off-peak hours to minimize impact on production workloads. Best Practices and Recommendations Based on years of production experience, here are my key recommendations for indexed view implementation: Production Deployment Considerations Always implement indexed views during scheduled maintenance windows. I've learned the hard way that creating an indexed view on a large table during business hours can cause blocking and performance degradation. Use SQL Server Agent jobs to automate index maintenance, especially for indexed views experiencing heavy update activity. Monitor transaction log space carefully – indexed view maintenance can generate significant log activity. I always ensure the transaction log backup frequency is appropriate for the maintenance overhead. Monitoring and Maintenance Establish baseline metrics before implementation and monitor continuously: Plain Text -- Monitor indexed view usage and maintenance costs SELECT OBJECT_NAME(i.object_id) AS view_name, i.name AS index_name, user_seeks + user_scans + user_lookups AS total_reads, user_updates AS maintenance_cost, avg_fragmentation_in_percent FROM sys.dm_db_index_usage_stats us INNER JOIN sys.indexes i ON us.object_id = i.object_id AND us.index_id = i.index_id CROSS APPLY sys.dm_db_index_physical_stats(DB_ID(), i.object_id, i.index_id, NULL, 'LIMITED') WHERE database_id = DB_ID() AND OBJECTPROPERTY(i.object_id, 'IsView') = 1; When to Use and When to Avoid Use indexed views for read-heavy workloads with complex aggregations, especially in data warehousing and reporting scenarios. Avoid them on tables with high update frequency unless the read benefits significantly outweigh maintenance costs. I never implement indexed views on tables receiving more than 1000 modifications per minute without extensive testing. Conclusion Indexed views represent one of SQL Server's most powerful performance optimization features when applied correctly. In my fifteen years of database administration, I've seen them transform application performance in scenarios where traditional indexing strategies fall short. The key to success lies in understanding their limitations and maintenance requirements. They're not suitable for every scenario, but when properly implemented in read-heavy environments with complex aggregation requirements, the performance benefits are substantial and immediate. Looking forward, I expect Microsoft to continue enhancing indexed view capabilities, potentially addressing some current limitations around function restrictions and cross-database scenarios. The automatic query substitution feature in Enterprise Edition continues to be a compelling reason for organizations to invest in higher SQL Server editions. My final recommendation: start small, test thoroughly, and monitor continuously. Begin with your most problematic reporting queries, implement indexed views in a controlled environment, and gradually expand based on proven results. The performance gains you'll achieve will make indexed views an indispensable tool in your database optimization arsenal.
LLMs may speak in words, but under the hood they think in tokens: compact numeric IDs representing character sequences. If you grasp why tokens exist, how they are formed, and where the real-world costs arise, you can trim your invoices, slash latency, and squeeze higher throughput from any model, whether you rent a commercial endpoint or serve one in-house. Why LLMs Don’t Generate Text One Character at a Time Imagine predicting “language” character by character. When decoding the very last “e,” the network must still replay the entire hidden state for the preceding seven characters. Multiply that overhead by thousands of characters in a long prompt and you get eye-watering compute. Sub-word tokenization offers a sweet spot between byte-level granularity and full words. Common fragments such as “lan,” “##gua,” and “##ge” (WordPiece notation, the # being a special attachment token) capture richer statistical signals than individual letters while keeping the vocabulary small enough for fast matrix multiplications on modern accelerators. Fewer time steps per sentence means shorter KV caches, smaller attention matrices, and crucially, fewer dollars spent. How Tokenizers Build Their Vocabulary Tokenizers are trained once, frozen, and shipped with every checkpoint. Three dominant families are worth knowing: Algorithm Starting Point Merge / Prune Strategy Famous Uses Byte-Pair Encoding (BPE) All possible bytes (256) Repeatedly merge the most frequent adjacent pair GPT-2, GPT-3, Llama-2 WordPiece Individual Unicode characters Merge pair that most reduces perplexity rather than raw count BERT, DistilBERT Unigram (SentencePiece) Extremely large seed vocabulary Iteratively remove tokens whose absence improves a Bayesian objective T5, ALBERT Byte-level BPE UTF-8 bytes Same as classic BPE but merges operate on raw bytes GPT-NeoX, GPT-3.5, GPT-4 Because byte-level BPE sees the world as 1-byte pieces, it can tokenize English, Chinese, emoji, and Markdown without language-specific hacks. The price is sometimes unintuitive splits: a single exotic Unicode symbol might expand into dozens of byte tokens. An End-to-End Example (GPT-3.5 Tokenizer) Input string: Python def greet(name: str) -> str: return f"Hello, {name}" Tokenized output: Token ID Text def 3913 def _g 184 space + g reet 13735 reet ( 25 ( … … … Eighteen tokens, but 55 visible characters. Every additional token will be part of your bill. Why Providers Charge Per Token A transformer layer applies the same weight matrices to every token position. Doubling token count roughly doubles FLOPs, SRAM traffic, and wall-clock time. Hardware vendors quote sustained TFLOPs/s assuming full utilization, so providers size their clusters and price their SKUs accordingly. Billing per word would misrepresent the reality that some emoji characters can explode into ten byte tokens, while the English word “the” costs only one. The token is the fairest atomic unit of compute. If an endpoint advertises a 128 k-token context, that means roughly 512 kB of text (in English prose) or a short novel. Pass that slice through a 70-billion-parameter model and you’ll crunch trillions of multiply-accumulates, hence the eye-popping price tag. Four Techniques to Shrink Your Token Budget 1. Fine-tuning & PEFT Shift recurring instructions (“You are a helpful assistant…”) into model weights. A one-time fine-tune cost can pay for itself after a few million calls by chopping 50–200 prompt tokens each request. 2. Prompt Caching KV (key–value) caches store attention projections for the shared prefix. Subsequent tokens reuse them, so the incremental cost is linear in new tokens only.OpenAI and Anthropic expose an cache=true parameter; vLLM auto-detects overlapping prefixes server-side and reports ~1.2–2× throughput gains at >256 concurrent streams. 3. Retrieval-Augmented Generation (RAG) Instead of injecting an entire knowledge base, embed it offline, retrieve only the top-k snippets, and feed the model a skinny prompt like the one shown below. RAG can replace a 10 k-token memory dump with a 1 k-token on-demand payload. Answer with citations. Context:\n\n<snippet 1>\n<snippet 2> 4. Vocabulary-Aware Writing Avoid fancy quotes, hairline spaces, and deep Unicode indentation which balloon into byte junk.Prefer ASCII tables to box-drawing characters.Batch similar calls (e.g., multiple Q&A pairs) to amortize overhead. Prompt Caching Under the Microscope Assume your backend supports prefix reuse. Two users ask: SYSTEM: You are a SQL expert. Provide optimized queries. USER: List the ten most-purchased products. and later SYSTEM: You are a SQL expert. Provide optimized queries. USER: Calculate monthly revenue growth. The second request shares a 14-token system prompt. With caching, the model skips those 14 tokens, runs attention only on the five fresh ones, and streams the answer twice as fast. Your bill likewise drops because providers charge only for non-cached tokens (input and output). Hidden Costs: Tokenization Mistakes Across Model Families Each checkpoint ships with its own merge table. A prompt engineered for GPT-4 may tokenize very differently on Mixtral or Gemini-Pro. For instance, the em-dash “—” is a single token (1572) for GPT-3.5 but splits into three on Llama-2. Rule of thumb: Whenever you migrate a workflow, log token counts before and after. What was cheap yesterday can triple in price overnight. Instrumentation: What to Measure and Alert On prompt_tokens – size of user + system + assistant context.completion_tokens – model’s output length.Cache hit ratio – percentage of tokens skipped.Cost per request – aggregate of (prompt + completion) × price rate.Latency variance – spikes often correlate with unusually long prompts that evaded cache. Streaming these metrics into Grafana or Datadog lets you spot runaway bills in real time. Advanced Tricks for Power Users Adaptive Chunking: For Llama-2 in vLLM, adding --max-prompt-feed 2048 breaks colossal prompts into GPU-friendly slices, enabling 8 × throughput on A100-40G cards.Speculative Decoding: Draft with a small model, validate with the big one. Providers like OpenAI (gpt-4o-mini + gpt-4o) surface this behind the scenes, slashing tail latency by ~50 %.Token Dropping at Generation Time: During beam search, discard beams diverging early; they would spend tokens on answers you’ll never show. Key Takeaways Tokens are the currency. Vocabulary design, not characters, defines cost.Measure relentlessly. Log every call’s token counts.Exploit repetition. Fine-tune or cache recurring scaffolding.Retrieval beats memorization. RAG turns 10 k-token dumps into 1 k curated bites.Re-benchmark after each model swap. Merge tables shift; your budget should shift with them. Whether you’re integrating language models into everyday applications or creating AI agents, understanding tokenization will keep your solutions fast, affordable, and reliable. Master the humble tokenizer and every other layer of the LLM stack (prompt engineering, retrieval, model selection, etc.) becomes much easier.
Ever thought about building your own AI-powered app that gives personalized nutrition tips, and even talks back to you? In this hands-on tutorial, you’ll learn how to create Nurture, your very own AI nutrition coach. We’ll use GPT-4 for natural, intelligent conversations, Gradio to build a simple, interactive web interface, and gTTS (Google Text-to-Speech) so your app can speak its responses aloud. Nurture will be able to chat with users, calculate their BMI, and provide helpful audio feedback, all wrapped in a clean, shareable web app. Whether you’re just getting started with Python or looking to sharpen your skills, this step-by-step guide will walk you through the entire process: setting up your environment, working with APIs, and deploying a fully functional AI assistant you can show off or expand further. Prerequisites Let’s get set up! Here’s what you’ll need to have in place before we dive in: Python 3.8+ installed on your system.Use your favorite code editor (VS Code or PyCharm if you need a suggestion).Basic knowledge of Python and familiarity with installing packages using pip.An internet connection to access APIs and deploy the app. Step 1: Setting Up the Environment To begin, let's set up a virtual environment and install the required libraries. 1. Set up a virtual environment Open your terminal and run the following command: Python python -m venv nurture_env Activate the virtual environment: On Windows: nurture_env\Scripts\activateOn macOS/Linux: source nurture_env/bin/activate 2. Install required libraries: With the virtual environment activated, install the necessary packages: Python pip install openai gradio gTTS openai: For interacting with OpenAI's GPT-4 model.gradio: For creating a web-based user interface.gTTS: For converting text responses to speech. Step 2: Obtaining an OpenAI API Key The application uses OpenAI's GPT-4 model to power the conversational AI. To use it, you need an API key. 1. Sign up for OpenAI: Visit platform.openai.com.If you’re new, sign up for an account—otherwise, just log in and you’re good to go. 2. Generate an API key: Go to the API Keys section in your OpenAI dashboard.Click Create New Secret Key, give it a name (e.g., "Nurture App"), and copy the key.Important: Store this key securely and never share it publicly. 3. Set up your API key: In your code, replace "Your_key" with your actual OpenAI API key. For security, you can store it in an environment variable: Python export OPENAI_API_KEY="your-api-key-here" Then, in your Python code, access it using: Python import os openai.api_key = os.getenv("OPENAI_API_KEY") Step 3: Understanding the Code Structure The application has three main components: Conversational AI: Uses OpenAI’s ChatCompletion API to deliver friendly, nutrition-focused guidance tailored to each user.Text-to-Speech: Converts AI responses to audio using gTTS.BMI Calculator: Computes the user’s Body Mass Index (BMI) from their height and weight inputs.Gradio Interface: Brings all the features together in a simple, web-based user interface. Let's break down each part and explain how to implement it. Step 4: Writing the Code Below is the complete code for the application, with detailed explanations for each section. Save this as nurture.py. Python import openai import gradio as gr from gtts import gTTS import tempfile import os Open AI Setup Python openai.api_key = os.getenv("OPENAI_API_KEY") # Securely load API key Persistent Chat History Here’s a short explanation of what the code does: chat_history initializes a conversation context with a system prompt that defines the assistant as a friendly nutritionist.ask_openai() sends the user’s question to OpenAI’s GPT-4 model, maintains the ongoing conversation by updating chat_history, and returns the model's response.speak() uses Google Text-to-Speech (gTTS) to convert the assistant’s response into an audio file (MP3), which can be played back.calculate_bmi() takes a user’s height and weight, calculates their Body Mass Index (BMI), and classifies it into categories like "Normal weight" or "Obese." This setup enables a conversational, voice-enabled nutrition coach with BMI feedback. Python # Initialize chat history with system prompt chat_history = [ { "role": "system", "content": "You are an empathetic, conversational nutritionist who gently guides users to share their name, fitness goals, lifestyle, allergies, dietary preferences, and typical meals. Start with a warm greeting and then ask follow-up questions to eventually create a personalized diet plan." } ] def ask_openai(question): """ Sends user input to OpenAI's GPT-4 model and returns the response. Maintains conversation context by appending to chat_history. """ try: chat_history.append({"role": "user", "content": question}) response = openai.ChatCompletion.create( model="gpt-4", messages=chat_history, temperature=0.5, # Controls randomness (0.5 for balanced responses) max_tokens=300 # Limits response length ) reply = response['choices'][0]['message']['content'] chat_history.append({"role": "assistant", "content": reply}) return reply except Exception as e: return f"Error: {str(e)}" def speak(text): """ Converts text to speech using gTTS and saves it to a temporary MP3 file. Returns the file path for playback. """ tts = gTTS(text) tmp_file = tempfile.NamedTemporaryFile(delete=False, suffix=".mp3") tts.save(tmp_file.name) return tmp_file.name def calculate_bmi(height_cm, weight_kg): """ Calculates BMI based on height (cm) and weight (kg). Returns BMI value and weight category. """ try: height_m = height_cm / 100 bmi = weight_kg / (height_m ** 2) category = ( "Underweight" if bmi < 18.5 else "Normal weight" if bmi < 25 else "Overweight" if bmi < 30 else "Obese" ) return f"Your BMI is {bmi:.2f} — {category}" except: return "Please enter valid height and weight." Gradio Interface Python with gr.Blocks(theme=gr.themes.Soft()) as demo: gr.Markdown(" Nurture : Your AI Nutrition Coach ") with gr.Row(): # Chat Section with gr.Column(scale=2): chatbot = gr.Chatbot(height=350, label="Chat with Nurture⚕️") message = gr.Textbox(placeholder="Ask something...", label="Your Message") send_btn = gr.Button("Send") audio_output = gr.Audio(label="AI Voice", type="filepath", interactive=False) def respond(user_message, chat_log): """ Handles user input, gets AI response, updates chat log, and generates audio. """ bot_reply = ask_openai(user_message) chat_log.append((user_message, bot_reply)) audio_path = speak(bot_reply) return "", chat_log, audio_path send_btn.click(respond, inputs=[message, chatbot], outputs=[message, chatbot, audio_output]) # BMI + Tools Section with gr.Column(scale=1): gr.Markdown("### Check Your BMI") height = gr.Number(label="Height (in cm)") weight = gr.Number(label="Weight (in kg)") bmi_output = gr.Textbox(label="Result") bmi_btn = gr.Button("Calculate BMI") bmi_btn.click(fn=calculate_bmi, inputs=[height, weight], outputs=bmi_output) demo.launch(share=True) Let’s review the integrated code. You’re welcome to use a different model instead of GPT and add any additional features as needed. Python import gradio as gr from gtts import gTTS import tempfile # OpenAI setup openai.api_key = "Your_key" # Persistent chat history chat_history = [ {"role": "system", "content": "You are an empathetic, conversational nutritionist who gently guides users to share their name, fitness goals, lifestyle, allergies, dietary preferences, and typical meals. Start with a warm greeting and then ask follow-up questions to eventually create a personalized diet plan."} ] def ask_openai(question): try: chat_history.append({"role": "user", "content": question}) response = openai.ChatCompletion.create( model="gpt-4", messages=chat_history, temperature=0.5, max_tokens=300 ) reply = response['choices'][0]['message']['content'] chat_history.append({"role": "assistant", "content": reply}) return reply except Exception as e: return f"Error: {str(e)}" def speak(text): tts = gTTS(text) tmp_file = tempfile.NamedTemporaryFile(delete=False, suffix=".mp3") tts.save(tmp_file.name) return tmp_file.name def calculate_bmi(height_cm, weight_kg): try: height_m = height_cm / 100 bmi = weight_kg / (height_m ** 2) category = ( "Underweight" if bmi < 18.5 else "Normal weight" if bmi < 25 else "Overweight" if bmi < 30 else "Obese" ) return f"Your BMI is {bmi:.2f} — {category}" except: return "Please enter valid height and weight." # Gradio interface with gr.Blocks(theme=gr.themes.Soft()) as demo: gr.Markdown("<h1 style='text-align: center;'> Nurture : Your AI Nutrition Coach</h1>") with gr.Row(): # Chat Section with gr.Column(scale=2): chatbot = gr.Chatbot(height=350, label="Chat with Nurture⚕️") message = gr.Textbox(placeholder="Ask something...", label="Your Message") send_btn = gr.Button("Send") audio_output = gr.Audio(label="AI Voice", type="filepath", interactive=False) def respond(user_message, chat_log): bot_reply = ask_openai(user_message) chat_log.append((user_message, bot_reply)) audio_path = speak(bot_reply) return "", chat_log, audio_path send_btn.click(respond, inputs=[message, chatbot], outputs=[message, chatbot, audio_output]) # BMI + Tools Section with gr.Column(scale=1): gr.Markdown("###Check Your BMI") height = gr.Number(label="Height (in cm)") weight = gr.Number(label="Weight (in kg)") bmi_output = gr.Textbox(label="Result") bmi_btn = gr.Button("Calculate BMI") bmi_btn.click(fn=calculate_bmi, inputs=[height, weight], outputs=bmi_output) demo.launch(share=True) Sample UI after running the code: Note: This application also includes text-to-speech functionality to deliver the AI's responses audibly to users. Happy Coding!
At noon, Xiao Wang was staring at his computer screen, looking worried. He is in charge of the company's data platform and recently received a task: to perform real-time analysis on data from three different databases—MySQL, PostgreSQL, and Oracle. "I have to write three sets of ETL programs to synchronize the data into Doris. What a workload...", Xiao Wang rubbed his sore eyes. "Why make it so complicated? Use JDBC Catalog!" his colleague Xiao Li's voice came from behind. Xiao Wang looked puzzled: "What is this magical JDBC Catalog?" Xiao Li smiled: "It's like a universal key in the data world, opening the doors to multiple databases with one key." The Magic Tool to Break Data Silos The core charm of Doris JDBC Catalog lies in its ability to connect to a variety of databases through the standard JDBC interface, including MySQL, PostgreSQL, Oracle, SQL Server, IBM Db2, ClickHouse, SAP HANA, and OceanBase. Let's look at a practical example: SQL CREATE CATALOG mysql_source PROPERTIES ( "type"="jdbc", "user"="root", "password"="secret", "jdbc_url" = "jdbc:mysql://example.net:3306", "driver_url" = "mysql-connector-j-8.3.0.jar", "driver_class" = "com.mysql.cj.jdbc.Driver" ) That's it! Once created, you can query tables in MySQL just like querying local Doris tables: Shell -- Show all databases SHOW DATABASES FROM mysql_source; -- Show tables in the test database SHOW TABLES FROM mysql_source.test; -- Directly query a table in MySQL SELECT * FROM mysql_source.test.users; For data analysts, this means no more writing complex ETL processes, no more worrying about data consistency, and no additional storage costs. The data remains stored in the original databases, but you can query them as if they were local tables. The Magic Formula of JDBC Catalog To unleash the full power of JDBC Catalog, you need to understand its key configuration options. Driver Configuration First, the driver package path can be specified in three ways: Filename only, such as mysql-connector-j-8.3.0.jar. The system will look for it in the jdbc_drivers/ directory.Local absolute path, such as file:///path/to/mysql-connector-j-8.3.0.jar.HTTP address. The system will automatically download the driver file. Tip: To better manage driver packages, you can use the jdbc_driver_secure_path FE configuration item to allow driver package paths, enhancing security. Connection Pool Tuning The connection pool has a significant impact on performance. For example, establishing a new connection for each query is like having to apply for a new access card every time you enter the office—it's cumbersome! Key parameters include: connection_pool_min_size: Minimum number of connections (default is 1).connection_pool_max_size: Maximum number of connections (default is 30).connection_pool_max_wait_time: Maximum milliseconds to wait for a connection (default is 5000).connection_pool_max_life_time: Maximum lifecycle of a connection (default is 1800000 milliseconds). SQL -- Adjust connection pool size based on load ALTER CATALOG mysql_source SET PROPERTIES ( 'connection_pool_max_size' = '50', 'connection_pool_max_wait_time' = '10000' ); Advanced Usage: Statement Pass-Through Xiao Wang curiously asked, "What if I want to perform some DDL or DML operations in MySQL?" Xiao Li smiled mysteriously: "Doris provides a statement pass-through feature that allows you to execute native SQL statements directly in the data source." DDL and DML Pass-Through Currently, only DDL and DML statements are supported, and you must use the syntax corresponding to the data source. SQL -- Insert data CALL EXECUTE_STMT("mysql_source", "INSERT INTO users VALUES(1, 'Zhang San'), (2, 'Li Si')"); -- Delete data CALL EXECUTE_STMT("mysql_source", "DELETE FROM users WHERE id = 2"); -- Create a table CALL EXECUTE_STMT("mysql_source", "CREATE TABLE new_users (id INT, name VARCHAR(50))"); Query Pass-Through SQL -- Use the query table function to execute a native query SELECT * FROM query( "catalog" = "mysql_source", "query" = "SELECT id, name FROM users WHERE id > 10" ); The pass-through feature allows you to fully leverage the capabilities and syntax of the source database while maintaining unified management in Doris. Pitfall Guide: Common Issues and Solutions The world of database connections is not always sunny; sometimes you will encounter some pitfalls. Connection Timeout Issues One of the most common errors is: Connection is not available, request timed out after 5000ms Possible causes include: Cause 1: Network issues (e.g., the server is unreachable).Cause 2: Authentication issues, such as invalid usernames or passwords.Cause 3: High network latency, causing connection creation to exceed the 5-second timeout.Cause 4: Too many concurrent queries, exceeding the maximum number of connections configured in the connection pool. Solutions: 1. If you only see the error Connection is not available, request timed out after 5000ms, check Causes 3 and 4: First, check for high network latency or resource exhaustion. Increase the maximum number of connections and connection timeout time: SQL -- Increase the maximum number of connections ALTER CATALOG mysql_source SET PROPERTIES ('connection_pool_max_size' = '100'); -- Increase the wait time ALTER CATALOG mysql_source SET PROPERTIES ('connection_pool_max_wait_time' = '10000'); 2. If you see additional errors besides Connection is not available, request timed out after 5000ms, investigate these additional errors: Network issues (e.g., the server is unreachable) can cause connection failures. Check if the network connection is normal.Authentication issues (e.g., invalid usernames or passwords) can also cause connection failures. Verify the database credentials used in the configuration to ensure they are correct.Investigate issues related to the network, database, or authentication based on the specific error messages to identify the root cause. Conclusion Doris JDBC Catalog brings a revolutionary change to data analysis. It allows us to connect to multiple data sources in an elegant and efficient way, achieving query-as-you-go. No more complex ETL processes, no more data synchronization headaches, just a smooth analysis experience. As Xiao Wang later said to Xiao Li: "JDBC Catalog has shown me a new possibility in the world of data. I used to spend 80% of my time handling data synchronization, but now I can use that time for real analysis." Next time you face the challenge of analyzing multiple data sources, consider trying this universal key to the data world. It might change your data analysis approach just as it changed Xiao Wang's. Stay tuned for more interesting, useful, and valuable content in the next post!
Your application has an integration with another system. In your unit integration tests, you want to mock the other system's behaviour. WireMock is a testing library that helps you with mocking the APIs you depend on. In this blog, you will explore WireMock for testing a Spring Boot application. Enjoy! Introduction Almost every application has an integration with another system. This integration needs to be tested, of course. Testcontainers are a good choice for writing unit integration tests. This way, your application will talk to a real system in your tests. However, what do you do when no container image is available, or when the other system is difficult to configure for your tests? In that case, you would like to mock the other system. WireMock is a testing library that will help you with that. Sources used in this blog are available at GitHub. Prerequisites The prerequisites needed for reading this blog are: Basic Java knowledge;Basic Spring Boot knowledge;Basic LangChain4j knowledge;Basic LMStudio knowledge. Application Under Test As the application under test, a Spring Boot application is created using LangChain4j, which communicates with LMStudio. There is no official container image for LMStudio, so this is a good use case for WireMock. The communication between LangChain4j and LMStudio is based on the OpenAI OpenAPI specification. The LangChain4j tutorial for integrating with Spring Boot will be used as a starting point. Navigate to the Spring Initializr and add the Spring Web dependency. Additionally, add the following dependencies to the pom. XML <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-open-ai-spring-boot-starter</artifactId> <version>1.0.0-beta2</version> </dependency> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-spring-boot-starter</artifactId> <version>1.0.0-beta2</version> </dependency> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-reactor</artifactId> <version>1.0.0-beta2</version> </dependency> Create an Assistant which will allow you to send a chat message to LMStudio. Create one for non-streaming responses and one for streaming responses. Java @AiService public interface Assistant { @SystemMessage("You are a polite assistant") String chat(String userMessage); @SystemMessage("You are a polite assistant") Flux<String> stream(String userMessage); } Create an AssistantConfiguration where you define beans to create the language models to be used. The URL for LMStudio is configurable; the other options are hard-coded, just for the convenience of the demo. Java @Configuration public class AssistantConfiguration { @Bean public ChatLanguageModel languageModel(MyProperties myProperties) { return OpenAiChatModel.builder() .apiKey("dummy") .baseUrl(myProperties.lmStudioBaseUrl()) .modelName("llama-3.2-1b-instruct") .build(); } @Bean public StreamingChatLanguageModel streamingLanguageModel(MyProperties myProperties) { return OpenAiStreamingChatModel.builder() .apiKey("dummy") .baseUrl(myProperties.lmStudioBaseUrl()) .modelName("llama-3.2-1b-instruct") .build(); } } The last thing to do is to create an AssistantController. Java @RestController class AssistantController { Assistant assistant; public AssistantController(Assistant assistant) { this.assistant = assistant; } @GetMapping("/chat") public String chat(String message) { return assistant.chat(message); } @GetMapping("/stream") public Flux<String> stream(String message) { return assistant.stream(message); } } Start LMStudio, load the llama-3.2-1b-instruct model, and start the server. Start the Spring Boot application. Shell mvn spring-boot:run Send a chat message for the non-streaming API. Plain Text $ curl http://localhost:8080/chat?message=Tell%20me%20a%20joke Here's one: Why did the scarecrow win an award? Because he was outstanding in his field! I hope that made you smile! Do you want to hear another one? This works; you can do the same for the streaming API The response will be similar, but with a streaming response. Now, stop the Spring Boot application and the LMStudio server. Mock Assistant You can create a test using @WebMvcTest and inject the Assistant as a MockitoBean. This allows you to mock the response from the Assistant. However, you will only test up to the dashed line in the image below. Everything else will be out of scope for your test. The test itself is the following. Java @WebMvcTest(AssistantController.class) class ControllerWebMvcTest { @Autowired private MockMvc mockMvc; @MockitoBean private Assistant assistant; @Test void testChat() throws Exception { when(assistant.chat("Tell me a joke")).thenReturn("This is a joke"); mockMvc.perform(MockMvcRequestBuilders.get("/chat") .param("message", "Tell me a joke") ) .andExpect(status().is2xxSuccessful()) .andExpect(content().string("This is a joke")); } } This test might be ok, but actually, you are not testing a lot of functionality. When you upgrade LangChain4j, you might get surprised when breaking changes are introduced. This test will not reveal anything because the LangChain4j dependency is not part of your test. Mock HTTP Request A better approach is to mock the HTTP request/response between your application and LMStudio. And this is where WireMock is used. Your test is now extended up to the dashed line in the image below. In order to use WireMock, you need to add the wiremock-spring-boot dependency to the pom. XML <dependency> <groupId>org.wiremock.integrations</groupId> <artifactId>wiremock-spring-boot</artifactId> <version>3.6.0</version> <scope>test</scope> </dependency> The setup of the test is as follows: Add @SpringBootTest, this will spin up the Spring Boot application.Add @EnableWireMock in order to enable WireMock.Add @TestPropertySource in order to override the LMStudio URL. WireMock will run on a random port and this way the random port will be used.Add @AutoConfigureMockMvc because MockMvc will be used to send the HTTP request to the controller. Java @SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT) @EnableWireMock @TestPropertySource(properties = { "my.properties.lm-studio-base-url=http://localhost:${wiremock.server.port}/v1" }) @AutoConfigureMockMvc class ControllerWireMockTest { @Autowired MockMvc mockMvc; ... } In order to mock the request and response, you need to know the API, or you need some examples. The logs of LMStudio are very convenient because the request and response are logged. JSON 2025-03-29 11:28:45 [INFO] Received POST request to /v1/chat/completions with body: { "model": "llama-3.2-1b-instruct", "messages": [ { "role": "system", "content": "You are a polite assistant" }, { "role": "user", "content": "Tell me a joke" } ], "stream": false } 2025-03-29 11:28:45 [INFO] [LM STUDIO SERVER] Running chat completion on conversation with 2 messages. 2025-03-29 11:28:46 [INFO] [LM STUDIO SERVER] Accumulating tokens ... (stream = false) 2025-03-29 11:28:48 [INFO] [LM STUDIO SERVER] [llama-3.2-1b-instruct] Generated prediction: { "id": "chatcmpl-p1731vusmgq3oh0xqnkay4", "object": "chat.completion", "created": 1743244125, "model": "llama-3.2-1b-instruct", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Here's one:\n\nWhy did the scarecrow win an award?\n\nBecause he was outstanding in his field!\n\nI hope that made you smile! Do you want to hear another one?" }, "logprobs": null, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 24, "completion_tokens": 36, "total_tokens": 60 }, "system_fingerprint": "llama-3.2-1b-instruct" } Mocking the request with WireMock consists of a few steps: Stub the request with stubFor and indicate how the request should be matched. Many options are available here; in this example, it is only checked whether it is a POST request and the request matches a specific URL.Set the response; here, many options are available. In this example, the HTTP status is set and the body. After this, you send the request to your controller and verify its response. WireMock will mock the communication between LangChain4j and LMStudio for you. Java @Test void testChat() throws Exception { stubFor(post(urlEqualTo("/v1/chat/completions")) .willReturn(aResponse() .withStatus(200) .withBody(BODY))); mockMvc.perform(MockMvcRequestBuilders.get("/chat") .param("message", "Tell me a joke") ) .andExpect(status().is2xxSuccessful()) .andExpect(content().string("This works!")); } ... private static final String BODY = """ { "id": "chatcmpl-p1731vusmgq3oh0xqnkay4", "object": "chat.completion", "created": 1743244125, "model": "llama-3.2-1b-instruct", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "This works!" }, "logprobs": null, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 24, "completion_tokens": 36, "total_tokens": 60 }, "system_fingerprint": "llama-3.2-1b-instruct" } """; When you run this test, you see in the logs that the WireMock server is started. Shell Started WireMockServer with name 'wiremock':http://localhost:37369 You can also see the requests that are received, what has been matched, and which response has been sent. JSON Content-Type: [application/json] Host: [localhost:37369] Content-Length: [784] Connection: [keep-alive] User-Agent: [Apache-HttpClient/5.4.2 (Java/21)] { "id" : "8d734483-c2f5-4924-8e53-4e4bc4f3b848", "request" : { "url" : "/v1/chat/completions", "method" : "POST" }, "response" : { "status" : 200, "body" : "{\n \"id\": \"chatcmpl-p1731vusmgq3oh0xqnkay4\",\n \"object\": \"chat.completion\",\n \"created\": 1743244125,\n \"model\": \"llama-3.2-1b-instruct\",\n \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\": \"assistant\",\n \"content\": \"This works!\"\n },\n \"logprobs\": null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\": 24,\n \"completion_tokens\": 36,\n \"total_tokens\": 60\n },\n \"system_fingerprint\": \"llama-3.2-1b-instruct\"\n}\n" }, "uuid" : "8d734483-c2f5-4924-8e53-4e4bc4f3b848" } 2025-03-29T13:03:21.398+01:00 INFO 39405 --- [MyWiremockAiPlanet] [qtp507765539-35] WireMock wiremock : Request received: 127.0.0.1 - POST /v1/chat/completions Authorization: [Bearer dummy] User-Agent: [langchain4j-openai] Content-Type: [application/json] Accept-Encoding: [gzip, x-gzip, deflate] Host: [localhost:37369] Content-Length: [214] Connection: [keep-alive] { "model" : "llama-3.2-1b-instruct", "messages" : [ { "role" : "system", "content" : "You are a polite assistant" }, { "role" : "user", "content" : "Tell me a joke" } ], "stream" : false } Matched response definition: { "status" : 200, "body" : "{\n \"id\": \"chatcmpl-p1731vusmgq3oh0xqnkay4\",\n \"object\": \"chat.completion\",\n \"created\": 1743244125,\n \"model\": \"llama-3.2-1b-instruct\",\n \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\": \"assistant\",\n \"content\": \"This works!\"\n },\n \"logprobs\": null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\": 24,\n \"completion_tokens\": 36,\n \"total_tokens\": 60\n },\n \"system_fingerprint\": \"llama-3.2-1b-instruct\"\n}\n" } Stubbing There is a lot of functionality available for stubbing requests. In the example above, the response was created as follows. Java stubFor(post(urlEqualTo("/v1/chat/completions")) .willReturn(aResponse() .withStatus(200) .withBody(BODY))); However, this can be written much shorter by using okJson. Java stubFor(post(urlEqualTo("/v1/chat/completions")) .willReturn(okJson(BODY))); Response From File The body of the response is put in a local constant. This might be ok when you only have one response in your test, but when you have a lot of responses, it is more convenient to put them in files. Do note that you need to put the files in directory test/resources/__files/, otherwise the files will not be found by WireMock. The following error will then be shown. Shell com.github.tomakehurst.wiremock.admin.NotFoundException: Not found in blob store: stubs/jokestub.json The test can be rewritten as follows, assuming that jokestub.json is located in directory test/resources/__files/stubs. Shell stubFor(post(urlEqualTo("/v1/chat/completions")) .willReturn(aResponse().withBodyFile("stubs/jokestub.json"))); Request Matching Request matching is also possible in many ways. Let's assume that you want to match based on the content of the HTTP request. You write two stubs that match on a different body and will return a different response. Java @Test void testChatWithRequestBody() throws Exception { stubFor(post(urlEqualTo("/v1/chat/completions")) .withRequestBody(matchingJsonPath("$.messages[?(@.content == 'Tell me a joke')]")) .willReturn(aResponse().withBodyFile("stubs/jokestub.json"))); stubFor(post(urlEqualTo("/v1/chat/completions")) .withRequestBody(matchingJsonPath("$.messages[?(@.content == 'Tell me another joke')]")) .willReturn(aResponse().withBodyFile("stubs/anotherjokestub.json"))); mockMvc.perform(MockMvcRequestBuilders.get("/chat") .param("message", "Tell me a joke") ) .andExpect(status().is2xxSuccessful()) .andExpect(content().string("This works!")); mockMvc.perform(MockMvcRequestBuilders.get("/chat") .param("message", "Tell me another joke") ) .andExpect(status().is2xxSuccessful()) .andExpect(content().string("This works also!")); } Streaming Response Mocking a streaming response is a bit more complicated. LMStudio only logs the request and not the response. JSON 5-03-29 14:13:55 [INFO] Received POST request to /v1/chat/completions with body: { "model": "llama-3.2-1b-instruct", "messages": [ { "role": "system", "content": "You are a polite assistant" }, { "role": "user", "content": "Tell me a joke" } ], "stream": true, "stream_options": { "include_usage": true } } This caused some challenges. Eventually, LangChain4j was debugged in order to get a grasp of what the response should look like ( dev.langchain4j.http.client.sse.DefaultServerSentEventParser.parse). Only a small snippet of the entire response is used in the test; it is more about the idea. Another difference with the above tests is that you need to use WebTestClient instead of MockMvc. So, remove the @AutoConfigureMockMvc and inject a WebTestClient. Also, add the following dependencies to the pom. XML <!-- Needed for WebTestClient --> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-webflux</artifactId> <scope>test</scope> </dependency> <!-- Needed for StepVerifier --> <dependency> <groupId>io.projectreactor</groupId> <artifactId>reactor-test</artifactId> <version>3.5.8</version> <scope>test</scope> </dependency> The WebTestClient allows you to send and receive a streaming response. The StepVerifier is used to verify the streamed responses. Java @SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT) @EnableWireMock @TestPropertySource(properties = { "my.properties.lm-studio-base-url=http://localhost:${wiremock.server.port}/v1" }) class ControllerStreamWireMockTest { @Autowired private WebTestClient webTestClient; @Test void testStreamFlux() { stubFor(post(WireMock.urlPathEqualTo("/v1/chat/completions")) .willReturn(aResponse() .withStatus(200) .withHeader("Content-Type", "text/event-stream") .withBody(""" data: {"id":"chatcmpl-tnh9pc0j6m91mm9duk4c4x","object":"chat.completion.chunk","created":1743325543,"model":"llama-3.2-1b-instruct","system_fingerprint":"llama-3.2-1b-instruct","choices":[{"index":0,"delta":{"role":"assistant","content":"Here"},"logprobs":null,"finish_reason":null}]} data: {"id":"chatcmpl-tnh9pc0j6m91mm9duk4c4x","object":"chat.completion.chunk","created":1743325543,"model":"llama-3.2-1b-instruct","system_fingerprint":"llama-3.2-1b-instruct","choices":[{"index":0,"delta":{"role":"assistant","content":"'s"},"logprobs":null,"finish_reason":null}]} data: {"id":"chatcmpl-tnh9pc0j6m91mm9duk4c4x","object":"chat.completion.chunk","created":1743325543,"model":"llama-3.2-1b-instruct","system_fingerprint":"llama-3.2-1b-instruct","choices":[{"index":0,"delta":{"role":"assistant","content":" one"},"logprobs":null,"finish_reason":null}]} data: {"id":"chatcmpl-tnh9pc0j6m91mm9duk4c4x","object":"chat.completion.chunk","created":1743325543,"model":"llama-3.2-1b-instruct","system_fingerprint":"llama-3.2-1b-instruct","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]} data: [DONE]"""))); // Use WebClient to make a request to /stream endpoint Flux<String> response = webTestClient.get() .uri(uriBuilder -> uriBuilder.path("/stream").queryParam("message", "Tell me a joke").build()) .accept(MediaType.TEXT_EVENT_STREAM) .exchange() .expectStatus().isOk() .returnResult(String.class) .getResponseBody(); // Verify streamed data using StepVerifier StepVerifier.create(response) .expectNext("Here") .expectNext("'s") .expectNext("one") // spaces are stripped .verifyComplete(); } } Conclusion WireMock is an easy-to-use testing library that helps you with testing integrations with other systems. A lot of functionality is available, and it even works with streaming responses. WireMock is not only limited for use with Spring Boot, also when you want to test integrations from within a regular Java application, WireMock can be used.
Alright, welcome to the final post in this three-part series. Let's do a quick recap of the journey so far: In Part 1, I laid out the problem with monolithic AI "brains" and designed the architecture for a specialist team of agents to power my "InstaVibe Ally" feature.In Part 2, we did a deep dive into the Model Context Protocol (MCP), and I showed you exactly how I connected my Platform Interaction Agent to my application's existing REST APIs, turning them into reusable tools. But my agents are still living on isolated islands. My Social Profiling Agent has no way to give its insights to the Event Planner. My platform integrator can create post and event. I've built a team of specialists, but I haven't given them a way to collaborate. They're a team that can't talk. This is the final, critical piece of the puzzle. To make this a true multi-agent system, my agents need to communicate. This is where the Agent-to-Agent (A2A) protocol comes in. A2A: Giving Your Agents a Shared Language So, what exactly is A2A? At its core, it’s an open standard designed for one agent to discover, communicate with, and delegate tasks to another. If MCP is about an agent using a non-sentient tool, A2A is about an agent collaborating with a peer—another intelligent agent with its own reasoning capabilities. This isn't just about making a simple API call from one service to another. It’s about creating a standardized way for agents to understand each other's skills and work together to achieve complex goals. This was the key to unlocking true orchestration. It meant I could build my specialist agents as completely independent microservices, and as long as they all "spoke A2A," my Orchestrator could manage them as a cohesive team. The Big Question: A2A vs. MCP: What's the Difference? This is a point that can be confusing, so let me break down how I think about it. It’s all about who is talking to whom. MCP is for Agent-to-Tool communication. It’s the agent's key to the tool shed. My Platform Agent uses MCP to connect to my MCP Server, which is a simple gateway to a "dumb" tool—my InstaVibe REST API. The API can't reason or think; it just executes a specific function.A2A is for Agent-to-Agent communication. It’s the agent's phone number to call a colleague. My Orchestrator uses A2A to connect to my Planner Agent. The Planner Agent isn't just a simple function; it has its own LLM, its own instructions, and its own tools (like Google Search). I'm not just telling it to do something; I'm delegating a goal to it. Here’s the simplest way I can put it: Use MCP when you want your agent to use a specific, predefined capability (like create_post or run_sql_query).Use A2A when you want your agent to delegate a complex task to another agent that has its own intelligence. Making It Real: The a2a-python Library in Action Theory is great, but let's look at the code. To implement this, I used the a2a-python library, which made the whole process surprisingly straightforward. It breaks down into two parts: making an agent listen (the server) and making an agent talk (the client). First, I needed to take my specialist agents (Planner, Social, etc.) and wrap them in an A2A server so they could receive tasks. The most important part of this is creating an Agent Card. An Agent Card is exactly what it sounds like: a digital business card for your agent. It’s a standard, machine-readable JSON object that tells other agents: "Hi, I'm the Planner Agent. Here's what I'm good at (my skills), and here's the URL where you can reach me." Here’s a snippet from my planner/a2a_server.py showing how I defined its card and started the server. Python # Inside my PlannerAgent class... skill = AgentSkill( id="event_planner", name="Event planner", description="This agent generates fun plan suggestions tailored to your specified location, dates, and interests...", tags=["instavibe"], examples=["What about Boston MA this weekend?"] ) self.agent_card = AgentCard( name="Event Planner Agent", description="This agent generates fun plan suggestions...", url=f"{PUBLIC_URL}", # The public URL of this Cloud Run service skills=[skill] ) # And in the main execution block... request_handler = DefaultRequestHandler(...) server = A2AStarletteApplication( agent_card=plannerAgent.agent_card, http_handler=request_handler, ) uvicorn.run(server.build(), host='0.0.0.0', port=port) With just that little bit of boilerplate, my Planner Agent, running on Cloud Run, now has an endpoint (/.well-known/agent.json) that serves its Agent Card. It’s officially on the grid and ready to take requests. Now for the fun part: my Orchestrator agent. Its primary job is not to do work itself, but to delegate work to the others. This means it needs to be an A2A client. First, during its initialization, the Orchestrator fetches the Agent Cards from the URLs of all my specialist agents. This is how it "meets the team." The real magic is in how I equipped the Orchestrator. I gave it a single, powerful ADK tool called send_message. The sole purpose of this tool is to make an A2A call to another agent. The final piece was the Orchestrator's prompt. I gave it a detailed instruction set that told it, in no uncertain terms: "You are a manager. Your job is to understand the user's goal, look at the list of available agents and their skills, and use the send_message tool to delegate tasks to the correct specialist." Here's a key snippet from its instructions: Python You are an expert AI Orchestrator. Your primary responsibility is to...delegate each action to the most appropriate specialized remote agent using the send_message function. You do not perform the tasks yourself. Agents: {self.agents} <-- This is where I inject the list of discovered Agent Cards This instruction allows the LLM inside the Orchestrator to reason about the user's request, look at the skills listed on the Agent Cards, and make an intelligent decision about which agent to call. The Grand Finale: A Symphony of Agents Now, let's trace the full, end-to-end flow. A user in the InstaVibe app says: "Plan a fun weekend in Chicago for me and my friends, Ian and Nora."The app calls my Orchestrator Agent, now running on Vertex AI Agent Engine.The Orchestrator’s LLM reasons: "This is a two-step process. First, I need to understand Ian and Nora's interests. Then, I need to create a plan based on those interests."It consults its list of available agents, sees the "Social Profile Agent," and determines it's the right specialist for the first step.It uses its send_message tool to make an A2A call to the Social Agent, asking it to profile Ian and Nora.The Social Agent on Cloud Run receives the A2A request, does its work (querying my Spanner Graph Database), and returns a summary of their shared interests.The Orchestrator receives this summary. It now reasons: "Okay, step one is done. Time for step two."It consults its agent list again, sees the "Event Planner Agent," and makes a new A2A call, delegating the planning task and passing along the crucial context: "Plan an event in Chicago for people who enjoy [shared interests from step 1]."The Planner Agent on Cloud Run receives the request, uses its own tool (Google Search) to find relevant events and venues, and returns a structured JSON plan.The Orchestrator receives the final plan and presents it to the user. This is the power of a multi-agent system. Each component did what it does best, all coordinated through a standard communication protocol. Conclusion of the Series And there you have it. Over these three posts, we've gone from a simple idea to a fully functioning, distributed AI system. I started with a user problem, designed a modular team of agents to solve it, gave them access to my existing APIs with MCP, and finally, enabled them to collaborate as a team with A2A. My biggest takeaway from this whole process is this: building sophisticated AI systems requires us to think like software engineers, not just prompt engineers. By using open standards like MCP and A2A and frameworks like the ADK, I was able to build something that is robust, scalable, and—most importantly—maintainable. You've read the whole story. Now, it's your turn to build it. I've documented every single step of this process in a hands-on "InstaVibe Multi-Agent" Google Codelab. You'll get to build each agent, deploy the MCP server, and orchestrate the whole thing with A2A, all on Google Cloud. It's the best way to move from theory to practice. Thank you for following along with this series. I hope it's been helpful. Give the Codelab a try, and let me know what you build
TL; DR: The Agile Paradox Many companies adopt Agile practices like Scrum but fail to achieve true transformation. This “Agile Paradox” occurs because they implement tactical processes without changing their underlying command-and-control structure, culture, and leadership style. True agility requires profound systemic changes to organizational design, leadership, and technical practices, not just performing rituals. Without this fundamental shift from “doing” to “being” agile, transformations stall, and the promised benefits remain unrealized. The Fundamental Disconnect at the Heart of the Agile Paradox Two decades after the Agile Manifesto, we are in a puzzling situation. Agile practices, particularly Scrum, have achieved widespread adoption across industries. Yet many organizations report significant challenges, with transformations stalling, teams disengaging, and promised benefits unrealized. Studies suggest that a considerable percentage of Agile initiatives do not meet expectations. The evidence points to a fundamental paradox: Organizations adopt Agile tactically while attempting to preserve strategically incompatible systems. This approach isn’t merely an implementation challenge; it represents a profound category error in understanding what Agile actually is. The research report indicates that organizations frequently “implement Agile tactically, at the team or process level, without fundamentally rethinking or dismantling their existing command-and-control organizational structures and management paradigms” (Agile Scope, page 3). The Five Agile Adoption Fallacies 1. The Framing Error: “Doing Agile” vs. Transforming Work Most organizations approach Agile as a process replacement, swapping waterfall artifacts for Scrum events, expecting twice the work to be done in half the time. Teams now have Daily Scrums instead of status meetings, Product Backlogs instead of requirements documents, and Sprint Reviews instead of milestone presentations. This superficial adoption creates the illusion of transformation while preserving the underlying coordination logic. The Reality: Scrum isn’t primarily about events or artifacts. It’s about fundamentally reshaping how work is discovered, prioritized, and validated. When organizations “install” Scrum without changing how decisions are made, how feedback flows, or how learning occurs, they deny themselves its core benefits. As documented in the research report, many large-scale adoptions fail because organizations adopt “the visible artifacts and rituals of Agile [...] without truly understanding or internalizing the core values, principles, and mindset shifts required for genuine agility” (Agile Scope, page 12). The result is what we can call “Agile Theatre” or “Cargo Cult Agile,” essentially performing agility for the show, but without substance. 2. System Over Task: Local Optimization vs. Organizational Adaptation Organizations often optimize team-level practices: improving velocity, optimizing backlog management, and conducting efficient events. However, they frequently resist addressing the organizational systems that enforce handoffs, create dependencies, mandate annual budgeting cycles, or perpetuate fixed-scope initiatives. The Reality: A high-performing Scrum team embedded in a traditional organizational structure hits an effectiveness ceiling almost immediately. The Agile principles of responding to change and, thus, continuously delivering value collide with quarterly planning cycles, departmental silos, the incentivized urge for local optimization, and multi-layer approval processes. This tension manifests concretely: Product Owners who typically lack true ownership and empowerment, acting merely as backlog administrators or requirement scribes rather than value maximizers.” Additionally, teams are blocked by external dependencies, and rigid governance stifles innovation. In other words, traditional governance structures characterized by rigid stage gates, extensive documentation requirements, and centralized approval processes will likely clash with Agile’s need for speed, flexibility, and minimal viable documentation. The heart of this problem lies in complexity. Modern organizational environments, increasingly characterized by volatility, uncertainty, complexity, and ambiguity (VUCA), require adaptive approaches to align strategy to reality or avoid betting the farm on number 23. Yet many organizations still operate with structures designed for more stable, predictable, controllable environments, creating a fundamental mismatch between problem domain and solution approach. 3. The Illusion of Empowerment: “Self-Organizing” Teams in Disempowering Systems Perhaps the most insidious pattern is the contradiction of declared empowerment within controlling systems. Teams are told they’re self-organizing and empowered to make decisions, yet critical choices about roadmaps, staffing, architecture, and priorities typically remain elsewhere. This contradiction slowly but steadily undermines the whole idea of agility, with management saying one thing while the organizational system enforces another. The Reality: True empowerment requires structural changes. Decision authority must be explicitly delegated and supported: the people closest to the problem should make most of the calls. Instead, teams often lack genuine authority over their work, including scope, schedule, or process, and decisions remain centralized and top-down. When organizations claim to want autonomous teams while maintaining command-and-control structures, they create cynicism and disengagement. Additionally, a lack of management support seems to rank among the top causes of Agile failure. Often, this isn’t just a lack of verbal encouragement, but a failure by leadership to understand Agile principles, abandon command-and-control habits, or make the necessary structural changes that create the conditions for autonomy to succeed. 4. Technical Excellence and the Agile Paradox: The Missing Foundation Many Agile transformations focus exclusively on process changes while neglecting the technical practices that enable sustainable agility. Teams adopt iterations and user stories but skip practices like test automation, continuous integration, and refactoring. The Reality: This pattern of process adoption without technical practices is a major reason for failed Agile initiatives. Sustainable agility relies on strong technical foundations; without practices like automated testing and continuous integration, teams accumulate technical debt that eventually cripples their ability to deliver value quickly and reliably. Instead, Agile’s success requires congruence between the team-level agile practices and the organization’s overall ‘operating system,’ including technical practices. 5. The Operating System Fallacy: Scrum as Plugin vs. Platform At its core, Scrum represents a fundamentally different operating model designed for complex, unpredictable environments where adaptation and learning outperform prediction and control. Yet organizations often treat it as a plugin to their existing operating system of command-and-control hierarchies. The Reality: This category error, trying to run an adaptive framework on top of a predictive operating system, creates irreconcilable conflicts. They stem from profoundly different philosophical foundations. Traditional management structures are rooted in Scientific Management (Taylorism), pioneered by Frederick Winslow Taylor in the early 20th century, emphasizing efficiency, standardization, and control. By contrast, Agile is founded on adaptation, collaboration, and distributed decision-making principles. While Taylorism operates on distinct assumptions, it views work as decomposable into simple, standardized, repeatable tasks. It assumes the existence of a single most efficient method for performing each task, which defies common experience in complex environments where we already struggle to identify necessary work. Consequently, when these two systems collide, the traditional structure typically dominates, leading to compromised implementations that retain the form of Agile but not its essence. Beyond the Paradox: Toward Authentic Transformation How do organizations break free from this paradox? There are several paths forward: 1. Structural Realignment Rather than merely implementing Agile within existing structures, successful organizations realign their structure around value streams. Moving beyond simply optimizing isolated team processes requires rethinking the entire organizational operating model. This essential step includes: Moving from functional departments to cross-functional product teamsReducing organizational layers and approval gatesCreating persistent teams rather than project-based staffingAligning support functions (HR, Finance, etc.) with Agile principles. 2. Leadership Transformation Leadership must move beyond endorsement to embodiment, as in consistently playing a dual role: actively supporting the agile initiative while changing their own management style, including: Shifting from directive to servant leadership stylesProviding clear vision and boundaries rather than detailed instructionsModeling a learning mindset that embraces experimentation and adaptationCreating psychological safety for teams to surface problems, take risks, and fail in the process. Thus, leadership is pivotal for any organization striving to become “Agile.” Yet, too often, the leadership undermines its role through a lack of executive understanding, sponsorship, active participation, and commitment; all of those are critical failure factors. 3. Systemic Changes to Enabling Functions True transformation requires reimagining core organizational functions. If these functions remain rooted in traditional, non-Agile paradigms, they can cause critical friction. Therefore, successful organizations implement changes such as: Moving from project-based funding to persistent product teams with outcome-based metricsShifting from annual budgeting to more flexible, incremental funding models aligned with Agile’s iterative, adaptive natureEvolving HR practices from individual performance management to team capability building, overcoming practices like annual individual performance reviews, ranking systems, and reward structures designed to achieve personal objectives.Redesigning governance to focus on outcomes rather than adherence to plans. 4. Technical Excellence and Craft Organizations must invest equally in technical capabilities. Achieving true agility requires significant investment in technical training, mentoring, and DevOps infrastructure and practices, which is why successful engineering organizations focus on: Building technical agility through practices like test automation, continuous delivery, and clean codeCreating a culture of craftsmanship where quality is non-negotiableAllowing teams time to reduce technical debt and improve their tools and practices regularlyMeasuring not just delivery speed but sustainability and quality. Conclusion: From Installation to Transformation The Agile paradox persists because it’s easier to change processes than paradigms. Installing Scrum events requires nothing more than training and schedule adjustments; transforming an organization requires questioning fundamental assumptions about control, hierarchy, and value creation. True Agile transformation isn’t about doing Scrum; it’s about becoming an organization that can continuously learn, adapt, and deliver value in complex environments. This approach requires not just new practices but new mental models. Organizations that break through the paradox recognize that Agile isn’t something you “do” but become. The choice organizations face isn’t whether to do Agile well or poorly. It’s whether they genuinely want to become agile at all. The real failure isn’t Agile itself but the fundamental mismatch between adopting adaptive practices while clinging to outdated management paradigms. Until organizations address this systemic conflict, the promise of true agility will remain a pipedream.
July 9, 2025
by
CORE
Multiple Stakeholder Management in Software Engineering
July 8, 2025 by
How to Reduce Technical Debt With Artificial Intelligence (AI)
July 11, 2025 by
July 11, 2025 by
How to Reduce Technical Debt With Artificial Intelligence (AI)
July 11, 2025 by
The Cybersecurity Blind Spot in DevOps Pipelines
July 11, 2025 by
Modernize Your IAM Into Identity Fabric Powered by Connectors
July 10, 2025 by
Server-Driven UI: Agile Interfaces Without App Releases
July 11, 2025 by
July 11, 2025 by
How to Reduce Technical Debt With Artificial Intelligence (AI)
July 11, 2025 by
July 11, 2025 by