Indexed Views in SQL Server: A Production DBA's Complete Guide
Building an AI Nutrition Coach With OpenAI, Gradio, and gTTS
Software Supply Chain Security
Gone are the days of fragmented security checkpoints and analyzing small pieces of the larger software security puzzle. Today, we are managing our systems for security end to end. Thanks to this shift, software teams have access to a more holistic view — a "full-picture moment" — of our entire software security environment. In the house that DevSecOps built, software supply chains are on the rise as security continues to flourish and evolve across modern software systems. Through the increase of zero-trust architecture and AI-driven threat protection strategies, our security systems are more intelligent and resilient than ever before. DZone's Software Supply Chain Security Trend Report unpacks everything within the software supply chain, every touchpoint and security decision, via its most critical parts. Topics covered include AI-powered security, maximizing ROI when it comes to securing supply chains, regulations from a DevSecOps perspective, a dive into SBOMs, and more.Now, more than ever, is the time to strengthen resilience and enhance your organization's software supply chains.
Getting Started With DevSecOps
AI Automation Essentials
LLMs may speak in words, but under the hood they think in tokens: compact numeric IDs representing character sequences. If you grasp why tokens exist, how they are formed, and where the real-world costs arise, you can trim your invoices, slash latency, and squeeze higher throughput from any model, whether you rent a commercial endpoint or serve one in-house. Why LLMs Don’t Generate Text One Character at a Time Imagine predicting “language” character by character. When decoding the very last “e,” the network must still replay the entire hidden state for the preceding seven characters. Multiply that overhead by thousands of characters in a long prompt and you get eye-watering compute. Sub-word tokenization offers a sweet spot between byte-level granularity and full words. Common fragments such as “lan,” “##gua,” and “##ge” (WordPiece notation, the # being a special attachment token) capture richer statistical signals than individual letters while keeping the vocabulary small enough for fast matrix multiplications on modern accelerators. Fewer time steps per sentence means shorter KV caches, smaller attention matrices, and crucially, fewer dollars spent. How Tokenizers Build Their Vocabulary Tokenizers are trained once, frozen, and shipped with every checkpoint. Three dominant families are worth knowing: Algorithm Starting Point Merge / Prune Strategy Famous Uses Byte-Pair Encoding (BPE) All possible bytes (256) Repeatedly merge the most frequent adjacent pair GPT-2, GPT-3, Llama-2 WordPiece Individual Unicode characters Merge pair that most reduces perplexity rather than raw count BERT, DistilBERT Unigram (SentencePiece) Extremely large seed vocabulary Iteratively remove tokens whose absence improves a Bayesian objective T5, ALBERT Byte-level BPE UTF-8 bytes Same as classic BPE but merges operate on raw bytes GPT-NeoX, GPT-3.5, GPT-4 Because byte-level BPE sees the world as 1-byte pieces, it can tokenize English, Chinese, emoji, and Markdown without language-specific hacks. The price is sometimes unintuitive splits: a single exotic Unicode symbol might expand into dozens of byte tokens. An End-to-End Example (GPT-3.5 Tokenizer) Input string: Python def greet(name: str) -> str: return f"Hello, {name}" Tokenized output: Token ID Text def 3913 def _g 184 space + g reet 13735 reet ( 25 ( … … … Eighteen tokens, but 55 visible characters. Every additional token will be part of your bill. Why Providers Charge Per Token A transformer layer applies the same weight matrices to every token position. Doubling token count roughly doubles FLOPs, SRAM traffic, and wall-clock time. Hardware vendors quote sustained TFLOPs/s assuming full utilization, so providers size their clusters and price their SKUs accordingly. Billing per word would misrepresent the reality that some emoji characters can explode into ten byte tokens, while the English word “the” costs only one. The token is the fairest atomic unit of compute. If an endpoint advertises a 128 k-token context, that means roughly 512 kB of text (in English prose) or a short novel. Pass that slice through a 70-billion-parameter model and you’ll crunch trillions of multiply-accumulates, hence the eye-popping price tag. Four Techniques to Shrink Your Token Budget 1. Fine-tuning & PEFT Shift recurring instructions (“You are a helpful assistant…”) into model weights. A one-time fine-tune cost can pay for itself after a few million calls by chopping 50–200 prompt tokens each request. 2. Prompt Caching KV (key–value) caches store attention projections for the shared prefix. Subsequent tokens reuse them, so the incremental cost is linear in new tokens only.OpenAI and Anthropic expose an cache=true parameter; vLLM auto-detects overlapping prefixes server-side and reports ~1.2–2× throughput gains at >256 concurrent streams. 3. Retrieval-Augmented Generation (RAG) Instead of injecting an entire knowledge base, embed it offline, retrieve only the top-k snippets, and feed the model a skinny prompt like the one shown below. RAG can replace a 10 k-token memory dump with a 1 k-token on-demand payload. Answer with citations. Context:\n\n<snippet 1>\n<snippet 2> 4. Vocabulary-Aware Writing Avoid fancy quotes, hairline spaces, and deep Unicode indentation which balloon into byte junk.Prefer ASCII tables to box-drawing characters.Batch similar calls (e.g., multiple Q&A pairs) to amortize overhead. Prompt Caching Under the Microscope Assume your backend supports prefix reuse. Two users ask: SYSTEM: You are a SQL expert. Provide optimized queries. USER: List the ten most-purchased products. and later SYSTEM: You are a SQL expert. Provide optimized queries. USER: Calculate monthly revenue growth. The second request shares a 14-token system prompt. With caching, the model skips those 14 tokens, runs attention only on the five fresh ones, and streams the answer twice as fast. Your bill likewise drops because providers charge only for non-cached tokens (input and output). Hidden Costs: Tokenization Mistakes Across Model Families Each checkpoint ships with its own merge table. A prompt engineered for GPT-4 may tokenize very differently on Mixtral or Gemini-Pro. For instance, the em-dash “—” is a single token (1572) for GPT-3.5 but splits into three on Llama-2. Rule of thumb: Whenever you migrate a workflow, log token counts before and after. What was cheap yesterday can triple in price overnight. Instrumentation: What to Measure and Alert On prompt_tokens – size of user + system + assistant context.completion_tokens – model’s output length.Cache hit ratio – percentage of tokens skipped.Cost per request – aggregate of (prompt + completion) × price rate.Latency variance – spikes often correlate with unusually long prompts that evaded cache. Streaming these metrics into Grafana or Datadog lets you spot runaway bills in real time. Advanced Tricks for Power Users Adaptive Chunking: For Llama-2 in vLLM, adding --max-prompt-feed 2048 breaks colossal prompts into GPU-friendly slices, enabling 8 × throughput on A100-40G cards.Speculative Decoding: Draft with a small model, validate with the big one. Providers like OpenAI (gpt-4o-mini + gpt-4o) surface this behind the scenes, slashing tail latency by ~50 %.Token Dropping at Generation Time: During beam search, discard beams diverging early; they would spend tokens on answers you’ll never show. Key Takeaways Tokens are the currency. Vocabulary design, not characters, defines cost.Measure relentlessly. Log every call’s token counts.Exploit repetition. Fine-tune or cache recurring scaffolding.Retrieval beats memorization. RAG turns 10 k-token dumps into 1 k curated bites.Re-benchmark after each model swap. Merge tables shift; your budget should shift with them. Whether you’re integrating language models into everyday applications or creating AI agents, understanding tokenization will keep your solutions fast, affordable, and reliable. Master the humble tokenizer and every other layer of the LLM stack (prompt engineering, retrieval, model selection, etc.) becomes much easier.
Hey, DZone Community! We have an exciting year of research ahead for our beloved Trend Reports. And once again, we are asking for your insights and expertise (anonymously if you choose) — readers just like you drive the content we cover in our Trend Reports. Check out the details for our research survey below. Data Engineering Research Across the globe, companies are leveling up their data capabilities and analytics maturity. While organizations have become increasingly aware of the copious new technologies at our disposal, it's now about how we can use them in a thoughtful, efficient, and strategic way. Take our short research survey (~10 minutes) to contribute to our upcoming Trend Report. Did we mention that anyone who takes the survey will be eligible for a chance to enter a raffle to win an e-gift card of their choosing? We're exploring key topics such as: Driving a data-centric cultureData storage and architectureBuilding a robust AI strategyStreaming and real-time dataDataOps trends and takeawaysThe future of data pipelines Join the Data Engineering Research Over the coming month, we will compile and analyze data from hundreds of respondents; results and observations will be featured in the "Key Research Findings" of our upcoming Trend Report. Your responses help inform the narrative of our Trend Reports, so we truly cannot do this without you. Stay tuned for each report's launch and see how your insights align with the larger DZone Community. We thank you in advance for your help! —The DZone Content and Community team
At noon, Xiao Wang was staring at his computer screen, looking worried. He is in charge of the company's data platform and recently received a task: to perform real-time analysis on data from three different databases—MySQL, PostgreSQL, and Oracle. "I have to write three sets of ETL programs to synchronize the data into Doris. What a workload...", Xiao Wang rubbed his sore eyes. "Why make it so complicated? Use JDBC Catalog!" his colleague Xiao Li's voice came from behind. Xiao Wang looked puzzled: "What is this magical JDBC Catalog?" Xiao Li smiled: "It's like a universal key in the data world, opening the doors to multiple databases with one key." The Magic Tool to Break Data Silos The core charm of Doris JDBC Catalog lies in its ability to connect to a variety of databases through the standard JDBC interface, including MySQL, PostgreSQL, Oracle, SQL Server, IBM Db2, ClickHouse, SAP HANA, and OceanBase. Let's look at a practical example: SQL CREATE CATALOG mysql_source PROPERTIES ( "type"="jdbc", "user"="root", "password"="secret", "jdbc_url" = "jdbc:mysql://example.net:3306", "driver_url" = "mysql-connector-j-8.3.0.jar", "driver_class" = "com.mysql.cj.jdbc.Driver" ) That's it! Once created, you can query tables in MySQL just like querying local Doris tables: Shell -- Show all databases SHOW DATABASES FROM mysql_source; -- Show tables in the test database SHOW TABLES FROM mysql_source.test; -- Directly query a table in MySQL SELECT * FROM mysql_source.test.users; For data analysts, this means no more writing complex ETL processes, no more worrying about data consistency, and no additional storage costs. The data remains stored in the original databases, but you can query them as if they were local tables. The Magic Formula of JDBC Catalog To unleash the full power of JDBC Catalog, you need to understand its key configuration options. Driver Configuration First, the driver package path can be specified in three ways: Filename only, such as mysql-connector-j-8.3.0.jar. The system will look for it in the jdbc_drivers/ directory.Local absolute path, such as file:///path/to/mysql-connector-j-8.3.0.jar.HTTP address. The system will automatically download the driver file. Tip: To better manage driver packages, you can use the jdbc_driver_secure_path FE configuration item to allow driver package paths, enhancing security. Connection Pool Tuning The connection pool has a significant impact on performance. For example, establishing a new connection for each query is like having to apply for a new access card every time you enter the office—it's cumbersome! Key parameters include: connection_pool_min_size: Minimum number of connections (default is 1).connection_pool_max_size: Maximum number of connections (default is 30).connection_pool_max_wait_time: Maximum milliseconds to wait for a connection (default is 5000).connection_pool_max_life_time: Maximum lifecycle of a connection (default is 1800000 milliseconds). SQL -- Adjust connection pool size based on load ALTER CATALOG mysql_source SET PROPERTIES ( 'connection_pool_max_size' = '50', 'connection_pool_max_wait_time' = '10000' ); Advanced Usage: Statement Pass-Through Xiao Wang curiously asked, "What if I want to perform some DDL or DML operations in MySQL?" Xiao Li smiled mysteriously: "Doris provides a statement pass-through feature that allows you to execute native SQL statements directly in the data source." DDL and DML Pass-Through Currently, only DDL and DML statements are supported, and you must use the syntax corresponding to the data source. SQL -- Insert data CALL EXECUTE_STMT("mysql_source", "INSERT INTO users VALUES(1, 'Zhang San'), (2, 'Li Si')"); -- Delete data CALL EXECUTE_STMT("mysql_source", "DELETE FROM users WHERE id = 2"); -- Create a table CALL EXECUTE_STMT("mysql_source", "CREATE TABLE new_users (id INT, name VARCHAR(50))"); Query Pass-Through SQL -- Use the query table function to execute a native query SELECT * FROM query( "catalog" = "mysql_source", "query" = "SELECT id, name FROM users WHERE id > 10" ); The pass-through feature allows you to fully leverage the capabilities and syntax of the source database while maintaining unified management in Doris. Pitfall Guide: Common Issues and Solutions The world of database connections is not always sunny; sometimes you will encounter some pitfalls. Connection Timeout Issues One of the most common errors is: Connection is not available, request timed out after 5000ms Possible causes include: Cause 1: Network issues (e.g., the server is unreachable).Cause 2: Authentication issues, such as invalid usernames or passwords.Cause 3: High network latency, causing connection creation to exceed the 5-second timeout.Cause 4: Too many concurrent queries, exceeding the maximum number of connections configured in the connection pool. Solutions: 1. If you only see the error Connection is not available, request timed out after 5000ms, check Causes 3 and 4: First, check for high network latency or resource exhaustion. Increase the maximum number of connections and connection timeout time: SQL -- Increase the maximum number of connections ALTER CATALOG mysql_source SET PROPERTIES ('connection_pool_max_size' = '100'); -- Increase the wait time ALTER CATALOG mysql_source SET PROPERTIES ('connection_pool_max_wait_time' = '10000'); 2. If you see additional errors besides Connection is not available, request timed out after 5000ms, investigate these additional errors: Network issues (e.g., the server is unreachable) can cause connection failures. Check if the network connection is normal.Authentication issues (e.g., invalid usernames or passwords) can also cause connection failures. Verify the database credentials used in the configuration to ensure they are correct.Investigate issues related to the network, database, or authentication based on the specific error messages to identify the root cause. Conclusion Doris JDBC Catalog brings a revolutionary change to data analysis. It allows us to connect to multiple data sources in an elegant and efficient way, achieving query-as-you-go. No more complex ETL processes, no more data synchronization headaches, just a smooth analysis experience. As Xiao Wang later said to Xiao Li: "JDBC Catalog has shown me a new possibility in the world of data. I used to spend 80% of my time handling data synchronization, but now I can use that time for real analysis." Next time you face the challenge of analyzing multiple data sources, consider trying this universal key to the data world. It might change your data analysis approach just as it changed Xiao Wang's. Stay tuned for more interesting, useful, and valuable content in the next post!
Your application has an integration with another system. In your unit integration tests, you want to mock the other system's behaviour. WireMock is a testing library that helps you with mocking the APIs you depend on. In this blog, you will explore WireMock for testing a Spring Boot application. Enjoy! Introduction Almost every application has an integration with another system. This integration needs to be tested, of course. Testcontainers are a good choice for writing unit integration tests. This way, your application will talk to a real system in your tests. However, what do you do when no container image is available, or when the other system is difficult to configure for your tests? In that case, you would like to mock the other system. WireMock is a testing library that will help you with that. Sources used in this blog are available at GitHub. Prerequisites The prerequisites needed for reading this blog are: Basic Java knowledge;Basic Spring Boot knowledge;Basic LangChain4j knowledge;Basic LMStudio knowledge. Application Under Test As the application under test, a Spring Boot application is created using LangChain4j, which communicates with LMStudio. There is no official container image for LMStudio, so this is a good use case for WireMock. The communication between LangChain4j and LMStudio is based on the OpenAI OpenAPI specification. The LangChain4j tutorial for integrating with Spring Boot will be used as a starting point. Navigate to the Spring Initializr and add the Spring Web dependency. Additionally, add the following dependencies to the pom. XML <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-open-ai-spring-boot-starter</artifactId> <version>1.0.0-beta2</version> </dependency> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-spring-boot-starter</artifactId> <version>1.0.0-beta2</version> </dependency> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-reactor</artifactId> <version>1.0.0-beta2</version> </dependency> Create an Assistant which will allow you to send a chat message to LMStudio. Create one for non-streaming responses and one for streaming responses. Java @AiService public interface Assistant { @SystemMessage("You are a polite assistant") String chat(String userMessage); @SystemMessage("You are a polite assistant") Flux<String> stream(String userMessage); } Create an AssistantConfiguration where you define beans to create the language models to be used. The URL for LMStudio is configurable; the other options are hard-coded, just for the convenience of the demo. Java @Configuration public class AssistantConfiguration { @Bean public ChatLanguageModel languageModel(MyProperties myProperties) { return OpenAiChatModel.builder() .apiKey("dummy") .baseUrl(myProperties.lmStudioBaseUrl()) .modelName("llama-3.2-1b-instruct") .build(); } @Bean public StreamingChatLanguageModel streamingLanguageModel(MyProperties myProperties) { return OpenAiStreamingChatModel.builder() .apiKey("dummy") .baseUrl(myProperties.lmStudioBaseUrl()) .modelName("llama-3.2-1b-instruct") .build(); } } The last thing to do is to create an AssistantController. Java @RestController class AssistantController { Assistant assistant; public AssistantController(Assistant assistant) { this.assistant = assistant; } @GetMapping("/chat") public String chat(String message) { return assistant.chat(message); } @GetMapping("/stream") public Flux<String> stream(String message) { return assistant.stream(message); } } Start LMStudio, load the llama-3.2-1b-instruct model, and start the server. Start the Spring Boot application. Shell mvn spring-boot:run Send a chat message for the non-streaming API. Plain Text $ curl http://localhost:8080/chat?message=Tell%20me%20a%20joke Here's one: Why did the scarecrow win an award? Because he was outstanding in his field! I hope that made you smile! Do you want to hear another one? This works; you can do the same for the streaming API The response will be similar, but with a streaming response. Now, stop the Spring Boot application and the LMStudio server. Mock Assistant You can create a test using @WebMvcTest and inject the Assistant as a MockitoBean. This allows you to mock the response from the Assistant. However, you will only test up to the dashed line in the image below. Everything else will be out of scope for your test. The test itself is the following. Java @WebMvcTest(AssistantController.class) class ControllerWebMvcTest { @Autowired private MockMvc mockMvc; @MockitoBean private Assistant assistant; @Test void testChat() throws Exception { when(assistant.chat("Tell me a joke")).thenReturn("This is a joke"); mockMvc.perform(MockMvcRequestBuilders.get("/chat") .param("message", "Tell me a joke") ) .andExpect(status().is2xxSuccessful()) .andExpect(content().string("This is a joke")); } } This test might be ok, but actually, you are not testing a lot of functionality. When you upgrade LangChain4j, you might get surprised when breaking changes are introduced. This test will not reveal anything because the LangChain4j dependency is not part of your test. Mock HTTP Request A better approach is to mock the HTTP request/response between your application and LMStudio. And this is where WireMock is used. Your test is now extended up to the dashed line in the image below. In order to use WireMock, you need to add the wiremock-spring-boot dependency to the pom. XML <dependency> <groupId>org.wiremock.integrations</groupId> <artifactId>wiremock-spring-boot</artifactId> <version>3.6.0</version> <scope>test</scope> </dependency> The setup of the test is as follows: Add @SpringBootTest, this will spin up the Spring Boot application.Add @EnableWireMock in order to enable WireMock.Add @TestPropertySource in order to override the LMStudio URL. WireMock will run on a random port and this way the random port will be used.Add @AutoConfigureMockMvc because MockMvc will be used to send the HTTP request to the controller. Java @SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT) @EnableWireMock @TestPropertySource(properties = { "my.properties.lm-studio-base-url=http://localhost:${wiremock.server.port}/v1" }) @AutoConfigureMockMvc class ControllerWireMockTest { @Autowired MockMvc mockMvc; ... } In order to mock the request and response, you need to know the API, or you need some examples. The logs of LMStudio are very convenient because the request and response are logged. JSON 2025-03-29 11:28:45 [INFO] Received POST request to /v1/chat/completions with body: { "model": "llama-3.2-1b-instruct", "messages": [ { "role": "system", "content": "You are a polite assistant" }, { "role": "user", "content": "Tell me a joke" } ], "stream": false } 2025-03-29 11:28:45 [INFO] [LM STUDIO SERVER] Running chat completion on conversation with 2 messages. 2025-03-29 11:28:46 [INFO] [LM STUDIO SERVER] Accumulating tokens ... (stream = false) 2025-03-29 11:28:48 [INFO] [LM STUDIO SERVER] [llama-3.2-1b-instruct] Generated prediction: { "id": "chatcmpl-p1731vusmgq3oh0xqnkay4", "object": "chat.completion", "created": 1743244125, "model": "llama-3.2-1b-instruct", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Here's one:\n\nWhy did the scarecrow win an award?\n\nBecause he was outstanding in his field!\n\nI hope that made you smile! Do you want to hear another one?" }, "logprobs": null, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 24, "completion_tokens": 36, "total_tokens": 60 }, "system_fingerprint": "llama-3.2-1b-instruct" } Mocking the request with WireMock consists of a few steps: Stub the request with stubFor and indicate how the request should be matched. Many options are available here; in this example, it is only checked whether it is a POST request and the request matches a specific URL.Set the response; here, many options are available. In this example, the HTTP status is set and the body. After this, you send the request to your controller and verify its response. WireMock will mock the communication between LangChain4j and LMStudio for you. Java @Test void testChat() throws Exception { stubFor(post(urlEqualTo("/v1/chat/completions")) .willReturn(aResponse() .withStatus(200) .withBody(BODY))); mockMvc.perform(MockMvcRequestBuilders.get("/chat") .param("message", "Tell me a joke") ) .andExpect(status().is2xxSuccessful()) .andExpect(content().string("This works!")); } ... private static final String BODY = """ { "id": "chatcmpl-p1731vusmgq3oh0xqnkay4", "object": "chat.completion", "created": 1743244125, "model": "llama-3.2-1b-instruct", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "This works!" }, "logprobs": null, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 24, "completion_tokens": 36, "total_tokens": 60 }, "system_fingerprint": "llama-3.2-1b-instruct" } """; When you run this test, you see in the logs that the WireMock server is started. Shell Started WireMockServer with name 'wiremock':http://localhost:37369 You can also see the requests that are received, what has been matched, and which response has been sent. JSON Content-Type: [application/json] Host: [localhost:37369] Content-Length: [784] Connection: [keep-alive] User-Agent: [Apache-HttpClient/5.4.2 (Java/21)] { "id" : "8d734483-c2f5-4924-8e53-4e4bc4f3b848", "request" : { "url" : "/v1/chat/completions", "method" : "POST" }, "response" : { "status" : 200, "body" : "{\n \"id\": \"chatcmpl-p1731vusmgq3oh0xqnkay4\",\n \"object\": \"chat.completion\",\n \"created\": 1743244125,\n \"model\": \"llama-3.2-1b-instruct\",\n \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\": \"assistant\",\n \"content\": \"This works!\"\n },\n \"logprobs\": null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\": 24,\n \"completion_tokens\": 36,\n \"total_tokens\": 60\n },\n \"system_fingerprint\": \"llama-3.2-1b-instruct\"\n}\n" }, "uuid" : "8d734483-c2f5-4924-8e53-4e4bc4f3b848" } 2025-03-29T13:03:21.398+01:00 INFO 39405 --- [MyWiremockAiPlanet] [qtp507765539-35] WireMock wiremock : Request received: 127.0.0.1 - POST /v1/chat/completions Authorization: [Bearer dummy] User-Agent: [langchain4j-openai] Content-Type: [application/json] Accept-Encoding: [gzip, x-gzip, deflate] Host: [localhost:37369] Content-Length: [214] Connection: [keep-alive] { "model" : "llama-3.2-1b-instruct", "messages" : [ { "role" : "system", "content" : "You are a polite assistant" }, { "role" : "user", "content" : "Tell me a joke" } ], "stream" : false } Matched response definition: { "status" : 200, "body" : "{\n \"id\": \"chatcmpl-p1731vusmgq3oh0xqnkay4\",\n \"object\": \"chat.completion\",\n \"created\": 1743244125,\n \"model\": \"llama-3.2-1b-instruct\",\n \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\": \"assistant\",\n \"content\": \"This works!\"\n },\n \"logprobs\": null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\": 24,\n \"completion_tokens\": 36,\n \"total_tokens\": 60\n },\n \"system_fingerprint\": \"llama-3.2-1b-instruct\"\n}\n" } Stubbing There is a lot of functionality available for stubbing requests. In the example above, the response was created as follows. Java stubFor(post(urlEqualTo("/v1/chat/completions")) .willReturn(aResponse() .withStatus(200) .withBody(BODY))); However, this can be written much shorter by using okJson. Java stubFor(post(urlEqualTo("/v1/chat/completions")) .willReturn(okJson(BODY))); Response From File The body of the response is put in a local constant. This might be ok when you only have one response in your test, but when you have a lot of responses, it is more convenient to put them in files. Do note that you need to put the files in directory test/resources/__files/, otherwise the files will not be found by WireMock. The following error will then be shown. Shell com.github.tomakehurst.wiremock.admin.NotFoundException: Not found in blob store: stubs/jokestub.json The test can be rewritten as follows, assuming that jokestub.json is located in directory test/resources/__files/stubs. Shell stubFor(post(urlEqualTo("/v1/chat/completions")) .willReturn(aResponse().withBodyFile("stubs/jokestub.json"))); Request Matching Request matching is also possible in many ways. Let's assume that you want to match based on the content of the HTTP request. You write two stubs that match on a different body and will return a different response. Java @Test void testChatWithRequestBody() throws Exception { stubFor(post(urlEqualTo("/v1/chat/completions")) .withRequestBody(matchingJsonPath("$.messages[?(@.content == 'Tell me a joke')]")) .willReturn(aResponse().withBodyFile("stubs/jokestub.json"))); stubFor(post(urlEqualTo("/v1/chat/completions")) .withRequestBody(matchingJsonPath("$.messages[?(@.content == 'Tell me another joke')]")) .willReturn(aResponse().withBodyFile("stubs/anotherjokestub.json"))); mockMvc.perform(MockMvcRequestBuilders.get("/chat") .param("message", "Tell me a joke") ) .andExpect(status().is2xxSuccessful()) .andExpect(content().string("This works!")); mockMvc.perform(MockMvcRequestBuilders.get("/chat") .param("message", "Tell me another joke") ) .andExpect(status().is2xxSuccessful()) .andExpect(content().string("This works also!")); } Streaming Response Mocking a streaming response is a bit more complicated. LMStudio only logs the request and not the response. JSON 5-03-29 14:13:55 [INFO] Received POST request to /v1/chat/completions with body: { "model": "llama-3.2-1b-instruct", "messages": [ { "role": "system", "content": "You are a polite assistant" }, { "role": "user", "content": "Tell me a joke" } ], "stream": true, "stream_options": { "include_usage": true } } This caused some challenges. Eventually, LangChain4j was debugged in order to get a grasp of what the response should look like ( dev.langchain4j.http.client.sse.DefaultServerSentEventParser.parse). Only a small snippet of the entire response is used in the test; it is more about the idea. Another difference with the above tests is that you need to use WebTestClient instead of MockMvc. So, remove the @AutoConfigureMockMvc and inject a WebTestClient. Also, add the following dependencies to the pom. XML <!-- Needed for WebTestClient --> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-webflux</artifactId> <scope>test</scope> </dependency> <!-- Needed for StepVerifier --> <dependency> <groupId>io.projectreactor</groupId> <artifactId>reactor-test</artifactId> <version>3.5.8</version> <scope>test</scope> </dependency> The WebTestClient allows you to send and receive a streaming response. The StepVerifier is used to verify the streamed responses. Java @SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT) @EnableWireMock @TestPropertySource(properties = { "my.properties.lm-studio-base-url=http://localhost:${wiremock.server.port}/v1" }) class ControllerStreamWireMockTest { @Autowired private WebTestClient webTestClient; @Test void testStreamFlux() { stubFor(post(WireMock.urlPathEqualTo("/v1/chat/completions")) .willReturn(aResponse() .withStatus(200) .withHeader("Content-Type", "text/event-stream") .withBody(""" data: {"id":"chatcmpl-tnh9pc0j6m91mm9duk4c4x","object":"chat.completion.chunk","created":1743325543,"model":"llama-3.2-1b-instruct","system_fingerprint":"llama-3.2-1b-instruct","choices":[{"index":0,"delta":{"role":"assistant","content":"Here"},"logprobs":null,"finish_reason":null}]} data: {"id":"chatcmpl-tnh9pc0j6m91mm9duk4c4x","object":"chat.completion.chunk","created":1743325543,"model":"llama-3.2-1b-instruct","system_fingerprint":"llama-3.2-1b-instruct","choices":[{"index":0,"delta":{"role":"assistant","content":"'s"},"logprobs":null,"finish_reason":null}]} data: {"id":"chatcmpl-tnh9pc0j6m91mm9duk4c4x","object":"chat.completion.chunk","created":1743325543,"model":"llama-3.2-1b-instruct","system_fingerprint":"llama-3.2-1b-instruct","choices":[{"index":0,"delta":{"role":"assistant","content":" one"},"logprobs":null,"finish_reason":null}]} data: {"id":"chatcmpl-tnh9pc0j6m91mm9duk4c4x","object":"chat.completion.chunk","created":1743325543,"model":"llama-3.2-1b-instruct","system_fingerprint":"llama-3.2-1b-instruct","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]} data: [DONE]"""))); // Use WebClient to make a request to /stream endpoint Flux<String> response = webTestClient.get() .uri(uriBuilder -> uriBuilder.path("/stream").queryParam("message", "Tell me a joke").build()) .accept(MediaType.TEXT_EVENT_STREAM) .exchange() .expectStatus().isOk() .returnResult(String.class) .getResponseBody(); // Verify streamed data using StepVerifier StepVerifier.create(response) .expectNext("Here") .expectNext("'s") .expectNext("one") // spaces are stripped .verifyComplete(); } } Conclusion WireMock is an easy-to-use testing library that helps you with testing integrations with other systems. A lot of functionality is available, and it even works with streaming responses. WireMock is not only limited for use with Spring Boot, also when you want to test integrations from within a regular Java application, WireMock can be used.
Alright, welcome to the final post in this three-part series. Let's do a quick recap of the journey so far: In Part 1, I laid out the problem with monolithic AI "brains" and designed the architecture for a specialist team of agents to power my "InstaVibe Ally" feature.In Part 2, we did a deep dive into the Model Context Protocol (MCP), and I showed you exactly how I connected my Platform Interaction Agent to my application's existing REST APIs, turning them into reusable tools. But my agents are still living on isolated islands. My Social Profiling Agent has no way to give its insights to the Event Planner. My platform integrator can create post and event. I've built a team of specialists, but I haven't given them a way to collaborate. They're a team that can't talk. This is the final, critical piece of the puzzle. To make this a true multi-agent system, my agents need to communicate. This is where the Agent-to-Agent (A2A) protocol comes in. A2A: Giving Your Agents a Shared Language So, what exactly is A2A? At its core, it’s an open standard designed for one agent to discover, communicate with, and delegate tasks to another. If MCP is about an agent using a non-sentient tool, A2A is about an agent collaborating with a peer—another intelligent agent with its own reasoning capabilities. This isn't just about making a simple API call from one service to another. It’s about creating a standardized way for agents to understand each other's skills and work together to achieve complex goals. This was the key to unlocking true orchestration. It meant I could build my specialist agents as completely independent microservices, and as long as they all "spoke A2A," my Orchestrator could manage them as a cohesive team. The Big Question: A2A vs. MCP: What's the Difference? This is a point that can be confusing, so let me break down how I think about it. It’s all about who is talking to whom. MCP is for Agent-to-Tool communication. It’s the agent's key to the tool shed. My Platform Agent uses MCP to connect to my MCP Server, which is a simple gateway to a "dumb" tool—my InstaVibe REST API. The API can't reason or think; it just executes a specific function.A2A is for Agent-to-Agent communication. It’s the agent's phone number to call a colleague. My Orchestrator uses A2A to connect to my Planner Agent. The Planner Agent isn't just a simple function; it has its own LLM, its own instructions, and its own tools (like Google Search). I'm not just telling it to do something; I'm delegating a goal to it. Here’s the simplest way I can put it: Use MCP when you want your agent to use a specific, predefined capability (like create_post or run_sql_query).Use A2A when you want your agent to delegate a complex task to another agent that has its own intelligence. Making It Real: The a2a-python Library in Action Theory is great, but let's look at the code. To implement this, I used the a2a-python library, which made the whole process surprisingly straightforward. It breaks down into two parts: making an agent listen (the server) and making an agent talk (the client). First, I needed to take my specialist agents (Planner, Social, etc.) and wrap them in an A2A server so they could receive tasks. The most important part of this is creating an Agent Card. An Agent Card is exactly what it sounds like: a digital business card for your agent. It’s a standard, machine-readable JSON object that tells other agents: "Hi, I'm the Planner Agent. Here's what I'm good at (my skills), and here's the URL where you can reach me." Here’s a snippet from my planner/a2a_server.py showing how I defined its card and started the server. Python # Inside my PlannerAgent class... skill = AgentSkill( id="event_planner", name="Event planner", description="This agent generates fun plan suggestions tailored to your specified location, dates, and interests...", tags=["instavibe"], examples=["What about Boston MA this weekend?"] ) self.agent_card = AgentCard( name="Event Planner Agent", description="This agent generates fun plan suggestions...", url=f"{PUBLIC_URL}", # The public URL of this Cloud Run service skills=[skill] ) # And in the main execution block... request_handler = DefaultRequestHandler(...) server = A2AStarletteApplication( agent_card=plannerAgent.agent_card, http_handler=request_handler, ) uvicorn.run(server.build(), host='0.0.0.0', port=port) With just that little bit of boilerplate, my Planner Agent, running on Cloud Run, now has an endpoint (/.well-known/agent.json) that serves its Agent Card. It’s officially on the grid and ready to take requests. Now for the fun part: my Orchestrator agent. Its primary job is not to do work itself, but to delegate work to the others. This means it needs to be an A2A client. First, during its initialization, the Orchestrator fetches the Agent Cards from the URLs of all my specialist agents. This is how it "meets the team." The real magic is in how I equipped the Orchestrator. I gave it a single, powerful ADK tool called send_message. The sole purpose of this tool is to make an A2A call to another agent. The final piece was the Orchestrator's prompt. I gave it a detailed instruction set that told it, in no uncertain terms: "You are a manager. Your job is to understand the user's goal, look at the list of available agents and their skills, and use the send_message tool to delegate tasks to the correct specialist." Here's a key snippet from its instructions: Python You are an expert AI Orchestrator. Your primary responsibility is to...delegate each action to the most appropriate specialized remote agent using the send_message function. You do not perform the tasks yourself. Agents: {self.agents} <-- This is where I inject the list of discovered Agent Cards This instruction allows the LLM inside the Orchestrator to reason about the user's request, look at the skills listed on the Agent Cards, and make an intelligent decision about which agent to call. The Grand Finale: A Symphony of Agents Now, let's trace the full, end-to-end flow. A user in the InstaVibe app says: "Plan a fun weekend in Chicago for me and my friends, Ian and Nora."The app calls my Orchestrator Agent, now running on Vertex AI Agent Engine.The Orchestrator’s LLM reasons: "This is a two-step process. First, I need to understand Ian and Nora's interests. Then, I need to create a plan based on those interests."It consults its list of available agents, sees the "Social Profile Agent," and determines it's the right specialist for the first step.It uses its send_message tool to make an A2A call to the Social Agent, asking it to profile Ian and Nora.The Social Agent on Cloud Run receives the A2A request, does its work (querying my Spanner Graph Database), and returns a summary of their shared interests.The Orchestrator receives this summary. It now reasons: "Okay, step one is done. Time for step two."It consults its agent list again, sees the "Event Planner Agent," and makes a new A2A call, delegating the planning task and passing along the crucial context: "Plan an event in Chicago for people who enjoy [shared interests from step 1]."The Planner Agent on Cloud Run receives the request, uses its own tool (Google Search) to find relevant events and venues, and returns a structured JSON plan.The Orchestrator receives the final plan and presents it to the user. This is the power of a multi-agent system. Each component did what it does best, all coordinated through a standard communication protocol. Conclusion of the Series And there you have it. Over these three posts, we've gone from a simple idea to a fully functioning, distributed AI system. I started with a user problem, designed a modular team of agents to solve it, gave them access to my existing APIs with MCP, and finally, enabled them to collaborate as a team with A2A. My biggest takeaway from this whole process is this: building sophisticated AI systems requires us to think like software engineers, not just prompt engineers. By using open standards like MCP and A2A and frameworks like the ADK, I was able to build something that is robust, scalable, and—most importantly—maintainable. You've read the whole story. Now, it's your turn to build it. I've documented every single step of this process in a hands-on "InstaVibe Multi-Agent" Google Codelab. You'll get to build each agent, deploy the MCP server, and orchestrate the whole thing with A2A, all on Google Cloud. It's the best way to move from theory to practice. Thank you for following along with this series. I hope it's been helpful. Give the Codelab a try, and let me know what you build
TL; DR: The Agile Paradox Many companies adopt Agile practices like Scrum but fail to achieve true transformation. This “Agile Paradox” occurs because they implement tactical processes without changing their underlying command-and-control structure, culture, and leadership style. True agility requires profound systemic changes to organizational design, leadership, and technical practices, not just performing rituals. Without this fundamental shift from “doing” to “being” agile, transformations stall, and the promised benefits remain unrealized. The Fundamental Disconnect at the Heart of the Agile Paradox Two decades after the Agile Manifesto, we are in a puzzling situation. Agile practices, particularly Scrum, have achieved widespread adoption across industries. Yet many organizations report significant challenges, with transformations stalling, teams disengaging, and promised benefits unrealized. Studies suggest that a considerable percentage of Agile initiatives do not meet expectations. The evidence points to a fundamental paradox: Organizations adopt Agile tactically while attempting to preserve strategically incompatible systems. This approach isn’t merely an implementation challenge; it represents a profound category error in understanding what Agile actually is. The research report indicates that organizations frequently “implement Agile tactically, at the team or process level, without fundamentally rethinking or dismantling their existing command-and-control organizational structures and management paradigms” (Agile Scope, page 3). The Five Agile Adoption Fallacies 1. The Framing Error: “Doing Agile” vs. Transforming Work Most organizations approach Agile as a process replacement, swapping waterfall artifacts for Scrum events, expecting twice the work to be done in half the time. Teams now have Daily Scrums instead of status meetings, Product Backlogs instead of requirements documents, and Sprint Reviews instead of milestone presentations. This superficial adoption creates the illusion of transformation while preserving the underlying coordination logic. The Reality: Scrum isn’t primarily about events or artifacts. It’s about fundamentally reshaping how work is discovered, prioritized, and validated. When organizations “install” Scrum without changing how decisions are made, how feedback flows, or how learning occurs, they deny themselves its core benefits. As documented in the research report, many large-scale adoptions fail because organizations adopt “the visible artifacts and rituals of Agile [...] without truly understanding or internalizing the core values, principles, and mindset shifts required for genuine agility” (Agile Scope, page 12). The result is what we can call “Agile Theatre” or “Cargo Cult Agile,” essentially performing agility for the show, but without substance. 2. System Over Task: Local Optimization vs. Organizational Adaptation Organizations often optimize team-level practices: improving velocity, optimizing backlog management, and conducting efficient events. However, they frequently resist addressing the organizational systems that enforce handoffs, create dependencies, mandate annual budgeting cycles, or perpetuate fixed-scope initiatives. The Reality: A high-performing Scrum team embedded in a traditional organizational structure hits an effectiveness ceiling almost immediately. The Agile principles of responding to change and, thus, continuously delivering value collide with quarterly planning cycles, departmental silos, the incentivized urge for local optimization, and multi-layer approval processes. This tension manifests concretely: Product Owners who typically lack true ownership and empowerment, acting merely as backlog administrators or requirement scribes rather than value maximizers.” Additionally, teams are blocked by external dependencies, and rigid governance stifles innovation. In other words, traditional governance structures characterized by rigid stage gates, extensive documentation requirements, and centralized approval processes will likely clash with Agile’s need for speed, flexibility, and minimal viable documentation. The heart of this problem lies in complexity. Modern organizational environments, increasingly characterized by volatility, uncertainty, complexity, and ambiguity (VUCA), require adaptive approaches to align strategy to reality or avoid betting the farm on number 23. Yet many organizations still operate with structures designed for more stable, predictable, controllable environments, creating a fundamental mismatch between problem domain and solution approach. 3. The Illusion of Empowerment: “Self-Organizing” Teams in Disempowering Systems Perhaps the most insidious pattern is the contradiction of declared empowerment within controlling systems. Teams are told they’re self-organizing and empowered to make decisions, yet critical choices about roadmaps, staffing, architecture, and priorities typically remain elsewhere. This contradiction slowly but steadily undermines the whole idea of agility, with management saying one thing while the organizational system enforces another. The Reality: True empowerment requires structural changes. Decision authority must be explicitly delegated and supported: the people closest to the problem should make most of the calls. Instead, teams often lack genuine authority over their work, including scope, schedule, or process, and decisions remain centralized and top-down. When organizations claim to want autonomous teams while maintaining command-and-control structures, they create cynicism and disengagement. Additionally, a lack of management support seems to rank among the top causes of Agile failure. Often, this isn’t just a lack of verbal encouragement, but a failure by leadership to understand Agile principles, abandon command-and-control habits, or make the necessary structural changes that create the conditions for autonomy to succeed. 4. Technical Excellence and the Agile Paradox: The Missing Foundation Many Agile transformations focus exclusively on process changes while neglecting the technical practices that enable sustainable agility. Teams adopt iterations and user stories but skip practices like test automation, continuous integration, and refactoring. The Reality: This pattern of process adoption without technical practices is a major reason for failed Agile initiatives. Sustainable agility relies on strong technical foundations; without practices like automated testing and continuous integration, teams accumulate technical debt that eventually cripples their ability to deliver value quickly and reliably. Instead, Agile’s success requires congruence between the team-level agile practices and the organization’s overall ‘operating system,’ including technical practices. 5. The Operating System Fallacy: Scrum as Plugin vs. Platform At its core, Scrum represents a fundamentally different operating model designed for complex, unpredictable environments where adaptation and learning outperform prediction and control. Yet organizations often treat it as a plugin to their existing operating system of command-and-control hierarchies. The Reality: This category error, trying to run an adaptive framework on top of a predictive operating system, creates irreconcilable conflicts. They stem from profoundly different philosophical foundations. Traditional management structures are rooted in Scientific Management (Taylorism), pioneered by Frederick Winslow Taylor in the early 20th century, emphasizing efficiency, standardization, and control. By contrast, Agile is founded on adaptation, collaboration, and distributed decision-making principles. While Taylorism operates on distinct assumptions, it views work as decomposable into simple, standardized, repeatable tasks. It assumes the existence of a single most efficient method for performing each task, which defies common experience in complex environments where we already struggle to identify necessary work. Consequently, when these two systems collide, the traditional structure typically dominates, leading to compromised implementations that retain the form of Agile but not its essence. Beyond the Paradox: Toward Authentic Transformation How do organizations break free from this paradox? There are several paths forward: 1. Structural Realignment Rather than merely implementing Agile within existing structures, successful organizations realign their structure around value streams. Moving beyond simply optimizing isolated team processes requires rethinking the entire organizational operating model. This essential step includes: Moving from functional departments to cross-functional product teamsReducing organizational layers and approval gatesCreating persistent teams rather than project-based staffingAligning support functions (HR, Finance, etc.) with Agile principles. 2. Leadership Transformation Leadership must move beyond endorsement to embodiment, as in consistently playing a dual role: actively supporting the agile initiative while changing their own management style, including: Shifting from directive to servant leadership stylesProviding clear vision and boundaries rather than detailed instructionsModeling a learning mindset that embraces experimentation and adaptationCreating psychological safety for teams to surface problems, take risks, and fail in the process. Thus, leadership is pivotal for any organization striving to become “Agile.” Yet, too often, the leadership undermines its role through a lack of executive understanding, sponsorship, active participation, and commitment; all of those are critical failure factors. 3. Systemic Changes to Enabling Functions True transformation requires reimagining core organizational functions. If these functions remain rooted in traditional, non-Agile paradigms, they can cause critical friction. Therefore, successful organizations implement changes such as: Moving from project-based funding to persistent product teams with outcome-based metricsShifting from annual budgeting to more flexible, incremental funding models aligned with Agile’s iterative, adaptive natureEvolving HR practices from individual performance management to team capability building, overcoming practices like annual individual performance reviews, ranking systems, and reward structures designed to achieve personal objectives.Redesigning governance to focus on outcomes rather than adherence to plans. 4. Technical Excellence and Craft Organizations must invest equally in technical capabilities. Achieving true agility requires significant investment in technical training, mentoring, and DevOps infrastructure and practices, which is why successful engineering organizations focus on: Building technical agility through practices like test automation, continuous delivery, and clean codeCreating a culture of craftsmanship where quality is non-negotiableAllowing teams time to reduce technical debt and improve their tools and practices regularlyMeasuring not just delivery speed but sustainability and quality. Conclusion: From Installation to Transformation The Agile paradox persists because it’s easier to change processes than paradigms. Installing Scrum events requires nothing more than training and schedule adjustments; transforming an organization requires questioning fundamental assumptions about control, hierarchy, and value creation. True Agile transformation isn’t about doing Scrum; it’s about becoming an organization that can continuously learn, adapt, and deliver value in complex environments. This approach requires not just new practices but new mental models. Organizations that break through the paradox recognize that Agile isn’t something you “do” but become. The choice organizations face isn’t whether to do Agile well or poorly. It’s whether they genuinely want to become agile at all. The real failure isn’t Agile itself but the fundamental mismatch between adopting adaptive practices while clinging to outdated management paradigms. Until organizations address this systemic conflict, the promise of true agility will remain a pipedream.
Any modern distributed system which requires high throughput, scaling, high availability etc., utilizes Kafka as one of its component. Thus, making Kafka a popular platform which need no introduction for itself. However even though being an integral part of Kafka, Apache ZooKeeper is neither explored nor understood as much it should have. In this article we would briefly touch upon these aspects, and understand the next generation Kafka via KRaft mode and the benefits it bring over ZooKeeper. Kafka Cluster To achieve the design goals of high availability and high throughput Kafka utilizes concept of distributed system via multiple nodes (viz. brokers) as its core component. These brokers together form a Kafka cluster. Typical to any distributed system the Kafka cluster too requires certain aspects of it to be managed. For this purpose Apache ZooKeeper was chosen. ZooKeeper acts as a Cluster Controller or Consistent Core and is typically responsible for below aspects of a Kafka cluster: Cluster Membership – Maintain details of every brokers which are active member of the cluster. Do note that since brokers can dynamically join (typically for scaling out) or exit (due to failures or scale down), managing cluster membership is a crucial aspect towards achieving high availability.Leader Election via Controller Broker – Maintain the information of controller broker and co-ordinate with it for the leader election (of topic partitions).Cluster Metadata – Maintain the topics information viz. partitions, ISR, consumers, consumer group, current offsets, leaders etc. It also includes partition assignment, tracking and notifying the routing service about changes.Cluster Recovery – In extreme case of complete cluster failure, the metadata stored in ZooKeeper is utilized for recovery. However, its imperative to note that the recovered cluster state is as fresh as the latest snapshot of ZooKeeper.Service Discovery – ZooKeeper allows service discovery between brokers by acting as a central registry. Also the complete cluster topology is available. This helps brokers to make appropriate decisions (e.g. rebalancing) whenever a broker dynamically joins and/or exits the cluster.Access Control List (ACLs) – Maintain the details of access control towards topics/consumers groups allowing brokers for required authorization. Complete ZooKeeper’s responsibilities stretch way beyond the aspects discussed above and is out of scope of this article. Limitations of ZooKeeper Despite being an integral part of Kafka ecosystem, ZooKeeper has certain limitations as described below: ZooKeeper in itself requires high availability thus adds to cluster complexity during deployment and operations as well. Moreover, this causes the infrastructure cost increase to maintain ZooKeeper servers.Being a consensus based system the throughput goes lower as the cluster grows beyond certain thresholds in terms of number of brokers, topic partitions etc. This is known cause for slowing down leader elections and consumer rebalancing respectively.ZooKeeper being central registry of metadata it becomes bottle neck if all clusters members fetch meta data at once. Although this issue has been solved long back via KAFKA-901 it essentially diminished the responsibility of ZooKeeper.ZooKeeper utilizes Single Socket Channel and Thread per follower for all communication. This is essential to maintain strict ordered processing. However, the downside is a increased latency and lower throughput for large clusters.For a given Kafka cluster there could be at max one Controller broker. The election of Controller broker happens during the startup of cluster and/or after the Controller broker fails. For large cluster it may take a while for this election as every broker tries to register itself as Controller. This eventually could cause the cluster unusable for a brief period of time which may be unacceptable for variety of use cases. KRaft Mode To understand the reasoning behind the drastic decision to move away from ZooKeeper altogether lets refer to official KIP-500 itself. Currently, Kafka uses ZooKeeper to store its metadata about partitions and brokers, and to elect a broker to be the Kafka Controller. We would like to remove this dependency on ZooKeeper. This will enable us to manage metadata in a more scalable and robust way, enabling support for more partitions. It will also simplify the deployment and configuration of Kafka. In other words — KRaft simplifies cluster management by removing ZooKeeper, eliminating the need for two separate systems with different configurations. Instead, Kafka now alone handles metadata, reducing errors and operational complexity. Metadata is treated as an event stream, allowing quick updates using a single offset, similar to how producers and consumers function. A typical Kafka cluster with ZooKeeper (left) and KRaft (right). Brown & blue nodes are controller brokers while black are the regular brokers. Image Courtesy — KIP-500. The key difference and advantages KRaft mode brings w.r.t. ZooKeeper are as follows → Controller Quorum Instead of single controller broker in ZooKeeper mode, there is a quorum maintained for controllers. Everything that is currently stored in ZooKeeper, such as topics, partitions, ISRs, configurations, and so on, is stored in this controller. Using Raft algorithm one broker among the controllers is chosen as leader and termed as active controller. Moreover, Raft is utilized for log replication as well. Essentially all controller brokers thus have all the latest information among themselves and act as hot standbys. This drastically reduces the recovery time, during controller failure, as there is no longer need to transfer all the data to new controller broker. Side note — In case you wondered how the KRaft name was chosen, you may now have the answer! Broker Metadata and State Instead of the controller pushing out updates to the other brokers, brokers fetch updates from the active controller. The fetched metadata is persisted to disk. This enables quick recovery of brokers even if there are hundreds or thousands of partitions. Moreover, the metadata fetch is delta in nature (most of the time) which means only newer updates are fetched. In few cases when broker lag behind too much from active controller or no cached metadata is there, the active controller sends a full metadata snapshot instead of incremental deltas. The metadata fetch double up as heartbeat for broker, letting know the controller that broker is still alive. Failure to receive heartbeat thus now result in immediate eviction from cluster. This essentially eliminates edge case where broker might be disconnected from ZooKeeper but still connected to other broker(s) leading to some tricky situation of false durability and divergent state. Partition Reassignment Partition reassignment largely remains unchanged. Except that KRaft controller now allows the topic deletion which is still undergoing partition reassignment. The partition reassignment is terminated immediately in case of topic deletion is requested midway thus avoiding any redundant computations. Shutdown and Recovery Time With the difference discussed above the controlled shutdown time and recovery from uncontrolled shutdown has improved drastically. Infrastructure Although migrating to KRaft mode reduces the complexity of config management towards two different systems, viz. ZooKeeper and Kafka brokers, it doesn’t immediately translate into reduced number of nodes. Although with KRaft a broker could have process role of both controller or broker or both its highly recommended to utilize nodes with dedicated roles. Thus, for a large cluster there could be N dedicated ZooKeeper nodes, which are to be converted into controller nodes to meet the quorum. Depending upon the cluster setup and requirement the number of nodes may vary between ZooKeeper and Kraft mode. Challenges ZooKeeper served the ecosystem well for more than a decade and is battle tested. KRaft is an attempt to solve the challenges faced in ZooKeeper. However, as the adoption of KRaft takes speed its imperative to note that there would be challenges faced with KRaft as well. Migration Guide For new clusters KRaft can be used directly as its the default mode starting Kafka 4.0. However, for existing cluster Kafka need to be upgraded to at least version 3.9. Detailed migration steps are out of scope of this article. Official migration guide can be a good starting point. Conclusion The evolution from ZooKeeper to KRaft mode marks a significant milestone in Kafka’s journey toward better scalability, efficiency, and simplicity. ZooKeeper has played a crucial role in Kafka’s architecture for over a decade. However, as Kafka deployments grew larger and more complex, ZooKeeper’s limitations became increasingly evident — ranging from scalability bottlenecks to infrastructure overhead and slower recovery times. KRaft mode offers a streamlined approach by eliminating ZooKeeper and bringing metadata management directly into Kafka’s architecture. This transition not only improves operational efficiency, reduces recovery times, and simplifies deployment, but also aligns Kafka with modern distributed system patterns, leveraging the Raft consensus algorithm for better fault tolerance and high availability. For new Kafka clusters, adopting KRaft mode is highly recommended due to its native support for scalability, improved fault tolerance, and reduced infrastructure complexity. For existing clusters, organizations must evaluate their current limitations with ZooKeeper and assess the risks and benefits of migration. References and Further Read Apache KafkaPatterns of Distributed SystemsZooKeeper to KRaftKIP-500 | KIP-595 | KIP-631
In Part 1 of this series, I laid out the high-level architecture for my "InstaVibe Ally" and made the case for building a team of specialist AI agents instead of a single, monolithic brain. I sketched out a system where an Orchestrator delegates tasks to a Social Profiler, a Planner, and a Platform Interaction Agent. Now, I'm going to zoom in on one of the most critical, practical challenges you’ll face: How do you actually let your agent use your application's APIs? An AI agent, powered by a Large Language Model (LLM) like Gemini, is a master of language and reason. But by default, it's a brain in a jar—isolated from the real world of your applications. It can't update a Salesforce record, book a flight, or, in my case, post an event to the InstaVibe platform. To do useful work, it needs to connect with the code I've already built. The first instinct for many developers is to start prompt-hacking. You try stuffing API documentation, cURL commands, and examples directly into the agent's prompt. Let’s be honest, this feels clever for about five minutes, and then it becomes a maintenance nightmare and a pile of technical debt. Every time your API changes, you have to hunt down and update every single prompt that uses it. You’re tightly coupling your agent's logic to your API's implementation details, and it will come back to bite you. This is not the way. The solution for me was the Model Context Protocol (MCP). MCP is an open standard that acts as a structured, standardized bridge between an AI agent and any external tool. It's the universal adapter that lets you plug your application's existing capabilities directly into your agent's brain, without creating a messy, unmaintainable prompt. Let's dive into exactly how I used MCP to transform my own internal REST APIs into a set of clean, discoverable tools for my Platform Interaction Agent. The Problem: My Agent on the Outside, My API on the Inside My InstaVibe application, like most modern web apps, has a standard set of internal REST endpoints. For this feature, I cared about two in particular: POST /api/posts for creating a new social post.POST /api/events for creating a new event. My goal was to allow my agent to call these endpoints intelligently. It needed to be a first-class citizen of my application ecosystem, not a guest who has to be told how to do everything. I wanted the agent to take a user's request like, "Post a positive message from Julia about her new cat!", and have it translate that into a well-formed, secure API call to my existing service. To do this cleanly, I introduced the MCP Tool Server. You can think of it as a dedicated microservice that acts as an API proxy specifically for my agents. It's an integration hub, a translation layer, and a security gateway all in one. The architecture is simple but powerful, and I love it for a few key reasons: Separation of concerns. It only communicates with my MCP Tool Server. It has no idea the InstaVibe API even exists.Easy to introduce security. My MCP Tool Server is the only component that communicates directly with the internal InstaVibe API and acts as a centralized security control point. This completely decouples the agent from the application. The agent's world is simple: it just knows about tools. The application's world is unchanged: it just serves its API. The MCP server is the bridge that connects them, and that separation makes your life a whole lot easier down the road. Step 1: Write a Function, Not a Prompt The first thing I did was create a clean separation of concerns. My agent shouldn't have to think about HTTP methods, headers, or authentication. That's application logic. So, on my MCP server, I wrote simple Python functions that represent each tool I wanted to expose. These wrappers handle all the ugly details. For example, here’s the wrapper for my create_post API. It's the only place where the requests library and the API's URL are mentioned. Python def create_post(author_name: str, text: str, sentiment: str): """ Sends a POST request to the /posts endpoint to create a new post. This function encapsulates the API call logic. """ url = f"{BASE_URL}/posts" headers = {"Content-Type": "application/json"} payload = { "author_name": author_name, "text": text, "sentiment": sentiment } try: response = requests.post(url, headers=headers, json=payload) response.raise_for_status() # Raise an exception for bad status codes print(f"Successfully created post.") return response.json() except requests.exceptions.RequestException as e: print(f"Error creating post: {e}") return None Look how clean that is! My agent will never see a URL or an HTTP header. It will only know about a tool called create_post, along with the arguments they need. Step 2: Build the MCP Server—The Agent's New Best Friend With the wrapper functions ready, the next step was to build the MCP Server itself. It’s a lightweight web application (I used FastAPI, which is great for this) that implements two fundamental endpoints defined by the MCP standard: list_tools(): This is the discovery mechanism. You can think of it as the "handshake." When an agent first connects, it calls this endpoint to ask, "Hey, what can you do?" The server responds with a machine-readable list of all available tools (create_post, create_event), their descriptions, and, most importantly, a JSON Schema for their arguments. This schema is critical—it tells the LLM exactly what parameters are needed and what their data types are, which drastically reduces errors and hallucinations.call_tool(name, arguments): This is the execution endpoint. After the agent has decided which tool to use (based on the user's request and the info from list_tools), it calls this endpoint and says, "Okay, do this." It sends the tool's name and a dictionary of arguments. The server then acts as a router, finds the matching Python function I wrote in Step 1, and executes it. I packaged this MCP server into a Docker container and deployed it to Cloud Run. Now it has its own unique, publicly accessible URL and runs as an independent microservice. Step 3: Plug It In—The Easy Part This is where I LOVED using Google's Agent Development Kit (ADK). After all the careful setup of the server, telling the agent to use it was incredibly simple. I didn't need to write a custom client, parse JSON, or deal with any networking logic myself. I just had to point the ADK at my deployed MCP server. Python from google.adk.agents import Agent from google.adk.mcp import MCPToolset, SseServerParams .... async def build_agent_with_mcp_tools(): """Connects to the MCP server and initializes an agent with the discovered tools.""" print(f"Connecting to MCP server at {MCP_SERVER_URL} to fetch tools...") # This single line handles the entire MCP handshake. It connects to the server, # calls list_tools(), gets the definitions, and creates ADK-compatible tool objects. # It's beautiful. tools = MCPToolset( connection_params=SseServerParams(url=MCP_SERVER_URL) ) print(f"Successfully loaded tools: {[tool.name async for tool in tools.tools()]}") # Now I create my agent and pass in the dynamically loaded tools. # The agent's toolset is now whatever the MCP server provides. platform_agent = Agent( model="gemini-2.5-flash", instruction="You are an expert at interacting with the InstaVibe platform using the provided tools.", tools=[tools] ) return platform_agent And just like that—boom! The process is automated and flexible. My agent's toolset is no longer hardcoded. The tool logic is centralized on the MCP server, making it a breeze to update. I've successfully unlocked my existing API for my AI agent without getting tangled in a mess of prompts. What's Next? From Theory to Your Terminal By embracing the Model Context Protocol, I've built a robust and maintainable bridge between my agent's reasoning capabilities and my application's existing API. I've treated AI integration like proper software engineering, and the result is a clean, scalable system. But this is only one piece of the puzzle. My Platform Agent can now talk to my application, but how does the Orchestrator talk to the Platform Agent? What happens when agents need to talk to each other? That’s a job for the Agent-to-Agent (A2A) protocol, and it’s exactly what I’ll cover in Part 3, the final post of this series. In the meantime, reading is great, but building is better. I turned this entire project—from the multi-agent design in Part 1 to the MCP integration I just covered, and even the A2A orchestration I'll discuss next—into a hands-on InstaVibe Multi-Agent Google Codelab. You can build this exact system yourself, step-by-step, and get the gift of simplicity so you can spend less time scrolling and more time coding. You'll get to: Build your first agent with the Agent Development Kit (ADK).Expose your application’s APIs as tools using MCP.Connect your agents with the A2A protocol.Orchestrate the whole team and deploy it to Cloud Run and Vertex AI Agent Engine. It's the perfect way to skip the steep learning curves and see how these powerful concepts work in practice. Give the Codelab a try and drop a comment below with any questions. I'll see you in Part 3.
These are not just technical tips, but principles that shaped how I learned and think about engineering. Take what’s useful, ignore what isn’t. But remember, being a great developer isn’t only about code. 1. “The Most Elegant Solution Is the One You Never Have to Maintain.” The most effective code is often the code you never have to write. This might sound counter-intuitive, but one of the most profound lessons I've learned is that sometimes, the most effective solution is to avoid writing new code altogether. Example: Before you embark on building a complex feature or system, pause and ask: Does a library, service, or existing component already do this?Can we simplify the problem so much that this feature isn't even necessary?Is there a manual process, or a simpler, non-technical solution, that would suffice for now? Often, the cost of maintaining, debugging, and evolving new code far outweighs the perceived benefit of building it from scratch. PRO-Tip: Prioritize simplicity and reusability. Always look for opportunities to leverage existing solutions, whether internal or external. Question the necessity of every line you write. The less code you have, the less there is to break, and the easier it is to maintain and evolve. 2. “Tools Are Servants of Your Ideas, Not Their Masters.” Master Your Tools, But Don't Worship Them. Just as a master does, great software engineers understand their tools—languages, frameworks, IDEs, and deployment pipelines. However, the tools are a means to an end, not the end itself. Example: Python is a powerful language, Django a robust framework. But relying solely on them without understanding the underlying principles of web development, databases, or algorithms can lead to chaos and inefficient systems. The true master can pick up a new language or framework with relative ease because they grasp the fundamental concepts that transcend specific technologies.PRO-Tip: Dive deep into how your tools work. Understand their strengths and weaknesses. But also, regularly explore new tools and paradigms. This prevents dogma and keeps your mind agile. Don't be afraid to leave a comfortable tool behind if a newer, better one emerges, but only after careful consideration and understanding. 3. "Perfect Is a Moving Target; Value Is When You Release." Know When to Ship, and When to Refine. Perfection can stop you from doing good work, But careless work will ruin long-term success. A true expert knows how to strike the right balance. Example: Google's "launch early and iterate" philosophy is a testament to this. They release products when they are "good enough" and then refine them based on real-world user feedback. On the other hand, critical infrastructure like medical software or aerospace systems demands an extremely high level of initial polish and rigorous testing before deployment.PRO-Tip: Understand the context of your project. For a new feature, aim for a Minimum Viable Product (MVP) to get feedback quickly. For core, critical systems, invest more upfront in design, testing, and robustness. Always strive for quality, but be pragmatic about where to apply your greatest efforts. Sometimes, a quick-and-dirty solution today allows you to deliver value and learn, which informs the truly elegant solution tomorrow. 4. “Programs Must Be Written for People to Read, and Only Incidentally for Machines to Execute.” — Harold Abelson Understand the Business, or the Code Won’t Matter: You must understand the business first. If you don’t know what problem you’re solving or who your work is for, your code won’t matter, no matter how perfect it is. Knowing the business gives purpose to your code and makes it truly valuable. Example: Google PageRank Algorithm Solved a business-critical problem: surfacing the most relevant content.Original paper:The Anatomy of a Large-Scale Hypertextual Web Search EnginePRO-Tip: Always ask: What real-world outcome will this produce? 5. “Any Fool Can Write Code That a Computer Can Understand. Good Programmers Write Code That Humans Can Understand.” — Martin Fowler Write your code as a gift to the next engineer who reads it. Your future teammates will thank you for clear, thoughtful code that’s easy to read and maintain. Example: Django Codebase Famous for clarity and explicitness.Example urls.py:Django URLconf ExamplePRO-Tip: Leave docstrings: Python def get_user_profile(user_id): """ Retrieve the user profile based on the given user ID. Returns None if the profile does not exist. """ 6. “The Most Dangerous Phrase in the Language Is: ‘We’ve Always Done It This Way.’” — Grace Hopper Every line of code has a story behind it—why it was written, what problem it solves, and how it fits into the bigger picture. Respect that history before you change or remove anything as understanding the story helps you make better decisions. Example: Linux Kernel Git History Rich commit logs spanning decades.Browse online:Linux Kernel Git RepositoryPRO-Tip: Use git blame: GitHub Flavored Markdown git blame -L 50,70 -- path/to/file.c …and see who changed those lines and why. 7. “Premature Optimisation Is the Root of All Evil.” — Donald Knuth Choose the Right Abstraction—Not the Most Abstract One. Pick the right level of abstraction for your code, not the fanciest or most complicated one. The goal is to make your design clear and practical, not to impress people. Good abstractions solve real problems without adding confusion. Example: Python collections module Simple, practical structures instead of overengineering.Docs:collections — Container DatatypesPRO-Tip: Sometimes a list is better: Python words = ["apple", "banana", "cherry"] 8. “The Biggest Problem in Communication Is the Illusion That It Has Taken Place.” — George Bernard Shaw Communication is the most underrated skill in engineering. You can write great code, but if you can’t clearly explain your ideas, share your decisions, or ask for help, your work will suffer. Good communication makes teams stronger and projects more successful. Example:Stripe API Documentation Clear and approachable examples.See it live:Stripe API ReferencePRO-Tip: Write a README.md: Markdown # Payment Processor This module validates and charges credit cards via the Acme API. ## How to Use 1. Configure credentials. 2. Call `charge()`. 3. Handle exceptions. 9. “Debugging Is Twice as Hard as Writing the Code in the First Place.” — Brian Kernighan Good logs are like time machines. They let you travel back and see exactly what happened in your system when things go wrong. Clear, detailed logs save you hours of guessing and help you fix problems faster. Example: Airbnb Postmortem Culture They publish internal retrospectives to learn from incidents.Example (summary): Airbnb’s engineering team practices blameless, detailed postmortems using a custom incident-tracking tool to document both failures and successes, fostering a culture of continuous learning and improvement.This video, 12 essential logging best practices, explains how to make your logs more useful, structured, secure, and scalable for easier debugging and monitoring. PRO-Tip: Use structured logging: Python import logging logger = logging.getLogger(__name__) logger.info("Processing order", extra={"order_id": order.id, "user_id": user.id}) 10. “If You Want to Go Fast, Go Alone. If You Want to Go Far, Go Together.” — African Proverb You Grow When Your Team Grows. The real measure of your skill is how much stronger everyone becomes because you were here.” Example: Apache Software Foundation Thousands of contributors mentored into maintainers.See their project governance:Apache Project Maturity ModelPRO-Tip: Pair-program, document decisions, encourage questions. 11. “Code With Purpose, Design With Clarity, Build With Care—This Is How Good Engineers Create Lasting Impact.” Engineering isn’t just solving problems — it’s solving the right problems, in the right way, for the right reasons.” Example: Git’s Design A sophisticated internal model but simple commands.Linus’s original design notes:Git Design Notes PRO-Tip: Keep interfaces small and focused. 12. “Yesterday’s Expertise Is Today’s Starting Line.” Cultivate a Growth Mindset; The Learning Never Stops Technology is a relentless tide, constantly bringing new waves of innovation. The moment you believe you've learned it all is the moment you start falling behind. Example: The transition from monolithic architectures to microservices, from relational databases to NoSQL, from manual deployments to CI/CD pipelines—these shifts demand continuous learning. The engineers who thrived through these changes were those who embraced new ideas, experimented, and were willing to unlearn old habits.PRO-Tip:Dedicate time, even if it's just a few hours a week, to continuous learning. Read technical books, follow industry leaders, experiment with new technologies, contribute to open source, or even teach others. The act of teaching is one of the most powerful ways to solidify your own understanding. Keep learning. Keep teaching. Keep building.
As you all know, Generative AI is reshaping how we build certain applications — but diving into LangChain code or orchestrating complex pipelines can be intimidating for newcomers. That’s where I feel Langflow comes in very handy for first-timers trying to explore and build such applications. Langflow is a low-code, visual interface that lets you prototype and deploy LLM-powered applications without writing a single line of backend code. Whether you're building a chatbot, a document summarizer, or a custom retrieval-augmented generation (RAG) app — Langflow lets you do it visually and quickly. In this tutorial, I will walk you through building your first GenAI app using Langflow, step-by-step — no prior LangChain experience needed. Why Langflow? Before we dive in, let’s briefly understand what makes Langflow appealing: No Backend Coding Required: Build apps without diving deep into Python or LangChain syntax.Rapid Prototyping: Drag, drop, and connect blocks to test ideas instantly.Modular and Extensible: Mix and match components like embeddings, loaders, memory, and LLMs.Visual Debugging: Inspect node-level inputs and outputs at any stage of your flow.Multi-model Support: Integrate OpenAI, Cohere, HuggingFace, PaLM, and more.Built for Collaboration: Share flows or export them for integration with teams. Langflow is especially valuable for: Data scientists prototyping quicklyDevelopers avoiding boilerplateProduct teams doing proof-of-conceptsEducators and researchers demoing concepts Prerequisites Before we begin, make sure you have: Basic understanding of LLMs (like GPT)A working Python environment (Python 3.8+)OpenAI API key (or any supported LLM provider)Node.js and npm installed (optional for advanced deployment or front-end integration) Step 1: Install Langflow (You Can Either Use Python or Datastax) You can install Langflow using pip. It’s as easy as: Python pip install langflow #Once installed, run the app: langflow run It will start a local server on http://localhost:7860, where you can start building your app visually. This web interface is where you’ll design, test, and deploy your GenAI workflows. Pro Tip: Create a virtual environment (venv) before installing Langflow to avoid dependency conflicts with other Python projects. Step 2: Design Your App Flow Visually Langflow gives you a drag-and-drop canvas to build your app. Let’s say you want to build a PDF summarizer: Drag in a FileLoader node (like PyPDFLoader)Connect it to a Text SplitterFeed that into an Embedding GeneratorStore embeddings in a Vector Store (like FAISS)Link the store to a RetrievalQA ChainAdd an LLM block (e.g., OpenAI or HuggingFace)Connect to an Input/Output Interface for users The UI makes it super intuitive — no manual coding required. Step 3: Add Your LLM Credentials Click on the LLM node in your flow, and paste your OpenAI API key or use another supported provider. Langflow supports the following (I used OpenAI GPT 3.5 for my app): OpenAI (GPT-3.5 / GPT-4)HuggingFaceCohereGoogle PaLMAzure OpenAIAnthropic Claude (via API endpoints) Step 4: Test Your Flow Hit “Run” or test specific nodes in isolation. You’ll instantly see how the data flows through your pipeline. You can debug each block and inspect outputs — perfect for understanding how your app works. Step 5: Deploy or Export Your Flow There are different ways. Langflow supports: Exporting to JSON or Python codeRunning locally as a Python scriptDeploying as a FastAPI appOr, integrating with a front-end using APIs To export your project: Python langflow export my_app.json langflow convert my_app.json --to=python Build a Chatbot UI (Optional) Langflow can integrate with: StreamlitReact.jsGradio Want to build a chatbot interface? Just connect your app’s backend with a frontend using Streamlit: Python import streamlit as st user_input = st.text_input("Ask your PDF:") if user_input: response = call_langflow_pipeline(user_input) st.write(response) And that’s it! You now have a working GenAI app that you can tweak, enhance, or scale. Troubleshooting Tips Issue Solution "Cannot find module" Reinstall dependencies or check Python path Slow execution Reduce chunk size or limit input tokens API errors Verify keys, rate limits, and model availability UI not loading Restart server or clear browser cache Real-World Use Cases Legal Document Analyzer – Summarize clauses, search precedents, answer queries.Internal Knowledge Base – Load company docs and chat with them securely.Sales Enablement Tool – Summarize competitor reports, generate scripts.Academic Research Assistant – Digest papers and generate citations.Customer Support Assistant – Pull from FAQs, manuals, tickets for real-time resolution. Enterprises are using Langflow to rapidly prototype internal tools, MVPs, and AI assistants — cutting down dev cycles by weeks. Wrapping Up Langflow lets you go from idea → prototype → deployment in minutes. For developers and teams trying to bring AI to their apps quickly, it’s a game-changer. With its visual flow editor, LLM integration, and plug-and-play tooling, it abstracts away much of the boilerplate that typically slows down GenAI development. TL;DR Step 1: Install Langflow via pip - Langflow is available on PyPI and can be installed with a single command.Step 2 : Design your app visually - Langflow lets you build GenAI apps by connecting modular components together on a canvas — similar to how you'd wire up nodes in a no-code workflow tool.Step 3 : Plug in your LLM API key - Langflow supports a range of providers out of the box. You can also switch models or test multiple providers to compare performance.Step 4 : Test each Component - This granular level of testing helps debug complex pipelines before they go live. It’s especially useful when chaining multiple tools like document loaders, retrievers, and memory modules. Step 5 : Export or deploy your app -Once your app is working as expected, you have several deployment options.Step 6 : Add a Front-End UI - To make your GenAI app more user-friendly and interactive, you can easily integrate Langflow with popular front-end frameworks. Langflow supports seamless integration with Streamlit, React.js, and Gradio, allowing you to build intuitive interfaces for end users. Learn More Langflow GitHubLangChain DocumentationStreamlitDZone AI Zone
July 9, 2025
by
CORE
Multiple Stakeholder Management in Software Engineering
July 8, 2025 by
How to Reduce Technical Debt With Artificial Intelligence (AI)
July 11, 2025 by
July 11, 2025 by
How to Reduce Technical Debt With Artificial Intelligence (AI)
July 11, 2025 by
The Cybersecurity Blind Spot in DevOps Pipelines
July 11, 2025 by
Modernize Your IAM Into Identity Fabric Powered by Connectors
July 10, 2025 by
Server-Driven UI: Agile Interfaces Without App Releases
July 11, 2025 by
July 11, 2025 by
How to Reduce Technical Debt With Artificial Intelligence (AI)
July 11, 2025 by
July 11, 2025 by