Is 2025 the year of API-first development and democratization? Take our annual survey and tell us how you implement APIs across your org.
Integrate database deployments into your CI/CD workflows. Learn key techniques to handle the unique constraints of database operations.
Multi-Tenant .NET Applications With Keycloak Realms
More Efficient Software Development Means More Need for Devs
Generative AI
AI technology is now more accessible, more intelligent, and easier to use than ever before. Generative AI, in particular, has transformed nearly every industry exponentially, creating a lasting impact driven by its (delivered) promises of cost savings, manual task reduction, and a slew of other benefits that improve overall productivity and efficiency. The applications of GenAI are expansive, and thanks to the democratization of large language models, AI is reaching every industry worldwide.Our focus for DZone's 2025 Generative AI Trend Report is on the trends surrounding GenAI models, algorithms, and implementation, paying special attention to GenAI's impacts on code generation and software development as a whole. Featured in this report are key findings from our research and thought-provoking content written by everyday practitioners from the DZone Community, with topics including organizations' AI adoption maturity, the role of LLMs, AI-driven intelligent applications, agentic AI, and much more.We hope this report serves as a guide to help readers assess their own organization's AI capabilities and how they can better leverage those in 2025 and beyond.
Apache Cassandra Essentials
Identity and Access Management
Welcome back to the “Text to Action” series, where we build intelligent systems that transform natural language into real-world actionable outcomes using AI. In Part 1, we established our foundation by creating an Express.js backend that connects to Google Calendar’s API. This gave us the ability to programmatically create calendar events through an exposed API endpoint. Today, we’re adding the AI magic — enabling users to simply type natural language descriptions like “Schedule a team meeting tomorrow at 3 pm” and have our system intelligently transform such words into adding an actual calendar event action. What We’re Building We’re bridging the gap between natural human language and structured machine data. This ability to parse and transform natural language is at the heart of modern AI assistants and automation tools. The end goal remains the same: create a system where users can type or say “create a party event at 5 pm on March 20” and instantly see it appear in their Google Calendar. This tutorial will show you how to: Set up a local language model using OllamaDesign an effective prompt for text-to-json conversionBuild a natural language API endpointCreate a user-friendly interface for text inputHandle date, time, and timezone complexities The complete code is available on GitHub. Prerequisites Before starting, please make sure you have: Completed Part 1 setupNode.js and npm installedOllama installed locallyThe llama3.2:latest model pulled via Ollama Plain Text # Install Ollama from https://ollama.com/ # Then pull the model: ollama pull llama3.2:latest Architecture Overview Here’s how our complete system works: User enters natural language text through the UI or API call.System sends text to Ollama with a carefully designed prompt.Ollama extracts structured data (event details, time, date, etc.).System passes the structured data to Google Calendar API.Google Calendar creates the event.User receives confirmation with event details. The magic happens in the middle steps, where we convert unstructured text to structured data that APIs can understand. Step 1: Creating the NLP Service First, let’s install the axios package for making HTTP requests to our local Ollama instance: Plain Text npm install axios Now, create a new file nlpService.js to handle the natural language processing. Here are the key parts (full code available on GitHub): JavaScript const axios = require('axios'); // Main function to convert text to calendar event const textToCalendarEvent = async (text, timezone) => { try { const ollamaEndpoint = process.env.OLLAMA_ENDPOINT || 'http://localhost:11434/api/generate'; const ollamaModel = process.env.OLLAMA_MODEL || 'llama3.2:latest'; const { data } = await axios.post(ollamaEndpoint, { model: ollamaModel, prompt: buildPrompt(text, timezone), stream: false }); return parseResponse(data.response); } catch (error) { console.error('Error calling Ollama:', error.message); throw new Error('Failed to convert text to calendar event'); } }; // The core prompt engineering part const buildPrompt = (text, timezone) => { // Get current date in user's timezone const today = new Date(); const formattedDate = today.toISOString().split('T')[0]; // YYYY-MM-DD // Calculate tomorrow's date const tomorrow = new Date(today.getTime() + 24*60*60*1000); const tomorrowFormatted = tomorrow.toISOString().split('T')[0]; // Get timezone information const tzString = timezone || Intl.DateTimeFormat().resolvedOptions().timeZone; return ` You are a system that converts natural language text into JSON for calendar events. TODAY'S DATE IS: ${formattedDate} USER'S TIMEZONE IS: ${tzString} Given a text description of an event, extract the event information and return ONLY a valid JSON object with these fields: - summary: The event title - description: A brief description of the event - startDateTime: ISO 8601 formatted start time - endDateTime: ISO 8601 formatted end time Rules: - TODAY'S DATE IS ${formattedDate} - all relative dates must be calculated from this date - Use the user's timezone for all datetime calculations - "Tomorrow" means ${tomorrowFormatted} - For dates without specified times, assume 9:00 AM - If duration is not specified, assume 1 hour for meetings/calls and 2 hours for other events - Include timezone information in the ISO timestamp Examples: Input: "Schedule a team meeting tomorrow at 2pm for 45 minutes" Output: {"summary":"Team Meeting","description":"Team Meeting","startDateTime":"${tomorrowFormatted}T14:00:00${getTimezoneOffset(tzString)}","endDateTime":"${tomorrowFormatted}T14:45:00${getTimezoneOffset(tzString)}"} Now convert the following text to a calendar event JSON: "${text}" REMEMBER: RESPOND WITH RAW JSON ONLY. NO ADDITIONAL TEXT OR FORMATTING. `; }; // Helper functions for timezone handling and response parsing // ... (See GitHub repository for full implementation) module.exports = { textToCalendarEvent }; The key innovation here is our prompt design that: Anchors the model to today’s dateProvides timezone awarenessGives clear rules for handling ambiguous casesShows examples of desired output format Step 2: Calendar Utility Function Utility module for calendar operations. Here’s the simplified version (utils/calendarUtils.js): JavaScript const { google } = require('googleapis'); // Function to create a calendar event using the Google Calendar API const createCalendarEvent = async ({ auth, calendarId = 'primary', summary, description, startDateTime, endDateTime }) => { const calendar = google.calendar({ version: 'v3', auth }); const { data } = await calendar.events.insert({ calendarId, resource: { summary, description: description || summary, start: { dateTime: startDateTime }, end: { dateTime: endDateTime } } }); return { success: true, eventId: data.id, eventLink: data.htmlLink }; }; module.exports = { createCalendarEvent }; Step 3: Updating the Express App Now, let’s update our app.js file to include the new natural language endpoint: JavaScript // Import the new modules const { textToCalendarEvent } = require('./nlpService'); const { createCalendarEvent } = require('./utils/calendarUtils'); // Add this new endpoint after the existing /api/create-event endpoint app.post('/api/text-to-event', async (req, res) => { try { const { text } = req.body; if (!text) { return res.status(400).json({ error: 'Missing required field: text' }); } // Get user timezone from request headers or default to system timezone const timezone = req.get('X-Timezone') || Intl.DateTimeFormat().resolvedOptions().timeZone; // Convert the text to a structured event with timezone awareness const eventData = await textToCalendarEvent(text, timezone); const { summary, description = summary, startDateTime, endDateTime } = eventData; // Create the calendar event using the extracted data const result = await createCalendarEvent({ auth: oauth2Client, summary, description, startDateTime, endDateTime }); // Add the parsed data for reference res.status(201).json({ ...result, eventData }); } catch (error) { console.error('Error creating event from text:', error); res.status(error.code || 500).json({ error: error.message || 'Failed to create event from text' }); } }); Step 4: Building the User Interface We’ll create a dedicated HTML page for natural language input (public/text-to-event.html). Here's a simplified version showing the main components: HTML <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>Text to Calendar Event</title> <!-- CSS styles omitted for brevity --> </head> <body> <div class="nav"> <a href="/">Standard Event Form</a> <a href="/text-to-event.html">Natural Language Events</a> </div> <h1>Text to Calendar Event</h1> <p>Simply describe your event in natural language, and we'll create it in your Google Calendar.</p> <div class="container"> <h2>Step 1: Authenticate with Google</h2> <a href="/auth/google"><button class="auth-btn">Connect to Google Calendar</button></a> </div> <div class="container"> <h2>Step 2: Describe Your Event</h2> <div> <div>Try these examples:</div> <div class="example">Schedule a team meeting tomorrow at 2pm for 45 minutes</div> <div class="example">Create a dentist appointment on April 15 from 10am to 11:30am</div> <div class="example">Set up a lunch with Sarah next Friday at noon</div> </div> <form id="event-form"> <label for="text-input">Event Description</label> <textarea id="text-input" required placeholder="Schedule a team meeting tomorrow at 2pm for 45 minutes"></textarea> <button type="submit">Create Event</button> <div class="loading" id="loading">Processing your request <span></span></div> </form> <div id="result"></div> </div> <script> // JavaScript code to handle the form submission and examples // See GitHub repository for full implementation </script> </body> </html> The interface provides clickable examples and a text input area and displays results with a loading indicator. Step 5: Creating a Testing Script For easy command-line testing to automatically detect your current timezone, here’s a simplified version of our shell script test-text-to-event.sh: Shell #!/bin/bash # Test the text-to-event endpoint with a natural language input # Usage: ./test-text-to-event.sh "Schedule a meeting tomorrow at 3pm for 1 hour" TEXT_INPUT=${1:-"Schedule a team meeting tomorrow at 2pm for 45 minutes"} # Try to detect system timezone TIMEZONE=$(timedatectl show --property=Timezone 2>/dev/null | cut -d= -f2) # Fallback to a popular timezone if detection fails TIMEZONE=${TIMEZONE:-"America/Chicago"} echo "Sending text input: \"$TEXT_INPUT\"" echo "Using timezone: $TIMEZONE" echo "" curl -X POST http://localhost:3000/api/text-to-event \ -H "Content-Type: application/json" \ -H "X-Timezone: $TIMEZONE" \ -d "{\"text\": \"$TEXT_INPUT\"}" | json_pp echo "" Don’t forget to make it executable: chmod +x test-text-to-event.sh If you know your timezone already, you could pass it directly like below Shell curl -X POST http://localhost:3000/api/text-to-event \ -H "Content-Type: application/json" \ -H "X-Timezone: America/New_York" \ -d '{"text": "Schedule a team meeting tomorrow at 3pm for 1 hour"}' Step 6: Updating Environment Variables Create or update your .env file to include the Ollama settings: Plain Text # Google Calendar API settings (from Part 1) GOOGLE_CLIENT_ID=your_client_id_here GOOGLE_CLIENT_SECRET=your_client_secret_here GOOGLE_REDIRECT_URI=http://localhost:3000/auth/google/callback PORT=3000 # Ollama LLM settings OLLAMA_ENDPOINT=http://localhost:11434/api/generate OLLAMA_MODEL=llama3.2:latest # Server configuration PORT=3000 The Magic of Prompt Engineering The heart of our system lies in the carefully designed prompt that we send to the language model. Let’s break down why this prompt works so well: Context setting. We tell the model exactly what we want it to do — convert text to a specific JSON format.Date anchoring. By providing today’s date, we ground all relative date references.Timezone awareness. We explicitly tell the model what timezone to use.Specific format. We clearly define the exact JSON structure we expect back.Rules for ambiguities. We give explicit instructions for handling edge cases.Examples. We show the model exactly what good outputs look like. This structured approach to prompt engineering is what makes our text conversion reliable and accurate. Testing the Complete System Now that everything is set up, let’s test our system: Start the server: npm run startMake sure Ollama is running with the llama3.2:latest modelOpen your browser to http://localhost:3000/text-to-event.htmlAuthenticate with Google (if you haven’t already)Enter a natural language description like “Schedule a team meeting tomorrow at 2 pm”Click “Create Event” and watch as it appears in your calendar! You can also test from the command line: Plain Text ./test-text-to-event.sh "Set up a project review on Friday at 11am" To test if your Ollama is running as expected, try this test query: Shell curl -X POST http://localhost:11434/api/generate \ -H "Content-Type: application/json" \ -d '{ "model": "llama3.2:latest", "prompt": "What is the capital of France?", "stream": false }' The Complete Pipeline Let’s review what happens when a user enters text like “Schedule a team meeting tomorrow at 2 pm for 45 minutes”: The text is sent to our /api/text-to-event endpoint, along with the user's timezoneOur NLP service constructs a prompt that includes: Today’s date (for reference)The user’s timezoneClear instructions and examplesThe user’s textOllama processes this prompt and extracts the structured event data: JSON { "summary": "Team Meeting", "description": "Team Meeting", "startDateTime": "2025-03-09T14:00:00-05:00", "endDateTime": "2025-03-09T14:45:00-05:00" } Our app passes this structured data to the Google Calendar APIThe Calendar API creates the event and returns a success message with a linkWe display the confirmation to the user with all the details This demonstrates the core concept of our “Text to Action” series: transforming natural language into structured data that can trigger real-world actions. Conclusion and Next Steps In this tutorial, we’ve built a powerful natural language interface for calendar event creation. We’ve seen how: A well-crafted prompt can extract structured data from free-form textTimezone and date handling requires careful considerationModern LLMs like llama3.2 can understand and process natural language reliably But we’ve only scratched the surface of what’s possible. In future episodes of the “Text to Action” series, we’ll explore: Adding voice recognition for hands-free event creationBuilding agent-based decision-making for more complex tasksConnecting to multiple services beyond just calendars The complete code for this project is available on GitHub. Stay tuned for Part 3, where we’ll add voice recognition to our system and make it even more accessible! Resources GitHub RepositoryPart 1: Calendar API FoundationOllama DocumentationGoogle Calendar API Let me know in the comments what you’d like to see in the next episode of this series!
In Playwright, you can run tests using headed and headless modes. In the earlier versions of Playwright, running tests in headed mode was somewhat challenging. To improve the headed mode experience, Playwright introduced UI mode in version 1.32. Playwright UI mode provides a visual interface for running and debugging tests. It allows you to inspect elements, step through tests interactively, and get real-time feedback, making troubleshooting more intuitive. In this blog, we look at how to use Playwright UI mode to run and debug tests. What Is Playwright UI Mode? Playwright UI mode is an interactive feature that allows you to run, inspect, and debug tests in a visually intuitive environment. With UI mode, users can see exactly what’s happening on their website during test execution, making it easier to identify issues and improve test reliability. This interface brings a visual and user-friendly approach to testing, which contrasts with the traditionally code-centric methods of automation testing. Key Benefits Enables you to view the test run in real time. When each step is executed, the associated actions are presented side by side to demonstrate how the elements are used. Such live feedback may be useful to determine if the test script performs correctly when interacting with the elements and if there are any deviations.Allows you to stop test execution at any stage and check the page elements for issues such as timing or selector issues.Comes with a trace viewer that provides details about every test run. The trace viewer records all the actions performed while a test is in progress, using screenshots and network traffic. Makes it easy to find reliable selectors for page elements. When you hover over various elements, Playwright shows the most stable selectors. This capability is crucial in creating resilient tests, as fragile tests may fail due to wrong or weak selectors caused by UI deviations.Lets you easily execute test steps and see the results if you change them. This streamlines the testing process by reducing the number of full test cycles needed. It helps you save time, especially when working on large-scale projects, where tests can take time to execute completely.Supports automated screenshots and video recordings of test runs. These visual aids clearly represent the testing journey step by step, helping teams understand exactly where issues occur. Components of Playwright UI Mode The screenshot shows the Playwright UI mode, which provides a graphical interface for running and debugging Playwright tests. Let's break down each component in detail: Test Explorer (Left Panel) The test explorer comes with the following features: Filters tests (through search box) by keywords, tags (e.g., @tau), or metadata, helping to narrow down tests in large suites.Displays tests of all statuses (passed, failed, skipped) by default, with the option to filter specific states.Filters tests based on the selected browser (e.g., Chromium, Firefox, WebKit).The above screenshot shows the following test files: api.spec.jsexample.spec.jsuiLT.spec.jsvisual.spec.js You can also click on test files to execute or debug them. Timeline (Top Section): It displays a graph tracking test execution over time. This helps visualize the duration of each test or step. It is also useful for identifying performance bottlenecks and debugging slow-running tests. Action Explorer (Center Panel) It records and displays test execution details step by step. The Action Explorer includes multiple tabs: Actions: Logs user interactions (clicks, inputs, navigation, etc.).Metadata: Displays additional test attributes (e.g., environment, priority).Action: Lists recorded actions performed during the test.Before/After: Shows test setup and teardown steps.Pick Locator: Helps identify element locators for test scripts. Browser Preview (Right Panel) It provides a real-time view of test execution. In the screenshot, the browser shows that the KaneAI link is opened. Developer Tools (Bottom Panel) It contains tabs similar to browser dev tools: Locator: Assists in finding elements.Source: Displays the test script source code.Call: Logs API calls, function executions, or internal Playwright actions.Log: Provides a detailed execution log.Errors: Lists runtime errors (e.g., element not found, network failures).Console: Displays logs and debug messages.Network: Logs network requests made during test execution.Attachments: Stores test artifacts like screenshots, videos, and downloaded files.Annotations: Highlights test case statuses (e.g., skipped, fixme, slow). How to Use Playwright UI Mode? Now, let’s look at how to set up UI mode in Playwright: Install Playwright v1.32 by running the below command: JavaScript npm install @playwright/test@1.32 In case you want to install the latest version of Playwright, run the below command: JavaScript npm init playwright@latest Open the Playwright in UI mode by running the below command: JavaScript npx playwright test --ui It will open the Playwright Test Runner in your browser, where you can interact with your tests and gain insights into each step of the test execution process. This command is particularly helpful during development when you need a more visual, hands-on approach to test automation. The --ui flag enables UI mode, launching an interactive interface in your browser. After installation, the folder structure looks like as shown below: Consider the test script below. It navigates to the KaneAI page, verifies the title, and clicks a button inside a specific section. Then, it fills the First Name and Last Name fields with "KaneAI" and "Automation," respectively. It ensures the page loads correctly and inputs test data. JavaScript // @ts-check const { test, expect } = require("@playwright/test"); test("Open the site , verify the title, and fill data in the fields", async ({ page, }) => { await page.goto("https://www.lambdatest.com/kane-ai"); await expect(page).toHaveTitle(/Kane AI - World's first AI E2E Testing Agent/); await page.locator('section').filter({ hasText: 'KaneAI - Testing' }).getByRole('button').click(); await page.getByRole('textbox', { name: 'First Name*' }).fill('KaneAI') await page.getByRole('textbox', { name: 'Last Name*' }).fill('Automation') }); To run the above test script, use the command: JavaScript npx playwright test --ui Output Features of Playwright UI Mode Now let’s look at different UI mode features offered by Playwright and how you can use them: Filtering Tests You can filter the executed test script by text or @tag. Also, With the status ‘passed,’ it will display only the passed test cases. Status with ‘failed’ will display all failed test cases, and the status ‘skipped’ will display if there are any skipped test cases during the execution. Additionally, you can filter by the project (Chromium, Firefox, and WebKit) and display test cases that are executed in respective browsers. Pick Locator The Pick Locator feature in Playwright UI mode assists in identifying locators for elements on a webpage. To capture the locator of a specific field, simply click on the circle icon (as shown in the screenshot) and hover over the desired field. This will highlight and display the corresponding locator for easy selection. Actions and Metadata In the Actions tab, you can see all the actions we have performed against a particular test. When you hover over each command, you can see the change in DOM on the right side. In the Metadata tab, you can see the details about the particular test case start time, browser, viewport, and count, which include pages, actions, and events. Timeline View At the top of the trace, you can see a timeline view of each action of your test. The timeline shows image snapshots after hovering over it, and double-clicking any action will show its time range. The slider adjustment lets you choose additional actions that directly filter the Actions tab while controlling log visibility in the console and network tabs. Watch Feature The Watch feature enhances the testing experience by automatically rerunning the test when changes are made to the test file. This eliminates the need for manually re-executing tests after modifications, making the development and debugging process more efficient. To enable the Watch in Playwright UI mode, simply click on the watch icon next to the specific test case. Once enabled, any changes made to that test file will automatically trigger the test to re-execute. Source, Call, and Log Source In the Source tab, you can see the source code of the selected test case. Call In the Call tab, you can view the details of a particular command, including its start time, the time it takes to complete and other parameters. Log The Log tab provides the execution of the particular command in detail. In the screenshot below, you can see the command is attempting to fill the "Last Name" field with the value "Automation." It first waits for the textbox with the role textbox and the name Last Name* to be located. Once the locator resolves to the correct <input> element, the script attempts to perform the fill action. However, before filling, it ensures that the element is visible, enabled, and editable. Errors The Error tab lets you know the reason for test failure. In case any test fails, the Timeline at the top is also highlighted with a red color where the error comes. Console The Console tab shows the log from the test and browser. The icons below indicate whether the console log came from the browser or the test file. Network In the Network tab, you can see all the requests made during the test execution. You can filter the network requests and sort them by different types of requests, methods, status codes, content types, sizes, and durations. Attachments The Attachments tab helps with visual image comparison by identifying the difference between the images. By clicking on Diff, you can see the difference between the actual and expected images. Annotations Annotations are special markers that you can add to test suites to change the behavior of the test execution. Any of the tests are marked as test.skip(), test.fail(), test.fixme(), test.slow() will show in the UI mode. Overall, Playwright UI mode makes debugging tests easier with its interactive interface, but running tests locally has limitations. Wrapping Up With Playwright UI mode, you can fully access automated test functionality without any hindrances. The real-time visual testing framework enhances debugging, improves test reliability, and accelerates test development. Integrated features such as live test execution monitoring, step-by-step debugging, trace viewer, and selector inspection make the test automation process more efficient. Leveraging Playwright UI mode can optimize test automation workflows, leading to higher test quality and reduced debugging time. This streamlined approach enables you to create more effective test scripts and quickly resolve test failures.
There is no doubt that Python is one of the most popular programming languages. Furthermore, it offers frameworks like Django and Flask. But Django is the most famous among Python developers as it helps them with a rapid development process and pragmatic design. For example, the Object Relational Mapping tool (ORM), routing, and templating features make things easier for developers. But despite these powerful features, there are many mistakes, such as bad application structure and incorrect resource placement, as well as writing fat views and skinny models, etc. These types of problems are not just faced by newbie Python developers, but they are also difficult ones for experienced Python developers as well. In this article, I have listed the most common problems developers face and how to avoid them. 1. Developers Often Use Python’s Global Environment for Project Dependencies This mistake is often made by new Python developers who don’t know about the environment isolation features of Python. You must not use the global environment for project dependencies as it generates dependency issues. Moreover, Python will be unable to use various package versions simultaneously. Further, this will be a significant issue, as various other projects will need different, irreconcilable varieties of the same package. You can get rid of this problem by isolating Python’s environment: Use Virtual Environment You can use a module named virtualenv since it is a tool that helps you build virtual environments in Python. These environments will be separated from those system environments. Using virtualenv will create a folder that will include all the important executable files that your Python project will require to use the packages. Virtualenvwrapper Virtualenvwrapper is a Python package that installs globally. When it comes to creating, deleting, and activating virtual environments, a Python package has a toolset for it. You can keep all the virtual environments in a single folder. Virtual Machines (VM) This is one of the best ways to isolate your environment since an entire virtual machine is dedicated to your application. You can choose from a collection of tools that include VirtualBox, Parallels, and Proxmox. Furthermore, if you integrate it with a VM automation tool named Vagrant, it will deliver results beyond your expectations. Containers For container automation, you can use a Docker tool since it has a lot of third-party tools. Moreover, this tool includes a catching feature that helps to rebuild your containers at a rapid speed. Further, this is very easy to implement, and when you have an idea of how Docker works, then you will find a lot of helpful images like Postgres, MongoDB, Redis, PySpark, etc. So, these are the best ways you can use to master project dependency isolation as well as management. 2: Developers Do Not Pin Project Dependencies in a requirements.txt File Starting your Python project, you should start it with an isolated environment with a requirement.txt file. And when developers install packages via pip/easy_install, they must also add them to their requirement.txt file. Therefore, it will be way easier for you if you deploy a project on the server. There are various modules, parameters, and functions for different versions of packages. Even a small change in the dependency is capable of breaking your package. Hence, it is very important that you pin the particular version of your dependencies in your requirements.txt file. Moreover, there is a list of pip-tools available in Python, and with command-line tools, you will be able to manage those dependencies with ease. This tool is useful since it automatically produces a requirement.txt file that helps to pin all of those dependencies and even a complete dependency tree. Also, keep in mind to have a copy of your dependencies file in the file system, S3 folder, FTP, as well as SFTP. 3: You Don’t Know the Advantages of Both Function-Based Views and Class-Based Views If you use function-based views, then it is a traditional approach to implementing views. Being implemented as normal Python functions, these take an HTTP request as if it is an argument and, therefore, return an HTTP response. Function-Based Views (FBVs) Below, you can read the benefits of using function-based views (FBVs): FBVs Offer Flexibility FBVs facilitate a great level of flexibility that enables Python developers to utilize any of the Python functions as a view that can also include third-party libraries as well as custom functions. They Are Simple to Understand Function-based views are very easy to understand. Therefore, FBVs are a great choice for small projects as well as simple views. Familiarity Since FBVs utilize a function-based syntax, it will be very easy for Python developers to use them. Python developers are more likely to be comfortable with FBVs as they are already familiar with them. Class-Based Views On the other hand, class-based views (CBVs) offer an abstract class that carries out general development tasks. Here are the benefits of CBVs: Using Structured API On top of object-oriented programming, you can also benefit from utilizing structured API. So as a result, your code will be clearer and readable. The Benefit of Code Reusability CBVs are reusable, and you can extend and modify them with ease with the help of subclassing. It Offers Consistency In order to manage various versions of HTTP requests, CBVs offer a consistent interface. CBVs Are Modular in Nature Since these are modular in nature, you will divide those complicated views into smaller as well as reusable components. 4: Django Developers Write the Application Logic in Views Instead of Model If you write the logic in views, then it will make your application view “fat” and not to mention your model “skinny.” So, it is important that you avoid this mistake and write the application logic in models rather than writing in views. Django developers can further work on breaking the logic into tiny methods, and then they can write them into the models. This will enable them to utilize it various times from multiple sources such as front-end UI, admin interface UI, API endpoints, etc. And they can do it in just a few lines of code. Thus, they will get rid of copying and pasting lots of code. Therefore, when you send an email to a user, you must extend the model with the help of an email function rather than writing the logic in views. Furthermore, this will enable your code to be easily unit-tested. This is because Python developers can test the email logic in one place. So, as a result, they will get rid of testing the email logic over and over again in each controller. Thus, next time you work on your project, write fat models as well as skinny views.
Welcome to the “Text to Action” series, where we build intelligent systems that transform natural language into real-world actionable outcomes using AI. To understand the concept better, let’s start simple by building a Smart Calendar AI Assistant. Soon, we’ll tackle more complex challenges — from smart home control to document automation — as we master the art of turning words into actions. Our goal for this project is to create a working AI system where you can simply type or say, “Create a party event at 5 pm on March 20” and watch it instantly appear in your Google Calendar. Part 1 focuses on building the backend foundation: an Express.js backend that connects to Google Calendar’s API. This will handle the actual event creation before we add natural language processing in future episodes. Let’s begin by creating our calendar integration! This tutorial shows you how to authenticate with OAuth2 and create a simple Express endpoint that adds events to your Google Calendar — perfect for integrating calendar functionality into any application. Create a simple REST API endpoint that adds events to Google Calendar with minimal code. Links The complete code is available on vivekvells/text-to-calendar-aiThis tutorial video What We're Building A lightweight Express.js API that exposes a single endpoint for creating Google Calendar events. This API will: Authenticate with Google using OAuth2Add events to your primary calendarReturn event details and links Prerequisites JavaScript # Install Node.js (v14+) and npm npm install express googleapis dotenv You'll need OAuth credentials from the Google Cloud Console. Project Structure JavaScript text-to-calendar/ ├── app.js # Our entire application ├── public/ # Static files │ └── index.html # Simple web interface └── .env # Environment variables The Code: Complete Express Application JavaScript // app.js - Google Calendar API with Express require('dotenv').config(); const express = require('express'); const { google } = require('googleapis'); const fs = require('fs'); const path = require('path'); const app = express(); app.use(express.json()); app.use(express.static('public')); // Configure OAuth const oauth2Client = new google.auth.OAuth2( process.env.GOOGLE_CLIENT_ID, process.env.GOOGLE_CLIENT_SECRET, process.env.GOOGLE_REDIRECT_URI || 'http://localhost:3000/auth/google/callback' ); // Load saved tokens if available try { const tokens = JSON.parse(fs.readFileSync('tokens.json')); oauth2Client.setCredentials(tokens); } catch (e) { /* No tokens yet */ } // Auth routes app.get('/auth/google', (req, res) => { const authUrl = oauth2Client.generateAuthUrl({ access_type: 'offline', scope: ['https://www.googleapis.com/auth/calendar'] }); res.redirect(authUrl); }); app.get('/auth/google/callback', async (req, res) => { const { tokens } = await oauth2Client.getToken(req.query.code); oauth2Client.setCredentials(tokens); fs.writeFileSync('tokens.json', JSON.stringify(tokens)); res.redirect('/'); }); // API endpoint to create calendar event app.post('/api/create-event', async (req, res) => { try { // Check if we have the required fields const { summary, description, startDateTime, endDateTime } = req.body; if (!summary || !startDateTime || !endDateTime) { return res.status(400).json({ error: 'Missing required fields: summary, startDateTime, endDateTime' }); } // Create the calendar event const calendar = google.calendar({ version: 'v3', auth: oauth2Client }); const response = await calendar.events.insert({ calendarId: 'primary', resource: { summary, description: description || summary, start: { dateTime: startDateTime }, end: { dateTime: endDateTime } } }); res.status(201).json({ success: true, eventId: response.data.id, eventLink: response.data.htmlLink }); } catch (error) { console.error('Error creating event:', error); res.status(error.code || 500).json({ error: error.message || 'Failed to create event' }); } }); // Start server const PORT = process.env.PORT || 3000; app.listen(PORT, () => console.log(`Server running at http://localhost:${PORT}`)); How to Use 1. Set up Environment Variables Create a .env file: Plain Text GOOGLE_CLIENT_ID=your_client_id GOOGLE_CLIENT_SECRET=your_client_secret GOOGLE_REDIRECT_URI=http://localhost:3000/auth/google/callback PORT=3000 2. Authenticate With Google Visit http://localhost:3000/auth/google in your browser to connect to Google Calendar. 3. Create an Event Use a POST request to the API endpoint: JavaScript curl -X POST http://localhost:3000/api/create-event \ -H "Content-Type: application/json" \ -d '{ "summary": "Team Meeting", "description": "Weekly team status update", "startDateTime": "2025-03-10T14:00:00-07:00", "endDateTime": "2025-03-10T15:00:00-07:00" }' Sample response: JavaScript { "success": true, "eventId": "e0ae1vv8gkop6bcbb5gqilotrs", "eventLink": "https://www.google.com/calendar/event?eid=ZTBhZTF2djhna29wNmJjYmI1Z3FpbG90cnMgdml2ZWtzbWFpQG0" } API Endpoint Details POST /api/create-event – Create a calendar event Request: summary (String, required) – Event titledescription (String) – Event detailsstartDateTime (ISO 8601) – When event startsendDateTime (ISO 8601) – When event ends Response format: JavaScript { "success": true, "eventId": "e0ae1vv8gkop6bcbb5gqilotrs", "eventLink": "https://www.google.com/calendar/event?eid=..." } OAuth2 Authentication Flow User visits /auth/google endpoint.User is redirected to the Google consent screen.After granting access, Google redirects to /auth/google/callback. The app stores OAuth tokens for future API calls.API is now ready to create events. Troubleshooting OAuth setup can be tricky. If you encounter issues, refer to the OAUTH_SETUP.md in the repository, which contains a detailed troubleshooting guide for common OAuth errors. Security Considerations Store OAuth tokens securely in production (not in a local file)Use HTTPS for all API endpoints in productionConsider rate limiting to prevent abuseImplement proper error handling and validation Conclusion With just under 60 lines of core code, we've created a functional API that connects to Google Calendar. This foundation can be extended to support more calendar operations like retrieving, updating, or deleting events. The complete code is available on text-to-calendar-ai. The same approach can be applied to integrate with other Google services or third-party APIs that use OAuth2 authentication.
I still remember the day our CTO walked into the engineering huddle and declared, "We're moving everything to Kubernetes." It was 2017, and like many teams caught in the container hype cycle, we dove in headfirst with more excitement than wisdom. What followed was a sobering 18-month journey of steep learning curves, 3 AM incident calls, and the gradual realization that we'd traded one set of operational headaches for another. Fast forward to today, I'm deploying containerized applications without managing a single node. No upgrades. No capacity planning. No security patching. Yet, I still have the full power of Kubernetes' declarative API at my fingertips. The serverless Kubernetes revolution is here, and it's changing everything about how we approach container orchestration. The Evolution I've Witnessed Firsthand Having worked with Kubernetes since its early days, I've lived through each phase of its management evolution: Phase 1: The DIY Era (2015-2018) Our first production Kubernetes cluster was a badge of honor — and an operational nightmare. We manually set up everything: etcd clusters, multiple master nodes for high availability, networking plugins that mysteriously failed, and storage integrations that tested the limits of our patience. We became experts by necessity, learning Kubernetes internals in painful detail. I filled three notebooks with command-line incantations, troubleshooting flows, and architecture diagrams. New team members took weeks to ramp up. We were doing cutting-edge work, but at a staggering operational cost. Phase 2: Managed Control Planes (2018-2020) When GKE, EKS, and AKS matured, it felt like a revelation. "You mean we don't have to manage etcd backups anymore?" The relief was immediate — until we realized we still had plenty of operational responsibilities. Our team still agonized over node sizing, Kubernetes version upgrades, and capacity management. I spent countless hours tuning autoscaling parameters and writing Terraform modules. We eliminated some pain, but our engineers were still spending 20-30% of their time on infrastructure rather than application logic. Phase 3: Advanced Management Tooling (2020-2022) As our company expanded to multiple clusters across different cloud providers, we invested heavily in management layers. Rancher became our control center, and we built automation for standardizing deployments. Tools improved, but complexity increased. Each new feature or integration point added cognitive load. Our platform team grew to five people — a significant investment for a mid-sized company. We were more sophisticated, but not necessarily more efficient. Phase 4: The Serverless Awakening (2022-Present) My epiphany came during a late-night production issue. After spending hours debugging a node-level problem, I asked myself: "Why are we still dealing with nodes in 2022?" That question led me down the path to serverless Kubernetes, and I haven't looked back. What Makes Kubernetes Truly "Serverless"? Through trial and error, I've developed a practical definition of what constitutes genuine serverless Kubernetes: You never think about nodes. Period. No sizing, scaling, patching, or troubleshooting. If you're SSHing into a node, it's not serverless.You pay only for what you use. Our bill now scales directly with actual workload usage. Last month, our dev environment cost dropped 78% because it scaled to zero overnight and on weekends.Standard Kubernetes API. The critical feature that separates this approach from traditional PaaS. My team uses the same YAML, kubectl commands, and CI/CD pipelines we've already mastered.Instant scalability. When our product hit the front page of Product Hunt, our API scaled from handling 10 requests per minute to 3,000 in seconds, without any manual intervention.Zero operational overhead. We deleted over 200 runbooks and automation scripts that were dedicated to cluster maintenance. Real Architectural Approaches I've Evaluated When exploring serverless Kubernetes options, I found four distinct approaches, each with unique strengths and limitations: 1. The Virtual Kubelet Approach We first experimented with Azure Container Instances (ACI) via Virtual Kubelet. The concept was elegant — a virtual node that connected our cluster to a serverless backend. This worked well for batch processing workloads but introduced frustrating latency when scaling from zero. Some of our Kubernetes manifests needed modifications, particularly those using DaemonSets or privileged containers. 2. Control Plane + Serverless Compute Our team later moved some workloads to Google Cloud Run for Anthos. I appreciated maintaining a dedicated control plane (for familiarity) while offloading the compute layer. This hybrid approach provided excellent Kubernetes compatibility. The downside? We still paid for the control plane even when idle, undermining the scale-to-zero economics. 3. On-Demand Kubernetes For our development environments, we've recently adopted an on-demand approach, where the entire Kubernetes environment — control plane included — spins up only when needed. The cost savings have been dramatic, but we've had to architect around cold start delays. We've implemented clever prewarming strategies for critical environments before high-traffic events. 4. Kubernetes-Compatible API Layers I briefly tested compatibility layers that provide Kubernetes-like APIs on top of other orchestrators. While conceptually interesting, we encountered too many edge cases where standard Kubernetes features behaved differently. Platform Experiences: What Actually Worked for Us Rather than providing generic platform overviews, let me share my team's real experiences with these technologies: AWS Fargate for EKS After running Fargate for 14 months, here's my honest assessment: What I loved: The seamless integration with existing EKS deployments lets us migrate workloads gradually. Our developers continued using familiar tools while we eliminated node management behind the scenes. The per-second billing granularity provided predictable costs.What caused headaches: Our monitoring stack relied heavily on DaemonSets, requiring significant rearchitecting. Storage limitations forced us to migrate several stateful services to managed alternatives. Cold starts occasionally impacted performance during low-traffic periods.Pro tip: Create separate Fargate profiles with appropriate sizing for different workload types — we reduced costs by 23% after segmenting our applications this way. Google Cloud Run for Anthos We deployed a new microservice architecture using this platform last year: What worked brilliantly: The sub-second scaling from zero consistently impressed us. The Knative Foundation provided an elegant developer experience, particularly for HTTP services. Traffic splitting for canary deployments became trivially easy.Where we struggled: Building effective CI/CD pipelines required additional work. Some of our batch processing workloads weren't ideal fits for the HTTP-centric model. Cost visibility was initially challenging.Real-world insight: Invest time in setting up detailed monitoring for Cloud Run services. We missed several performance issues until implementing custom metrics dashboards. Azure Container Apps For our .NET-based services, we evaluated Azure Container Apps: Standout features: The built-in KEDA-based autoscaling worked exceptionally well for event-driven workloads. The revisions concept for deployment management simplified our release process.Limitations we encountered: The partial Kubernetes API implementation meant we couldn't directly port all our existing manifests. Integration with legacy on-premises systems required additional networking configuration.Lesson learned: Start with greenfield applications rather than migrations to minimize friction with this platform. Implementation Lessons from the Trenches After transitioning multiple environments to serverless Kubernetes, here are the pragmatic lessons that don't typically make it into vendor documentation: Application Architecture Reality Check Not everything belongs in serverless Kubernetes. Our journey taught us to be selective: Perfect fits. Our API gateways, web frontends, and event processors thrived in serverless environments.Problematic workloads. Our ML training jobs, which needed GPU access and ran for hours, remained on traditional nodes. A database with specific storage performance requirements stayed on provisioned infrastructure.Practical adaptation. We created a "best of both worlds" architecture, using serverless for elastic workloads while maintaining traditional infrastructure for specialized needs. The Cost Model Shift That Surprised Us Serverless dramatically changed our cost structure: Before: Predictable but inefficient monthly expenses regardless of traffic.After: Highly efficient but initially less predictable costs that closely tracked usage.How we adapted: We implemented ceiling limits on autoscaling to prevent runaway costs. We developed resource request guidelines for teams to prevent over-provisioning. Most importantly, we built cost visibility tooling so teams could see the direct impact of their deployment decisions. Developer Experience Transformation Transitioning to serverless required workflow adjustments: Local development continuity. We standardized on kind (Kubernetes in Docker) for local development, ensuring compatibility with our serverless deployments.Troubleshooting changes. Without node access, we invested in enhanced logging and tracing. Distributed tracing, in particular, became essential rather than optional.Deployment pipeline adjustments. We built staging environments that closely mimicked production serverless configurations to catch compatibility issues early. Security Model Adaptation Security practices evolved significantly: Shared responsibility clarity. We documented clear boundaries between provider responsibilities and our security obligations.IAM integration. We moved away from Kubernetes RBAC for some scenarios, leveraging cloud provider identity systems instead.Network security evolution. Traditional network policies gave way to service mesh implementations for fine-grained control. Real-World Outcomes From Our Transition The impact of our serverless Kubernetes adoption went beyond technical architecture: Team Structure Transformation Our platform team of five shrunk to two people, with three engineers reallocated to product development. The remaining platform engineers focused on developer experience rather than firefighting. The on-call rotation, once dreaded for its 3 AM Kubernetes node issues, now primarily handles application-level concerns. Last quarter, we had zero incidents related to infrastructure. Business Agility Improvements Product features that once took weeks to deploy now go from concept to production in days. Our ability to rapidly scale during demand spikes allowed the marketing team to be more aggressive with promotions, knowing the platform would handle the traffic. Perhaps most significantly, we reduced our time-to-market for new initiatives by 40%, giving us an edge over competitors still managing their own Kubernetes infrastructure. Economic Impact After full adoption of serverless Kubernetes: Development environment costs decreased by 78%Overall infrastructure spend reduced by 32%Engineer productivity increased by approximately 25%Time spent on infrastructure maintenance dropped by over 90% Honest Challenges You'll Face No transformation is without its difficulties. These are the real challenges we encountered: Debugging complexity. Without node access, some troubleshooting scenarios became more difficult. We compensated with enhanced observability but still occasionally hit frustrating limitations.Ecosystem compatibility gaps. Several of our favorite Kubernetes tools didn't work as expected in serverless environments. We had to abandon some tooling and adapt others.The cold start compromise. We implemented creative solutions for cold start issues, including keepalive mechanisms for critical services and intelligent prewarming before anticipated traffic spikes.Migration complexity. Moving existing applications required more effort than we initially estimated. If I could do it again, I'd allocate 50% more time for the migration phase. Where Serverless Kubernetes Is Heading Based on industry connections and my own observations, here's where I see serverless Kubernetes evolving: Cost Optimization Tooling The next frontier is intelligent, automated cost management. My team is already experimenting with tools that automatically adjust resource requests based on actual usage patterns. Machine learning-driven resource optimization will likely become standard. Developer Experience Convergence The gap between local development and serverless production environments is narrowing. New tools emerging from both startups and established vendors are creating seamless development experiences that maintain parity across environments. Edge Computing Expansion I'm particularly excited about how serverless Kubernetes is enabling edge computing scenarios. Projects we're watching are bringing lightweight, serverless Kubernetes variants to edge locations with centralized management and zero operational overhead. Hybrid Architectures Standardization The most practical approach for many organizations will be hybrid deployments — mixing traditional and serverless Kubernetes. Emerging patterns and tools are making this hybrid approach more manageable and standardized. Final Thoughts When we started our Kubernetes journey years ago, we accepted operational complexity as the cost of admission for container orchestration benefits. Serverless Kubernetes has fundamentally changed that equation. Today, our team focuses on building products rather than maintaining infrastructure. We deploy with confidence to environments that scale automatically, cost-efficiently, and without operational burden. For us, serverless Kubernetes has delivered on the original promise of containers: greater focus on applications rather than infrastructure. Is serverless Kubernetes right for every workload? Absolutely not. Is it transforming how forward-thinking teams deploy applications? Without question. References Kubernetes Virtual Kubelet documentationCNCF Serverless Landscape AWS Fargate for EKSGoogle Cloud Run for AnthosAzure Container AppsKnative documentation
When you only have a few data sources (e.g., PDFs, JSON) that are required in your generative AI application, building RAG might not be worth the time and effort. In this article, I'll show how you can use Google Gemini to retrieve context from three data sources. I'll also show how you can combine the context and ground results using Google search. This enables the end user to combine real-time information from Google Search with their internal data sources. Application Overview I'll only cover the code needed for Gemini and getting the data rather than building the entire application. Please note that this code is for demonstration purposes only. If you want to implement it, follow best practices such as using a key management service for API keys, error handling, etc. This application can answer any question related to events occurring in Philadelphia (I'm only using Philadelphia as an example because I found some good public data.) The data sources I used to send context to Gemini were a Looker report that has a few columns related to car crashes in Philadelphia for 2023, Ticketmaster events occurring for the following week, and weather for the following week. Parts of the code below were generated using Gemini 1.5 Pro and Anthropic Claude Sonnet 3.5. Data Sources I have all my code in three different functions for the API calls to get data in a file called api_handlers. App.py imports from api_handlers and sends the data to Gemini. Let's break down the sources in more detail. Application files Looker Looker is Google's enterprise BI capability. Looker is an API-first platform. Almost anything you can do in the UI can be achieved using the Looker SDK. In this example, I'm executing a Looker report and saving the results to JSON. Here's a screenshot of the report in Looker. Looker report Here's the code to get data from the report using the Looker SDK. Python def get_crash_data(): import looker_sdk from looker_sdk import models40 as models import os import json sdk = looker_sdk.init40("looker.ini") look_id = "Enter Look ID" try: response = sdk.run_look(look_id=look_id, result_format="json") print('looker done') return json.loads(response) except Exception as e: print(f"Error getting Looker data: {e}") return [] This code imports looker_sdk, which is required to interact with Looker reports, dashboards, and semantic models using the API. Looker.ini is a file where the Looker client ID and secret are stored. This document shows how to get API credentials from Looker. You get the look_id from the Looker's Look URL. A Look in Looker is a report with a single visual. After that, the run_look command executes the report and saves the data to JSON. The response is returned when this function is called. Ticketmaster Here's the API call to get events coming from Ticketmaster. Python def get_philly_events(): import requests from datetime import datetime, timedelta base_url = "https://app.ticketmaster.com/discovery/v2/events" start_date = datetime.now() end_date = start_date + timedelta(days=7) params = { "apikey": "enter", "city": "Philadelphia", "stateCode": "PA", "startDateTime": start_date.strftime("%Y-%m-%dT%H:%M:%SZ"), "endDateTime": end_date.strftime("%Y-%m-%dT%H:%M:%SZ"), "size": 50, "sort": "date,asc" } try: response = requests.get(base_url, params=params) if response.status_code != 200: return [] data = response.json() events = [] for event in data.get("_embedded", {}).get("events", []): venue = event["_embedded"]["venues"][0] event_info = { "name": event["name"], "date": event["dates"]["start"].get("dateTime", "TBA"), "venue": event["_embedded"]["venues"][0]["name"], "street": venue.get("address", {}).get("line1", "") } events.append(event_info) return events except Exception as e: print(f"Error getting events data: {e}") return [] I'm using the Ticketmaster Discovery API to get the name, date, venue, and street details for the next 7 days. Since this is an HTTP GET request, you can use the requests library to make the GET request. If the result is successful, the response gets saved as JSON to the data variable. After that, the code loops through the data, and puts the information in a dictionary called events_info, which gets appended to the events list. The final piece of data is weather. Weather data comes from NOAA weather API, which is also free to use. Python def get_philly_weather_forecast(): import requests from datetime import datetime, timedelta import json lat = "39.9526" lon = "-75.1652" url = f"https://api.weather.gov/points/{lat},{lon}" try: # Get API data response = requests.get(url, headers={'User-Agent': 'weatherapp/1.0'}) response.raise_for_status() grid_data = response.json() forecast_url = grid_data['properties']['forecast'] # Get forecast data forecast_response = requests.get(forecast_url) forecast_response.raise_for_status() forecast_data = forecast_response.json() weather_data = { "location": "Philadelphia, PA", "forecast_generated": datetime.now().strftime("%Y-%m-%d %H:%M:%S"), "data_source": "NOAA Weather API", "daily_forecasts": [] } # Process forecast data - take 14 periods to get 7 full days periods = forecast_data['properties']['periods'][:14] # Get 14 periods (7 days × 2 periods per day) # Group periods into days current_date = None daily_data = None for period in periods: period_date = period['startTime'][:10] # Get just the date part of period is_daytime = period['isDaytime'] # If we're starting a new day if period_date != current_date: # Save the previous day's data if it exists if daily_data is not None: weather_data["daily_forecasts"].append(daily_data) # Start a new daily record current_date = period_date daily_data = { "date": period_date, "forecast": { "day": None, "night": None, "high_temperature": None, "low_temperature": None, "conditions": None, "detailed_forecast": None } } # Update the daily data based on whether it's day or night period_data = { "temperature": { "value": period['temperature'], "unit": period['temperatureUnit'] }, "conditions": period['shortForecast'], "wind": { "speed": period['windSpeed'], "direction": period['windDirection'] }, "detailed_forecast": period['detailedForecast'] } if is_daytime: daily_data["forecast"]["day"] = period_data daily_data["forecast"]["high_temperature"] = period_data["temperature"] daily_data["forecast"]["conditions"] = period_data["conditions"] daily_data["forecast"]["detailed_forecast"] = period_data["detailed_forecast"] else: daily_data["forecast"]["night"] = period_data daily_data["forecast"]["low_temperature"] = period_data["temperature"] # Append the last day's data if daily_data is not None: weather_data["daily_forecasts"].append(daily_data) # Keep only 7 days of forecast weather_data["daily_forecasts"] = weather_data["daily_forecasts"][:7] return json.dumps(weather_data, indent=2) except Exception as e: print(f"Error with NOAA API: {e}") return json.dumps({ "error": str(e), "location": "Philadelphia, PA", "forecast_generated": datetime.now().strftime("%Y-%m-%d %H:%M:%S"), "daily_forecasts": [] }, indent=2) The API doesn't require a key but it does require latitude and longitude in the request. The API request is made and saved as JSON in forecast_data. The weather data is broken out by two periods in a day: day and night. The code loops through 14 times times and keeps only 7 days of forecast. I'm interested in temperature, forecast details, and wind speed. It also gets the high and low temperatures. Bringing It All Together Now that we have the necessary code to get our data, we will have to execute those functions and send them to Gemini as the initial context. You can get the Gemini API key from Google AI Studio. The code below adds the data to Gemini's chat history. Python from flask import Flask, render_template, request, jsonify import os from google import genai from google.genai import types from api_handlers import get_philly_events, get_crash_data, get_philly_weather_forecast from dotenv import load_dotenv # Load environment variables load_dotenv() app = Flask(__name__) # Initialize Gemini client client = genai.Client( api_key='Enter Key Here', ) # Global chat history chat_history = [] def initialize_context(): try: # Get API data events = get_philly_events() looker_data = get_crash_data() weather_data = get_philly_weather_forecast() # Format events data events_formatted = "\n".join([ f"- {event['name']} at {event['venue']} {event['street']} on {event['date']}" for event in events ]) # Create system context system_context = f"""You are a helpful AI assistant focused on Philadelphia. You have access to the following data that was loaded when you started: Current Philadelphia Events (Next 7 Days): {events_formatted} Crash Analysis Data: {looker_data} Instructions: 1. Use this event and crash data when answering relevant questions 2. For questions about events, reference the specific events listed above 3. For questions about crash data, use the analysis provided 4. For other questions about Philadelphia, you can provide general knowledge 5. Always maintain a natural, conversational tone 6. Use Google Search when needed for current information not in the provided data Remember: Your events and crash data is from system initialization and represents that point in time.""" # Add context to chat history chat_history.append(types.Content( role="user", parts=[types.Part.from_text(text=system_context)] )) print("Context initialized successfully") return True except Exception as e: print(f"Error initializing context: {e}") return False The final step is to get the message from the user and call Gemini's Flash 2.0 model. Notice how the model also takes a parameter called tools=[types.Tool(google_search=types.GoogleSearch())]. This is the parameter that uses Google search to ground results. If the answer isn't in one of the data sources provided, Gemini will do a Google search to find the answer. This is useful if you had information, such as events that weren't in Ticketmaster, but you wanted to know about them. I used Gemini to help get a better prompt to give during the initial context initialization. Python from flask import Flask, render_template, request, jsonify import os from google import genai from google.genai import types from api_handlers import get_philly_events, get_crash_data, get_philly_weather_forecast from dotenv import load_dotenv # Load environment variables load_dotenv() app = Flask(__name__) # Initialize Gemini client client = genai.Client( api_key='Enter Key Here', ) # Global chat history chat_history = [] def initialize_context(): """Initialize context with events and Looker data""" try: # Get initial data events = get_philly_events() looker_data = get_crash_data() weather_data = get_philly_weather_forecast() # Format events data to present better events_formatted = "\n".join([ f"- {event['name']} at {event['venue']} {event['street']} on {event['date']}" for event in events ]) # Create system context system_context = f"""You are a helpful AI assistant focused on Philadelphia. You have access to the following data that was loaded when you started: Philadelphia Events for the next 7 Days: {events_formatted} Weather forecast for Philadelphia: {weather_data} Crash Analysis Data: {looker_data} Instructions: 1. Use this events, weather, and crash data when answering relevant questions 2. For questions about events, reference the specific events listed above 3. For questions about crash data, use the analysis provided 4. For questions about weather, use the data provided 5. For other questions about Philadelphia, you can provide general knowledge 6. Use Google Search when needed for current information not in the provided data Remember: Your events and crash data is from system initialization and represents that point in time.""" # Add context to chat history chat_history.append(types.Content( role="user", parts=[types.Part.from_text(text=system_context)] )) print("Context initialized successfully") return True except Exception as e: print(f"Error initializing context: {e}") return False @app.route('/') def home(): return render_template('index.html') @app.route('/chat', methods=['POST']) def chat(): try: user_message = request.json.get('message', '') if not user_message: return jsonify({'error': 'Message required'}), 400 # Add user message to history chat_history.append(types.Content( role="user", parts=[types.Part.from_text(text=user_message)] )) # Configure generation settings generate_content_config = types.GenerateContentConfig( temperature=0.9, top_p=0.95, top_k=40, max_output_tokens=8192, tools=[types.Tool(google_search=types.GoogleSearch())], ) # Generate response using full chat history response = client.models.generate_content( model="gemini-2.0-flash", contents=chat_history, config=generate_content_config, ) # Add assistant response to history chat_history.append(types.Content( role="assistant", parts=[types.Part.from_text(text=response.text)] )) return jsonify({'response': response.text}) except Exception as e: print(f"Error in chat endpoint: {e}") return jsonify({'error': str(e)}), 500 if __name__ == '__main__': # Initialize context before starting print("Initializing context...") if initialize_context(): app.run(debug=True) else: print("Failed to initialize context") exit(1) Final Words I'm sure there are other ways to initialize context rather than using RAG. This is just one approach that also grounds Gemini using Google search.
DZone events bring together industry leaders, innovators, and peers to explore the latest trends, share insights, and tackle industry challenges. From Virtual Roundtables to Fireside Chats, our events cover a wide range of topics, each tailored to provide you, our DZone audience, with practical knowledge, meaningful discussions, and support for your professional growth. DZone Events Happening Soon Below, you’ll find upcoming events that you won't want to miss. Unpacking the 2025 Developer Experience Trends Report: Insights, Gaps, and Putting it into Action Date: March 19, 2025Time: 1:00 PM ET Register for Free! We’ve just seen the 2025 Developer Experience Trends Report from DZone, and while it shines a light on important themes like platform engineering, developer advocacy, and productivity metrics, there are some key gaps that deserve attention. Join Cortex Co-founders Anish Dhar and Ganesh Datta for a special webinar, hosted in partnership with DZone, where they’ll dive into what the report gets right—and challenge the assumptions shaping the DevEx conversation. Their take? Developer experience is grounded in clear ownership. Without ownership clarity, teams face accountability challenges, cognitive overload, and inconsistent standards, ultimately hampering productivity. Don’t miss this deep dive into the trends shaping your team’s future. Accelerating Software Delivery: Unifying Application and Database Changes in Modern CI/CD Date: March 25, 2025Time: 1:00 PM ET Register for Free! Want to speed up your software delivery? It’s time to unify your application and database changes. Join us for Accelerating Software Delivery: Unifying Application and Database Changes in Modern CI/CD, where we’ll teach you how to seamlessly integrate database updates into your CI/CD pipeline. Petabyte Scale, Gigabyte Costs: Mezmo’s ElasticSearch to Quickwit Evolution Date: March 27, 2025Time: 1:00 PM ET Register for Free! For Mezmo, scaling their infrastructure meant facing significant challenges with ElasticSearch. That's when they made the decision to transition to Quickwit, an open-source, cloud-native search engine designed to handle large-scale data efficiently. This is a must-attend session for anyone looking for insights on improving search platform scalability and managing data growth. Best Practices for Building Secure Data Pipelines with Apache Airflow® Date: April 15, 2025Time: 1:00 PM ET Register for Free! Security is a critical but often overlooked aspect of data pipelines. Effective security controls help teams protect sensitive data, meet compliance requirements with confidence, and ensure smooth, secure operations. Managing credentials, enforcing access controls, and ensuring data integrity across systems can become overwhelming—especially while trying to keep Airflow environments up–to-date and operations running smoothly. Whether you're working to improve access management, protect sensitive data, or build more resilient pipelines, this webinar will provide the knowledge and best practices to enhance security in Apache Airflow. Generative AI: The Democratization of Intelligent Systemsive Date: April 16, 2025Time: 1:00 PM ET Register for Free! Join DZone, alongside industry experts from Cisco and Vertesia, for an exclusive virtual roundtable exploring the latest trends in GenAI. This discussion will dive into key insights from DZone's 2025 Generative AI Trend Report, focusing on advancements in GenAI models and algorithms, their impact on code generation, and the evolving role of AI in software development. We’ll examine AI adoption maturity, intelligent search capabilities, and how organizations can optimize their AI strategies for 2025 and beyond. What's Next? DZone has more in store! Stay tuned for announcements about upcoming Webinars, Virtual Roundtables, Fireside Chats, and other developer-focused events. Whether you’re looking to sharpen your skills, explore new tools, or connect with industry leaders, there’s always something exciting on the horizon. Don’t miss out — save this article and check back often for updates!
Disclaimer The stock data used in this article is entirely fictitious. It is purely for demo purposes. Please do not use this data for making any financial decisions. In a previous article, we saw the benefits of using Ollama locally for a RAG application. In this article, we'll extend our evaluation of Ollama by testing natural language (NL) queries against a database system, using LangChain's SQLDatabaseToolkit. SQL will serve as the baseline system for comparison as we explore the quality of results provided by OpenAI and Ollama. The notebook files used in this article are available on GitHub. Introduction LangChain's SQLDatabaseToolkit is a powerful tool designed to integrate NL processing capabilities with relational database systems. It enables users to query databases using NL inputs, using the capabilities of large language models (LLMs) to generate SQL queries dynamically. This makes it especially useful for applications where non-technical users or automated systems need to interact with structured data. A number of LLMs are well supported by LangChain. LangChain also provides support for Ollama. In this article, we'll evaluate how well LangChain integrates with Ollama and the feasibility of using the SQLDatabaseToolkit in a local setup. Create a SingleStore Cloud Account A previous article showed the steps to create a free SingleStore Cloud account. We'll use the Free Shared Tier. Selecting the Starter Workspace > Connect > CLI Client will give us the details we need later, such as username, password, host, port and database. Create Database Tables For our test environment, we'll use SingleStore running in the Cloud as our target database system, and we'll connect securely to this environment using Jupyter notebooks running in a local system. From the left navigation pane in the SingleStore cloud portal, we'll select DEVELOP > Data Studio > Open SQL Editor. We'll create three tables, as follows: SQL CREATE TABLE IF NOT EXISTS tick ( symbol VARCHAR(10), ts DATETIME SERIES TIMESTAMP, open NUMERIC(18, 2), high NUMERIC(18, 2), low NUMERIC(18, 2), price NUMERIC(18, 2), volume INT, KEY(ts) ); CREATE TABLE IF NOT EXISTS portfolio ( symbol VARCHAR(10), shares_held INT, purchase_date DATE, purchase_price NUMERIC(18, 2) ); CREATE TABLE IF NOT EXISTS stock_sentiment ( headline VARCHAR(250), positive FLOAT, negative FLOAT, neutral FLOAT, url TEXT, publisher VARCHAR(30), ts DATETIME, symbol VARCHAR(10) ); We'll load the portfolio table with the following fictitious data: SQL INSERT INTO portfolio (symbol, shares_held, purchase_date, purchase_price) VALUES ('AAPL', 100, '2022-01-15', 150.25), ('MSFT', 50, '2021-12-10', 305.50), ('GOOGL', 25, '2021-11-05', 2800.75), ('AMZN', 10, '2020-07-20', 3200.00), ('TSLA', 40, '2022-02-18', 900.60), ('NFLX', 15, '2021-09-01', 550.00); For the stock_sentiment table, we'll download the stock_sentiment.sql.zip file and unpack it. We'll load the data into the table using a MySQL client, as follows: Shell mysql -u "<username>" -p"<password>" -h "<host>" -P <port> -D <database> < stock_sentiment.sql We'll use the values for <username>, <password>, <host>, <port> and <database> that we saved earlier. Finally, for the tick table, we'll create a pipeline: SQL CREATE PIPELINE tick AS LOAD DATA KAFKA 'public-kafka.memcompute.com:9092/stockticker' BATCH_INTERVAL 45000 INTO TABLE tick FIELDS TERMINATED BY ',' (symbol,ts,open,high,low,price,volume); We'll adjust to get the earliest data: SQL ALTER PIPELINE tick SET OFFSETS EARLIEST; And test the pipeline: SQL TEST PIPELINE tick LIMIT 1; Example output: Plain Text +--------+---------------------+--------+--------+--------+--------+--------+ | symbol | ts | open | high | low | price | volume | +--------+---------------------+--------+--------+--------+--------+--------+ | MMM | 2025-01-23 21:40:32 | 178.34 | 178.43 | 178.17 | 178.24 | 38299 | +--------+---------------------+--------+--------+--------+--------+--------+ And then we'll start the pipeline: SQL START PIPELINE tick; After a few minutes, we'll check the quantity of data loaded so far: SQL SELECT COUNT(*) FROM tick; Local Test Environment From a previous article, we'll follow the same steps to set up our local test environment as described in these sections: Introduction. Use a Virtual Machine or venv.Create a SingleStore Cloud account. This step was completed above.Create a database. The Free Shared Tier already provides a database and we just need to note down the database name.Install Jupyter. Plain Text pip install notebook Install Ollama. Plain Text curl -fsSL https://ollama.com/install.sh | sh Environment variables. Plain Text export SINGLESTOREDB_URL="<username>:<password>@<host>:<port>/<database>" Replace <username>, <password>, <host>, <port> and <database> with the values for your environment. Plain Text export OPENAI_API_KEY="<OpenAI API Key>" Replace <OpenAI API Key> with your key.Launch Jupyter. Plain Text jupyter notebook We'll use the Jupyter notebooks from GitHub. These notebooks are configured to use OpenAI and Ollama. For Ollama, we'll use one of the LLMs listed with Tools support. We'll test the following four queries. First Query SQL SQL SELECT symbol, (MAX(high) - MIN(low)) AS volatility FROM tick GROUP BY symbol ORDER BY volatility ASC LIMIT 1; Natural Language Plain Text "For each stock symbol, calculate the volatility as the difference\n" "between the highest recorded price and the lowest recorded price over time.\n" "Which stock symbol has the least volatility?" Results SQL Plain Text +--------+------------+ | symbol | volatility | +--------+------------+ | FTR | 0.55 | +--------+------------+ OpenAI Plain Text The stock symbol with the least volatility is FTR, with a volatility of 0.55. Ollama Plain Text To find the stock with the highest price, we need to compare the prices of all the given stocks and find the maximum value. However, I don't have real-time access to financial data or the ability to execute code that interacts with external APIs. Therefore, I can't directly calculate the highest price from this list. However, if you provide me with a specific date or time range, I can help you find the stock with the highest price during that period. For example, if you want to know the stock with the highest price on a particular day in the past, I can assist you with that. If you're looking for the current highest-priced stock, I recommend checking a financial news website or an API that provides real-time stock data, such as Yahoo Finance, Google Finance, or a service like Alpha Vantage or Finnhub.io. These platforms can give you the most up-to-date information on stock prices.` For troubleshooting, visit: https://python.langchain.com/docs/troubleshooting/errors/OUTPUT_PARSING_FAILURE Second Query SQL SQL SELECT COUNT(*) FROM tick; Natural Language Plain Text "How many rows are in the tick table?" Results SQL Plain Text +----------+ | COUNT(*) | +----------+ | 22367162 | +----------+ OpenAI Plain Text There are 22,367,162 rows in the tick table. Ollama Plain Text The "tick" table has 3 rows. Third Query SQL SQL -- Step 1: Get the latest price for each symbol WITH latest_prices AS ( SELECT symbol, price FROM tick t1 WHERE ts = ( SELECT MAX(ts) FROM tick t2 WHERE t2.symbol = t1.symbol ) ) -- Step 2: Calculate the total portfolio value SELECT SUM(p.shares_held * lp.price) AS total_portfolio_value FROM portfolio p, latest_prices lp WHERE p.symbol = lp.symbol; Natural Language Plain Text "Taking all the stock symbols from the portfolio table,\n" "and using the latest value for each stock symbol from the tick table,\n" "calculate the grand total value of all the shares listed in the portfolio table." Results SQL Plain Text +-----------------------+ | total_portfolio_value | +-----------------------+ | 44540.60 | +-----------------------+ OpenAI Plain Text $44,540.60 Ollama Plain Text I don't know. The provided SQL query does not contain any of the common mistakes listed. It uses the correct join conditions, functions, and data types. The query also properly quotes identifiers and uses the correct number of arguments for functions. Therefore, no changes are needed. However, without access to the actual database schema and table data, I cannot provide a specific answer to the question. Fourth Query SQL SQL SELECT ( SELECT ts FROM stock_sentiment WHERE symbol = 'AAPL' ORDER BY positive DESC LIMIT 1 ) AS sentiment_date, ( SELECT positive FROM stock_sentiment WHERE symbol = 'AAPL' ORDER BY positive DESC LIMIT 1 ) AS most_positive_sentiment, ( SELECT price FROM tick WHERE symbol = 'AAPL' ORDER BY ts DESC LIMIT 1 ) AS current_best_price; Natural Language Plain Text "Using the symbol AAPL, show me the date and the most\n" "positive sentiment in the stock sentiment table and the\n" "current best price for this symbol from the tick table." Results SQL Plain Text +---------------------+-------------------------+--------------------+ | sentiment_date | most_positive_sentiment | current_best_price | +---------------------+-------------------------+--------------------+ | 2020-05-28 00:00:00 | 0.331509 | 116.27 | +---------------------+-------------------------+--------------------+ OpenAI Plain Text On 2020-05-28, the most positive sentiment for AAPL was 0.331509, and the current best price for AAPL is 116.27. Ollama Plain Text The provided list contains decimal numbers, which appear to be the results of some kind of experiment or analysis. Without additional context, it's difficult to determine the exact nature of these results. However, we can observe that the majority of the numbers are between 116.85 and 117.27, with a few outliers at 115.99 and 117.30. The smallest number in the list is 115.99, and the largest is 117.30.` For troubleshooting, visit: https://python.langchain.com/docs/troubleshooting/errors/OUTPUT_PARSING_FAILURE Summary Analyzing the results, we see that SQL and OpenAI produce consistent outputs across all four queries. However, Ollama presents clear issues. A discussion thread on GitHub highlights that while an LLM model should support tool calling, this functionality is not natively available in Ollama. If you are able to get this LangChain functionality working with Ollama in one of the supported LLMs, please send me a message, and I'll update the article and acknowledge your help.
Java is a programming language with many language features, specifications, and APIs. Even among experienced Java developers, being aware of all of these is quite rare. If a study were conducted, we might come across Java developers who have never worked with Threads, never used JPA, or never developed custom annotations. However, is there a Java developer who has worked with Java 8 or later but has never used the Stream API? I highly doubt it. Gatherers is a powerful extension of the Stream API that introduces support for customized intermediate operations. Initially introduced as a preview feature in JDK 22, it became a standard feature in JDK 24. What Are Gatherers? Gatherers were developed to model intermediate operations in the Stream API. Just as a collector models a terminal operation, a gatherer is an object that models an intermediate operation. Gatherers support the characteristics of intermediate operations — they can push any number of elements to the stream they produce, maintain an internal mutable state, short-circuit a stream, delay consumption, be chained, and execute in parallel. For this reason, as stated in JEP 485: In fact every stream pipeline is, conceptually, equivalent to source.gather(…).gather(…).gather(…).collect(…) Java public interface Gatherer<T, A, R> { … } T represents the input element.A represents the potential mutable state object.R represents the output that will be pushed downstream. A gatherer is built upon four key elements: Java Supplier<A> initializer(); Integrator<A, T, R> integrator(); BinaryOperator<A> combiner(); BiConsumer<A, Downstream<? super R>> finisher(); Initializer – A function that produces an instance of the internal intermediate state.Integrator – Integrates a new element into the stream produced by the Gatherer.Combiner – A function that accepts two intermediate states and merges them into one. Supporting parallel execution.Finisher – A function that allows performing a final action at the end of input elements. Among these four elements, only the integrator is mandatory because it has the role of integrating a new element into the stream produced by the Gatherer. The other elements may or may not be required, depending on the operation you intend to model, making them optional. Creating a Gatherer Gatherers are created using factory methods, or you can implement the Gatherer interface. Depending on the operation you want to model, you can use the overloaded variants of Gatherer.of and Gatherer.ofSequential. Java var uppercaseGatherer = Gatherer.<String, String>of((state, element, downstream) -> downstream.push(element.toUpperCase())); The example gatherer above calls toUpperCase on an input element of type String and pushes the result downstream. This gatherer is equivalent to the following map operation. Java Stream.of("a", "b", "c", "d", "e", "f", "g") .map(String::toUpperCase) .forEach(System.out::print); The Stream interface now includes a method called gather(), which accepts a Gatherer parameter. We can use it by passing the gatherer we created. Java Stream.of("a", "b", "c", "d", "e", "f", "g") .gather(uppercaseGatherer) .forEach(System.out::print); Built-In Gaterers The java.util.stream.Gatherers class is a factory class that contains predefined implementations of the java.util.stream.Gatherer interface, defining five different gatherers. windowFixed. It is a many-to-many gatherer that groups input elements into lists of a supplied size, emitting the windows downstream when they are full.windowSliding. It is a many-to-many gatherer that groups input elements into lists of a supplied size. After the first window, each subsequent window is created from a copy of its predecessor by dropping the first element and appending the next element from the input stream.fold. It is a many-to-one gatherer that constructs an aggregate incrementally and emits that aggregate when no more input elements exist.scan. It is a one-to-one gatherer that applies a supplied function to the current state and the current element to produce the next element, which it passes downstream.mapConcurrent. It is a one-to-one gatherer that invokes a supplied function for each input element concurrently, up to a supplied limit. The function executes in Virtual Thread. All of the above gatherers are stateful. Fold and Scan are very similar to the Stream reduce operation. The key difference is that both can take an input of type T and produce an output of type R, and their identity element is mandatory, not optional. Create Your Own Gatherer Let’s see how we can write our custom gatherer using a real-world scenario. Imagine you are processing a system’s log stream. Each log entry represents an event, and it is evaluated based on certain rules to determine whether it is anomalous. The rule and scenario are as follows: Rule. An event (log entry) is considered anomalous if it exceeds a certain threshold or contains an error.Scenario. If an error occurs and is immediately followed by several anomalous events (three in a row, e.g), they might be part of a failure chain. However, if a “normal” event appears in between, the chain is broken. In this case, we can write a gatherer that processes a log stream and returns only the uninterrupted anomalous events. INFO, ERROR, ERROR, INFO, WARNING, ERROR, ERROR, ERROR, INFO, DEBUG Let’s assume that the object in our log stream is structured as follows. Java class LogWrapper { enum Level{ INFO, DEBUG, WARNING, ERROR } private Level level; private String details; } The object has a level field representing the log level. The details field represents the content of the log entry. We need a stateful gatherer because we must retain information about past events to determine whether failures occur consecutively. To achieve this, the internal state of our gatherer can be a List<LogWrapper> Java static Supplier<List<LogWrapper>> initializer() { return ArrayList::new; } The object returned by the initializer() corresponds to the second parameter explained earlier in the type parameters of the Gatherer interface. Java static Integrator<List<LogWrapper>, LogWrapper, String> integrator(final int threshold) { return ((internalState, element, downstream) -> { if(downstream.isRejecting()){ return false; } if(element.getLevel().equals(LogWrapper.Level.ERROR)){ internalState.add(element); } else { if(internalState.size() >= threshold){ internalState.stream().map(LogWrapper::getDetails).forEach(downstream::push); } internalState.clear(); } return true; }); } The integrator will be responsible for integrating elements into the produced stream. The third parameter of the integrator represents the downstream object. We check whether more elements are needed by calling the isRejecting(), which determines if the next stage no longer wants to receive elements. If this condition is met, we return false. If the integrator returns false, it performs a short-circuit operation similar to intermediate operations like allMatch, anyMatch, and noneMatch in the Stream API, indicating that no more elements will be integrated into the stream. If isRejecting() returns false, we check whether the level value of our stream element, LogWrapper, is ERROR. If the level is ERROR, we add the object to our internal state. If the level is not ERROR, we then check the size of our internal state. If the size exceeds or is equal to the threshold, we push the LogWrapper objects stored in the internal state downstream. If not, we don’t. I want you to pay attention to two things here. Pushing an element downstream or not, as per the business rule, is similar to what filter() does. Accepting an input of type LogWrapper and producing an output of type String is similar to what map() does. After that, according to our business rule, we clear the internal state and return true to allow new elements to be integrated into the stream. Java static BinaryOperator<List<LogWrapper>> combiner() { return (_, _) -> { throw new UnsupportedOperationException("Cannot be parallelized"); }; } To prevent our gatherer from being used in a parallel stream, we define a combiner, even though it is not strictly required. This is because our gatherer is inherently designed to work as expected only in a sequential stream. Java static BiConsumer<List<LogWrapper>, Downstream<? super String>> finisher(final int threshold) { return (state, downstream) -> { if(!downstream.isRejecting() && state.size() >= threshold){ state.stream().map(LogWrapper::getDetails).forEach(downstream::push); } }; } Finally, we define a finisher to push any remaining stream elements that have not yet been emitted downstream. If isRejecting() returns false, and the size of the internal state is greater than or equal to the threshold, we push the LogWrapper objects stored in the internal state downstream. When we use this gatherer on data: Plain Text ERROR, Process ID: 191, event details ... INFO, Process ID: 216, event details ... DEBUG, Process ID: 279, event details ... ERROR, Process ID: 312, event details ... WARNING, Process ID: 340, event details ... ERROR, Process ID: 367, event details ... ERROR, Process ID: 389, event details ... INFO, Process ID: 401, event details ... ERROR, Process ID: 416, event details ... ERROR, Process ID: 417, event details ... ERROR, Process ID: 418, event details ... WARNING, Process ID: 432, event details ... ERROR, Process ID: 444, event details ... ERROR, Process ID: 445, event details ... ERROR, Process ID: 446, event details ... ERROR, Process ID: 447, event details ... Similar to the one above, we get the following result: Plain Text Process ID: 416, event details … Process ID: 417, event details … Process ID: 418, event details … Process ID: 444, event details … Process ID: 445, event details … Process ID: 446, event details … Process ID: 447, event details … The code example is accessible in the GitHub repository. Conclusion Gatherers is a new and powerful API that enhances the Stream API by modeling intermediate operations and allowing the definition of custom intermediate operations. A gatherer supports the features that intermediate operations have; it can push any number of elements to the resulting stream, maintain an internal mutable state, short-circuit a stream, delay consumption, be chained, and execute in parallel. References JEP 485cr.openjdk.org
In Couchbase, memory management in the Query Service is key to keeping the service efficient and responsive, especially as the service handles an increasing number of queries simultaneously. Without proper memory management, things can go awry — greedy queries can hog memory, and the combined memory usage of multiple concurrent queries can overwhelm the service, leading to degraded performance. Fortunately, the Query Service has several features that allow users to manage the memory usage of queries and the overall service. This blog will explore these features in detail: Per Request Memory QuotaSoft Memory LimitNode-wide Document Memory Quota Per Request Memory Quota A significant portion of memory usage of the Query Service comes from transient values, which can include documents or computed values. The memory used by these transient values will be referred to as "document memory" in the blog. The Query Service receives documents from the data service as an encoded byte stream. However, the memory used by the value associated with the document can be much larger than the size of the original stream. This is because the Query Service decodes the stream into a structure that can be large as it must store all fields, values, and any nested objects. The Query Service is optimized for performance and not for compactness. What happens if a resource-intensive query comes along and starts consuming a large amount of document memory? It can end up hogging memory and cause other queries to stall. How do we prevent a "greedy" query from affecting the execution of other active queries? This is exactly where the per-request memory quota feature comes in! Since Couchbase 7.0, the Query Service provides a setting called "memory quota" to limit the maximum amount of document memory that a query request can use at any given time during its execution. This per-request memory quota works by terminating a query if it exceeds its quota, while allowing all other active queries to continue execution. This ensures that only the greedy query is stopped, preventing it from affecting the performance of the other queries. The memory quota does not correspond to OS memory use. It only accounts for document memory usage and not for any memory used in the heap, stack, execution operators, etc. How Does Memory Quota Work? The per-request memory quota can be thought of as configuring a document memory pool for a query request. The size of the pool is determined by the value of the query’s memory quota. When the query requires a document/ value, it allocates the size of the document/value from this pool. When the value/ document is no longer needed, the allocated size is returned back to the pool for reuse by the request. At any given moment, the total amount of document memory being used by the query request cannot exceed the size of its pool, i.e., its memory quota. If the query tries to use more document memory than what is available in its pool, the request will be terminated, and an error will be returned. It is important to note that the Query Service is highly parallelized, and operators can run simultaneously. This means that whether a query exceeds its memory quota can vary between runs. This is because, depending on the specifics of each run, the amount of document memory that is being used ( and hence allocated from its request pool ) can vary, even at the same stage of execution. How to Configure the Memory Quota? The per-request memory quota can be set at a cluster, node, and request level. Unit: MiBDefault: 0, i.e., there is no limit on how much document memory a request can use Cluster Level Set the memory quota for every query node in the cluster with the queryMemoryQuota cluster-level setting. The value at the cluster level is persisted and when set, over-writes the node level setting for every query node. Learn how to set a cluster-level setting here. Node Level Set the memory quota for a particular query node with the memory-quota node-level setting. The value set at the node level is the default memory quota for all query requests executed on the node. The node level value is not persisted and is over-written when the cluster level setting is modified. Learn how to set a node-level setting here. Request Level Set the memory quota for a particular query request with the memory_quota parameter. The request level parameter overrides the value of the node-level setting. However, if the node level setting is greater than zero, the request level value is limited by the node level value. Learn how to set a request-level parameter here. Soft Memory Limit of the Query Service Now that we have explored how to limit the document memory usage of a query, you might be wondering, is there a way to limit the memory usage of the Query Service? The Query Service has no setting to enforce a hard limit on the memory usage of the service. This is because the programming language used to develop SQL++ does not provide a mechanism to enforce a hard limit on its runtime memory usage. But it does provide a mechanism to adjust the soft memory limit… Hence, in Couchbase 7.6.0, the "node quota" setting was introduced to adjust the soft memory limit of the Query Service! Since this is a soft limit, there is no guarantee that the memory usage of the Query Service will always strictly stay below it or that out-of-memory conditions will not occur. However, an effort is made to maintain the Query Service’s memory usage below this limit by running the garbage collector (GC) more frequently when this limit is crossed or approached closely. Important Note If the memory usage stays close to the soft limit, the GC runs aggressively, which can cause high CPU utilization. How to Configure the Node Quota? The node quota can be set at a cluster, node, and request level. Unit: MiBDefault: 0Minimum: 1 While the minimum value of the node quota is 1 MiB, please set the node quota to practical values depending on the workloads and the system’s capabilities. Cluster Level Set the node quota for every query node in the cluster with the queryNodeQuota cluster-level setting. The value at the cluster level is persisted and when set, overwrites the node level setting for every query node. Learn how to set a cluster-level setting here. Important Note One way of configuring this setting cluster wide is by using the Couchbase Web Console. In the Web Console, this can be configured under the "Memory Quota per server node" on the Settings page. This section is specifically for configuring the Query Service's cluster-level node quota and must not be confused with setting the cluster-level memory quota setting. Node Level Set the node quota for a particular query node with the node-quota node-level setting. The value set at the node level is the default memory quota for all query requests executed on the node. The node level value is not persisted and is over-written when the cluster level setting is modified. Learn how to set a node-level setting here. How to Configure the Soft Memory Limit? The soft memory limit of the Query Service is set using the value of the node quota. If not set, a default value is calculated. Node Quota If the node quota is set for a node, this is the soft memory limit. The soft memory limit will be capped at a maximum allowable value, which is calculated using these steps: 1. The difference between the total system RAM and 90% of the total system RAM is calculated. Plain Text Total System RAM - (0.9 * Total System RAM) 2. If the difference is greater than 8 GiB, the maximum soft memory limit will be: Plain Text Total System RAM - 8 GiB 3. If the difference is 8 GiB or less, the maximum soft memory limit will be set to 90% of the total system RAM. If the node quota exceeds the calculated maximum, then the soft memory limit is silently set to the maximum. Default If the node quota setting is not set for a node, a default value is calculated for the soft limit using the following steps: 1. The difference between the total system RAM and 90% of the total system RAM is calculated. Plain Text Total System RAM - (0.9 * Total System RAM) 2. If the difference is greater than 8 GiB, the default soft memory limit will be: Plain Text Total System RAM - 8 GiB 3. If the difference is 8 GiB or less, the default soft memory limit will be set to 90% of the total system RAM. Node-Wide Document Memory Quota What if a workload has a query that requires a large amount of memory to execute? Enforcing a per-request memory quota might not be ideal, as this query might frequently be terminated for exceeding its quota. How can this query successfully execute while still protecting the Query Service from excessive memory usage? Consider another scenario with multiple queries executing concurrently, each with a per-request memory quota set. In this scenario, the memory usage of the service has become very high. But the document memory use of the queries remains below their respective quotas. So, no query is terminated. As a result, the overall memory usage of the Query Service remains high, causing problems. How can this be addressed? Starting in Couchbase 7.6.0, the Query Service has a mechanism to limit the cumulative amount of document memory that active queries can use! The introduction of a node-wide document memory quota attempts to address these challenges. How Does Node-Wide Quota Work? The node-wide quota can be thought of as configuring a document memory pool for the entire Query Service on a node. When the node quota is set, a "memory session" is created for each request. By default, this session starts with an initial size of 1 MiB. When the node-wide quota is configured, 1 MiB is allocated for every servicer and subtracted from the node-wide pool. This default allocation guarantees that each servicer has at least a minimum amount of reserved space, ensuring that incoming requests can always be serviced.When a request requires a value/document, the size of the value/document is allocated from its session. If the session does not have enough memory for this allocation, it will grow in minimum increments of 1 MiB to accommodate the allocation request. The additional memory required for this growth is allocated from the node-wide pool. If an active request’s memory session attempts to grow beyond the available remaining memory in the node-wide pool, the request will be stopped, and an error will be returned.Once the request no longer needs the value/document, it returns the allocated size back to its session. The session’s memory ( excluding the 1 MiB of the initial servicer reservation ) is only returned to the node-wide pool once the request’s execution completes.At any time, the total size of all memory sessions cannot exceed the size of the node-wide quota. It is important to understand that this memory session is not to be confused with the per-request pool that is configured when the memory quota is set for a request. The two are not the same. Both a node-wide quota and a per-request memory quota can be configured. Read the "Configuring both Node-Wide Document Quota and Per-Request Memory Quota" section below to understand more. In this way, the node-wide quota places a limit on the amount of document memory that is being used by all active requests. The node-wide document quota can only be configured when the node quota setting is explicitly set for a node. The size of this quota is calculated using two Query settings, "node quota" ( explored in an earlier section ) and "node quota value percent." How to Configure Node Quota Value Percent? The node quota value percent is the percentage of the node quota dedicated to tracked value content memory/"document memory" across all active requests. The node quota value percent can be set at the cluster and the node level. Unit: MiBDefault: 67Minimum: 0Maximum: 100 Cluster Level Set the node quota value percentage for every query node in the cluster with the queryNodeQuotaValPercent cluster-level setting. The value at the cluster level is persisted and when set, over-writes the node level setting for every query node. Learn how to set a cluster-level setting here. Node Level Set the node quota value percentage for a particular query node with the node-quota-val-percent node-level setting. The value set at the node level is the default memory quota for all query requests executed on the node. The node level value is not persisted and is over-written when the cluster level setting is modified. Learn how to set a node-level setting here. How to Configure the Node-Wide Document Memory Quota? The size of the node-wide document memory quota is calculated relative to the node quota. The node quota must be set for the node-wide document memory quota to be configured. The size of the node-wide pool is calculated using the following steps: 1. Calculate the percentage of the node quota dedicated to tracking document memory across all active queries using the following formula: Plain Text node-quota * node-quota-val-percent / 100 2. Calculate the minimum allowable value for the node-wide document memory quota. The execution of SQL++ statements is handled by "servicers." When a query is to be executed, it is assigned to a servicer thread that is responsible for its execution. The Query Service is configured with a number of servicers to handle incoming requests. There are two types of servicers, "unbounded servicers" and "plus servicers." The Query engine reserves 1 MiB of document memory for each servicer. Hence, the default initial value of each request’s memory session is 1 MiB. This means that the baseline document memory usage will be the total number of unbounded and plus servicers, measured in MiB. Therefore, the size of the node-wide document memory quota must be at least equal to the number of servicers, measured in MiB. Formula 1 Plain Text Quota reserved for servicers = (number of unbounded servicers + number of plus servicers) MiB Learn more about unbounded servicers here and plus servicers here. 3. The size of the node-wide document memory quota is calculated using the following formula: Formula 2 Plain Text Size of node-wide document memory quota = MAX(node-quota * node-quota-val-percent / 100, Quota reserved for servicers) The quota reserved for the servicers is calculated using Formula 1. This is the maximum allowable size of all memory sessions across active requests and includes the initial reservation for each servicer. Calculating Available Quota in the Pool for Document Memory Growth The initial reservation for the servicers is deducted from the node-wide document memory quota for the node. Any remaining space in the node-wide memory pool can be used by each active request to grow its document memory usage beyond its initial 1 MiB reservation. This remaining quota available for document memory growth is calculated using the following formula: Formula 3 Plain Text Size of node-wide document memory quota available for memory sessions of active requests to grow = Size of node-wide document memory quota - Quota reserved for servicers The size of the node-wide document memory quota is calculated in Formula 2.The quota reserved for the servicers is calculated using Formula 1. It is important to set appropriate node-quota and node-quota-val-percent values that are practical and suitable for workloads. The next section explores an example to illustrate the importance of this. Example Consider a query node with 32 unbounded servicers and 128 plus servicers. The Administrator sets the node quota to 10 MiB. The node-quota-val-percent is the default value of 67. Using Formula 2 to calculate the size of the node-wide document memory quota: Plain Text Size of node-wide document memory quota = MAX(node-quota * node-quota-val-percent / 100, Quota reserved for servicers) = MAX( node-quota * node-quota-val-percent / 100, (number of unbounded servicers + number of plus servicers) MiB ) = MAX ( 10 * 67 / 100 MiB, (32+128) MiB ) = MAX ( 6.7 MiB, 160 MiB) = 160 MiB Using Formula 3 to calculate the amount of document memory available in the node-wide pool available for requests’ memory growth: Plain Text = Size of node-wide document memory quota - (number of unbounded servicers + number of plus servicers) MiB = 160 MiB - (32+128) MiB = 160 MiB - 160 MiB = 0 MiB This means that there is no room for document memory growth of requests beyond their 1 MiB initial reservation. In other words, each request is limited to using a maximum of 1 MiB of document memory. Additionally, the node quota of 10 MiB is very small, and garbage collection will likely be forced to run frequently, causing high CPU utilization. Reporting Document Memory Figures If the memory quota was set for a request or a node-wide document memory pool configured, information about the same will be reported in several SQL++ features which will be explored below. This information is helpful for debugging. 1. Response Output Metrics The usedMemory field in the metrics section of the query’s response reports the high-water mark (HWM) document memory usage of the query in bytes. The Query Service is highly parallelized, and operators can run simultaneously. As a result, the usedMemory figures can vary between runs for the same query. This is because, depending on the specifics of each run, the HWM document memory usage can be different. A sample metrics section of a query response: JSON "metrics": { "elapsedTime": "19.07875ms", "executionTime": "18.909916ms", "resultCount": 10000, "resultSize": 248890, "serviceLoad": 2, "usedMemory": 341420 } Controls Section If the controls Query setting is enabled, and the memory quota configured for the request, the memoryQuota field in the controls section of the query’s response reports the value of the memory quota set. A sample controls section of a query response: JSON "controls": { "scan_consistency": "unbounded", "use_cbo": "true", "memoryQuota": "25", "n1ql_feat_ctrl": "0x4c", "disabledFeatures":[ "(Reserved for future use) (0x40)", "Encoded plans (0x4)", "Golang UDFs (0x8)" ], "stmtType": "SELECT" } Learn more about the controls setting here. 2. System Keyspaces In a request’s entry in the system:completed_requests and system:active_requestssystem keyspaces: The usedMemory field is the HWM document memory usage of the query in bytes. The Query Service is highly parallelized, and operators can run simultaneously. As a result, the usedMemory figures can vary between runs for the same query. This is because, depending on the specifics of each run, the HWM document memory usage can be different.The memoryQuota field is the value of the memory quota set for the request Learn more about system:completed_requests here, and system:active_requests here. Configuring Both Per-Request Memory Quota and Node-Wide Document Quota As described in the "Per Request Memory Quota" section, if a request has a memory quota configured, the maximum amount of document memory that it can use at any given time during its execution is limited by the memory quota. Additionally, as explained in the "Node-Wide Document Memory Quota" section, when the node quota and a node-wide document memory quota are configured, each request gets its own "memory session." Any growth in the size of these sessions is allocated from the node-wide document memory quota. If a node-wide document memory quota is configured and a request has a memory quota set, the document memory usage of the query request is limited by both quotas. How Would a Document/Value Allocation be Performed? When the request requires a document/ value, the following steps are performed during the allocation process: 1. Memory Session Allocation The request first tries to allocate memory for the document from its memory session. If there is enough space in the session, the allocation is successful.If there is insufficient space in the session, the session attempts to grow its size by allocating from the node-wide document memory quota. (i.e., from the "node-wide document memory pool" ). If there is not enough space for the session’s growth in the node-wide pool, the request will be stopped, and an error will be returned. 2. Request Memory Quota Allocation If the session allocation is successful, the request will attempt to allocate memory for the document from its memory quota. (i.e., from its "request memory pool").If there is sufficient space left in its memory quota, the allocation succeeds, and the request proceeds.If there is not enough remaining space in the memory quota, the request will fail, and an error will be returned. Monitoring With system:vitals The system keyspace system:vitals contains important information about each query node in the cluster, including information related to memory and CPU usage, garbage collection, and much more. Users can use this system keyspace to monitor the health and vitals of the query nodes. There are two ways to access this information: 1. Query the system:vitals keyspace using SQL++. SQL SELECT * FROM system:vitals; 2. Accessing the vitals per node using the Query Service's /admin/vitals endpoint. Plain Text curl -u $USER:$PASSWORD $QUERY_NODE_URL/admin/vitals Below is a sample of a record in system:vitals for a query node. JSON { "bucket.IO.stats": { "travel-sample": { "reads": 52090 } }, "cores": 12, "cpu.sys.percent": 0.005, "cpu.user.percent": 0.0056, "ffdc.total": 0, "gc.num": 64352224, "gc.pause.percent": 0, "gc.pause.time": "4.479336ms", "healthy": true, "host.memory.free": 321028096, "host.memory.quota": 10485760000, "host.memory.total": 38654705664, "host.memory.value_quota": 7025459200, "load": 0, "loadfactor": 6, "local.time": "2024-12-05T17:21:25.609+05:30", "memory.system": 584662408, "memory.total": 3884613696, "memory.usage": 25302328, "node": "127.0.0.1:8091", "node.allocated.values": 613916, "node.memory.usage": 251658240, "process.memory.usage": 0, "process.percore.cpupercent": 0, "process.rss": 629309440, "process.service.usage": 0, "request.active.count": 1, "request.completed.count": 41, "request.per.sec.15min": 0.0221, "request.per.sec.1min": 0.0136, "request.per.sec.5min": 0.017, "request.prepared.percent": 0, "request.queued.count": 0, "request_time.80percentile": "63.969209ms", "request_time.95percentile": "74.865437ms", "request_time.99percentile": "150.904625ms", "request_time.mean": "37.019323ms", "request_time.median": "39.115791ms", "servicers.paused.count": 0, "servicers.paused.total": 0, "temp.hwm": 0, "temp.usage": 0, "total.threads": 411, "uptime": "12m48.431007375s", "version": "7.6.0-N1QL" } Learn more about the Vitals here and the system:vitals keyspace here. How Does the Query Service Trigger the Garbage Collector? Starting in 7.6.0, the Query Service routinely checks if the garbage collector (GC) has run in the last 30 seconds. If it has not, the GC is triggered to run. During this check, the amount of free system memory is also monitored. If the amount of free memory is less than 25%, an attempt is made to return as much memory to the OS as possible. Run Garbage Collector on Demand Starting in Couchbase 7.6.0, the Query Service provides a REST endpoint /admin/gc that can be invoked to run the garbage collector. This endpoint can be invoked to trigger a GC run in an attempt to reduce memory utilization. To force a GC run, issue a GET request to the API. Plain Text curl -u $USER:$PASSWORD $QUERY_NODE_URL/admin/gc To force a GC run and attempt to return as much memory to the OS as possible, issue a POST request to the API. Plain Text curl -X POST -u $USER:$PASSWORD $QUERY_NODE_URL/admin/gc Learn more about this endpoint here. Important Note Aggressively running the garbage collector can cause high CPU utilization. Helpful References Couchbase blog on per-request memory quotaCouchbase documentation for system keyspacesCouchbase documentation for configuring cluster, node, request level Query settings
Is Vibe Coding Agile or Merely a Hype?
March 24, 2025
by
CORE
AI-Driven Kubernetes Troubleshooting With DeepSeek and k8sgpt
March 19, 2025
by
CORE
Bringing Security to Digital Product Design
March 18, 2025 by
Hybrid Backup Strategies: On-Premises vs Cloud for DevOps
March 24, 2025 by
Text Clustering With Deepseek Reasoning
March 24, 2025 by
Breaking AWS Lambda: Chaos Engineering for Serverless Devs
March 24, 2025 by
The Role of Sanity Testing in Performance Engineering
March 24, 2025 by
Hybrid Backup Strategies: On-Premises vs Cloud for DevOps
March 24, 2025 by
Breaking AWS Lambda: Chaos Engineering for Serverless Devs
March 24, 2025 by
Breaking AWS Lambda: Chaos Engineering for Serverless Devs
March 24, 2025 by
How to Build a React Native Chat App for Android
March 24, 2025 by
Supercharging Pytest: Integration With External Tools
March 24, 2025 by
Hybrid Backup Strategies: On-Premises vs Cloud for DevOps
March 24, 2025 by
Breaking AWS Lambda: Chaos Engineering for Serverless Devs
March 24, 2025 by
How to Build a React Native Chat App for Android
March 24, 2025 by
Text Clustering With Deepseek Reasoning
March 24, 2025 by
How to Build a React Native Chat App for Android
March 24, 2025 by
Leverage Amazon BedRock Chat Model With Java and Spring AI
March 24, 2025
by
CORE