Artificial intelligence (AI) and machine learning (ML) are two fields that work together to create computer systems capable of perception, recognition, decision-making, and translation. Separately, AI is the ability for a computer system to mimic human intelligence through math and logic, and ML builds off AI by developing methods that "learn" through experience and do not require instruction. In the AI/ML Zone, you'll find resources ranging from tutorials to use cases that will help you navigate this rapidly growing field.
Graph Database Pruning for Knowledge Representation in LLMs
Software Development Trends to Follow in 2025
With the advancements in artificial intelligence (AI), the models are getting increasingly complex, resulting in increased size and latency, leading to difficulties in shipping models to production. Maintaining a balance between performance and efficiency is often a challenging task, and the faster and more lightweight you make your models, the further along they can be deployed into production. Training models on massive datasets with over a billion parameters results in high latency and is impractical for real-world use. In this article, we will be delving into techniques that can help make your model more efficient. These methods focus on reducing the models’ size and latency and making them ready for deployment without any significant degradation in performance. 1. Pruning One of the first methods we would discuss is model pruning. More often than not, deep learning models are trained on extensive datasets, and as the neural networks keep getting trained, there are connections within the network that are not significant enough for the result. Model pruning is a technique to reduce the size of the neural networks by removing such less important connections. Doing this results in a sparse matrix, i.e., certain matrix values are set to 0. Model pruning helps not only in reducing the size of the models but also in inference times. Pruning can be broadly classified into two types: Structured pruning: In this method, we remove entire groups of weights from the neural networks for acceleration and size reduction. The weights are removed based on their L-n norm or at random.Unstructured pruning: In this method, we remove individual connections of weights. We zero out the units in a tensor with the lowest L-n norm or even at random. Additionally, we also have magnitude pruning, wherein we remove a percentage of weights with the smallest absolute values. But, to get an ideal balance between performance and efficiency, we often follow a strategy called iterative pruning as shown in the figure below. It's important to note that sparse matrix multiplication algorithms are critical in order to maximize the benefits of pruning. 2. Quantization Another method for model optimization is quantization. Deep learning neural networks are often comprised of billions of parameters, and by default, in machine learning frameworks such as PyTorch, these parameters are all stored as 32-bit floating precision, which leads to increased memory consumption and increased latency. Quantization is a method that lowers the precision of these parameters to lower bits, such as floating-point 16-bit or 8-bit integers. Doing this reduces the computational costs and memory footprint for the model as 8 8-bit integer takes four times less space as compared to FP32. We can broadly classify quantization as follows: Binary quantization: By representing weights and activations as binary numbers (that is, -1 or 1), you may significantly reduce the amount of memory that is required and the amount of computation that is required.Fixed-point quantization: Decrease numerical precision to a predetermined bit count, such as 8-bit or 16-bit, facilitating efficient storage and processing at the expense of some degree of numerical accuracy.Dynamic quantization: Modifying numerical precision in real-time during inference to balance model size and computing accuracy. Source: Qualcomm 3. Knowledge Distillation In the domain of model optimization, another effective methodology is knowledge distillation. The basic idea behind knowledge distillation is how a student learns from its teacher. We have an original pre-trained model which has the entire set of parameters, and this is known as the teacher model. Then, we have a student model that directly learns from the teacher model’s outputs rather than from any labeled data. This allows the student model to learn much faster as it learns from a probability distribution known as soft targets over all possible labels. In knowledge distribution, the student model need not have an entire set of layers or parameters, thereby making it significantly smaller and faster and, at the same time, providing similar performance as compared to the teacher model.KD has been shown to reduce model size by 40% while maintaining ~97% of the teacher model’s performance. Implementing knowledge distillation can be resource-intensive. Training a child model for a complex network such as BERT typically takes 700 GPU hours, whereas training it from scratch or the teacher model would take around 2400 GPU hours. However, given the performance retention by the child model and efficiency gains, knowledge distillation is a sought-after method for optimizing large models. Conclusion The advancement of deep neural networks has resulted in heightened complexity of the models employed in deep learning. Currently, models may possess millions or even billions of parameters, necessitating substantial computational resources for training and inference. Model optimization solutions aim to reduce the computational requirements of complex models while enhancing their overall efficiency. Numerous applications, especially those implemented on edge devices, possess limited access to computational resources such as memory, computing power, and energy. This is particularly applicable to edge devices. Optimizing models for these resource-constrained environments is essential for facilitating efficient deployment and real-time inference. Methods such as pruning, quantization, and knowledge distillation are some of the model optimization methods that can help achieve it.
Personalization today is an essential part of any successful mobile app. With 89% of U.S. marketers confirming that personalization on websites and apps has led to revenue growth and 88% prioritizing it to improve customer experience, the data speaks for itself. Thus, generative AI (GenAI) is proving to be a success in the mobile application segment in terms of bringing personalized experiences that users demand. GenAI in Mobile App Personalization The baseline of generative AI includes generating relevant content, responses, or insights from massive datasets. It does so by identifying patterns and making accurate predictions about user needs. For mobile apps, GenAI means a more adaptive, user-centric experience — be it personalized recommendations, curated content, or even conversational interactions that are as close to humans as possible. With this technology, we can move beyond the "one-size-fits-all" approach and instead deliver dynamically customized experiences. Real-time personalization can make apps highly engaging. This makes users feel intuitive and natural and get an experience that uniquely suits them. How GenAI Is Embedded into Mobile Apps for Personalization 1. Personalized Content Recommendations GenAI thrives on understanding user preferences and behaviors to create user-specific experiences. Whether it’s a streaming app suggesting shows based on past views or an e-commerce app showcasing products aligned with browsing history, GenAI works to increase engagement. 2. Conversational AI and Chatbots Chatbots powered by GenAI are improving customer support within apps. These chatbots can respond instantly, answer user questions, and even detect sentiment. Imagine a financial app that not only assists with account inquiries but also provides user-specific investment advice based on the user’s risk profile. GenAI-powered chatbots create genuine, empathetic, and real-time connections that make users feel understood. 3. Custom Visual Asset Creation Tools like DALL-E 2 and DiffusionBee allow apps to generate visual elements based on user preferences or specific brand aesthetics. For instance, a gaming app could use GenAI to generate unique characters, while a retail app might create product images that feel customized to the individual shopper’s style. Such targeted visual engagement is both a differentiator and a driver of brand loyalty. 4. Personalized Voice and Video Platforms like Synthesia bring a new dimension to mobile app personalization by generating realistic video content tailored to the user. Imagine a fitness app that presents a virtual trainer speaking directly to the user, giving workout tips based on individual progress. It’s this level of personalization that deepens connections with users and improves app experiences to an entirely new level. AI Tools For Mobile App Development Today’s GenAI tools are built to accelerate and refine the development process. Below are some examples: GitHub Copilot: GitHub Copilot offers code suggestions, completes functions, and streamlines testing. With this, development becomes faster and more efficient, enabling teams to invest more time in perfecting personalization features.Amazon Code Whisperer: Similar to GitHub Copilot, this tool generates code suggestions that improve efficiency. With GenAI supporting code development, teams can bring complex ideas to life more easily and quickly than ever before.DALL-E 2 and DiffusionBee: As text-to-image models, these tools support design teams in creating custom visuals on demand, from app icons to user-centric in-app illustrations.Synthesia: This tool generates lifelike video content that can support personalized onboarding, tutorials, or demos within apps, giving users an immersive, guided experience. Challenges of GenAI in Mobile Applications While GenAI shows promise, it also comes with challenges that must be addressed: 1. Data Privacy and Security Users trust apps with personal data in exchange for convenience. However, a massive 87% of users say they would refuse to do business with companies that don’t protect their data, and 71% would stop using a service that misuses sensitive data. A strong commitment to data privacy, coupled with transparent data usage practices, is paramount. 2. Maintaining Balance in Personalization Having the right balance between personalization and user privacy is key. GenAI enables hyper-personalized experiences, but users must have control over how much data is used. Transparent opt-ins, straightforward privacy settings, and clear communication on data usage strengthen this trust. 3. Addressing Ethical Concerns and Avoiding Bias GenAI models can reproduce biases. This affects the user experience negatively. Developers need to implement strict testing and continuous audits to make sure that the personalized experiences are fair, unbiased, and reflective of the diverse user base. Best Practices for GenAI Implementation for Personalized Experiences Here are some measures that mobile app developers must keep in mind for a reliable user experience: Collect only essential data and anonymize it whenever possible to protect user privacy.Use differential privacy techniques and edge processing to keep data local to the device, reducing reliance on external servers and minimizing exposure of sensitive information.Add clear, in-app privacy controls, allowing users to manage their data preferences directly.Offer options for users to toggle on/off features such as personalized recommendations or chatbot responses, improving transparency and giving users active control over their data.Use robust encryption and other security protocols to protect data from unauthorized access.Conduct regular audits of GenAI algorithms to detect and mitigate any potential biases or ethical issues. These practices will help in maintaining privacy regulations like GDPR and CCPA, ensuring user trust by safeguarding their personal data. Real-World Success Stories of GenAI in Action Spotify Audio streaming app Spotify leverages generative AI to analyze user listening habits, moods, and preferences, crafting a highly personalized experience through features like Discover Weekly and Daily Mix playlists. This approach not only delights users by offering relevant content but also encourages deeper exploration, setting Spotify apart in customer engagement. Duolingo Similarly, Duolingo utilizes GenAI to create an adaptive learning environment, delivering personalized feedback and dynamically adjusting lessons based on individual progress. This user-centric experience boosts motivation by aligning with each learner’s pace and proficiency. Additionally, it makes language learning more enjoyable and effective. Snapchat Snapchat is known for its dynamic filters and visual effects. It does so by using AI to provide users with personalized content that resonates with their past interactions. By integrating tools like DALL-E 2, Snapchat constantly refreshes its offerings. This leads to unique, engaging, and highly customized user interactions each time they use the app. Together, these examples illustrate how GenAI can elevate user engagement and loyalty by delivering deeply personalized, context-aware experiences. Conclusion Using generative AI in mobile apps is an evolution of what customers expect from their digital experiences. GenAI equips app developers with the means to drive engagement and satisfaction through personalized interactions. Mobile application developers can leverage GenAI to truly delight their users by thoughtfully working around data challenges, adopting the right tools, and continuous iteration based on feedback.
In digital communication, email remains a primary tool for both personal and business correspondence. However, as email usage has grown, so has the prevalence of spam and malicious emails. Organizations like Spamhaus work tirelessly to maintain email security, protect users from spam, and set standards for email etiquette. By using machine learning (ML) and artificial intelligence (AI), Spamhaus can improve its email filtering accuracy, better identify malicious senders, and promote responsible emailing practices. This article explores how machine learning and AI can be leveraged for Spamhaus email etiquette and security, highlighting techniques used for spam detection, filtering, and upholding responsible emailing standards. Section 1: The Role of Spamhaus in Email Etiquette and Security Spamhaus is a non-profit organization that maintains several real-time databases used to identify and block spam sources. By analyzing IP addresses, domain reputations, and known malicious activities, Spamhaus helps internet service providers (ISPs) and organizations filter out unwanted emails. Beyond spam blocking, Spamhaus also establishes guidelines for email etiquette to help prevent legitimate messages from being flagged and promote ethical practices in email marketing and communication. Section 2: Machine Learning Techniques in Spam Detection and Filtering 1. Supervised Machine Learning for Email Classification Spam vs. ham classification: Supervised learning models, such as decision trees, support vector machines, and logistic regression, can be trained on labeled datasets containing spam (unwanted emails) and ham (legitimate emails) examples. These models learn the distinguishing features between spam and non-spam emails based on keywords, sender reputation, frequency of certain terms, and more.Feature extraction: Machine learning models rely on features such as email subject lines, sender metadata, URLs, and attachments. By identifying specific words, links, and patterns associated with spam, the models can classify emails more accurately. 2. Natural Language Processing (NLP) for Content Analysis NLP techniques can analyze the content and language structure within emails. Spam messages often use certain phrases, misspellings, or urgent language to deceive users. NLP models, such as sentiment analysis and named entity recognition, can identify these patterns and flag potentially harmful emails.Using techniques like Word2Vec or TF-IDF, words and phrases in an email can be converted into numerical vectors that capture their contextual meaning. These vectors help the ML model understand the text better and identify suspicious language patterns. 3. Bayesian Filtering Bayesian filtering is a probabilistic approach commonly used in spam detection. This method calculates the likelihood that an email is spam based on the frequency of certain words or features in the email. As the filter is trained with more spam and ham emails, it continually improves its accuracy. Section 3: AI-Powered Enhancements for Spamhaus Email Etiquette 1. Unsupervised Learning for Pattern Detection Unlike supervised models, unsupervised learning does not rely on labeled data. Instead, it identifies patterns and anomalies in email data. Techniques like clustering and anomaly detection can be used to find unusual email patterns that may indicate spam or phishing attempts.Clustering algorithms: By grouping similar emails together, clustering algorithms (e.g., K-means) can help Spamhaus identify patterns in spam emails that are evolving or changing over time, such as new phishing tactics or scams. 2. Deep Learning Models for Phishing Detection Phishing attacks are one of the biggest email security challenges, as they are often sophisticated and hard to detect. Deep learning models, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), can analyze the entire structure of an email, including headers, content, and hyperlinks, to identify potential phishing attempts with high accuracy. 3. AI-Driven Domain and IP Reputation Scoring By analyzing historical data on domains and IP addresses, AI models can assign reputation scores to various sources. These scores are based on factors like the frequency of spam reports, associations with known malicious activity, and unusual email-sending patterns. A low reputation score could result in an email being flagged as suspicious or blocked entirely. 4. Adaptive Learning With Reinforcement Techniques Reinforcement learning can be used to create adaptive filters that continuously improve as they interact with new data. These filters adjust their response based on feedback, refining their spam detection over time, adapting to new spam tactics, and evolving email etiquette. Section 4: Ensuring Responsible Emailing With AI 1. User Behavior Analytics Machine learning models can analyze user behavior to detect anomalies, such as unusual sending patterns or spikes in email volume. By identifying these behaviors, Spamhaus can encourage responsible email usage and discourage practices associated with spam-like behavior, even among legitimate senders. 2. Sender Authentication Techniques AI can help verify sender identities and enhance email authentication using protocols like SPF (Sender Policy Framework), DKIM (DomainKeys Identified Mail), and DMARC (Domain-based Message Authentication, Reporting, and Conformance). Machine learning models can cross-reference these authentication mechanisms to prevent email spoofing and ensure that emails are sent by verified sources. 3. Predictive Modeling for Engagement and Spam-Like Behavior AI can analyze engagement metrics, such as open rates and click-through rates, to identify email campaigns that might be perceived as spammy by recipients. By offering insights into how recipients interact with emails, predictive models can help senders improve their practices, aligning with Spamhaus guidelines for responsible emailing. 4. Automated Feedback Loops for Continuous Improvement AI-driven feedback loops can alert email marketers or organizations when their emails are flagged as spam or exhibit characteristics of poor etiquette. These insights can help senders refine their strategies to meet best practices, reducing the chances of legitimate emails being blocked. Section 5: Benefits and Challenges of Using AI and ML in Email Etiquette Benefits Higher accuracy: AI models can identify nuanced patterns that are difficult for traditional filters to catch, improving accuracy in detecting spam and malicious emails.Real-time detection: Machine learning enables real-time analysis, allowing Spamhaus to block spam emails before they reach the inbox.Better user experience: By reducing false positives and promoting responsible emailing, AI improves the overall email experience for both senders and recipients. Challenges Privacy and data protection: AI models require extensive data, raising concerns about user privacy and data security. Organizations must adhere to data protection regulations and prioritize user privacy.Model bias and fairness: ML models can sometimes exhibit biases based on the data they’re trained on. It’s essential to monitor and correct these biases to avoid mistakenly flagging legitimate senders.Adaptability to evolving threats: Spam and phishing tactics are constantly evolving, requiring AI models to be updated and retrained regularly to stay effective. Conclusion Machine learning and AI have the potential to transform Spamhaus email etiquette and security, improving spam detection, reducing false positives, and enhancing the user experience. By leveraging techniques such as supervised learning, NLP, Bayesian filtering, and unsupervised learning, AI can provide more accurate and adaptive filtering solutions. Additionally, with the integration of user behavior analysis and predictive modeling, AI can support responsible emailing practices, encouraging a safer and more ethical email environment. As these technologies continue to advance, the collaboration between AI and organizations like Spamhaus will play a crucial role in keeping email communication secure, efficient, and courteous. By staying vigilant, continuously refining models, and promoting best practices, the future of email security and etiquette looks promising with the support of machine learning and AI.
Data governance refers to the policies and processes that ensure the management, integrity, and security of organizational data. Traditional frameworks like DAMA-DMBOK and COBIT focus on structured data management and standardizing processes (Otto, 2011). These frameworks are foundational in managing enterprise data but often lack the flexibility needed for AI applications that process unstructured data types (Khatri & Brown, 2010). Generative AI: An Overview Generative AI technologies, including models like GPT, DALL·E, and others, are becoming widespread in industries such as finance, healthcare, and e-commerce. These models generate text, images, and code based on large datasets (IBM, 2022). While the potential of these technologies is vast, they pose governance issues that are not addressed by traditional data management strategies, especially when handling vast, diverse, and unstructured datasets. The Intersection of Data Governance and Generative AI Studies show that generative AI impacts data governance by affecting how data is collected, processed, and utilized (Gartner, 2023). Managing unstructured data — such as media files and PDFs — which does not fit traditional data governance models due to its schema-less nature is crucial. Without effective management and governance, AI applications risk mishandling sensitive data, leading to security breaches and compliance failures. Key Challenges in Data Governance With Generative AI Data Privacy and Security Risks Generative AI systems process vast amounts of data, often including sensitive information. Without robust security measures, organizations face significant risks of data exposure and breaches. Legal frameworks like the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) mandate stringent data privacy standards, necessitating advanced governance strategies to comply (European Union, 2018; CCPA, 2020). Ethical and Compliance Issues The use of generative AI raises ethical concerns, such as biases in AI outputs and manipulation of data. Compliance challenges arise as organizations attempt to align AI operations with existing regulatory frameworks, which were not designed for the complexities introduced by AI (IBM, 2022). New governance models must account for these issues by integrating ethical standards and compliance checks into AI development processes. Quality Control and Data Integrity Quality control is crucial in ensuring that AI-generated outputs are reliable. Tools such as AWS Glue, Google Cloud’s Data Quality features, and Microsoft Azure Data Factory are essential for maintaining data integrity in AI models. These platforms offer capabilities like data profiling and quality scoring, which help organizations monitor and enhance the quality of their data. Theoretical Framework Data Governance Frameworks Traditional frameworks like DAMA-DMBOK and COBIT emphasize structured data management, data quality assurance, and compliance (Khatri & Brown, 2010). However, these frameworks often fall short when applied to unstructured data, a common element in generative AI. The lack of schema-less data management capabilities poses a risk, as AI models rely heavily on diverse datasets (Otto, 2011). Generative AI Frameworks Generative AI demands new governance frameworks that accommodate its unique challenges. Integrating AI-specific considerations such as fine-grained access control, user role permissions, and unstructured data management tools like AWS Glue, AWS Lake Formation, Google Cloud Data Catalog, and Microsoft Azure Cognitive Services is essential. These platforms emphasize the need for robust strategies in AI data management, focusing on discoverability and privacy (Gartner, 2023; IBM, 2022). Proposed Framework for Data Governance in Generative AI The proposed framework incorporates elements from traditional governance models but extends to include tools specifically designed for managing unstructured data and ensuring privacy. For instance, AWS services such as Amazon Textract and AWS Glue can automate data cataloging and metadata extraction, enhancing data governance efficiency in AI applications. This hybrid approach allows organizations to maintain traditional governance standards while integrating AI-specific tools for improved data management. Evolution of Gen AI Applications Strategies for Effective Data Governance in the Age of Generative AI Policy and Framework Development Organizations must develop AI-specific policies that integrate data privacy, security, and compliance considerations. Data privacy policy such as masking Personally Identifiable Information(PII) using Hashing or Redaction techniques or following field level encryptions. Segregating data based on geography and localizing the AI frameworks local to that area. Divert the traffic based on origin into respective AI frameworks. Adapting traditional frameworks like DAMA-DMBOK with AI-focused tools can address these challenges. However, modernized tools from cloud providers like AWS Glue and Amazon Macie help with data privacy. Most AWS services are designed to be compliant with the geographical region where they are deployed. So, choosing an appropriate service in your region helps you adhere to data residency compliance requirements. Technological Solutions Using AI and ML technologies to automate governance processes is vital. AWS, Google Cloud, and Microsoft Azure offer advanced tools for managing AI data and ensuring compliance (Gartner, 2023). Implementing these solutions enhances the efficiency and security of data governance practices. Also, data quality and data enrichment solutions are important components of the data governance process. When malformed data is ingested into Generative AI frameworks, it can cause large language models to hallucinate. Using data quality scores from tools like AWS Glue or Informatica can be ingested along with the data, which will give better context to the Generative AI on which data to use. Data enrichment solutions can be used to avoid bias and toxicity by inducing Synthetic Data Generation, Entity Resolution, and modifying the data points. Later, these can be used to train the Large Language Models (LLMs). Continuous Monitoring and Auditing AI-based monitoring tools can be used for real-time tracking of data usage and potential security threats, allowing organizations to respond swiftly to anomalies. Regular audits using automated tools, such as AWS Audit Manager or Azure Purview, ensure compliance with governance policies, promote transparency, and highlight areas for improvement to maintain effective data governance. Data Integration and Interoperability Solutions Investing in a unified data management platform that consolidates various data sources — such as data lakes and warehouses — allows consistency and compliance across AI systems. The adoption of such interoperability standards and open APIs facilitates secure data exchange between different systems, maintaining data integrity and security across AI platforms while supporting a cohesive governance environment. We have a proven track record for ingesting structured data, but ingesting unstructured data is vital in data integrations. As of today, ingestion of unstructured data involves separating the data and metadata and normalizing the data by bringing in a schema. By doing this, you will be able to catalog the unstructured metadata, which gives you better discoverability. With a unified data cataloging system, you will be able to better discover and enable better integrations as these data are normalized. Data cataloging tools like AWS Glue Data Catalog, Azure Data Catalog, and Google Cloud Data Catalog provide this functionality. AWS services like Amazon Textract, Amazon Comprehend, and Amazon Rekognition extract metadata from unstructured data into these data catalogs. Data integration tools like AWS Glue and Informatica help in data integration. Cross-Functional Teams and Collaboration Building cross-functional teams that include data scientists, IT specialists, compliance officers, and business leaders is crucial for aligning data governance strategies with business goals and regulatory requirements. Taking external stakeholders, like regulators and industry experts, in a loop also helps organizations stay informed about any newer regulations and best practices, ensuring proactive policy adjustments. Conclusion The successful implementation of data governance initiatives for generative AI has established a robust, production-ready foundation for secure data management and machine learning. The solutions for building well-governed generative AI data platforms on the cloud like AWS. You can divide solutions into two main workstreams to address the unique requirements of generative AI. In Workstream 1, an Amazon S3 data lake with AWS Lake Formation was set up to ensure secure access, with data pipelines and quality checks providing clean, labeled datasets for model training. Workstream 2 introduced an Amazon Bedrock environment for sophisticated data enrichment, including synthetic data generation and entity resolution to minimize bias and toxicity, and an Amazon SageMaker setup for deploying real-time classification models. Together, these workstreams create a scalable, adaptable framework that supports ongoing data-driven insights. This production-grade setup not only makes data accessible, secure, and organized for model training and operations but also highlights gaps in traditional data governance methods. Generative AI requires enhanced governance practices that exceed traditional frameworks, particularly around privacy, unstructured data management, and continuous monitoring. By integrating AI-specific policies, advanced management tools, and continuous monitoring, organizations can better safeguard data assets, ensuring both security and flexibility in production environments. Future research should build on this foundation by assessing AI governance frameworks across industries, helping organizations to develop best practices that adapt to rapidly changing AI landscapes. This ongoing exploration will support the evolution of governance strategies, ensuring robust compliance, data integrity, and operational resilience at scale.
Managing expenses and keeping track of receipts can be cumbersome. Digitalizing receipts and extracting product information automatically can greatly enhance efficiency. In this blog, we’ll build a Receipt Scanner App where users can scan receipts using their phone, extract data from them using OCR (Optical Character Recognition), process the extracted data with OpenAI to identify products and prices, store the data in PostgreSQL, and analyze product prices across different stores. What Does the Receipt Scanner App do? This app allows users to: Scan receipts: Users can take pictures of their receipts with their phone.Extract text: The app will use OCR to recognize the text from the receipt images.Analyze product information: With OpenAI’s natural language processing capabilities, we can intelligently extract the product names and prices from the receipt text.Store data: The extracted data is stored in a PostgreSQL database.Track prices: Users can later retrieve price ranges for products across different stores, providing insights into spending patterns and price comparisons. Tech Stack Overview We'll be using the following technologies: Frontend (Mobile) Expo - React Native: For the mobile app that captures receipt images and uploads them to the backend. Backend Node.js with Express: For handling API requests and managing interactions between the frontend, Google Cloud Vision API, OpenAI, and PostgreSQL.Google Cloud Vision API: For Optical Character Recognition (OCR) to extract text from receipt images.OpenAI GPT-4: For processing and extracting meaningful information (product names, prices, etc.) from the raw receipt text.PostgreSQL: For storing receipt and product information in a structured way. Step 1: Setting Up the Backend with Node.js and PostgreSQL 1. Install the Required Dependencies Let’s start by setting up a Node.js project that will serve as the backend for processing and storing receipt data. Navigate to your project folder and run: Shell mkdir receipt-scanner-backend cd receipt-scanner-backend npm init -y npm install express multer @google-cloud/vision openai pg body-parser cors dotenv 2. Set Up PostgreSQL We need to create a PostgreSQL database that will store information about receipts and products. Create two tables: receipts: Stores metadata about each receipt.products: Stores individual product data, including names, prices, and receipt reference. SQL CREATE TABLE receipts ( id SERIAL PRIMARY KEY, store_name VARCHAR(255), receipt_date DATE ); CREATE TABLE products ( id SERIAL PRIMARY KEY, product_name VARCHAR(255), price DECIMAL(10, 2), receipt_id INTEGER REFERENCES receipts(id) ); 3. Set Up Google Cloud Vision API Go to the Google Cloud Console, create a project, and enable the Cloud Vision API.Download your API credentials as a JSON file and save it in your backend project directory. 4. Set Up OpenAI API Create an account at Open AI and obtain your API key.Store your OpenAI API key in a .envfile like this: Shell OPENAI_API_KEY=your-openai-api-key-here 5. Write the Backend Logic Google Vision API (vision.js) This script will use the Google Cloud Vision API to extract text from the receipt image. Google Vision for Text Extraction (vision.js) JavaScript const vision = require('@google-cloud/vision'); const client = new vision.ImageAnnotatorClient({ keyFilename: 'path-to-your-google-vision-api-key.json', }); async function extractTextFromImage(imagePath) { const [result] = await client.textDetection(imagePath); const detections = result.textAnnotations; return detections[0]?.description || ''; } module.exports = { extractTextFromImage }; OpenAI Text Processing (openaiService.js) This service will use OpenAI GPT-4 to analyze the extracted text and identify products and their prices. JavaScript const { Configuration, OpenAIApi } = require('openai'); const configuration = new Configuration({ apiKey: process.env.OPENAI_API_KEY, }); const openai = new OpenAIApi(configuration); async function processReceiptText(text) { const prompt = ` You are an AI that extracts product names and prices from receipt text. Here’s the receipt data: "${text}" Return the data as a JSON array of products with their prices, like this: [{"name": "Product1", "price": 9.99}, {"name": "Product2", "price": 4.50}] `; const response = await openai.createCompletion({ model: 'gpt-4', prompt, max_tokens: 500, }); return response.data.choices[0].text.trim(); } module.exports = { processReceiptText }; Setting Up Express (app.js) Now, we’ll integrate the OCR and AI processing in our Express server. This server will handle image uploads, extract text using Google Vision API, process the text with OpenAI, and store the results in PostgreSQL. JavaScript require('dotenv').config(); const express = require('express'); const multer = require('multer'); const { Pool } = require('pg'); const { extractTextFromImage } = require('./vision'); const { processReceiptText } = require('./openaiService'); const app = express(); app.use(express.json()); const pool = new Pool({ user: 'your-db-user', host: 'localhost', database: 'your-db-name', password: 'your-db-password', port: 5432, }); const upload = multer({ dest: 'uploads/' }); app.get('/product-price-range/:productName', async (req, res) => { const { productName } = req.params; try { // Query to get product details, prices, and store names const productDetails = await pool.query( `SELECT p.product_name, p.price, r.store_name, r.receipt_date FROM products p JOIN receipts r ON p.receipt_id = r.id WHERE p.product_name ILIKE $1 ORDER BY p.price ASC`, [`%${productName}%`] ); if (productDetails.rows.length === 0) { return res.status(404).json({ message: 'Product not found' }); } res.json(productDetails.rows); } catch (error) { console.error(error); res.status(500).json({ error: 'Failed to retrieve product details.' }); } }); app.post('/upload-receipt', upload.single('receipt'), async (req, res) => { try { const imagePath = req.file.path; const extractedText = await extractTextFromImage(imagePath); const processedData = await processReceiptText(extractedText); const products = JSON.parse(processedData); const receiptResult = await pool.query( 'INSERT INTO receipts (store_name, receipt_date) VALUES ($1, $2) RETURNING id', ['StoreName', new Date()] ); const receiptId = receiptResult.rows[0].id; for (const product of products) { await pool.query( 'INSERT INTO products (product_name, price, receipt_id) VALUES ($1, $2, $3)', [product.name, product.price, receiptId] ); } res.json({ message: 'Receipt processed and stored successfully.' }); } catch (error) { console.error(error); res.status(500).json({ error: 'Failed to process receipt.' }); } }); app.listen(5000, () => { console.log('Server running on port 5000'); }); Step 2: Building the React Native Frontend Now that our backend is ready, we’ll build the React Native app for capturing and uploading receipts. 1. Install React Native and Required Libraries Plain Text npx expo init receipt-scanner-app cd receipt-scanner-app npm install axios expo-image-picker 2. Create the Receipt Scanner Component This component will allow users to capture an image of a receipt and upload it to the backend for processing. App.js JavaScript import React from 'react'; import { NavigationContainer } from '@react-navigation/native'; import { createStackNavigator } from '@react-navigation/stack'; import ProductPriceSearch from './ProductPriceSearch'; // Import the product price search screen import ReceiptUpload from './ReceiptUpload'; // Import the receipt upload screen const Stack = createStackNavigator(); export default function App() { return ( <NavigationContainer> <Stack.Navigator initialRouteName="ReceiptUpload"> <Stack.Screen name="ReceiptUpload" component={ReceiptUpload} /> <Stack.Screen name="ProductPriceSearch" component={ProductPriceSearch} /> </Stack.Navigator> </NavigationContainer> ); } ProductPriceSearch.js JavaScript import React, { useState } from 'react'; import { View, Text, TextInput, Button, FlatList, StyleSheet } from 'react-native'; import axios from 'axios'; const ProductPriceSearch = () => { const [productName, setProductName] = useState(''); const [productDetails, setProductDetails] = useState([]); const [message, setMessage] = useState(''); // Function to search for a product and retrieve its details const handleSearch = async () => { try { const response = await axios.get(`http://localhost:5000/product-price-range/${productName}`); setProductDetails(response.data); setMessage(''); } catch (error) { console.error(error); setMessage('Product not found or error retrieving data.'); setProductDetails([]); // Clear previous search results if there was an error } }; const renderProductItem = ({ item }) => ( <View style={styles.item}> <Text style={styles.productName}>Product: {item.product_name}</Text> <Text style={styles.storeName}>Store: {item.store_name}</Text> <Text style={styles.price}>Price: ${item.price}</Text> </View> ); return ( <View style={styles.container}> <Text style={styles.title}>Search Product Price by Store</Text> <TextInput style={styles.input} placeholder="Enter product name" value={productName} onChangeText={setProductName} /> <Button title="Search" onPress={handleSearch} /> {message ? <Text style={styles.error}>{message}</Text> : null} <FlatList data={productDetails} keyExtractor={(item, index) => index.toString()} renderItem={renderProductItem} style={styles.list} /> </View> ); }; const styles = StyleSheet.create({ container: { flex: 1, justifyContent: 'center', padding: 20, }, title: { fontSize: 24, textAlign: 'center', marginBottom: 20, }, input: { height: 40, borderColor: '#ccc', borderWidth: 1, padding: 10, marginBottom: 20, }, list: { marginTop: 20, }, item: { padding: 10, backgroundColor: '#f9f9f9', borderBottomWidth: 1, borderBottomColor: '#eee', marginBottom: 10, }, productName: { fontSize: 18, fontWeight: 'bold', }, storeName: { fontSize: 16, marginTop: 5, }, price: { fontSize: 16, color: 'green', marginTop: 5, }, error: { color: 'red', marginTop: 10, textAlign: 'center', }, }); export default ProductPriceSearch; ReceiptUpload.js JavaScript import React, { useState } from 'react'; import { View, Button, Image, Text, StyleSheet } from 'react-native'; import * as ImagePicker from 'expo-image-picker'; import axios from 'axios'; const ReceiptUpload = () => { const [receiptImage, setReceiptImage] = useState(null); const [message, setMessage] = useState(''); // Function to open the camera and capture a receipt image const captureReceipt = async () => { const permissionResult = await ImagePicker.requestCameraPermissionsAsync(); if (permissionResult.granted === false) { alert('Permission to access camera is required!'); return; } const result = await ImagePicker.launchCameraAsync(); if (!result.cancelled) { setReceiptImage(result.uri); } }; // Function to upload the receipt image to the backend const handleUpload = async () => { if (!receiptImage) { alert('Please capture a receipt image first!'); return; } const formData = new FormData(); formData.append('receipt', { uri: receiptImage, type: 'image/jpeg', name: 'receipt.jpg', }); try { const response = await axios.post('http://localhost:5000/upload-receipt', formData, { headers: { 'Content-Type': 'multipart/form-data' }, }); setMessage(response.data.message); } catch (error) { console.error(error); setMessage('Failed to upload receipt.'); } }; return ( <View style={styles.container}> <Text style={styles.title}>Upload Receipt</Text> <Button title="Capture Receipt" onPress={captureReceipt} /> {receiptImage && ( <Image source={{ uri: receiptImage } style={styles.receiptImage} /> )} <Button title="Upload Receipt" onPress={handleUpload} /> {message ? <Text style={styles.message}>{message}</Text> : null} </View> ); }; const styles = StyleSheet.create({ container: { flex: 1, justifyContent: 'center', padding: 20, }, title: { fontSize: 24, textAlign: 'center', marginBottom: 20, }, receiptImage: { width: 300, height: 300, marginTop: 20, marginBottom: 20, }, message: { marginTop: 20, textAlign: 'center', color: 'green', }, }); export default ReceiptUpload; Explanation expo-image-picker is used to request permission to access the device's camera and to capture an image of the receipt.The captured image is displayed on the screen and then uploaded to the backend using axios. 3. Running the App To run the app: Start the Expo development server: Plain Text npx expo start Scan the QR code using the Expo Go app on your phone. The app will load, allowing you to capture and upload receipts. Step 3: Running the Application Start the Backend Run the backend on port 5000: Plain Text node app.js Run the React Native App Open the iOS or Android emulator and run the app: Plain Text npx expo init receipt-scanner-app cd receipt-scanner-app npm install axios expo-image-picker Once the app is running: Capture a receipt image.Upload the receipt to the backend.The backend will extract the text, process it with OpenAI, and store the data in PostgreSQL. Step 4: Next Steps Enhancements Authentication: Implement user authentication so that users can manage their personal receipts and data.Price comparison: Provide analytics and price comparison across different stores for the same product.Improve parsing: Enhance the receipt parsing logic to handle more complex receipt formats with OpenAI. Conclusion We built a Receipt Scanner App from scratch using: Expo - React Native for the frontend.Node.js, Google Cloud Vision API, and OpenAI for text extraction and data processing.PostgreSQL for storing and querying receipt data. The Receipt Scanner App we built provides users with a powerful tool to manage their receipts and gain valuable insights into their spending habits. By leveraging AI-powered text extraction and analysis, the app automates the process of capturing, extracting, and storing receipt data, saving users from the hassle of manual entry. This app allows users to: Easily scan receipts: Using their mobile phone, users can capture receipts quickly and effortlessly without needing to manually input data.Track spending automatically: Extracting product names, prices, and other details from receipts helps users keep a detailed log of their purchases, making expense tracking seamless.Compare product prices: The app can provide price ranges for products across different stores, empowering users to make smarter shopping decisions and find the best deals.Organize receipts efficiently: By storing receipts in a structured database, users can easily access and manage their purchase history. This is particularly useful for budgeting, tax purposes, or warranty claims. Overall, the Price Match App is a valuable tool for anyone looking to streamline their receipt management, track their spending patterns, and make data-driven decisions when shopping. With features like AI-powered text processing, automatic product identification, and price comparison, users benefit from a more organized, efficient, and intelligent way of managing their personal finances and shopping habits. By automating these tasks, the app frees up time and reduces errors, allowing users to focus on more important things. Whether you're tracking business expenses, managing household finances, or simply looking for the best deals on products, this app simplifies the process and adds value to everyday tasks.
2024 is promised to be the year of generative AI. Instead, it has been the year of catastrophic software outages. Earlier this year, we saw outages affecting high-street shops, banks, and cloud vendors, whilst those of us in the UK saw the Post Office Horizon IT scandal reach new levels of public outrage. Having made a living working to investigate and resolve such scandals in recent times, I found myself amid furor after leading a study that found that Agile wasn’t all that it was cracked up to be. After the international crises following the Crowdstrike outage helped underscore the point, I spoke to The Register about how catastrophic takes on Agile feed into failure. Broadly, I spoke out against interpretations of Agile that focus on getting the latest features as quickly as possible, DevOps metrics that disregard issues so long as they’re fixed quickly, and digital transformation strategies where the informed consent of those being ‘transformed’ is disregarded (despite the low success rates). However, in truth, at its core, the Agile failure research was speaking to a deeper factor - loss aversion. Back in 1979, now-Nobel-winner Daniel Kahneman and psychologist Amos Tversky coined the phrase to describe how humans feel the pain of a loss to twice the extent of the pleasure of a gain. I first learned about this when studying cognitive psychology at the University of Cambridge and with this understanding, it isn’t hard to understand why tech scandals can start with a technical cause but snowball into tragedy through coverup attempts. Through investigating tech project failures and software catastrophes at scale, the pattern of minor problems snowballing to tragedy was blindingly obvious and was the underlying hypothesis of the Agile failure rate study. The solution? Yes, the psychological safety to discuss and address problems led to an 87% increase in project success, but the one factor that came above this was clear requirements, with a 97% increase in success rates. Something we can all do with less pain than needing to change deep-rooted psychological factors. In other words, having a gate to discuss and address problems when loss aversion is at its lowest allows for the greatest improvement in success rates. However, this seemingly obvious proposition might not be universally obvious. Over the past six months, I’ve spoken about technology disasters to audiences of politicians at the UK Parliament, alongside engineers and lawyers. It may well be controversial to say this, but what I’ve found has been that those equipped with the mental models to work across multiple disciplines have been able to engage with these problems and find solutions in a way that will often go over the heads of those who can only operate in one discipline. This interdisciplinary approach allowed the cybersecurity community to address severe problems of huge public interest in an incredibly effective way by adopting expertise spanning technical, social, and legal factors. Unfortunately, in my experience, there is still an aversion to doing the same in software engineering. Software engineering is stronger as a result of learning from the catastrophes over the past year, but the longer-term solution rests in us being less averse to having expertise spanning other disciplines alongside software development in our profession. That way, addressing the next ‘Agile’ won’t hurt like hell.
Snowflake’s Snowpark brings machine learning (ML) closer to your data by enabling developers and data scientists to use Python for ML workflows directly within the Snowflake Data Cloud. Here are some of the advantages of using Snowpark for machine learning: Process data and build models within Snowflake, reducing data movement and latency.Scale ML tasks efficiently using Snowflake's elastic compute capabilities.Centralize data pipelines, transformations, and ML workflows in one environment.Write code in Python, Java, or Scala for seamless library integration.Integrate Snowpark with tools like Jupyter and Streamlit for enhanced workflows. In this tutorial, I'll walk you through setting up the Snowpark ML library, configuring your environment, and implementing a basic ML use case. Step 1: Prerequisites Before getting into Snowpark ML, ensure you have the following: A Snowflake account.SnowSQL CLI or any supported Snowflake IDE (e.g., Snowsight).Python 3.8+ installed locally.Necessary Python packages: snowflake-snowpark-python and scikit-learn. Install the required packages using pip: Shell pip install snowflake-snowpark-python scikit-learn Step 2: Set Up Snowpark ML Library 1. Ensure your Snowflake account is Snowpark-enabled. You can verify or enable it via your Snowflake admin console. 2. Create a Python runtime environment in Snowflake to execute your ML models. SQL CREATE STAGE my_python_lib; 3. Upload your required Python packages (like scikit-learn) to the stage. For example, use this command to upload a file: Shell snowsql -q "PUT file://path/to/your/package.zip @my_python_lib AUTO_COMPRESS=TRUE;" 4. Grant permissions to the Snowpark role to use external libraries: SQL GRANT USAGE ON STAGE my_python_lib TO ROLE my_role; Step 3: Configure Snowflake Connection in Python Set up your Python script to connect to Snowflake: Python from snowflake.snowpark import Session # Define your Snowflake connection parameters connection_parameters = { "account": "your_account", "user": "your_username", "password": "your_password", "role": "your_role", "warehouse": "your_warehouse", "database": "your_database", "schema": "your_schema" } # Create a Snowpark session session = Session.builder.configs(connection_parameters).create() print("Connection successful!") Step 4: A Simple ML Use Case – Predicting Customer Attrition Data Preparation 1. Load a sample dataset into Snowflake: SQL CREATE OR REPLACE TABLE cust_data ( cust_id INT, age INT, monthly_exp FLOAT, attrition INT ); INSERT INTO cust_data VALUES (1, 25, 50.5, 0), (2, 45, 80.3, 1), (3, 30, 60.2, 0), (4, 50, 90.7, 1); 2. Access the data in Snowpark: Python df = session.table("cust_data") print(df.collect()) Building an Attrition Prediction Model 1. Extract features and labels: Python from snowflake.snowpark.functions import col features = df.select(col("age"), col("monthly_exp")) labels = df.select(col("attrition")) 2. Locally train a Logistic Regression model using scikit-learn: Python from sklearn.linear_model import LogisticRegression import numpy as np # Prepare data X = np.array(features.collect()) y = np.array(labels.collect()).ravel() # Train model model = LogisticRegression() model.fit(X, y) print("Model trained successfully!") 3. Locally save the model and deploy it to Snowflake as a stage file: Shell pickle.dump(model, open("attrition_model.pkl", "wb")) snowsql -q "PUT file://attrition_model.pkl @my_python_lib AUTO_COMPRESS=TRUE;" Predict Customer Attrition in Snowflake 1. Use Snowflake’s UDFs to load and use the model: Python from snowflake.snowpark.types import PandasDataFrame, PandasSeries import pickle # Define a UDF def predict_attrition(age, monthly_exp): model = pickle.load(open("attrition_model.pkl", "rb")) return model.predict([[age, monthly_exp]])[0] # Register UDF session.udf.register(predict_attrition, return_type=IntType(), input_types=[IntType(), FloatType()]) 2. Apply the UDF to predict Attrition: Python result = df.select("cust_id", predict_attrition("age", "monthly_exp").alias("attrition_prediction")) result.show() Best Practices for Snowflake Snowpark in ML Use Snowflake's SQL engine for preprocessing to boost performance.Design efficient UDFs for non-native computations and limit data passed to them.Version and store models centrally for easy deployment and tracking.Monitor resource usage with query profiling and optimize warehouse scaling.Validate pipelines with sample data before running on full datasets. Conclusion You’ve successfully set up Snowpark ML, configured your environment, and implemented a basic Attrition Prediction model. Snowpark allows you to scale ML workflows directly within Snowflake, reducing data movement and improving operational efficiency.
In this blog post, we will explore how to implement CRUD (Create, Read, Update, Delete) operations using Natural Language Processing (NLP) with the Microsoft.Extensions.AI library in a .NET Web API application. We will utilize the power of NLP to interact with our application through natural language queries and perform CRUD operations on a light management system. Step-by-Step Guide 1. Create a .NET Web API Application First, let's create a new Web API project using the dotnet CLI: Plain Text dotnet new webapi -o lightsmeai This command generates a basic Web API project named "lightsmeai." 2. Add Required Packages Next, we need to add the necessary packages to our project. These packages include Azure.AI.OpenAI, Azure.Identity, DotNetEnv, Microsoft.AspNetCore.OpenApi, Microsoft.Extensions.AI, and more. Run the following commands to install the required packages: Plain Text dotnet add package Azure.AI.OpenAI --version 2.1.0-beta.2 dotnet add package Azure.Identity --version 1.13.1 dotnet add package DotNetEnv --version 3.1.1 dotnet add package Microsoft.AspNetCore.OpenApi --version 8.0.1 dotnet add package Microsoft.Extensions.AI --version 9.0.0-preview.9.24556.5 dotnet add package Microsoft.Extensions.AI.AzureAIInference --version 9.0.0-preview.9.24556.5 dotnet add package Microsoft.Extensions.AI.OpenAI --version 9.0.0-preview.9.24556.5 dotnet add package Swashbuckle.AspNetCore --version 6.4.0 3. Configure Program.cs In the Program.cs file, we set up the necessary configurations and services for our application. Here's the code snippet: Plain Text using Azure; using Azure.AI.Inference; using Azure.AI.OpenAI; using DotNetEnv; using Microsoft.Extensions.AI; // Get keys from configuration Env.Load(".env"); string githubKey = Env.GetString("GITHUB_KEY"); var builder = WebApplication.CreateBuilder(args); // Add services to the container. builder.Services.AddControllers(); builder.Services.AddEndpointsApiExplorer(); builder.Services.AddSwaggerGen(); // Add the chat client IChatClient innerChatClient = new ChatCompletionsClient( endpoint: new Uri("<https://models.inference.ai.azure.com>"), new AzureKeyCredential(githubKey)) .AsChatClient("gpt-4o-mini"); builder.Services.AddChatClient(chatClientBuilder => chatClientBuilder .UseFunctionInvocation() // .UseLogging() .Use(innerChatClient)); // Register embedding generator builder.Services.AddSingleton<IEmbeddingGenerator<string, Embedding<float>>>(sp => new AzureOpenAIClient(new Uri("<https://models.inference.ai.azure.com>"), new AzureKeyCredential(githubKey)) .AsEmbeddingGenerator(modelId: "text-embedding-3-large")); builder.Services.AddLogging(loggingBuilder => loggingBuilder.AddConsole().SetMinimumLevel(LogLevel.Trace)); var app = builder.Build(); // Configure the HTTP request pipeline. if (app.Environment.IsDevelopment()) { app.UseSwagger(); app.UseSwaggerUI(); } app.UseStaticFiles(); // Enable serving static files app.UseRouting(); // Must come before UseEndpoints app.UseAuthorization(); app.MapControllers(); // Serve index.html as the default page app.MapFallbackToFile("index.html"); app.Run(); 4. Add ChatController Let's create a ChatController to handle natural language queries and perform CRUD operations. Here's the code for the ChatController: Plain Text using System.Collections.Generic; using System.Threading.Tasks; using Microsoft.AspNetCore.Mvc; using Microsoft.Extensions.AI; namespace lightsmeai.Controllers { [ApiController] [Route("[controller]")] public class ChatController : ControllerBase { private readonly IChatClient _chatClient; private readonly IEmbeddingGenerator<string, Embedding<float>> _embeddingGenerator; private readonly ChatOptions _chatOptions; public ChatController( IChatClient chatClient, IEmbeddingGenerator<string, Embedding<float>> embeddingGenerator, ChatOptions chatOptions ) { _chatClient = chatClient; _embeddingGenerator = embeddingGenerator; _chatOptions = chatOptions; } [HttpPost("chat")] public async Task<ActionResult<IEnumerable<string>>> Chat(string userMessage) { var messages = new List<ChatMessage> { new(Microsoft.Extensions.AI.ChatRole.System, """ You answer any question, Hey there, I'm Lumina, your friendly lighting assistant! I can help you with all your lighting needs. You can ask me to turn on the light, get the status of the light, turn off all the lights, add a new light, or delete the light. For update you should create an object like below. some time the user will pass all key values or one or two key value. { "id": 6, "name": "Chandelier", "Switched": false } Just let me know what you need and I'll do my best to help! """), new(Microsoft.Extensions.AI.ChatRole.User, userMessage) }; var response = await _chatClient.CompleteAsync(messages,_chatOptions); return Ok(response.Message.Text); } } } 5. Remove WeatherForecast Related Code We will remove the WeatherForecast-related code from the Program.cs file as it is not relevant to our CRUD operations. 6. Add Microsoft.EntityFrameworkCore.InMemory To manage our light data, we will use an in-memory database provided by Microsoft.EntityFrameworkCore.InMemory. Install the package using the following command: dotnet add package Microsoft.EntityFrameworkCore.InMemory 7. Add Model and Its DBContext Let's define a Light model to represent our light entities and create a LightContext to manage the in-memory database. Here's the code: Plain Text using System.Text.Json.Serialization; namespace lightsmeai.Models { public class Light { [JsonPropertyName("id")] public int Id { get; set; } [JsonPropertyName("name")] public string? Name { get; set; } [JsonPropertyName("Switched")] public bool? Switched { get; set; } } } Plain Text using lightsmeai.Models; using Microsoft.AspNetCore.Mvc; using Microsoft.EntityFrameworkCore; namespace lightsmeai.Controllers { [Route("api/[controller]")] [ApiController] public class LightController : ControllerBase { // Initialize the context within the constructor private readonly LightContext _context; public LightController() { var options = new DbContextOptionsBuilder<LightContext>() .UseInMemoryDatabase("LightList") .Options; _context = new LightContext(options); } // CRUD operations implementation... } } 8. Check the LightController REST API Endpoints in Swagger After setting up the LightController, we can check the REST API endpoints in Swagger to interact with our light management system. 9. Add Function in Program.cs In the Program.cs file, we will add functions to expose the CRUD operations as tools for the chat client. Here's the updated code: Plain Text using Azure; using Azure.AI.Inference; using Azure.AI.OpenAI; using DotNetEnv; using lightsmeai.Controllers; using Microsoft.Extensions.AI; // Get keys from configuration Env.Load(".env"); string githubKey = Env.GetString("GITHUB_KEY"); var builder = WebApplication.CreateBuilder(args); // Add services to the container. builder.Services.AddControllers(); builder.Services.AddEndpointsApiExplorer(); builder.Services.AddSwaggerGen(); // Add the chat client IChatClient innerChatClient = new ChatCompletionsClient( endpoint: new Uri("<https://models.inference.ai.azure.com>"), new AzureKeyCredential(githubKey)) .AsChatClient("gpt-4o-mini"); builder.Services.AddChatClient(chatClientBuilder => chatClientBuilder .UseFunctionInvocation() .UseLogging() .Use(innerChatClient)); // Register embedding generator builder.Services.AddSingleton<IEmbeddingGenerator<string, Embedding<float>>>(sp => new AzureOpenAIClient(new Uri("<https://models.inference.ai.azure.com>"), new AzureKeyCredential(githubKey)) .AsEmbeddingGenerator(modelId: "text-embedding-3-large")); builder.Services.AddLogging(loggingBuilder => loggingBuilder.AddConsole().SetMinimumLevel(LogLevel.Trace)); var light = new LightController(); var getAllLightsTool = AIFunctionFactory.Create(light.GetLights); var getLightTool = AIFunctionFactory.Create(light.GetLight); var createLightTool = AIFunctionFactory.Create(light.AddLight); var updateLightTool = AIFunctionFactory.Create(light.UpdateLight); var deleteLightTool = AIFunctionFactory.Create(light.DeleteLight); var chatOptions = new ChatOptions { Tools = new[] { getAllLightsTool, getLightTool, createLightTool, updateLightTool, deleteLightTool } }; builder.Services.AddSingleton(light); builder.Services.AddSingleton(chatOptions); var app = builder.Build(); // Configure the HTTP request pipeline. if (app.Environment.IsDevelopment()) { app.UseSwagger(); app.UseSwaggerUI(); } app.UseStaticFiles(); // Enable serving static files app.UseRouting(); // Must come before UseEndpoints app.UseAuthorization(); app.MapControllers(); // Serve index.html as the default page app.MapFallbackToFile("index.html"); app.Run(); 10. Make a Few Changes in ChatController We will make a few adjustments to the ChatController to utilize the tools we exposed in the previous step. Here's the updated code: Plain Text using System.Collections.Generic; using System.Threading.Tasks; using Microsoft.AspNetCore.Mvc; using Microsoft.Extensions.AI; namespace lightsmeai.Controllers { [ApiController] [Route("[controller]")] public class ChatController : ControllerBase { private readonly IChatClient _chatClient; private readonly IEmbeddingGenerator<string, Embedding<float>> _embeddingGenerator; private readonly ChatOptions _chatOptions; public ChatController( IChatClient chatClient, IEmbeddingGenerator<string, Embedding<float>> embeddingGenerator, ChatOptions chatOptions ) { _chatClient = chatClient; _embeddingGenerator = embeddingGenerator; _chatOptions = chatOptions; } [HttpPost("chat")] public async Task<ActionResult<IEnumerable<string>>> Chat(string userMessage) { var messages = new List<ChatMessage> { new(Microsoft.Extensions.AI.ChatRole.System, """ You answer any question, Hey there, I'm Lumina, your friendly lighting assistant! I can help you with all your lighting needs. You can ask me to turn on the light, get the status of the light, turn off all the lights, add a new light, or delete the light. For update you should create an object like below. some time the user will pass all key values or one or two key value. { "id": 6, "name": "Chandelier", "Switched": false } Just let me know what you need and I'll do my best to help! """), new(Microsoft.Extensions.AI.ChatRole.User, userMessage) }; var response = await _chatClient.CompleteAsync(messages,_chatOptions); return Ok(response.Message.Text); } } } 11. Check the ChatController REST API Endpoints in Swagger Finally, we can check the REST API endpoints in Swagger to interact with our chat controller and perform CRUD operations using natural language queries. Wrapping Up With this setup, users can interact with our light management system through natural language queries, and the application will respond with appropriate actions based on the user's input. The Microsoft.Extensions.AI library and the power of NLP enable us to create a more intuitive and user-friendly interface for managing lights.
Since the introduction of simple applications to the market, DevOps teams have faced increasing demands in terms of speed, efficiency, and application reliability. To meet these needs, some are embracing artificial intelligence (AI) to introduce more automation, business intelligence, and intelligent decision-making to cloud DevOps. The Role of AI in DevOps AI brings new features that enable DevOps to improve the efficiency of processes, ensure better security, and reduce the need for interventions that can be both lengthy and inaccurate. AI's capacity to manipulate data and make instantaneous, intelligent decisions is beneficial for operations that necessitate analysis, irregular event monitoring, and maintenance forecasting, all of which are highly pertinent in today's cloud environments. The integration of AI with DevOps enhances the level of automation as well as the overall adaptability of the new value stream to deliver an ideal response to meeting new needs as well as managing new risks. Enhanced Decision-Making and Predictive Analysis DevOps teams can use AI to analyze vast datasets from different sources quickly, giving them usable information. Nowadays, ML models can make system behavior predictions from the past, predict system failures, and make suggestions about preventive actions. This provision of a predictive capability enables smart decision-making instead of simple problem-solving and the enhancement of system dependability. Continuous Improvement It is capable of continuously learning from each DevOps cycle and identifying the bottlenecks in each process. This feedback loop enables gradual enhancements, making it simple for the teams to keep improving the agility of their workflows as they deploy more of them in the system. Key Areas Where AI Enhances DevOps Automation AI applications in DevOps span several operational areas. Key benefits include: 1. Automated Testing and Quality Assurance AI tools enhance testing efficiency by analyzing data from previous tests to identify critical tests, potential failure sites, and identify defects. By reducing the amount of manual testing, this automation leads to faster and more effective releases. 2. Incident Management and Resolution AI can enhance the management of incidents through the fast identification of abnormal behaviors and the identification of behaviors and response measures. Using ML, such tools as AIOps (Artificial Intelligence for IT Operations) identify problems when they are still in the embryo and recommend measures for their remediation, thus decreasing system availability downtime and increasing business continuity. 3. Resource Optimization AI forecasts traffic patterns and adjusts cloud resources accordingly. It helps to maximize the use of cloud assets in a better manner, preventing the client from overbuying the assets, thereby cutting down on expenses and improving flexibility. 4. Security Enhancements It aids in security by scanning traffic patterns and identifying potential weak points. Assimilating previous security breaches and acts, AI models can easily identify irregularities, which allows the DevOps team to address threats before they worsen. Implementing AI-driven DevOps in Cloud Environments AI in DevOps entails identifying the right tools, setting the appropriate infrastructure, and using the right data. Selecting AI and ML Tools Today, cloud providers such as AWS, Azure, Google Cloud, and many others offer AI/ML tools as part of their DevOps tools, which aid in performing anomalies, predictive analytics, and auto-response tasks. These tools make it easier to implement and get value from an IT investment more quickly. Data Collection and Model Training To create a learning model that will predict and respond to occurrences, AI requires data. Cloud environments naturally create a significant volume of data; focusing on the most important metrics helps produce useful and pertinent AI-based recommendations for DevOps. Building a Feedback Loop We create an ongoing loop to directly feed insights from the deployment process back into AI models. This approach not only facilitates the progressive enhancement of the processes used but also enhances the flexibility of the workflows regarding demand. Challenges and Considerations However, the use of AI in delivering DevOps has some drawbacks. Key considerations include: Data Privacy and Security Automations using AI heavily rely on big data, often containing private information. This is crucial to ensure the privacy of the data and to comply with regulations such as the GDPR. Skills and Expertise DevOps frequently requires AI skills, and skill gaps can arise when learning and applying AI in DevOps. There are several ways to ensure the effective use of AI within teams, including hiring or training skilled individuals. Managing Algorithm Bias and Drift One common observation about machine learning models is their occasional poor performance, also known as data drift. In essence, models require validation, monitoring, and retraining to ensure that they maintain high levels of accuracy. Conclusion AI-powered automation in the cloud DevOps has the potential to enhance the way businesses install, run, and support their applications. AI-empowered DevOps operations within the cloud equip every stakeholder with the tools to be more flexible, effective, and resilient by improving areas in testing, incident management, resource optimization, and security. A thoughtfully devised AI adoption strategy offers numerous benefits despite some barriers, including data privacy, skill requirements, and model drift. Organizations embracing AI-driven DevOps today can sustain advantages in optimizing application performance, dependability, and cost-effectiveness compared with their competitors. AI provides cloud and DevOps professionals with a strategic opportunity that inspires innovation, reduces manual labor, and serves the demands of the modern digital environment in its complexity. References Oztoprak, K., Tuncel, Y. K., & Butun, I. (2023). Technological Transformation of Telco Operators towards Seamless IoT Edge-Cloud Continuum. Sensors, 23(2), 1004. https://doi.org/10.3390/s23021004Barakabitze, A. A., Ahmad, A., Hines, A., & Mijumbi, R. (2019). 5G Network Slicing using SDN and NFV: A Survey of Taxonomy, Architectures and Future Challenges. Computer Networks, 106984. https://doi.org/10.1016/j.comnet.2019.106984Woodhead, R., Stephenson, P., & Morrey, D. (2018). Digital construction: From point solutions to IoT ecosystem. Automation in Construction, 93(1), 35–46. https://doi.org/10.1016/j.autcon.2018.05.004Chen, Y. (2017). Integrated and Intelligent Manufacturing: Perspectives and Enablers. Engineering, 3(5), 588–595. https://doi.org/10.1016/j.eng.2017.04.009
MLOps, or Machine Learning Operations, is a set of practices that combine machine learning (ML), data engineering, and DevOps to streamline and automate the end-to-end ML model lifecycle. MLOps is an essential aspect of the current data science workflows. It is a foundational component of the contemporary information technology landscape, and its influence is expected to increase significantly in the coming years. It encompasses everything from data processing and model development to deployment, monitoring, and continuous improvement, making it a crucial discipline in integrating machine learning into production environments. However, a significant challenge in MLOps lies in the demand for scalable and flexible infrastructure capable of handling the distinct requirements of machine learning workloads. While the development cycle is often experimental, typically using interactive tools like Jupyter notebooks, production deployment requires automation and scalability. Kubernetes, a container or orchestration tool, offers this infrastructure essential to support MLOps at scale, ensuring flexibility, scalability, and efficient resource management for diverse ML workflows. To understand its significance further, let's break it down using simple, real-life examples. 1. Scalability and Resource Management Kubernetes provides exceptional support for scaling machine learning workflows, which frequently demand substantial computational resources. Especially for deep learning models, dynamic scaling is crucial to managing fluctuating workloads during the training and inference phases. Kubernetes automates resource orchestration, enabling horizontal scaling of containerized services in response to real-time demand. In MLOps pipelines, workloads typically involve large datasets, multiple feature engineering tasks, and resource-intensive model training. Kubernetes effectively distributes these tasks across nodes within a cluster, dynamically allocating CPU, GPU, and memory resources based on each task’s needs. This approach ensures optimal performance across ML workflows, regardless of infrastructure scale. Furthermore, Kubernetes’ auto-scaling capabilities enhance cost efficiency by reducing unused resources during low-demand periods. Example For instance, a company running a recommendation system (like Netflix suggesting films) might see higher demand at certain times of the day. Kubernetes makes sure the system can handle more requests during peak hours and scale back when it's quieter. Similarly, Airbnb uses Kubernetes to manage its machine learning workloads for personalized searches and recommendations. With fluctuating user traffic, Airbnb leverages Kubernetes to automatically scale its ML services. For instance, during peak travel seasons, Kubernetes dynamically allocates more resources to handle increased user requests, optimizing costs and ensuring high availability. 2. Consistency Across Environments One of the core challenges in MLOps is ensuring the reproducibility of machine learning experiments and models. Imagine you're baking a cake and want it to turn out the same, whether you’re baking at home or in a commercial kitchen. You follow the same recipe to ensure consistency. Kubernetes does something similar by using containers. These containers package the machine learning model and all its dependencies (software, libraries, etc.), so it works the same way whether it's being tested on a developer's laptop or running in a large cloud environment. This is crucial for ML projects because even small differences in setup can lead to unexpected results. Example Spotify has adopted Kubernetes to containerize its machine-learning models and ensure reproducibility across different environments. By packaging models with all dependencies into containers, Spotify minimizes discrepancies that could arise during deployment. This practice has allowed Spotify to maintain consistency in how models perform across development, testing, and production environments, reducing the ‘works on my machine’ problem. 3. Automating the Work In a typical MLOps workflow, data scientists submit code and model updates to version control systems. These updates activate automated CI pipelines that handle the building, testing, and validation of models within containerized environments. Kubernetes streamlines this process by orchestrating the containerized tasks, ensuring that each stage of model development and testing is carried out in a scalable and isolated environment. During this, models, after validation, are smoothly deployed to production environments using Kubernetes’ native deployment and scaling features, enabling continuous, reliable, and low-latency updates to machine learning models. Example For example, when a new ML model version is ready (like a spam filter in Gmail), Kubernetes can roll it out automatically, ensuring it performs well and replaces the old version without interruption. Likewise, Zalando – a major European fashion retailer – employs Kubernetes in its CI/CD pipeline for ML model updates. 4. Enhanced Monitoring and Model Governance Monitoring machine learning models in production can be quite challenging due to the constantly changing nature of data inputs and the evolving behavior of models over time. Kubernetes greatly improves the observability of ML systems by offering integrated monitoring tools like Prometheus and Grafana, as well as its own native logging capabilities. These tools allow data scientists and MLOps engineers to monitor essential metrics related to system performance, such as CPU, memory, and GPU usage, as well as model-specific metrics like prediction accuracy, response time, and drift detection. Example For instance, Kubernetes’ capabilities help NVIDIA define custom metrics related to their machine-learning models, such as model drift or changes in accuracy over time. They set up alerts to notify data scientists and MLOps engineers when these metrics fall outside acceptable thresholds. This proactive monitoring helps maintain model performance and ensures that models are functioning as intended. 5. Orchestration of Distributed Training and Inference Kubernetes has been essential for orchestrating distributed training and inference of large-scale machine learning models. Training intricate models, particularly deep neural networks, often requires the distribution of computational tasks across multiple machines or nodes, frequently utilizing specialized hardware like GPUs or TPUs. Kubernetes offers native support for distributed computing frameworks such as TensorFlow, PyTorch, and Horovod, enabling machine learning engineers to efficiently scale model training across clusters. Example Uber, for example, employs Kubernetes for distributed training of its machine learning models used in various services, including ride-sharing and food delivery. Additionally, Kubernetes serves models in real-time to deliver estimated time of arrivals (ETAs) and pricing to users with low latency, scaling based on demand during peak hours. 6. Hybrid and Multi-Cloud Flexibility In MLOps, organizations often deploy models across diverse environments, including on-premises, public clouds, and edge devices. Kubernetes’ cloud-agnostic design enables seamless orchestration in hybrid and multi-cloud setups, providing flexibility critical for data sovereignty and low-latency needs. By abstracting infrastructure, Kubernetes allows ML models to be deployed and scaled across regions and providers, supporting redundancy, disaster recovery, and compliance without vendor lock-in. Example For instance, Alibaba uses Kubernetes to run its machine learning workloads across both on-premises data centers and public cloud environments. This hybrid setup allows Alibaba to manage data sovereignty issues while providing the flexibility to scale workloads based on demand. By utilizing Kubernetes' cloud-agnostic capabilities, Alibaba can deploy and manage its models efficiently across different environments, optimizing performance and cost. 7. Fault Tolerance Kubernetes' fault tolerance ensures that machine learning workloads can proceed seamlessly, even if individual nodes or containers experience failures. This feature is crucial for distributed training, where the loss of a node could otherwise force a restart of the entire training process, wasting both time and computational resources. The Kubernetes control plane continuously monitors the health of nodes and pods, and when it detects a node failure, it automatically marks the affected pod as “unhealthy.” Kubernetes then reschedules the workload from the failed pod to another healthy node in the cluster. If GPU nodes are available, Kubernetes will automatically select one, allowing the training to continue uninterrupted. Example Uber leverages Kubernetes with Horovod for distributed deep-learning model training. In this setup, Kubernetes offers fault tolerance; if a node running a Horovod worker fails, Kubernetes automatically restarts the worker on a different node. By incorporating checkpointing, Uber’s training jobs can recover from such failures with minimal loss. This system enables Uber to train large-scale models more reliably, even in the face of occasional hardware or network issues. Conclusion Kubernetes has become essential in MLOps, providing a robust infrastructure to manage and scale machine learning workflows effectively. Its strengths in resource orchestration, containerization, continuous deployment, and monitoring streamline the entire ML model lifecycle, from development through to production. As machine learning models grow in complexity and importance within enterprise operations, Kubernetes will continue to be instrumental in enhancing the scalability, efficiency, and reliability of MLOps practices. Beyond supporting technical implementation, Kubernetes also drives innovation and operational excellence in AI-driven systems.
Tuhin Chattopadhyay
CEO at Tuhin AI Advisory and Professor of Practice,
JAGSoM
Frederic Jacquet
Technology Evangelist,
AI[4]Human-Nexus
Suri Nuthalapati
Data & AI Practice Lead, Americas,
Cloudera
Pratik Prakash
Principal Solution Architect,
Capital One