Snowflake Cortex for Developers: How Generative AI and SaaS Enable Self-Serve Data Analytics
This article explores how Snowflake Cortex, Snowflake’s generative AI solution, advances self-serve analytics for both structured and unstructured data.
Join the DZone community and get the full member experience.
Join For FreeIn the era of low-code, no-code platforms, SaaS solutions, and the new trend called Agentic AI, the industry is focused on optimizing software development for greater efficiency. Text-to-SQL is one such area in software engineering where organizations aim to enable self-serve analytics and democratize SQL using AI. Snowflake Cortex AI, a generative AI offering from Snowflake, bundles this capability into a SaaS product that eliminates the complexity of building custom text-to-SQL solutions. The benefits go beyond reduced development effort. Cortex AI also delivers significantly higher accuracy in SQL query generation compared to custom-built solutions, thanks to its use of a semantic model.
Cortex AI comes in two versions:
- Cortex Analyst to function against structured data stored in traditional tables
- Cortex Search to function against unstructured data stored as documents (e.g. PDF)
Background: The Rise of Self-Serve Analytics
Before we dive into Cortex, it is important to understand the challenges organizations face in the field of analytics. Data analytics is an essential part of enterprises across all sectors. Although analyzing data stored in a warehouse has been common practice for over two decades, the rise of big data and data lakes in the past ten years has accelerated the adoption of data analytics across industries.
At the heart of data analytics are queries that extract insights from databases to support business decision-making. To accomplish this, business intelligence tools like Tableau and Power BI have become widely used. Data engineers write analytical queries and build dashboards for business teams, who then use these tools to gain insights and drive growth.
While this approach works reasonably well, it presents several growing concerns as organizations seek to move faster. Due to the dynamic nature of data, the introduction of new data types, and the adoption of new business models, businesses frequently require updated queries to keep up with evolving demands. Analytical queries often involve complex SQL, and business users typically lack the knowledge of data models or the SQL skills needed to write them. As a result, they must depend on data engineers to build new queries.
This dependency becomes a bottleneck, as data engineers have their own workloads and may not be able to respond to ad hoc requests quickly. The process is inefficient and leads to delays in delivering business insights.
In the past few years, the emergence of generative AI and large language models (LLMs) has begun to change this landscape. Organizations have started building custom web applications that accept queries in plain English, translate them into SQL, run the SQL against a database, and return results. This shift aims to reduce reliance on technical teams and empower business users to access insights on their own.
As generative AI has evolved, so too has the architecture of text-to-SQL systems. Amazon's blog provides a detailed view of this architectural evolution. Although adoption of text-to-SQL continues to grow, building and maintaining these solutions still involves significant complexity. Accuracy remains a major challenge, as incorrect SQL translations lead to invalid or misleading results.
Snowflake Cortex addresses this challenge by offering a fully managed SaaS solution that eliminates implementation complexity. It delivers much higher accuracy through its semantic model and supports both structured and unstructured data, providing a significant advancement in making self-serve analytics both accessible and reliable.
Cortex Analyst
Cortex Analyst is a conversational self-serve data analytics tool powered by generative AI. Cortex Analyst exposes both an SDK and a REST API, which a front-end application running in any technology can use to query a database using a plain English prompt. Figure 1 shows the conceptual architecture of a Cortex Analyst–based solution.
Figure-1 Cortex Analyst Architecture
The front-end application could be a Snowflake-hosted Streamlit app, a custom Python or Node.js application, or another SaaS solution like Slack. An end user who wants to pull analytics information from the database, without any knowledge of the underlying data model, types their question into the front-end application as a prompt. The application sends the user prompt to the Cortex Analyst API, which performs the processing behind the scenes.
- It uses a previously configured semantic model and sends the model and prompt to a prompt processor component. Generally, there could be a list of semantic models in the Snowflake ecosystem and the front-end application will choose a semantic model appropriate to the user prompt leveraging some internal algorithm.
- The prompt processor enriches the user prompt with semantic information using the semantic model, finds similar answerable questions and send the enriched information to an LLM hosted by Snowflake.
- The LLM generates the SQL, prepares an interpretation of the question and sends the information to the API.
- The API in its final step executes the query against the database and returns database result as well as interpretation of the prompt/query to the end user.
If Cortex Analyst is not able to answer the question asked by the end user, it sends back a possible set of matching questions as response to the user. The intention of returning a set of suggested questions is primarily to prevent hallucination. This also gives the user an opportunity to select an alternative question that better matches their intent, especially when the original prompt does not yield a relevant result in Snowflake.
The Semantic Model
The core of the Cortex Analyst solution, which enables over 90% query accuracy, is the semantic model. The semantic model serves as the bridge where organization-specific terminology is incorporated, allowing business user vocabulary and raw schema to work together to generate accurate query results. It is the only component in the solution ecosystem that developers need to create and deploy within Snowflake. It is recommended to create a set of semantic models based on the organization’s data domains. This approach helps narrow the scope of queries to specific schemas or tables, thereby improving result accuracy.
Snowflake provides a tool for building semantic models, available through its Github repository. In the near future, support for creating semantic models directly within the Snowsight portal will be added, simplifying the development process.
A Pool of LLM Agents
Figure-1 shows a simplified architecture of Cortex Analyst. There are multiple agents involved in the process to produce accurate query result:
- LLM specific SQL generation agents - This typically includes the enterprise grade LLMs such as Meta LLama, Mistral, etc. Snowflake added support for DeepSeek-R1 as well as Anthropic Claude.
- Error correction agents – Responsible for validating syntactical correctness of the generated SQL using Snowflake’s in-built SQL Compiler tool.
- Synthesizer agent for final query – This agent refines the SQL query and delivers that as final response along with interpretation of the user prompt.
Cortex Search
Cortex Search is another generative AI-powered solution from Snowflake that works with unstructured documents such as PDFs. It is not a text-to-SQL solution, but rather a fully managed RAG engine designed to deliver high-quality fuzzy search. Figure-2 demonstrates a Cortex Search API based document processing solution.
Figure-2: Cortex Search Architecture
In this conceptual architecture solution, there are two different flows.
- Bring documents into Snowflake, process and store for future search
- Perform the actual search against stored documents using Cortex Search tool
Snowflake, as a product, offers various connectors to bring data from external systems, including those for unstructured data such as documents. Using the SharePoint connector or Snowflake Knowledge Extensions, Snowflake can ingest data from a SharePoint location or third-party systems, as shown in the diagram. The documents that are brought in are processed and stored in the database both as plain text and as vector embeddings. This dual storage approach allows for both vector-based search and text-based search, which helps improve the accuracy of search results.
The client-side search architecture is similar to the Cortex Analyst search discussed in the previous section. In this setup, whether the query originates from a Streamlit app, Slack integration, or a custom Node.js or Python-based front-end application, the user prompt is sent to the Cortex Search API. Cortex Search performs two types of search under the hood: vector search and lexical search. It then extracts matching results and re-ranks them. The search results, along with the user prompt, are passed to the LLM, which generates the final response and returns it to the front-end application.
Agentic AI With Cortex
In 2023, the focus was on getting familiar with Generative AI. In 2024, it shifted to implementing RAG (Retrieval-Augmented Generation) applications. Now, 2025 is shaping up to be the year of building Agentic AI solutions. Early this year, Snowflake introduced its own Agentic AI: Cortex Agent.
Cortex Agent is designed to interpret user prompts and determine whether to query structured data using Cortex Analyst, unstructured data using Cortex Search, or both. Unlike traditional one-shot models, Cortex Agent works iteratively. It analyzes the initial results and, if needed, performs additional queries before delivering a complete and accurate response.
Getting Started With Cortex
Unlike traditional programming language or software library where the process to get started with a tool typically happens by installing library and dive into a GitHub repo, getting started with a Cortex implementation involves a few additional steps.
Prerequisite
- Basic SQL and Python programming knowledge
- A Snowflake account
Implementation
As shown in the architecture diagrams above, developers need to build a front-end application. If you're familiar with Python, the easiest way to get started is by using Streamlit, which can be built directly within the Snowsight portal in Snowflake. Simply log in to your Snowflake account and follow the quick-start guide to create a data warehouse, load data, and build your Streamlit application.
Low Level AI Functions
From a developer standpoint, it is important to be aware of a few notable low-level functions Cortex AI offers to build solution.
COMPLETE
function uses LLM to produce the final output to client or consumerCORTEX.PARSE_DOCUMENT
- performs the OCR type of work with added capability to retain the layout such as table headers using the layout ON/OFF modeSPLIT_TEXT_RECURSIVE_CHARACTER
- parses the extracted document and splits into logical segmentsCORTEX.FINETUNE
- capability to fine tune out-of-the-box LLMsCLASSIFY_TEXT
- text classification function
Cortex Security
A discussion about AI can’t be complete without talking about security. Snowflake incorporates appropriate guardrails to protect customer data in Cortex AI implementation.
- LLMs are deployed within Snowflake boundary therefore data never leaves outside of Snowflake.
- The Cortex AI services run within Snowflake boundary and is executable only via appropriate RBAC (role based access control) policy.
- The structured data or the unstructured documents are all brought into Snowflake and Cortex Analyst or Search runs within the datastore of Snowflake
- The query or prompt from an external service like Slack or custom user interface is sent using Cortex REST API over HTTPS and further processed within Snowflake.
Conclusion
In this article, we introduced Snowflake Cortex, a SaaS-based generative AI solution. This tool removes the need for custom text-to-SQL implementations and significantly improves query accuracy. It also offers advanced capabilities for ingesting documents via connectors and performing searches on unstructured data. Cortex reduces both implementation and infrastructure management overhead while enabling non-technical users to access self-serve analytics. The platform is user-friendly, feature-rich, and continues to improve with advancements in LLM and AI. It is gaining traction and is expected to see increased adoption across the industry.
Opinions expressed by DZone contributors are their own.
Comments