Automating Sentiment Analysis Using Snowflake Cortex
Learn how to use Snowflake Cortex to classify customer sentiment and categorize feedback by theme—then automate it weekly using nothing but SQL. No ML training required.
Join the DZone community and get the full member experience.
Join For FreeIn this hands-on tutorial, you'll learn how to automate sentiment analysis and categorize customer feedback using Snowflake Cortex, all through a simple SQL query without needing to build heavy and complex machine learning algorithms. No MLOps is required.
We'll work with sample data simulating real customer feedback comments about a fictional company, "DemoMart," and classify each customer feedback entry using Cortex's built-in function. We'll determine sentiment (positive, negative, neutral) and label the feedback into different categories.
The goal is to:
- Load a sample dataset of customer feedback into a Snowflake table.
- Use the built-in LLM-powered classification (
CLASSIFY_TEXT
) to tag each entry with a sentiment and classify the feedback into a specific category. - Automate this entire workflow to run weekly using Snowflake Task.
- Generate insights from the classified data.
Prerequisites
- A Snowflake account with access to Snowflake Cortex
- Role privileges to create tables, tasks, and procedures
- Basic SQL knowledge
Step 1: Create Sample Feedback Table
We'll use a sample dataset of customer feedback that covers products, delivery, customer support, and other areas. Let's create a table in Snowflake to store this data. Here is the SQL for creating the required table to hold customer feedback.
CREATE OR REPLACE TABLE customer.csat.feedback (
feedback_id INT,
feedback_ts DATE,
feedback_text STRING
);
Now, you can load the data into the table using Snowflake's Snowsight interface. The sample data "customer_feedback_demomart.csv" is available in the GitHub repo. You can download and use it.
Step 2: Use Cortex to Classify Sentiment and Category
Let's read and process each row from the feedback table.
Here's the magic. This single query classifies each piece of feedback for both sentiment and category:
SELECT
feedback_id,
feedback_ts,
feedback_text,
SNOWFLAKE.CORTEX.CLASSIFY_TEXT(feedback_text, ['positive', 'negative', 'neutral']):label::STRING AS sentiment,
SNOWFLAKE.CORTEX.CLASSIFY_TEXT(
feedback_text,
['Product', 'Customer Service', 'Delivery', 'Price', 'User Experience', 'Feature Request']
):label::STRING AS feedback_category
FROM
customer.csat.feedback
LIMIT 10;
I have used the CLASSIFY_TEXT
Function available within Snowflake's cortex to derive the sentiment based on the feedback_text
and further classify it into a specific category the feedback is associated with, such as 'Product', 'Customer Service', 'Delivery', and so on.
P.S.: You can change the categories based on your business needs.
Step 3: Store Classified Results
Let's store the classified results in a separate table for further reporting and analysis purposes. For this, I have created a table with the name feedback_classified
as shown below.
CREATE OR REPLACE TABLE customer.csat.feedback_classified (
feedback_id INT,
feedback_ts DATE,
feedback_text STRING,
sentiment STRING,
feedback_category STRING
);
Initial Bulk Load
Now, let's do an initial bulk classification for all existing data before moving on to the incremental processing of newly arriving data.
-- Initial Load
INSERT INTO customer.csat.feedback_classified
SELECT
feedback_id,
feedback_ts,
feedback_text,
SNOWFLAKE.CORTEX.CLASSIFY_TEXT(feedback_text, ['positive', 'negative', 'neutral']):label::STRING,
SNOWFLAKE.CORTEX.CLASSIFY_TEXT(
feedback_text,
['Product', 'Customer Service', 'Delivery', 'Price', 'User Experience', 'Feature Request']
):label::STRING AS feedback_label,
CURRENT_TIMESTAMP AS PROCESSED_TIMESTAMP
FROM customer.csat.feedback;
Once the initial load is completed successfully, let's build an SQL that fetches only incremental data based on the processed_ts
column value.
For the incremental load, we need fresh data with customer feedback. For that, let's insert ten new records into our raw table customer.csat.feedback
INSERT INTO customer.csat.feedback (feedback_id, feedback_ts, feedback_text)
VALUES
(5001, CURRENT_DATE, 'My DemoMart order was delivered to the wrong address again. Very disappointing.'),
(5002, CURRENT_DATE, 'I love the new packaging DemoMart is using. So eco-friendly!'),
(5003, CURRENT_DATE, 'The delivery speed was slower than promised. Hope this improves.'),
(5004, CURRENT_DATE, 'The product quality is excellent, I’m genuinely impressed with DemoMart.'),
(5005, CURRENT_DATE, 'Customer service helped me cancel and reorder with no issues.'),
(5006, CURRENT_DATE, 'DemoMart’s website was down when I tried to place my order.'),
(5007, CURRENT_DATE, 'Thanks DemoMart for the fast shipping and great support!'),
(5008, CURRENT_DATE, 'Received a damaged item. This is the second time with DemoMart.'),
(5009, CURRENT_DATE, 'DemoMart app is very user-friendly. Shopping is a breeze.'),
(5010, CURRENT_DATE, 'The feature I wanted is missing. Hope DemoMart adds it soon.');
Step 4: Automate Incremental Data Processing With TASK
Now that we have newly added (incremental) fresh data into our raw table, let's create a task to pick up only new data and classify it automatically. We will schedule this task to run every Sunday at midnight UTC.
--Creating task
CREATE OR REPLACE TASK CUSTOMER.CSAT.FEEDBACK_CLASSIFIED
WAREHOUSE = COMPUTE_WH
SCHEDULE = 'USING CRON 0 0 * * 0 UTC' -- Run evey Sunday at midnight UTC
AS
INSERT INTO customer.csat.feedback_classified
SELECT
feedback_id,
feedback_ts,
feedback_text,
SNOWFLAKE.CORTEX.CLASSIFY_TEXT(feedback_text, ['positive', 'negative', 'neutral']):label::STRING,
SNOWFLAKE.CORTEX.CLASSIFY_TEXT(
feedback_text,
['Product', 'Customer Service', 'Delivery', 'Price', 'User Experience', 'Feature Request']
):label::STRING AS feedback_label,
CURRENT_TIMESTAMP AS PROCESSED_TIMESTAMP
FROM customer.csat.feedback
WHERE
feedback_ts > (SELECT COALESCE(MAX(PROCESSED_TIMESTAMP),'1900-01-01') FROM CUSTOMER.CSAT.FEEDBACK_CLASSIFIED );
This will automatically run every Sunday at midnight UTC, process any newly arrived customer feedback, and classify it.
Step 5: Visualize Insights
You can now build dashboards in Snowsight to see weekly trends using a simple query like this:
SELECT feedback_category, sentiment, COUNT(*) AS total
FROM customer.csat.feedback_classified
GROUP BY feedback_category, sentiment
ORDER BY total DESC;
Conclusion
With just a few lines of SQL, you:
- Ingested raw feedback into a Snowflake table.
- Used Snowflake Cortex to classify customer feedback and derive sentiment and feedback categories
- Automated the process to run weekly
- Built insights into the classified feedback for business users/leadership team to act upon by category and sentiment
This approach is ideal for support teams, product teams, and leadership, as it allows them to continuously monitor customer experience without building or maintaining ML infrastructure.
GitHub
I have created a GitHub page with all the code and sample data. You can access it freely.
The whole dataset generator and SQL scripts are available on GitHub.
Opinions expressed by DZone contributors are their own.
Comments