DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • Why Text2SQL Alone Isn’t Enough: Embracing TAG
  • Graph Database Pruning for Knowledge Representation in LLMs
  • A Developer's Guide to Database Sharding With MongoDB
  • Kafka Link: Ingesting Data From MongoDB to Capella Columnar

Trending

  • The End of “Good Enough Agile”
  • SaaS in an Enterprise - An Implementation Roadmap
  • Modern Test Automation With AI (LLM) and Playwright MCP
  • How to Merge HTML Documents in Java
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. How to Build a ChatGPT Super App

How to Build a ChatGPT Super App

SingleStore's database integrates LLMs to enhance contextual insights, supporting advanced customer engagement in applications like chatbots.

By 
Akmal Chaudhri user avatar
Akmal Chaudhri
DZone Core CORE ·
Mar. 10, 25 · Tutorial
Likes (4)
Comment
Save
Tweet
Share
3.9K Views

Join the DZone community and get the full member experience.

Join For Free

SingleStore is a powerful multi-model database system and platform designed to support a wide variety of business use cases. Its distinctive features allow businesses to unify multiple database systems into a single platform, reducing the Total Cost of Ownership (TCO) and simplifying developer workflows by eliminating the need for complex integration tools.

In this article, we'll explore how SingleStore can transform email campaigns for a web analytics company, enabling the creation of personalized and highly targeted email content.

The notebook file used in the article is available on GitHub.

Introduction

A web analytics company relies on email campaigns to engage with customers. However, a generic approach to targeting customers often misses opportunities to maximize business potential. A more effective solution would involve using a large language model (LLM) to craft personalized email messages.

Consider a scenario where user behavior data are stored in a NoSQL database like MongoDB, while valuable documentation resides in a vector database, such as Pinecone. Managing these multiple systems can become complex and resource-intensive, highlighting the need for a unified solution.

SingleStore, a versatile multi-model database, supports various data formats, including JSON, and offers built-in vector functions. It seamlessly integrates with LLMs, making it a powerful alternative to managing multiple database systems. In this article, we'll demonstrate how easily SingleStore can replace both MongoDB and Pinecone, simplifying operations without compromising functionality.

In our example application, we'll use an LLM to generate unique emails for our customers. To help the LLM learn how to target our customers, we'll use a number of well-known analytics companies as learning material for the LLM. 

We'll further customize the content based on user behavior. Customer data are stored in MongoDB. Different stages of user behavior are stored in Pinecone. The user behavior will allow the LLM to generate personalized emails. Finally, we'll consolidate the data stored in MongoDB and Pinecone by using SingleStore.

Create a SingleStore Cloud Account

A previous article showed the steps to create a free SingleStore Cloud account. We'll use the Standard Tier and take the default names for the Workspace Group and Workspace. We'll also enable SingleStore Kai.

We'll store our OpenAI API Key and Pinecone API Key in the secrets vault using OPENAI_API_KEY and PINECONE_API_KEY, respectively.

Import the Notebook

We'll download the notebook from GitHub.

From the left navigation pane in the SingleStore cloud portal, we'll select "DEVELOP" > "Data Studio."

In the top right of the web page, we'll select "New Notebook" > "Import From File." We'll use the wizard to locate and import the notebook we downloaded from GitHub.

Run the Notebook

Generic Email Template

We'll start by generating generic email templates and then use an LLM to transform them into personalized messages for each customer. This way, we can address each recipient by name and introduce them to the benefits of our web analytics platform.

We can generate a generic email as follows:

Python
 
people = ["Alice", "Bob", "Charlie", "David", "Emma"]

for person in people:
    message = (
        f"Hey {person},\n"
        "Check out our web analytics platform, it's Awesome!\n"
        "It's perfect for your needs. Buy it now!\n"
        "- Marketer John"
    )
    print(message)
    print("_" * 100)


For example, Alice would see the following message:

Plain Text
 
Hey Alice,
Check out our web analytics platform, it's Awesome!
It's perfect for your needs. Buy it now!
- Marketer John


Other users would receive the same message, but with their name, respectively.

2. Adding a Large Language Model (LLM)

We can easily bring an LLM into our application by providing it with a role and giving it some information, as follows:

Python
 
system_message = """
You are a helpful assistant.
My name is Marketer John.
You help write the body of an email for a fictitious company called 'Awesome Web Analytics'.
This is a web analytics company that is similar to the top 5 web analytics companies (perform a web search to determine the current top 5 web analytics companies).
The goal is to write a custom email to users to get them interested in our services.
The email should be less than 150 words.
Address the user by name.
End with my signature.
"""


We'll create a function to call the LLM:

Python
 
def chatgpt_generate_email(prompt, person):
    conversation = [
        {"role": "system", "content": prompt},
        {"role": "user", "content": person},
        {"role": "assistant", "content": ""}
    ]

    response = openai_client.chat.completions.create(
        model = "gpt-4o-mini",
        messages = conversation,
        temperature = 1.0,
        max_tokens = 800,
        top_p = 1,
        frequency_penalty = 0,
        presence_penalty = 0
    )

    assistant_reply = response.choices[0].message.content
    return assistant_reply


Looping through the list of users and calling the LLM produces unique emails:

Python
 
openai_client = OpenAI()

# Define a list to store the responses
emails = []

# Loop through each person and generate the conversation
for person in people:
    email = chatgpt_generate_email(system_message, person)
    emails.append(
        {
            "person": person,
            "assistant_reply": email
        }
    )


For example, this is what Alice might see:

Plain Text
 
Person: Alice
Subject: Unlock Your Website's Potential with Awesome Web Analytics!

Hi Alice,

Are you ready to take your website to new heights? At Awesome Web Analytics, we provide cutting-edge insights that empower you to make informed decisions and drive growth. 

With our powerful analytics tools, you can understand user behavior, optimize performance, and boost conversions—all in real-time! Unlike other analytics platforms, we offer personalized support to guide you every step of the way.

Join countless satisfied customers who have transformed their online presence. Discover how we stack up against competitors like Google Analytics, Adobe Analytics, and Matomo, but with a focus on simplicity and usability.

Let us help you turn data into your greatest asset!

Best,  
Marketer John  
Awesome Web Analytics


Equally unique emails will be generated for the other users.

3. Customizing Email Content With User Behavior

By categorising users based on their behavior stages, we can further customize email content to align with their specific needs. An LLM will assist in crafting emails that encourage users to progress through different stages, ultimately improving their understanding and usage of various services.

At present, user data are held in a MongoDB database with a record structure similar to the following:

JSON
 
{
    '_id': ObjectId('64afb3fda9295d8421e7a19f'),
    'first_name': 'James',
    'last_name': 'Villanueva',
    'company_name': 'Foley-Turner',
    'stage': 'generating a tracking code',
    'created_date': 1987-11-09T12:43:26.000+00:00
}


We'll connect to MongoDB to get the data as follows:

Python
 
try:
    mongo_client = MongoClient("mongodb+srv://admin:<password>@<host>/?retryWrites=true&w=majority")
    mongo_db = mongo_client["mktg_email_demo"]
    collection = mongo_db["customers"]
    print("Connected successfully")
except Exception as e:
    print(e)


We'll replace <password> and <host> with the values from MongoDB Atlas.

We have a number of user behavior stages:

Python
 
stages = [
    "getting started",
    "generating a tracking code",
    "adding tracking to your website",
    "real-time analytics",
    "conversion tracking",
    "funnels",
    "user segmentation",
    "custom event tracking",
    "data export",
    "dashboard customization"
]

def find_next_stage(current_stage):
    current_index = stages.index(current_stage)
    if current_index < len(stages) - 1:
        return stages[current_index + 1]
    else:
        return stages[current_index]


Using the data about behavior stages, we'll ask the LLM to further customize the email as follows:

Python
 
limit = 5
emails = []

for record in collection.find(limit = limit):
    fname, stage = record.get("first_name"), record.get("stage")
    next_stage = find_next_stage(stage)

    system_message = f"""
    You are a helpful assistant, who works for me, Marketer John at Awesome Web Analytics.
    You help write the body of an email for a fictitious company called 'Awesome Web Analytics'.
    We are a web analytics company similar to the top 5 web analytics companies.
    We have users at various stages in our product's pipeline, and we want to send them helpful emails to encourage further usage of our product.
    Please write an email for {fname} who is on stage {stage} of the onboarding process.
    The next stage is {next_stage}.
    Ensure the email describes the benefits of moving to the next stage.
    Limit the email to 1 paragraph.
    End the email with my signature.
    """

    email = chatgpt_generate_email(system_message, fname)
    emails.append(
        {
            "fname": fname,
            "stage": stage,
            "next_stage": next_stage,
            "email": email
        }
    )


For example, here is an email generated for Michael:

Plain Text
 
First Name: Michael

Stage: funnels

Next Stage: user segmentation

Subject: Unlock Deeper Insights with User Segmentation!

Hi Michael,

Congratulations on successfully navigating the funnel stage of our onboarding process! As you move forward to user segmentation, you'll discover how this powerful tool will enable you to categorize your users based on their behaviors and demographics. By understanding your audience segments better, you can create tailored experiences that increase engagement and optimize conversions. This targeted approach not only enhances your marketing strategies but also drives meaningful results and growth for your business. We're excited to see how segmentation will elevate your analytics efforts!

Best,  
Marketer John  
Awesome Web Analytics


4. Further Customizing Email Content

To support user progress, we'll use Pinecone's vector embeddings, allowing us to direct users to relevant documentation for each stage. These embeddings make it effortless to guide users toward essential resources and further enhance their interactions with our product.

Python
 
pc = Pinecone(
    api_key = pc_api_key
)

index_name = "mktg-email-demo"

if any(index["name"] == index_name for index in pc.list_indexes()):
    pc.delete_index(index_name)

pc.create_index(
    name = index_name,
    dimension = dimensions,
    metric = "euclidean",
    spec = ServerlessSpec(
        cloud = "aws",
        region = "us-east-1"
    )
)

pc_index = pc.Index(index_name)

pc.list_indexes()


We'll create the embeddings as follows:

Python
 
def get_embeddings(text):
    text = text.replace("\n", " ")
    try:
        response = openai_client.embeddings.create(
            input = text,
            model = "text-embedding-3-small"
        )
        return response.data[0].embedding, response.usage.total_tokens, "success"
    except Exception as e:
        print(e)
        return "", 0, "failed"

id_counter = 1
ids_list = []

for stage in stages:
    embedding, tokens, status = get_embeddings(stage)

    parent = id_counter - 1

    pc_index.upsert([
        {
            "id": str(id_counter),
            "values": embedding,
            "metadata": {"content": stage, "parent": str(parent)}
        }
    ])

    ids_list.append(str(id_counter))

    id_counter += 1


We'll search Pinecone for matches as follows:

Python
 
def search_pinecone(embedding):
    match = pc_index.query(
        vector = [embedding],
        top_k = 1,
        include_metadata = True
    )["matches"][0]["metadata"]
    return match["content"], match["parent"]


Using the data, we can ask the LLM to further customize the email, as follows:

Python
 
limit = 5
emails = []

for record in collection.find(limit = limit):
    fname, stage = record.get("first_name"), record.get("stage")

    # Get the current and next stages with their embedding
    this_stage = next((item for item in stages_w_embed if item["stage"] == stage), None)
    next_stage = next((item for item in stages_w_embed if item["stage"] == find_next_stage(stage)), None)

    if not this_stage or not next_stage:
        continue

    # Get content
    cur_content, cur_permalink = search_pinecone(this_stage["embedding"])
    next_content, next_permalink = search_pinecone(next_stage["embedding"])

    system_message = f"""
    You are a helpful assistant.
    I am Marketer John at Awesome Web Analytics.
    We are similar to the current top web analytics companies.
    We have users at various stages of using our product, and we want to send them helpful emails to encourage them to use our product more.
    Write an email for {fname}, who is on stage {stage} of the onboarding process.
    The next stage is {next_stage['stage']}.
    Ensure the email describes the benefits of moving to the next stage, and include this link: https://github.com/VeryFatBoy/mktg-email-flow/tree/main/docs/{next_content.replace(' ', '-')}.md.
    Limit the email to 1 paragraph.
    End the email with my signature: 'Best Regards, Marketer John.'
    """

    email = chatgpt_generate_email(system_message, fname)
    emails.append(
        {
            "fname": fname, 
            "stage": stage, 
            "next_stage": next_stage["stage"], 
            "email": email
        }
    )


For example, here is an email generated for Melissa:

Plain Text
 
First Name: Melissa

Stage: getting started

Next Stage: generating a tracking code

Subject: Take the Next Step with Awesome Web Analytics!

Hi Melissa,

We're thrilled to see you getting started on your journey with Awesome Web Analytics! The next step is generating your tracking code, which will allow you to start collecting valuable data about your website visitors. With this data, you can gain insights into user behavior, optimize your marketing strategies, and ultimately drive more conversions. To guide you through this process, check out our detailed instructions here: [Generating a Tracking Code](https://github.com/VeryFatBoy/mktg-email-flow/tree/main/docs/generating-a-tracking-code.md). We're here to support you every step of the way!

Best Regards,  
Marketer John.


We can see that we have refined the generic template and developed quite targeted emails.

Using SingleStore

Instead of managing separate database systems, we'll streamline our operations by using SingleStore. With its support for JSON, text, and vector embeddings, we can efficiently store all necessary data in one place, reducing TCO and simplifying our development processes.

We'll ingest the data from MongoDB using a pipeline similar to the following:

SQL
 
USE mktg_email_demo;

CREATE LINK mktg_email_demo.link AS MONGODB
CONFIG '{"mongodb.hosts": "<primary>:27017, <secondary>:27017, <secondary>:27017",
        "collection.include.list": "mktg_email_demo.*",
        "mongodb.ssl.enabled": "true",
        "mongodb.authsource": "admin",
        "mongodb.members.auto.discover": "false"}'
CREDENTIALS '{"mongodb.user": "admin",
            "mongodb.password": "<password>"}';

CREATE TABLES AS INFER PIPELINE AS LOAD DATA LINK mktg_email_demo.link '*' FORMAT AVRO;

START ALL PIPELINES;


We'll replace <primary>, <secondary>, <secondary> and <password> with the values from MongoDB Atlas.

The customer table will be created by the pipeline. The vector embeddings for the behavior stages can be created as follows:

Python
 
df_list = []
id_counter = 1

for stage in stages:
    embedding, tokens, status = get_embeddings(stage)

    parent = id_counter - 1

    stage_df = pd.DataFrame(
        {
            "id": [id_counter],
            "content": [stage],
            "embedding": [embedding],
            "parent": [parent]
        }
    )

    df_list.append(stage_df)
    
    id_counter += 1

df = pd.concat(df_list, ignore_index = True)


We'll need a table to store the data:

SQL
 
USE mktg_email_demo;

DROP TABLE IF EXISTS docs_splits;

CREATE TABLE IF NOT EXISTS docs_splits (
    id INT,
    content TEXT,
    embedding VECTOR(:dimensions),
    parent INT
);


Then, we can save the data in the table:

Python
 
df.to_sql(
    "docs_splits",
    con = db_connection,
    if_exists = "append",
    index = False,
    chunksize = 1000
)


We'll search SingleStore for matches as follows:

Python
 
def search_s2(vector):
    query = """
        SELECT content, parent
        FROM docs_splits
        ORDER BY (embedding <-> :vector) ASC
        LIMIT 1
    """
    with db_connection.connect() as con:
        result = con.execute(text(query), {"vector": str(vector)})
        return result.fetchone()


Using the data, we can ask the LLM to customize the email as follows:

Python
 
limit = 5
emails = []

# Create a connection
with db_connection.connect() as con:
    query = "SELECT _more :> JSON FROM customers LIMIT :limit"
    result = con.execute(text(query), {"limit": limit})

    for customer in result:
        customer_data = customer[0]
        fname, stage = customer_data["first_name"], customer_data["stage"]

        # Retrieve current and next stage embeddings
        this_stage = next((item for item in stages_w_embed if item["stage"] == stage), None)
        next_stage = next((item for item in stages_w_embed if item["stage"] == find_next_stage(stage)), None)

        if not this_stage or not next_stage:
            continue

        # Get content
        cur_content, cur_permalink = search_s2(this_stage["embedding"])
        next_content, next_permalink = search_s2(next_stage["embedding"])

        # Create the system message
        system_message = f"""
        You are a helpful assistant.
        I am Marketer John at Awesome Web Analytics.
        We are similar to the current top web analytics companies.
        We have users that are at various stages in using our product, and we want to send them helpful emails to get them to use our product more.
        Write an email for {fname} who is on stage {stage} of the onboarding process.
        The next stage is {next_stage['stage']}.
        Ensure the email describes the benefits of moving to the next stage, then always share this link: https://github.com/VeryFatBoy/mktg-email-flow/tree/main/docs/{next_content.replace(' ', '-')}.md.
        Limit the email to 1 paragraph.
        End the email with my signature: 'Best Regards, Marketer John.'
        """

        email = chatgpt_generate_email(system_message, fname)
        emails.append(
            {
                "fname": fname,
                "stage": stage,
                "next_stage": next_stage["stage"],
                "email": email,
            }
        )


For example, here is an email generated for Joseph:

Plain Text
 
First Name: Joseph

Stage: generating a tracking code

Next Stage: adding tracking to your website

Subject: Take the Next Step in Your Analytics Journey!

Hi Joseph,

Congratulations on generating your tracking code! The next step is to add tracking to your website, which is crucial for unlocking the full power of our analytics tools. By integrating the tracking code, you will start collecting valuable data about your visitors, enabling you to understand user behavior, optimize your website, and drive better results for your business. Ready to get started? Check out our detailed guide here: [Adding Tracking to Your Website](https://github.com/VeryFatBoy/mktg-email-flow/tree/main/docs/adding-tracking-to-your-website.md).

Best Regards,  
Marketer John.


Summary

Through this practical demonstration, we've seen how SingleStore improves our email campaigns with its multi-model capabilities and AI-driven personalization. Using SingleStore as our single source of truth, we've simplified our workflows and ensured that our email campaigns deliver maximum impact and value to our customers.

Acknowledgements

I thank Wes Kennedy for the original demo code, which was adapted for this article.

Database MongoDB SingleStore large language model

Published at DZone with permission of Akmal Chaudhri. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Why Text2SQL Alone Isn’t Enough: Embracing TAG
  • Graph Database Pruning for Knowledge Representation in LLMs
  • A Developer's Guide to Database Sharding With MongoDB
  • Kafka Link: Ingesting Data From MongoDB to Capella Columnar

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!