DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Build Retrieval-Augmented Generation (RAG) With Milvus
  • Utilizing Multiple Vectors and Advanced Search Data Model Design for City Data
  • Architectural Patterns for Enterprise Generative AI Apps: DSFT, RAG, RAFT, and GraphRAG
  • Introduction to Retrieval Augmented Generation (RAG)

Trending

  • The Modern Data Stack Is Overrated — Here’s What Works
  • Building Scalable and Resilient Data Pipelines With Apache Airflow
  • Rethinking Recruitment: A Journey Through Hiring Practices
  • Segmentation Violation and How Rust Helps Overcome It
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. MariaDB Vector Edition: Designed for AI

MariaDB Vector Edition: Designed for AI

In the era of purpose-built AI databases, how is a traditional database like MariaDB reinventing itself to stay relevant? Find out in this review.

By 
Vamsi Kavuri user avatar
Vamsi Kavuri
DZone Core CORE ·
Oct. 10, 24 · Review
Likes (9)
Comment
Save
Tweet
Share
18.7K Views

Join the DZone community and get the full member experience.

Join For Free

As a solutions architect with over two decades of experience in relational database systems, I recently started exploring MariaDB's new Vector Edition to see if it could address some of the AI data challenges we're facing. A quick look seemed pretty convincing, especially with how it could bring AI magic right into a regular database setup. However, I wanted to test it with a simple use case to see how it performs in practice. 

In this article, I will share my hands-on experience and observations about MariaDB's vector capabilities by running a simple use case. Specifically, I will be loading sample customer reviews into MariaDB and performing fast similarity searches to find related reviews.

Environment Setup

  • Python 3.10 or higher
  • Docker Desktop 

My experiment started with setting up a Docker container using MariaDB's latest release (11.6) which includes vector capabilities. 

Shell
 
# Pull the latest release
docker pull quay.io/mariadb-foundation/mariadb-devel:11.6-vector-preview

# Update password
docker run -d --name mariadb_vector -e MYSQL_ROOT_PASSWORD=<replace_password> quay.io/mariadb-foundation/mariadb-devel:11.6-vector-preview


Now, create a table and load it with sample customer reviews that include sentiment scores and embeddings for each review. To generate text embeddings, I am using SentenceTransformer, which lets you use pre-trained models. To be specific, I decided to go with a model called paraphrase-MiniLM-L6-v2 that takes our customer reviews and maps them into a 384-dimensional space.

Python
 
import mysql.connector
import numpy as np
from sentence_transformers import SentenceTransformer

model = SentenceTransformer('paraphrase-MiniLM-L6-v2')

# I already have a database created with a name vectordb
connection = mysql.connector.connect(
        host="localhost",
        user="root",
        password="<password>", # Replace me
        database="vectordb"
    )
cursor = connection.cursor()

# Create a table to store customer reviews with sentiment score and embeddings.
cursor.execute("""
    CREATE TABLE IF NOT EXISTS customer_reviews (
            id INT PRIMARY KEY AUTO_INCREMENT,
    		product_id INT,
    		review_text TEXT,
    		sentiment_score FLOAT,
    		review_embedding BLOB,
    		INDEX vector_idx (review_embedding) USING HNSW
    ) ENGINE=ColumnStore;
    """)

# Sample reviews
reviews = [
        (1, "This product exceeded my expectations. Highly recommended!", 0.9),
        (1, "Decent quality, but pricey.", 0.6),
        (2, "Terrible experience. The product does not work.", 0.1),
        (2, "Average product, ok ok", 0.5),
        (3, "Absolutely love it! Best purchase I have made this year.", 1.0)
    ]

# Load sample reviews into vector DB
for product_id, review_text, sentiment_score in reviews:
	embedding = model.encode(review_text)
	cursor.execute(
		"INSERT INTO customer_reviews (product_id, review_text, sentiment_score, review_embedding) VALUES (%s, %s, %s, %s)",
      	(product_id, review_text, sentiment_score, embedding.tobytes()))

connection.commit()
connection.close()


Now, let's leverage MariaDB's vector capabilities to find similar reviews. This is more like asking "What other customers said similar to this review?". In the below example, I am going to find the top 2 reviews that are similar to a customer review that says "I am super satisfied!". To do this, I am using one of the vector functions (VEC_Distance_Euclidean) available in the latest release.

Python
 
# Convert the target customer review into vector
target_review_embedding = model.encode("I am super satisfied!")

# Find top 2 similar reviews using MariaDB's VEC_Distance_Euclidean function
cursor.execute("""
        SELECT review_text, sentiment_score, VEC_Distance_Euclidean(review_embedding, %s) AS similarity
        FROM customer_reviews
        ORDER BY similarity
        LIMIT %s
    """, (target_review_embedding.tobytes(), 2))

similar_reviews = cursor.fetchall()


Observations

  • It is easy to set up and we can combine both structured data  (like product IDs and sentiment scores), unstructured data (review text), and their vector representations in a single table. 
  • I like its ability to use SQL syntax alongside vector operations which makes it easy for teams that are already familiar with relational databases. Here is the full list of vector functions supported in this release. 
  • The HNSW index improved the performance of the similarity search query for larger datasets that I tried so far.

Conclusion

Overall, I am impressed! MariaDB's Vector Edition is going to simplify certain AI-driven architectures. It bridges the gap between the traditional database world and the evolving demands of AI tools. In the coming months, I look forward to seeing how this technology matures and how the community adopts it in real-world applications.

AI MariaDB Data (computing) Docker (software) vector database

Opinions expressed by DZone contributors are their own.

Related

  • Build Retrieval-Augmented Generation (RAG) With Milvus
  • Utilizing Multiple Vectors and Advanced Search Data Model Design for City Data
  • Architectural Patterns for Enterprise Generative AI Apps: DSFT, RAG, RAFT, and GraphRAG
  • Introduction to Retrieval Augmented Generation (RAG)

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!