DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Control Your Services With OTEL, Jaeger, and Prometheus
  • Building RAG Apps With Apache Cassandra, Python, and Ollama
  • Implementing LSM Trees in Golang: A Comprehensive Guide
  • Keeping Two Multi-Master Databases Aligned With a Vector Clock

Trending

  • Unlocking the Benefits of a Private API in AWS API Gateway
  • Google Cloud Document AI Basics
  • Integrating Security as Code: A Necessity for DevSecOps
  • Medallion Architecture: Why You Need It and How To Implement It With ClickHouse
  1. DZone
  2. Data Engineering
  3. Databases
  4. Build a Philosophy Quote Generator With Vector Search and Astra DB

Build a Philosophy Quote Generator With Vector Search and Astra DB

In part 1 of this 'Infinite Wisdom Series,' learn the how and the why of vector search and vector embeddings by building a vector store from scratch.

By 
Stefano Lottini user avatar
Stefano Lottini
·
Oct. 27, 23 · Analysis
Likes (1)
Comment
Save
Tweet
Share
4.2K Views

Join the DZone community and get the full member experience.

Join For Free

The field of generative AI (GenAI), which has ignited a computing revolution this year, encompasses several technologies, key ideas, and paradigms. Although there will surely be more astonishing developments to come, vector search has become one of the most crucial tools in GenAI.

In this three-part series, we will demonstrate the power of vector search by building a vector store from scratch and using it to accomplish two standard tasks: semantic search and text generation based on provided examples.

This is what we'll build: first, a semantic search engine for finding quotes by famous philosophers. Then, this search facility will be the basis to create, in true GenAI style, a tool that can invent new, plausible snippets of philosophical wisdom in the style of your favorite philosopher! Along the way, we’ll illustrate how the various parts work together so that by the end, you'll have acquired not just a taste for Arthur Schopenhauer but also working knowledge to get started with your own GenAI application.

We will work with Python and use DataStax Astra DB for the vector-search back-end. Though most concepts are generally applicable, some aspects of the particular implementation take advantage of the specific architecture of this database.

The reference application this series is about has been developed in two flavors:

  • One version uses the database drivers directly to interface with the database
  • The other leverages the "CassIO" library, which abstracts away the database-specific aspects, offering a more Pythonic interface that "just works."

We will point out the differences between these two approaches and provide comprehensive references at the end to find out more. Both implementations are available as notebooks hosted in the OpenAI Cookbook repo: all you need is to create a free-tier Astra DB account and get an OpenAI API key. If you're eager to do some hands-on work, you can open the driver-based or the CassIO-based version right now as Google Colab interactive notebooks.

This three-part series is structured as follows:

  • In this post, we’ll summarize where the need for vector search arises and give a brief account of how it works
  • In Part 2, we build a search engine based on vector embeddings to find quotes by famous philosophers
  • Part 3 is where the search is used at the heart of a full GenAI application: a generator of new philosophical quotes.

Let's start!

Why Vector Search?

In a typical GenAI application, large language models (LLMs) are employed to produce text. For instance, answers to customers' questions, summaries, or suggestions based on context and past interactions. But in most cases, one cannot just use the LLM out-of-the-box, as it may lack the required domain-specific knowledge. To solve this problem while avoiding an often expensive (or outright unavailable) model fine-tuning step, the approach known as RAG (retrieval-augmented generation) has emerged.

In practice, in the RAG paradigm, first, a search is performed to obtain elements of textual information relevant to the specific task (for example, documentation snippets pertinent to the customer question being asked). Then, in a second step, these pieces of text are put into a suitably-designed prompt and passed to the LLM, which is instructed to craft an answer using the supplied information. The RAG approach has proven to be one of the main workhorses to expand the capabilities of LLMs. While the range of methods to augment the powers of LLMs is in rapid evolution (even fine-tuning approaches are experiencing a sort of comeback right now), RAG is considered one of the key ingredients.

LLM

So, the first problem at hand is that of retrieving "relevant" information. While this is not a new problem that has traditionally been solved with keyword-based search (possibly with preprocessing steps such as lemmatization/stemming or other variants), recently, vector search has emerged as a superior approach. Let's see what it does and why it works so well.

There are two key concepts that play together, namely:

  • For a given piece of text, an "embedding vector" can be computed. This looks like a fixed-length sequence of numbers that encode the meaning of the sentence, rather than its exact wording, to a striking degree of accuracy.
  • In the space where these vectors live, the typical mathematical definitions for the "distance between vectors" happen to measure fairly well the degree of (semantic) similarity between the corresponding sentences.

distance

In practice, this means that one can map a set of phrases to the corresponding vectors (i.e., points in a certain space) and then look for phrases with similar contents by actually looking for points close to each other in this space. Vector embeddings allow mapping a task in the domain of natural language processing (NLP) to a simpler, better-understood mathematical task in…geometry!

So, suppose you are building a service to find phrases similar to a user-provided input, searching in a possibly large corpus of text. You can pre-compute the vector embedding for all phrases in the corpus and store them in a suitable database along with the texts. Once this is done, queries would work like this: first, calculate the vector V for the query sentence; second, run a database query for the rows whose vector is the closest to V.

A database with this kind of capability is called a vector database. Nowadays, with the extraordinary growth of GenAI, many databases have started to offer vector-oriented features, and new databases have sprung up, explicitly built around this need.

Astra DB, a DBaaS offering built on the planet-scale and ultra-high-availability distributed database Apache Cassandra®, provides solid support for vector search workloads with top-class performance. You can try the vector capabilities of Astra DB right now; you can create a free tier account and start experimenting with it, for example, by running the demo application outlined throughout the rest of this post!

An important need that emerges when setting up a vector search-based application is that of filtering based on metadata. For instance, you may want to look for items in your e-commerce offering whose description is "similar to" a provided search query, but still, you may want to limit the search to entries with "special_offer = True" or "style = casual" in the query itself. You will see a practical application of vector-search filtering and two different ways to accommodate it when creating the vector store in the database.

Coming up Next

In the next installment of this mini-series, we will put these concepts to use by building a vector store and developing a search engine on top of it.

Apache Cassandra Data structure Database

Opinions expressed by DZone contributors are their own.

Related

  • Control Your Services With OTEL, Jaeger, and Prometheus
  • Building RAG Apps With Apache Cassandra, Python, and Ollama
  • Implementing LSM Trees in Golang: A Comprehensive Guide
  • Keeping Two Multi-Master Databases Aligned With a Vector Clock

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!