Enhancing Productivity With RAG-Based GenAI Solutions
Learn how RAG on Amazon Bedrock simplifies content creation by combining data retrieval with generation, how it enhances AI capabilities, and boosts efficiency.
Join the DZone community and get the full member experience.
Join For FreeSo what exactly is RAG? In simple terms, it stands for retrieval-augmented generation. Let us focus on these two aspects: retrieval and generation. With standard generative AI (GenAI), you provide a prompt, and a GenAI application would use a large language model to come up with a suitable response for the prompt. Now, imagine an application that can retrieve information from various sources and then generate a response based on the retrieved information. That is exactly how a RAG GenAI works. It provides context to the generated example.
Let us explore this further with an example. If we ask something like "What is the best way to back up my customer database?" to a GenAI application, it would probably respond with some generic stuff. I would not know the details of the customer database that I am talking about. Now, suppose I have a design document with all the details. It has a section on data stores and explicitly lists out the customer database that is hosted on Amazon DynamoDB. The design document is uploaded to my organization’s SharePoint. So, the application will first do a retrieval of contextual information from SharePoint, augment the prompt with retrieved information, and then generate a response based on that. In this case, the application will provide strategies for backing up a DynamoDB database and direct me to the relevant sections in my design document.
Organizations can reap huge benefits by implementing RAG. They can generate accurate and relevant content by adding context. AI-driven content generation helps keep brands at the forefront of innovation. It is a good time to start exploring how it can elevate your strategy.
The Business Case
Let us expand the example above into a problem statement. A company has a corporate SharePoint with loads of information stored in it. However, the team hardly looks into these documents as it takes a long time to dig into pages of information. Also, legacy searches often fail to come back with relevant information. Application teams raise tickets whenever they encounter an issue. Most of them are trivial, and a solution is known. This unnecessarily increases the number of tickets and impacts the productivity of the teams.
The organization is looking to build a RAG GenAI chatbot that would help team members quickly get a solution and resolve their problems. This chatbot will be able to help developers resolve issues like “Why is my application failing with a DB access error?” Developers would be encouraged to use this chatbot to find answers and raise tickets only if they are not able to solve their problems using the chatbot.
Major Building Blocks
To build this solution, we will use AWS managed services — Amazon Bedrock to provide GenAI functionalities and S3 vectors to store RAG data.
Let’s start with vector databases. Vector databases are the foundation of GenAI applications. Vector databases, as the name suggests, store data in the form of vectors. So, how is it different from a traditional relational database? A relational database stores information as rows of data. For example, if we want to store a photo in a relational database, it would be stored as a binary with a few other information like date, place, and maybe some tags. It can be searched using SQL queries on these attributes.
Now, suppose I want to search for photos using plain English like "photos of lion cubs running around in the midst of tall green grass." Notice the number of attributes — "lion cubs," "running," and "tall green grass." Clearly, a relational database is not well equipped to respond to these kinds of searches. It’s like scanning the actual photo and extracting attributes out of it. That is exactly what a vector embedding is. It would break down the photo into a number of dimensions, typically in the order of thousands, and create vectors of numerical values. These vectors are then stored in a vector database. While searching, the search string is also broken down into vectors using the same algorithm and then matched with the stored values.
S3 Vectors is the latest offering from AWS and is a highly capable vector data store. It has all the benefits of S3. So, it can manage large-scale, high-dimensional data efficiently. Combined with low-latency searches, this makes it an ideal vector store for complex GenAI applications. S3 being elastically scalable ensures seamless performance even as data grows exponentially. Additionally, S3 Vectors integrate well with SageMaker and other AWS services, enabling the data to be used for other machine learning models.
Foundation models (FMs) are the backbone of GenAI applications. These models can understand and generate text, image, and multimedia content. They power everything from chatbots to content creation tools. However, deploying and managing these models can be complex, requiring significant infrastructure and expertise. This is where Amazon Bedrock steps in. Bedrock simplifies the entire process of building and deploying.
Let’s look at Amazon Bedrock, a fully managed service for GenAI applications. It is used to build and scale generative AI applications using foundation models from leading AI companies. Bedrock offers a selection of FMs from top AI providers, giving developers the flexibility to choose models that best fit their use cases. With Bedrock, integrating foundation models into applications becomes a seamless process, requiring minimal coding and configuration. So, developers can focus on application development without worrying about managing infrastructure.
The Solution
The first part of the solution is to ingest data that is stored in SharePoint. We will use Bedrock Knowledge Base (KB) to store this information. Bedrock KB can internally integrate with S3 Vectors. Bedrock also provides pre-built connectors for popular data sources, including SharePoint. So, essentially, the first step would be to configure a SharePoint connector and build the knowledge base, which S3 vectors would back. We would also need to select a Foundation Model at this stage.
For this example, let us select Titan Text Embeddings. The default dimension of Titan Text Embeddings is 1024. To ingest data, Bedrock will use the connector to read data, chunk, and create vector embeddings of 1024 dimensions using the foundation model and store them as its knowledge base in S3 vectors.

Once that’s done, it is ready to provide contextual answers. We can use a Lambda function here, fronted by an API gateway, to accept prompts from users. The Lambda would invoke the Bedrock application with a prompt. Bedrock will at first convert the prompt to a vector using the same Titan foundation model. The knowledge base would then be searched with the vector for close matches. The prompt would be “augmented” with the “retrieved” response received from the database. The enhanced prompt would be used by Bedrock to generate a response.

Conclusion
The development of RAG-based GenAI applications with cutting-edge technologies like Amazon Bedrock and S3 Vectors demonstrates a significant leap forward in the field of generative AI. By combining the capabilities of data retrieval and advanced content generation, organizations can truly unlock the value of information stored in their repositories. RAG not only enhances the user experience by providing context-aware responses but also streamlines operations. It reduces reliance on traditional support channels and improves productivity significantly.
Opinions expressed by DZone contributors are their own.
Comments