DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • Enterprise RAG in Amazon Bedrock: Introduction to KnowledgeBases
  • Build Multimodal RAG Apps With Amazon Bedrock and OpenSearch
  • A Guide to Using Amazon Bedrock Prompts for LLM Integration
  • Boosting Efficiency: Implementing Natural Language Processing With AWS RDS Using CloudFormation

Trending

  • Can You Run a MariaDB Cluster on a $150 Kubernetes Lab? I Gave It a Shot
  • Navigating and Modernizing Legacy Codebases: A Developer's Guide to AI-Assisted Code Understanding
  • Introducing Graph Concepts in Java With Eclipse JNoSQL
  • Enhancing Business Decision-Making Through Advanced Data Visualization Techniques
  1. DZone
  2. Data Engineering
  3. Databases
  4. Simplify RAG Application With MongoDB Atlas and Amazon Bedrock

Simplify RAG Application With MongoDB Atlas and Amazon Bedrock

In this article, learn how to integrate MongoDB Atlas as the vector store and set up the entire workflow for your RAG application.

By 
Abhishek Gupta user avatar
Abhishek Gupta
DZone Core CORE ·
May. 30, 24 · Tutorial
Likes (1)
Comment
Save
Tweet
Share
3.2K Views

Join the DZone community and get the full member experience.

Join For Free

By fetching data from the organization’s internal or proprietary sources, Retrieval Augmented Generation (RAG) extends the capabilities of FMs to specific domains, without needing to retrain the model. It is a cost-effective approach to improving model output so it remains relevant, accurate, and useful in various contexts.

Knowledge Bases for Amazon Bedrock is a fully managed capability that helps you implement the entire RAG workflow from ingestion to retrieval and prompt augmentation without having to build custom integrations to data sources and manage data flows. With MongoDB Atlas vector store integration, you can build RAG solutions to securely connect your organization’s private data sources to FMs in Amazon Bedrock.

Let's see how the MongoDB Atlas integration with Knowledge Bases can simplify the process of building RAG applications.

mongodb atlas

Configure MongoDB Atlas

MongoDB Atlas cluster creation on AWS process is well documented. Here are the high-level steps:

  • This integration requires an Atlas cluster tier of at least M10. During cluster creation, choose an M10 dedicated cluster tier.
  • Create a database and collection.
  • For authentication, create a database user. Select Password as the Authentication Method. Grant the Read and write to any database role to the user.
  • Modify the IP Access List – add IP address 0.0.0.0/0 to allow access from anywhere. For production deployments, AWS PrivateLink is the recommended way to have Amazon Bedrock establish a secure connection to your MongoDB Atlas cluster.

Create the Vector Search Index in MongoDB Atlas

Use the below definition to create a Vector Search index.

{
  "fields": [
    {
      "numDimensions": 1536,
      "path": "AMAZON_BEDROCK_CHUNK_VECTOR",
      "similarity": "cosine",
      "type": "vector"
    },
    {
      "path": "AMAZON_BEDROCK_METADATA",
      "type": "filter"
    },
    {
      "path": "AMAZON_BEDROCK_TEXT_CHUNK",
      "type": "filter"
    }
  ]
}


  • AMAZON_BEDROCK_TEXT_CHUNK – Contains the raw text for each data chunk. We are using cosine similarity and embeddings of size 1536 (we will choose the embedding model accordingly - in the the upcoming steps).
  • AMAZON_BEDROCK_CHUNK_VECTOR – Contains the vector embedding for the data chunk.
  • AMAZON_BEDROCK_METADATA – Contains additional data for source attribution and rich query capabilities.

Configure the Knowledge Base in Amazon Bedrock

Create an AWS Secrets Manager secret to securely store the MongoDB Atlas database user credentials.

Secrets Manager

Create an Amazon Simple Storage Service (Amazon S3) storage bucket and upload any document(s) of your choice — Knowledge Base supports multiple file formats (including text, HTML, and CSV). Later, you will use the knowledge base to ask questions about the contents of these documents.

Navigate to the Amazon Bedrock console and start configuring the knowledge base. In step 2, choose the S3 bucket you created earlier:

configure data source

Select Titan Embeddings G1 – Text embedding model MongoDB Atlas as the vector database.

Select Titan Embeddings G1 – Text embedding model MongoDB Atlas as the vector database

Enter the basic information for the MongoDB Atlas cluster along with the ARN of the AWS Secrets Manager secret you had created earlier. In the Metadata field mapping attributes, enter the vector store-specific details. They should match the vector search index definition you used earlier.

metadata field mapping

Once the knowledge base is created, you need to synchronize the data source (S3 bucket data) with the MongoDB Atlas vector search index.

synchronize data source

Once that's done, you can check the MongoDB Atlas collection to verify the data. As per the index definition, the vector embeddings have been stored in AMAZON_BEDROCK_CHUNK_VECTOR along with the text chunk and metadata in AMAZON_BEDROCK_TEXT_CHUNK and AMAZON_BEDROCK_METADATA, respectively.

bedrock knowledge

Query the Knowledge Base

You can now ask questions about your documents by querying the knowledge base — select Show source details to see the chunks cited for each footnote.

Select Show source details to see the chunks cited for each footnote

You can also change the foundation model. For example, I switched to Claude 3 Sonnet.

select model

Use Retrieval APIs To Integrate Knowledge Base With Applications

To build RAG applications on top of Knowledge Bases for Amazon Bedrock, you can use the RetrieveAndGenerate API which allows you to query the knowledge base and get a response.

If you want to further customize your RAG solutions, consider using the Retrieve API, which returns the semantic search responses that you can use for the remaining part of the RAG workflow.

More Configurations

You can further customize your knowledge base queries using a different search type, additional filter, different prompt, etc.

More Configurations

Conclusion

Thanks to the MongoDB Atlas integration with Knowledge Bases for Amazon Bedrock, most of the heavy lifting is taken care of. Once the vector search index and knowledge base are configured, you can incorporate RAG into your applications. Behind the scenes, Amazon Bedrock will convert your input (prompt) into embeddings, query the knowledge base, augment the FM prompt with the search results as contextual information, and return the generated response.

Happy building!

AWS Knowledge base MongoDB Integration vector database

Published at DZone with permission of Abhishek Gupta, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Enterprise RAG in Amazon Bedrock: Introduction to KnowledgeBases
  • Build Multimodal RAG Apps With Amazon Bedrock and OpenSearch
  • A Guide to Using Amazon Bedrock Prompts for LLM Integration
  • Boosting Efficiency: Implementing Natural Language Processing With AWS RDS Using CloudFormation

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!