DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

SBOMs are essential to circumventing software supply chain attacks, and they provide visibility into various software components.

Related

  • Supercharge Your Coding Workflow With Ollama, LangChain, and RAG
  • Decoding the Secret Language of LLM Tokenizers
  • Getting Started With LangChain for Beginners
  • Challenges of Using LLMs in Production: Constraints, Hallucinations, and Guardrails

Trending

  • Decoding the Secret Language of LLM Tokenizers
  • Evaluating Accuracy in RAG Applications: A Guide to Automated Evaluation
  • Secret Recipe of the Template Method: Po Learns the Art of Structured Cooking
  • The Shift to Open Industrial IoT Architectures With Data Streaming
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Reducing Hallucinations Using Prompt Engineering and RAG

Reducing Hallucinations Using Prompt Engineering and RAG

Using prompt engineering as a tool to reduce hallucinations with LLMs. This is one of the methodologies I used with LLM to output the desired information.

By 
Pranav Kumar Chaudhary user avatar
Pranav Kumar Chaudhary
·
Jul. 02, 25 · Analysis
Likes (2)
Comment
Save
Tweet
Share
1.3K Views

Join the DZone community and get the full member experience.

Join For Free

Overview

Large language models (LLMs) are a powerful tool to generate content. The generative capabilities of these LLMs come with various pros and cons. One of the major issues we often encounter is the factual correctness of the generated content. The models have a high tendency to hallucinate and sometimes generate non-existent and incorrect content. These generated contents are so impressive that they look like they are factually correct and viable. As developers, it is our responsibility to ensure the system works perfectly and generates concise content.

In this article, I will delve into two of the major methodologies that I employed to lower the hallucinations for applications developed using AWS Bedrock and other AWS tools and technologies.

Prompt Engineering

System Prompt

  • Set up role: Using the system prompt, you can set up the role for the LLM. This will instruct the model to assume the provided role and generate the content in a confined space.
  • Set up boundaries: Boundaries are something that will instruct LLM to generate content in the given space. This helps in setting up the clarity and precision to break down the instructions and act accordingly.
  • Enhance security: Security is one of the important aspects of any software application. System prompt helps in upending the security of the LLM application by adding an extra layer of protection between user input and the LLM.

A clear system prompt will help the LLM to break down the instructions into steps and make decisions accordingly. This will make the system clearer, concise, and active. In order to design the system prompt, we need to first:

  • Identify the use case: A generic system is prone to error, and it can assume any role it wants. In order to minimize the risk of hallucinations, we first need to identify the use case and assign a role to the LLM. This role will help the LLM to work in the given space. E.g., “You are working as a research assistant to break down the input user queries, use input data, validate, and generate content.” “As a marketing assistant, using the inputs, generate the required output without assuming any information. If you require more information, please ask the user.”
  • Identify the constraints and boundaries: It is essential for such systems to understand the constraints and boundaries beyond which they should not resolve any data. This can be supplied using constraints and boundaries. E.g., if you don’t know the answers, revert to “I can not help with this” instead of making up the answers. “Return the response in strict JSON format. Before returning, validate the JSON and fix any JSON errors, etc.”
  • Identify presentation requirements: Formatting is another requirement to check for before designing the system prompt. Formatting like delimiters, output formats, etc., helps presentation layers to render the generated content in the required format. E.g., “Create bullet points for the list of items”, “Generate the output in JSON format,” etc.

Retrieval-Augmented Generation (RAG)

1. Knowledge Base (KB) Data Sync

For this, I have leveraged AWS OpenSearch to store the generated embeddings. The source data is stored and synced in an S3 bucket regularly to ensure the latest information is available for the KB. This S3 bucket is the source for the KB, which will be chunked using the chunking strategy and stored in the OpenSearch vector store.

2. Embedding Model

The embedding model is used to create the vector embeddings of the provided source data. For this, I have leveraged the Amazon Titan Embedding model to create the embeddings of the source data. The Titan Embedding model is one of the text-to-vector models, and a vector is a mathematical representation of given information (text). 

Vectors represent a multidimensional view of the data, which allows for efficient search, indexing, and other operations, and can be used to calculate the similarity or distance between different data points. This is useful for clustering, finding nearest neighbors, and other tasks that require identifying similar objects. 

3. Knowledge Bases Creation

The next step is to create the KB using Amazon Titan Embedding Model and a chunking strategy to ensure efficient data chunking and retrieval. The data from S3 is used as a source and chunked and stored in OpenSearch vector databases. OpenSearch provides various out-of-the-box serverless capabilities to ensure scaling, efficient retrieval, query, filtering, etc.

4. RAG Library

A RAG library is required to efficiently perform all RAG operations across various data sources. This library, upon receiving a user query, will perform a KB query to retrieve the relevant chunks using similarity search. Once this chunk is retrieved, it is used to enrich the prompt with the retrieved data. This will provide the LLMs with the required context and details of the information for the given input query.

5. Output Generation

The LLM, upon receiving the enriched prompt with relevant information, will generate the output in a confined role and with the information. This will ensure they do not inject any non-existent data or make up data in the output.

Conclusion

This process has enabled me to curb the hallucinations and generate factually correct information with given citations (link to the documents referred to). Apart from the above approach, I have also experimented with another approach, using LLM as a judge to evaluate the generated content against the gold dataset. This was a measure to ensure the fairness of the generated content.

Knowledge base large language model RAG

Opinions expressed by DZone contributors are their own.

Related

  • Supercharge Your Coding Workflow With Ollama, LangChain, and RAG
  • Decoding the Secret Language of LLM Tokenizers
  • Getting Started With LangChain for Beginners
  • Challenges of Using LLMs in Production: Constraints, Hallucinations, and Guardrails

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • [email protected]

Let's be friends: