DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Creating an Agentic RAG for Text-to-SQL Applications
  • What OpenAI's Reasoning Models Mean for GPT and AI
  • Hybrid Search Using Postgres DB
  • Personalized Search Optimization Using Semantic Models and Context-Aware NLP for Improved Results

Trending

  • Understanding and Mitigating IP Spoofing Attacks
  • It’s Not About Control — It’s About Collaboration Between Architecture and Security
  • Customer 360: Fraud Detection in Fintech With PySpark and ML
  • Teradata Performance and Skew Prevention Tips

How to Implement Semantic Search Using OpenAI GPT-3

Semantic search is a mostly overlooked feature of OpenAI GPT-3. In this blog, we discuss how you can implement a semantic search for groups of documents using GPT-3.

By 
Mittal Patel user avatar
Mittal Patel
·
Updated Sep. 29, 21 · Tutorial
Likes (3)
Comment
Save
Tweet
Share
12.3K Views

Join the DZone community and get the full member experience.

Join For Free

Generative Pre-trained Transformer 3 (GPT-3) is an autoregressive language model used for text generation created by OpenAI. GPT-3 showed the amazing potential for a really smart language model to generate text and has the ability to do amazing tasks such as question answering, summarization, semantic search, chatbot, and writing poetry or an essay. Among them, we have already experimented with question answering using GPT-3, ads generation, sentence paraphrasing, and intent classification. Now let’s do some experiments for a semantic search task using GPT-3 API endpoint provided by OpenAI.

OpenAI’s API for search allows you to do a semantic search among a group of documents. Based on the semantically related query text, it provides the scores to each document and gives them ranks. 

As it is API-based access, it is easy to use. We just have to provide text in the form of documents and then query the text. API will respond back with multiple results matching the query sorted by relevance score. 

Below are steps to use OpenAI API for semantic search.

Installing OpenAI for Semantic Search

Here we are using Python for API calls. However, you can also make a cURL request.

Let’s create virtualenv by following these steps:

Python
 
virtualenv env_gpt --python=python3
source env_gpt/bin/activate


Next, install the OpenAI Python package to use its API and engines.

Python
 
pip install openai


Semantic Search Using GPT-3

To perform a semantic search, we first need to upload our documents in the JSONL file format. The following is a JSONL file format sample.

Python
 
{"text": "Hello OpenAI", "metadata": "sample data"}


Next, we will create a JSONL file for semantic search. Name it sample_search.jsonl and copy the following code into it: 

Python
 
{“text”: “The rebuilding of economies after the COVID-19 crisis offers a unique opportunity to transform the global food system and make it resilient to future shocks, ensuring environmentally sustainable and healthy nutrition for all. To make this happen, United Nations agencies like the Food and Agriculture Organization, the United Nations Environment Program, the Intergovernmental Panel on Climate Change, the International Fund for Agricultural Development, and the World Food Program, collectively, suggest four broad shifts in the food system.”, “metadata”: “Economic reset”} 
{“text”: “In the past few weeks healthcare professionals have been fully focussed caring for enormous numbers of people infected with COVID-19. They did an amazing job. Not in the least because healthcare professionals and leaders have been using continues improvement as part of their accreditation program for many years. It has become part of their DNA. This has enabled them to change many processes as needed during COVID-19, using a cross-functional problem solving approach in (very) rapid improvement cycles.”, “metadata”: “Supporting adaptive healthcare”}


Now it’s time to upload this JSONL file using API key by setting purpose as search for semantic search. Create a file named upload_file.py, then copy the below code and provide your OpenAI API key.

Python
 
import openai
openai.api_key = "YOUR-API-KEY" response = openai.File.create(file=open("sample_doc.jsonl"), purpose="search")
print(response)


When you run the upload_file.py file, you will get the response below:

upload_file.py file

Copy id from the response in the above step.

Now let’s test it. To test the capability of GPT-3 semantic search, provide your query in the query text parameter. 

Python
 
import openai
openai.api_key = "YOUR-API-KEY"
 
search_response = openai.Engine("davinci").search(
    search_model="davinci",
    query="healthcare",
    max_rerank=5,
    file="file-8ejPA5eM13J4J0dWy3bBbvTf",
    return_metadata=True
 )
 print(search_response)


Let’s understand the parameters of the openai.Engine.search.

  • search_model:
    • OpenAI’s API lets us use different engines like Davinci, Babbage, Ada, Curie, etc.
    • Davinci is the most powerful engine and costliest, too
  • query:
    • Query text is the text used for the semantic search
  • max_rerank: 
    • The output documents are re-ranked by semantic search in the response, where the response contains documents with the most max_rerank
  • file:
    • File ID, which we got while uploading the documents
  • return_metadata: 
    • Enable to get metadata in the response

And the response will look like the below image:

JSON response

In the JSON response, we get the document text which was matched with the query, and score shows the relevance of the result. In our test, we provided only one document. If we provide multiple documents then we will get multiple results with different scores.

As we can see, it is simple to perform a semantic search using GPT-3 for a given query. GPT-3’s results are quite amazing.

Limitation

There is a limitation on the size of the document we can upload. There must be no more than 2,048 tokens in the document, and we can upload a maximum of 200 documents.

Do let us know in the comments if you have any queries regarding OpenAI semantic search.

GPT-3 Semantic search Semantics (computer science)

Published at DZone with permission of Mittal Patel. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Creating an Agentic RAG for Text-to-SQL Applications
  • What OpenAI's Reasoning Models Mean for GPT and AI
  • Hybrid Search Using Postgres DB
  • Personalized Search Optimization Using Semantic Models and Context-Aware NLP for Improved Results

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!