DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • From Keywords to Meaning: A Hands-On Tutorial With Sentence-Transformers for Semantic Search
  • When Search Started Breaking at Scale: How We Chose the Right Search Engine
  • From Keywords to Meaning: The New Foundations of Intelligent Search
  • Semantic Contracts: The Missing Layer Between Good Data and Reliable AI

Trending

  • Self-Hosted Inference Doesn’t Have to Be a Nightmare: How to Use GPUStack
  • The Third Culture: Blending Teams With Different Management Models
  • Zone-Free Angular: Unlocking High-Performance Change Detection With Signals and Modern Reactivity
  • Content Lakes: Harness Unstructured Data for Enterprise AI Readiness

How to Implement Semantic Search Using OpenAI GPT-3

Semantic search is a mostly overlooked feature of OpenAI GPT-3. In this blog, we discuss how you can implement a semantic search for groups of documents using GPT-3.

By 
Mittal Patel user avatar
Mittal Patel
·
Updated Sep. 29, 21 · Tutorial
Likes (3)
Comment
Save
Tweet
Share
12.5K Views

Join the DZone community and get the full member experience.

Join For Free

Generative Pre-trained Transformer 3 (GPT-3) is an autoregressive language model used for text generation created by OpenAI. GPT-3 showed the amazing potential for a really smart language model to generate text and has the ability to do amazing tasks such as question answering, summarization, semantic search, chatbot, and writing poetry or an essay. Among them, we have already experimented with question answering using GPT-3, ads generation, sentence paraphrasing, and intent classification. Now let’s do some experiments for a semantic search task using GPT-3 API endpoint provided by OpenAI.

OpenAI’s API for search allows you to do a semantic search among a group of documents. Based on the semantically related query text, it provides the scores to each document and gives them ranks. 

As it is API-based access, it is easy to use. We just have to provide text in the form of documents and then query the text. API will respond back with multiple results matching the query sorted by relevance score. 

Below are steps to use OpenAI API for semantic search.

Installing OpenAI for Semantic Search

Here we are using Python for API calls. However, you can also make a cURL request.

Let’s create virtualenv by following these steps:

Python
 
virtualenv env_gpt --python=python3
source env_gpt/bin/activate


Next, install the OpenAI Python package to use its API and engines.

Python
 
pip install openai


Semantic Search Using GPT-3

To perform a semantic search, we first need to upload our documents in the JSONL file format. The following is a JSONL file format sample.

Python
 
{"text": "Hello OpenAI", "metadata": "sample data"}


Next, we will create a JSONL file for semantic search. Name it sample_search.jsonl and copy the following code into it: 

Python
 
{“text”: “The rebuilding of economies after the COVID-19 crisis offers a unique opportunity to transform the global food system and make it resilient to future shocks, ensuring environmentally sustainable and healthy nutrition for all. To make this happen, United Nations agencies like the Food and Agriculture Organization, the United Nations Environment Program, the Intergovernmental Panel on Climate Change, the International Fund for Agricultural Development, and the World Food Program, collectively, suggest four broad shifts in the food system.”, “metadata”: “Economic reset”} 
{“text”: “In the past few weeks healthcare professionals have been fully focussed caring for enormous numbers of people infected with COVID-19. They did an amazing job. Not in the least because healthcare professionals and leaders have been using continues improvement as part of their accreditation program for many years. It has become part of their DNA. This has enabled them to change many processes as needed during COVID-19, using a cross-functional problem solving approach in (very) rapid improvement cycles.”, “metadata”: “Supporting adaptive healthcare”}


Now it’s time to upload this JSONL file using API key by setting purpose as search for semantic search. Create a file named upload_file.py, then copy the below code and provide your OpenAI API key.

Python
 
import openai
openai.api_key = "YOUR-API-KEY" response = openai.File.create(file=open("sample_doc.jsonl"), purpose="search")
print(response)


When you run the upload_file.py file, you will get the response below:

upload_file.py file

Copy id from the response in the above step.

Now let’s test it. To test the capability of GPT-3 semantic search, provide your query in the query text parameter. 

Python
 
import openai
openai.api_key = "YOUR-API-KEY"
 
search_response = openai.Engine("davinci").search(
    search_model="davinci",
    query="healthcare",
    max_rerank=5,
    file="file-8ejPA5eM13J4J0dWy3bBbvTf",
    return_metadata=True
 )
 print(search_response)


Let’s understand the parameters of the openai.Engine.search.

  • search_model:
    • OpenAI’s API lets us use different engines like Davinci, Babbage, Ada, Curie, etc.
    • Davinci is the most powerful engine and costliest, too
  • query:
    • Query text is the text used for the semantic search
  • max_rerank: 
    • The output documents are re-ranked by semantic search in the response, where the response contains documents with the most max_rerank
  • file:
    • File ID, which we got while uploading the documents
  • return_metadata: 
    • Enable to get metadata in the response

And the response will look like the below image:

JSON response

In the JSON response, we get the document text which was matched with the query, and score shows the relevance of the result. In our test, we provided only one document. If we provide multiple documents then we will get multiple results with different scores.

As we can see, it is simple to perform a semantic search using GPT-3 for a given query. GPT-3’s results are quite amazing.

Limitation

There is a limitation on the size of the document we can upload. There must be no more than 2,048 tokens in the document, and we can upload a maximum of 200 documents.

Do let us know in the comments if you have any queries regarding OpenAI semantic search.

GPT-3 Semantic search Semantics (computer science)

Published at DZone with permission of Mittal Patel. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • From Keywords to Meaning: A Hands-On Tutorial With Sentence-Transformers for Semantic Search
  • When Search Started Breaking at Scale: How We Chose the Right Search Engine
  • From Keywords to Meaning: The New Foundations of Intelligent Search
  • Semantic Contracts: The Missing Layer Between Good Data and Reliable AI

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook