An In-Depth Guide to Threads in OpenAI Assistants API

Learn how to use OpenAI's Assistants API to manage threads and messages — create, list, retrieve, modify, delete — plus handle files, metadata, and more.

Mohammed Talib

Feb. 10, 25 · Tutorial

Likes (3)

Comment

Save

5.3K Views

In this blog, we will explore what chat completion models can and cannot do and then see how Assistants API addresses those limitations.

We will also focus on threads and messages — how to create them, list them, retrieve them, modify them, and delete them. Additionally, we will add some Python code snippets and show possible outputs based on the script language.

Limitations of Chat Completion Models

No Memory

Chat completion models do not have a memory concept. For example, if you ask: “What’s the capital of Japan?”

The model might say: “The capital of Japan is Tokyo.”

But when you ask again: “Tell me something about that city.”

It often responds with: “I’m sorry but you didn’t specify which city you are referring to.”

It does not understand what was discussed previously. That’s the main issue: there is no memory concept in chat completions.

Poor at Computational Tasks

Chat completion models are really bad at direct computational tasks. For instance, if you want to reverse the string “openaichatgpt,” it may generate the wrong output, like inserting extra characters or missing some letters.

No Direct File Handling

In chat completions, there is no way to process text files or Word documents directly. You have to convert those files to text, do chunking (divide documents into smaller chunks), create embeddings, and do vector searches yourself. Only then do you pass some relevant text chunks to the model as context.

Synchronous Only

Chat completion models are not asynchronous. You must ask a question and wait for it to finish. You cannot do something else while it’s processing without extra workarounds.

Capabilities of the Assistants API

Context Support With Threads

In Assistants API, you can create a thread for each user. A thread is like a chat container where you can add many messages. It persists the conversation, so when the user logs in again, you can pass the same thread ID to retrieve what was discussed previously. This is very helpful.

Code Interpreter

There is also a code interpreter. Whenever you ask for some computational task, it runs Python code. It then uses that answer to expand or explain. This makes it very helpful for reversing strings, finding dates, or any Python-based operations.

Retrieval With Files

The Assistants API has retrieval support, letting you upload files and ask questions based on those files. The system handles the vector search process and then uses relevant chunks as context. You can upload up to 20 files in Assistants as context. This is very helpful for referencing company documents, reports, or data sets.

Function Calling

Function calling allows the model to tell you what function to call and what arguments to pass, so that you can get external data (like weather or sales from your own database). It does not call the function automatically; it indicates which function to call and with what parameters, and then you handle that externally.

Asynchronous Workflows

The Assistants API is asynchronous. You can run a request, and you don’t have to wait for it immediately. You can keep checking if it’s done after a few seconds. This is very helpful if you have multiple tasks or want to do other things in parallel.

Focusing on Threads and Messages

A thread is essentially a container that holds all messages in a conversation. OpenAI recommends creating one thread per user as soon as they start using your product. This thread can store any number of messages, so you do not have to manually manage the context window.

Unlimited messages. You can add as many user queries and assistant responses as you want.
Automatic context handling. The system uses truncation if the conversation grows beyond token limits.
Metadata storage. You can store additional data in the thread’s metadata (for example, user feedback or premium status).

Below are code snippets to demonstrate how to create, retrieve, modify, and delete threads.

1. Creating an Assistant

First, you can create an assistant with instructions and tools. For example:

     Python
    
    from openai import OpenAI
client = OpenAI()

file_input = client.files.create(file=open("Location/to/the/path", "rb"), purpose = "assistants")

file_input.model_dump()

     Python
    
 

    assistant = client.beta.assistants.create(
    name="data_science_tutor",
    instructions="This assistant is a data science tutor.",
    tools=[{"type":"code_interpreter", {"type":"retrieval"}}],
    model="gpt-4-1106-preview",
    file_ids=[file_input.id]
)
print(assistant.model_dump())
   

2. Creating Threads

A thread is like a container that holds the conversation. We can create one thread per user.

     Python
    
    thread = client.beta.threads.create()
print(thread.model_dump())

id – a unique identifier that starts with thr-
object – always "thread"
metadata – an empty dictionary by default

Why Create Separate Threads?

OpenAI recommends creating one thread per user as soon as they start using your product. This structure ensures that the conversation context remains isolated for each user.

3. Retrieving a Thread

     Python
    
    retrieved_thread = client.beta.threads.retrieve(thread_id=thread.id)
print(retrieved_thread.model_dump())

This returns a JSON object similar to what you get when you create a thread, including the id, object, and metadata fields.

4. Modifying a Thread

You can update the thread’s metadata to store important flags or notes relevant to your application. For instance, you might track if the user is premium or if the conversation has been reviewed by a manager.

     Python
    
 

    updated_thread = client.beta.threads.update(
    thread_id=thread.id,
    metadata={"modified_today": True, "user_is_premium": True}
)
print(updated_thread.model_dump())
   

modified_today – a custom Boolean to note whether you changed the thread today
user_is_premium – a Boolean flag for user account tier
conversation_topic – a string that labels this thread’s main subject

Further Metadata Examples

{"language_preference": "English"} – if the user prefers answers in English or another language
{"escalated": true} – if the thread needs special attention from a support team
{"feedback_rating": 4.5} – if you collect a rating for the conversation

5. Deleting a Thread

When you no longer need a thread, or if a user deletes their account, you can remove the entire conversation container:

     Python
    
    delete_response = client.beta.threads.delete(thread_id=thread.id)
print(delete_response.model_dump())

Once deleted, you can no longer retrieve this thread or any messages it contained.

Working With Messages

Previously, we focused on threads — the containers that hold conversations in the Assistants API. Now, let’s explore messages, which are the individual pieces of content (questions, responses, or system notes) you add to a thread. We’ll walk through creating messages, attaching files, listing and retrieving messages, and updating message metadata. We’ll also show Python code snippets illustrating these steps.

Messages and Their Role in Threads

What Are Messages?

Messages are mostly text (like user queries or assistant answers), but they can also include file references. Each thread can have many messages, and every message is stored with an ID, a role (for example, "user" or "assistant"), optional file attachments, and other metadata.

Opposite Index Order

Unlike chat completions, where the first message in the list is the earliest, here, the first message you see in the array is actually the most recent. So, index 0 corresponds to the newest message in the thread.

Annotations and File Attachments

Messages can include annotations, for instance, if a retrieval step references certain files. When using a code interpreter, any new files generated may also appear as part of the message annotations.

Create a Message in a Thread

Messages are added to a thread. Each message can be a user message or an assistant message. Messages can also contain file references.

Before adding messages, we need a thread. If you do not already have one:

     Python
    
    # Create a new thread
new_thread = client.beta.threads.create()
print(thread.model_dump())  # Shows the thread's detailspython

     Python
    
 

    # Create a new message in the thread
message = client.beta.threads.messages.create(
    thread_id=thread.id, 
    role="user",
    content="ELI5: What is a neural network?",
    file_ids=[file_input.id]  # Passing one or more file IDs
)
print(message.model_dump())
   

Here, you can see:

Message ID – unique identifier starting with msg
Role – user, indicating this is a user input
File attachments – the file_ids list includes any referenced files
Annotations – empty at creation, but can include details like file citations if retrieval is involved
Metadata – a placeholder for storing additional key-value pairs

List Messages in a Thread

To list messages in a thread, use the list method. The limit parameter determines how many recent messages to retrieve.

Now, let’s try to list all the messages:

You will see only the most recent messages. For instance, if we’ve added just one message, the output will look like:

     Python
    
 

    messages_list = client.beta.threads.messages.list(
    thread_id=thread.id, 
    limit=5
)
for msg in messages_list.data:
    print(msg.id, msg.content)
   

If there are multiple messages, the system works like a linked list:

The first ID points to the newest message.
The last ID points to the earliest message.

Retrieve a Specific Message

     Python
    
 

    retrieved_msg = client.beta.threads.messages.retrieve(
    thread_id=new_thread.id,
    message_id=message.id
)
print(retrieved_msg.model_dump())
   

Retrieve a Message File

Now, let’s retrieve a message file:

This provides the file’s metadata, including its creation timestamp.

     Python
    
 

    files_in_msg = client.beta.threads.messages.files.list(
    thread_id=new_thread.id,
    message_id=message.id
)
print(files_in_msg.model_dump())
   

Modify a Message

     Python
    
 

    updated_msg = client.beta.threads.messages.update(
    thread_id=new_thread.id,
    message_id=message.id,
    metadata={"added_note": "Revised content"}
)
print(updated_msg.model_dump())
   

Delete a Message

     Python
    
 

    deleted_msg = client.beta.threads.messages.delete(
    thread_id=new_thread.id,
    message_id=message.id
)
print(deleted_msg.model_dump())
   

We have seen that chat completion models have no memory concept, are bad at computational tasks, cannot process files directly, and are not asynchronous. Meanwhile, Assistants API has context support with threads, code interpreter for computational tasks, retrieval for files, function calling for external data, and it also supports asynchronous usage.

In this blog, we focused on how to create, list, retrieve, modify, and delete threads and messages. We also saw how to handle file references within messages. In the next session, we will learn more about runs, which connect threads and assistants to get actual outputs from the model.

I hope this is helpful. Thank you for reading!

Let’s connect on LinkedIn!

An In-Depth Guide to Threads in OpenAI Assistants API

Learn how to use OpenAI's Assistants API to manage threads and messages — create, list, retrieve, modify, delete — plus handle files, metadata, and more.

Limitations of Chat Completion Models

No Memory

Poor at Computational Tasks

No Direct File Handling

Synchronous Only

Capabilities of the Assistants API

Context Support With Threads

Code Interpreter

Retrieval With Files

Function Calling

Asynchronous Workflows

Focusing on Threads and Messages

1. Creating an Assistant

2. Creating Threads

Why Create Separate Threads?

3. Retrieving a Thread

4. Modifying a Thread

Further Metadata Examples

5. Deleting a Thread

Working With Messages

Messages and Their Role in Threads

What Are Messages?

Opposite Index Order

Annotations and File Attachments

Create a Message in a Thread

List Messages in a Thread

Retrieve a Specific Message

Retrieve a Message File

Modify a Message

Delete a Message

Further Reading

Partner Resources

Related

Trending

An In-Depth Guide to Threads in OpenAI Assistants API

Learn how to use OpenAI's Assistants API to manage threads and messages — create, list, retrieve, modify, delete — plus handle files, metadata, and more.

Limitations of Chat Completion Models

No Memory

Poor at Computational Tasks

No Direct File Handling

Synchronous Only

Capabilities of the Assistants API

Context Support With Threads

Code Interpreter

Retrieval With Files

Function Calling

Asynchronous Workflows

Focusing on Threads and Messages

1. Creating an Assistant

2. Creating Threads

Why Create Separate Threads?

3. Retrieving a Thread

4. Modifying a Thread

Further Metadata Examples

5. Deleting a Thread

Working With Messages

Messages and Their Role in Threads

What Are Messages?

Opposite Index Order

Annotations and File Attachments

Create a Message in a Thread

List Messages in a Thread

Retrieve a Specific Message

Retrieve a Message File

Modify a Message

Delete a Message

Further Reading

Related

Partner Resources