Persistent Memory for AI Agents Using LangChain's Deep Agents

Most AI agents forget you when the session ends. deepagents fixes that — persistent per-user memory across sessions, no vector database.

Ninaad Rao

Jun. 04, 26 · Tutorial

Likes (4)

Comment

Save

3.4K Views

AI agents have a memory problem. Not the kind that we all hear daily — hallucination, wrong answers, but a much quieter and fundamental problem. When you start a new conversation with the agent, it forgets who you are. It doesn't know what you have already worked on, what you have clarified multiple times across sessions, or what is common across all the sessions. You start from scratch every single time.

While this does sound good in a way, in case you weren't getting what you wanted out of the agent, it does pose some challenges. LLMs are capable of maintaining a rich context of a conversation. The problem is more architectural: most of the agents designed the scope to include all state files, memory, and history into a single thread. When that thread ends, so does the state. This results in an intelligent agent but amnesiac across sessions.

LangChain's deepagents have a solution with three components that work together:

StoreBackend – stores files outside the conversation thread in LangGraph's BaseStore
CompositeBackend – routes specific file paths to persistent storage while also keeping everything else ephemeral
MemoryMiddleware – loads memory into the agent's context automatically before any run.

By the end of this article, you will learn how to create a working personal assistant remembering your preferences, provides feedback across sessions, has per-user isolation, and a clear path from local SQLite to production Postgres.

Why Persistent Memory Matters for AI Agents

Consider a case of a customer support agent. A customer chats with the agent, and just like how they converse with a normal human, they try to bring up something that they brought up during the last conversation, but the agent has no idea about this. This creates friction and a poor user experience. There are other such scenarios, like a coding assistant that does not remember your team's conventions and coding patterns and gives generic answers, or a personal assistant that asks for your timezone every time the agent is asked to schedule a meeting.

LangChain's deepagents approach is notable because it doesn't require a vector database, an embeddings pipeline, or any kind of retrieval step at query time. Memory is a pure file. Loading memory means reading a file. Agent updates it the same way it edits any file, just like a human. The complexity comes in the routing and persistence layer, which CompositeBackend and Storebackend handle independently of the agentic loop.

The Problem

Conversations are stateless by default. In deep agents, every file that the agent reads or writes goes through a backend. The default is StateBackend. This stores files inside the LangGraph conversation state, which is scoped to a thread_id.

Starting a new conversation? new thread_id. New state. Files gone. The fix requires separating two distinct storage concerns:

Working files, scratch notes -> scope is usually session -> this shouldn't be in the memory.
User profile, preferences -> this is scoped at the user level -> this should survive in the memory.

Deepagents handles this with three cooperating primitives: StoreBackend, CompositeBackend, and MemoryMiddleware, and there are two storage primitives -> conversation thread, which is scoped to thread_id, and BaseStore, which is a key-value store that exists independently of threads.

StateBackend reads and writes from the conversation state. StoreBackend reads and writes from BaseStore. The key difference is where the agent reads from.

Setting Up the Persistent Memory Assistant

Installation

uv add deepagents langchain-anthropic langgraph

Backend Wiring

    Python
   
 

   from deepagents.backends.composite import CompositeBackend                                                                                                                                                                                
from deepagents.backends.state import StateBackend
from deepagents.backends.store import StoreBackend
from langgraph.store.memory import InMemoryStore                                                                                                                                                                                          

store = InMemoryStore()                                                                                                                                                                                                                   

store_backend = StoreBackend(                                                                                                                                                                                                             
  store=store,
  namespace=lambda rt: (f"user:{user_id}", "memories"),
)                                                                                                                                                                                                                                         

backend = CompositeBackend(                                                                                                                                                                                                               
  default=StateBackend(),
  routes={"/memories/": store_backend},                                                                                                                                                                                                 
)           
  

The namespace lambda is what can isolate users. Consider a case where there are two users: Alice and Bob. Alice's memory lives at ("user:alice", "memories") and Bob's at ("user:bob", "memories").

Agent Creation

    Python
   
 

   from deepagents import create_deep_agent
from langchain_anthropic import ChatAnthropic                                                                                                                                                                                             
from langgraph.checkpoint.memory import InMemorySaver

agent = create_deep_agent(
  model=ChatAnthropic(model="claude-sonnet-4-6"),                                                                                                                                                                                       
  system_prompt=SYSTEM_PROMPT,                                                                                                                                                                                                          
  memory=["/memories/profile.md"],
  backend=backend,                                                                                                                                                                                                                      
  checkpointer=InMemorySaver(),
)                  
  

The memory parameter is all MemoryMiddleware needs. It reads that path along with the configured backend. At the start of the session, the content is cached in state and is then injected into the system prompt before model calls within the session. If the file does not exist, then it injects "(no memory loaded)" so the agent knows to create a new one.

Architecture

The System Prompt Contract

The agent needs to know when to update the memory and how to update this memory. The system prompt decides this contract:

    Python
   
 

   SYSTEM_PROMPT = """You are a personal assistant with persistent memory.
                                                                                                                                                                                                                                            
  Your persistent memory file lives at /memories/profile.md and survives
  across all conversations.                                                                                                                                                                                                                 
                  
  When to update memory:                                                                                                                                                                                                                    
  - User shares name, role, or background
  - User mentions ongoing projects or goals                                                                                                                                                                                                 
  - User states a preference (language, tools, response format)
  - User corrects you or gives explicit feedback                                                                                                                                                                                            
                                                                                                                                                                                                                                            
  How to update:                                                                                                                                                                                                                            
  - First conversation: write_file to create /memories/profile.md                                                                                                                                                                           
  - Later conversations: edit_file to update it                                                                                                                                                                                             
                                                                                                                                                                                                                                            
  Keep the file concise — bullet points, not prose.                                                                                                                                                                                         
  Never store credentials.                                                                                                                                                                                                                  
  """      
  

MemoryMiddleware also appends its own guidelines, which include heuristics for what not to save.

Multi-User Isolation

Now you might be wondering. Having this agent sounds amazing, but how to scale it for multiple people? Do we need to create separate instances for each user? The answer is no!. The namespace lambda is the only thing that separates users:

namespace=lambda rt: (f"user:{user_id}", "memories")

In the CLI, user_id is a flag. In LangGraph deployment, this can be derived from the request context.

namespace=lambda rt: (rt.server_info.user.identity, "memories")

Different Storing Backends

In this example, I experimented with in-memory store, SQLite, and PostgreSQL.

    Python
   
 

   #In-memory (demos): 
from langgraph.store.memory import InMemoryStore 
store = InMemoryStore() 
#Resets when the process exits. Good for demo runs.

#SQLite (local development, survives restarts): 
import sqlite3
from langgraph.store.sqlite import SqliteStore 

conn = sqlite3.connect("assistant_memory.db", isolation_level=None) 
store = SqliteStore(conn) 
store.setup() 
#Note: isolation_level=None (autocommit) is required by SqliteStore. 

#PostgreSQL (production, multi-instance): 
import os
from langgraph.store.postgres import PostgresStore 

with PostgresStore.from_conn_string(os.environ["DATABASE_URL"]) as store:
	store.setup() 
#Set DATABASE_URL to a standard Postgres connection string.

  

Advantages

LangChain's deepagents framework provides several advantages, such as:

Cross-session continuity – memory injected into the system prompt directly - no search, no embedding lookup, no extra latency.
Per-user isolation – easier namespacing using StoreBackend.
Explicit, inspectable memory – it's a plain markdown file. You can read it, edit it, and audit it without any special tooling.
Adaptable with existing middleware – MemoryMiddleware is part of the middleware stack along with permission checks and logging. Adding persistent memory is additive and not a total rewrite.

Disadvantages

While there are several advantages to using LangChain's deepagents, it does come with some limitations:

Context window consumption – Since the memory files are injected into the system prompt every time, it could become really large, and it could exceed the context budget. The system prompt needs to be clear and concise on what to save and what not to save.
Agent manages its own memory – A poorly prompted agent may over-save, under-save, or save the wrong things. The system prompt contract is very important.
Not suitable for large-scale memory – For a compact user-profile, this sounds perfect — a few hundred words. But applications that need to remember several past interactions, a RAG-based approach with a vector store makes much more sense. It doesn't scale to large memory corpora.

Extending the Pattern

Multiple memory files — separate concerns:

    Python
   
 

   memory=[
      "/memories/profile.md",      # identity and background                                                                                                                                                                                
      "/memories/projects.md",     # active work                                                                                                                                                                                            
      "/memories/preferences.md",  # style and tool preferences                                                                                                                                                                             
]   
  

Write-scoped permissions — prevent the agent from writing outside /memories/:

    Python
   
   from deepagents import FilesystemPermission                                                                                                                                                                                               

permissions=[
	FilesystemPermission(operations=["write"], paths=["/memories/**"]),                                                                                                                                                                   
    FilesystemPermission(operations=["write"], paths=["/**"], mode="deny"),                                                                                                                                                               
]

Shared team context alongside per-user memory:

    Python
   
 

    backend = CompositeBackend(                                                                                                                                                                                                               
   default=StateBackend(),
   routes={
     "/memories/": StoreBackend(store=store, namespace=lambda rt: (f"user:{user_id}", "memories")),
     "/shared/":   StoreBackend(store=store, namespace=("team:engineering", "shared")),                                                                                                                                                
   },                                                                                                                                                                                                                                    
 )     
  

Running the Example

    Shell
   
 

   git clone -b feat/permissions-execute-task https://github.com/NinaadRao/deepagents 
cd examples/persistent-memory-assistant 
uv venv && source .venv/bin/activate 
uv pip install -e . 
export ANTHROPIC_API_KEY=your_key 

# Built-in two-session demo 
python assistant.py --demo 

# Interactive with SQLite persistence 
python assistant.py --store sqlite --user alice "I prefer Python and FastAPI"
python assistant.py --store sqlite --user alice "What do you know about me?" 

# Different user — isolated memory 
python assistant.py --store sqlite --user bob "I build data pipelines in Spark" 
python assistant.py --store sqlite --user alice "What do you know about me?" # Alice only 
  

Conclusion

Most agent memory problems trace back to two things: conversation and the user's context. Keeping them separate in the storage layer and not the application code is what makes the solution clean. The three-component design in deep agents, i.e., StoreBackend + CompositeBackend + MemoryMiddleware, handles this without coupling any layer to the others. You can change the model, store, or routing rules independently of each other, which makes it a good use case for abstraction.

AI Memory (storage engine)

Opinions expressed by DZone contributors are their own.

Related

Trending