DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Transforming Warehouse Operations: Harnessing the Power of AI and Automation
  • Self-Hosted Inference Doesn’t Have to Be a Nightmare: How to Use GPUStack
  • Stop Using the ATM-Didn’t-Kill-Jobs Story to Reassure Developers About AI
  • The Hidden Risk of SaaS-Based AI: You’re Training Models You Don’t Control

Trending

  • Integrating AI-Driven Decision-Making in Agile Frameworks: A Deep Dive into Real-World Applications and Challenges
  • The Death of "Text-Only" ChatOps: Why Google's A2UI Matters for DevOps and SRE
  • How to Prevent Data Loss in C#
  • How to Build and Optimize AI Models for Real-World Applications
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. How to Build a Self-Evolving AI Agent That Learns From Failure

How to Build a Self-Evolving AI Agent That Learns From Failure

This guide demonstrates how to transform brittle AI agents into resilient systems that reflect on failures and retain learnings to avoid repeating errors.

By 
Apratim Mukherjee user avatar
Apratim Mukherjee
·
Rudrendu Paul user avatar
Rudrendu Paul
·
Sourav Nandy user avatar
Sourav Nandy
·
Dec. 30, 25 · Analysis
Likes (1)
Comment
Save
Tweet
Share
3.1K Views

Join the DZone community and get the full member experience.

Join For Free

For developers building autonomous systems, today's generative AI agents present a fundamental challenge: they are amnesiacs. An agent can execute a complex task, fail, and then repeat the same mistake five minutes later. Their capabilities are "test-time static," meaning they are frozen at the moment their training ends. They cannot learn from their interactions, discard valuable insights, or correct their own errors.

For developers and architects trying to build reliable autonomous systems, this is the primary barrier to adoption. An unreliable agent is not autonomous. It is a brittle system that creates technical debt.

This is why the true frontier of AI is not just about building larger models but about creating agents that can learn. This article will show you how to build a simple self-evolving agent in Python. We will ditch the academic theory and write code for an agent that learns from its failures using a persistent, structured memory we call a "ReasoningBank." 

Think of ReasoningBank as retrieval-augmented generation (RAG) for strategy rather than data. Instead of fetching static documents, the agent retrieves high-level reasoning patterns distilled from its past successes and failures. This allows the agent to consult a dynamic playbook of proven strategies at inference time, which it continuously updates after every interaction.

The Problem: Static, Brittle Agents

Today's LLM agents combine a core model with planning modules and tools, making them vulnerable to error propagation. A single root-cause error, such as misusing a tool or calling an unreliable API, can cascade through all subsequent steps and ultimately lead to task failure.

This makes them unsuitable for the "long-horizon" technical challenges that define real-world value, such as managing a software project, conducting complex data analysis, or automating multi-step DevOps workflows.

A Practical Solution: Building a "ReasoningBank" in Python

Let's build an agent that stops making the same mistake twice. The core mechanism involves a shift from amnesia to experience facilitated by a persistent and structured memory. Instead of treating every task as its first, this architecture allows an agent to run a continuous "Plan-Execute-Reflect-Memorize" loop.

Step 1: Define a Relatable Task

Before we build the brain, let's define the work. We need a task that simulates a real-world scenario, like fetching data from a project management API.

In a perfect world, this function always works. In our simulation, it is "flaky." It fails under specific conditions (an expired key or a bad host), which forces our agent to adapt.

Python
 
import json
import os
from typing import List, Dict, Any, Optional

# --- Our Simulated "Flaky Tool" ---
def call_external_api(project_id: str, api_key: str):
    """
    Simulates a tool call that usually works but fails under specific conditions.
    Target: Retrieve project data.
    """
    # Specific failure 1: Expired Key
    if project_id == "project-123" and api_key == "old-key-xyz":
        raise ValueError("APIError: Invalid API Key. Key 'old-key-xyz' is expired.")
    
    # Specific failure 2: Bad Host
    if project_id == "project-789":
        raise ConnectionError("NetworkError: Cannot connect to host for project-789.")
    
    # Success path
    print(f"Successfully called API for {project_id} with {api_key}")
    return f"Success: Data for {project_id}"


Step 2: The Memory (ReasoningBank)

Now we need a place to store lessons. This simple class saves insights to a local JSON file so the agent remembers them even after the script restarts.

Python
 
REASONING_BANK_FILE = "reasoning_bank.json"

class ReasoningBank:
    """A simple JSON-based persistent memory for our agent."""
    
    def __init__(self):
        self.memory: List[Dict[str, Any]] = []
        self.load_memory()

    def load_memory(self):
        """Loads lessons from the JSON file."""
        if os.path.exists(REASONING_BANK_FILE):
            try:
                with open(REASONING_BANK_FILE, 'r') as f:
                    self.memory = json.load(f)
                print(f"Loaded {len(self.memory)} lessons from {REASONING_BANK_FILE}")
            except json.JSONDecodeError:
                self.memory = []
        else:
            print("No reasoning bank found, starting fresh.")

    def save_memory(self):
        """Saves all lessons back to the JSON file."""
        with open(REASONING_BANK_FILE, 'w') as f:
            json.dump(self.memory, f, indent=2)

    def add_lesson(self, lesson: Dict[str, Any]):
        """Adds a new, structured lesson to the memory."""
        self.memory.append(lesson)
        self.save_memory()

    def find_relevant_lessons(self, task_description: Dict[str, Any]) -> List[Dict[str, Any]]:
        """
        Finds lessons that match the current task.
        A real implementation might use vector search here.
        """
        relevant_lessons = []
        current_project = task_description.get('project_id')
        if not current_project:
            return []
            
        for lesson in self.memory:
            # We match lessons specifically to the project ID for this demo
            if lesson.get("context", {}).get("project_id") == current_project:
                relevant_lessons.append(lesson)
        return relevant_lessons


Step 3: The Brain (Reflection)

This function is the most critical component. It translates raw error text into actionable strategies. It is the difference between logging an error and understanding it.

Python
 
def reflect_on_failure(task: Dict[str, Any], error: Exception) -> Dict[str, Any]:
    """
    Analyzes the error to create a generalizable, actionable lesson.
    """
    print(f"\n--- REFLECTION ---")
    error_str = str(error)
    lesson = {
        "type": "FAILURE_AVOIDANCE",
        "context": task,
        "error": error_str,
    }

    # Root Cause Analysis Logic
    if "Invalid API Key" in error_str and "expired" in error_str:
        lesson["root_cause"] = "The API key used for this project is expired."
        lesson["strategy"] = {
            "action": "replace_param",
            "param": "api_key",
            "old_value": task.get("api_key"),
            "new_value": "new-key-abc" # In prod, this would be fetched securely
        }
        print("Lesson: Expired key detected. Strategy: Update key.")
        
    elif "NetworkError" in error_str:
        lesson["root_cause"] = "The project's host is unreachable."
        lesson["strategy"] = {
            "action": "skip_task",
            "reason": "Project host is down. Do not retry."
        }
        print("Lesson: Host unreachable. Strategy: Skip future attempts.")
        
    else:
        lesson["root_cause"] = "Unknown error."
        lesson["strategy"] = {"action": "log_and_skip", "reason": "Unhandled error."}
    
    print("--- END REFLECTION ---")
    return lesson


Step 4: The Agent Structure

Finally, we assemble the agent. Its execute_task method is wrapped in logic that checks the ReasoningBank before acting.

Python
 
class SelfEvolvingAgent:
    
    def __init__(self):
        self.memory_bank = ReasoningBank()

    def create_plan(self, task: Dict[str, Any]) -> Optional[Dict[str, Any]]:
        """
        Consults memory before execution.
        """
        print(f"\nPlanning task for project: {task.get('project_id')}")
        relevant_lessons = self.memory_bank.find_relevant_lessons(task)
        
        if not relevant_lessons:
            return task # No lessons, proceed as planned

        # Apply the most recent lesson
        latest_lesson = relevant_lessons[-1]
        strategy = latest_lesson.get("strategy", {})
        action = strategy.get("action")
        
        print(f"Found relevant lesson: {action}")
        
        if action == "replace_param":
            new_task = task.copy()
            param = strategy.get("param")
            new_val = strategy.get("new_value")
            if param in new_task:
                print(f"Applying lesson: Replacing '{param}' with '{new_val}'")
                new_task[param] = new_val
            return new_task
            
        elif action == "skip_task":
            print(f"Applying lesson: Skipping task. Reason: {strategy.get('reason')}")
            return None # None signals "do not execute"
            
        return task

    def execute_task(self, task: Dict[str, Any]):
        # 1. PLAN (Consult memory)
        plan = self.create_plan(task)
        
        if plan is None:
            print("--- Task execution skipped based on past failures. ---")
            return

        # 2. EXECUTE
        try:
            print(f"Executing task with params: {plan}")
            result = call_external_api(
                project_id=plan.get("project_id"), 
                api_key=plan.get("api_key")
            )
            print(f"--- Task Succeeded: {result} ---")

        # 3. REFLECT (On failure)
        except (ValueError, ConnectionError) as e:
            print(f"--- Task Failed: {e} ---")
            failure_lesson = reflect_on_failure(plan, e)
            self.memory_bank.add_lesson(failure_lesson)


Putting It All Together: The Execution Flow

Now we can watch the agent learn in real time. We will interleave the execution code with the agent's reasoning process.

First Attempt

The agent tries task_1 with old-key-xyz. The call_external_api function throws a ValueError. The except block catches this.

Python
 
# Clean setup for demo
if os.path.exists(REASONING_BANK_FILE):
    os.remove(REASONING_BANK_FILE)

agent = SelfEvolvingAgent()

task_1 = {
    "project_id": "project-123",
    "api_key": "old-key-xyz"
}

print("\n========= ATTEMPT 1 (Expect Failure) =========")
agent.execute_task(task_1)


Reflect and memorize: The reflect_on_failure() function is triggered. It analyzes the error message, identifies the root cause ("Invalid API Key"), and creates a structured lesson with a "replace_param" strategy. This lesson is saved to reasoning_bank.json.

Second Attempt

The agent is asked to redo task_1. This time, create_plan() queries the ReasoningBank and retrieves the lesson. It applies the strategy by modifying the task to use new-key-abc, and the execution succeeds.

Python
 
print("\n========= ATTEMPT 2 (Expect Success via Adaptation) =========")
agent.execute_task(task_1)


Third Attempt

The agent tries a new task, task_2. This project hosts a different error (ConnectionError). The agent has no prior lessons for this project, so it attempts execution and fails.

Python
 
task_2 = {
    "project_id": "project-789",
    "api_key": "any-key"
}

print("\n========= ATTEMPT 3 (Expect Network Failure) =========")
agent.execute_task(task_2)


Reflect and memorize: The agent reflects and creates a new lesson with a "skip_task" strategy because the host is down.

Fourth Attempt

On its fourth attempt, the planner sees this "skip_task" lesson and decides not to attempt executing the task. This saves time and computational resources.

Python
 
print("\n========= ATTEMPT 4 (Expect Skip) =========")
agent.execute_task(task_2)


Conclusion

By moving from static, amnesiac agents to dynamic systems that learn, we unlock the door to true autonomy. The difference is a simple, persistent JSON file and a single reflect_on_failure function.

This is the shift from a brittle, black-box tool to an adaptive, resilient system. It is one that you can trust to manage a CI/CD pipeline, not just run a single script, precisely because it has memory and the capacity to improve.

AI API Task (computing)

Published at DZone with permission of Apratim Mukherjee. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Transforming Warehouse Operations: Harnessing the Power of AI and Automation
  • Self-Hosted Inference Doesn’t Have to Be a Nightmare: How to Use GPUStack
  • Stop Using the ATM-Didn’t-Kill-Jobs Story to Reassure Developers About AI
  • The Hidden Risk of SaaS-Based AI: You’re Training Models You Don’t Control

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook