The Self-Improving AI Agent (Part 2)

Build an agent that monitors, evaluates, and rewrites its own tools at runtime.

This repository contains the full working code from the blog post: The Self-Extending AI Agent Part 2: Build a Self-Improving Agent That Rewrites Its Own Tools

This is the continuation of Part 1: The Self-Extending Agent (GitHub repo).

What This Does

In Part 1, the agent learned to generate new tools when it encounters a capability gap. In Part 2, the agent goes further — it monitors tool performance, detects degradation, rewrites underperforming tools using an LLM, and validates rewrites against a regression test suite before promoting them.

The improvement loop:

Execute — Every tool call is wrapped with metrics capture (latency, success/failure, output).
Record — Metrics are stored in a SQLite-backed Performance Memory.
Evaluate — A weighted scoring function computes tool quality and flags underperformers.
Rewrite — An LLM receives the old code + failure context and produces an improved version.
Test — A Regression Runner validates the candidate against stored test cases.
Promote — Only if all tests pass, the new version replaces the old one.

Files

File	Description
`main.py`	Entry point — seeds test cases and runs the agent
`orchestrator.py`	Extended orchestrator with metrics capture and improvement loop
`performance_memory.py`	SQLite-backed storage for per-tool invocation metrics
`evaluator.py`	Weighted quality scoring with configurable thresholds
`rewriter.py`	LLM-powered tool rewriter with performance context
`regression_runner.py`	Test suite runner with type checking and latency gates
`registry.py`	Tool registry with persistence (from Part 1)
`generator.py`	LLM tool generator (from Part 1)
`validator.py`	AST + sandbox validator (from Part 1)

Setup

1. Create a virtual environment

python -m venv venv
source venv/bin/activate  # Linux/macOS
venv\Scripts\activate     # Windows

2. Install dependencies

pip install -r requirements.txt

3. Set your OpenAI API key

export OPENAI_API_KEY="your-key-here"

Or edit main.py directly (not recommended for production).

Usage

python main.py

The agent will:

Generate tools it needs (if not already in the registry).
Execute tasks while recording performance metrics.
Automatically evaluate and rewrite tools that score below the threshold (0.6).
Print a final performance report showing tool versions and metrics.

Requirements

Python 3.11+
OpenAI API key (GPT-4o recommended)

Blog Post

Read the full walkthrough with architecture diagrams, live demo output, and production deployment guidance:

Part 2 (this repo): The Self-Improving Agent
Part 1: The Self-Extending Agent

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Self-Improving AI Agent (Part 2)

What This Does

Files

Setup

1. Create a virtual environment

2. Install dependencies

3. Set your OpenAI API key

Usage

Requirements

Blog Post

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
evaluator.py		evaluator.py
generator.py		generator.py
main.py		main.py
orchestrator.py		orchestrator.py
performance_memory.py		performance_memory.py
registry.py		registry.py
regression_runner.py		regression_runner.py
requirements.txt		requirements.txt
rewriter.py		rewriter.py
validator.py		validator.py

Folders and files

Latest commit

History

Repository files navigation

The Self-Improving AI Agent (Part 2)

What This Does

Files

Setup

1. Create a virtual environment

2. Install dependencies

3. Set your OpenAI API key

Usage

Requirements

Blog Post

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages