
I Made Claude Code Learn From Its Own Debugging Mistakes

Build a self-improving RAG system where Claude learns from your debugging sessions, captures insights automatically, and reflects to fix issues faster.
Why this matters
I got tired of Claude Code forgetting lessons learned. Every new session started from scratch—same mistakes, same debugging loops. So I built a self-improving RAG system: hooks capture failures automatically, graph memory tracks relationships between errors and fixes, and self-reflection extracts meta-learnings. Now my Claude Code actually gets smarter over time.
In this cluster
Cluster context
This article sits inside AI Product Development.
Claude Code workflows, micro-SaaS execution, and evidence-based AI building.
AI product teams get stuck when they confuse model output with system design. This cluster documents the loops that matter: context control, verification, tool orchestration, and shipping discipline.
Claude Code Best Practices the Official Docs Don't Cover (2026)
What I learned building 36K lines of production code with Claude Code: quality gates, multi-agent orchestration, and the workflow patterns that ship.
My Two-Gate System for Claude Code Cut Errors 84%
Build safer Claude Code projects with a two-gate quality system. Learn the mandatory checks that catch bugs before deployment.
WebMCP Is the Missing Piece for AI-Connected SvelteKit Apps
Build WebMCP into SvelteKit apps using navigator.modelContext. Learn polyfill setup, tool schemas, and verification in 2026.
I was debugging the same authentication error for the third time this month.
Same error. Same root cause. Same fix.
Claude Code had solved this exact problem two weeks ago—but it didn’t remember. Each session starts fresh. No memory of what worked, what failed, or what patterns emerged.
That’s a massive waste of debugging time.
So I built a system to fix it.
The Problem With Stateless AI
Claude Code is powerful, but it has a fundamental limitation: every session starts from zero.
This means:
- Same mistakes repeated across sessions
- No accumulation of project-specific knowledge
- Debugging loops that should take minutes take hours
- Learnings trapped in conversation history, never extracted
The irony? Claude Code can solve complex problems. It just can’t remember that it already solved them. This is where building a self-improving RAG system becomes transformative.
Introducing the Self-Improving RAG System
I built a configuration that makes Claude Code learn from every debugging session.
The core idea: automatic capture, structured storage, intelligent retrieval.
When something breaks, the system captures it. When something works, the system remembers it. When a session ends, the system reflects on what happened.
No manual logging. No /learn commands for every insight. The system watches, learns, and improves. This is built on the principles of Retrieval-Augmented Generation (RAG)—using external knowledge to enhance AI responses—combined with Claude Code’s development capabilities.
Architecture: Three Memory Layers
The system uses three complementary memory approaches:
┌─────────────────────────────────────────────────────────────────┐
│ Knowledge Layer │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ ChromaDB │ │ Graph Memory │ │ CLAUDE.md │ │
│ │ (Vectors) │ │ (Relations) │ │ (File) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
│ Collections: Relationships: Sections: │
│ • error_patterns • error→file • Known Pitfalls │
│ • successful_patterns • error→fix • Successful Patterns │
│ • project_learnings • fix→file • Error History │
│ • meta_learnings • decision→outcome │
└─────────────────────────────────────────────────────────────────┘ Layer 1: ChromaDB (Semantic Search)
Vector embeddings enable semantic search across all captured knowledge.
Collections:
error_patterns: Build failures, type errors, runtime exceptionssuccessful_patterns: What worked—deployment patterns, architecture decisionsproject_learnings: Project-specific insightsmeta_learnings: Process improvements from self-reflection
When I search “authentication errors,” I get relevant results even if the exact phrase wasn’t used.
Layer 2: Graph Memory (Relationships)
ChromaDB stores flat documents. But knowledge has structure.
Graph memory tracks relationships:
Error ──occurred_in──→ File
│
└──fixed_by──→ Fix ──applied_to──→ File
Decision ──led_to──→ Outcome
│
Learning ←──learned_from─┘ This enables queries like:
- “What errors have occurred in
auth.ts?” - “What fixes have been applied to the API module?”
- “What decisions led to successful deployments?”
Relationships reveal patterns that flat search misses.
Layer 3: CLAUDE.md (Project Memory)
Each project maintains a CLAUDE.md file—a living document that Claude reads at session start.
Sections:
- Known Pitfalls: Project-specific gotchas (auto-populated by hooks)
- Successful Patterns: Code patterns that have worked
- Error History: Recent errors with resolutions
This provides immediate context without database queries.
Automatic Learning Capture
The magic is in the hooks—scripts that intercept Claude Code events.
Hook: Capture Failures
When a build or test fails:
# capture_failure.py (PostToolUse hook)
def capture_failure(tool_result):
if tool_result.exit_code != 0:
error = extract_error(tool_result.output)
store_to_chromadb({
"type": "error_pattern",
"error": error,
"file": tool_result.file,
"timestamp": now()
})
update_graph("error", error, "occurred_in", tool_result.file) No manual intervention. Failures get captured automatically.
Hook: Track File Edits
When files are modified:
# log_edit.py (PostToolUse hook)
def log_edit(tool_result):
if tool_result.tool == "Edit":
update_graph("fix", tool_result.diff, "applied_to", tool_result.file) This builds the error→fix→file relationship over time.
Hook: Session Summary
When a session ends:
# session_summary.py (Stop hook)
def session_summary():
learnings = extract_learnings(session_history)
update_claude_md(learnings)
store_to_chromadb(learnings) The system extracts what was learned and persists it.
Self-Reflection: Meta-Learning
Beyond capturing individual learnings, the system reflects on patterns.
At session end, a self-reflection agent analyzes:
- What approaches were effective
- What inefficiencies occurred
- What patterns emerged
These meta-learnings go into a separate collection—insights about the development process itself, not just specific bugs.
Example meta-learning:
“When debugging TypeScript type errors, checking the imported types first resolves 70% of issues faster than tracing through the codebase.”
This is knowledge about how to debug, not just what broke.
Memory Decay: Keeping Knowledge Fresh
Old knowledge becomes stale. A fix that worked six months ago might not apply to the current architecture.
Memory decay solves this:
- Half-life: 90 days (configurable)
- Minimum weight: 0.1 (never fully forgotten)
- Access boost: Recently used memories get +20% weight
The result: Claude prioritizes recent, actively-used knowledge while maintaining a long-term memory of rare but important patterns. This is similar to the token optimization strategies I’ve outlined for reducing AI token usage, where selective information display keeps context efficient.
Custom Commands
The system adds slash commands for manual interaction:
| Command | What It Does |
|---|---|
/learn | Manually capture a learning from the current session |
/search-knowledge "query" | Search all memory layers |
/review-plan | Validate a plan against past learnings |
/reflect | Trigger self-reflection analysis |
/memory-stats | Show knowledge base statistics |
/memory-maintain | Run decay, merge duplicates, archive old memories |
Most learning happens automatically. These commands are for when you want manual control.
Effort-Based Routing
Not every task needs maximum AI capability. The system classifies prompts:
| Level | Example | Token Impact |
|---|---|---|
low | “What’s the syntax for…” | Fastest, cheapest |
medium | “Implement this feature” | Balanced |
high | “Architect the auth system” | Maximum capability |
Simple questions get fast answers. Complex problems get deep thinking.
Real Results
After two months of use:
Before (vanilla Claude Code):
- Same auth bug: 45 minutes to debug (third time)
- Build failures: Manual pattern recognition
- Session knowledge: Lost after conversation ends
After (self-improving RAG):
- Same auth bug:
/search-knowledge "auth middleware"→ fix in 2 minutes - Build failures: Automatic capture, searchable history
- Session knowledge: Persisted, searchable, improving
The system has captured 150+ error patterns, 45 successful patterns, and 80 meta-learnings across my projects. For more on Claude Code workflows and best practices, see my comprehensive Claude Code guide.
Getting Started
The system is available as a configuration you can install:
cd ~/Projects/active/claude-rag-config
./setup.sh
# Then in any project:
claude Setup installs:
- Hooks for automatic capture
- Custom commands for manual control
- ChromaDB for vector storage
- Graph memory database
- CLAUDE.md template
Getting Started: The Minimum Viable RAG Setup
You don’t need the full system on day one. Here’s the smallest version that actually makes a difference.
Step 1: Install ChromaDB
pip install chromadb That’s your vector store. One command.
Step 2: Create a capture hook
Create a file at ~/.claude/hooks/post_tool_use.py:
import json, sys, chromadb, hashlib
from datetime import datetime
data = json.loads(sys.stdin.read())
if data.get("tool") == "Bash" and data.get("exit_code", 0) != 0:
client = chromadb.PersistentClient(path="~/.claude/memory")
collection = client.get_or_create_collection("error_patterns")
error_text = data.get("output", "")[:500]
doc_id = hashlib.md5(error_text.encode()).hexdigest()
collection.upsert(
documents=[error_text],
ids=[doc_id],
metadatas=[{"timestamp": datetime.now().isoformat()}]
)
print(json.dumps({"continue": True})) This captures every failed Bash command into ChromaDB. No manual intervention.
Step 3: Add a search command
Add this to your CLAUDE.md:
## /search-errors command
When user types /search-errors "query":
1. Query ChromaDB error_patterns collection
2. Return top 3 similar past errors and their context
3. Suggest fixes based on patterns Step 4: Add a project-specific CLAUDE.md section
## Known Pitfalls (auto-updated)
<!-- This section gets updated by the session summary hook --> That’s the minimum viable setup. Four steps, maybe 20 minutes. You won’t have graph memory or self-reflection yet—but you’ll have semantic search over your past errors, which is where most of the day-to-day value comes from.
The auth bug I mentioned at the top of this post? The minimum viable version would have caught it. The error was in the database. The fix was two queries away.
Build the full system when the minimum version proves itself. For me, that took about 3 weeks.
The minimum version will change how you think about debugging. Instead of starting from scratch each session, you’ll start with a search. That shift alone is worth the 20-minute setup cost.
What’s Next
The system keeps improving. Planned additions:
- Cross-project learning: Patterns that work in one project suggested in others
- Confidence scoring: How reliable is this learning based on how often it’s worked?
- Team memory: Shared knowledge base across collaborators
The Bigger Picture
This isn’t just about remembering bugs.
It’s about accumulating developer judgment in a searchable, queryable format.
Every debugging session teaches something. Without capture, those lessons evaporate. With this system, they compound.
Claude Code doesn’t just help you code. It becomes a repository of everything you’ve learned about your codebase—and it gets smarter every session.
Questions about implementing this in your workflow? Reach out on LinkedIn.
FAQ
What's the difference between this and regular Claude Code?
Regular Claude Code treats each session as independent—it doesn't remember what worked last week. This system captures learnings, stores them in a searchable database, and automatically loads relevant context. It's the difference between a contractor who starts fresh each day vs. one who keeps notes.
How does automatic learning capture work?
PostToolUse hooks intercept build failures, test results, and file edits. When something fails, the hook extracts the error, root cause, and eventual fix—then stores it in ChromaDB. No manual /learn commands needed for most cases.
What's graph memory and why does it matter?
Vector databases store flat documents. Graph memory tracks relationships: which files have errors, what fixes work, how decisions connect to outcomes. This lets Claude answer questions like 'What fixes have I applied to auth.ts?' or 'What patterns led to successful deployments?'
Does the knowledge base get bloated over time?
Memory decay solves this. Old memories gradually lose weight (90-day half-life). Recently accessed memories get boosted. This keeps the knowledge base relevant—Claude remembers what it uses, forgets what it doesn't.
Can I use this with my own projects?
Yes—the system is project-agnostic. Run setup.sh to install hooks and commands. Each project gets its own CLAUDE.md for project-specific memory, while ChromaDB stores cross-project learnings.
Sources & Further Reading
Sources
- Claude Code Best Practices Official guidance on Claude Code workflows that informed the hook design.
- Claude Code Hooks Documentation Technical documentation for the hooks system used for automatic capture.
- ChromaDB Documentation Vector database used for semantic search of learnings.
Further Reading
- I Built a Bot That Builds SaaS Products. It Shipped One in 24 Hours. MicroSaaSBot automates SaaS building from idea to deployed MVP. Built StatementSync in 7 days with minimal code. See how it works.
- Claude Code Best Practices the Official Docs Don't Cover (2026) What I learned building 36K lines of production code with Claude Code: quality gates, multi-agent orchestration, and the workflow patterns that ship.
- My Two-Gate System for Claude Code Cut Errors 84% Build safer Claude Code projects with a two-gate quality system. Learn the mandatory checks that catch bugs before deployment.
Reading Path
Continue the AI Product Development track
Contextual next reads
Claude Code Best Practices the Official Docs Don't Cover (2026)
What I learned building 36K lines of production code with Claude Code: quality gates, multi-agent orchestration, and the workflow patterns that ship.
My Two-Gate System for Claude Code Cut Errors 84%
Build safer Claude Code projects with a two-gate quality system. Learn the mandatory checks that catch bugs before deployment.
WebMCP Is the Missing Piece for AI-Connected SvelteKit Apps
Build WebMCP into SvelteKit apps using navigator.modelContext. Learn polyfill setup, tool schemas, and verification in 2026.
Continue the AI Product Development track
This signup keeps the reader in the same context as the article they just finished. It is intended as a track-specific continuation, not a generic site-wide interrupt.
- Next posts in this reading path
- New supporting notes tied to the same cluster
- Distribution-ready summaries instead of generic blog digests
What do you think?
I post about this stuff on LinkedIn every day and the conversations there are great. If this post sparked a thought, I'd love to hear it.
Discuss on LinkedIn