Thursday, October 16, 2025

Self-Learning CLI Agents: A Practical Guide

 

Building Systems That Get Smarter Every Day

Introduction

After months of building and refining self-learning systems with CLI agents, I've discovered something that is quite obvious in retrospect: the secret to continuously improving AI assistance isn't in the model itself—it's in the knowledge capture infrastructure surrounding it.

Recent research quantifies what practitioners have been discovering: systems that accumulate and refine their own contextual knowledge can outperform traditional approaches by 10.6% on agent tasks and 8.6% on domain-specific reasoning, while reducing adaptation costs by up to 87%.

Since, I've been using this for a while, I though a a practical guide to building your own self-learning development system is in order. 

I've experimented with this approach in multiple domains, including AMP for email development and writing TradingView PineScript code, which is published at https://github.com/NakliTechie/PineScriptCoder.

Both are relatively obscure in the overall scheme of things. 

The system's adaptability across different development contexts demonstrates its versatility and value regardless of the specific technology or domain you're working with.


Why You Need This

The Problem You're Living With

Without systematic knowledge capture, you're experiencing:

  • Repeated Mistakes: Solving the same problems across projects because insights weren't preserved
  • Knowledge Loss: Valuable lessons disappear when team members move on or memories fade
  • No Measurable Progress: Can't tell if your processes are actually improving
  • Inconsistent Quality: Good patterns discovered but never systematically applied

The tragedy isn't making mistakes—it's making the same mistakes repeatedly because you have no institutional memory.

The Solution: Memory as Infrastructure

A self-learning system addresses these through three mechanisms:

  1. Systematic Capture: Every significant learning is documented in a structured format
  2. Organized Storage: Insights filed logically for easy retrieval
  3. Active Utilization: Learnings automatically feed back into development workflows

This creates a compounding effect: better context → better performance → better learnings → better context.

The key insight from research: LLMs don't need brevity—they need comprehensive, detailed contexts. Unlike humans who benefit from concise summaries, language models perform better with long, detailed contexts and can autonomously extract relevant information.


The Research Foundation (In Plain English)

ACE Framework: Contexts as Evolving Playbooks

Stanford and SambaNova research introduced Agentic Context Engineering (ACE), which treats contexts not as static prompts but as "evolving playbooks" that accumulate and organize strategies over time.

Two problems ACE solves:

  1. Brevity Bias: Traditional optimizers create concise, universal instructions while sacrificing domain-specific knowledge—the details that make systems work in production.

  2. Context Collapse: Iterative rewriting degrades contexts into shorter, more ambiguous summaries over time, causing sharp performance declines.

ACE's solution: Structured, incremental updates that preserve detailed knowledge. Think of it like version control for context—you add and refine, you don't rewrite from scratch.

Training-Free GRPO: Learning Without Training

Tencent's research showed you can improve agent performance without any parameter updates—just by accumulating experiential knowledge as context.

With just a few dozen training samples, their approach outperformed fine-tuned models while avoiding overfitting and data scarcity issues.

The Implication

You don't need to fine-tune models. You don't need massive datasets. You need better memory management.


Implementation: Let Your Agent Build It

Here's the liberating truth: you don't need to manually create this infrastructure. Your CLI coding agent can build the entire learning system from broad guidelines.

This is meta-prompting in action—instead of manually writing templates and structures, you collaborate with your agent to design and implement the system itself.

Step 1: Bootstrap the System

Give your coding agent these requirements:

Create a comprehensive learning capture system for our development projects:

1. A directory structure separating different types of learnings
2. Templates that capture problems, solutions, and prevention strategies  
3. A consolidated quick-reference file for frequently-used knowledge
4. Integration with our development workflow
5. Clear documentation on usage

Make it practical, not bureaucratic. The system should make capturing 
knowledge easier, not harder.

Let the agent propose and implement. It will often produce something more comprehensive than you'd write manually.

Step 2: What the Agent Creates

Your agent will typically build:

/project/learning/
├── README.md                          # System overview
├── LESSONS_LEARNED_TEMPLATE.md        # Capture template
├── common_learnings.md                # Quick reference (KEY FILE)
└── categorized_learnings/
    ├── technical/                     # Technical solutions
    ├── process/                       # Workflow improvements
    ├── project_specific/              # Project insights
    └── framework_updates/             # System-wide changes

Critical component: The common_learnings.md file is your system's working memory. This is what the agent references during active development—not the detailed categorized files.

Step 3: The Capture Template

Your agent will design templates that capture:

For Technical Learnings:

  • Problem description and context
  • Root cause analysis
  • Solution (with before/after code)
  • Prevention strategies
  • Integration recommendations

For Process Improvements:

  • Current process description
  • Issues identified
  • Improved approach
  • Implementation guide
  • Measurable benefits

The template ensures completeness—when you capture knowledge, you capture it actionably.

Step 4: Agent Integration

Add this to your agent's system prompt:

## Learning Capture Protocol

After every significant development session:

1. IDENTIFY learnings worthy of capture
   - Non-trivial technical solutions
   - Process improvements discovered
   - Patterns benefiting future projects
   - Mistakes with clear prevention strategies

2. CATEGORIZE appropriately
   - Technical / Process / Project-specific / Framework

3. UPDATE common_learnings.md
   - Add broadly applicable insights
   - Ensure immediate accessibility
   - Maintain organized structure

4. FILE detailed learning
   - Use complete template
   - Store in appropriate category
   - Cross-reference with common learnings

5. IDENTIFY documentation updates
   - Mark guides/checklists needing revision
   - Create action items
   - Prioritize by impact

Learning capture is NOT optional—it's core workflow.

Step 5: Active Utilization

Ensure your agent uses what it learns:

## Using the Learning System

During development:
1. CONSULT common_learnings.md at task start
2. APPLY relevant patterns from past experience
3. DOCUMENT new patterns discovered during work
4. UPDATE common learnings with applicable insights
5. FILE detailed learnings in categories

When solving problems:
- Reference past solutions before implementing from scratch
- Apply prevention strategies from similar issues
- Build on established patterns
- Note when patterns don't apply (document edge cases)

The self-reinforcing cycle: Execution → Learning → Documentation → Context → Better Execution


Real-World Results: What Actually Happens

After a few iterations of living with this system:

Quantifiable Improvements

  • Reduced problem-solving time: Similar issues reference existing solutions
  • Fewer repeated mistakes: Prevention strategies actually prevent
  • Faster onboarding: New team members leverage accumulated knowledge immediately
  • Measurable consistency: Best practices applied consistently

Qualitative Changes

  • Compound learning: Each project makes future projects easier
  • Knowledge democratization: Expertise becomes accessible to entire team
  • Confidence in complexity: Willing to tackle harder problems
  • Reduced cognitive load: System remembers, you don't have to

The Surprise Benefit: Cross-Project Intelligence

Insights from one project automatically benefit unrelated projects. A performance optimization from web work improves database tasks. Validation techniques from forms apply to APIs.

This happens because common_learnings.md creates unified context. The system sees patterns across domains that individuals might miss.


The Evolution: From Basic to Optimized

Initial State

Start simple:

  • Basic directory structure
  • Simple template (problem/solution/lesson)
  • Manual capture after sessions
  • Agent references common learnings

Intermediate State

Refinements emerge:

  • Template complexity adjusted based on actual use
  • Common learnings reorganized by access frequency
  • Agent begins suggesting capture automatically
  • Cross-references between learnings appear

Mature State

The system optimizes itself:

  • Agent maintains optimal organization
  • Redundant learnings merged automatically
  • Common learnings updated in real-time
  • Prevention strategies actively applied

Key insight: My initial categorization worked, but lacked optimization for access. The common_learnings.md innovation came from observing usage patterns—I needed instant access during development, not perfect categorization.

This evolution happens naturally as you use the system. You don't need to plan for it—just start with the basics and let actual usage drive improvements.


Common Pitfalls and Solutions

Pitfall 1: Template Overload

Problem: Templates too complex, leading to inconsistent usage.

Solution: Start with problem/solution/lesson. Add complexity gradually based on actual needs.

Pitfall 2: Scattered Information

Problem: Perfect categorization that makes nothing findable during work.

Solution: The common_learnings.md file. Accept redundancy—information in both quick reference and detailed files is a feature, not a bug.

Pitfall 3: Inconsistent Application

Problem: Great system used occasionally.

Solution: Make it automatic via system prompt. Have agent remind you. Make skipping require conscious choice.

Pitfall 4: Write-Only Documentation

Problem: Learnings captured but never referenced.

Solution: System prompt must actively consult learnings during work. If the agent isn't referencing past knowledge, you're just documenting, not learning.

Pitfall 5: Brevity Bias

Problem: Keeping learnings concise, losing useful detail.

Solution: Remember the research insight: comprehensiveness beats conciseness for AI systems. Preserve details. Let the AI extract relevance.


Getting Started: Your Implementation Path

Phase 1: Co-Design with Your Agent

Describe your vision

  • Tell your coding agent what you want
  • Discuss your workflow and constraints
  • Review agent's architectural proposal

Let agent implement

  • Agent builds the infrastructure
  • Review generated templates and structure
  • Suggest refinements if needed

Manual testing

  • Capture 2-3 learnings yourself
  • Validate template completeness
  • Ensure organization makes sense

Time investment: ~1 hour for initial setup

Phase 2: Agent-Driven Capture

Add capture instructions

  • Update agent's system prompt
  • Define capture criteria clearly
  • Set expectations for when/what to document

Observe behavior

  • Does agent identify learnings appropriately?
  • Are captured learnings specific and actionable?
  • Is categorization logical?

Refinement

  • Adjust capture criteria based on observations
  • Simplify or expand template as needed
  • Begin building initial knowledge base

Time investment: ~30 minutes for setup, then observe over several sessions

Phase 3: Active Utilization

Add consultation instructions

  • Update system prompt for active reference
  • Ensure common_learnings.md in context
  • Define when agent should consult learnings

Monitor reference behavior

  • Does agent check common learnings?
  • Are past solutions being reused?
  • Is the feedback loop working?

Measure improvement

  • Track time to solve similar problems
  • Note reduction in repeated mistakes
  • Document instances of knowledge reuse

Time investment: ~20 minutes for setup, benefits accrue automatically

Phase 4: Optimization

Analysis

  • Ask agent to review what worked
  • Identify template complexity issues
  • Find access pattern inefficiencies

Reorganization

  • Adjust common learnings organization
  • Merge redundant entries
  • Improve categorization

Metrics and sharing

  • Establish baseline metrics
  • Document the system's value
  • Share approach with team

Time investment: ~1 hour for optimization, then ongoing maintenance is minimal

Phase 5: Autonomous Evolution

Let the agent maintain and optimize:

  • Agent adjusts organization based on usage
  • Automatic quality checks for captured knowledge
  • Cross-project knowledge synthesis
  • Team-level scaling if needed

The system improves itself from this point forward

Key principle: Start with agent collaboration, not manual work. The agent that uses the system should also build and maintain it. Total setup time is 2-3 hours; the benefits compound indefinitely.


Measuring Success

Track these metrics:

Efficiency Metrics

  • Time to resolve similar issues: Should decrease as solutions accumulate
  • Repeated mistake frequency: Should approach zero for captured lessons
  • Onboarding time: Should decrease as knowledge base grows
  • Reference rate: Should increase as quality improves

Quality Metrics

  • Best practice consistency: Measured through reviews
  • Knowledge application rate: % of captured learnings actually used
  • Cross-project benefit: Insights from one project applied elsewhere
  • Prevention effectiveness: Issues avoided due to strategies

System Health Metrics

  • Learnings per project: Should be consistent
  • Common learnings reference frequency: Should be high during work
  • Categorization accuracy: Learnings filed where findable
  • Documentation freshness: Regular updates indicate active use

Critical metric: Time to first reference. If you can find relevant past knowledge in under 30 seconds, your system works. If it takes minutes, your organization needs work.


The Future: Where This Goes

Three converging trends:

  1. Context windows are exploding: 4K → 32K → 128K → 1M+ tokens
  2. Models handle long contexts better: Can actually use those million tokens effectively
  3. Context engineering is formalizing: Frameworks like ACE codify best practices

Implication: In 12 months, self-learning systems won't be advanced—they'll be standard. The competitive advantage belongs to teams mastering knowledge capture now.

What's Coming

  • Automated reflection: AI analyzing traces and suggesting learnings automatically
  • Cross-team synthesis: Learning systems sharing insights across organizational boundaries
  • Semantic knowledge graphs: Moving beyond flat files to relationship-based storage
  • Real-time adaptation: Systems updating based on immediate execution feedback
  • Failure prediction: Warning about likely problems based on accumulated experience

Conclusion: Memory as Competitive Advantage

The research validates what practice confirms: comprehensive, evolving contexts outperform traditional approaches while reducing costs.

But the deeper insight: This isn't about AI—it's about building institutional memory that compounds over time.

Your development environment should get smarter every time you use it. Your tools should accumulate wisdom. Your AI assistants should improve not because models update, but because context evolves.

Each problem solved becomes future capability.

This is the future of development: not smarter AI, but AI with better memory.

The question isn't whether you'll build a self-learning system. The question is whether you'll build it before your competitors do.



Most important: Start imperfectly today. A simple system used consistently beats a sophisticated system used sporadically. The entire setup takes 2-3 hours, and the system improves itself from there.


Resources and References

Research Papers

Related Reading

No comments:

Post a Comment