chirag.patnaiks.in: Self-Learning CLI Agents: A Practical Guide

Building Systems That Get Smarter Every Day

Introduction

After months of building and refining self-learning systems with CLI agents, I've discovered something that is quite obvious in retrospect: the secret to continuously improving AI assistance isn't in the model itself—it's in the knowledge capture infrastructure surrounding it.

Recent research quantifies what practitioners have been discovering: systems that accumulate and refine their own contextual knowledge can outperform traditional approaches by 10.6% on agent tasks and 8.6% on domain-specific reasoning, while reducing adaptation costs by up to 87%.

Since, I've been using this for a while, I though a a practical guide to building your own self-learning development system is in order.

I've experimented with this approach in multiple domains, including AMP for email development and writing TradingView PineScript code, which is published at https://github.com/NakliTechie/PineScriptCoder.

Both are relatively obscure in the overall scheme of things.

The system's adaptability across different development contexts demonstrates its versatility and value regardless of the specific technology or domain you're working with.

Why You Need This

The Problem You're Living With

Without systematic knowledge capture, you're experiencing:

Repeated Mistakes: Solving the same problems across projects because insights weren't preserved
Knowledge Loss: Valuable lessons disappear when team members move on or memories fade
No Measurable Progress: Can't tell if your processes are actually improving
Inconsistent Quality: Good patterns discovered but never systematically applied

The tragedy isn't making mistakes—it's making the same mistakes repeatedly because you have no institutional memory.

The Solution: Memory as Infrastructure

A self-learning system addresses these through three mechanisms:

Systematic Capture: Every significant learning is documented in a structured format
Organized Storage: Insights filed logically for easy retrieval
Active Utilization: Learnings automatically feed back into development workflows

This creates a compounding effect: better context → better performance → better learnings → better context.

The key insight from research: LLMs don't need brevity—they need comprehensive, detailed contexts. Unlike humans who benefit from concise summaries, language models perform better with long, detailed contexts and can autonomously extract relevant information.

The Research Foundation (In Plain English)

ACE Framework: Contexts as Evolving Playbooks

Stanford and SambaNova research introduced Agentic Context Engineering (ACE), which treats contexts not as static prompts but as "evolving playbooks" that accumulate and organize strategies over time.

Two problems ACE solves:

Brevity Bias: Traditional optimizers create concise, universal instructions while sacrificing domain-specific knowledge—the details that make systems work in production.
Context Collapse: Iterative rewriting degrades contexts into shorter, more ambiguous summaries over time, causing sharp performance declines.

ACE's solution: Structured, incremental updates that preserve detailed knowledge. Think of it like version control for context—you add and refine, you don't rewrite from scratch.

Training-Free GRPO: Learning Without Training

Tencent's research showed you can improve agent performance without any parameter updates—just by accumulating experiential knowledge as context.

With just a few dozen training samples, their approach outperformed fine-tuned models while avoiding overfitting and data scarcity issues.

The Implication

You don't need to fine-tune models. You don't need massive datasets. You need better memory management.

Implementation: Let Your Agent Build It

Here's the liberating truth: you don't need to manually create this infrastructure. Your CLI coding agent can build the entire learning system from broad guidelines.

This is meta-prompting in action—instead of manually writing templates and structures, you collaborate with your agent to design and implement the system itself.

Step 1: Bootstrap the System

Give your coding agent these requirements:

Create a comprehensive learning capture system for our development projects:

1. A directory structure separating different types of learnings
2. Templates that capture problems, solutions, and prevention strategies  
3. A consolidated quick-reference file for frequently-used knowledge
4. Integration with our development workflow
5. Clear documentation on usage

Make it practical, not bureaucratic. The system should make capturing 
knowledge easier, not harder.

Let the agent propose and implement. It will often produce something more comprehensive than you'd write manually.

Step 2: What the Agent Creates

Your agent will typically build:

/project/learning/
├── README.md                          # System overview
├── LESSONS_LEARNED_TEMPLATE.md        # Capture template
├── common_learnings.md                # Quick reference (KEY FILE)
└── categorized_learnings/
    ├── technical/                     # Technical solutions
    ├── process/                       # Workflow improvements
    ├── project_specific/              # Project insights
    └── framework_updates/             # System-wide changes

Critical component: The common_learnings.md file is your system's working memory. This is what the agent references during active development—not the detailed categorized files.

Step 3: The Capture Template

Your agent will design templates that capture:

For Technical Learnings:

Problem description and context
Root cause analysis
Solution (with before/after code)
Prevention strategies
Integration recommendations

For Process Improvements:

Current process description
Issues identified
Improved approach
Implementation guide
Measurable benefits

The template ensures completeness—when you capture knowledge, you capture it actionably.

Step 4: Agent Integration

Add this to your agent's system prompt:

## Learning Capture Protocol

After every significant development session:

1. IDENTIFY learnings worthy of capture
   - Non-trivial technical solutions
   - Process improvements discovered
   - Patterns benefiting future projects
   - Mistakes with clear prevention strategies

2. CATEGORIZE appropriately
   - Technical / Process / Project-specific / Framework

3. UPDATE common_learnings.md
   - Add broadly applicable insights
   - Ensure immediate accessibility
   - Maintain organized structure

4. FILE detailed learning
   - Use complete template
   - Store in appropriate category
   - Cross-reference with common learnings

5. IDENTIFY documentation updates
   - Mark guides/checklists needing revision
   - Create action items
   - Prioritize by impact

Learning capture is NOT optional—it's core workflow.

Step 5: Active Utilization

Ensure your agent uses what it learns:

## Using the Learning System

During development:
1. CONSULT common_learnings.md at task start
2. APPLY relevant patterns from past experience
3. DOCUMENT new patterns discovered during work
4. UPDATE common learnings with applicable insights
5. FILE detailed learnings in categories

When solving problems:
- Reference past solutions before implementing from scratch
- Apply prevention strategies from similar issues
- Build on established patterns
- Note when patterns don't apply (document edge cases)

The self-reinforcing cycle: Execution → Learning → Documentation → Context → Better Execution

Real-World Results: What Actually Happens

After a few iterations of living with this system:

Quantifiable Improvements

Reduced problem-solving time: Similar issues reference existing solutions
Fewer repeated mistakes: Prevention strategies actually prevent
Faster onboarding: New team members leverage accumulated knowledge immediately
Measurable consistency: Best practices applied consistently

Qualitative Changes

Compound learning: Each project makes future projects easier
Knowledge democratization: Expertise becomes accessible to entire team
Confidence in complexity: Willing to tackle harder problems
Reduced cognitive load: System remembers, you don't have to

The Surprise Benefit: Cross-Project Intelligence

Insights from one project automatically benefit unrelated projects. A performance optimization from web work improves database tasks. Validation techniques from forms apply to APIs.

This happens because common_learnings.md creates unified context. The system sees patterns across domains that individuals might miss.

The Evolution: From Basic to Optimized

Initial State

Start simple:

Basic directory structure
Simple template (problem/solution/lesson)
Manual capture after sessions
Agent references common learnings

Intermediate State

Refinements emerge:

Template complexity adjusted based on actual use
Common learnings reorganized by access frequency
Agent begins suggesting capture automatically
Cross-references between learnings appear

Mature State

The system optimizes itself:

Agent maintains optimal organization
Redundant learnings merged automatically
Common learnings updated in real-time
Prevention strategies actively applied

Key insight: My initial categorization worked, but lacked optimization for access. The common_learnings.md innovation came from observing usage patterns—I needed instant access during development, not perfect categorization.

This evolution happens naturally as you use the system. You don't need to plan for it—just start with the basics and let actual usage drive improvements.

Common Pitfalls and Solutions

Pitfall 1: Template Overload

Problem: Templates too complex, leading to inconsistent usage.

Solution: Start with problem/solution/lesson. Add complexity gradually based on actual needs.

Pitfall 2: Scattered Information

Problem: Perfect categorization that makes nothing findable during work.

Solution: The common_learnings.md file. Accept redundancy—information in both quick reference and detailed files is a feature, not a bug.

Pitfall 3: Inconsistent Application

Problem: Great system used occasionally.

Solution: Make it automatic via system prompt. Have agent remind you. Make skipping require conscious choice.

Pitfall 4: Write-Only Documentation

Problem: Learnings captured but never referenced.

Solution: System prompt must actively consult learnings during work. If the agent isn't referencing past knowledge, you're just documenting, not learning.

Pitfall 5: Brevity Bias

Problem: Keeping learnings concise, losing useful detail.

Solution: Remember the research insight: comprehensiveness beats conciseness for AI systems. Preserve details. Let the AI extract relevance.

Getting Started: Your Implementation Path

Phase 1: Co-Design with Your Agent

Describe your vision

Tell your coding agent what you want
Discuss your workflow and constraints
Review agent's architectural proposal

Let agent implement

Agent builds the infrastructure
Review generated templates and structure
Suggest refinements if needed

Manual testing

Capture 2-3 learnings yourself
Validate template completeness
Ensure organization makes sense

Time investment: ~1 hour for initial setup

Phase 2: Agent-Driven Capture

Add capture instructions

Update agent's system prompt
Define capture criteria clearly
Set expectations for when/what to document

Observe behavior

Does agent identify learnings appropriately?
Are captured learnings specific and actionable?
Is categorization logical?

Refinement

Adjust capture criteria based on observations
Simplify or expand template as needed
Begin building initial knowledge base

Time investment: ~30 minutes for setup, then observe over several sessions

Phase 3: Active Utilization

Add consultation instructions

Update system prompt for active reference
Ensure common_learnings.md in context
Define when agent should consult learnings

Monitor reference behavior

Does agent check common learnings?
Are past solutions being reused?
Is the feedback loop working?

Measure improvement

Track time to solve similar problems
Note reduction in repeated mistakes
Document instances of knowledge reuse

Time investment: ~20 minutes for setup, benefits accrue automatically

Phase 4: Optimization

Analysis

Ask agent to review what worked
Identify template complexity issues
Find access pattern inefficiencies

Reorganization

Adjust common learnings organization
Merge redundant entries
Improve categorization

Metrics and sharing

Establish baseline metrics
Document the system's value
Share approach with team

Time investment: ~1 hour for optimization, then ongoing maintenance is minimal

Phase 5: Autonomous Evolution

Let the agent maintain and optimize:

Agent adjusts organization based on usage
Automatic quality checks for captured knowledge
Cross-project knowledge synthesis
Team-level scaling if needed

The system improves itself from this point forward

Key principle: Start with agent collaboration, not manual work. The agent that uses the system should also build and maintain it. Total setup time is 2-3 hours; the benefits compound indefinitely.

Measuring Success

Track these metrics:

Efficiency Metrics

Time to resolve similar issues: Should decrease as solutions accumulate
Repeated mistake frequency: Should approach zero for captured lessons
Onboarding time: Should decrease as knowledge base grows
Reference rate: Should increase as quality improves

Quality Metrics

Best practice consistency: Measured through reviews
Knowledge application rate: % of captured learnings actually used
Cross-project benefit: Insights from one project applied elsewhere
Prevention effectiveness: Issues avoided due to strategies

System Health Metrics

Learnings per project: Should be consistent
Common learnings reference frequency: Should be high during work
Categorization accuracy: Learnings filed where findable
Documentation freshness: Regular updates indicate active use

Critical metric: Time to first reference. If you can find relevant past knowledge in under 30 seconds, your system works. If it takes minutes, your organization needs work.

The Future: Where This Goes

Three converging trends:

Context windows are exploding: 4K → 32K → 128K → 1M+ tokens
Models handle long contexts better: Can actually use those million tokens effectively
Context engineering is formalizing: Frameworks like ACE codify best practices

Implication: In 12 months, self-learning systems won't be advanced—they'll be standard. The competitive advantage belongs to teams mastering knowledge capture now.

What's Coming

Automated reflection: AI analyzing traces and suggesting learnings automatically
Cross-team synthesis: Learning systems sharing insights across organizational boundaries
Semantic knowledge graphs: Moving beyond flat files to relationship-based storage
Real-time adaptation: Systems updating based on immediate execution feedback
Failure prediction: Warning about likely problems based on accumulated experience

Conclusion: Memory as Competitive Advantage

The research validates what practice confirms: comprehensive, evolving contexts outperform traditional approaches while reducing costs.

But the deeper insight: This isn't about AI—it's about building institutional memory that compounds over time.

Your development environment should get smarter every time you use it. Your tools should accumulate wisdom. Your AI assistants should improve not because models update, but because context evolves.

Each problem solved becomes future capability.

This is the future of development: not smarter AI, but AI with better memory.

The question isn't whether you'll build a self-learning system. The question is whether you'll build it before your competitors do.

Most important: Start imperfectly today. A simple system used consistently beats a sophisticated system used sporadically. The entire setup takes 2-3 hours, and the system improves itself from there.

Resources and References

Research Papers

Agentic Context Engineering: arxiv.org/abs/2510.04618
Training-Free GRPO: arxiv.org/abs/2510.08191

Thursday, October 16, 2025

Self-Learning CLI Agents: A Practical Guide