Atom AI Labs - AI-Powered Multi-Tenant Platform

Auto-Dev User Guide

**Version:** 1.0

**Last Updated:** 2026-04-10

Complete guide to using Auto-Dev self-evolving agent capabilities in ATOM.

---

Overview
Getting Started
Core Concepts
Using Auto-Dev Agents
Monitoring Evolution
Capability Gates
Best Practices
Troubleshooting

---

Overview

Auto-Dev is ATOM's self-evolving agent system that enables AI agents to automatically improve their capabilities through experience. Instead of manually updating agent skills, Auto-Dev agents learn, adapt, and evolve based on their interactions and outcomes.

Key Benefits

**Continuous Improvement**: Agents get better over time without manual intervention
**Skill Discovery**: Automatically identifies and learns new skills from successful patterns
**Performance Optimization**: Refines existing skills based on real-world feedback
**Adaptive Behavior**: Adjusts to changing requirements and environments
**Safe Evolution**: Capability gates ensure only proven improvements are deployed

Auto-Dev Components

Component	Purpose
Memento Engine	Generates new skill candidates from successful experiences
AlphaEvolver Engine	Optimizes and refines existing skills
Reflection Engine	Detects patterns and suggests improvements
Fitness Service	Evaluates skill performance and quality
Capability Gate	Validates improvements before deployment
Event Hooks	Subscribes to agent events for learning triggers

---

Getting Started

Prerequisites

**ATOM Account**: Free, Solo, Team, or Enterprise plan
**Agent Maturity**: Agent must be at least **INTERN** level or higher
**Episodes**: Minimum of 10 episodes recorded for initial learning
**Storage**: Sufficient storage for skill versions and evolution history

Enabling Auto-Dev

**Navigate to Agent Settings**

**Enable Auto-Dev**

**Configure Learning Parameters**

**Save and Start Learning**

---

Core Concepts

1. Episodes as Learning Data

Auto-Dev learns from episodes - records of agent executions with outcomes.

**Episode Structure:**

**Input**: Task description, context, parameters
**Execution**: Agent actions, decisions, intermediate states
**Output**: Final result, success/failure, performance metrics
**Feedback**: User ratings, corrections, quality scores

**Learning from Episodes:**

Episodes → Pattern Detection → Skill Generation → Performance Testing → Deployment

2. Skill Candidates

Auto-Dev generates "skill candidates" - potential new skills or improvements.

**Candidate Types:**

**New Skills**: Novel capabilities discovered from patterns
**Skill Refinements**: Improvements to existing skills
**Parameter Optimizations**: Better configurations for existing skills
**Composite Skills**: Combined skills for complex tasks

**Candidate Lifecycle:**

Generated → Tested → Validated → Deployed → Monitored

3. Fitness Evaluation

Every skill candidate is evaluated for "fitness" - how well it performs.

**Fitness Metrics:**

**Success Rate**: Percentage of successful executions
**Efficiency**: Resource usage (time, compute, tokens)
**Quality**: Output quality scores
**Consistency**: Performance variance across episodes
**Safety**: Compliance with governance rules

**Fitness Score Formula:**

fitness = (success_rate * 0.4) +
         (efficiency * 0.3) +
         (quality * 0.2) +
         (consistency * 0.1)

4. Capability Gates

Capability gates ensure only safe, effective improvements are deployed.

**Gate Levels:**

Gate	Threshold	Description
Conservative	90% fitness	Only proven improvements
Standard	80% fitness	Balanced safety and innovation
Aggressive	70% fitness	Faster evolution, more risk
Disabled	N/A	No automatic deployment

**Gate Validation:**

Performance testing on historical episodes
Safety checks (governance compliance)
A/B testing against current skills
Gradual rollout (10% → 50% → 100%)

---

Using Auto-Dev Agents

Agent Behavior Differences

**Auto-Dev agents differ from standard agents:**

Aspect	Standard Agent	Auto-Dev Agent
Skills	Fixed set	Evolves over time
Performance	Static	Improves with use
Updates	Manual	Automatic
Adaptability	Limited	High
Learning	None	Continuous

Using Auto-Dev Agents

**No special usage required** - Auto-Dev agents work like normal agents:

// Use Auto-Dev agent just like any other agent
const response = await fetch('/api/chat', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'X-Tenant-Id': tenantId
  },
  body: JSON.stringify({
    agent_id: 'auto-dev-agent-id',
    message: 'Analyze sales data and create report'
  })
});

const result = await response.json();
// Agent uses evolved skills automatically

Monitoring Agent Evolution

**View Evolution History:**

Agents → [Select Agent] → Evolution

Shows:

Evolution timeline
Skill changes deployed
Performance improvements
Fitness scores over time
Rollback options

---

Monitoring Evolution

Evolution Dashboard

**Access:** Agents → [Agent] → Auto-Dev → Dashboard

**Metrics Displayed:**

**Evolution Timeline**

Skill additions and removals
Performance improvements over time
Deployment history

**Performance Trends**

Success rate by week
Average execution time
Resource efficiency
User satisfaction

**Skill Inventory**

Current skills and versions
Skill fitness scores
Usage frequency
Last updated

**Learning Progress**

Episodes analyzed
Patterns discovered
Candidates generated
Candidates deployed

Evolution Notifications

Auto-Dev sends notifications for important events:

**Email Notifications:**

New skill deployed
Performance milestone reached
Capability gate violation
Rollback initiated

**In-App Notifications:**

Real-time evolution updates
Skill performance alerts
A/B test results
Deployment status

---

Capability Gates

Understanding Capability Gates

Capability gates are safety checkpoints that validate skill improvements before deployment.

**Gate Criteria:**

**Performance Threshold**

Minimum fitness score (e.g., 80%)
Success rate improvement
No performance regression

**Safety Validation**

Governance compliance
No security violations
Passes safety checks

**Testing Requirements**

Validated on historical episodes
A/B test completed
Consistent performance

**Gradual Rollout**

Deployed to small percentage first
Monitor for issues
Full deployment after validation

Configuring Capability Gates

**Settings:** Agents → [Agent] → Auto-Dev → Capability Gates

**Configuration Options:**

Gate Level: Standard
Fitness Threshold: 80%
Success Rate Improvement: 5%
A/B Test Duration: 100 episodes
Rollout Strategy: Gradual (10% → 50% → 100%)
Rollback Trigger: Success rate drop > 10%

Gate Violations

**When a gate violation occurs:**

**Deployment Blocked**

Skill not deployed
Reason logged
Notification sent

**Investigation Required**

Review violation details
Analyze root cause
Decide on action

**Resolution Options**

Adjust gate parameters
Fix skill candidate
Force deploy (with confirmation)
Discard candidate

---

Best Practices

For Optimal Evolution

**Provide High-Quality Episodes**

Diverse task types
Clear success/failure outcomes
Detailed feedback
Consistent context

**Set Appropriate Gate Levels**

Conservative for critical agents
Standard for most use cases
Aggressive for experimentation

**Monitor Evolution Actively**

Review evolution dashboard weekly
Investigate performance changes
Adjust parameters as needed

**Provide Regular Feedback**

Rate agent performance
Correct agent mistakes
Provide context and guidance

For Safe Evolution

**Enable Capability Gates**

Never disable gates for production agents
Use conservative gates for critical tasks
Monitor gate violations closely

**Test in Development First**

Enable Auto-Dev in dev environment
Monitor evolution patterns
Adjust parameters before production

**Maintain Rollback Strategy**

Keep evolution history
Document skill versions
Test rollback procedures

**Set Realistic Expectations**

Evolution takes time (weeks to months)
Not all episodes lead to improvements
Some evolution cycles may fail

For Team Collaboration

**Share Evolution Insights**

Document successful patterns
Share learning across team
Collaborate on improvements

**Establish Evolution Policies**

Standard gate levels
Approval workflows
Monitoring schedules

**Train Team Members**

Auto-Dev concepts
Monitoring procedures
Troubleshooting techniques

---

Troubleshooting

Common Issues

**Issue: Agent Not Evolving**

**Symptoms:**

No skill changes over time
Evolution dashboard shows no activity
Fitness scores static

**Solutions:**

Check if Auto-Dev is enabled
Verify minimum episode count (10+ required)
Check evolution schedule (is it running?)
Review learning parameters (window size, thresholds)

**Issue: Performance Regression**

**Symptoms:**

Success rate decreased after evolution
Agent making more mistakes
User complaints increased

**Solutions:**

Check evolution timeline for recent changes
Identify problematic skill deployment
Rollback to previous skill version
Adjust capability gate parameters

**Issue: Capability Gate Violations**

**Symptoms:**

Frequent deployment blocks
"Gate violation" notifications
No skills being deployed

**Solutions:**

Review gate threshold (may be too strict)
Check if fitness targets are realistic
Analyze why candidates are failing
Consider adjusting gate level

**Issue: Excessive Resource Usage**

**Symptoms:**

High compute costs
Slow evolution cycles
Storage limits reached

**Solutions:**

Reduce episode batch size
Increase evolution interval
Clean up old skill versions
Adjust learning window

Getting Help

**Support Resources:**

**Documentation:** docs.atomagentos.com/auto-dev
**Community:** community.atomagentos.com
**Support:** support@atomagentos.com
**Status:** status.atomagentos.com

**Debug Information:**

When reporting issues, include:

Agent ID and maturity level
Evolution configuration
Episode count and timeframe
Error messages or logs
Performance metrics

---

Advanced Topics

Custom Fitness Functions

For specialized use cases, you can define custom fitness functions:

def custom_fitness_function(episode_batch, skill_candidate):
    """Calculate custom fitness score"""

    # Base metrics
    success_rate = calculate_success_rate(episode_batch)
    efficiency = calculate_efficiency(episode_batch, skill_candidate)

    # Custom metrics
    domain_specific_score = calculate_domain_score(episode_batch)
    user_satisfaction = calculate_satisfaction(episode_batch)

    # Custom weights
    fitness = (
        success_rate * 0.3 +
        efficiency * 0.2 +
        domain_specific_score * 0.3 +
        user_satisfaction * 0.2
    )

    return fitness

Evolution Policies

Define policies for automated evolution decisions:

# Evolution policy example
policies:
  - name: "Financial Services Policy"
    conditions:
      domain: "financial"
      maturity: "autonomous"
    gates:
      level: "conservative"
      fitness_threshold: 0.95
      require_human_approval: true
    rollout:
      strategy: "manual"
      testing_duration: "30 days"

  - name: "Research Policy"
    conditions:
      domain: "research"
      maturity: "supervised"
    gates:
      level: "aggressive"
      fitness_threshold: 0.75
      require_human_approval: false
    rollout:
      strategy: "gradual"
      testing_duration: "7 days"

Multi-Agent Evolution

Coordinate evolution across multiple agents:

# Evolve agent team together
from atom_auto_dev import MultiAgentEvolution

team_evolution = MultiAgentEvolution(
    agents=["agent-1", "agent-2", "agent-3"],
    shared_context=True,
    coordinated_evolution=True
)

# Agents learn from each other's episodes
team_evolution.evolve_team()

# Share successful skills across team
team_evolution.share_skills()

---

Next Steps

**Enable Auto-Dev:** Turn on Auto-Dev for your agents
**Monitor Evolution:** Track improvements over time
**Provide Feedback:** Rate agent performance regularly
**Join Community:** Share evolution insights with other users

---

**← Back to Documentation**

ATOM Documentation

Auto-Dev User Guide

Table of Contents

Overview

Key Benefits

Auto-Dev Components

Getting Started

Prerequisites

Enabling Auto-Dev

Core Concepts

1. Episodes as Learning Data

2. Skill Candidates

3. Fitness Evaluation

4. Capability Gates

Using Auto-Dev Agents

Agent Behavior Differences

Using Auto-Dev Agents

Monitoring Agent Evolution

Monitoring Evolution

Evolution Dashboard

Evolution Notifications

Capability Gates

Understanding Capability Gates

Configuring Capability Gates

Gate Violations

Best Practices

For Optimal Evolution

For Safe Evolution

For Team Collaboration

Troubleshooting

Common Issues

Getting Help

Advanced Topics

Custom Fitness Functions

Evolution Policies

Multi-Agent Evolution

Next Steps