Governance Architecture - Unified Frontend-Backend System
**Version:** 2.0
**Last Updated:** 2026-04-12
**Status:** Production-Ready
Overview
The governance system enforces agent maturity-based access control, ensuring AI agents only perform actions appropriate to their experience level. The architecture follows a **unified frontend-backend model** where the Python backend is the single source of truth for all governance decisions.
Key Principles
- **Backend Authority**: All governance decisions made by Python backend
- **Fail-Closed**: Deny actions if backend unavailable (never allow by default)
- **Client-Side Caching**: 30-second TTL to reduce API load
- **Audit Trail**: All decisions logged with latency tracking
- **Tenant Isolation**: Multi-tenant safety via tenant_id filtering
Architecture Diagram
┌─────────────────────────────────────────────────────────────────┐
│ Frontend (Next.js) │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ AgentGovernanceService │ │
│ │ - canPerformAction() │ │
│ │ - Client-side cache (30s TTL) │ │
│ │ - Fail-closed on errors │ │
│ │ - Telemetry (hits/misses/errors) │ │
│ └───────────────┬──────────────────────────────────────────┘ │
│ │ │
│ │ fetch('/api/v1/agent-governance/evaluate') │
│ ▼ │
└──────────────────┼───────────────────────────────────────────────┘
│
│ HTTP POST
│
┌──────────────────┼───────────────────────────────────────────────┐
│ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Next.js API Route │ │
│ │ /api/v1/agent-governance/evaluate │ │
│ │ - Validates request (tenant_id, agent_id, action_type) │ │
│ │ - Proxies to Python backend │ │
│ │ - Returns fail-closed response on errors │ │
│ └───────────────┬──────────────────────────────────────────┘ │
│ │ │
│ │ HTTP POST (internal) │
│ ▼ │
└──────────────────┼───────────────────────────────────────────────┘
│
│
┌──────────────────┼───────────────────────────────────────────────┐
│ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Python Backend (FastAPI) │ │
│ │ /api/v1/agent-governance/enforce-action │ │
│ │ - AgentGovernanceService │ │
│ │ - ACTION_COMPLEXITY mapping │ │
│ │ - MATURITY_REQUIREMENTS │ │
│ │ - Database queries (agent_registry) │ │
│ │ - Governance decision logic │ │
│ └───────────────┬──────────────────────────────────────────┘ │
│ │ │
│ │ │
│ ┌───────────────┴──────────────────────────────────────────┐ │
│ │ PostgreSQL Database │ │
│ │ - agent_registry (status, confidence_score) │ │
│ │ - tenant_id filtering (security) │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────┘API Contract
Frontend → Next.js API
**Endpoint:** POST /api/v1/agent-governance/evaluate
**Request:**
{
tenant_id: string // Required: Tenant ID for multi-tenancy
agent_id: string // Required: Agent to evaluate
action_type: string // Required: Action to perform
context?: object // Optional: Additional context
}**Response (200 OK):**
{
allowed: boolean // Whether action is permitted
reason: string // Human-readable explanation
requires_approval: boolean // Whether human approval needed
maturity_level: 'student' | 'intern' | 'supervised' | 'autonomous'
complexity: number // Action complexity (1-4)
confidence_score?: number // Agent confidence (0.0-1.0)
budget_remaining?: number // Remaining budget (optional)
}**Error Responses:**
All errors return **200 OK with allowed: false** (fail-closed):
// Backend unavailable
{
allowed: false,
reason: 'Backend governance service unavailable',
requires_approval: true,
maturity_level: 'student',
complexity: 0
}
// Invalid agent ID
{
allowed: false,
reason: 'Reserved or invalid agent ID',
requires_approval: true,
maturity_level: 'student',
complexity: 0
}
// Agent not found
{
allowed: false,
reason: 'Agent not found',
requires_approval: true,
maturity_level: 'student',
complexity: 0
}Next.js API → Python Backend
**Endpoint:** POST /api/v1/agent-governance/enforce-action
**Request:**
{
agent_id: string
action_type: string
action_details?: object
}**Headers:**
X-Tenant-ID: <tenant_id>
X-User-ID: <user_id>
Authorization: Bearer <jwt_token>**Response:** Same as frontend response format
Client-Side Caching
Cache Implementation
// Cache key format
const cacheKey = `${tenantId}:${agentId}:${actionType}`
// Cache entry structure
interface CacheEntry {
data: GovernanceEvaluationResult
expiry: number // Unix timestamp in milliseconds
}
// TTL configuration
private static CACHE_TTL_MS = 30 * 1000 // 30 secondsCache Behavior
| Scenario | Behavior | TTL |
|---|---|---|
| **Cache Hit** | Return cached data immediately | N/A |
| **Cache Miss** | Call backend API, cache response | 30s |
| **Backend Error** | Cache denied response | 10s (shorter) |
| **Agent Maturity Change** | Invalidate all agent cache entries | Immediate |
Cache Telemetry
The service tracks cache performance:
interface CacheTelemetry {
hits: number // Cache hits
misses: number // Cache misses
errors: number // Backend errors
lastReset: number // Last telemetry reset timestamp
}Telemetry logged every 5 minutes:
[Governance] Cache telemetry (last 300s): hits=450, misses=50, errors=5, hit_rate=0.82Fail-Closed Behavior
Security Principle
**Never allow an action without explicit backend approval.**
If the backend is unavailable, the frontend MUST deny the action rather than allow it.
Implementation
try {
const response = await fetch('/api/v1/agent-governance/evaluate', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ tenant_id, agent_id, action_type })
})
if (!response.ok) {
// Backend error -> DENY
return {
allowed: false,
reason: 'Backend governance service unavailable',
requires_approval: true,
maturity_level: 'student',
complexity: 0
}
}
const result = await response.json()
return result
} catch (error) {
// Network error -> DENY
return {
allowed: false,
reason: 'Backend governance service unavailable',
requires_approval: true,
maturity_level: 'student',
complexity: 0
}
}Error Scenarios
| Error Type | Frontend Response | Cache TTL |
|---|---|---|
| **Backend 500** | allowed: false | 10s |
| **Network Timeout** | allowed: false | 10s |
| **Invalid Response** | allowed: false | 10s |
| **Tenant Not Found** | allowed: false | N/A (404) |
Usage Examples
Frontend Usage
import { AgentGovernanceService } from '@/lib/ai/agent-governance'
// Initialize service
const db = new DatabaseService()
const governance = new AgentGovernanceService(db)
// Check if agent can perform action
const result = await governance.canPerformAction(
'tenant_123',
'agent_sales_bot',
'delete'
)
if (result.allowed) {
// Execute action
await executeDeleteOperation()
} else {
// Show error or request approval
showMessage(result.reason)
if (result.requires_approval) {
await requestApproval(result)
}
}With Context
const result = await governance.canPerformAction(
'tenant_123',
'agent_sales_bot',
'send_email',
{ recipient: 'user@example.com', subject: 'Important' }
)Backend Usage (Python)
from core.agent_governance_service import AgentGovernanceService
# Initialize service
governance = AgentGovernanceService(db)
# Check if agent can perform action
result = await governance.enforce_action(
agent_id='agent_sales_bot',
action_type='delete',
action_details={'resource': 'document_789'}
)
if result['proceed']:
# Execute action
await execute_delete()
else:
# Log denial
logger.info(f"Action denied: {result['reason']}")Maturity Levels
| Level | Confidence | Max Complexity | Auto-Approve | Description |
|---|---|---|---|---|
| **student** | 0.0-0.5 | 1 (Read-only) | No | Learning from examples |
| **intern** | 0.5-0.7 | 2 (Analysis) | No | Can suggest, needs approval |
| **supervised** | 0.7-0.9 | 3 (Mutation) | Partial | Live monitoring required |
| **autonomous** | 0.9-1.0 | 4 (Critical) | Yes | Full autonomy |
Action Complexity
| Complexity | Actions | Maturity Required |
|---|---|---|
| **1 (Low)** | search, read, list, get, fetch, summarize | Student+ |
| **2 (Medium-Low)** | analyze, suggest, draft, generate, recommend | Intern+ |
| **3 (Medium)** | create, update, send_email, post_message, schedule | Supervised+ |
| **4 (High)** | delete, execute, deploy, transfer, payment, approve | Autonomous only |
Troubleshooting
Backend Unavailable
**Symptom:** All actions denied with "Backend governance service unavailable"
**Debug Steps:**
- Check Python backend health:
curl http://localhost:8000/health/live - Check Next.js API logs for proxy errors
- Verify network connectivity between Next.js and Python
**Solution:**
- Start Python backend:
cd backend-saas && uvicorn main:app --reload - Check backend logs for errors
- Verify internal networking configuration
Cache Issues
**Symptom:** Stale governance decisions (old maturity level used)
**Debug Steps:**
- Check cache telemetry logs
- Verify cache TTL (should be 30s)
- Check if agent maturity changed recently
**Solution:**
- Wait 30s for cache expiry
- Manually invalidate cache by updating agent score
- Restart frontend to clear cache
Permission Denied
**Symptom:** Action denied despite correct maturity level
**Debug Steps:**
- Check agent status in database:
SELECT status FROM agent_registry WHERE id = 'agent_abc' - Check action complexity mapping
- Review backend governance logs
**Solution:**
- Promote agent to higher maturity level
- Reduce action complexity (use alternative approach)
- Request manual approval
Performance
Metrics
| Metric | Target | Current |
|---|---|---|
| **Cache Hit Rate** | >50% | ~80% (measured in production) |
| **API Latency (p50)** | <100ms | ~50ms |
| **API Latency (p99)** | <500ms | ~200ms |
| **Cache Lookup** | <1ms | <0.1ms |
Optimization Tips
- **Batch Governance Checks:** Check multiple actions in parallel
- **Prefetch:** Check governance before user interaction
- **Cache Warming:** Populate cache with common actions on startup
- **Monitoring:** Track cache hit rate to optimize TTL
Security
Threat Model
| Threat | Mitigation |
|---|---|
| **Client Bypass** | Backend is authoritative, frontend cannot override |
| **Cache Poisoning** | Cache is read-only, short TTL (30s) |
| **Tenant Leakage** | All queries filtered by tenant_id |
| **Agent Spoofing** | Reserved names blocked (admin, root, system) |
| **Backend Down** | Fail-closed denies all actions |
Audit Trail
All governance decisions logged:
{
"event": "governance_check",
"timestamp": "2026-04-12T20:00:00.000Z",
"tenant_id": "tenant_123",
"agent_id": "agent_sales_bot",
"action_type": "delete",
"decision": "DENIED",
"reason": "Agent intern lacks maturity for delete (Req: autonomous)",
"maturity": "intern",
"complexity": 4,
"latency_ms": 45
}Migration Notes
From Client-Side Governance (v1.0)
**Changes:**
- ✅ Removed
ACTION_COMPLEXITYconstant from frontend - ✅ Removed
MATURITY_REQUIREMENTSconstant from frontend - ✅ Removed
RESERVED_AGENT_NAMESvalidation from frontend - ✅ Refactored
canPerformAction()to call backend API - ✅ Added client-side caching (30s TTL)
- ✅ Implemented fail-closed behavior
**Benefits:**
- Single source of truth (backend)
- Real-time governance updates
- No code duplication
- Enhanced security (server-side validation)
Rollback Plan
If issues arise, revert to client-side governance:
- Restore
ACTION_COMPLEXITYandMATURITY_REQUIREMENTSconstants - Revert
canPerformAction()to client-side logic - Remove backend API calls
- Clear client-side cache
Related Documentation
- **Backend API Contracts:**
backend-saas/docs/GOVERNANCE_API_CONTRACTS.md - **Agent Governance Service:**
backend-saas/core/agent_governance_service.py - **Frontend Service:**
src/lib/ai/agent-governance.ts - **API Route:**
src/app/api/v1/agent-governance/evaluate/route.ts
Changelog
v2.0 (2026-04-12)
**Breaking Changes:**
- Frontend now requires backend API for governance decisions
- Client-side constants removed, use backend API instead
**New Features:**
- Unified frontend-backend governance architecture
- Client-side caching with 30s TTL
- Fail-closed behavior on backend errors
- Cache telemetry tracking
- Audit logging with latency
**Bug Fixes:**
- Fixed code duplication between frontend and backend
- Fixed stale governance decisions in frontend cache
- Fixed security risk of client-side bypass
v1.0 (Legacy)
- Client-side governance with hardcoded constants
- No backend API integration
- Vulnerable to client manipulation