BYOK Key Fix Deployment Summary
**Date:** 2026-05-01
**Deployment:** atom-saas version with pre-flight API key validation
**Image:** registry.fly.io/atom-saas:deployment-01KQHXM3XJ9JY595H6NG0GXADJ
Problem Solved
Backfill jobs were failing during LLM extraction because:
- **Wrong key being used**: System was falling back to global DeepSeek key ending in
6100instead of tenant's key ending in4207 - **tenant_id not resolved**:
LLMServicewas initialized withworkspace_idastenant_id, so BYOK lookup failed - **Silent failures**: Auth errors were caught and silently skipped, making debugging difficult
Commits Deployed
1. `6303c13f57` - Resolve tenant_id from workspace_id
**File:** backend-saas/core/graphrag_engine.py
# Before: Used workspace_id as tenant_id (WRONG)
llm = LLMService(db=self.db, workspace_id=workspace_id, tenant_id=workspace_id)
# After: Resolve tenant_id from Workspace table (CORRECT)
w = session.query(Workspace).filter(Workspace.id == workspace_id).first()
resolved_tenant_id = str(w.tenant_id) if w and w.tenant_id else workspace_id
llm = LLMService(db=self.db, workspace_id=workspace_id, tenant_id=resolved_tenant_id)**Impact:** BYOK key lookup now uses correct tenant_id
2. `077bbdc115` - Remove dangerous global key fallback
**File:** backend-saas/core/byok_endpoints.py
# REMOVED: Dangerous fallback to global keys
fallback_key_id = f"{provider_id}_{key_name}_{environment}" # e.g., "deepseek_default_production"
if fallback_key_id in self.api_keys:
return decrypt(self.api_keys[fallback_key_id]) # ← REMOVED THIS LINE**Impact:** No longer returns global keys when tenant key lookup fails
3. `a51e194459` - Raise AuthenticationError on auth failure
**File:** Multiple files in BYOKHandler
# Before: Silently skip on auth failure
except Exception as e:
logger.warning(f"Provider {provider_id} failed: {e}")
continue # Try next provider
# After: Raise clear error
raise AuthenticationError(
f"Failed to authenticate with {provider_id}: {auth_error}"
)**Impact:** Clear error messages when API keys are invalid
4. `ab01dc1cb5` - Pre-flight API key check
**File:** backend-saas/core/historical_sync_service.py
# Check BEFORE fetching records (lines 624-648)
openai_key = db.query(TenantSetting).filter(
TenantSetting.tenant_id == tenant_id,
TenantSetting.setting_key == "OPENAI_API_KEY"
).first()
has_openai_key = (
openai_key
and openai_key.setting_value
and not openai_key.setting_value.startswith("mock")
)
can_use_graphrag = has_graphrag_access and has_openai_key
if not can_use_graphrag:
logger.warning(
f"Skipping backfill: Tenant {tenant_id} has no valid API key. "
f"Add key in Settings or skip GraphRAG extraction."
)
return # Stop immediately, don't fetch records**Impact:**
- **Faster feedback:** Job stops immediately if no API key (vs fetching all records then failing)
- **Clearer logs:** Explicit warning about missing API key
- **No wasted resources:** Doesn't fetch emails that can't be processed
Expected Behavior
Scenario 1: Tenant with Valid API Key (Brennan)
- ✅ Checks tenant_settings for
OPENAI_API_KEY - ✅ Finds key ending in
4207 - ✅ Resolves tenant_id from workspace_id
- ✅ Uses tenant's key (not global fallback)
- ✅ LLM extraction succeeds
- ✅ Entities and relationships extracted
Scenario 2: Tenant Without API Key
- ✅ Checks tenant_settings for
OPENAI_API_KEY - ✅ Key not found or invalid
- ✅ Logs clear warning: "Skipping backfill: Tenant has no valid API key"
- ✅ Stops immediately (doesn't fetch records)
- ✅ Job marked as failed with clear reason
Scenario 3: Tenant with Invalid API Key
- ✅ Checks tenant_settings for
OPENAI_API_KEY - ✅ Key found but invalid (401 error)
- ✅ Raises
AuthenticationErrorwith clear message - ✅ Job marked as failed with auth error details
Database State
**Brennan's tenant (verified):**
-- Tenant
SELECT id, subdomain, plan_type FROM tenants WHERE subdomain = 'brennan';
-- Result: 31c06fc4-db22-4740-83ea-48ac14f25810 | brennan | team
-- Workspace
SELECT id, tenant_id FROM workspaces WHERE tenant_id = '31c06fc4-db22-4740-83ea-48ac14f25810';
-- Result: 795c2ec9-b794-47ea-9aae-12c1c3d48589 | 31c06fc4-db22-4740-83ea-48ac14f25810
-- API Keys
SELECT setting_key, LENGTH(setting_value), RIGHT(setting_value, 8)
FROM tenant_settings
WHERE tenant_id = '31c06fc4-db22-4740-83ea-48ac14f25810'
AND setting_key LIKE '%API_KEY%';
-- Results:
-- DEEPSEEK_API_KEY | 35 | ...4f474207 ✅ CORRECT KEY
-- OPENAI_API_KEY | 164| ...CQnbyPMA ✅ CORRECT KEY
-- GOOGLE_API_KEY | 39 | ...LLfokymk
-- MINIMAX_2_7_API_KEY| 126| ...7onMexq8Testing
To verify the fix:
- **Trigger a backfill** for brennan tenant
- **Check logs** for: "Resolved tenant_id from workspace_id"
- **Check logs** for: "Using tenant's DEEPSEEK_API_KEY"
- **Verify** LLM extraction succeeds
- **Verify** entities and relationships are created
To test the pre-flight check:
- **Remove** the OPENAI_API_KEY from tenant_settings temporarily
- **Trigger** a backfill
- **Verify** job fails immediately with: "Skipping backfill: Tenant has no valid API key"
- **Verify** NO records are fetched (faster feedback)
- **Restore** the API key
Future Work
For **managed AI tenants** (platform provides keys):
LLMServicewill handle key management transparently- No code change needed
- System will use platform keys instead of BYOK keys
The current fix ensures BYOK tenants' keys are found correctly without falling back to global keys.