My AI Agents Create Their Own Bug Fixes — But None of Them Have Credentials

AI agents that fix their own bugs sound like science fiction. But the real breakthrough isn’t the self-improvement — it’s running powerful autonomous systems without storing a single credential.

Most teams give AI agents standing permissions: API keys in environment variables, database passwords in config files, tokens that live forever. This creates a massive attack surface. A compromised agent becomes a compromised system.

Here’s how to build AI agents that create pull requests, analyze telemetry, and improve themselves — while holding zero credentials.

The Problem with Standing Credentials

Traditional AI agent security follows the “just in case” model:

  1. Create broad API credentials
  2. Store them in the agent’s environment
  3. Hope the agent uses them responsibly
  4. Forget about rotation

This approach treats AI agents like traditional software. But agents are fundamentally different — they take orders from text. Prompt injection attacks on agents with database credentials aren’t theoretical risks. They’re Tuesday.

You wouldn’t give an intern the admin password on day one. Don’t give it to a system that confidently makes mistakes.

Just-in-Time JWT Tokens

The solution: agents have zero standing permissions. No stored credentials anywhere in the container.

When a workflow needs an agent to access a service, the orchestrator creates a short-lived JWT token scoped to exactly what that agent needs:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
// Orchestrator creates workflow-specific token
const token = createJWT({
  agent: "telemetry-analyzer",
  scope: "read:telemetry",
  workflow: "wf-7829",
  exp: "5min"
});

// Token goes to proxy, not agent
configureProxy(token);
startAgent(); // Agent never sees the token

The lifecycle works like this:

  1. Orchestrator receives task
  2. Creates JIT JWT for specific services and duration
  3. Configures container proxy with token
  4. Agent makes requests through proxy
  5. Proxy injects JWT, forwards request, strips auth headers
  6. Workflow completes, token expires

The agent never sees credentials. It calls proxy/telemetry/query, the proxy handles authentication. Prompt injection can’t steal what doesn’t exist.

Role-Based Access Control for Bots

Every agent type gets a role definition that the proxy enforces:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
roles:
  crash-tracker:
    services: [crash-reporting]
    actions: [read]
    data: [crash-reports, stack-traces]
    limits: { max_requests_per_min: 30 }

  pr-creator:
    services: [code-repository]
    actions: [read, create-pr, create-branch]
    forbidden_paths: [auth/*, .ci/*, security/*]
    limits: { max_diff_lines: 500, max_files_changed: 10 }

Roles are defined in config, not prompts. The security model is structural — the proxy only exposes endpoints the role allows. Prompt injection can’t circumvent walls with no doors.

Multi-Layer Validation

Every agent output goes through validation before execution:

Schema validation for routine outputs. The response either matches expected structure or fails.

Cross-evaluation with multiple LLMs for consequential decisions. The same question goes to 2-3 models, measuring consensus through structured voting and adversarial debate.

Deterministic checks alongside LLM evaluation. Static analysis and regression tests catch obvious errors while models handle subtle issues.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
async function validateAgentOutput(output, type) {
  // Always validate schema first
  const schemaValid = validateSchema(output, type);
  if (!schemaValid) return false;

  // Cross-evaluate important decisions
  if (type === 'create-pr' || type === 'alert-team') {
    const consensus = await crossEvaluate(output, ['gpt-4', 'claude-3', 'gemini-pro']);
    return consensus.confidence > 0.8;
  }

  return true;
}

The Meta-Workflow: Self-Improvement Without Self-Modification

The most interesting part: a special workflow analyzes logs from all other agents and creates improvement PRs.

The meta-workflow runs under its own restricted role:

  • Read-only access to log store
  • Write access to code repository (staging PRs only)
  • No access to production systems

It identifies patterns in the append-only logs:

Performance degradation: “The crash tracker took 3x longer than average on 12 reports today”

Security anomalies: “The telemetry agent made unusual request sequences”

Quality drift: “Classification accuracy dropped 15% this week”

When it finds concrete problems, the meta-workflow drafts pull requests with proposed fixes, test evidence, and the specific log entries that triggered the analysis.

1
2
3
4
5
6
7
8
9
# Example meta-workflow output
title: "Fix telemetry agent prompt causing classification drift"
evidence:
  - log_entries: [wf-7829, wf-7831, wf-7834]
  - accuracy_drop: 15%
  - response_time_increase: 200ms
proposed_fix:
  - updated_prompt: "Focus on error patterns, not just error counts"
  - test_results: "98% accuracy on synthetic data"

The system improves itself with strict constraints:

  • PRs can’t modify tests, auth, or CI configuration
  • Full test suite must pass before review
  • Human approval required for all merges
  • No irreversible actions without human oversight

What Can Still Go Wrong

No system is bulletproof. Here’s the honest threat model:

ThreatMitigationResidual Risk
Prompt injectionAgents can’t expand permissions; proxy enforces boundariesAgent could waste compute within allowed scope
Data exfiltrationNo network egress; output monitoring catches anomaliesAgent could encode data in normal outputs
Log poisoningAppend-only store; agents can’t read/modify logsCompromised logging pipeline
Self-reinforcing bugsPRs can’t modify guardrails; human reviews requiredSubtle regressions that pass tests

The goal isn’t eliminating risk — it’s reducing blast radius. “Agent wastes 5 minutes of compute” is very different from “agent has the database password.”

The Boring Parts Enable the Interesting Parts

Most of this article covers tokens, proxies, and logging. Not AI models or clever prompts.

That’s intentional.

The ambitious features — self-healing workflows, autonomous PR creation, cross-model evaluation — only work because the infrastructure is solid. JIT tokens prevent credential leaks. Container proxies contain prompt injection. RBAC prevents cascade failures.

The boring infrastructure is the product. The AI agents are just tenants.

Start with the proxy. Start with token lifecycle. Start with logging. Get the padded room right, then worry about what the agent inside it is saying.

Once the foundation is secure, the autonomous parts take care of themselves.