Agent Security Model
Agent Security Model
Section titled “Agent Security Model”BaselineOS provides defense-in-depth for AI agent operations. Every layer is designed to contain blast radius if any single component is compromised.
The Threat Model
Section titled “The Threat Model”AI agents operate with real credentials and real infrastructure access. The threats:
| Threat | Example | Traditional defense | BaselineOS defense |
|---|---|---|---|
| Credential exfiltration | Prompt injection causes agent to leak API keys | Don’t put keys in prompts | Opaque grants — agent never sees raw values |
| Credential persistence | Compromised agent retains access indefinitely | Rotate keys manually | Time-boxed leases — auto-expire in seconds |
| Lateral movement | Agent with deploy access uses it for data exfiltration | Network segmentation | Per-agent scoping — each agent sees only its credentials |
| Privilege escalation | Low-trust agent performs high-risk operations | RBAC | Trust scoring — higher-risk steps require higher trust |
| No accountability | Unknown agent accessed production credentials | Manual log review | Persistent audit trail — every access, every accessor, every timestamp |
| Uncontrolled execution | Agent invents its own workflow with arbitrary commands | Hope for the best | Execution protocols — agent follows defined steps with defined access |
Defense Layers
Section titled “Defense Layers”┌─────────────────────────────────────────────────────────────────┐│ Layer 5: Execution Protocols ││ Agent follows defined steps, not freeform commands │├─────────────────────────────────────────────────────────────────┤│ Layer 4: Trust Scoring ││ Per-protocol, per-step, per-credential trust gates │├─────────────────────────────────────────────────────────────────┤│ Layer 3: Opaque Credential Grants ││ Agent gets handle, not value — can use but can't exfiltrate │├─────────────────────────────────────────────────────────────────┤│ Layer 2: Time-Boxed Leases ││ Credentials auto-expire after step timeout │├─────────────────────────────────────────────────────────────────┤│ Layer 1: Encrypted Vault ││ AES-256-CBC at rest, per-agent scoping, persistent audit trail │└─────────────────────────────────────────────────────────────────┘Each layer contains failures in the layers above it:
- Protocol defines the steps → but what if the step is compromised?
- Trust scoring gates access → but what if trust is too high?
- Opaque grants hide values → but what if the runner is compromised?
- Leases auto-expire → but what if the vault is compromised?
- Encryption protects storage → bottom of the stack
What Happens When Something Is Compromised
Section titled “What Happens When Something Is Compromised”Agent is compromised (prompt injection, jailbreak)
Section titled “Agent is compromised (prompt injection, jailbreak)”Traditional: Agent has VERCEL_TOKEN=vcel_... in its environment. Attacker has the token forever.
BaselineOS:
- Agent has a lease handle (
lease-xxx), not the token value - The lease expires in seconds (step timeout)
- Even if the attacker extracts the handle, it resolves to nothing after the step ends
- The agent never had the raw value in its memory or context
Credential is leaked
Section titled “Credential is leaked”Traditional: Leaked ANTHROPIC_API_KEY is valid until manually rotated. Could be months.
BaselineOS:
- Leased credentials auto-expire after TTL (seconds to minutes)
- Vault audit trail shows exactly when and how the credential was accessed
vault.rotate('anthropic', newKey)immediately invalidates all active leasesvault.getAccessLog()shows the full forensic trail
Agent attempts unauthorized access
Section titled “Agent attempts unauthorized access”Traditional: Agent reads process.env.DATABASE_URL — nothing stops it.
BaselineOS:
- Agent requests capability through protocol → trust score checked
- Agent trust 60 < required 85 for production database → denied
- Denial logged to audit trail with agent ID, trust score, and reason
- Agent cannot bypass — credential is not in env, only in vault
Credential Flow Diagram
Section titled “Credential Flow Diagram”Organization defines: defineBaseline({ credentials: { vercel: { envVar: 'VERCEL_TOKEN', minTrustScore: 60 }, 'prod-db': { envVar: 'DATABASE_URL', minTrustScore: 85, scope: 'agent', allowedAccessors: ['data-agent'] }, } }) │ ▼ CredentialVault auto-resolves from env Encrypts with AES-256-CBC Persists to SQLite │ ▼ Protocol defines steps: S1: "Run tests" (no credentials needed) S2: "Deploy" (needs vercel-token, trust 60) S3: "Migrate DB" (needs prod-db, trust 85) │ ▼ Agent starts execution: trust score: 75 │ ▼ S1: ✓ passes (no credentials) S2: ✓ vercel-token → lease issued (TTL 60s) → handle to agent → runner injects → lease revoked S3: ✗ prod-db → DENIED (trust 75 < required 85) → logged → step failsSecret Scanning
Section titled “Secret Scanning”The Context Integrity Engine scans all context entries for embedded secrets:
// These patterns are rejected on context registration:const secretPatterns = [ /(?:sk-|pk-|api[_-]?key)[a-zA-Z0-9\-_]{20,}/, // API keys /(?:ghp|gho|ghu|ghs|ghr)_[A-Za-z0-9_]{36,}/, // GitHub tokens /xox[bpors]-[A-Za-z0-9\-]+/, // Slack tokens /-----BEGIN (?:RSA |EC )?PRIVATE KEY-----/, // Private keys /AKIA[0-9A-Z]{16}/, // AWS access keys];If an agent tries to register context containing secrets, it’s blocked before storage.
API Server Security
Section titled “API Server Security”The /api/context/* endpoints include:
| Control | Implementation |
|---|---|
| Authentication | API key required (Bearer token or X-API-Key header) |
| Authorization | authorizedBy derived from auth context, not user input |
| Rate limiting | 20 destructive operations per hour |
| Input validation | Content size limits (100KB), find/replace limits (10K chars) |
| Response sanitization | Snapshots stripped — no sensitive content in responses |
| Scope validation | Enum-validated scope parameter |
| Body size limit | 1MB max request body |
Compliance Mapping
Section titled “Compliance Mapping”| Standard | Control | BaselineOS Implementation |
|---|---|---|
| OWASP Agentic Top 10 | Tool Misuse | Execution protocols define allowed tools per step |
| OWASP Agentic Top 10 | Prompt Injection | Input sanitizer (11 patterns), opaque credentials |
| OWASP Agentic Top 10 | Excessive Agency | Trust scoring gates autonomous actions |
| SOC 2 | Access Control | Per-agent credential scoping, trust scoring |
| SOC 2 | Audit Trail | SQLite-persisted access log, lease lifecycle |
| SOC 2 | Encryption | AES-256-CBC at rest, TLS in transit |
| GDPR Art. 32 | Security of Processing | Encryption, access control, audit trail |
| ISO 27001 | A.9 Access Control | Trust-scored, scoped, time-boxed access |