Skip to content

Execution Protocols — How Agents Carry Out Work

Execution Protocols — How Agents Carry Out Work

Section titled “Execution Protocols — How Agents Carry Out Work”

BaselineOS execution protocols define HOW agents carry out work — the steps, the order, the credentials needed, the trust required, and how to verify it was done right. Any agent (Claude, GPT, Gemini, a bash script) follows the same protocol.


import {
ProtocolRegistry, ProtocolExecutor, ProtocolRunner,
CredentialVault,
} from 'baselineos';
// 1. Set up credentials
const vault = new CredentialVault({
persistPath: '.baseline/vault',
masterKey: process.env.BASELINE_VAULT_KEY!,
});
await vault.initialize();
vault.store('vercel-token', process.env.VERCEL_TOKEN!, {
type: 'bearer-token',
provider: 'vercel',
});
// 2. Define a protocol
const registry = new ProtocolRegistry();
registry.register({
id: 'deploy-staging',
name: 'Deploy to Staging',
description: 'Build, test, and deploy to staging environment',
minTrustScore: 60,
author: 'devops-team',
version: '1.0',
tags: ['deployment', 'staging'],
steps: [
{
id: 'S1',
name: 'Run tests',
instructions: 'Execute the full test suite',
command: 'pnpm test',
requiredCapabilities: [],
dependsOn: [],
validation: {
type: 'output-contains',
expected: 'passed',
failureMessage: 'Test suite did not pass',
},
critical: true,
},
{
id: 'S2',
name: 'Build',
instructions: 'Create production build',
command: 'pnpm build',
requiredCapabilities: [],
dependsOn: ['S1'],
critical: true,
},
{
id: 'S3',
name: 'Deploy to staging',
instructions: 'Push to Vercel staging',
command: 'vercel deploy --target preview',
requiredCapabilities: [
{
credentialName: 'vercel-token',
envVar: 'VERCEL_TOKEN',
reason: 'Vercel deployment access',
},
],
dependsOn: ['S2'],
validation: {
type: 'output-contains',
expected: 'https://',
failureMessage: 'No deployment URL in output',
},
timeoutMs: 120_000,
critical: true,
},
],
});
// 3. Run it
const executor = new ProtocolExecutor(vault);
const runner = new ProtocolRunner(executor, registry, {
opaqueCredentials: true, // Agent never sees raw token values
});
const execution = await runner.run('deploy-staging', 'deploy-agent', 75);
// → S1: tests pass ✓
// → S2: build succeeds ✓
// → S3: vault issues time-boxed lease for VERCEL_TOKEN,
// injects into child process, deploys, lease revoked ✓

A protocol is a contract between the organization and the agent:

"Here's what needs to happen, in what order, with what access,
and how we verify it was done right."

It contains ordered steps — each step has:

  • A command to execute (optional — steps without commands are agent-driven)
  • Required capabilities — credentials from the vault
  • Dependencies — which steps must complete first
  • Validation — how to verify the step succeeded
  • Trust requirement — minimum trust score for this step
  • A critical flag — if a critical step fails, the protocol aborts

An execution is a specific run of a protocol by a specific agent. It tracks:

  • Which steps completed, failed, or are blocked
  • Which credentials were granted or denied
  • Duration of each step
  • Overall progress and status

When a step needs credentials, the executor requests them from the vault. Depending on the security mode:

ModeWhat agent getsSecurity level
Opaque (default)A lease handle — can’t read the valueHighest — Anthropic Managed Agents pattern
TransparentThe raw credential valueLower — traditional pattern

A lease is a time-boxed credential access. Borrowed from HashiCorp Vault:

  • The lease is valid only for the step’s timeout duration
  • After the step completes (success or failure), the lease is revoked
  • If the agent is compromised, the credential is already dead

interface ExecutionProtocol {
id: string; // Unique identifier
name: string; // Human-readable name
description: string; // What this protocol accomplishes
steps: ProtocolStep[]; // Ordered steps
minTrustScore: number; // Minimum trust to start (0-100)
author: string; // Who created this
version: string; // Protocol version
tags: string[]; // For discovery
}
interface ProtocolStep {
id: string; // Step ID (e.g., 'S1', 'S4.2')
name: string; // What this step does
instructions: string; // Detailed instructions
command?: string; // Shell command to execute
cwd?: string; // Working directory
requiredCapabilities: CapabilityGrant[]; // Credentials needed
minTrustScore?: number; // Per-step trust override
dependsOn: string[]; // Step IDs that must complete first
validation?: StepValidation; // How to verify success
parallel?: boolean; // Can run concurrently with peers
timeoutMs?: number; // Max execution time (ms)
critical: boolean; // Failure aborts protocol
}
interface CapabilityGrant {
credentialName: string; // Name in the vault (e.g., 'vercel-token')
envVar?: string; // Env var name to inject (e.g., 'VERCEL_TOKEN')
reason: string; // Why this step needs it
minTrustScore?: number; // Trust required for THIS credential
}
interface StepValidation {
type: 'output-contains' | 'output-matches' | 'exit-code' | 'custom';
expected: string; // What to check against
failureMessage: string; // Error message if validation fails
}

Steps execute in dependency order. A step is “ready” when all steps in its dependsOn list are completed.

S1 (no deps) → S2 (depends on S1) → S4 (depends on S2, S3)
S3 (depends on S1) ↗

Steps with parallel: true can run concurrently if their dependencies allow:

steps: [
{ id: 'S1', dependsOn: [], critical: true },
{ id: 'S2', dependsOn: ['S1'], parallel: true, critical: false },
{ id: 'S3', dependsOn: ['S1'], parallel: true, critical: false },
{ id: 'S4', dependsOn: ['S2', 'S3'], critical: true },
]
// S1 runs → S2 and S3 run in parallel → S4 runs

Trust is checked at three levels:

  1. Protocol level — agent must meet protocol.minTrustScore to start
  2. Step level — if step.minTrustScore is set, overrides protocol minimum
  3. Capability level — if capability.minTrustScore is set, gates individual credentials
{
minTrustScore: 60, // Must have 60+ to start
steps: [
{
id: 'S1',
name: 'Run tests',
minTrustScore: undefined, // Uses protocol's 60
requiredCapabilities: [],
critical: true,
},
{
id: 'S2',
name: 'Deploy to production',
minTrustScore: 85, // Only highly trusted agents
requiredCapabilities: [
{
credentialName: 'prod-deploy-key',
minTrustScore: 90, // Even higher for this credential
reason: 'Production deployment',
},
],
critical: true,
},
],
}

When a critical step fails:

  1. The step is marked as failed
  2. All pending steps are marked as blocked
  3. The protocol execution is marked as failed
  4. No further steps execute

Non-critical steps can fail without aborting the protocol.


const runner = new ProtocolRunner(executor, registry, {
cwd: '/path/to/project', // Default working directory
defaultTimeoutMs: 120_000, // 2 minutes per step
verbose: false, // Print output to stdout
dryRun: false, // Simulate without executing
opaqueCredentials: true, // Use lease-based opaque grants
extraEnv: { NODE_ENV: 'production' }, // Extra env vars for all steps
onEvent: (event) => { // Real-time progress
console.log(`[${event.type}] ${event.message}`);
},
});

The runner emits events for real-time progress tracking:

EventWhen
step:startStep begins execution
step:completeStep finished successfully
step:failedStep failed (command error or validation failure)
capability:grantedCredential access approved by vault
capability:deniedCredential access denied (trust, scope, or missing)
protocol:completeAll steps finished
protocol:failedProtocol aborted due to critical step failure
protocol:abortedProtocol manually aborted

Test a protocol without executing any commands:

const runner = new ProtocolRunner(executor, registry, { dryRun: true });
const execution = await runner.run('deploy-staging', 'agent', 80);
// All steps "complete" with output like:
// "[dry-run] Would execute: pnpm test"
// Validation is skipped in dry run mode

const progress = executor.getProgress(execution.id);
// {
// total: 5,
// completed: 3,
// failed: 0,
// pending: 1,
// inProgress: 1,
// blocked: 0,
// percentComplete: 60,
// }

BaselineOS includes two templates you can use or extend:

5-step production deployment:

  1. Run test suite (critical)
  2. Security self-check (critical)
  3. Build production artifacts (critical)
  4. Deploy to staging (needs vercel-token, critical)
  5. Promote to production (needs vercel-token, trust 85, critical)

4-step environment configuration:

  1. Validate service URLs (critical)
  2. Remove stale env vars (needs vercel-token)
  3. Set new env vars (needs vercel-token, critical)
  4. Verify configuration (needs vercel-token, critical)

Real-World Example: Vercel Environment Setup

Section titled “Real-World Example: Vercel Environment Setup”

The exact pattern from the user’s workflow — setting service URLs across environments:

registry.register({
id: 'configure-service-urls',
name: 'Configure Service URLs',
description: 'Set API service URLs in Vercel production environment',
minTrustScore: 60,
author: 'platform-team',
version: '1.0',
tags: ['configuration', 'vercel', 'production'],
steps: [
{
id: 'S1',
name: 'Validate API endpoints',
instructions: 'Verify each service URL returns 200',
command: [
'curl -sf https://griot-api.run.app/api/v1/index54/snapshot > /dev/null',
'curl -sf https://griot-api.run.app/api/v1/ledger54/dossier > /dev/null',
'curl -sf https://griot-api.run.app/api/v1/observatory54/coverage > /dev/null',
].join(' && '),
requiredCapabilities: [],
dependsOn: [],
critical: true,
},
{
id: 'S2',
name: 'Set INDEX54_SERVICE_URL',
instructions: 'Configure index54 snapshot endpoint',
command: 'vercel env rm INDEX54_SERVICE_URL production --yes 2>/dev/null; echo "https://griot-api.run.app/api/v1/index54/snapshot" | vercel env add INDEX54_SERVICE_URL production',
requiredCapabilities: [
{ credentialName: 'vercel-token', envVar: 'VERCEL_TOKEN', reason: 'Vercel env management' },
],
dependsOn: ['S1'],
critical: true,
},
{
id: 'S3',
name: 'Set LEDGER_DOSSIER_SERVICE_URL',
instructions: 'Configure ledger54 dossier endpoint',
command: 'vercel env rm LEDGER_DOSSIER_SERVICE_URL production --yes 2>/dev/null; echo "https://griot-api.run.app/api/v1/ledger54/dossier" | vercel env add LEDGER_DOSSIER_SERVICE_URL production',
requiredCapabilities: [
{ credentialName: 'vercel-token', envVar: 'VERCEL_TOKEN', reason: 'Vercel env management' },
],
dependsOn: ['S1'],
parallel: true,
critical: true,
},
{
id: 'S4',
name: 'Set OBSERVATORY_COVERAGE_URL',
instructions: 'Configure observatory54 coverage endpoint',
command: 'vercel env rm OBSERVATORY_COVERAGE_URL production --yes 2>/dev/null; echo "https://griot-api.run.app/api/v1/observatory54/coverage" | vercel env add OBSERVATORY_COVERAGE_URL production',
requiredCapabilities: [
{ credentialName: 'vercel-token', envVar: 'VERCEL_TOKEN', reason: 'Vercel env management' },
],
dependsOn: ['S1'],
parallel: true,
critical: true,
},
{
id: 'S5',
name: 'Verify configuration',
instructions: 'List env vars and confirm all URLs are set',
command: 'vercel env ls production',
requiredCapabilities: [
{ credentialName: 'vercel-token', envVar: 'VERCEL_TOKEN', reason: 'Read env vars' },
],
dependsOn: ['S2', 'S3', 'S4'],
validation: {
type: 'output-contains',
expected: 'INDEX54_SERVICE_URL',
failureMessage: 'Service URLs not configured',
},
critical: true,
},
],
});

Execution flow:

S1: Validate URLs (curl health checks) ──────────────────────→ ✓
S2: Set INDEX54_SERVICE_URL ──┐
S3: Set LEDGER_DOSSIER_URL ──┼── (parallel, all need vercel-token lease) → ✓
S4: Set OBSERVATORY_URL ──┘
S5: Verify all URLs set (vercel env ls) ──────────────────────→ ✓

Each step gets a time-boxed lease on VERCEL_TOKEN. The agent never sees the token value. After each step, the lease is revoked. The full execution is audited.