Execution Protocols — How Agents Carry Out Work

BaselineOS execution protocols define HOW agents carry out work — the steps, the order, the credentials needed, the trust required, and how to verify it was done right. Any agent (Claude, GPT, Gemini, a bash script) follows the same protocol.

Quick Start

import {
  ProtocolRegistry, ProtocolExecutor, ProtocolRunner,
  CredentialVault,
} from 'baselineos';

// 1. Set up credentials
const vault = new CredentialVault({
  persistPath: '.baseline/vault',
  masterKey: process.env.BASELINE_VAULT_KEY!,
});
await vault.initialize();
vault.store('vercel-token', process.env.VERCEL_TOKEN!, {
  type: 'bearer-token',
  provider: 'vercel',
});

// 2. Define a protocol
const registry = new ProtocolRegistry();
registry.register({
  id: 'deploy-staging',
  name: 'Deploy to Staging',
  description: 'Build, test, and deploy to staging environment',
  minTrustScore: 60,
  author: 'devops-team',
  version: '1.0',
  tags: ['deployment', 'staging'],
  steps: [
    {
      id: 'S1',
      name: 'Run tests',
      instructions: 'Execute the full test suite',
      command: 'pnpm test',
      requiredCapabilities: [],
      dependsOn: [],
      validation: {
        type: 'output-contains',
        expected: 'passed',
        failureMessage: 'Test suite did not pass',
      },
      critical: true,
    },
    {
      id: 'S2',
      name: 'Build',
      instructions: 'Create production build',
      command: 'pnpm build',
      requiredCapabilities: [],
      dependsOn: ['S1'],
      critical: true,
    },
    {
      id: 'S3',
      name: 'Deploy to staging',
      instructions: 'Push to Vercel staging',
      command: 'vercel deploy --target preview',
      requiredCapabilities: [
        {
          credentialName: 'vercel-token',
          envVar: 'VERCEL_TOKEN',
          reason: 'Vercel deployment access',
        },
      ],
      dependsOn: ['S2'],
      validation: {
        type: 'output-contains',
        expected: 'https://',
        failureMessage: 'No deployment URL in output',
      },
      timeoutMs: 120_000,
      critical: true,
    },
  ],
});

// 3. Run it
const executor = new ProtocolExecutor(vault);
const runner = new ProtocolRunner(executor, registry, {
  opaqueCredentials: true, // Agent never sees raw token values
});

const execution = await runner.run('deploy-staging', 'deploy-agent', 75);
// → S1: tests pass ✓
// → S2: build succeeds ✓
// → S3: vault issues time-boxed lease for VERCEL_TOKEN,
//        injects into child process, deploys, lease revoked ✓

Concepts

Protocol

A protocol is a contract between the organization and the agent:

"Here's what needs to happen, in what order, with what access,
 and how we verify it was done right."

It contains ordered steps — each step has:

A command to execute (optional — steps without commands are agent-driven)
Required capabilities — credentials from the vault
Dependencies — which steps must complete first
Validation — how to verify the step succeeded
Trust requirement — minimum trust score for this step
A critical flag — if a critical step fails, the protocol aborts

Execution

An execution is a specific run of a protocol by a specific agent. It tracks:

Which steps completed, failed, or are blocked
Which credentials were granted or denied
Duration of each step
Overall progress and status

Capability Grant

When a step needs credentials, the executor requests them from the vault. Depending on the security mode:

Mode	What agent gets	Security level
Opaque (default)	A lease handle — can’t read the value	Highest — Anthropic Managed Agents pattern
Transparent	The raw credential value	Lower — traditional pattern

Lease

A lease is a time-boxed credential access. Borrowed from HashiCorp Vault:

The lease is valid only for the step’s timeout duration
After the step completes (success or failure), the lease is revoked
If the agent is compromised, the credential is already dead

Protocol Definition Reference

interface ExecutionProtocol {
  id: string;              // Unique identifier
  name: string;            // Human-readable name
  description: string;     // What this protocol accomplishes
  steps: ProtocolStep[];   // Ordered steps
  minTrustScore: number;   // Minimum trust to start (0-100)
  author: string;          // Who created this
  version: string;         // Protocol version
  tags: string[];          // For discovery
}

Step Definition

interface ProtocolStep {
  id: string;                          // Step ID (e.g., 'S1', 'S4.2')
  name: string;                        // What this step does
  instructions: string;                // Detailed instructions
  command?: string;                    // Shell command to execute
  cwd?: string;                        // Working directory
  requiredCapabilities: CapabilityGrant[];  // Credentials needed
  minTrustScore?: number;              // Per-step trust override
  dependsOn: string[];                 // Step IDs that must complete first
  validation?: StepValidation;         // How to verify success
  parallel?: boolean;                  // Can run concurrently with peers
  timeoutMs?: number;                  // Max execution time (ms)
  critical: boolean;                   // Failure aborts protocol
}

Capability Grant

interface CapabilityGrant {
  credentialName: string;  // Name in the vault (e.g., 'vercel-token')
  envVar?: string;         // Env var name to inject (e.g., 'VERCEL_TOKEN')
  reason: string;          // Why this step needs it
  minTrustScore?: number;  // Trust required for THIS credential
}

Validation

interface StepValidation {
  type: 'output-contains' | 'output-matches' | 'exit-code' | 'custom';
  expected: string;         // What to check against
  failureMessage: string;   // Error message if validation fails
}

Step Dependencies

Steps execute in dependency order. A step is “ready” when all steps in its dependsOn list are completed.

S1 (no deps)  →  S2 (depends on S1)  →  S4 (depends on S2, S3)
                  S3 (depends on S1)  ↗

Steps with parallel: true can run concurrently if their dependencies allow:

steps: [
  { id: 'S1', dependsOn: [], critical: true },
  { id: 'S2', dependsOn: ['S1'], parallel: true, critical: false },
  { id: 'S3', dependsOn: ['S1'], parallel: true, critical: false },
  { id: 'S4', dependsOn: ['S2', 'S3'], critical: true },
]
// S1 runs → S2 and S3 run in parallel → S4 runs

Trust Gating

Trust is checked at three levels:

Protocol level — agent must meet protocol.minTrustScore to start
Step level — if step.minTrustScore is set, overrides protocol minimum
Capability level — if capability.minTrustScore is set, gates individual credentials

{
  minTrustScore: 60,  // Must have 60+ to start
  steps: [
    {
      id: 'S1',
      name: 'Run tests',
      minTrustScore: undefined,  // Uses protocol's 60
      requiredCapabilities: [],
      critical: true,
    },
    {
      id: 'S2',
      name: 'Deploy to production',
      minTrustScore: 85,  // Only highly trusted agents
      requiredCapabilities: [
        {
          credentialName: 'prod-deploy-key',
          minTrustScore: 90,  // Even higher for this credential
          reason: 'Production deployment',
        },
      ],
      critical: true,
    },
  ],
}

Critical Steps

When a critical step fails:

The step is marked as failed
All pending steps are marked as blocked
The protocol execution is marked as failed
No further steps execute

Non-critical steps can fail without aborting the protocol.

Runner Options

const runner = new ProtocolRunner(executor, registry, {
  cwd: '/path/to/project',           // Default working directory
  defaultTimeoutMs: 120_000,          // 2 minutes per step
  verbose: false,                     // Print output to stdout
  dryRun: false,                      // Simulate without executing
  opaqueCredentials: true,            // Use lease-based opaque grants
  extraEnv: { NODE_ENV: 'production' }, // Extra env vars for all steps
  onEvent: (event) => {               // Real-time progress
    console.log(`[${event.type}] ${event.message}`);
  },
});

Events

The runner emits events for real-time progress tracking:

Event	When
`step:start`	Step begins execution
`step:complete`	Step finished successfully
`step:failed`	Step failed (command error or validation failure)
`capability:granted`	Credential access approved by vault
`capability:denied`	Credential access denied (trust, scope, or missing)
`protocol:complete`	All steps finished
`protocol:failed`	Protocol aborted due to critical step failure
`protocol:aborted`	Protocol manually aborted

Dry Run Mode

Test a protocol without executing any commands:

const runner = new ProtocolRunner(executor, registry, { dryRun: true });
const execution = await runner.run('deploy-staging', 'agent', 80);

// All steps "complete" with output like:
// "[dry-run] Would execute: pnpm test"
// Validation is skipped in dry run mode

Progress Tracking

const progress = executor.getProgress(execution.id);
// {
//   total: 5,
//   completed: 3,
//   failed: 0,
//   pending: 1,
//   inProgress: 1,
//   blocked: 0,
//   percentComplete: 60,
// }

Built-in Protocol Templates

BaselineOS includes two templates you can use or extend:

`DEPLOYMENT_PROTOCOL`

5-step production deployment:

Run test suite (critical)
Security self-check (critical)
Build production artifacts (critical)
Deploy to staging (needs vercel-token, critical)
Promote to production (needs vercel-token, trust 85, critical)

`ENV_CONFIGURATION_PROTOCOL`

4-step environment configuration:

Validate service URLs (critical)
Remove stale env vars (needs vercel-token)
Set new env vars (needs vercel-token, critical)
Verify configuration (needs vercel-token, critical)

Real-World Example: Vercel Environment Setup

The exact pattern from the user’s workflow — setting service URLs across environments:

registry.register({
  id: 'configure-service-urls',
  name: 'Configure Service URLs',
  description: 'Set API service URLs in Vercel production environment',
  minTrustScore: 60,
  author: 'platform-team',
  version: '1.0',
  tags: ['configuration', 'vercel', 'production'],
  steps: [
    {
      id: 'S1',
      name: 'Validate API endpoints',
      instructions: 'Verify each service URL returns 200',
      command: [
        'curl -sf https://griot-api.run.app/api/v1/index54/snapshot > /dev/null',
        'curl -sf https://griot-api.run.app/api/v1/ledger54/dossier > /dev/null',
        'curl -sf https://griot-api.run.app/api/v1/observatory54/coverage > /dev/null',
      ].join(' && '),
      requiredCapabilities: [],
      dependsOn: [],
      critical: true,
    },
    {
      id: 'S2',
      name: 'Set INDEX54_SERVICE_URL',
      instructions: 'Configure index54 snapshot endpoint',
      command: 'vercel env rm INDEX54_SERVICE_URL production --yes 2>/dev/null; echo "https://griot-api.run.app/api/v1/index54/snapshot" | vercel env add INDEX54_SERVICE_URL production',
      requiredCapabilities: [
        { credentialName: 'vercel-token', envVar: 'VERCEL_TOKEN', reason: 'Vercel env management' },
      ],
      dependsOn: ['S1'],
      critical: true,
    },
    {
      id: 'S3',
      name: 'Set LEDGER_DOSSIER_SERVICE_URL',
      instructions: 'Configure ledger54 dossier endpoint',
      command: 'vercel env rm LEDGER_DOSSIER_SERVICE_URL production --yes 2>/dev/null; echo "https://griot-api.run.app/api/v1/ledger54/dossier" | vercel env add LEDGER_DOSSIER_SERVICE_URL production',
      requiredCapabilities: [
        { credentialName: 'vercel-token', envVar: 'VERCEL_TOKEN', reason: 'Vercel env management' },
      ],
      dependsOn: ['S1'],
      parallel: true,
      critical: true,
    },
    {
      id: 'S4',
      name: 'Set OBSERVATORY_COVERAGE_URL',
      instructions: 'Configure observatory54 coverage endpoint',
      command: 'vercel env rm OBSERVATORY_COVERAGE_URL production --yes 2>/dev/null; echo "https://griot-api.run.app/api/v1/observatory54/coverage" | vercel env add OBSERVATORY_COVERAGE_URL production',
      requiredCapabilities: [
        { credentialName: 'vercel-token', envVar: 'VERCEL_TOKEN', reason: 'Vercel env management' },
      ],
      dependsOn: ['S1'],
      parallel: true,
      critical: true,
    },
    {
      id: 'S5',
      name: 'Verify configuration',
      instructions: 'List env vars and confirm all URLs are set',
      command: 'vercel env ls production',
      requiredCapabilities: [
        { credentialName: 'vercel-token', envVar: 'VERCEL_TOKEN', reason: 'Read env vars' },
      ],
      dependsOn: ['S2', 'S3', 'S4'],
      validation: {
        type: 'output-contains',
        expected: 'INDEX54_SERVICE_URL',
        failureMessage: 'Service URLs not configured',
      },
      critical: true,
    },
  ],
});

Execution flow:

S1: Validate URLs (curl health checks) ──────────────────────→ ✓
S2: Set INDEX54_SERVICE_URL   ──┐
S3: Set LEDGER_DOSSIER_URL    ──┼── (parallel, all need vercel-token lease) → ✓
S4: Set OBSERVATORY_URL       ──┘
S5: Verify all URLs set (vercel env ls) ──────────────────────→ ✓

Each step gets a time-boxed lease on VERCEL_TOKEN. The agent never sees the token value. After each step, the lease is revoked. The full execution is audited.