Intent-Based Operations: Design Specification¶

Status: Phase 5 Complete Version: 0.6.0 Created: 2025-12-18 Updated: 2025-12-23

Foundational Specifications (P0)¶

Before implementing any code, the following specifications define the "physics" of the system:

Spec	Description	Status
spec-templating-v1.md	Variable interpolation, render modes, escaping	Normative
spec-locking-v1.md	Lock format, acquisition, heartbeat, stale recovery	Normative
spec-audit-v1.md	Canonical JSON, hash chains, event schemas	Normative

These specs are authoritative. Implementation MUST conform to them.

Executive Summary¶

This document specifies a migration from our current natural language + CLAUDE.md model to a deterministic, auditable intent-based operations layer with:

Typed intents with explicit parameters and execution context
Policy gates (not just prereqs) that enforce controls based on risk
Transaction-style audit logging (JSONL) with full reproducibility
Concurrency control via explicit locking
LLM as proposer, not decider - schema validation is the authority

Execution Strategy: Hybrid Approach¶

After evaluating build vs. integrate options, the recommended approach is hybrid:

┌─────────────────────────────────────────────────────────────────┐
│                    INTENT LAYER (build)                         │
│  Schema validation, policy gate, context capture, audit log     │
└─────────────────────────────────────────────────────────────────┘
                              │
              ┌───────────────┴───────────────┐
              ▼                               ▼
┌─────────────────────────┐     ┌─────────────────────────┐
│   mode: reconcile       │     │   mode: mutate/observe  │
│   (high-risk deploys)   │     │   (scaffolds, one-offs) │
└─────────────────────────┘     └─────────────────────────┘
              │                               │
              ▼                               ▼
┌─────────────────────────┐     ┌─────────────────────────┐
│   Git commit → Argo CD  │     │   Local executor        │
│   (manifest is artifact)│     │   (run-intent.sh)       │
└─────────────────────────┘     └─────────────────────────┘

Rationale¶

Path	Use For	Why
GitOps (Argo CD)	Production deploys	Immutable artifacts, rollback via revert, audit via git
Local executor	Scaffolding, migrations, observe	Lightweight, no new infra, full control

What We Build vs. Borrow¶

Component	Approach
Intent schema + validation	Build (this is our unique contract)
Policy evaluator	Build (simple rules.yaml, not full OPA)
Audit logging (JSONL)	Build (hash-chained, tamper-evident)
Locking	Build (file-based with TTL)
Executor for observe/mutate	Build (bash state machine)
Deploys to K8s	Borrow (GitOps via Argo CD)

Architecture¶

Layer Model¶

┌────────────────────────────────────────────────────────────────────┐
│                    NATURAL LANGUAGE LAYER                          │
│  User speaks freely. LLM proposes (intent + params). Never decides.│
└────────────────────────────────────────────────────────────────────┘
                                │
                    ┌───────────┴───────────┐
                    ▼                       ▼
         ┌──────────────────┐    ┌──────────────────┐
         │  Slash Command   │    │  LLM Proposal    │
         │  (deterministic) │    │  (best-effort)   │
         └──────────────────┘    └──────────────────┘
                    │                       │
                    └───────────┬───────────┘
                                ▼
┌────────────────────────────────────────────────────────────────────┐
│                       INTENT LAYER                                 │
│  Schema validation: intent name + typed params + execution context │
│  If invalid: REJECT. No fallback to ad-hoc for formal intents.     │
└────────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌────────────────────────────────────────────────────────────────────┐
│                       POLICY LAYER                                 │
│  Pure function: (intent, params, env_facts) → decision             │
│  Decisions: allow | deny | require_confirmation | require_approval │
│  NOT vibes. Explicit risk → control mappings.                      │
└────────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌────────────────────────────────────────────────────────────────────┐
│                     EXECUTION LAYER                                │
│  Transaction-style: each step is an audited event                  │
│  Locking prevents concurrent mutations to same resource            │
│  Rollback hooks for reversible steps                               │
└────────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌────────────────────────────────────────────────────────────────────┐
│                    VERIFICATION LAYER                              │
│  Contract-based: explicit success criteria, not "looks okay"       │
│  Distinguish "deploy succeeded, verify failed" from "deploy failed"│
└────────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌────────────────────────────────────────────────────────────────────┐
│                       AUDIT LAYER                                  │
│  Append-only JSONL. Hash-chained for tamper evidence.              │
│  Contains: intent hash, execution context, all decisions, outputs  │
└────────────────────────────────────────────────────────────────────┘

Intent Schema¶

Full Schema (YAML)¶

apiVersion: tower-fleet/v1
kind: Intent
metadata:
  name: deploy-app
  version: "1.2.0"                    # Semantic version for evolution
  description: Deploy application to Kubernetes cluster
  owners: ["@jake"]                   # Who gets paged on failure

spec:
  # How the LLM should recognize this intent
  triggers:
    - "deploy {app} to {environment}"
    - "ship {app} to {environment}"
    - "release {app}"
    - "push {app} to k8s"

  # Explicit semantics
  mode: mutate                         # observe | reconcile | mutate
  capabilities:                        # What this intent can touch
    - k8s.apply
    - registry.push
    - k8s.rollout

  # Risk classification → controls
  risk: high
  controls:
    confirmation: required             # Must confirm before execution
    explicit_env: true                 # "production" must be explicit, not defaulted
    lock: "${params.app}-${params.environment}"  # Prevent concurrent runs

  # What this intent touches (for auditing + policy)
  resources:
    - kind: namespace
      name: "${params.app}"
    - kind: deployment
      name: "${params.app}"
      namespace: "${params.app}"
    - kind: registry
      path: "registry.internal/${params.app}"

  # Typed parameters
  # Note: Validation runs AFTER all params are resolved, so use ${params.*}
  parameters:
    - name: app
      type: string
      required: true
      validate:
        - rule: dir_exists
          path: "/root/projects/${params.app}"
          error: "Project directory not found"

    - name: environment
      type: enum
      values: [sandbox, production]
      required: true                   # No default for high-risk

    - name: skip_build_check
      type: boolean
      default: false
      allowed_in: [sandbox]            # Cannot skip in production

    - name: skip_health_check
      type: boolean
      default: false
      allowed_in: []                   # Never allowed

    - name: allow_dirty
      type: boolean
      default: false
      description: "Allow deploy from dirty git state"

  # Secrets this intent needs (declared, never values)
  secrets:
    - name: REGISTRY_PASSWORD
      source: env
    - name: KUBECONFIG
      source: file
      path: "~/.kube/config"

  # Prerequisites: "can this run?"
  prerequisites:
    - name: auth_configured
      check:
        type: file_exists
        path: "/root/projects/${params.app}/lib/auth/config.ts"
      error: "Auth not configured. Run `/migrate:auth ${params.app}` first."
      blocking: true

    - name: dockerfile_exists
      check:
        type: file_exists
        path: "/root/projects/${params.app}/Dockerfile"
      error: "No Dockerfile. See docs/workflows/production-deployment.md"
      blocking: true

    - name: manifests_exist
      check:
        type: dir_exists
        path: "/root/tower-fleet/manifests/apps/${params.app}"
      error: "K8s manifests not found."
      blocking: true

    - name: git_clean
      check:
        type: exec
        command: "git -C /root/projects/${params.app} diff --quiet"
      error: "Working directory has uncommitted changes."
      blocking: true
      skip_if: allow_dirty

    - name: build_passes
      check:
        type: exec
        command: "npm run build"
        cwd: "/root/projects/${params.app}"
        timeout: 300s
      error: "Build failed. Fix errors first."
      blocking: true
      skip_if: skip_build_check

  # Execution context to capture (for reproducibility)
  context:
    capture:
      - name: git_sha
        command: "git -C /root/projects/${params.app} rev-parse HEAD"
      - name: git_dirty
        command: "git -C /root/projects/${params.app} status --porcelain"
      - name: node_version
        command: "node --version"
      - name: docker_version
        command: "docker --version"
      - name: kubectl_version
        command: "kubectl version --client -o json"
      - name: kube_context
        command: "kubectl config current-context"
      - name: intent_hash
        command: "sha256sum /root/tower-fleet/intents/deploy-app.yaml"

  # Timeouts
  timeouts:
    global: 600s                       # Max total execution time
    per_step: 300s                     # Default per-step timeout
    lock_ttl: 900s                     # Lock TTL before considered stale

  # Runbook steps
  runbook:
    - name: build_image
      description: "Build Docker image"
      exec:
        command: |
          docker build \
            -t registry.internal/${params.app}:${context.git_sha} \
            --build-arg AUTH_MODE=authentik \
            /root/projects/${params.app}
        timeout: 300s
      rollback:
        command: "docker rmi registry.internal/${params.app}:${context.git_sha} || true"
      outputs:
        - name: image_tag
          value: "registry.internal/${params.app}:${context.git_sha}"

    - name: push_image
      description: "Push image to registry"
      exec:
        command: "docker push registry.internal/${params.app}:${context.git_sha}"
        timeout: 120s
      # No rollback - registry push is append-only

    - name: deploy_manifests
      description: "Apply Kubernetes manifests"
      exec:
        command: "kubectl apply -k /root/tower-fleet/manifests/apps/${params.app}"
        timeout: 60s
      rollback:
        command: "kubectl rollout undo deployment/${params.app} -n ${params.app}"

    - name: wait_rollout
      description: "Wait for rollout to complete"
      exec:
        command: "kubectl rollout status deployment/${params.app} -n ${params.app} --timeout=300s"
        timeout: 330s
      # Rollback handled by previous step

  # Verification: explicit success criteria
  # v1: exec checks only (typed checks like k8s_condition/http deferred to v2)
  verify:
    - name: pods_ready
      description: "All pods in Ready state"
      check:
        type: exec
        command: |
          kubectl wait deployment/${params.app} -n ${params.app} \
            --for=condition=Available --timeout=120s
      retries: 10
      delay: 10s

    - name: health_endpoint
      description: "Health endpoint returns 200"
      check:
        type: exec
        command: "curl -sf https://${params.app}.bogocat.com/api/health"
      retries: 5
      delay: 15s

  # Output to user
  # Runtime vars (failed_step, rollback_status) use facts.* scope
  outputs:
    success:
      - "Deployed ${params.app} to ${params.environment}"
      - "Image: registry.internal/${params.app}:${context.git_sha}"
      - "URL: https://${params.app}.bogocat.com"
    failure:
      - "Deployment failed at step: ${facts.failed_step}"
      - "Logs: kubectl logs -n ${params.app} -l app=${params.app}"
      - "Rollback initiated: ${facts.rollback_status}"

JSON Schema for Validation¶

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "$id": "https://tower-fleet/intent/v1",
  "type": "object",
  "required": ["apiVersion", "kind", "metadata", "spec"],
  "properties": {
    "apiVersion": {
      "type": "string",
      "pattern": "^tower-fleet/v[0-9]+$"
    },
    "kind": {
      "const": "Intent"
    },
    "metadata": {
      "type": "object",
      "required": ["name", "version"],
      "properties": {
        "name": { "type": "string", "pattern": "^[a-z][a-z0-9-]*$" },
        "version": { "type": "string", "pattern": "^[0-9]+\\.[0-9]+\\.[0-9]+$" },
        "description": { "type": "string" },
        "owners": { "type": "array", "items": { "type": "string" } }
      }
    },
    "spec": {
      "type": "object",
      "required": ["mode", "risk", "parameters", "runbook"],
      "properties": {
        "mode": { "enum": ["observe", "reconcile", "mutate"] },
        "capabilities": {
          "type": "array",
          "items": { "type": "string" }
        },
        "risk": { "enum": ["low", "medium", "high", "destructive"] },
        "controls": {
          "type": "object",
          "properties": {
            "confirmation": { "enum": ["none", "optional", "required"] },
            "explicit_env": { "type": "boolean" },
            "lock": { "type": "string" },
            "require_approval": { "type": "boolean" }
          }
        },
        "parameters": {
          "type": "array",
          "items": {
            "type": "object",
            "required": ["name", "type"],
            "properties": {
              "name": { "type": "string" },
              "type": { "enum": ["string", "boolean", "enum", "integer"] },
              "required": { "type": "boolean" },
              "default": {},
              "values": { "type": "array" },
              "allowed_in": { "type": "array", "items": { "type": "string" } },
              "validate": { "type": "array" }
            }
          }
        },
        "prerequisites": { "type": "array" },
        "context": { "type": "object" },
        "timeouts": {
          "type": "object",
          "properties": {
            "global": { "type": "string" },
            "per_step": { "type": "string" },
            "lock_ttl": { "type": "string" }
          }
        },
        "runbook": { "type": "array" },
        "verify": { "type": "array" },
        "outputs": { "type": "object" }
      }
    }
  }
}

Audit Log Schema (JSONL)¶

Each line is a JSON object representing an event. Events are hash-chained for tamper evidence.

Note: All events MUST include audit_version, event, request_id, timestamp, prev_hash, and event_hash per spec-audit-v1. Examples below are illustrative; see spec for complete schemas.

Event Types¶

// Request received (first event in chain)
{
  "audit_version": "v1",
  "event": "intent_received",
  "request_id": "req_abc123",
  "timestamp": "2025-12-18T10:30:00Z",
  "prev_hash": null,
  "event_hash": "sha256:a1b2c3...",
  "intent": "deploy-app",
  "intent_version": "1.2.0",
  "intent_hash": "sha256:def456...",
  "policy_hash": "sha256:789abc...",
  "executor_version": "0.1.0",
  "host_id": "tower-01",
  "actor": "claude-code",
  "source": "slash_command",
  "raw_input": "/deploy:app money-tracker --env=production"
}

// Parameters resolved
{
  "audit_version": "v1",
  "event": "params_resolved",
  "request_id": "req_abc123",
  "timestamp": "2025-12-18T10:30:01Z",
  "prev_hash": "sha256:a1b2c3...",
  "event_hash": "sha256:d4e5f6...",
  "params": {
    "app": "money-tracker",
    "environment": "production",
    "skip_build_check": false,
    "allow_dirty": false
  },
  "params_hash": "sha256:fedcba..."
}

// Execution context captured
{
  "event": "context_captured",
  "request_id": "req_abc123",
  "timestamp": "2025-12-18T10:30:02Z",
  "context": {
    "git_sha": "abc123def456",
    "git_dirty": "",
    "node_version": "v22.0.0",
    "docker_version": "Docker version 24.0.0",
    "kubectl_version": "v1.28.0",
    "kube_context": "k3s-cluster",
    "intent_hash": "sha256:abc..."
  },
  "prev_hash": "sha256:ghi..."
}

// Policy evaluated
{
  "event": "policy_evaluated",
  "request_id": "req_abc123",
  "timestamp": "2025-12-18T10:30:03Z",
  "decision": "require_confirmation",
  "reasons": [
    "risk=high requires confirmation",
    "environment=production requires explicit_env"
  ],
  "prev_hash": "sha256:jkl..."
}

// Prerequisite checked
{
  "event": "prereq_checked",
  "request_id": "req_abc123",
  "timestamp": "2025-12-18T10:30:04Z",
  "prereq": "auth_configured",
  "result": "pass",
  "check_output_hash": "sha256:mno...",
  "prev_hash": "sha256:pqr..."
}

// Lock acquired
{
  "event": "lock_acquired",
  "request_id": "req_abc123",
  "timestamp": "2025-12-18T10:30:10Z",
  "lock_name": "money-tracker-production",
  "lock_path": "/root/tower-fleet/logs/locks/money-tracker-production.lock",
  "prev_hash": "sha256:stu..."
}

// Plan rendered (shown to user) - confirmation is tied to this plan_hash
{
  "audit_version": "v1",
  "event": "plan_rendered",
  "request_id": "req_abc123",
  "timestamp": "2025-12-18T10:30:11Z",
  "prev_hash": "sha256:vwx...",
  "event_hash": "sha256:xyz...",
  "steps": [
    { "name": "build_image", "description": "Build Docker image" },
    { "name": "push_image", "description": "Push image to registry" },
    { "name": "deploy_manifests", "description": "Apply Kubernetes manifests" },
    { "name": "wait_rollout", "description": "Wait for rollout to complete" }
  ],
  "plan_hash": "sha256:plan123..."
}

// Confirmation received
{
  "event": "confirmation_received",
  "request_id": "req_abc123",
  "timestamp": "2025-12-18T10:30:30Z",
  "confirmed": true,
  "confirmation_method": "interactive",  // or "flag", "approval_file"
  "prev_hash": "sha256:yza..."
}

// Step started
{
  "event": "step_started",
  "request_id": "req_abc123",
  "timestamp": "2025-12-18T10:30:31Z",
  "step": "build_image",
  "command_hash": "sha256:bcd...",
  "prev_hash": "sha256:efg..."
}

// Step finished
{
  "event": "step_finished",
  "request_id": "req_abc123",
  "timestamp": "2025-12-18T10:32:15Z",
  "step": "build_image",
  "result": "success",
  "exit_code": 0,
  "duration_ms": 104000,
  "stdout_hash": "sha256:hij...",
  "stderr_hash": "sha256:klm...",
  "outputs": {
    "image_tag": "registry.internal/money-tracker:abc123def456"
  },
  "prev_hash": "sha256:nop..."
}

// Verification
{
  "event": "verify_finished",
  "request_id": "req_abc123",
  "timestamp": "2025-12-18T10:35:00Z",
  "verification": "pods_ready",
  "result": "pass",
  "attempts": 3,
  "prev_hash": "sha256:qrs..."
}

// Final result
{
  "event": "intent_completed",
  "request_id": "req_abc123",
  "timestamp": "2025-12-18T10:35:30Z",
  "result": "success",
  "duration_ms": 330000,
  "steps_completed": 4,
  "steps_failed": 0,
  "verifications_passed": 2,
  "outputs": {
    "url": "https://money-tracker.bogocat.com",
    "image": "registry.internal/money-tracker:abc123def456"
  },
  "prev_hash": "sha256:tuv..."
}

// Rollback (if needed)
{
  "event": "rollback_started",
  "request_id": "req_abc123",
  "timestamp": "2025-12-18T10:33:00Z",
  "reason": "step deploy_manifests failed",
  "rollback_to_step": "push_image",
  "prev_hash": "sha256:wxy..."
}

Policy Rules¶

Policy is a pure function: (intent, params, context) → decision

Risk → Controls Mapping¶

# tower-fleet/policy/rules.yaml
policies:
  - name: high_risk_confirmation
    match:
      risk: high
    require:
      confirmation: true
      explicit_env: true
    log:
      actor: true

  - name: destructive_requires_typing
    match:
      risk: destructive
    require:
      confirmation: true
      confirmation_kind: type_to_confirm  # Must type resource name
      confirm_value: "${params.app}"      # What they must type
      log_retention: 365d

  - name: production_explicit
    match:
      environment: production
    require:
      explicit_env: true               # Cannot default to prod

  - name: no_skip_health_in_prod
    match:
      environment: production
      param.skip_health_check: true
    decision: deny
    reason: "Health checks cannot be skipped in production"

  - name: observe_always_allowed
    match:
      mode: observe
    decision: allow                    # Read-only is safe

Decision Types¶

Decision	Meaning
`allow`	Proceed immediately
`deny`	Reject, explain why
`require_confirmation`	Show plan, wait for explicit yes
`require_approval`	Need approval file/second human

Confirmation kinds (when decision is require_confirmation):

Kind	Meaning
`interactive`	Simple yes/no confirmation
`type_to_confirm`	Must type specific value (e.g., resource name)
`flag`	Must pass `--confirm` flag

Executor State Machine¶

                    ┌─────────────┐
                    │   START     │
                    └──────┬──────┘
                           │
                    ┌──────▼──────┐
                    │ PARSE_INTENT│──── invalid ────► REJECT
                    └──────┬──────┘
                           │ valid
                    ┌──────▼──────┐
                    │RESOLVE_PARAMS│──── invalid ───► REJECT
                    └──────┬──────┘
                           │
                    ┌──────▼──────┐
                    │CAPTURE_CONTEXT│
                    └──────┬──────┘
                           │
                    ┌──────▼──────┐
                    │EVALUATE_POLICY│──── deny ─────► REJECT
                    └──────┬──────┘
                           │ allow/confirm
                    ┌──────▼──────┐
                    │CHECK_PREREQS │──── fail ─────► REJECT
                    └──────┬──────┘
                           │ pass
                    ┌──────▼──────┐
                    │ACQUIRE_LOCK  │──── blocked ──► WAIT/REJECT
                    └──────┬──────┘
                           │
                    ┌──────▼──────┐
                    │ RENDER_PLAN  │
                    └──────┬──────┘
                           │
               ┌───────────┴───────────┐
               │                       │
        ┌──────▼──────┐         ┌──────▼──────┐
        │  DRY_RUN    │         │AWAIT_CONFIRM│──── rejected ─► ABORT
        └──────┬──────┘         └──────┬──────┘
               │                       │ confirmed
               ▼                       │
            OUTPUT              ┌──────▼──────┐
                               │ EXECUTE_STEPS│──── fail ────► ROLLBACK
                               └──────┬──────┘
                                      │ success
                               ┌──────▼──────┐
                               │   VERIFY    │──── fail ────► ROLLBACK
                               └──────┬──────┘
                                      │ pass
                               ┌──────▼──────┐
                               │RELEASE_LOCK │
                               └──────┬──────┘
                                      │
                               ┌──────▼──────┐
                               │  SUCCESS    │
                               └─────────────┘

LLM Integration Rules¶

Principle: LLM Proposes, Schema Validates, Policy Decides¶

User: "ship money-tracker"

┌─────────────────────────────────────────────────────────────────┐
│ 1. Check for slash command match first (deterministic)          │
│    Result: No exact match                                       │
└─────────────────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────┐
│ 2. LLM proposes intent + params                                 │
│    Proposal: deploy-app(app="money-tracker", environment=???)   │
│    Confidence: 0.9                                              │
│    Reasoning: "ship" matches trigger pattern                    │
└─────────────────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────┐
│ 3. Schema validation                                            │
│    FAIL: "environment" is required, no default for high-risk    │
└─────────────────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────┐
│ 4. Response to user                                             │
│    "I recognized this as a deploy intent, but I need to know:   │
│     - Which environment? (sandbox | production)                 │
│    Please specify, e.g.: 'ship money-tracker to production'"    │
└─────────────────────────────────────────────────────────────────┘

Rules¶

Slash commands > LLM proposals: Deterministic matcher runs first
No defaulting to production: High-risk intents require explicit environment
Log LLM reasoning: Record why it matched, for debugging
Schema is authority: LLM proposal must pass schema validation
Fallback is ask, not guess: If ambiguous, ask user to clarify

Migration Phases (Revised)¶

Phase 1: Foundation (Audit + Executor First) ✓ COMPLETE¶

Goal: Build the executor and audit system BEFORE LLM integration.

[x] Create directory structure

tower-fleet/
├── intents/
│   └── examples/
│       ├── observe-app.yaml
│       └── deploy-app.yaml
├── scripts/
│   ├── run-intent.sh      # State machine executor
│   ├── audit-log.sh       # Hash-chained logging
│   ├── audit-viewer.sh    # View/verify audit trails
│   ├── lock-manager.sh    # TTL-based locking
│   └── template-resolve.sh # Variable interpolation
└── logs/
    ├── intents/           # JSONL audit logs by date
    └── locks/             # Lock files

[x] Implement JSONL audit logger (hash-chained, tamper-evident)
[x] Implement locking mechanism (TTL, heartbeat, stale detection)
[x] Implement template resolver (4 scopes: params, context, steps, facts)
[x] Create run-intent.sh state machine
[x] Create --dry-run mode
[x] Create audit log viewer (show, list, verify, search, tail)

Deliverables: - ✓ Working executor (manually invoked via run-intent.sh) - ✓ Audit logging with chain verification - ✓ Locking with TTL and stale recovery

Usage:

# Dry-run (no changes)
./scripts/run-intent.sh intents/examples/observe-app.yaml --params app=home-portal --dry-run

# Real execution
./scripts/run-intent.sh intents/examples/observe-app.yaml --params app=home-portal

# View audit trail
./scripts/audit-viewer.sh list
./scripts/audit-viewer.sh show <request_id>
./scripts/audit-viewer.sh verify <request_id>

Phase 2: Core Intents (Complete)¶

Goal: Define 3-5 most common operations as intents.

[x] observe-app.yaml - Check app status (mode: observe) ✓ Tested
[x] deploy-app.yaml - Deploy to K8s (with locking, confirmation) ✓ Dry-run tested
[x] check-logs.yaml - View application logs with filters ✓ Tested
[x] restart-app.yaml - Rolling restart of deployment ✓ Tested (real)
[x] scale-app.yaml - Scale replicas with confirmation ✓ Dry-run tested
[x] migrate-schema.yaml - Run database migrations ✓ Dry-run tested
[x] create-nextjs-app.yaml - Scaffold new project ✓ Dry-run tested

Deliverables: - 7 production-ready intent definitions ✓ - Validation tests for each (dry-run) ✓

Phase 3: Policy Engine (Complete)¶

Goal: Implement policy evaluation as pure function.

[x] Create policy evaluator (scripts/policy-evaluator.sh) ✓
[x] Implement risk → controls mapping (intents/policy/rules.yaml) ✓
[x] Add confirmation flow (yes/type-to-confirm) ✓
[x] Add "type to confirm" for destructive operations ✓
[x] Add risk levels to all intents ✓
[x] Integrate policy evaluator with run-intent.sh ✓
[x] Test policy edge cases ✓

Deliverables: - Policy specification: docs/architecture/spec-policy-v1.md - Policy rules: intents/policy/rules.yaml - Policy evaluator: scripts/policy-evaluator.sh - Risk-based controls enforcement in audit trail

Phase 4: Slash Commands ✓¶

Goal: Deterministic command interface.

[x] /intents:deploy-app <app> - Deploy application to K8s
[x] /intents:observe-app <app> - View pods, logs, events
[x] /intents:check-logs <app> - View logs with filters
[x] /intents:scale-app <app> <replicas> - Scale replicas
[x] /intents:restart-app <app> - Rolling restart
[x] /intents:migrate-schema <app> - Database migrations
[x] /intents:scaffold-nextjs <name> - Create new Next.js app

Deliverables: - Slash commands in /root/.claude/commands/intents/ - Each command maps to corresponding intent YAML - Commands provide context and execute via run-intent.sh

Phase 5: LLM Integration ✓¶

Goal: Connect natural language to intent system safely.

[x] Update CLAUDE.md with intent matching rules
[x] Implement proposal → validation flow (scripts/validate-proposal.sh)
[x] Add "ask, don't guess" for ambiguous cases
[x] Log all LLM proposals + reasoning (scripts/log-intent-proposal.sh)
[x] Add toggle system (intents/config.yaml, scripts/intent-config.sh)
[x] Refactor CLAUDE.md for clarity (690 → 224 lines)

Deliverables: - LLM matching rules: intents/llm-rules.md - Proposal logging: scripts/log-intent-proposal.sh - Proposal validation: scripts/validate-proposal.sh - Config toggle: intents/config.yaml - Streamlined CLAUDE.md with intent system at top

Phase 6: Hardening¶

Goal: Production-ready reliability.

[ ] Replay capability (re-run from audit log)
[ ] Metrics dashboard (success rates, common failures)
[ ] Alert on failed intents
[ ] Periodic audit log integrity check
[ ] Documentation complete

Deliverables: - Replay tooling - Observability - Documentation

Directory Structure (Final)¶

tower-fleet/
├── intents/
│   ├── schema.json              # JSON Schema for validation
│   ├── deploy-app.yaml
│   ├── observe-app.yaml
│   ├── migrate-schema.yaml
│   ├── rollback-deployment.yaml
│   ├── create-nextjs-app.yaml
│   └── commands.yaml            # Slash command → intent mapping
├── policy/
│   ├── rules.yaml               # Risk → controls mapping
│   └── evaluator.sh             # Policy evaluation script
├── scripts/
│   ├── run-intent.sh            # Main executor
│   ├── validate-intent.sh       # Schema validation
│   ├── template-resolve.sh      # Template resolution (per spec-templating-v1)
│   ├── lock-manager.sh          # Lock management (per spec-locking-v1)
│   ├── audit-log.sh             # Append to audit log (per spec-audit-v1)
│   ├── audit-viewer.sh          # View/search/verify audit logs
│   └── replay-intent.sh         # Replay from audit log
├── logs/
│   ├── intents/                 # JSONL audit logs
│   │   └── 2025-12-18/
│   │       └── req_abc123.jsonl
│   └── locks/                   # Lock files (JSON with metadata)
│       └── money-tracker-production.lock
└── docs/
    └── architecture/
        ├── intent-based-operations.md  # This document
        ├── spec-templating-v1.md       # P0: Templating semantics
        ├── spec-locking-v1.md          # P0: Locking specification
        └── spec-audit-v1.md            # P0: Audit/hashing specification

The 3 Non-Negotiable Upgrades¶

If nothing else, implement these:

Structured audit log (JSONL) + request_id + intent-definition hash
Without this, you can't debug or explain what happened
Locking + explicit production confirmation
Without this, concurrent deploys will corrupt state
Execution context capture (git sha, tool versions, kube context)
Without this, you can't reproduce or explain differences

References¶

Spacelift Intent - NL infrastructure provisioning
Guardrails AI - LLM output validation
Kubernetes Operator Pattern - Declarative reconciliation
Runbook Automation Best Practices - Structured procedures
Model Context Protocol - Standardized tool interface
Amazon Bedrock Guardrails - Policy enforcement

Next Steps¶

Review this specification
Approve directory structure
Begin Phase 1: Build executor + audit logging