Skip to content

Phase 7: Enhanced Operations - Execution Plan

Version: v0.8.0 Status: Complete Created: 2025-12-29 Completed: 2025-12-29

Overview

Phase 7 expands intent coverage with high-value operational intents. Prioritized by practical utility and dependency order.


Implementation Order

Task 1: rollback-app Intent (P0)

Priority: Highest - Direct complement to deploy-app Risk Level: High Effort: ~1 hour

Why First: Every deploy needs a rollback capability. This is the most critical missing piece.

Features: - Rollback to previous revision (default) - Rollback to specific revision number - Rollback to specific version tag - Automatic verification after rollback

Interface:

# Rollback to previous revision
./scripts/run-intent.sh rollback-app.yaml --params app=home-portal

# Rollback to specific revision
./scripts/run-intent.sh rollback-app.yaml --params app=home-portal revision=3

# Rollback to specific version
./scripts/run-intent.sh rollback-app.yaml --params app=home-portal version=v1.0.2

Steps: 1. Get current deployment state 2. Identify target revision/version 3. Execute kubectl rollout undo or kubectl set image 4. Wait for rollout 5. Verify deployment health


Task 2: health-check Intent (P1)

Priority: High - Essential for observability Risk Level: Low (read-only) Effort: ~30 min

Why Second: Quick win, low risk, immediately useful.

Features: - Check pod readiness - Check service endpoints - Check ingress connectivity - Check recent restarts/crashes - Optional: HTTP probe to app endpoint

Interface:

# Basic health check
./scripts/run-intent.sh health-check.yaml --params app=home-portal

# With HTTP probe
./scripts/run-intent.sh health-check.yaml --params app=home-portal probe_url=/api/health

Steps: 1. Check pod status and readiness 2. Check service endpoints 3. Check ingress (if exists) 4. Check for recent restarts 5. (Optional) HTTP probe


Task 3: backup-app Intent (P1)

Priority: High - Critical for disaster recovery Risk Level: Medium Effort: ~1 hour

Why Third: Builds on existing Velero infrastructure.

Features: - Create Velero backup for specific app/namespace - Include PVCs and secrets - Backup verification (wait for completion, check status) - TTL configuration

Interface:

# Backup an app
./scripts/run-intent.sh backup-app.yaml --params app=home-portal

# With custom TTL
./scripts/run-intent.sh backup-app.yaml --params app=home-portal ttl=720h

Steps: 1. Verify Velero is running 2. Create backup with label selector 3. Wait for backup completion 4. Verify backup status 5. Report backup size/details


Task 4: restore-app Intent (P2)

Priority: Medium - Depends on backup-app Risk Level: High (destructive) Effort: ~1 hour

Why Fourth: Completes the backup/restore cycle.

Features: - Restore from specific backup - Restore to same or different namespace - Pre-restore validation - Post-restore verification

Interface:

# Restore from backup
./scripts/run-intent.sh restore-app.yaml --params app=home-portal backup=home-portal-20251229

# Restore to different namespace
./scripts/run-intent.sh restore-app.yaml --params backup=home-portal-20251229 target_namespace=home-portal-staging

Steps: 1. Verify backup exists and is valid 2. Confirm destructive operation 3. Create Velero restore 4. Wait for restore completion 5. Verify restored resources 6. Run health check


Task 5: Bulk Operations (P2)

Priority: Medium - Quality of life improvement Risk Level: Variable Effort: ~1 hour

New Intents: - deploy-all - Deploy multiple apps in sequence - restart-all - Restart multiple deployments - health-check-all - Check health of all apps

Interface:

# Deploy all apps
./scripts/run-intent.sh deploy-all.yaml --params apps="home-portal,money-tracker,trip-planner"

# Restart all apps in namespace
./scripts/run-intent.sh restart-all.yaml --params namespace=default


Task 6: cert-renew Intent (P3)

Priority: Low - cert-manager handles this automatically Risk Level: Medium Effort: ~30 min

Note: May skip if cert-manager is handling renewals automatically.

Features: - Force certificate renewal - Check certificate expiry - Verify new certificate


Acceptance Criteria

  • [x] rollback-app can revert to previous or specific version
  • [x] health-check provides comprehensive app health status
  • [x] backup-app creates verified Velero backups
  • [x] restore-app can restore from any valid backup
  • [x] All new intents have complete audit trails
  • [x] All intents pass dry-run testing (9/9 passed)
  • [x] Documentation updated

Completion Summary

Implemented Intents: 1. rollback-app.yaml - Rollback to previous/specific revision or version 2. health-check.yaml - Comprehensive app health status (7 steps) 3. backup-app.yaml - Create Velero backups with TTL and PVC options 4. restore-app.yaml - Restore from backups with namespace mapping 5. health-check-all.yaml - Bulk health check for multiple apps

Not Implemented (Deferred): - deploy-all - Can be added later if needed - restart-all - Can be added later if needed - cert-renew - cert-manager handles automatically

Test Results: All 9 intents pass dry-run testing


Implementation Notes

Common Patterns

All new intents should follow established patterns: - Use observe mode for read-only operations - Use mutate mode for changes - Include proper risk levels - Add prereq checks for required tools - Include verification steps

Testing Strategy

  1. Dry-run each new intent
  2. Test with non-production app first
  3. Verify audit logs are complete
  4. Test rollback/restore in staging scenario

Schedule

Task Estimated Dependencies
rollback-app 1h None
health-check 30m None
backup-app 1h Velero running
restore-app 1h backup-app
bulk-ops 1h Core intents
cert-renew 30m cert-manager

Total Estimated: ~5 hours