Infrastructure Audit — Current State Analysis¶
Overview¶
This document captures the current infrastructure capabilities relevant to building the Vault Platform, identifying what exists and what needs to be built.
Audit Date: 2025-12-16
Current Infrastructure Summary¶
What We Have¶
| Capability | Status | Notes |
|---|---|---|
| K8s Cluster | Production | 3-node k3s, namespace isolation |
| Authentik SSO | Production | OAuth/OIDC, MFA, group-based RBAC |
| Supabase | Production | PostgreSQL, Storage, schema isolation |
| Sealed Secrets | Production | Encrypted K8s secrets |
| Background Workers | Pattern exists | Celery + Redis (subtitleai) |
| Monitoring | Production | Prometheus, Grafana, Loki |
| Deploy Scripts | Automated | Per-app deployment patterns |
What We Don't Have¶
| Capability | Status | Priority |
|---|---|---|
| End-to-End Encryption | Not built | CRITICAL |
| Field-Level Encryption | Not built | CRITICAL |
| Audit Logging | Not built | CRITICAL |
| Dead Man's Switch | Not built | CRITICAL |
| Notification Service | Not built | HIGH |
| Document Generation | Not built | HIGH |
| Multi-Party Access | Partial | HIGH |
| Triggered Actions | Not built | HIGH |
| Key Management | Basic | MEDIUM |
| Data Export | Not built | MEDIUM |
Detailed Assessment¶
Kubernetes Cluster¶
Strengths: - Namespace isolation per app - Longhorn distributed storage - Private container registry - Ingress with TLS termination
Gaps for Vault Platform: - No network policies (pod-to-pod communication not restricted) - No mTLS between services - No dedicated namespace for sensitive workloads
Recommendation: Create vault-core namespace with stricter network policies.
Authentication (Authentik)¶
Strengths: - MFA required on first login - Group-based RBAC - OAuth/OIDC standard compliance - WebAuthn support (phishing-resistant)
Gaps for Vault Platform: - No "executor" role type - No delegation/power-of-attorney patterns - No death verification flow - No secondary contact verification
Recommendation: Extend Authentik groups for vault-specific roles; build custom flows for death verification.
Database (Supabase)¶
Strengths: - Schema-based multi-app isolation - Row-Level Security (RLS) - Storage API for files - Real-time subscriptions
Gaps for Vault Platform: - No encryption at rest beyond standard PostgreSQL - No audit table patterns - No trigger-based change tracking - RLS policies need extension for multi-party access
Recommendation:
- Create vault_core schema for shared infrastructure
- Implement append-only audit tables with triggers
- Extend RLS for conditional access patterns
Secret Management¶
Strengths: - Sealed Secrets for K8s secrets - Secrets not in git (gitignored)
Gaps for Vault Platform: - No user key management - No key rotation automation - No escrow/recovery system - No HSM or external KMS
Recommendation: - Build key management service in application layer - Consider external KMS for production (AWS KMS, HashiCorp Vault) - Implement M-of-N recovery in application code
Background Processing¶
Strengths: - Celery + Redis pattern exists (subtitleai) - Worker deployment patterns established
Gaps for Vault Platform: - No scheduled job management - No trigger evaluation loop - No notification queue - No retry/failure handling patterns
Recommendation: - Extend Celery for trigger evaluation (runs every 5 minutes) - Add dedicated notification worker - Implement dead letter queue for failed jobs
Monitoring & Observability¶
Strengths: - Prometheus metrics collection - Grafana dashboards - Loki log aggregation - AlertManager for alerts
Gaps for Vault Platform: - No audit-specific dashboards - No trigger execution monitoring - No security event alerting - No compliance reporting
Recommendation: - Create vault-specific Grafana dashboards - Add alerts for security-relevant events - Implement audit trail integrity monitoring
Existing Patterns to Leverage¶
From Home Portal¶
- Authentik OAuth integration
- Group-based RBAC
- Supabase RLS patterns
- JWT bridge for auth
Reusable: Auth middleware, session management, RLS helpers
From Money Tracker¶
- Financial data handling
- User data isolation
- Export patterns (if any)
Reusable: Data export approach, privacy patterns
From Subtitleai¶
- Background worker architecture
- Job queue patterns
- Long-running task handling
Reusable: Celery worker setup, polling patterns
From RMS¶
- AI integration patterns
- Complex form handling
- Multi-step workflows
Reusable: Form patterns, workflow state management
New Infrastructure Required¶
Tier 1: Foundation (Must Build)¶
| Component | Description | Effort |
|---|---|---|
| Encryption Service | Client/server encryption, all three classes | 1 week |
| Key Management | User keys, escrow, recovery | 1 week |
| Audit Logger | Append-only, hash-chained, signed | 1 week |
| Trigger Engine | State machine, evaluation loop | 1-2 weeks |
Tier 2: Services (Should Build)¶
| Component | Description | Effort |
|---|---|---|
| Notification Service | Email, SMS, push, templates | 1 week |
| Document Generation | PDF generation, templates | 3 days |
| Export Service | Data export in open format | 3 days |
Tier 3: Enhancement (Nice to Have)¶
| Component | Description | Effort |
|---|---|---|
| External KMS Integration | AWS KMS or HashiCorp Vault | 1 week |
| Blockchain Anchoring | Periodic hash anchoring | 3 days |
| Multi-region | Disaster recovery | 2+ weeks |
Database Schema Requirements¶
New Schema: vault_core¶
-- Core vault infrastructure shared by all products
-- Documents with encryption metadata
CREATE TABLE vault_core.documents (...);
-- User encryption keys
CREATE TABLE vault_core.user_keys (...);
-- Append-only audit log
CREATE TABLE vault_core.audit_log (...);
-- Triggers (dead man's switch, scheduled, etc.)
CREATE TABLE vault_core.triggers (...);
-- Check-ins for dead man's switch
CREATE TABLE vault_core.check_ins (...);
-- Access grants (multi-party sharing)
CREATE TABLE vault_core.access_grants (...);
-- Notifications sent
CREATE TABLE vault_core.notifications (...);
Product-Specific Schemas¶
Each product gets its own schema for product-specific data:
- witness_vault.*
- final_file.*
- last_mile.*
- parent_trap.*
- exit_map.*
- ghost_protocol.*
External Services Required¶
Tier 1: Essential¶
| Service | Purpose | Options |
|---|---|---|
| Notifications, alerts | Resend, SendGrid, SES | |
| SMS | Check-in reminders, alerts | Twilio, SNS |
Tier 2: Recommended¶
| Service | Purpose | Options |
|---|---|---|
| Push Notifications | Mobile alerts | Firebase, OneSignal |
| Error Tracking | Production debugging | Sentry |
Tier 3: Future¶
| Service | Purpose | Options |
|---|---|---|
| External KMS | Key management | AWS KMS, HashiCorp Vault |
| E-Signature | Legal documents | DocuSign, HelloSign |
| Video Processing | Legacy recordings | Cloudflare Stream, Mux |
Security Hardening Required¶
Before Production¶
- [ ] Network policies for vault namespace
- [ ] Pod security policies/standards
- [ ] Secret rotation procedures
- [ ] Penetration testing
- [ ] Security audit of encryption implementation
- [ ] Incident response runbook
Production Monitoring¶
- [ ] Failed authentication alerting
- [ ] Trigger execution monitoring
- [ ] Audit log integrity checks
- [ ] Anomaly detection for access patterns
Estimated Infrastructure Investment¶
Phase 0.5 (Thin Slice)¶
- Minimal new infrastructure
- Use existing Supabase, add encryption library
- Basic worker for trigger evaluation
- Email via Resend
Effort: 1-2 developer-weeks
Phase 1 (Foundation)¶
- Full encryption service
- Key management service
- Audit logger with integrity
- Trigger engine
- Notification service
Effort: 4-5 developer-weeks
Production Hardening¶
- Security audit
- Network policies
- Monitoring dashboards
- Runbooks
Effort: 2-3 developer-weeks
Recommendations¶
Immediate Actions¶
- Set up
vault-corenamespace in K8s with network policies - Create Resend account for email notifications
- Create Twilio account for SMS
- Select encryption library (recommendation: libsodium via tweetnacl)
Short-Term¶
- Build encryption service as first foundation component
- Extend Supabase schema with
vault_coretables - Set up trigger worker using existing Celery patterns
Medium-Term¶
- Evaluate external KMS for production key management
- Implement blockchain anchoring for audit trail
- Consider multi-region for disaster recovery
This audit identifies the gap between current infrastructure and Vault Platform requirements. Most gaps are addressable with application-layer development on existing infrastructure.