Skip to content

Supabase Sandbox Deployment

Purpose: Deploy a complete Supabase sandbox environment for staging, backup testing, and safe experimentation.

Last Updated: 2025-12-03 Validated: Full deployment tested end-to-end


Overview

The Supabase sandbox is a complete replica of production running in the same Kubernetes cluster but in an isolated namespace. It enables:

  • Staging environment: Test risky schema changes before production
  • Backup validation: Verify backups can be restored successfully
  • Safe experimentation: Break things without affecting production
  • Disaster recovery practice: Rehearse complete recovery procedures

Quick Reference

Task Command Time
Deploy sandbox /root/scripts/deploy-sandbox-complete.sh 2-3 min
Refresh with latest prod data /root/scripts/deploy-sandbox-complete.sh --destroy-first 3-4 min
Destroy sandbox kubectl delete namespace supabase-sandbox 1 min
Access Kong API kubectl port-forward -n supabase-sandbox svc/kong 8000:8000 -
Access Studio kubectl port-forward -n supabase-sandbox svc/studio 3000:3000 -
Check status kubectl get pods -n supabase-sandbox -

Jump to: - Initial Deployment - First time setup - Sandbox Refresh - Sync with production data - Using the Sandbox - Connect apps and test changes - Troubleshooting - Fix common problems


One-command deployment from production backup:

/root/scripts/deploy-sandbox-complete.sh

This script: 1. Deploys all Supabase components to supabase-sandbox namespace 2. Waits for PostgreSQL to be ready 3. Restores latest production backup 4. Fixes migration tracking tables 5. Restarts services 6. Validates deployment

Expected time: 2-3 minutes (optimized restore using kubectl cp + psql inside pod)


Architecture

Components Deployed

Component Purpose Resources
PostgreSQL Database (StatefulSet) 100m CPU, 256Mi RAM, 10Gi storage
GoTrue Authentication 50m CPU, 64Mi RAM
PostgREST REST API 50m CPU, 64Mi RAM
Storage API File storage 50m CPU, 64Mi RAM
postgres-meta DB introspection 50m CPU, 128Mi RAM
Kong API Gateway 100m CPU, 256Mi RAM ⚠️
Studio Dashboard 50m CPU, 128Mi RAM

Total: ~500m CPU, ~1.5Gi RAM

⚠️ Critical: Kong requires minimum 512Mi memory limit or it will be OOMKilled.

Network Configuration

Services: All ClusterIP (internal-only)

Access via port-forward:

# Kong API
kubectl port-forward -n supabase-sandbox svc/kong 8000:8000

# Studio Dashboard
kubectl port-forward -n supabase-sandbox svc/studio 3000:3000

Why ClusterIP? - MetalLB pool typically exhausted (production uses 10.89.97.210-220) - Sandbox doesn't need external access - Port-forwarding works fine for testing


Manual Deployment (Step-by-Step)

Prerequisites

Check cluster capacity:

kubectl top nodes
# Need: ~500m CPU, ~1.5Gi RAM available

Check if sandbox exists:

kubectl get namespace supabase-sandbox
# If exists: kubectl delete namespace supabase-sandbox

Step 1: Generate JWT Secrets

/root/scripts/generate-sandbox-secrets.sh

Creates: - /root/tower-fleet/manifests/supabase-sandbox/secrets.yaml - /root/.supabase-sandbox-secrets.txt (plaintext reference)

Step 2: Deploy Sandbox

/root/scripts/deploy-supabase-sandbox.sh --deploy

This deploys all manifests in correct order: 1. Namespace, ConfigMap, Secrets 2. PostgreSQL init scripts (creates schemas and roles automatically) 3. PostgreSQL StatefulSet 4. Dependent services (GoTrue, PostgREST, Storage, Kong, Studio)

Wait for deployment:

kubectl get pods -n supabase-sandbox -w
# All pods should reach Running (except Storage/Kong may restart once)

Step 3: Restore Production Data

/root/scripts/sync-production-to-sandbox.sh --latest

This: - Decrypts latest production backup - Restores to sandbox PostgreSQL - Verifies schemas and table counts

⚠️ Known Issue: Storage migrations will fail after restore.

Step 4: Fix Migration Tracking

Problem: Production backup includes migrated data but not all migration tracking records.

Solution:

# Copy auth migrations
kubectl exec -n supabase postgres-0 -- \
  psql -U postgres -d postgres -c "COPY (SELECT * FROM auth.schema_migrations) TO STDOUT;" | \
kubectl exec -i -n supabase-sandbox postgres-0 -- \
  psql -U postgres -d postgres -c "COPY auth.schema_migrations FROM STDIN;"

# Copy storage migrations
kubectl exec -n supabase postgres-0 -- \
  psql -U postgres -d postgres -c "COPY (SELECT * FROM storage.migrations WHERE id >= 2) TO STDOUT;" | \
kubectl exec -i -n supabase-sandbox postgres-0 -- \
  psql -U postgres -d postgres -c "COPY storage.migrations FROM STDIN;"

Step 5: Restart Services

kubectl delete pod -n supabase-sandbox -l app=storage
kubectl delete pod -n supabase-sandbox -l app=kong

Wait for pods to become Running (20-30 seconds).

Step 6: Validate Deployment

# Check all pods running
kubectl get pods -n supabase-sandbox
# All should be Running (7 pods total)

# Test Kong API
kubectl port-forward -n supabase-sandbox svc/kong 8000:8000 &
curl http://localhost:8000/
# Should return Kong response

# Access Studio
kubectl port-forward -n supabase-sandbox svc/studio 3000:3000
# Open http://localhost:3000 in browser

Common Issues & Solutions

Issue 1: Storage Pod CrashLoopBackOff

Symptom:

Error: relation "storage.objects" does not exist

Cause: Storage migrations trying to run before tables exist.

Solution: Ensure Step 3 (restore production data) completed before storage starts.


Issue 2: Storage Migration "Column Already Exists"

Symptom:

Error: column "path_tokens" of relation "objects" already exists

Cause: Production backup restored data but not migration tracking.

Solution: Run Step 4 (fix migration tracking).


Issue 3: Kong Pod OOMKilled

Symptom:

kubectl get pods -n supabase-sandbox
# kong pod shows CrashLoopBackOff

kubectl describe pod -n supabase-sandbox -l app=kong | grep OOMKilled
# Shows: reason: OOMKilled

Cause: Kong memory limit too low (256Mi insufficient).

Solution: Kong manifest should have:

resources:
  limits:
    memory: 512Mi  # Not 256Mi!


Issue 4: LoadBalancer IPs Pending

Symptom:

kubectl get svc -n supabase-sandbox kong studio
# EXTERNAL-IP shows <pending>

Cause: MetalLB IP pool exhausted.

Check pool status:

kubectl get ipaddresspool -n metallb-system -o yaml | grep -A 5 "addresses:"
kubectl get svc -A | grep LoadBalancer | wc -l

Solution: Change to ClusterIP (already done in manifests):

spec:
  type: ClusterIP  # Not LoadBalancer


Using the Sandbox

Connect Applications

Update .env.local to point to sandbox:

# Option 1: Via port-forward
kubectl port-forward -n supabase-sandbox svc/kong 8000:8000
NEXT_PUBLIC_SUPABASE_URL=http://localhost:8000
NEXT_PUBLIC_SUPABASE_ANON_KEY=<see /root/.supabase-sandbox-secrets.txt>

# Option 2: Via cluster DNS (from within cluster)
NEXT_PUBLIC_SUPABASE_URL=http://kong.supabase-sandbox.svc.cluster.local:8000

Test Schema Changes

# 1. Port-forward to sandbox
kubectl port-forward -n supabase-sandbox svc/kong 8000:8000

# 2. Create migration
cd /root/projects/your-app
npx supabase migration new test_change

# 3. Apply to sandbox via Studio
kubectl port-forward -n supabase-sandbox svc/studio 3000:3000
# Open http://localhost:3000 → SQL Editor → Run migration

# 4. Test in app
npm run dev
# App connects to localhost:8000 (sandbox)

# 5. If works, apply to production
/root/scripts/migrate-app.sh your-app

Sandbox Refresh (Sync with Production)

Use case: Sandbox is running but you want fresh production data (e.g., after making schema changes in production).

Destroys and recreates sandbox with latest production data:

/root/scripts/deploy-sandbox-complete.sh --destroy-first

When to use: - ✅ First refresh of the day/week - ✅ After major production schema changes - ✅ Want guaranteed clean slate - ✅ Sandbox has issues (CrashLoopBackOff, etc.)

Time: 60-120 minutes (full deployment + restore)

Option 2: In-Place Refresh (Faster)

Keeps sandbox running, only replaces database:

# WARNING: This uses pipe approach and may hang on large backups
# Consider using Option 1 instead for reliability

/root/scripts/sync-production-to-sandbox.sh --latest

When to use: - ⚠️ Small backups only (< 10MB compressed) - ⚠️ Quick data refresh without infrastructure changes - ⚠️ Known to work in your environment

Time: 30-90 minutes (restore only)

⚠️ Known Issue: This script uses the pipe approach (gpg | gunzip | kubectl exec psql) which can hang on large backups. If it hangs, use Option 1 instead.

Option 3: Manual Refresh (For Testing)

Step-by-step manual process:

# 1. Find latest backup
ls -lth /vault/backups/databases/supabase-backup-*.sql.gz.gpg | head -5

# 2. Decrypt to temp file (avoids pipe hang)
BACKUP="/vault/backups/databases/supabase-backup-20251202-221700.sql.gz.gpg"
gpg --batch --yes --passphrase-file /root/.database-backup-passphrase \
    --decrypt "$BACKUP" 2>/dev/null | gunzip > /tmp/manual-restore.sql

# 3. Restore to sandbox
kubectl exec -i -n supabase-sandbox postgres-0 -- \
    psql -U postgres -d postgres < /tmp/manual-restore.sql

# 4. Verify schemas
kubectl exec -n supabase-sandbox postgres-0 -- \
    psql -U postgres -d postgres -c "\dn"

# 5. Fix migration tracking (if needed)
kubectl exec -n supabase postgres-0 -- \
    psql -U postgres -d postgres -c "COPY (SELECT * FROM storage.migrations WHERE id >= 2) TO STDOUT;" | \
kubectl exec -i -n supabase-sandbox postgres-0 -- \
    psql -U postgres -d postgres -c "COPY storage.migrations FROM STDIN;"

# 6. Restart services
kubectl delete pod -n supabase-sandbox -l app=storage
kubectl delete pod -n supabase-sandbox -l app=kong

# 7. Cleanup
rm /tmp/manual-restore.sql

Refresh Frequency Recommendations

Daily development: - No refresh needed - sandbox persists changes - Use sandbox for testing migrations before production

Weekly testing: - Refresh Monday morning to sync with production - Ensures sandbox has latest production data

After production changes: - Refresh immediately after schema migrations - Validates migration works on production-like data

Ad-hoc: - Before testing RBAC Phase 3 migration - Before major feature development - When sandbox data is stale (> 1 week)


Performance Notes

Backup Restore Time

Observed: 60-90 minutes for 52M SQL dump (8.2M compressed)

Why so slow? 1. kubectl exec overhead: The 52M SQL file is streamed through the Kubernetes API to the pod 2. Sequential processing: PostgreSQL processes each DDL statement (CREATE TABLE, CREATE FUNCTION, etc.) one at a time 3. Complex schema: Production backup includes 100+ tables, functions, triggers, RLS policies across multiple schemas

Breakdown: - Steps 1-3: ~2 minutes (fast - deployment and PostgreSQL ready) - Step 4 (restore): 60-90 minutes (slow - kubectl exec bottleneck) - Steps 5-6: ~5 minutes (restarts and validation)

Why not faster? - Considered: Direct PostgreSQL connection (requires exposing port, security risk) - Considered: Copy file into pod first (adds complexity, still slow to execute SQL) - Current approach: Simple, secure, works reliably - just requires patience

Recommendation: - Run deploy-sandbox-complete.sh in background or tmux session - Let it run overnight if needed - The process is reliable - it won't hang (temp file approach prevents pipe issues)


Maintenance

Destroy Sandbox

/root/scripts/deploy-supabase-sandbox.sh --destroy

Deletes namespace, pods, PVCs, and all data.

Redeploy Sandbox

# Clean slate
/root/scripts/deploy-supabase-sandbox.sh --destroy
/root/scripts/deploy-sandbox-complete.sh

# Or update existing
kubectl apply -f /root/tower-fleet/manifests/supabase-sandbox/

Check Sandbox Status

/root/scripts/deploy-supabase-sandbox.sh --status

Shows: - Namespace status - Pod health - Service endpoints - PVC usage


Resource Requirements (Validated)

Minimum (tested and working): - CPU: 500m total - Memory: 1.5Gi total - Storage: 10Gi PVC

Per-service breakdown:

Service CPU Request Memory Limit Notes
PostgreSQL 100m 1Gi Can go lower but not recommended
Kong 100m 512Mi ⚠️ Less = OOMKilled
Studio 50m 256Mi
GoTrue 50m 128Mi
PostgREST 50m 256Mi
Storage 50m 256Mi
postgres-meta 50m 256Mi

What happens if you go too low: - Kong < 512Mi → OOMKilled, CrashLoopBackOff - PostgreSQL < 256Mi → Slow queries, potential crashes - Others < 64Mi → May work but unstable


Comparison: Sandbox vs Production

Aspect Production Sandbox
Namespace supabase supabase-sandbox
Services LoadBalancer (external IPs) ClusterIP (internal only)
Storage 20Gi PVC 10Gi PVC
Resources Full (750m CPU, 2Gi RAM) Reduced (500m CPU, 1.5Gi RAM)
Data Live production Restored from backup
Access http://10.89.97.214:8000 Port-forward only

Troubleshooting Checklist

If deployment fails:

  1. ✅ Check cluster capacity: kubectl top nodes
  2. ✅ Check namespace clean: kubectl get namespace supabase-sandbox
  3. ✅ Check secrets generated: ls /root/.supabase-sandbox-secrets.txt
  4. ✅ Check init scripts applied: kubectl get configmap -n supabase-sandbox postgres-init-scripts
  5. ✅ Check PostgreSQL running: kubectl get pods -n supabase-sandbox -l app=postgres
  6. ✅ Check schemas created: kubectl exec -n supabase-sandbox postgres-0 -- psql -U postgres -c "\dn"

If services crash:

  1. ✅ Check logs: kubectl logs -n supabase-sandbox -l app=<service>
  2. ✅ Check resource limits: kubectl describe pod -n supabase-sandbox -l app=<service>
  3. ✅ Check for OOMKilled: kubectl get events -n supabase-sandbox --sort-by='.lastTimestamp'

If migrations fail:

  1. ✅ Check migration tables exist: kubectl exec -n supabase-sandbox postgres-0 -- psql -U postgres -c "SELECT * FROM storage.migrations;"
  2. ✅ Copy from production if needed (Step 4)
  3. ✅ Restart services: kubectl delete pod -n supabase-sandbox -l app=storage

References


Success Criteria

Sandbox deployment is successful when:

✅ All 7 pods in Running state ✅ Kong API responds: curl http://localhost:8000/ (via port-forward) ✅ Studio accessible: http://localhost:3000 (via port-forward) ✅ Production data restored with all schemas present ✅ No CrashLoopBackOff or OOMKilled errors ✅ Can connect app by changing .env.local