Supabase Sandbox Deployment¶
Purpose: Deploy a complete Supabase sandbox environment for staging, backup testing, and safe experimentation.
Last Updated: 2025-12-03 Validated: Full deployment tested end-to-end
Overview¶
The Supabase sandbox is a complete replica of production running in the same Kubernetes cluster but in an isolated namespace. It enables:
- Staging environment: Test risky schema changes before production
- Backup validation: Verify backups can be restored successfully
- Safe experimentation: Break things without affecting production
- Disaster recovery practice: Rehearse complete recovery procedures
Quick Reference¶
| Task | Command | Time |
|---|---|---|
| Deploy sandbox | /root/scripts/deploy-sandbox-complete.sh |
2-3 min |
| Refresh with latest prod data | /root/scripts/deploy-sandbox-complete.sh --destroy-first |
3-4 min |
| Destroy sandbox | kubectl delete namespace supabase-sandbox |
1 min |
| Access Kong API | kubectl port-forward -n supabase-sandbox svc/kong 8000:8000 |
- |
| Access Studio | kubectl port-forward -n supabase-sandbox svc/studio 3000:3000 |
- |
| Check status | kubectl get pods -n supabase-sandbox |
- |
Jump to: - Initial Deployment - First time setup - Sandbox Refresh - Sync with production data - Using the Sandbox - Connect apps and test changes - Troubleshooting - Fix common problems
Quick Start (Recommended)¶
One-command deployment from production backup:
This script:
1. Deploys all Supabase components to supabase-sandbox namespace
2. Waits for PostgreSQL to be ready
3. Restores latest production backup
4. Fixes migration tracking tables
5. Restarts services
6. Validates deployment
Expected time: 2-3 minutes (optimized restore using kubectl cp + psql inside pod)
Architecture¶
Components Deployed¶
| Component | Purpose | Resources |
|---|---|---|
| PostgreSQL | Database (StatefulSet) | 100m CPU, 256Mi RAM, 10Gi storage |
| GoTrue | Authentication | 50m CPU, 64Mi RAM |
| PostgREST | REST API | 50m CPU, 64Mi RAM |
| Storage API | File storage | 50m CPU, 64Mi RAM |
| postgres-meta | DB introspection | 50m CPU, 128Mi RAM |
| Kong | API Gateway | 100m CPU, 256Mi RAM ⚠️ |
| Studio | Dashboard | 50m CPU, 128Mi RAM |
Total: ~500m CPU, ~1.5Gi RAM
⚠️ Critical: Kong requires minimum 512Mi memory limit or it will be OOMKilled.
Network Configuration¶
Services: All ClusterIP (internal-only)
Access via port-forward:
# Kong API
kubectl port-forward -n supabase-sandbox svc/kong 8000:8000
# Studio Dashboard
kubectl port-forward -n supabase-sandbox svc/studio 3000:3000
Why ClusterIP? - MetalLB pool typically exhausted (production uses 10.89.97.210-220) - Sandbox doesn't need external access - Port-forwarding works fine for testing
Manual Deployment (Step-by-Step)¶
Prerequisites¶
Check cluster capacity:
Check if sandbox exists:
Step 1: Generate JWT Secrets¶
Creates:
- /root/tower-fleet/manifests/supabase-sandbox/secrets.yaml
- /root/.supabase-sandbox-secrets.txt (plaintext reference)
Step 2: Deploy Sandbox¶
This deploys all manifests in correct order: 1. Namespace, ConfigMap, Secrets 2. PostgreSQL init scripts (creates schemas and roles automatically) 3. PostgreSQL StatefulSet 4. Dependent services (GoTrue, PostgREST, Storage, Kong, Studio)
Wait for deployment:
kubectl get pods -n supabase-sandbox -w
# All pods should reach Running (except Storage/Kong may restart once)
Step 3: Restore Production Data¶
This: - Decrypts latest production backup - Restores to sandbox PostgreSQL - Verifies schemas and table counts
⚠️ Known Issue: Storage migrations will fail after restore.
Step 4: Fix Migration Tracking¶
Problem: Production backup includes migrated data but not all migration tracking records.
Solution:
# Copy auth migrations
kubectl exec -n supabase postgres-0 -- \
psql -U postgres -d postgres -c "COPY (SELECT * FROM auth.schema_migrations) TO STDOUT;" | \
kubectl exec -i -n supabase-sandbox postgres-0 -- \
psql -U postgres -d postgres -c "COPY auth.schema_migrations FROM STDIN;"
# Copy storage migrations
kubectl exec -n supabase postgres-0 -- \
psql -U postgres -d postgres -c "COPY (SELECT * FROM storage.migrations WHERE id >= 2) TO STDOUT;" | \
kubectl exec -i -n supabase-sandbox postgres-0 -- \
psql -U postgres -d postgres -c "COPY storage.migrations FROM STDIN;"
Step 5: Restart Services¶
kubectl delete pod -n supabase-sandbox -l app=storage
kubectl delete pod -n supabase-sandbox -l app=kong
Wait for pods to become Running (20-30 seconds).
Step 6: Validate Deployment¶
# Check all pods running
kubectl get pods -n supabase-sandbox
# All should be Running (7 pods total)
# Test Kong API
kubectl port-forward -n supabase-sandbox svc/kong 8000:8000 &
curl http://localhost:8000/
# Should return Kong response
# Access Studio
kubectl port-forward -n supabase-sandbox svc/studio 3000:3000
# Open http://localhost:3000 in browser
Common Issues & Solutions¶
Issue 1: Storage Pod CrashLoopBackOff¶
Symptom:
Cause: Storage migrations trying to run before tables exist.
Solution: Ensure Step 3 (restore production data) completed before storage starts.
Issue 2: Storage Migration "Column Already Exists"¶
Symptom:
Cause: Production backup restored data but not migration tracking.
Solution: Run Step 4 (fix migration tracking).
Issue 3: Kong Pod OOMKilled¶
Symptom:
kubectl get pods -n supabase-sandbox
# kong pod shows CrashLoopBackOff
kubectl describe pod -n supabase-sandbox -l app=kong | grep OOMKilled
# Shows: reason: OOMKilled
Cause: Kong memory limit too low (256Mi insufficient).
Solution: Kong manifest should have:
Issue 4: LoadBalancer IPs Pending¶
Symptom:
Cause: MetalLB IP pool exhausted.
Check pool status:
kubectl get ipaddresspool -n metallb-system -o yaml | grep -A 5 "addresses:"
kubectl get svc -A | grep LoadBalancer | wc -l
Solution: Change to ClusterIP (already done in manifests):
Using the Sandbox¶
Connect Applications¶
Update .env.local to point to sandbox:
# Option 1: Via port-forward
kubectl port-forward -n supabase-sandbox svc/kong 8000:8000
NEXT_PUBLIC_SUPABASE_URL=http://localhost:8000
NEXT_PUBLIC_SUPABASE_ANON_KEY=<see /root/.supabase-sandbox-secrets.txt>
# Option 2: Via cluster DNS (from within cluster)
NEXT_PUBLIC_SUPABASE_URL=http://kong.supabase-sandbox.svc.cluster.local:8000
Test Schema Changes¶
# 1. Port-forward to sandbox
kubectl port-forward -n supabase-sandbox svc/kong 8000:8000
# 2. Create migration
cd /root/projects/your-app
npx supabase migration new test_change
# 3. Apply to sandbox via Studio
kubectl port-forward -n supabase-sandbox svc/studio 3000:3000
# Open http://localhost:3000 → SQL Editor → Run migration
# 4. Test in app
npm run dev
# App connects to localhost:8000 (sandbox)
# 5. If works, apply to production
/root/scripts/migrate-app.sh your-app
Sandbox Refresh (Sync with Production)¶
Use case: Sandbox is running but you want fresh production data (e.g., after making schema changes in production).
Option 1: Complete Refresh (Recommended)¶
Destroys and recreates sandbox with latest production data:
When to use: - ✅ First refresh of the day/week - ✅ After major production schema changes - ✅ Want guaranteed clean slate - ✅ Sandbox has issues (CrashLoopBackOff, etc.)
Time: 60-120 minutes (full deployment + restore)
Option 2: In-Place Refresh (Faster)¶
Keeps sandbox running, only replaces database:
# WARNING: This uses pipe approach and may hang on large backups
# Consider using Option 1 instead for reliability
/root/scripts/sync-production-to-sandbox.sh --latest
When to use: - ⚠️ Small backups only (< 10MB compressed) - ⚠️ Quick data refresh without infrastructure changes - ⚠️ Known to work in your environment
Time: 30-90 minutes (restore only)
⚠️ Known Issue: This script uses the pipe approach (gpg | gunzip | kubectl exec psql) which can hang on large backups. If it hangs, use Option 1 instead.
Option 3: Manual Refresh (For Testing)¶
Step-by-step manual process:
# 1. Find latest backup
ls -lth /vault/backups/databases/supabase-backup-*.sql.gz.gpg | head -5
# 2. Decrypt to temp file (avoids pipe hang)
BACKUP="/vault/backups/databases/supabase-backup-20251202-221700.sql.gz.gpg"
gpg --batch --yes --passphrase-file /root/.database-backup-passphrase \
--decrypt "$BACKUP" 2>/dev/null | gunzip > /tmp/manual-restore.sql
# 3. Restore to sandbox
kubectl exec -i -n supabase-sandbox postgres-0 -- \
psql -U postgres -d postgres < /tmp/manual-restore.sql
# 4. Verify schemas
kubectl exec -n supabase-sandbox postgres-0 -- \
psql -U postgres -d postgres -c "\dn"
# 5. Fix migration tracking (if needed)
kubectl exec -n supabase postgres-0 -- \
psql -U postgres -d postgres -c "COPY (SELECT * FROM storage.migrations WHERE id >= 2) TO STDOUT;" | \
kubectl exec -i -n supabase-sandbox postgres-0 -- \
psql -U postgres -d postgres -c "COPY storage.migrations FROM STDIN;"
# 6. Restart services
kubectl delete pod -n supabase-sandbox -l app=storage
kubectl delete pod -n supabase-sandbox -l app=kong
# 7. Cleanup
rm /tmp/manual-restore.sql
Refresh Frequency Recommendations¶
Daily development: - No refresh needed - sandbox persists changes - Use sandbox for testing migrations before production
Weekly testing: - Refresh Monday morning to sync with production - Ensures sandbox has latest production data
After production changes: - Refresh immediately after schema migrations - Validates migration works on production-like data
Ad-hoc: - Before testing RBAC Phase 3 migration - Before major feature development - When sandbox data is stale (> 1 week)
Performance Notes¶
Backup Restore Time¶
Observed: 60-90 minutes for 52M SQL dump (8.2M compressed)
Why so slow? 1. kubectl exec overhead: The 52M SQL file is streamed through the Kubernetes API to the pod 2. Sequential processing: PostgreSQL processes each DDL statement (CREATE TABLE, CREATE FUNCTION, etc.) one at a time 3. Complex schema: Production backup includes 100+ tables, functions, triggers, RLS policies across multiple schemas
Breakdown: - Steps 1-3: ~2 minutes (fast - deployment and PostgreSQL ready) - Step 4 (restore): 60-90 minutes (slow - kubectl exec bottleneck) - Steps 5-6: ~5 minutes (restarts and validation)
Why not faster? - Considered: Direct PostgreSQL connection (requires exposing port, security risk) - Considered: Copy file into pod first (adds complexity, still slow to execute SQL) - Current approach: Simple, secure, works reliably - just requires patience
Recommendation:
- Run deploy-sandbox-complete.sh in background or tmux session
- Let it run overnight if needed
- The process is reliable - it won't hang (temp file approach prevents pipe issues)
Maintenance¶
Destroy Sandbox¶
Deletes namespace, pods, PVCs, and all data.
Redeploy Sandbox¶
# Clean slate
/root/scripts/deploy-supabase-sandbox.sh --destroy
/root/scripts/deploy-sandbox-complete.sh
# Or update existing
kubectl apply -f /root/tower-fleet/manifests/supabase-sandbox/
Check Sandbox Status¶
Shows: - Namespace status - Pod health - Service endpoints - PVC usage
Resource Requirements (Validated)¶
Minimum (tested and working): - CPU: 500m total - Memory: 1.5Gi total - Storage: 10Gi PVC
Per-service breakdown:
| Service | CPU Request | Memory Limit | Notes |
|---|---|---|---|
| PostgreSQL | 100m | 1Gi | Can go lower but not recommended |
| Kong | 100m | 512Mi | ⚠️ Less = OOMKilled |
| Studio | 50m | 256Mi | |
| GoTrue | 50m | 128Mi | |
| PostgREST | 50m | 256Mi | |
| Storage | 50m | 256Mi | |
| postgres-meta | 50m | 256Mi |
What happens if you go too low: - Kong < 512Mi → OOMKilled, CrashLoopBackOff - PostgreSQL < 256Mi → Slow queries, potential crashes - Others < 64Mi → May work but unstable
Comparison: Sandbox vs Production¶
| Aspect | Production | Sandbox |
|---|---|---|
| Namespace | supabase |
supabase-sandbox |
| Services | LoadBalancer (external IPs) | ClusterIP (internal only) |
| Storage | 20Gi PVC | 10Gi PVC |
| Resources | Full (750m CPU, 2Gi RAM) | Reduced (500m CPU, 1.5Gi RAM) |
| Data | Live production | Restored from backup |
| Access | http://10.89.97.214:8000 | Port-forward only |
Troubleshooting Checklist¶
If deployment fails:
- ✅ Check cluster capacity:
kubectl top nodes - ✅ Check namespace clean:
kubectl get namespace supabase-sandbox - ✅ Check secrets generated:
ls /root/.supabase-sandbox-secrets.txt - ✅ Check init scripts applied:
kubectl get configmap -n supabase-sandbox postgres-init-scripts - ✅ Check PostgreSQL running:
kubectl get pods -n supabase-sandbox -l app=postgres - ✅ Check schemas created:
kubectl exec -n supabase-sandbox postgres-0 -- psql -U postgres -c "\dn"
If services crash:
- ✅ Check logs:
kubectl logs -n supabase-sandbox -l app=<service> - ✅ Check resource limits:
kubectl describe pod -n supabase-sandbox -l app=<service> - ✅ Check for OOMKilled:
kubectl get events -n supabase-sandbox --sort-by='.lastTimestamp'
If migrations fail:
- ✅ Check migration tables exist:
kubectl exec -n supabase-sandbox postgres-0 -- psql -U postgres -c "SELECT * FROM storage.migrations;" - ✅ Copy from production if needed (Step 4)
- ✅ Restart services:
kubectl delete pod -n supabase-sandbox -l app=storage
References¶
- Supabase Deployment Lessons
- Disaster Recovery Guide
- Production Supabase Deployment
- MetalLB Management
Success Criteria¶
Sandbox deployment is successful when:
✅ All 7 pods in Running state
✅ Kong API responds: curl http://localhost:8000/ (via port-forward)
✅ Studio accessible: http://localhost:3000 (via port-forward)
✅ Production data restored with all schemas present
✅ No CrashLoopBackOff or OOMKilled errors
✅ Can connect app by changing .env.local