Skip to content

Supabase Kubernetes Deployment: Lessons Learned

This document captures the challenges encountered and solutions implemented when deploying Supabase to Kubernetes.


Overview

Deploying Supabase to Kubernetes involved solving several complex issues: 1. PostgreSQL volume permissions 2. Environment variable substitution limitations 3. Health probe configuration 4. Database schema initialization

Final Result: Fully functional Supabase deployment with all services running.


Challenge 1: PostgreSQL Volume Permissions

The Problem

PostgreSQL container failed to start with permission errors:

FATAL: data directory "/var/lib/postgresql/data" has wrong ownership
HINT: The server must be started by the user that owns the data directory.

Root Cause Analysis

UID Mismatch: - Longhorn volumes are created with root ownership (UID 0, GID 0) - supabase/postgres image runs as user postgres (UID 101, GID 102) - PostgreSQL refuses to start if data directory has wrong ownership (security feature)

Additional Issues: - lost+found directory created by ext4 filesystem interfered with initialization - PGDATA environment variable pointing to wrong location

Solutions Attempted

❌ Option A: fsGroup only

securityContext:
  fsGroup: 102
Result: Only sets group ownership, not user ownership. PostgreSQL requires user ownership match.

❌ Option B: runAsUser in container

containers:
  - securityContext:
      runAsUser: 101
Result: Container image expects to start as root for initialization scripts.

✅ Solution: initContainer + Correct Mount Point

spec:
  securityContext:
    fsGroup: 102  # Group ownership for volume
    fsGroupChangePolicy: "OnRootMismatch"

  initContainers:
    - name: fix-permissions
      image: busybox:latest
      command:
        - sh
        - -c
        - |
          # Remove lost+found (interferes with PostgreSQL init)
          rm -rf /var/lib/postgresql/lost+found
          # Set correct ownership for postgres user
          chown -R 101:102 /var/lib/postgresql
          chmod 755 /var/lib/postgresql
          # Create socket directory
          mkdir -p /run-postgresql
          chown 101:102 /run-postgresql
          chmod 775 /run-postgresql
      volumeMounts:
        - name: postgres-data
          mountPath: /var/lib/postgresql
        - name: postgres-run
          mountPath: /run-postgresql

  containers:
    - name: postgres
      volumeMounts:
        - name: postgres-data
          mountPath: /var/lib/postgresql  # Mount to parent dir
        - name: postgres-run
          mountPath: /var/run/postgresql

  volumes:
    - name: postgres-run
      emptyDir: {}  # Separate volume for sockets

Key Decisions: - Mount to /var/lib/postgresql instead of /var/lib/postgresql/data - Let PostgreSQL create data/ subdirectory with correct permissions - Remove PGDATA env var to use default location - Use initContainer for one-time setup (not ongoing overhead) - Create separate emptyDir for /var/run/postgresql (sockets)


Challenge 2: Environment Variable Substitution

The Problem

Services like GoTrue and PostgREST failed with database connection errors:

cannot parse `postgres://$%28POSTGRES_USER%29:xxxxx@$(POSTGRES_HOST):...`
ERROR: schema "auth" does not exist

Root Cause Analysis

Kubernetes Variable Substitution Limitation:

# This DOES NOT WORK:
env:
  - name: POSTGRES_HOST
    valueFrom:
      configMapKeyRef:  # Source from ConfigMap
        name: config
        key: POSTGRES_HOST

  - name: DATABASE_URL
    value: "postgres://user:pass@$(POSTGRES_HOST):5432/db"
    # ❌ $(POSTGRES_HOST) NOT substituted when source uses valueFrom

Why: Kubernetes $(VAR_NAME) substitution only works when the source variable is defined inline with value:. It does not work when the source uses valueFrom: (ConfigMap or Secret).

This is documented Kubernetes behavior, not a bug.

Solutions Considered

❌ Option A: Complex shell scripting

command:
  - sh
  - -c
  - |
    export DATABASE_URL="postgres://$POSTGRES_USER:$POSTGRES_PASSWORD@$POSTGRES_HOST:5432/$POSTGRES_DB"
    exec /app/start
Issues: Fragile, harder to maintain, bypasses container entrypoint.

✅ Solution: Store Full Connection Strings in Secrets

Standard Kubernetes Pattern:

# secrets.yaml
apiVersion: v1
kind: Secret
metadata:
  name: supabase-secrets
stringData:
  # Store COMPLETE connection strings
  DATABASE_URL: "postgres://postgres:password@postgres.supabase.svc.cluster.local:5432/postgres"
  DB_URL: "postgres://postgres:password@postgres.supabase.svc.cluster.local:5432/postgres"
# gotrue.yaml
env:
  - name: DATABASE_URL
    valueFrom:
      secretKeyRef:
        name: supabase-secrets
        key: DATABASE_URL  # Direct reference, no substitution needed

Benefits: - Standard Kubernetes pattern used in production - No variable substitution complexity - Secrets properly encrypted at rest - Easy to manage and rotate credentials

Research Finding: This is the approach used by: - Supabase community Helm charts - Official Kubernetes documentation - Production database deployments (CloudNativePG, Zalando Postgres Operator)


Challenge 3: Health Probe Configuration

The Problem

Kong:

Readiness probe failed: HTTP probe failed with statuscode: 404
Path /status not found in Kong configuration.

PostgREST:

Readiness probe failed: HTTP probe failed with statuscode: 400
Root path / returns 400 by design (PostgREST requires schema context).

Root Cause

Default HTTP health probes assumed standard REST API health endpoints. However: - Kong: Declarative config mode doesn't expose /status on proxy port (8000) - PostgREST: Returns 400 on / when no schema specified (expected behavior)

Solution: TCP Probes

For services without HTTP health endpoints, use TCP probes:

# Kong
livenessProbe:
  tcpSocket:
    port: 8000
  initialDelaySeconds: 30
  periodSeconds: 10

# PostgREST
livenessProbe:
  tcpSocket:
    port: 3000
  initialDelaySeconds: 30
  periodSeconds: 10

Why TCP Probes: - Verifies service is listening on port - Doesn't require specific HTTP endpoints - Appropriate for API gateways and database proxies - Less false-positive failures


Challenge 4: Database Schema Initialization

The Problem

GoTrue failed during migration:

ERROR: schema "auth" does not exist (SQLSTATE 3F000)

Root Cause

Fresh PostgreSQL deployment didn't have Supabase schemas created. GoTrue assumes auth schema exists before running migrations.

Solution

Manual schema creation before first GoTrue start:

kubectl exec -n supabase postgres-0 -- psql -U postgres -c "CREATE SCHEMA IF NOT EXISTS auth;"

Better Approach (for production):

Create an init script ConfigMap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: postgres-init-scripts
  namespace: supabase
data:
  01-init-schemas.sql: |
    -- Create Supabase schemas
    CREATE SCHEMA IF NOT EXISTS auth;
    CREATE SCHEMA IF NOT EXISTS storage;
    CREATE SCHEMA IF NOT EXISTS realtime;

    -- Grant permissions
    GRANT USAGE ON SCHEMA auth TO postgres;
    GRANT ALL ON SCHEMA auth TO postgres;

Then mount as initdb script:

volumeMounts:
  - name: init-scripts
    mountPath: /docker-entrypoint-initdb.d

Final Architecture

Services Deployed

Service Type Port LoadBalancer IP Status
postgres ClusterIP 5432 - ✅ Running
gotrue (Auth) ClusterIP 9999 - ✅ Running
rest (PostgREST) ClusterIP 3000 - ✅ Running
kong (API Gateway) LoadBalancer 8000, 8443 10.89.97.214 ✅ Running
studio (Dashboard) LoadBalancer 3000 10.89.97.215 ✅ Running

Access Points

For Applications:

NEXT_PUBLIC_SUPABASE_URL=http://10.89.97.214:8000
NEXT_PUBLIC_SUPABASE_ANON_KEY=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...

For Admins: - Supabase Studio: http://10.89.97.215:3000 - Direct PostgreSQL: postgres.supabase.svc.cluster.local:5432

Storage

PostgreSQL Data: - PVC: 20GB Longhorn volume with 2-replica - Mount: /var/lib/postgresql - Actual data: /var/lib/postgresql/data (created by PostgreSQL)


Key Learnings Summary

1. Volume Permissions

Problem: Kubernetes volumes are root-owned by default Solution: Use initContainer to fix permissions before main container starts Best Practice: Mount to parent directory, let app create subdirectories

2. Environment Variables

Problem: Kubernetes doesn't substitute variables from ConfigMap/Secret refs Solution: Store complete connection strings in Secrets Best Practice: Follow standard Kubernetes patterns, not custom scripting

3. Health Probes

Problem: Not all services have HTTP health endpoints Solution: Use TCP probes when HTTP endpoints don't exist Best Practice: Match probe type to service capabilities

4. Database Initialization

Problem: Services expect schemas to exist Solution: Create schemas before deploying dependent services Best Practice: Use initdb scripts or init Jobs for schema setup


Debugging Tips

Check Pod Status

kubectl get pods -n supabase
kubectl describe pod <pod-name> -n supabase
kubectl logs <pod-name> -n supabase

Test Database Connection

kubectl exec -n supabase postgres-0 -- psql -U postgres -c "SELECT version();"

Check Volume Permissions

kubectl run debug --rm -it --image=busybox --restart=Never -n supabase \
  --overrides='{"spec":{"volumes":[{"name":"data","persistentVolumeClaim":{"claimName":"postgres-data-postgres-0"}}],"containers":[{"name":"debug","image":"busybox","command":["ls","-la","/data"],"volumeMounts":[{"name":"data","mountPath":"/data"}]}]}}'

Test API Gateway

# Should return 401 (auth required) - means Kong is working
curl -I http://10.89.97.214:8000/auth/v1/health

Time Investment

Total Time: ~3 hours of debugging and implementation

Breakdown: - PostgreSQL permissions: 2 hours - Environment variable issue: 30 minutes - Health probe fixes: 20 minutes - Schema initialization: 10 minutes

Value: Deep understanding of Kubernetes storage, secrets, and service deployment patterns. Knowledge transferable to any stateful application deployment.


References