Kubernetes Cluster Overview¶

The Tower Fleet Kubernetes cluster provides container orchestration for homelab applications using k3s, a lightweight Kubernetes distribution.

Cluster Architecture¶

Platform: k3s (lightweight Kubernetes) Topology: 3-node cluster Nodes: - VM 201 (10.89.97.221) - k3s-master (control plane + worker) - VM 202 (10.89.97.222) - k3s-worker-1 - VM 203 (10.89.97.223) - k3s-worker-2

Hypervisor: Proxmox VE (VMs on Proxmox host)

Core Infrastructure Components¶

Installed Components¶

Component	Purpose	Access
k3s	Kubernetes distribution	kubectl
Helm	Package manager	CLI
MetalLB	LoadBalancer (10.89.97.230-249)	N/A
Longhorn	Distributed storage	UI via port-forward
cert-manager	TLS certificate management	N/A
Traefik	Ingress controller (built-in)	N/A

Observability Stack¶

Component	Purpose	Access
Grafana	Dashboards and visualization	http://10.89.97.231:3000
Loki	Log aggregation	Via Grafana
Promtail	Log collection agent	N/A
Prometheus	Metrics collection	Via Grafana

Application Platform¶

Service	Purpose	Access
Supabase	PostgreSQL, Auth, Storage, Realtime	http://10.89.97.230:8000
PostgreSQL	Shared database with schema isolation	Via Supabase
Kong	API Gateway for Supabase	Via Supabase

Networking¶

Pod Network: 10.42.0.0/16 (internal) Service Network: 10.43.0.0/16 (internal) MetalLB Pool: 10.89.97.230-10.89.97.249 (LoadBalancer IPs) Ingress: Traefik (built-in with k3s)

Service Exposure¶

Services are exposed via LoadBalancer type using MetalLB:

kubectl get svc -A | grep LoadBalancer
# supabase    kong         LoadBalancer   10.89.97.230
# monitoring  grafana      LoadBalancer   10.89.97.231

Storage¶

StorageClass: longhorn (default) Replicas: 3 (across all nodes) Access Modes: ReadWriteOnce, ReadWriteMany

Persistent volumes are automatically provisioned by Longhorn when PVCs are created.

Namespaces¶

Namespace	Purpose
`default`	Default namespace (rarely used)
`kube-system`	Kubernetes system components
`supabase`	Supabase platform services
`monitoring`	Grafana, Loki, Prometheus
`longhorn-system`	Distributed storage
`metallb-system`	LoadBalancer provider
`cert-manager`	Certificate automation
App namespaces	home-portal, money-tracker, rms, etc.

Deployment Model¶

GitOps (Planned): Flux for declarative infrastructure Current: Manual kubectl apply and Helm installs Configuration: YAML manifests in tower-fleet repository

Application Deployment Pattern¶

Each Next.js app gets: - Dedicated namespace - Deployment with 1-2 replicas - Service (ClusterIP) - LoadBalancer Service (for external access) - PersistentVolumeClaim (if needed for uploads) - ConfigMap/Secrets for environment variables

High Availability¶

Control Plane: Single master (VM 201) Workloads: Can run on any node (affinity rules optional) Storage: 3 replicas via Longhorn (survives 1 node failure) LoadBalancer: MetalLB Layer 2 mode (no SPOF)

Note: Single master is acceptable for homelab. For production, consider HA control plane.

Resource Allocation¶

Per Node: - 4 CPU cores - 8 GB RAM - 50 GB disk (OS + Longhorn storage)

Total Cluster: - 12 CPU cores - 24 GB RAM - ~150 GB usable storage (after OS overhead)

Access and Authentication¶

kubectl Configuration:

From Proxmox host:

export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
kubectl get nodes

From remote machine:

# Copy /etc/rancher/k3s/k3s.yaml from master
# Update server: https://10.89.97.221:6443
export KUBECONFIG=~/.kube/tower-fleet-config
kubectl get nodes

See kubectl & kubeconfig Guide for detailed setup.

Management Operations¶

Common Commands¶

# Cluster status
kubectl get nodes
kubectl cluster-info

# View all resources
kubectl get all -A

# Check storage
kubectl get pv
kubectl get pvc -A

# View logs
kubectl logs -n supabase postgres-0
kubectl logs -n monitoring -l app=grafana

# Port forwarding
kubectl port-forward -n monitoring svc/grafana 3000:80

Upgrading k3s¶

# Check current version
kubectl version

# Upgrade master
ssh root@10.89.97.221
curl -sfL https://get.k3s.io | sh -

# Upgrade workers
ssh root@10.89.97.222
curl -sfL https://get.k3s.io | K3S_URL=https://10.89.97.221:6443 K3S_TOKEN=<token> sh -

Monitoring and Observability¶

Grafana: http://10.89.97.231:3000 - Dashboards for cluster health, resource usage - Loki for log exploration - Prometheus for metrics

Key Metrics: - Node CPU/memory usage - Pod resource consumption - Storage capacity and performance - Network traffic

See Observability Standards for dashboard and alert configuration.

Troubleshooting¶

Node Issues¶

# Check node status
kubectl get nodes
kubectl describe node k3s-master

# Check node resources
kubectl top nodes

# SSH to node for system-level debugging
ssh root@10.89.97.221
systemctl status k3s
journalctl -u k3s -f

Pod Issues¶

# Check pod status
kubectl get pods -A

# Describe pod for events
kubectl describe pod <pod-name> -n <namespace>

# View logs
kubectl logs <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace> --previous  # Previous container instance

Storage Issues¶

# Check Longhorn health
kubectl get pods -n longhorn-system

# Check volumes
kubectl get volumes.longhorn.io -n longhorn-system

# Access Longhorn UI
kubectl port-forward -n longhorn-system svc/longhorn-frontend 8080:80
# Visit: http://localhost:8080

Network Issues¶

# Check MetalLB
kubectl get pods -n metallb-system

# Check services
kubectl get svc -A

# Test internal DNS
kubectl run -it --rm debug --image=busybox --restart=Never -- nslookup kubernetes.default

Documentation¶

Setup Guides: - Cluster Setup - k3s installation - kubectl Guide - kubectl configuration - Core Infrastructure - Helm, MetalLB, Longhorn - Supabase Platform - Shared database setup

Operations: - Loki Logging - Log aggregation - Alert Investigation - Troubleshooting alerts

Reference: - Architecture Decision: Multi-Node k3s - Storage Strategy

Related Documentation: - Infrastructure Overview - Applications Overview - Production Deployment Workflow