Skip to content

Kubernetes Cluster Overview

The Tower Fleet Kubernetes cluster provides container orchestration for homelab applications using k3s, a lightweight Kubernetes distribution.

Cluster Architecture

Platform: k3s (lightweight Kubernetes) Topology: 3-node cluster Nodes: - VM 201 (10.89.97.221) - k3s-master (control plane + worker) - VM 202 (10.89.97.222) - k3s-worker-1 - VM 203 (10.89.97.223) - k3s-worker-2

Hypervisor: Proxmox VE (VMs on Proxmox host)

Core Infrastructure Components

Installed Components

Component Purpose Access
k3s Kubernetes distribution kubectl
Helm Package manager CLI
MetalLB LoadBalancer (10.89.97.230-249) N/A
Longhorn Distributed storage UI via port-forward
cert-manager TLS certificate management N/A
Traefik Ingress controller (built-in) N/A

Observability Stack

Component Purpose Access
Grafana Dashboards and visualization http://10.89.97.231:3000
Loki Log aggregation Via Grafana
Promtail Log collection agent N/A
Prometheus Metrics collection Via Grafana

Application Platform

Service Purpose Access
Supabase PostgreSQL, Auth, Storage, Realtime http://10.89.97.230:8000
PostgreSQL Shared database with schema isolation Via Supabase
Kong API Gateway for Supabase Via Supabase

Networking

Pod Network: 10.42.0.0/16 (internal) Service Network: 10.43.0.0/16 (internal) MetalLB Pool: 10.89.97.230-10.89.97.249 (LoadBalancer IPs) Ingress: Traefik (built-in with k3s)

Service Exposure

Services are exposed via LoadBalancer type using MetalLB:

kubectl get svc -A | grep LoadBalancer
# supabase    kong         LoadBalancer   10.89.97.230
# monitoring  grafana      LoadBalancer   10.89.97.231

Storage

StorageClass: longhorn (default) Replicas: 3 (across all nodes) Access Modes: ReadWriteOnce, ReadWriteMany

Persistent volumes are automatically provisioned by Longhorn when PVCs are created.

Namespaces

Namespace Purpose
default Default namespace (rarely used)
kube-system Kubernetes system components
supabase Supabase platform services
monitoring Grafana, Loki, Prometheus
longhorn-system Distributed storage
metallb-system LoadBalancer provider
cert-manager Certificate automation
App namespaces home-portal, money-tracker, rms, etc.

Deployment Model

GitOps (Planned): Flux for declarative infrastructure Current: Manual kubectl apply and Helm installs Configuration: YAML manifests in tower-fleet repository

Application Deployment Pattern

Each Next.js app gets: - Dedicated namespace - Deployment with 1-2 replicas - Service (ClusterIP) - LoadBalancer Service (for external access) - PersistentVolumeClaim (if needed for uploads) - ConfigMap/Secrets for environment variables

High Availability

Control Plane: Single master (VM 201) Workloads: Can run on any node (affinity rules optional) Storage: 3 replicas via Longhorn (survives 1 node failure) LoadBalancer: MetalLB Layer 2 mode (no SPOF)

Note: Single master is acceptable for homelab. For production, consider HA control plane.

Resource Allocation

Per Node: - 4 CPU cores - 8 GB RAM - 50 GB disk (OS + Longhorn storage)

Total Cluster: - 12 CPU cores - 24 GB RAM - ~150 GB usable storage (after OS overhead)

Access and Authentication

kubectl Configuration:

From Proxmox host:

export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
kubectl get nodes

From remote machine:

# Copy /etc/rancher/k3s/k3s.yaml from master
# Update server: https://10.89.97.221:6443
export KUBECONFIG=~/.kube/tower-fleet-config
kubectl get nodes

See kubectl & kubeconfig Guide for detailed setup.

Management Operations

Common Commands

# Cluster status
kubectl get nodes
kubectl cluster-info

# View all resources
kubectl get all -A

# Check storage
kubectl get pv
kubectl get pvc -A

# View logs
kubectl logs -n supabase postgres-0
kubectl logs -n monitoring -l app=grafana

# Port forwarding
kubectl port-forward -n monitoring svc/grafana 3000:80

Upgrading k3s

# Check current version
kubectl version

# Upgrade master
ssh root@10.89.97.221
curl -sfL https://get.k3s.io | sh -

# Upgrade workers
ssh root@10.89.97.222
curl -sfL https://get.k3s.io | K3S_URL=https://10.89.97.221:6443 K3S_TOKEN=<token> sh -

Monitoring and Observability

Grafana: http://10.89.97.231:3000 - Dashboards for cluster health, resource usage - Loki for log exploration - Prometheus for metrics

Key Metrics: - Node CPU/memory usage - Pod resource consumption - Storage capacity and performance - Network traffic

See Observability Standards for dashboard and alert configuration.

Troubleshooting

Node Issues

# Check node status
kubectl get nodes
kubectl describe node k3s-master

# Check node resources
kubectl top nodes

# SSH to node for system-level debugging
ssh root@10.89.97.221
systemctl status k3s
journalctl -u k3s -f

Pod Issues

# Check pod status
kubectl get pods -A

# Describe pod for events
kubectl describe pod <pod-name> -n <namespace>

# View logs
kubectl logs <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace> --previous  # Previous container instance

Storage Issues

# Check Longhorn health
kubectl get pods -n longhorn-system

# Check volumes
kubectl get volumes.longhorn.io -n longhorn-system

# Access Longhorn UI
kubectl port-forward -n longhorn-system svc/longhorn-frontend 8080:80
# Visit: http://localhost:8080

Network Issues

# Check MetalLB
kubectl get pods -n metallb-system

# Check services
kubectl get svc -A

# Test internal DNS
kubectl run -it --rm debug --image=busybox --restart=Never -- nslookup kubernetes.default

Documentation

Setup Guides: - Cluster Setup - k3s installation - kubectl Guide - kubectl configuration - Core Infrastructure - Helm, MetalLB, Longhorn - Supabase Platform - Shared database setup

Operations: - Loki Logging - Log aggregation - Alert Investigation - Troubleshooting alerts

Reference: - Architecture Decision: Multi-Node k3s - Storage Strategy


Related Documentation: - Infrastructure Overview - Applications Overview - Production Deployment Workflow