Expanding Cluster Storage¶
Guide for expanding k3s cluster storage capacity by resizing VM disks.
When to Expand¶
Expand storage when: - Current allocation approaches usable capacity - Planning to add new stateful applications - Longhorn shows low disk space warnings - PVCs fail to provision due to insufficient storage
Check current usage:
# Via Longhorn UI
open http://10.89.97.210
# Navigate to Node tab, check "Allocatable" vs "Reserved"
# Via kubectl
kubectl get nodes.longhorn.io -n longhorn-system -o yaml | \
grep -E "storageAvailable|storageMaximum"
# Check PVC allocations
kubectl get pvc -A -o custom-columns=\
NAMESPACE:.metadata.namespace,\
NAME:.metadata.name,\
SIZE:.spec.resources.requests.storage
Tower Fleet Current Configuration¶
After expansion (November 2025): - VM disk size: 150GB per node - Raw capacity: 450GB (3 nodes × 150GB) - Usable capacity: 225GB (2-replica = 50% overhead) - Reserved: 56GB (25% minimum free space) - Available for allocation: 169GB
Current allocations:
Service Size × Replicas Total
─────────────────────────────────────────────────
Prometheus 30GB × 2 60GB
Grafana 5GB × 2 10GB
Alertmanager 2GB × 2 4GB
Docker Registry 10GB × 2 20GB
─────────────────────────────────────────────────
Total Used: 94GB
Remaining: 75GB
Expansion Process¶
Prerequisites¶
- Proxmox VE access (root on host)
- SSH access to all k3s nodes
- Sufficient space in ZFS pool
Check ZFS availability:
Step 1: Resize VMs at Proxmox Level¶
This operation is instant (just metadata change) and safe (no downtime):
# Resize all three k3s VMs (example: add 70GB to current 80GB = 150GB)
qm resize 201 scsi0 +70G
qm resize 202 scsi0 +70G
qm resize 203 scsi0 +70G
# Verify new sizes
qm config 201 | grep scsi0
qm config 202 | grep scsi0
qm config 203 | grep scsi0
# Expected output:
# scsi0: local-zfs:vm-201-disk-0,size=150G
# scsi0: local-zfs:vm-202-disk-0,size=150G
# scsi0: local-zfs:vm-203-disk-0,size=150G
Notes:
- Use +XG notation to add space (e.g., +70G)
- VMs remain running during this operation
- Pods and services continue running without interruption
Step 2: Extend Partitions Inside VMs¶
Now extend the filesystem to use the new space:
# Extend all three nodes
ssh root@10.89.97.201 'growpart /dev/sda 1 && resize2fs /dev/sda1'
ssh root@10.89.97.202 'growpart /dev/sda 1 && resize2fs /dev/sda1'
ssh root@10.89.97.203 'growpart /dev/sda 1 && resize2fs /dev/sda1'
# Verify expansion
ssh root@10.89.97.201 'df -h / | tail -1'
ssh root@10.89.97.202 'df -h / | tail -1'
ssh root@10.89.97.203 'df -h / | tail -1'
# Expected output (for 150GB disks):
# /dev/sda1 148G 8.0G 134G 6% /
Commands explained:
- growpart /dev/sda 1 - Extends partition 1 to fill available space
- resize2fs /dev/sda1 - Extends ext4 filesystem to fill partition
- Both commands are online (no unmount required)
Step 3: Verify Longhorn Detected New Space¶
Longhorn automatically detects increased disk space within 1-2 minutes:
# Wait for Longhorn to detect (automatic)
sleep 30
# Check new capacity
kubectl get nodes.longhorn.io -n longhorn-system -o yaml | \
grep -E "name:|storageAvailable:|storageMaximum:"
# Example output (after expanding to 150GB per node):
# name: k3s-master
# storageAvailable: 149736652800 # ~139GB available
# storageMaximum: 158312947712 # ~147GB total
Calculate usable capacity:
# Total raw: 147GB × 3 nodes = 441GB
# Usable (2-replica): 441GB ÷ 2 = 220GB
# Reserved (25%): 220GB × 0.25 = 55GB
# Available: 220GB - 55GB = 165GB
Check in Longhorn UI:
Common Scenarios¶
Scenario 1: Double Current Capacity¶
From 80GB to 160GB per node:
# Add 80GB to each VM
qm resize 201 scsi0 +80G
qm resize 202 scsi0 +80G
qm resize 203 scsi0 +80G
# Extend filesystems
ssh root@10.89.97.201 'growpart /dev/sda 1 && resize2fs /dev/sda1'
ssh root@10.89.97.202 'growpart /dev/sda 1 && resize2fs /dev/sda1'
ssh root@10.89.97.203 'growpart /dev/sda 1 && resize2fs /dev/sda1'
Result: 480GB raw → 240GB usable
Scenario 2: Add 50GB to Each Node¶
qm resize 201 scsi0 +50G
qm resize 202 scsi0 +50G
qm resize 203 scsi0 +50G
ssh root@10.89.97.201 'growpart /dev/sda 1 && resize2fs /dev/sda1'
ssh root@10.89.97.202 'growpart /dev/sda 1 && resize2fs /dev/sda1'
ssh root@10.89.97.203 'growpart /dev/sda 1 && resize2fs /dev/sda1'
Scenario 3: Expand Single Node (Not Recommended)¶
While technically possible, uneven node sizes complicate capacity planning. Longhorn will only use the smallest common capacity for replica placement.
Better approach: Expand all nodes equally.
Troubleshooting¶
Issue: growpart Says "NOCHANGE"¶
Symptoms:
Cause: Proxmox resize didn't complete or partition already expanded.
Fix:
# Check if Proxmox resize was applied
qm config <vmid> | grep scsi0
# Check current partition size
ssh root@<node-ip> 'parted /dev/sda print'
# If sizes match, filesystem might need extending
ssh root@<node-ip> 'resize2fs /dev/sda1'
Issue: Longhorn Not Detecting New Space¶
Symptoms: - Filesystem shows 150GB - Longhorn still reports old capacity
Fix:
# Check node-disk status
kubectl get nodes.longhorn.io -n longhorn-system
# Restart Longhorn manager (it will re-scan)
kubectl rollout restart daemonset longhorn-manager -n longhorn-system
# Wait 2 minutes for discovery
sleep 120
# Re-check capacity
kubectl get nodes.longhorn.io -n longhorn-system -o yaml | \
grep storageMaximum
Issue: PVCs Still Fail After Expansion¶
Cause: Existing PVC requests might still exceed new capacity with replica overhead.
Fix: 1. Calculate total needed: PVC size × replica count 2. Ensure available capacity > total needed 3. Check Longhorn over-provisioning setting:
kubectl get setting storage-over-provisioning-percentage \
-n longhorn-system -o jsonpath='{.value}'
# Default: 100 (allows 2× over-subscription)
Capacity Planning Guidelines¶
Homelab Scale (3-10 apps)¶
Recommended: 150-200GB per node - Raw: 450-600GB - Usable: 225-300GB - Comfortable for databases, registries, monitoring
Medium Scale (10-20 apps)¶
Recommended: 250-300GB per node - Raw: 750-900GB - Usable: 375-450GB - Multiple databases, large registries, extensive logging
Large Scale (20+ apps)¶
Recommended: 400GB+ per node or add more nodes - Consider adding 4th node for better redundancy - Or migrate large storage needs to NFS/external storage
Expansion History¶
November 10, 2025¶
Change: Expanded from 80GB to 150GB per node
Reason: - Initial 80GB allocation was 94GB (over capacity) - Insufficient room for upcoming applications - Supabase PostgreSQL needs 10-20GB - Future growth planning
Result: - Usable capacity increased from 90GB to 169GB - Comfortable headroom for 10+ applications - No downtime during expansion
Commands used:
qm resize 201 scsi0 +70G
qm resize 202 scsi0 +70G
qm resize 203 scsi0 +70G
ssh root@10.89.97.201 'growpart /dev/sda 1 && resize2fs /dev/sda1'
ssh root@10.89.97.202 'growpart /dev/sda 1 && resize2fs /dev/sda1'
ssh root@10.89.97.203 'growpart /dev/sda 1 && resize2fs /dev/sda1'
Related Documentation¶
- Storage Verification - Verify PVCs are persistent
- Troubleshooting - Storage-related issues
- Core Infrastructure - Initial storage setup