Software Engineering Approach Assessment¶

Date: 2025-11-12 Status: Active Reference Purpose: Define software engineering practices for homelab projects

Overview¶

This document compares enterprise software engineering practices to our homelab context and defines a refined approach that balances: - Learning & Career Development - Building skills for professional growth - Speed & Flexibility - Moving fast on personal projects - Quality & Maintainability - Avoiding technical debt

Current State Analysis¶

Strengths¶

1. Production-Grade Infrastructure - ✅ Multi-node Kubernetes cluster with HA - ✅ Full observability stack (Prometheus, Grafana, Loki) - ✅ Proper deployment patterns (sealed secrets, LoadBalancers, health checks) - ✅ IaC mindset (Helm charts, kubectl manifests)

2. Excellent Documentation - ✅ Architecture Decision Records (ADRs) - ✅ Comprehensive operational guides - ✅ Reference implementations (home-portal) - ✅ OtterWiki site with searchable docs

3. Clear Standards - ✅ Defined tech stack (Next.js 16, Supabase, Tailwind v4) - ✅ Git commit conventions - ✅ Deployment procedures - ✅ LXC templates for consistency

4. Iterative Learning Focus - ✅ Learning by doing (hands-on k8s experience) - ✅ Resume-building projects - ✅ Real-world problem solving

Gaps¶

1. No Testing Strategy - ❌ No test coverage - ❌ Manual verification only - ❌ No regression testing - ❌ No CI/CD pipeline

2. Ad-Hoc Design Process - ⚠️ Design decisions documented after implementation (ADRs) - ⚠️ No upfront design review - ⚠️ Jump straight to coding

3. Limited Pre-Implementation Planning - ⚠️ No design docs before major features - ⚠️ No task breakdown/estimation - ⚠️ No explicit alternatives consideration

Enterprise vs. Homelab Context¶

Enterprise Process (from industry post)¶

1. Technical Design Document (TDD)
   ├─ Proposal phase (stakeholder approval)
   └─ Design phase (architecture, integrations)

2. Design Review
   └─ Senior engineers critique ("front-load the pain")

3. Subsystem Documentation
   └─ Detailed specs for each component

4. Backlog & Sprint Planning
   └─ Discrete tasks with PMs/TPMs

5. Software Development (with AI)
   ├─ AI writes tests first (TDD)
   └─ AI builds features

6. Code Review
   └─ 2-dev approval before merge

7. Staging → Production
   └─ Phased deployment

What to Adopt (High Value)¶

1. Lightweight Design Documents (30-60 min upfront)

Create simple design docs for non-trivial features (>2 hours):

# Feature: [Name]
**Date:** YYYY-MM-DD
**Status:** Proposal | Approved | Implemented

## Problem
What are we solving? Why now?

## Proposed Solution
- High-level architecture
- Tech choices (with rationale)
- Key components

## Alternatives Considered
| Option | Pros | Cons | Verdict |
|--------|------|------|---------|
| A      | ... | ...  | Rejected because... |
| B      | ... | ...  | **Chosen** |

## Implementation Plan
- [ ] Component A (est: 2h)
- [ ] Component B (est: 3h)
- [ ] Tests (est: 1h)
- [ ] Documentation (est: 30min)

## Risks & Mitigations
- Risk 1: [mitigation]
- Risk 2: [mitigation]

## Validation Criteria
How do we know it works?
- [ ] Test case 1
- [ ] Test case 2

Where: /root/tower-fleet/docs/design/<feature-name>.md

When: Before implementing features >2 hours

2. Test-Driven Development (TDD)

Current: No testing Target: 80% coverage for new features

Standard Stack: - Unit tests: Vitest - Component tests: React Testing Library - E2E tests: Playwright (critical paths only) - API tests: Supertest or Vitest with MSW

Workflow: 1. Write test (with Claude Code) 2. Run test (should fail) 3. Implement feature 4. Run test (should pass) 5. Refactor

Benefits: - Prevents regressions - Documents expected behavior - Enables confident refactoring - Resume value (expected in senior roles)

3. Pre-Implementation Checklist

Before writing code, answer:

## Pre-Implementation Checklist

- [ ] **Problem:** What are we solving? (1-2 sentences)
- [ ] **Simplest solution:** What's the minimal viable approach?
- [ ] **Consistency:** Do we have a similar pattern elsewhere?
- [ ] **Failure modes:** What could go wrong?
- [ ] **Test strategy:** How will we verify it works?
- [ ] **Observability:** What metrics will we track?
- [ ] **Design doc needed:** Is this >2 hours of work?

If uncertain on any point → create design doc first

Time investment: 5-10 minutes Time saved: Hours (by catching issues early)

What to SKIP (Low Value)¶

1. Formal Stakeholder Approval - Enterprise: Proposal reviewed by product, engineering, design teams - Homelab: You're the sole stakeholder - Verdict: ❌ SKIP

2. Multi-Engineer Design Review - Enterprise: Design doc "shredded" by senior engineers - Homelab: Solo developer - Verdict: ✅ ADAPT - Use Claude Code as design critic

3. Sprint Planning with PMs/TPMs - Enterprise: Formal sprints, velocity tracking, burndown charts - Homelab: Flexible timeline - Verdict: ❌ SKIP formal sprints. Use prioritized backlog in PROJECTS.md

4. Two-Developer Code Review - Enterprise: 2-dev approval before merge - Homelab: Solo developer - Verdict: ✅ ADAPT - Claude Code reviews diffs before push

Refined Development Workflow¶

For Small Changes (<2 hours)¶

Examples: Bug fixes, UI tweaks, config changes

1. Update PROJECTS.md with task
2. Implement
3. Manual testing
4. Commit with descriptive message
5. Push

No design doc needed.

For Medium Features (2-8 hours)¶

Examples: New API endpoint, new page, database schema change

1. Pre-implementation checklist (5 min)
2. Create design doc if needed (30 min)
3. Ask Claude Code to critique design
4. Break into subtasks (TodoWrite)
5. Write tests first (TDD)
6. Implement → commit frequently
7. Update documentation
8. Deploy to dev (LXC) → test
9. Deploy to prod (k8s)

Design doc optional but recommended.

For Large Features (>8 hours)¶

Examples: New application, major architecture change, multi-service feature

1. Create detailed design doc with diagrams (1-2 hours)
2. Claude Code "design review" (act as senior engineer)
3. Break into milestones (each <8h)
4. For each milestone:
   ├─ Write tests (TDD)
   ├─ Implement
   ├─ Document
   └─ Deploy to staging
5. Integration testing
6. Deploy to production
7. Create ADR documenting decision

Design doc required.

Testing Standards¶

Philosophy¶

"Tests are documentation that never goes out of date."

Goals: - Confidence: Know when something breaks - Speed: Catch bugs before production - Refactoring: Change code without fear

NOT goals: - 100% coverage (diminishing returns) - Testing for testing's sake

Coverage Targets¶

Type	Coverage	Priority
Critical paths (auth, payments)	90-100%	P0
Core features	80%+	P1
UI components	60-80%	P2
Utils/helpers	80%+	P1
E2E critical flows	3-5 tests	P0

What to Test¶

DO test: - ✅ Business logic - ✅ API endpoints - ✅ Database operations - ✅ Authentication/authorization - ✅ Error handling - ✅ Edge cases

DON'T test: - ❌ External libraries (trust them) - ❌ Trivial getters/setters - ❌ UI styling (visual regression is expensive)

Test Structure¶

// Good: Arrange-Act-Assert (AAA)
describe('createTransaction', () => {
  it('should create transaction with valid data', async () => {
    // Arrange
    const userId = 'user-123'
    const transactionData = { amount: 100, category: 'food' }

    // Act
    const result = await createTransaction(userId, transactionData)

    // Assert
    expect(result.amount).toBe(100)
    expect(result.category).toBe('food')
    expect(result.userId).toBe(userId)
  })

  it('should throw error for negative amount', async () => {
    // Arrange
    const invalidData = { amount: -50, category: 'food' }

    // Act & Assert
    await expect(
      createTransaction('user-123', invalidData)
    ).rejects.toThrow('Amount must be positive')
  })
})

Test Setup (Next.js Project)¶

1. Install dependencies:

npm install -D vitest @vitest/ui @testing-library/react \
  @testing-library/jest-dom @testing-library/user-event jsdom

2. Create vitest.config.ts:

import { defineConfig } from 'vitest/config'
import react from '@vitejs/plugin-react'
import path from 'path'

export default defineConfig({
  plugins: [react()],
  test: {
    environment: 'jsdom',
    globals: true,
    setupFiles: './tests/setup.ts',
  },
  resolve: {
    alias: {
      '@': path.resolve(__dirname, './src'),
    },
  },
})

3. Create tests/setup.ts:

import '@testing-library/jest-dom'
import { expect, afterEach } from 'vitest'
import { cleanup } from '@testing-library/react'

afterEach(() => {
  cleanup()
})

4. Add scripts to package.json:

{
  "scripts": {
    "test": "vitest",
    "test:ui": "vitest --ui",
    "test:coverage": "vitest --coverage"
  }
}

5. Create first test:

// src/lib/utils.test.ts
import { describe, it, expect } from 'vitest'
import { formatCurrency } from './utils'

describe('formatCurrency', () => {
  it('should format USD correctly', () => {
    expect(formatCurrency(1234.56)).toBe('$1,234.56')
  })

  it('should handle zero', () => {
    expect(formatCurrency(0)).toBe('$0.00')
  })

  it('should handle negative values', () => {
    expect(formatCurrency(-50)).toBe('-$50.00')
  })
})

Documentation Standards¶

1. ADRs (Architecture Decision Records)¶

When: After implementing significant architectural decisions

Location: /root/tower-fleet/docs/decisions/ADR-XXX-topic.md

Template: Already defined (see ADR-001, ADR-002)

Examples: - Storage strategy (Longhorn vs NFS) - Multi-node vs single-node k3s - Supabase multi-app architecture

2. Design Documents¶

When: Before implementing features >2 hours

Location: /root/tower-fleet/docs/design/<feature-name>.md

Template: See "Lightweight Design Documents" above

Examples (to create): - AI subtitle generator architecture - RMS Next.js rewrite design - money-tracker CSV import v2

3. Operational Docs¶

When: After deploying new services/infrastructure

Location: /root/tower-fleet/docs/operations/<topic>.md

Examples (existing): - Alert investigation guide - Loki operations - Documentation maintenance

4. Reference Implementations¶

When: After establishing a pattern worth reusing

Location: /root/tower-fleet/docs/reference/<topic>.md

Examples (existing): - Observability standards (home-portal) - Supabase multi-app architecture - Production app deployment guide

Code Review Process¶

Pre-Push Review (with Claude Code)¶

Before pushing significant changes:

# 1. View diff
git diff

# 2. Ask Claude Code to review
"Review this diff for:
- Security issues (SQL injection, XSS, secrets in code)
- Performance problems (N+1 queries, unnecessary loops)
- Maintainability (readability, naming, duplication)
- Missing tests
"

# 3. Address issues

# 4. Push
git push

What Claude Code Should Check¶

Security: - ❌ Secrets hardcoded - ❌ SQL injection vulnerabilities - ❌ XSS vulnerabilities - ❌ Unvalidated user input - ❌ Missing authentication checks

Performance: - ❌ N+1 database queries - ❌ Unbounded loops - ❌ Missing indexes - ❌ Inefficient algorithms

Maintainability: - ❌ Complex functions (>50 lines) - ❌ Poor naming - ❌ Code duplication - ❌ Missing error handling - ❌ No comments on complex logic

Testing: - ❌ No tests for new features - ❌ No tests for bug fixes

Deployment Strategy¶

Environments¶

1. Development (Host) - Purpose: Active development, fast iteration - Location: /root/projects/ on Proxmox host - Access: tmux sessions - Database: K8s Supabase sandbox (supabase-sandbox namespace) - Deployment: npm run dev

2. Production (Kubernetes) - Purpose: Production workloads - Access: LoadBalancer IPs - Database: Shared k8s Supabase (supabase namespace, schema-isolated) - Deployment: kubectl apply

No staging environment - homelab context doesn't warrant it.

Deployment Checklist¶

Before deploying to production:

[ ] Feature tested locally (host dev environment)
[ ] Tests pass (npm test)
[ ] Build succeeds (npm run build)
[ ] Observability configured (metrics, alerts, dashboard)
[ ] Documentation updated
[ ] Secrets sealed (kubeseal)
[ ] Commit pushed to GitHub

Tools & Automation¶

Current¶

✅ Git - Version control
✅ kubectl - Kubernetes deployment
✅ Helm - Package management
✅ kubeseal - Secrets encryption
✅ Prometheus/Grafana - Observability
✅ OtterWiki - Documentation hosting (auto-synced from Git)

To Add¶

[ ] Vitest - Testing framework
[ ] Playwright - E2E testing (optional)
[ ] GitHub Actions - CI/CD (future)
[ ] pre-commit hooks - Automated checks (future)

Career Development Alignment¶

Skills You're Building (Resume Value)¶

Infrastructure & DevOps: - ✅ Kubernetes (k3s multi-node cluster) - ✅ Infrastructure as Code (Helm, kubectl manifests) - ✅ Observability (Prometheus, Grafana, Loki) - ✅ GitOps practices - ✅ Container orchestration - ✅ High availability patterns

Software Engineering: - ✅ Full-stack development (Next.js, React, TypeScript) - ✅ API design (RESTful, server actions) - ✅ Database design (PostgreSQL, Supabase) - ⚠️ Testing (need to add) - ⚠️ Code review (need to formalize)

Missing Gaps (High Value to Add): - ❌ Test-Driven Development (TDD) - ❌ CI/CD pipelines - ❌ Design documentation - ❌ Load testing / performance optimization

Interview Talking Points¶

With current approach: - "Built production k8s cluster with monitoring, HA, and GitOps" - "Deployed multiple Next.js apps with shared Supabase SSO" - "Implemented full observability stack (metrics, logs, alerts)"

With refined approach: - "Built production k8s cluster with 80%+ test coverage" - "TDD workflow with AI assistance for rapid development" - "Design documentation and architectural decision records" - "Code review process ensuring quality and security"

These additions significantly strengthen your story.

Action Plan¶

Phase 1: Testing Foundation (Week 1)¶

Goal: Add testing to one project (home-portal)

[ ] Install Vitest + Testing Library
[ ] Create vitest.config.ts
[ ] Write tests for 3 utility functions
[ ] Write tests for 1 API route
[ ] Add test scripts to package.json
[ ] Document testing approach in README

Success criteria: Tests run and pass in CI-like fashion

Phase 2: TDD Workflow (Week 2-3)¶

Goal: Use TDD for new features

[ ] Add pre-implementation checklist to CLAUDE.md
[ ] Practice TDD on 2-3 small features
[ ] Achieve 80% coverage on new code
[ ] Document learnings

Success criteria: Comfortable writing tests before implementation

Phase 3: Design Documentation (Week 4)¶

Goal: Establish design doc practice

[ ] Create design doc template
[ ] Write design doc for next medium-sized feature
[ ] Review design with Claude Code
[ ] Implement according to design
[ ] Reflect on whether upfront design saved time

Success criteria: Design doc prevents implementation rework

Phase 4: Expand Testing (Ongoing)¶

Goal: Add tests to existing projects

[ ] Add testing to money-tracker
[ ] Write E2E tests for critical paths
[ ] Add test coverage reporting
[ ] Set up GitHub Actions for automated testing (optional)

Templates¶

Design Document Template¶

# Design: [Feature Name]

**Date:** YYYY-MM-DD
**Author:** Your name
**Status:** Proposal | Approved | Implemented
**Project:** home-portal | money-tracker | etc.

---

## Problem Statement

[What problem are we solving? Why is this important?]

## Goals

- [ ] Goal 1
- [ ] Goal 2
- [ ] Goal 3

## Non-Goals

- What we're explicitly NOT doing in this iteration

## Proposed Solution

### High-Level Architecture

[Diagram if helpful]

### Components

**Component A:**
- Responsibility: ...
- Tech: ...
- Dependencies: ...

**Component B:**
- Responsibility: ...
- Tech: ...
- Dependencies: ...

### Data Model

```sql
-- New tables/changes

API Design¶

// New endpoints/functions

Alternatives Considered¶

Option 1: [Name]¶

Pros: - ...

Cons: - ...

Verdict: Rejected because...

Option 2: [Name] ← Chosen¶

Pros: - ...

Cons: - ...

Verdict: Accepted because...

Implementation Plan¶

Phase 1: Foundation (2h)
[ ] Set up database schema
[ ] Write migrations
Phase 2: API Layer (3h)
[ ] Implement endpoints
[ ] Write tests
Phase 3: UI (2h)
[ ] Build components
[ ] Wire up data

Total estimate: 7 hours

Risks & Mitigations¶

Risk	Impact	Probability	Mitigation
Risk 1	High	Low	...

Testing Strategy¶

Unit tests: - Test case 1 - Test case 2

Integration tests: - Test case 3

Manual testing: - Scenario 1 - Scenario 2

Observability¶

Metrics to add: - feature_usage_total{action} - feature_errors_total{type}

Alerts: - FeatureErrorRate > 5%

Dashboard: - Panel showing usage trends

Rollout Plan¶

Test locally (host dev environment)
Test manually
Deploy to prod (k8s)
Monitor for 24h
Consider it stable if no issues

Success Metrics¶

How do we know this worked?

[ ] Feature is used X times in first week
[ ] No errors in first 48 hours
[ ] Performance is <100ms p95

Open Questions¶

[ ] Question 1?
[ ] Question 2?

Approval: [Date] - Approved by [you]

### Test File Template

```typescript
// src/features/example/example.test.ts
import { describe, it, expect, beforeEach, vi } from 'vitest'
import { exampleFunction } from './example'

describe('exampleFunction', () => {
  // Setup
  beforeEach(() => {
    // Reset state before each test
  })

  describe('happy path', () => {
    it('should do X when given Y', () => {
      // Arrange
      const input = { /* ... */ }

      // Act
      const result = exampleFunction(input)

      // Assert
      expect(result).toBe(expected)
    })
  })

  describe('edge cases', () => {
    it('should handle empty input', () => {
      expect(exampleFunction({})).toThrow()
    })

    it('should handle null', () => {
      expect(exampleFunction(null)).toBe(null)
    })
  })

  describe('error cases', () => {
    it('should throw error for invalid input', () => {
      expect(() => exampleFunction(invalid)).toThrow('Error message')
    })
  })
})

FAQs¶

Q: Isn't this overkill for a homelab?

A: Some parts yes, some no. We're selectively adopting practices that provide: - Career development value (TDD, design docs) - Quality improvements (testing, code review) - Time savings (design docs prevent rework)

We're skipping things that don't apply (stakeholder approval, formal sprints).

Q: Will this slow me down?

A: Short term: slightly (learning curve) Long term: faster (less debugging, confident refactoring, clearer thinking)

Q: Do I need 80% test coverage for everything?

A: No. Target 80% for new features and critical paths. Existing code can be tested opportunistically.

Q: Should I write design docs for every feature?

A: No. Only for: - Features >2 hours - Unclear requirements - Multiple possible approaches - High complexity/risk

Q: What if I want to move fast and skip this?

A: That's fine for experiments. But for projects you'll maintain (home-portal, money-tracker), these practices pay dividends.

Conclusion¶

Your current approach is strong for infrastructure and operations. The main gaps are in software engineering practices (testing, design documentation).

Key Takeaways:

Add testing - Biggest bang for buck. Resume value + quality improvement.
Write design docs for medium/large features - Think before coding.
Use Claude Code for design review - Get feedback before implementation.
Keep what works - Your deployment process, documentation, and infrastructure practices are excellent.

Next Steps:

Read this document
Review the action plan (Phase 1-4)
Start with Phase 1 (testing foundation) on home-portal
Iterate and refine based on what works

Last Updated: 2025-11-12 Maintained By: Infrastructure Team Status: Active - living document