Skip to content

SubtitleAI v1.2.3 - Production Deployment Summary

Status: βœ… LIVE IN PRODUCTION URL: http://subtitles.internal (http://10.89.97.213) Current Version: v1.2.3 (worker) / v1.1.5 (web) Last Updated: 2025-11-25 Cluster: k3s (namespace: subtitleai)


What's Live

Core Features βœ…

  • Video Upload: MP4, MKV, AVI support up to 2GB
  • Speech-to-Text: OpenAI Whisper (base model)
  • Multi-Language: 12 languages supported (EN, ES, FR, DE, IT, PT, RU, JA, KO, ZH, AR, HI)
  • Output Formats: SRT + ASS (Advanced SubStation Alpha)
  • Dual-Language Subtitles: Source + translation with perfect alignment (ENβ†’ES tested βœ…)
  • AI Translation: Qwen QwQ-32B via OpenRouter (~$0.0002 per video)
  • Style Profiles: 4 profiles implemented (default, learning, enhanced, accessibility)
  • Authentication: Supabase Auth with RLS policies
  • Job Management: Track progress, view history, download multiple formats

Infrastructure βœ…

  • Web App: Next.js 16 (1 replica)
  • Image: 10.89.97.201:30500/subtitleai-web:v1.1.5
  • Enhanced UI with format badges (dual-language, style profile)
  • Worker: Python + Whisper + Celery (1 replica)
  • Image: 10.89.97.201:30500/subtitleai-worker:v1.2.3
  • Translation batch size: 10 segments
  • Individual fallback for perfect alignment
  • Poller: Database job monitoring (5s interval)
  • Redis: Task queue (StatefulSet)
  • LoadBalancer: 10.89.97.213
  • Uptime: Stable (production ready)

Performance βœ…

  • Job pickup: <5 seconds
  • Transcription: ~4x real-time (5s video = 20s processing)
  • Translation: ~3.5 minutes for typical video (68 segments)
  • E2E workflow: ~4-5 minutes for dual-language subtitles

Phase 2 Completion Status

βœ… COMPLETED & TESTED

  1. Dual-Language Subtitles
  2. Source + translation with perfect 1:1 alignment
  3. Tested: English β†’ Spanish βœ…
  4. Uses Qwen QwQ-32B open source model
  5. Cost: ~$0.0002 per video (5x cheaper than Claude)
  6. Stricter prompts prevent line skipping/merging
  7. Individual fallback on batch mismatch ensures accuracy

  8. ASS Format Support

  9. Generates Advanced SubStation Alpha files
  10. Dual-language ASS with separated Source/Translation tracks
  11. Source: White, 44pt, MarginV=50 (higher positioning)
  12. Translation: Cyan, 40pt, MarginV=10 (lower positioning)
  13. Tested: Default style with dual-language βœ…

  14. Multi-Format Output

  15. SRT + ASS generated for every job
  16. Dual-language creates 2 files (_dual suffix)
  17. Single-language creates 2 files (SRT + ASS)
  18. UI displays all formats with badges

  19. Translation Quality

  20. Perfect alignment verified by user
  21. No offset errors (fixed cascading alignment bug)
  22. Handles short phrases correctly ("No", "Ah", "Yeah")
  23. Graceful fallback when translation fails

  24. UI Enhancements

  25. Purple "Dual-Language" badge
  26. Blue "Style Profile" badge (for ASS files)
  27. Individual download buttons for each format
  28. Shows file size and download counts

πŸ”§ IMPLEMENTED BUT NOT TESTED

  1. Single-Language ASS - Code exists, not user-tested
  2. Style Profiles - All 4 implemented, only default tested:
  3. default - Clean, readable (White, 48pt, bottom center)
  4. learning - Larger text (52pt, thicker outline)
  5. enhanced - Same as default, ready for emotion/entity colors
  6. accessibility - High contrast yellow (56pt, top center)

Technical Implementation

Translation Model: Qwen QwQ-32B

Why Qwen? - Open source model via OpenRouter - Trained on 36 trillion tokens across 119 languages - Matches Claude/GPT quality in multilingual benchmarks - 5x cheaper than Claude 3.5 Haiku - No rate limits (paid tier)

Implementation Details: - Batch size: 10 segments per API call - Strict prompt: "You MUST return EXACTLY N translations" - Validation: Detects count mismatches - Fallback: Translates individually if batch fails - Cost tracking: ~$0.0002 per video (~$0.10/month for 500 videos)

Bug Fixes & Iterations

v1.1.0 - v1.1.4 (2025-11-24): - Fixed missing module exports - Fixed MIME type rejection (text/plain for ASS) - Fixed retry duplicate errors (upsert + cleanup) - Tried Gemini models (rate limits + 404 errors)

v1.2.0 - v1.2.3 (2025-11-25): - Switched to Qwen QwQ-32B - Fixed model ID (qwq-32b not qwq-32b-preview) - Critical: Fixed translation alignment offset bug - Problem: LLM returned 17/20 translations β†’ 3-position offset - Solution: Stricter prompt + batch size 5 β†’ individual fallback - Optimized: Batch size 5 β†’ 10 (2x faster, still perfect alignment)


QA Test Results

Test Category Status Details
Authentication βœ… PASS Routes protected, RLS policies working
Infrastructure βœ… PASS All pods healthy
Database βœ… PASS 16 RLS policies, schema isolation
Upload βœ… PASS Video storage pipeline verified
Job Creation βœ… PASS Database records & foreign keys
Worker Processing βœ… PASS Whisper transcription working
Dual-Language Translation βœ… USER TESTED Perfect alignment (ENβ†’ES)
Translation Fallback βœ… TESTED Individual retry on mismatch
Multi-Format Download βœ… TESTED SRT + ASS with badges
Job Retry βœ… TESTED Upsert + cleanup working
Performance βœ… TESTED ~3.5min translation confirmed
Single-Language ASS ⚠️ NOT TESTED Code complete, needs testing
Style Profiles ⚠️ PARTIAL Only default tested
Error Handling βœ… PASS Graceful degradation working

Overall: 11/13 PASS (85% coverage) - Core dual-language feature fully validated


Known Limitations

  1. Style Profiles Not Fully Tested
  2. All 4 profiles implemented but only default tested
  3. Need to test: learning, enhanced, accessibility
  4. Need to verify rendering in VLC/mpv

  5. Single-Language Mode Not Tested

  6. Code exists but not user-validated
  7. Should work (same code path as dual-language)

  8. Translation Languages

  9. Only EN→ES tested
  10. Other language pairs untested but should work

  11. No Translation Preview

  12. Cannot review translations before finalizing
  13. No manual correction interface

  14. No Cost Tracking

  15. API costs not logged in database
  16. No per-job cost visibility

  17. Whisper Model Fixed

  18. Using "base" model only
  19. Cannot select small/medium for better quality

Deployment Commands

Check Status

kubectl get pods -n subtitleai
kubectl logs -n subtitleai -l app=subtitleai-worker -f

Update Worker

cd /root/projects/subtitleai
docker build -t 10.89.97.201:30500/subtitleai-worker:vX.X.X -f worker/Dockerfile worker/
docker push 10.89.97.201:30500/subtitleai-worker:vX.X.X
kubectl set image deployment/subtitleai-worker subtitleai-worker=10.89.97.201:30500/subtitleai-worker:vX.X.X -n subtitleai
kubectl set image deployment/subtitleai-poller subtitleai-poller=10.89.97.201:30500/subtitleai-worker:vX.X.X -n subtitleai
kubectl rollout status deployment/subtitleai-worker -n subtitleai

Update Web

cd /root/projects/subtitleai
docker build -t 10.89.97.201:30500/subtitleai-web:vX.X.X .
docker push 10.89.97.201:30500/subtitleai-web:vX.X.X
kubectl set image deployment/subtitleai-web subtitleai-web=10.89.97.201:30500/subtitleai-web:vX.X.X -n subtitleai
kubectl rollout status deployment/subtitleai-web -n subtitleai

Environment Configuration

Worker Environment Variables

SUPABASE_URL=http://10.89.97.214:8000
SUPABASE_SERVICE_KEY=<service_role_key>
OPENROUTER_API_KEY=<sk-or-v1-...>

K8s Secret

# OpenRouter API key stored in secret
kubectl get secret subtitleai-secrets -n subtitleai

Monitoring & Observability

Key Metrics to Watch

  1. Translation Success Rate - Watch for "ERROR: Expected N translations, got M"
  2. Translation Time - Should be ~3.5min for typical video
  3. API Costs - Monitor OpenRouter usage
  4. Worker Memory - Whisper model requires ~200Mi
  5. Job Completion Rate - Track failed jobs

Logs

# Worker logs (transcription + translation)
kubectl logs -n subtitleai -l app=subtitleai-worker -f

# Poller logs (job pickup)
kubectl logs -n subtitleai -l app=subtitleai-poller -f

# Web logs (API requests)
kubectl logs -n subtitleai -l app=subtitleai-web -f

Future Enhancements

Immediate Testing Needed

  1. Test single-language ASS mode
  2. Test all 4 style profiles in video player
  3. Test other language pairs (EN→FR, EN→DE, etc.)

Phase 3 Potential Features

  1. VTT Format Support - WebVTT for web players
  2. Whisper Model Selection - Let users choose base/small/medium
  3. Translation Review Interface - Edit before finalizing
  4. Cost Tracking - Log API costs per job in database
  5. Batch Processing - Process multiple videos at once
  6. Bazarr Integration - Auto-process media library
  7. LLM Enhancement - Grammar cleanup, emotion detection
  8. Custom Style Editor - Create/save custom ASS styles

Documentation

  • Deployment Guide: /root/projects/subtitleai/DEPLOYMENT_COMPLETE.md
  • Implementation Summary: /root/projects/subtitleai/PHASE2_IMPLEMENTATION_SUMMARY.md
  • Git Repository: https://github.com/jakecelentano/subtitlesv2

Production Status

Current State: βœ… PRODUCTION READY

  • Core dual-language feature fully tested and working
  • Translation alignment perfect (user confirmed)
  • Performance optimized (~3.5min)
  • UI enhanced with format badges
  • All critical bugs fixed
  • Code committed and deployed

Confidence Level: HIGH for dual-language mode, MEDIUM for untested features (single-language, style profiles)


Maintained By: Claude Code Last Updated: 2025-11-25