SubtitleAI: Subtitle Import & Style Visualizer Feature Plan¶
Overview¶
Add ability to import existing subtitle files (SRT, ASS, VTT) instead of requiring video transcription, and build a visual style comparison tool to preview "before → after" styling changes.
Key Insight: Subtitle file is the foundation - video transcription is just one path to get there.
Core Flow (Simplified)¶
Input: Video OR Subtitle File
├─ Video → Whisper → Subtitle segments
└─ Subtitle File → Parse → Subtitle segments
↓
Apply Styling & Translation
↓
Output Files
Use Cases¶
Primary Use Cases¶
- Restyle Existing Subtitles: User has SRT file, wants to apply custom ASS styling
- Translate Existing Subtitles: User has English SRT, wants dual-language output
- Preview Styles Before Generation: User uploads subtitle to test style profiles
- Convert Between Formats: User has ASS, wants clean SRT or vice versa
Future Use Cases (Phase 2+)¶
- Bazarr Integration: Import subtitles directly from Bazarr API
- Media Library Scanning: Auto-discover subtitles in
/vaultmedia folders - Batch Processing: Apply same style to multiple subtitle files
Architecture Components¶
1. File Upload System¶
Current State: VideoUploader accepts video files only Proposed State: FileUploader accepts video OR subtitle files
Changes Required:
- Modify VideoUploader.tsx → rename to FileUploader.tsx
- Accept both video and subtitle MIME types
- Validate file extensions: .mp4, .mkv, .srt, .ass, .vtt
- Show different UI hints based on file type
- Max size: 2GB for video, 10MB for subtitles
UI Components:
<FileUploader
onFileSelect={setUploadedFile}
selectedFile={uploadedFile}
acceptedTypes={['video', 'subtitle']} // both
/>
2. Job Creation Flow¶
Current State: All jobs require video_id and run Whisper Proposed State: Jobs can be video-based OR subtitle-based
Database Schema Changes: None required (existing schema supports both)
API Changes: /api/jobs (POST)
// Current: Requires videoPath
{ videoTitle, videoPath, sourceLanguage, config }
// Proposed: Accepts videoPath OR subtitlePath
{
title: string,
videoPath?: string, // For video transcription
subtitlePath?: string, // For subtitle import
sourceLanguage: string,
config: { ... }
}
Worker Changes: worker/tasks.py
- Detect job type: Check if subtitle_path exists in video record
- Branch logic:
- Video job → Download video → Whisper → Parse SRT
- Subtitle job → Download subtitle → Parse directly → Skip Whisper
3. Subtitle Parser Enhancement¶
Current State: Worker parses Whisper-generated SRT only Proposed State: Worker parses SRT, ASS, and VTT formats
Implementation:
- Extract existing parse_srt() from formats/srt_parser.py
- Add parse_ass() for ASS format
- Add parse_vtt() for WebVTT format
- Create unified parse_subtitle_file() dispatcher
Format Support:
| Format | Extension | Parser Function | Status |
|--------|-----------|-----------------|--------|
| SubRip | .srt | parse_srt() | ✅ Exists |
| ASS/SSA | .ass, .ssa | parse_ass() | ⚠️ New |
| WebVTT | .vtt | parse_vtt() | ⚠️ New |
Output: All parsers return same structure:
[
{
'index': 1,
'start': '00:00:01,000',
'end': '00:00:03,500',
'text': 'Subtitle text here'
},
...
]
4. Subtitle Preview Component¶
Purpose: Display parsed subtitle segments before job submission
Component: SubtitlePreview.tsx
interface SubtitlePreviewProps {
segments: SubtitleSegment[]
totalSegments: number
previewLimit?: number // Default 10
}
Features: - Show first 10 segments by default - Display timing, text content - Scrollable list with virtualization (for large files) - Segment count summary: "Showing 10 of 847 segments"
UI Design:
┌─ Subtitle Preview ─────────────────────┐
│ 847 segments detected │
├────────────────────────────────────────┤
│ [1] 00:00:01.000 → 00:00:03.500 │
│ The world is fogly people doing... │
│ │
│ [2] 00:00:06.000 → 00:00:08.000 │
│ It's just nature of the beast. │
│ │
│ ... (showing 10 of 847) │
└────────────────────────────────────────┘
5. Style Comparison Visualizer¶
Purpose: Show "before → after" preview of style changes
Component: StyleComparison.tsx
interface StyleComparisonProps {
segments: SubtitleSegment[]
selectedStyleProfile: string | StyleConfig
previewText?: string // Optional override
displayMode: 'side-by-side' | 'toggle' | 'overlay'
}
Display Modes:
Mode 1: Side-by-Side¶
┌─ Original ────────┬─ Styled ──────────┐
│ Sample text here │ Sample text here │
│ (Default style) │ (Custom style) │
└───────────────────┴───────────────────┘
Mode 2: Toggle (Switcher)¶
┌─ Preview ─────────────────────────────┐
│ [○ Original] [● Styled] │
│ │
│ Sample text here │
│ (Shows selected version) │
└───────────────────────────────────────┘
Mode 3: Overlay (Fade transition)¶
┌─ Preview ─────────────────────────────┐
│ [────────●─────────] Opacity slider │
│ │
│ Sample text here │
│ (Crossfade between styles) │
└───────────────────────────────────────┘
Implementation:
- Reuse StylePreview.tsx component for rendering
- Extract actual subtitle text from uploaded file (not generic sample)
- Allow user to pick which segment to preview (dropdown or slider)
Advanced Features (Optional): - Video background upload for realistic preview - Playback simulation (text appears at correct timing) - Export preview as MP4 clip
Implementation Phases¶
Phase 1: Subtitle Upload (MVP)¶
Goal: Accept subtitle files, parse, apply styling
Tasks:
1. [ ] Rename VideoUploader → FileUploader, accept both types
2. [ ] Update generate page UI to show "Upload Video OR Subtitle"
3. [ ] Modify /api/upload to handle subtitle files
4. [ ] Add subtitle_path column to videos table (nullable)
5. [ ] Update Supabase Storage bucket to allow subtitle MIME types ⚠️ CRITICAL
6. [ ] Update /api/jobs to accept subtitlePath
7. [ ] Modify worker tasks.py to detect subtitle vs video jobs
8. [ ] Implement parse_ass() and parse_vtt() parsers
9. [ ] Test end-to-end: Upload SRT → Apply style → Download ASS
Success Criteria:
- User can upload .srt file
- Job processes without Whisper transcription
- Styled ASS file downloads successfully
Estimated Effort: 4-6 hours
Phase 2: Subtitle Preview¶
Goal: Show parsed segments before job submission
Tasks:
1. [ ] Create SubtitlePreview.tsx component
2. [ ] Add client-side subtitle parser (for instant preview)
3. [ ] Display component on generate page after file upload
4. [ ] Show segment count, first 10 segments, timestamps
5. [ ] Add "Show All" expansion (virtualized list)
Success Criteria: - User uploads subtitle, sees preview immediately - Can scroll through all segments - Preview updates when file changes
Estimated Effort: 2-3 hours
Phase 3: Style Comparison Visualizer¶
Goal: Preview "before → after" style changes
Tasks:
1. [ ] Create StyleComparison.tsx component
2. [ ] Implement side-by-side display mode
3. [ ] Implement toggle display mode
4. [ ] Extract real subtitle text for preview (not generic sample)
5. [ ] Add segment selector (which line to preview)
6. [ ] Integrate with StyleConfigurator (live updates)
Success Criteria: - User sees original vs styled subtitles side-by-side - Can toggle between views - Preview updates when style changes - Uses actual text from uploaded subtitle
Estimated Effort: 3-4 hours
Phase 4: Advanced Parsers¶
Goal: Support ASS and VTT input formats fully
Tasks: 1. [ ] Implement full ASS parser (handle style definitions, effects) 2. [ ] Implement VTT parser (handle cues, notes, styles) 3. [ ] Handle edge cases (malformed files, encoding issues) 4. [ ] Add format detection (auto-detect based on content) 5. [ ] Write unit tests for all parsers
Success Criteria: - All 3 formats parse correctly - Preserves timing, text, basic formatting - Handles malformed files gracefully
Estimated Effort: 4-5 hours
Phase 5: UI Polish¶
Goal: Refine user experience
Tasks: 1. [ ] Add file type icons (video vs subtitle) 2. [ ] Show file metadata (duration estimate, segment count) 3. [ ] Add drag-and-drop zones for each file type 4. [ ] Implement overlay display mode for style comparison 5. [ ] Add export preview as image/video (optional)
Success Criteria: - UI clearly distinguishes video vs subtitle uploads - Professional, polished appearance - Smooth transitions and interactions
Estimated Effort: 2-3 hours
Storage Configuration Changes¶
⚠️ CRITICAL: Update Supabase Storage Bucket MIME Types¶
Problem: The subtitleai-uploads bucket was originally configured to only accept video MIME types. When subtitle upload was added, the bucket rejected subtitle files with "mime type not supported" (HTTP 422).
Required Migration:
-- Migration: Update subtitleai-uploads bucket to allow subtitle MIME types
UPDATE storage.buckets
SET allowed_mime_types = ARRAY[
-- Video formats (existing)
'video/mp4',
'video/webm',
'video/ogg',
'video/quicktime',
'video/x-matroska',
'video/x-msvideo',
-- Subtitle formats (new)
'application/x-subrip', -- .srt
'text/x-ssa', -- .ass, .ssa
'text/vtt' -- .vtt
]::text[]
WHERE id = 'subtitleai-uploads';
Migration File: supabase/migrations/20251130203347_update_uploads_bucket_mime_types.sql
Application:
- Local Supabase: npx supabase db reset
- Production (k8s): kubectl exec -n supabase postgres-0 -- psql -U postgres < migration.sql
Verification:
Why This Matters: - Without this update, ALL subtitle uploads will fail with HTTP 422 - Error occurs at storage layer before reaching application code - Not mentioned in original feature plan → documentation gap identified and fixed
Database Schema Changes¶
Option 1: Add column to videos table (Recommended)¶
-- Migration: Add subtitle_path to videos table
ALTER TABLE subtitleai.videos
ADD COLUMN subtitle_path TEXT;
COMMENT ON COLUMN subtitleai.videos.subtitle_path IS
'Path to uploaded subtitle file in subtitleai-uploads bucket (mutually exclusive with file_path for video)';
Validation Logic:
- Either file_path OR subtitle_path must be set (not both)
- Job type determined by which field is populated
Option 2: Create separate subtitle_imports table (Over-engineered)¶
-- Not recommended - adds complexity without clear benefit
CREATE TABLE subtitleai.subtitle_imports (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
user_id UUID NOT NULL REFERENCES auth.users(id),
file_path TEXT NOT NULL,
format TEXT NOT NULL, -- 'srt', 'ass', 'vtt'
segment_count INT,
duration FLOAT,
created_at TIMESTAMPTZ DEFAULT now()
);
Decision: Use Option 1 (simpler, reuses existing infrastructure)
API Endpoints¶
Modified Endpoints¶
POST /api/upload¶
Current: Accepts video files only New: Accepts video OR subtitle files
Changes:
- Detect file type from MIME type or extension
- Upload to subtitleai-uploads bucket (same bucket, different prefixes)
- Return { fullPath, fileType: 'video' | 'subtitle' }
Request:
Response:
{
"fullPath": "subtitleai-uploads/users/<uuid>/subtitles/<filename>.srt",
"fileType": "subtitle",
"size": 45678
}
POST /api/jobs¶
Current: Requires videoPath
New: Accepts videoPath OR subtitlePath
Changes:
- Validate: exactly one of videoPath or subtitlePath must be provided
- Store in videos table using appropriate column
- Set duration to null for subtitle imports (calculate later if needed)
Request:
{
"title": "My Subtitles",
"subtitlePath": "subtitleai-uploads/...",
"sourceLanguage": "en",
"config": {
"targetLanguage": "es",
"styleProfile": "custom_abc123",
"dualLanguage": true
}
}
Response: Same as current (returns job ID)
New Endpoints (Optional)¶
POST /api/subtitles/parse (Client-side preview)¶
Purpose: Parse subtitle file client-side for instant preview
Alternative: Use client-side JavaScript parser (no backend needed)
Recommendation: Client-side parser for MVP (faster, no upload required)
Worker Changes¶
File: worker/tasks.py¶
Current Flow:
def generate_subtitles(job_id):
1. Download video from storage
2. Run Whisper transcription
3. Parse generated SRT
4. Apply styling
5. Upload output files
New Flow:
def generate_subtitles(job_id):
# Fetch job details
job = fetch_job(job_id)
video = job['videos']
# Determine job type
if video.get('subtitle_path'):
# SUBTITLE IMPORT PATH
subtitle_path = video['subtitle_path']
segments = process_subtitle_import(subtitle_path)
else:
# VIDEO TRANSCRIPTION PATH (existing)
video_path = video['file_path']
segments = process_video_transcription(video_path)
# Common path: Apply styling & translation
apply_styling_and_translation(segments, job['config'])
upload_outputs(job_id, styled_segments)
New Function:
def process_subtitle_import(subtitle_path: str, config: dict) -> tuple[List[Dict], dict]:
"""
Download subtitle file, detect format, parse into segments
Handles ASS style preservation based on user intent:
- Translation only (no style profile) → Preserve original ASS styles
- Style profile selected → Ignore original styles (will be replaced)
Args:
subtitle_path: Path in subtitleai-uploads bucket
config: Job configuration dict (contains styleProfile, dualLanguage)
Returns:
tuple: (segments, style_info)
- segments: List of subtitle segments (same format for all types)
- style_info: {
'preserve_original': bool,
'original_styles': dict | None # ASS style definitions if preserve=True
}
"""
# Download subtitle file
subtitle_data = supabase.storage.from_('subtitleai-uploads').download(subtitle_path)
# Detect format from extension
ext = subtitle_path.split('.')[-1].lower()
content = subtitle_data.decode('utf-8')
# Determine if we should preserve original styles (ASS only)
preserve_original = False
original_styles = None
if ext in ['ass', 'ssa']:
# Check if translation-only mode (D3 logic)
style_profile = config.get('styleProfile')
dual_language = config.get('dualLanguage', False)
if style_profile in [None, 'none'] and dual_language:
# Translation only - preserve original styles
preserve_original = True
segments, original_styles = parse_ass_with_styles(content)
else:
# Style profile selected - ignore original styles
segments = parse_ass(content)
elif ext == 'srt':
segments = parse_srt(content)
elif ext == 'vtt':
segments = parse_vtt(content)
else:
raise ValueError(f"Unsupported subtitle format: {ext}")
style_info = {
'preserve_original': preserve_original,
'original_styles': original_styles
}
return segments, style_info
File: worker/formats/subtitle_parsers.py (New)¶
Structure:
"""
Unified subtitle format parsers
"""
def parse_srt(content: str) -> List[Dict]:
"""Parse SRT format (existing function, moved here)"""
pass
def parse_ass(content: str) -> List[Dict]:
"""
Parse ASS/SSA format
Extracts dialogue lines, ignores style definitions for now
Returns same structure as parse_srt
"""
pass
def parse_vtt(content: str) -> List[Dict]:
"""
Parse WebVTT format
Similar to SRT but with 'WEBVTT' header and different timestamp format
"""
pass
def detect_format(content: str) -> str:
"""
Auto-detect subtitle format from content
Returns: 'srt', 'ass', or 'vtt'
"""
if content.startswith('WEBVTT'):
return 'vtt'
elif '[Script Info]' in content or '[V4+ Styles]' in content:
return 'ass'
else:
return 'srt' # Default assumption
Frontend Changes¶
File: src/components/FileUploader.tsx (Renamed)¶
Changes:
interface FileUploaderProps {
onFileSelect: (file: File | null) => void
selectedFile: File | null
acceptedTypes: ('video' | 'subtitle')[] // NEW: Configure accepted types
}
export default function FileUploader({ acceptedTypes, ... }) {
const handleFile = (file: File) => {
// Build dynamic validation based on acceptedTypes
const validExtensions = []
const validMimes = []
if (acceptedTypes.includes('video')) {
validExtensions.push('.mp4', '.mkv', '.webm', '.mov', '.avi')
validMimes.push('video/mp4', 'video/webm', ...)
}
if (acceptedTypes.includes('subtitle')) {
validExtensions.push('.srt', '.ass', '.ssa', '.vtt')
validMimes.push('text/plain', 'application/x-subrip', ...)
}
// Validate file
const isValid = validMimes.includes(file.type) ||
validExtensions.some(ext => file.name.toLowerCase().endsWith(ext))
if (!isValid) {
toast.error('Invalid file type')
return
}
onFileSelect(file)
}
return (
<div className="...">
{/* Show dynamic hint based on acceptedTypes */}
<p>
{acceptedTypes.includes('video') && acceptedTypes.includes('subtitle')
? 'Upload Video OR Subtitle File'
: acceptedTypes.includes('video')
? 'Upload Video File'
: 'Upload Subtitle File'
}
</p>
</div>
)
}
File: src/app/generate/page.tsx¶
Changes:
export default function GeneratePage() {
const [uploadedFile, setUploadedFile] = useState<File | null>(null)
const [fileType, setFileType] = useState<'video' | 'subtitle' | null>(null)
const [parsedSegments, setParsedSegments] = useState<SubtitleSegment[] | null>(null)
const handleFileSelect = async (file: File | null) => {
setUploadedFile(file)
if (!file) {
setFileType(null)
setParsedSegments(null)
return
}
// Detect file type
const ext = file.name.split('.').pop()?.toLowerCase()
if (['srt', 'ass', 'vtt'].includes(ext!)) {
setFileType('subtitle')
// Parse subtitle file client-side for preview
const text = await file.text()
const segments = parseSubtitleClientSide(text, ext!)
setParsedSegments(segments)
} else {
setFileType('video')
setParsedSegments(null)
}
}
return (
<div>
{/* File Upload */}
<FileUploader
onFileSelect={handleFileSelect}
selectedFile={uploadedFile}
acceptedTypes={['video', 'subtitle']}
/>
{/* Subtitle Preview (only for subtitle uploads) */}
{fileType === 'subtitle' && parsedSegments && (
<SubtitlePreview
segments={parsedSegments}
totalSegments={parsedSegments.length}
/>
)}
{/* Style Comparison Visualizer */}
{parsedSegments && (
<StyleComparison
segments={parsedSegments}
selectedStyleProfile={config.styleProfile}
displayMode="side-by-side"
/>
)}
{/* ... rest of generate page */}
</div>
)
}
Testing Plan¶
Unit Tests¶
Parser Tests¶
# worker/tests/test_subtitle_parsers.py
def test_parse_srt_basic():
"""Test basic SRT parsing"""
srt_content = """
1
00:00:01,000 --> 00:00:03,500
First subtitle
2
00:00:06,000 --> 00:00:08,000
Second subtitle
"""
segments = parse_srt(srt_content)
assert len(segments) == 2
assert segments[0]['text'] == 'First subtitle'
assert segments[0]['start'] == '00:00:01,000'
def test_parse_ass_basic():
"""Test ASS dialogue parsing"""
ass_content = """
[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:01.00,0:00:03.50,Default,,0,0,0,,First subtitle
"""
segments = parse_ass(ass_content)
assert len(segments) == 1
assert segments[0]['text'] == 'First subtitle'
def test_detect_format():
"""Test format auto-detection"""
assert detect_format('WEBVTT\n\n...') == 'vtt'
assert detect_format('[Script Info]\n...') == 'ass'
assert detect_format('1\n00:00:01...') == 'srt'
Integration Tests¶
Test 1: Upload Subtitle → Apply Style → Download¶
// Test flow
1. Upload test.srt file
2. Select custom style profile
3. Generate subtitles
4. Download ASS file
5. Verify ASS contains custom style definition
Test 2: Upload Video → Upload Subtitle (Either/Or)¶
// Test validation
1. Upload video file
2. Attempt to also upload subtitle → Should show error
3. Clear video
4. Upload subtitle → Should succeed
Test 3: Subtitle Preview Rendering¶
// Test UI
1. Upload subtitle with 100 segments
2. Verify preview shows first 10
3. Click "Show All" → Verify all 100 render
4. Check timestamps format correctly
End-to-End Tests¶
Test Suite: subtitleai-import.spec.ts
test('complete subtitle import flow', async ({ page }) => {
// Navigate to generate page
await page.goto('/generate')
// Upload subtitle file
await page.setInputFiles('input[type=file]', 'test-fixtures/sample.srt')
// Verify preview appears
await expect(page.locator('[data-testid=subtitle-preview]')).toBeVisible()
await expect(page.locator('[data-testid=segment-count]')).toContainText('847 segments')
// Select style profile
await page.click('[data-testid=style-profile-custom]')
// Verify style comparison shows
await expect(page.locator('[data-testid=style-comparison]')).toBeVisible()
// Generate subtitles
await page.click('button:has-text("Generate Subtitles")')
// Wait for job completion
await page.waitForURL('/jobs')
await page.waitForSelector('[data-testid=job-status-complete]', { timeout: 60000 })
// Download and verify
const download = await page.waitForEvent('download')
const path = await download.path()
const content = fs.readFileSync(path, 'utf-8')
// Verify custom style applied
expect(content).toContain('Style: My Custom Style')
})
Migration Strategy¶
Backwards Compatibility¶
- Existing jobs (video-only) continue to work without changes
videos.subtitle_pathis nullable - no migration needed- Worker detects job type automatically
Rollout Plan¶
- Phase 1: Deploy backend changes (API, worker) - No UI changes yet
- Phase 2: Deploy UI changes (file uploader, preview)
- Phase 3: Deploy visualizer (style comparison)
Rollback Plan¶
- If issues arise, revert UI to video-only mode
- Backend remains backwards compatible
- No data loss risk (new column is optional)
Performance Considerations¶
Client-Side Parsing¶
Concern: Large subtitle files (10,000+ segments) could freeze browser
Solution: - Parse in Web Worker (non-blocking) - Show progress indicator for files >1MB - Limit preview to first 100 segments by default - Use virtualized list for "Show All" mode
Implementation:
// utils/subtitle-parser-worker.ts
const worker = new Worker('/workers/subtitle-parser.js')
worker.postMessage({ content: fileContent, format: 'srt' })
worker.onmessage = (e) => {
setParsedSegments(e.data.segments)
}
Worker Processing¶
Concern: Subtitle parsing adds latency to job processing
Analysis: - SRT parsing: ~50ms for 1000 segments (negligible) - ASS parsing: ~100ms for 1000 segments (complex format) - Compared to Whisper: 2-5 minutes for typical video
Conclusion: Subtitle import will be 60-100x faster than video transcription
Documentation Updates¶
Files to Update¶
/root/tower-fleet/docs/applications/subtitleai.md- Add feature description/root/PROJECTS.md- Update SubtitleAI status- User-facing docs (if any) - Add "Import Subtitle" tutorial
New Documentation¶
- Create
/root/tower-fleet/docs/reference/subtitle-formats.md - Supported formats
- Format-specific quirks
- Conversion best practices
Future Enhancements (Out of Scope)¶
Bazarr Integration¶
- Fetch subtitles directly from Bazarr API
- Show movie/show library in SubtitleAI UI
- One-click import from Bazarr → Apply style → Re-upload
Media Library Scanning¶
- Mount
/vaultin worker pods - Scan for
.srt/.assfiles alongside videos - Build searchable subtitle inventory
- Batch restyle operations
Advanced Visualizer Features¶
- Upload video background for preview
- Playback simulation (animated timing)
- Export preview as MP4 clip
- A/B comparison (3+ styles side-by-side)
Subtitle Editing¶
- In-browser subtitle editor
- Adjust timing, fix typos
- Save edited version before applying styles
Design Decisions (Finalized)¶
D1: Video + Subtitle Upload Strategy¶
Decision: Mutually exclusive (Either/Or)
Implementation: - User uploads EITHER video OR subtitle file, not both in same job - UI shows single file upload area that accepts both types - Clearer user intent, simpler UX - Future enhancement: Add "both" option if use case emerges (e.g., compare Whisper transcription vs existing subtitle)
Rationale: Starting simple reduces complexity. Most users have one or the other, not both.
D2: Timing Preservation During Translation¶
Decision: Always preserve original timing
Implementation: - When translating subtitles, keep original start/end timestamps - No regeneration based on translated text length - Maintains perfect sync with video
Rationale: Translation text length rarely differs enough to affect readability. Preserving timing prevents drift and maintains audio sync.
D3: Style Handling for ASS Import¶
Decision: Context-aware (smart logic)
Scenario A - Translation Only (no style profile selected): - Preserve original ASS style definitions from uploaded file - Only translate dialogue text - Output maintains original fonts, colors, positioning
Scenario B - Style Profile Selected: - Strip original ASS styles completely - Apply selected style profile (preset or custom) - User explicitly choosing new style implies replacement intent
Implementation:
# worker/tasks.py
if uploaded_format == 'ass':
if config.get('styleProfile') in [None, 'none'] and config.get('dualLanguage'):
# Translation only - preserve original styles
preserve_original_styles = True
else:
# Apply selected profile - replace styles
preserve_original_styles = False
Rationale: Smart defaults that match user intent. Translation-only users want to keep their existing styling. Users selecting a profile want that profile applied.
Success Metrics¶
Phase 1 (Subtitle Upload)¶
- [ ] 50%+ of jobs use subtitle import (vs video transcription)
- [ ] Average job completion time <30 seconds (vs 2-5 minutes for video)
- [ ] Zero failed subtitle parse jobs
Phase 2 (Preview)¶
- [ ] 80%+ of subtitle uploads show preview successfully
- [ ] <200ms preview render time for files with <1000 segments
Phase 3 (Visualizer)¶
- [ ] Users spend 30+ seconds interacting with style comparison
- [ ] 20%+ of users change style selection after viewing comparison
Checklist¶
Pre-Development¶
- [x] Review and finalize feature plan
- [ ] Get user approval on UI mockups
- [ ] Decide on open questions (Q1-Q3)
- [ ] Set up feature branch:
feature/subtitle-import
Phase 1: Subtitle Upload¶
- [ ] Rename
VideoUploader→FileUploader - [ ] Add
subtitle_pathcolumn tovideostable - [ ] Modify
/api/uploadto handle subtitle files - [ ] Update
/api/jobsto acceptsubtitlePath - [ ] Implement
parse_ass()function - [ ] Implement
parse_vtt()function - [ ] Update worker to detect and handle subtitle jobs
- [ ] Write unit tests for parsers
- [ ] Test end-to-end with sample SRT file
Phase 2: Subtitle Preview¶
- [ ] Create
SubtitlePreview.tsxcomponent - [ ] Implement client-side SRT parser
- [ ] Add preview to generate page
- [ ] Add "Show All" expansion with virtualization
- [ ] Test with large subtitle files (5000+ segments)
Phase 3: Style Visualizer¶
- [ ] Create
StyleComparison.tsxcomponent - [ ] Implement side-by-side mode
- [ ] Implement toggle mode
- [ ] Extract real subtitle text for preview
- [ ] Add segment selector
- [ ] Integrate with
StyleConfigurator
Testing & Deployment¶
- [ ] Run full test suite
- [ ] Manual QA testing
- [ ] Update documentation
- [ ] Create deployment PR
- [ ] Deploy to production
- [ ] Monitor for errors (first 24h)
Timeline Estimate¶
Total Implementation Time: 12-16 hours (across 3 phases)
Breakdown: - Phase 1 (Subtitle Upload): 4-6 hours - Phase 2 (Preview): 2-3 hours - Phase 3 (Visualizer): 3-4 hours - Testing & Polish: 2-3 hours
Calendar Time: 2-3 days (assuming focused development)
Notes¶
- This feature significantly expands SubtitleAI's utility beyond just transcription
- Opens door for Bazarr integration and media library workflows
- Style visualizer could become a standout feature (unique value prop)
- Consider making visualizer accessible without file upload (style sandbox mode)