Skip to content

SubtitleAI: Subtitle Import & Style Visualizer Feature Plan

Overview

Add ability to import existing subtitle files (SRT, ASS, VTT) instead of requiring video transcription, and build a visual style comparison tool to preview "before → after" styling changes.

Key Insight: Subtitle file is the foundation - video transcription is just one path to get there.


Core Flow (Simplified)

Input: Video OR Subtitle File
├─ Video → Whisper → Subtitle segments
└─ Subtitle File → Parse → Subtitle segments
             Apply Styling & Translation
                  Output Files

Use Cases

Primary Use Cases

  1. Restyle Existing Subtitles: User has SRT file, wants to apply custom ASS styling
  2. Translate Existing Subtitles: User has English SRT, wants dual-language output
  3. Preview Styles Before Generation: User uploads subtitle to test style profiles
  4. Convert Between Formats: User has ASS, wants clean SRT or vice versa

Future Use Cases (Phase 2+)

  1. Bazarr Integration: Import subtitles directly from Bazarr API
  2. Media Library Scanning: Auto-discover subtitles in /vault media folders
  3. Batch Processing: Apply same style to multiple subtitle files

Architecture Components

1. File Upload System

Current State: VideoUploader accepts video files only Proposed State: FileUploader accepts video OR subtitle files

Changes Required: - Modify VideoUploader.tsx → rename to FileUploader.tsx - Accept both video and subtitle MIME types - Validate file extensions: .mp4, .mkv, .srt, .ass, .vtt - Show different UI hints based on file type - Max size: 2GB for video, 10MB for subtitles

UI Components:

<FileUploader
  onFileSelect={setUploadedFile}
  selectedFile={uploadedFile}
  acceptedTypes={['video', 'subtitle']} // both
/>

2. Job Creation Flow

Current State: All jobs require video_id and run Whisper Proposed State: Jobs can be video-based OR subtitle-based

Database Schema Changes: None required (existing schema supports both)

API Changes: /api/jobs (POST)

// Current: Requires videoPath
{ videoTitle, videoPath, sourceLanguage, config }

// Proposed: Accepts videoPath OR subtitlePath
{
  title: string,
  videoPath?: string,      // For video transcription
  subtitlePath?: string,   // For subtitle import
  sourceLanguage: string,
  config: { ... }
}

Worker Changes: worker/tasks.py - Detect job type: Check if subtitle_path exists in video record - Branch logic: - Video job → Download video → Whisper → Parse SRT - Subtitle job → Download subtitle → Parse directly → Skip Whisper

3. Subtitle Parser Enhancement

Current State: Worker parses Whisper-generated SRT only Proposed State: Worker parses SRT, ASS, and VTT formats

Implementation: - Extract existing parse_srt() from formats/srt_parser.py - Add parse_ass() for ASS format - Add parse_vtt() for WebVTT format - Create unified parse_subtitle_file() dispatcher

Format Support: | Format | Extension | Parser Function | Status | |--------|-----------|-----------------|--------| | SubRip | .srt | parse_srt() | ✅ Exists | | ASS/SSA | .ass, .ssa | parse_ass() | ⚠️ New | | WebVTT | .vtt | parse_vtt() | ⚠️ New |

Output: All parsers return same structure:

[
  {
    'index': 1,
    'start': '00:00:01,000',
    'end': '00:00:03,500',
    'text': 'Subtitle text here'
  },
  ...
]

4. Subtitle Preview Component

Purpose: Display parsed subtitle segments before job submission

Component: SubtitlePreview.tsx

interface SubtitlePreviewProps {
  segments: SubtitleSegment[]
  totalSegments: number
  previewLimit?: number  // Default 10
}

Features: - Show first 10 segments by default - Display timing, text content - Scrollable list with virtualization (for large files) - Segment count summary: "Showing 10 of 847 segments"

UI Design:

┌─ Subtitle Preview ─────────────────────┐
│ 847 segments detected                  │
├────────────────────────────────────────┤
│ [1] 00:00:01.000 → 00:00:03.500        │
│     The world is fogly people doing... │
│                                        │
│ [2] 00:00:06.000 → 00:00:08.000        │
│     It's just nature of the beast.     │
│                                        │
│ ... (showing 10 of 847)                │
└────────────────────────────────────────┘

5. Style Comparison Visualizer

Purpose: Show "before → after" preview of style changes

Component: StyleComparison.tsx

interface StyleComparisonProps {
  segments: SubtitleSegment[]
  selectedStyleProfile: string | StyleConfig
  previewText?: string  // Optional override
  displayMode: 'side-by-side' | 'toggle' | 'overlay'
}

Display Modes:

Mode 1: Side-by-Side

┌─ Original ────────┬─ Styled ──────────┐
│ Sample text here  │ Sample text here  │
│ (Default style)   │ (Custom style)    │
└───────────────────┴───────────────────┘

Mode 2: Toggle (Switcher)

┌─ Preview ─────────────────────────────┐
│ [○ Original] [● Styled]               │
│                                       │
│     Sample text here                  │
│     (Shows selected version)          │
└───────────────────────────────────────┘

Mode 3: Overlay (Fade transition)

┌─ Preview ─────────────────────────────┐
│ [────────●─────────] Opacity slider   │
│                                       │
│     Sample text here                  │
│     (Crossfade between styles)        │
└───────────────────────────────────────┘

Implementation: - Reuse StylePreview.tsx component for rendering - Extract actual subtitle text from uploaded file (not generic sample) - Allow user to pick which segment to preview (dropdown or slider)

Advanced Features (Optional): - Video background upload for realistic preview - Playback simulation (text appears at correct timing) - Export preview as MP4 clip


Implementation Phases

Phase 1: Subtitle Upload (MVP)

Goal: Accept subtitle files, parse, apply styling

Tasks: 1. [ ] Rename VideoUploaderFileUploader, accept both types 2. [ ] Update generate page UI to show "Upload Video OR Subtitle" 3. [ ] Modify /api/upload to handle subtitle files 4. [ ] Add subtitle_path column to videos table (nullable) 5. [ ] Update Supabase Storage bucket to allow subtitle MIME types ⚠️ CRITICAL 6. [ ] Update /api/jobs to accept subtitlePath 7. [ ] Modify worker tasks.py to detect subtitle vs video jobs 8. [ ] Implement parse_ass() and parse_vtt() parsers 9. [ ] Test end-to-end: Upload SRT → Apply style → Download ASS

Success Criteria: - User can upload .srt file - Job processes without Whisper transcription - Styled ASS file downloads successfully

Estimated Effort: 4-6 hours


Phase 2: Subtitle Preview

Goal: Show parsed segments before job submission

Tasks: 1. [ ] Create SubtitlePreview.tsx component 2. [ ] Add client-side subtitle parser (for instant preview) 3. [ ] Display component on generate page after file upload 4. [ ] Show segment count, first 10 segments, timestamps 5. [ ] Add "Show All" expansion (virtualized list)

Success Criteria: - User uploads subtitle, sees preview immediately - Can scroll through all segments - Preview updates when file changes

Estimated Effort: 2-3 hours


Phase 3: Style Comparison Visualizer

Goal: Preview "before → after" style changes

Tasks: 1. [ ] Create StyleComparison.tsx component 2. [ ] Implement side-by-side display mode 3. [ ] Implement toggle display mode 4. [ ] Extract real subtitle text for preview (not generic sample) 5. [ ] Add segment selector (which line to preview) 6. [ ] Integrate with StyleConfigurator (live updates)

Success Criteria: - User sees original vs styled subtitles side-by-side - Can toggle between views - Preview updates when style changes - Uses actual text from uploaded subtitle

Estimated Effort: 3-4 hours


Phase 4: Advanced Parsers

Goal: Support ASS and VTT input formats fully

Tasks: 1. [ ] Implement full ASS parser (handle style definitions, effects) 2. [ ] Implement VTT parser (handle cues, notes, styles) 3. [ ] Handle edge cases (malformed files, encoding issues) 4. [ ] Add format detection (auto-detect based on content) 5. [ ] Write unit tests for all parsers

Success Criteria: - All 3 formats parse correctly - Preserves timing, text, basic formatting - Handles malformed files gracefully

Estimated Effort: 4-5 hours


Phase 5: UI Polish

Goal: Refine user experience

Tasks: 1. [ ] Add file type icons (video vs subtitle) 2. [ ] Show file metadata (duration estimate, segment count) 3. [ ] Add drag-and-drop zones for each file type 4. [ ] Implement overlay display mode for style comparison 5. [ ] Add export preview as image/video (optional)

Success Criteria: - UI clearly distinguishes video vs subtitle uploads - Professional, polished appearance - Smooth transitions and interactions

Estimated Effort: 2-3 hours


Storage Configuration Changes

⚠️ CRITICAL: Update Supabase Storage Bucket MIME Types

Problem: The subtitleai-uploads bucket was originally configured to only accept video MIME types. When subtitle upload was added, the bucket rejected subtitle files with "mime type not supported" (HTTP 422).

Required Migration:

-- Migration: Update subtitleai-uploads bucket to allow subtitle MIME types
UPDATE storage.buckets
SET allowed_mime_types = ARRAY[
  -- Video formats (existing)
  'video/mp4',
  'video/webm',
  'video/ogg',
  'video/quicktime',
  'video/x-matroska',
  'video/x-msvideo',
  -- Subtitle formats (new)
  'application/x-subrip',  -- .srt
  'text/x-ssa',            -- .ass, .ssa
  'text/vtt'               -- .vtt
]::text[]
WHERE id = 'subtitleai-uploads';

Migration File: supabase/migrations/20251130203347_update_uploads_bucket_mime_types.sql

Application: - Local Supabase: npx supabase db reset - Production (k8s): kubectl exec -n supabase postgres-0 -- psql -U postgres < migration.sql

Verification:

SELECT id, allowed_mime_types
FROM storage.buckets
WHERE id = 'subtitleai-uploads';

Why This Matters: - Without this update, ALL subtitle uploads will fail with HTTP 422 - Error occurs at storage layer before reaching application code - Not mentioned in original feature plan → documentation gap identified and fixed


Database Schema Changes

-- Migration: Add subtitle_path to videos table
ALTER TABLE subtitleai.videos
ADD COLUMN subtitle_path TEXT;

COMMENT ON COLUMN subtitleai.videos.subtitle_path IS
'Path to uploaded subtitle file in subtitleai-uploads bucket (mutually exclusive with file_path for video)';

Validation Logic: - Either file_path OR subtitle_path must be set (not both) - Job type determined by which field is populated

Option 2: Create separate subtitle_imports table (Over-engineered)

-- Not recommended - adds complexity without clear benefit
CREATE TABLE subtitleai.subtitle_imports (
  id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  user_id UUID NOT NULL REFERENCES auth.users(id),
  file_path TEXT NOT NULL,
  format TEXT NOT NULL, -- 'srt', 'ass', 'vtt'
  segment_count INT,
  duration FLOAT,
  created_at TIMESTAMPTZ DEFAULT now()
);

Decision: Use Option 1 (simpler, reuses existing infrastructure)


API Endpoints

Modified Endpoints

POST /api/upload

Current: Accepts video files only New: Accepts video OR subtitle files

Changes: - Detect file type from MIME type or extension - Upload to subtitleai-uploads bucket (same bucket, different prefixes) - Return { fullPath, fileType: 'video' | 'subtitle' }

Request:

FormData {
  file: File  // .mp4, .mkv, .srt, .ass, .vtt
}

Response:

{
  "fullPath": "subtitleai-uploads/users/<uuid>/subtitles/<filename>.srt",
  "fileType": "subtitle",
  "size": 45678
}


POST /api/jobs

Current: Requires videoPath New: Accepts videoPath OR subtitlePath

Changes: - Validate: exactly one of videoPath or subtitlePath must be provided - Store in videos table using appropriate column - Set duration to null for subtitle imports (calculate later if needed)

Request:

{
  "title": "My Subtitles",
  "subtitlePath": "subtitleai-uploads/...",
  "sourceLanguage": "en",
  "config": {
    "targetLanguage": "es",
    "styleProfile": "custom_abc123",
    "dualLanguage": true
  }
}

Response: Same as current (returns job ID)


New Endpoints (Optional)

POST /api/subtitles/parse (Client-side preview)

Purpose: Parse subtitle file client-side for instant preview

Alternative: Use client-side JavaScript parser (no backend needed)

Recommendation: Client-side parser for MVP (faster, no upload required)


Worker Changes

File: worker/tasks.py

Current Flow:

def generate_subtitles(job_id):
    1. Download video from storage
    2. Run Whisper transcription
    3. Parse generated SRT
    4. Apply styling
    5. Upload output files

New Flow:

def generate_subtitles(job_id):
    # Fetch job details
    job = fetch_job(job_id)
    video = job['videos']

    # Determine job type
    if video.get('subtitle_path'):
        # SUBTITLE IMPORT PATH
        subtitle_path = video['subtitle_path']
        segments = process_subtitle_import(subtitle_path)
    else:
        # VIDEO TRANSCRIPTION PATH (existing)
        video_path = video['file_path']
        segments = process_video_transcription(video_path)

    # Common path: Apply styling & translation
    apply_styling_and_translation(segments, job['config'])
    upload_outputs(job_id, styled_segments)

New Function:

def process_subtitle_import(subtitle_path: str, config: dict) -> tuple[List[Dict], dict]:
    """
    Download subtitle file, detect format, parse into segments

    Handles ASS style preservation based on user intent:
    - Translation only (no style profile) → Preserve original ASS styles
    - Style profile selected → Ignore original styles (will be replaced)

    Args:
        subtitle_path: Path in subtitleai-uploads bucket
        config: Job configuration dict (contains styleProfile, dualLanguage)

    Returns:
        tuple: (segments, style_info)
        - segments: List of subtitle segments (same format for all types)
        - style_info: {
            'preserve_original': bool,
            'original_styles': dict | None  # ASS style definitions if preserve=True
          }
    """
    # Download subtitle file
    subtitle_data = supabase.storage.from_('subtitleai-uploads').download(subtitle_path)

    # Detect format from extension
    ext = subtitle_path.split('.')[-1].lower()
    content = subtitle_data.decode('utf-8')

    # Determine if we should preserve original styles (ASS only)
    preserve_original = False
    original_styles = None

    if ext in ['ass', 'ssa']:
        # Check if translation-only mode (D3 logic)
        style_profile = config.get('styleProfile')
        dual_language = config.get('dualLanguage', False)

        if style_profile in [None, 'none'] and dual_language:
            # Translation only - preserve original styles
            preserve_original = True
            segments, original_styles = parse_ass_with_styles(content)
        else:
            # Style profile selected - ignore original styles
            segments = parse_ass(content)
    elif ext == 'srt':
        segments = parse_srt(content)
    elif ext == 'vtt':
        segments = parse_vtt(content)
    else:
        raise ValueError(f"Unsupported subtitle format: {ext}")

    style_info = {
        'preserve_original': preserve_original,
        'original_styles': original_styles
    }

    return segments, style_info


File: worker/formats/subtitle_parsers.py (New)

Structure:

"""
Unified subtitle format parsers
"""

def parse_srt(content: str) -> List[Dict]:
    """Parse SRT format (existing function, moved here)"""
    pass

def parse_ass(content: str) -> List[Dict]:
    """
    Parse ASS/SSA format

    Extracts dialogue lines, ignores style definitions for now
    Returns same structure as parse_srt
    """
    pass

def parse_vtt(content: str) -> List[Dict]:
    """
    Parse WebVTT format

    Similar to SRT but with 'WEBVTT' header and different timestamp format
    """
    pass

def detect_format(content: str) -> str:
    """
    Auto-detect subtitle format from content

    Returns: 'srt', 'ass', or 'vtt'
    """
    if content.startswith('WEBVTT'):
        return 'vtt'
    elif '[Script Info]' in content or '[V4+ Styles]' in content:
        return 'ass'
    else:
        return 'srt'  # Default assumption


Frontend Changes

File: src/components/FileUploader.tsx (Renamed)

Changes:

interface FileUploaderProps {
  onFileSelect: (file: File | null) => void
  selectedFile: File | null
  acceptedTypes: ('video' | 'subtitle')[]  // NEW: Configure accepted types
}

export default function FileUploader({ acceptedTypes, ... }) {
  const handleFile = (file: File) => {
    // Build dynamic validation based on acceptedTypes
    const validExtensions = []
    const validMimes = []

    if (acceptedTypes.includes('video')) {
      validExtensions.push('.mp4', '.mkv', '.webm', '.mov', '.avi')
      validMimes.push('video/mp4', 'video/webm', ...)
    }

    if (acceptedTypes.includes('subtitle')) {
      validExtensions.push('.srt', '.ass', '.ssa', '.vtt')
      validMimes.push('text/plain', 'application/x-subrip', ...)
    }

    // Validate file
    const isValid = validMimes.includes(file.type) ||
      validExtensions.some(ext => file.name.toLowerCase().endsWith(ext))

    if (!isValid) {
      toast.error('Invalid file type')
      return
    }

    onFileSelect(file)
  }

  return (
    <div className="...">
      {/* Show dynamic hint based on acceptedTypes */}
      <p>
        {acceptedTypes.includes('video') && acceptedTypes.includes('subtitle')
          ? 'Upload Video OR Subtitle File'
          : acceptedTypes.includes('video')
          ? 'Upload Video File'
          : 'Upload Subtitle File'
        }
      </p>
    </div>
  )
}


File: src/app/generate/page.tsx

Changes:

export default function GeneratePage() {
  const [uploadedFile, setUploadedFile] = useState<File | null>(null)
  const [fileType, setFileType] = useState<'video' | 'subtitle' | null>(null)
  const [parsedSegments, setParsedSegments] = useState<SubtitleSegment[] | null>(null)

  const handleFileSelect = async (file: File | null) => {
    setUploadedFile(file)

    if (!file) {
      setFileType(null)
      setParsedSegments(null)
      return
    }

    // Detect file type
    const ext = file.name.split('.').pop()?.toLowerCase()
    if (['srt', 'ass', 'vtt'].includes(ext!)) {
      setFileType('subtitle')

      // Parse subtitle file client-side for preview
      const text = await file.text()
      const segments = parseSubtitleClientSide(text, ext!)
      setParsedSegments(segments)
    } else {
      setFileType('video')
      setParsedSegments(null)
    }
  }

  return (
    <div>
      {/* File Upload */}
      <FileUploader
        onFileSelect={handleFileSelect}
        selectedFile={uploadedFile}
        acceptedTypes={['video', 'subtitle']}
      />

      {/* Subtitle Preview (only for subtitle uploads) */}
      {fileType === 'subtitle' && parsedSegments && (
        <SubtitlePreview
          segments={parsedSegments}
          totalSegments={parsedSegments.length}
        />
      )}

      {/* Style Comparison Visualizer */}
      {parsedSegments && (
        <StyleComparison
          segments={parsedSegments}
          selectedStyleProfile={config.styleProfile}
          displayMode="side-by-side"
        />
      )}

      {/* ... rest of generate page */}
    </div>
  )
}


Testing Plan

Unit Tests

Parser Tests

# worker/tests/test_subtitle_parsers.py

def test_parse_srt_basic():
    """Test basic SRT parsing"""
    srt_content = """
    1
    00:00:01,000 --> 00:00:03,500
    First subtitle

    2
    00:00:06,000 --> 00:00:08,000
    Second subtitle
    """
    segments = parse_srt(srt_content)
    assert len(segments) == 2
    assert segments[0]['text'] == 'First subtitle'
    assert segments[0]['start'] == '00:00:01,000'

def test_parse_ass_basic():
    """Test ASS dialogue parsing"""
    ass_content = """
    [Events]
    Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
    Dialogue: 0,0:00:01.00,0:00:03.50,Default,,0,0,0,,First subtitle
    """
    segments = parse_ass(ass_content)
    assert len(segments) == 1
    assert segments[0]['text'] == 'First subtitle'

def test_detect_format():
    """Test format auto-detection"""
    assert detect_format('WEBVTT\n\n...') == 'vtt'
    assert detect_format('[Script Info]\n...') == 'ass'
    assert detect_format('1\n00:00:01...') == 'srt'

Integration Tests

Test 1: Upload Subtitle → Apply Style → Download

// Test flow
1. Upload test.srt file
2. Select custom style profile
3. Generate subtitles
4. Download ASS file
5. Verify ASS contains custom style definition

Test 2: Upload Video → Upload Subtitle (Either/Or)

// Test validation
1. Upload video file
2. Attempt to also upload subtitle  Should show error
3. Clear video
4. Upload subtitle  Should succeed

Test 3: Subtitle Preview Rendering

// Test UI
1. Upload subtitle with 100 segments
2. Verify preview shows first 10
3. Click "Show All"  Verify all 100 render
4. Check timestamps format correctly

End-to-End Tests

Test Suite: subtitleai-import.spec.ts

test('complete subtitle import flow', async ({ page }) => {
  // Navigate to generate page
  await page.goto('/generate')

  // Upload subtitle file
  await page.setInputFiles('input[type=file]', 'test-fixtures/sample.srt')

  // Verify preview appears
  await expect(page.locator('[data-testid=subtitle-preview]')).toBeVisible()
  await expect(page.locator('[data-testid=segment-count]')).toContainText('847 segments')

  // Select style profile
  await page.click('[data-testid=style-profile-custom]')

  // Verify style comparison shows
  await expect(page.locator('[data-testid=style-comparison]')).toBeVisible()

  // Generate subtitles
  await page.click('button:has-text("Generate Subtitles")')

  // Wait for job completion
  await page.waitForURL('/jobs')
  await page.waitForSelector('[data-testid=job-status-complete]', { timeout: 60000 })

  // Download and verify
  const download = await page.waitForEvent('download')
  const path = await download.path()
  const content = fs.readFileSync(path, 'utf-8')

  // Verify custom style applied
  expect(content).toContain('Style: My Custom Style')
})

Migration Strategy

Backwards Compatibility

  • Existing jobs (video-only) continue to work without changes
  • videos.subtitle_path is nullable - no migration needed
  • Worker detects job type automatically

Rollout Plan

  1. Phase 1: Deploy backend changes (API, worker) - No UI changes yet
  2. Phase 2: Deploy UI changes (file uploader, preview)
  3. Phase 3: Deploy visualizer (style comparison)

Rollback Plan

  • If issues arise, revert UI to video-only mode
  • Backend remains backwards compatible
  • No data loss risk (new column is optional)

Performance Considerations

Client-Side Parsing

Concern: Large subtitle files (10,000+ segments) could freeze browser

Solution: - Parse in Web Worker (non-blocking) - Show progress indicator for files >1MB - Limit preview to first 100 segments by default - Use virtualized list for "Show All" mode

Implementation:

// utils/subtitle-parser-worker.ts
const worker = new Worker('/workers/subtitle-parser.js')

worker.postMessage({ content: fileContent, format: 'srt' })
worker.onmessage = (e) => {
  setParsedSegments(e.data.segments)
}


Worker Processing

Concern: Subtitle parsing adds latency to job processing

Analysis: - SRT parsing: ~50ms for 1000 segments (negligible) - ASS parsing: ~100ms for 1000 segments (complex format) - Compared to Whisper: 2-5 minutes for typical video

Conclusion: Subtitle import will be 60-100x faster than video transcription


Documentation Updates

Files to Update

  1. /root/tower-fleet/docs/applications/subtitleai.md - Add feature description
  2. /root/PROJECTS.md - Update SubtitleAI status
  3. User-facing docs (if any) - Add "Import Subtitle" tutorial

New Documentation

  • Create /root/tower-fleet/docs/reference/subtitle-formats.md
  • Supported formats
  • Format-specific quirks
  • Conversion best practices

Future Enhancements (Out of Scope)

Bazarr Integration

  • Fetch subtitles directly from Bazarr API
  • Show movie/show library in SubtitleAI UI
  • One-click import from Bazarr → Apply style → Re-upload

Media Library Scanning

  • Mount /vault in worker pods
  • Scan for .srt/.ass files alongside videos
  • Build searchable subtitle inventory
  • Batch restyle operations

Advanced Visualizer Features

  • Upload video background for preview
  • Playback simulation (animated timing)
  • Export preview as MP4 clip
  • A/B comparison (3+ styles side-by-side)

Subtitle Editing

  • In-browser subtitle editor
  • Adjust timing, fix typos
  • Save edited version before applying styles

Design Decisions (Finalized)

D1: Video + Subtitle Upload Strategy

Decision: Mutually exclusive (Either/Or)

Implementation: - User uploads EITHER video OR subtitle file, not both in same job - UI shows single file upload area that accepts both types - Clearer user intent, simpler UX - Future enhancement: Add "both" option if use case emerges (e.g., compare Whisper transcription vs existing subtitle)

Rationale: Starting simple reduces complexity. Most users have one or the other, not both.


D2: Timing Preservation During Translation

Decision: Always preserve original timing

Implementation: - When translating subtitles, keep original start/end timestamps - No regeneration based on translated text length - Maintains perfect sync with video

Rationale: Translation text length rarely differs enough to affect readability. Preserving timing prevents drift and maintains audio sync.


D3: Style Handling for ASS Import

Decision: Context-aware (smart logic)

Scenario A - Translation Only (no style profile selected): - Preserve original ASS style definitions from uploaded file - Only translate dialogue text - Output maintains original fonts, colors, positioning

Scenario B - Style Profile Selected: - Strip original ASS styles completely - Apply selected style profile (preset or custom) - User explicitly choosing new style implies replacement intent

Implementation:

# worker/tasks.py
if uploaded_format == 'ass':
    if config.get('styleProfile') in [None, 'none'] and config.get('dualLanguage'):
        # Translation only - preserve original styles
        preserve_original_styles = True
    else:
        # Apply selected profile - replace styles
        preserve_original_styles = False

Rationale: Smart defaults that match user intent. Translation-only users want to keep their existing styling. Users selecting a profile want that profile applied.


Success Metrics

Phase 1 (Subtitle Upload)

  • [ ] 50%+ of jobs use subtitle import (vs video transcription)
  • [ ] Average job completion time <30 seconds (vs 2-5 minutes for video)
  • [ ] Zero failed subtitle parse jobs

Phase 2 (Preview)

  • [ ] 80%+ of subtitle uploads show preview successfully
  • [ ] <200ms preview render time for files with <1000 segments

Phase 3 (Visualizer)

  • [ ] Users spend 30+ seconds interacting with style comparison
  • [ ] 20%+ of users change style selection after viewing comparison

Checklist

Pre-Development

  • [x] Review and finalize feature plan
  • [ ] Get user approval on UI mockups
  • [ ] Decide on open questions (Q1-Q3)
  • [ ] Set up feature branch: feature/subtitle-import

Phase 1: Subtitle Upload

  • [ ] Rename VideoUploaderFileUploader
  • [ ] Add subtitle_path column to videos table
  • [ ] Modify /api/upload to handle subtitle files
  • [ ] Update /api/jobs to accept subtitlePath
  • [ ] Implement parse_ass() function
  • [ ] Implement parse_vtt() function
  • [ ] Update worker to detect and handle subtitle jobs
  • [ ] Write unit tests for parsers
  • [ ] Test end-to-end with sample SRT file

Phase 2: Subtitle Preview

  • [ ] Create SubtitlePreview.tsx component
  • [ ] Implement client-side SRT parser
  • [ ] Add preview to generate page
  • [ ] Add "Show All" expansion with virtualization
  • [ ] Test with large subtitle files (5000+ segments)

Phase 3: Style Visualizer

  • [ ] Create StyleComparison.tsx component
  • [ ] Implement side-by-side mode
  • [ ] Implement toggle mode
  • [ ] Extract real subtitle text for preview
  • [ ] Add segment selector
  • [ ] Integrate with StyleConfigurator

Testing & Deployment

  • [ ] Run full test suite
  • [ ] Manual QA testing
  • [ ] Update documentation
  • [ ] Create deployment PR
  • [ ] Deploy to production
  • [ ] Monitor for errors (first 24h)

Timeline Estimate

Total Implementation Time: 12-16 hours (across 3 phases)

Breakdown: - Phase 1 (Subtitle Upload): 4-6 hours - Phase 2 (Preview): 2-3 hours - Phase 3 (Visualizer): 3-4 hours - Testing & Polish: 2-3 hours

Calendar Time: 2-3 days (assuming focused development)


Notes

  • This feature significantly expands SubtitleAI's utility beyond just transcription
  • Opens door for Bazarr integration and media library workflows
  • Style visualizer could become a standout feature (unique value prop)
  • Consider making visualizer accessible without file upload (style sandbox mode)