SubtitleAI: Subtitle Import Feature - Testing Plan¶

Feature: Subtitle file import with multi-format parsing (SRT, ASS, VTT) Status: Phase 1 implementation complete, ready for testing Related: Feature Plan

Testing Overview¶

This testing plan covers Phase 1 of the subtitle import feature, which enables users to upload subtitle files directly instead of requiring video transcription.

Scope: - Frontend: FileUploader component accepting both video and subtitle files - API: /api/upload and /api/jobs handling subtitle uploads - Worker: Multi-format parsing (SRT, ASS, VTT) with format auto-detection - Database: XOR constraint on videos.file_path and videos.subtitle_path - End-to-end: Complete workflow from upload to downloadable styled subtitles

1. Unit Tests¶

1.1 Parser Tests (worker/formats/subtitle_parsers.py)¶

Test: parse_srt() - Valid SRT

def test_parse_srt_valid():
    srt_content = """1
00:00:00,000 --> 00:00:05,000
Hello world!

2
00:00:05,500 --> 00:00:10,000
This is a test."""

    segments = parse_srt(srt_content)

    assert len(segments) == 2
    assert segments[0]['index'] == 1
    assert segments[0]['start'] == '00:00:00,000'
    assert segments[0]['end'] == '00:00:05,000'
    assert segments[0]['text'] == 'Hello world!'
    assert segments[1]['index'] == 2

Test: parse_srt() - Multiline Text

def test_parse_srt_multiline():
    srt_content = """1
00:00:00,000 --> 00:00:05,000
First line
Second line
Third line"""

    segments = parse_srt(srt_content)

    assert len(segments) == 1
    assert segments[0]['text'] == 'First line\nSecond line\nThird line'

Test: parse_srt() - Invalid Format

def test_parse_srt_invalid():
    # Missing timestamp
    srt_content = """1
Hello world!"""

    segments = parse_srt(srt_content)
    assert len(segments) == 0  # Should skip malformed segments

Test: parse_ass() - Valid ASS

def test_parse_ass_valid():
    ass_content = """[Script Info]
Title: Test

[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Default,Arial,48,&H00FFFFFF,&H000000FF,&H00000000,&H00000000,0,0,0,0,100,100,0,0,1,2,2,2,10,10,10,1

[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:01.00,0:00:05.50,Default,,0,0,0,,Hello world!"""

    segments, styles = parse_ass(ass_content, preserve_styles=True)

    assert len(segments) == 1
    assert segments[0]['start'] == '00:00:01,000'  # Converted to SRT format
    assert segments[0]['end'] == '00:00:05,500'
    assert segments[0]['text'] == 'Hello world!'
    assert 'Default' in styles

Test: parse_ass() - ASS Formatting Tags

def test_parse_ass_formatting_cleanup():
    ass_content = """[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:01.00,0:00:05.50,Default,,0,0,0,,{\\i1}Italic text{\\i0}"""

    segments, _ = parse_ass(ass_content)

    assert segments[0]['text'] == 'Italic text'  # Formatting tags removed

Test: parse_ass() - Line Break Conversion

def test_parse_ass_line_breaks():
    ass_content = """[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:01.00,0:00:05.50,Default,,0,0,0,,First line\\NSecond line"""

    segments, _ = parse_ass(ass_content)

    assert segments[0]['text'] == 'First line\nSecond line'  # \\N → \n

Test: parse_vtt() - Valid VTT

def test_parse_vtt_valid():
    vtt_content = """WEBVTT

1
00:00:00.000 --> 00:00:05.000
Hello world!

2
00:00:05.500 --> 00:00:10.000
This is a test."""

    segments = parse_vtt(vtt_content)

    assert len(segments) == 2
    assert segments[0]['start'] == '00:00:00,000'  # Converted . to ,
    assert segments[0]['end'] == '00:00:05,000'
    assert segments[0]['text'] == 'Hello world!'

Test: parse_vtt() - VTT Tags

def test_parse_vtt_tag_removal():
    vtt_content = """WEBVTT

1
00:00:00.000 --> 00:00:05.000
<v Speaker>Hello world!</v>"""

    segments = parse_vtt(vtt_content)

    assert segments[0]['text'] == 'Hello world!'  # <v> tags removed

Test: parse_vtt() - Without Identifiers

def test_parse_vtt_no_identifiers():
    vtt_content = """WEBVTT

00:00:00.000 --> 00:00:05.000
Hello world!"""

    segments = parse_vtt(vtt_content)

    assert len(segments) == 1
    assert segments[0]['text'] == 'Hello world!'

Test: detect_format()

def test_detect_format_srt():
    content = """1\n00:00:00,000 --> 00:00:05,000\nTest"""
    assert detect_format(content) == 'srt'

def test_detect_format_vtt():
    content = """WEBVTT\n\n1\n00:00:00.000 --> 00:00:05.000\nTest"""
    assert detect_format(content) == 'vtt'

def test_detect_format_ass():
    content = """[Script Info]\nTitle: Test\n[Events]\nDialogue: ..."""
    assert detect_format(content) == 'ass'

Test: ass_time_to_srt_time()

def test_ass_time_conversion():
    assert ass_time_to_srt_time('0:00:05.00') == '00:00:05,000'
    assert ass_time_to_srt_time('1:23:45.67') == '01:23:45,670'
    assert ass_time_to_srt_time('0:00:00.10') == '00:00:00,100'

1.2 File Upload Tests (API)¶

Test: /api/upload - Video File

# Test video upload returns correct fileType
curl -X POST http://localhost:3000/api/upload \
  -H "Authorization: Bearer $TOKEN" \
  -F "file=@test.mp4"

# Expected response:
{
  "success": true,
  "path": "user-id/videos/timestamp_test.mp4",
  "fullPath": "subtitleai-uploads/user-id/videos/timestamp_test.mp4",
  "fileType": "video",
  "size": 1024000
}

Test: /api/upload - Subtitle File (SRT)

curl -X POST http://localhost:3000/api/upload \
  -H "Authorization: Bearer $TOKEN" \
  -F "file=@test.srt"

# Expected response:
{
  "success": true,
  "path": "user-id/subtitles/timestamp_test.srt",
  "fullPath": "subtitleai-uploads/user-id/subtitles/timestamp_test.srt",
  "fileType": "subtitle",
  "size": 5000
}

Test: /api/upload - Subtitle File Size Limit

# Create 11MB file (over 10MB limit)
dd if=/dev/zero of=large.srt bs=1M count=11

curl -X POST http://localhost:3000/api/upload \
  -H "Authorization: Bearer $TOKEN" \
  -F "file=@large.srt"

# Expected: 400 error with "exceeds maximum size"

Test: /api/upload - Unsupported File Type

curl -X POST http://localhost:3000/api/upload \
  -H "Authorization: Bearer $TOKEN" \
  -F "file=@test.txt"

# Expected: 400 error with "Invalid file type"

1.3 Job Creation Tests (API)¶

Test: /api/jobs - Create with videoPath

curl -X POST http://localhost:3000/api/jobs \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Test Video",
    "videoPath": "subtitleai-uploads/user-id/videos/test.mp4",
    "sourceLanguage": "en",
    "config": {"styleProfile": "default"}
  }'

# Expected: Job created, videos.file_path set, videos.subtitle_path NULL

Test: /api/jobs - Create with subtitlePath

curl -X POST http://localhost:3000/api/jobs \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Test Subtitle",
    "subtitlePath": "subtitleai-uploads/user-id/subtitles/test.srt",
    "sourceLanguage": "en",
    "config": {"styleProfile": "default"}
  }'

# Expected: Job created, videos.subtitle_path set, videos.file_path NULL

Test: /api/jobs - Missing both paths

curl -X POST http://localhost:3000/api/jobs \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Invalid Job",
    "sourceLanguage": "en",
    "config": {}
  }'

# Expected: 400 error "Either videoPath or subtitlePath must be provided"

Test: /api/jobs - Both paths provided

curl -X POST http://localhost:3000/api/jobs \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Invalid Job",
    "videoPath": "subtitleai-uploads/user-id/videos/test.mp4",
    "subtitlePath": "subtitleai-uploads/user-id/subtitles/test.srt",
    "sourceLanguage": "en",
    "config": {}
  }'

# Expected: 400 error "Cannot provide both videoPath and subtitlePath"

1.4 Database Constraint Tests¶

Test: XOR Constraint - Both NULL

-- Should FAIL
INSERT INTO subtitleai.videos (user_id, title, source_language, file_path, subtitle_path)
VALUES ('user-id', 'Test', 'en', NULL, NULL);

-- Expected: Check constraint violation

Test: XOR Constraint - Both SET

-- Should FAIL
INSERT INTO subtitleai.videos (user_id, title, source_language, file_path, subtitle_path)
VALUES ('user-id', 'Test', 'en', 'video.mp4', 'subtitle.srt');

-- Expected: Check constraint violation

Test: XOR Constraint - Only file_path

-- Should SUCCEED
INSERT INTO subtitleai.videos (user_id, title, source_language, file_path, subtitle_path)
VALUES ('user-id', 'Test', 'en', 'video.mp4', NULL);

-- Expected: Insert successful

Test: XOR Constraint - Only subtitle_path

-- Should SUCCEED
INSERT INTO subtitleai.videos (user_id, title, source_language, file_path, subtitle_path)
VALUES ('user-id', 'Test', 'en', NULL, 'subtitle.srt');

-- Expected: Insert successful

2. Integration Tests¶

2.1 Worker Job Detection¶

Test: Worker detects subtitle import job

# Create job with subtitle_path
job_id = create_test_job(subtitle_path='test.srt')

# Trigger worker
generate_subtitles.delay(job_id)

# Verify logs show:
# "[Job {id}] Subtitle import mode detected"
# "[Job {id}] Downloading subtitle from subtitleai-uploads/..."
# "[Job {id}] Detected subtitle format: srt"
# "[Job {id}] Parsed N subtitle segments from srt file"

Test: Worker detects video transcription job

# Create job with file_path
job_id = create_test_job(file_path='test.mp4')

# Trigger worker
generate_subtitles.delay(job_id)

# Verify logs show:
# "[Job {id}] Video transcription mode detected"
# "[Job {id}] Downloading video from subtitleai-uploads/..."
# "[Job {id}] Running Whisper transcription"
# "[Job {id}] Parsed N subtitle segments"

2.2 Worker Format Detection¶

Test: Worker auto-detects SRT

# Upload SRT file, create job
job_id = create_test_job_with_srt_file()

# Trigger worker
generate_subtitles.delay(job_id)

# Verify:
# - Format detected as 'srt'
# - parse_srt() called
# - Segments parsed correctly

Test: Worker auto-detects ASS

# Upload ASS file, create job
job_id = create_test_job_with_ass_file()

# Trigger worker
generate_subtitles.delay(job_id)

# Verify:
# - Format detected as 'ass'
# - parse_ass() called with preserve_styles=False
# - Segments parsed correctly
# - ASS formatting tags removed

Test: Worker auto-detects VTT

# Upload VTT file, create job
job_id = create_test_job_with_vtt_file()

# Trigger worker
generate_subtitles.delay(job_id)

# Verify:
# - Format detected as 'vtt'
# - parse_vtt() called
# - Timestamps converted from . to ,
# - VTT tags removed

2.3 Worker Progress Updates¶

Test: Subtitle import progress

job_id = create_test_job_with_srt_file()
generate_subtitles.delay(job_id)

# Monitor progress field in database:
# 0% → 30% (download) → 60% (parse) → 85% (styling) → 100% (complete)

Test: Video transcription progress

job_id = create_test_job_with_video()
generate_subtitles.delay(job_id)

# Monitor progress field:
# 0% → 20% (download) → 80% (whisper) → 85% (parse) → 100% (complete)

3. End-to-End Test Scenarios¶

3.1 Happy Path: SRT → Default Style → ASS Download¶

Steps: 1. Navigate to /generate 2. Upload test SRT file (e.g., test_sample.srt) 3. Select source language: English 4. Select style profile: Default 5. Click "Generate Subtitles" 6. Navigate to /jobs 7. Wait for job to complete (status: complete, progress: 100%) 8. Click job to view details 9. Download ASS file 10. Verify ASS file contains Default style definition 11. Verify all SRT segments present with correct timing

Expected Result: - Upload succeeds, shows file size - Job created and appears in jobs list - Job completes in <30 seconds (no video transcription) - Download links appear for both SRT and ASS - ASS file has Default style applied (yellow text, black outline) - All segments match original SRT timing

3.2 ASS → Custom Style → ASS Download¶

Steps: 1. Create custom style profile via /styles page - Name: "Test Blue" - Primary color: Blue (#0000FF) - Outline: 3px, black - Font size: 52 2. Navigate to /generate 3. Upload ASS file with existing style 4. Select style profile: "Test Blue" (custom) 5. Generate subtitles 6. Download ASS output 7. Verify style replaced with "Test Blue"

Expected Result: - Custom style appears in style selector - Job completes successfully - Downloaded ASS file has "Test Blue" style definition - All dialogue lines use "Test Blue" style - Original ASS styles not preserved (replaced) - Job detail page shows "Test Blue Style" tag (not ID)

3.3 VTT → Dual Language → ASS Download¶

Steps: 1. Upload VTT file 2. Source language: English 3. Target language: Spanish 4. Enable dual-language mode 5. Select style: Anime Fansub 6. Generate subtitles 7. Download ASS output 8. Verify dual-language formatting

Expected Result: - VTT parsed correctly (WEBVTT header handled) - Timestamps converted (. → ,) - Translation API called - ASS file shows dual-language format:

Original English text
Texto traducido al español

- Anime Fansub style applied

3.4 SRT → Translation Only (No Style)¶

Steps: 1. Upload SRT file 2. Source: English, Target: French 3. Enable dual-language 4. Style profile: None/Default 5. Generate subtitles 6. Download both SRT and ASS

Expected Result: - Translation preserves original timing - SRT output shows dual-language (original + translation) - ASS output uses Default style (no custom style applied) - All timestamps match original SRT exactly

3.5 Large Subtitle File (Edge Case)¶

Steps: 1. Upload 9MB SRT file (~15,000 segments) 2. Apply default style 3. Monitor job progress

Expected Result: - Upload succeeds (under 10MB limit) - Job processes all segments - Progress updates smoothly - Completes in <60 seconds - All segments present in output

3.6 Malformed Subtitle File (Error Handling)¶

Steps: 1. Upload SRT with invalid timestamps:

1
INVALID TIMESTAMP
Text here

2. Submit job 3. Monitor job status

Expected Result: - Job fails gracefully with error message - Error logged: "Invalid SRT timestamp" - Job status: failed - Error message visible on job detail page

4. Manual Testing Checklist¶

Pre-Deployment Checklist¶

[ ] Frontend build succeeds (npm run build)
[ ] No TypeScript errors
[ ] No console errors on /generate page
[ ] FileUploader shows correct accepted file types (video + subtitle)
[ ] File validation works (reject .txt, accept .srt/.ass/.vtt)
[ ] Size limit enforced (10MB for subtitles)

Post-Deployment Checklist (K8s Worker)¶

[ ] Deploy updated worker to K8s:

# Build and push new worker image
cd /root/projects/subtitleai
docker build -f Dockerfile.worker -t 10.89.97.201:30500/subtitleai-worker:latest .
docker push 10.89.97.201:30500/subtitleai-worker:latest

# Restart worker deployment
kubectl rollout restart deployment/subtitleai-worker -n subtitleai

[ ] Verify worker logs show no errors:

kubectl logs -n subtitleai -l app=subtitleai-worker -f

[ ] Test subtitle upload end-to-end
[ ] Verify job completes successfully
[ ] Download and inspect ASS file

Database Migration Checklist¶

[ ] Migration file exists: supabase/migrations/20251130163047_add_subtitle_path_to_videos.sql
[ ] Apply to production (if using k8s Supabase):
```
/root/scripts/migrate-app.sh subtitleai
```

[ ] Verify column added:

SELECT column_name, data_type, is_nullable
FROM information_schema.columns
WHERE table_schema = 'subtitleai' AND table_name = 'videos' AND column_name = 'subtitle_path';

[ ] Verify XOR constraint exists:

SELECT constraint_name, check_clause
FROM information_schema.check_constraints
WHERE constraint_schema = 'subtitleai' AND constraint_name = 'videos_file_xor_subtitle';

Format-Specific Tests¶

SRT: - [ ] Simple SRT (single line per segment) - [ ] Multiline SRT (2-3 lines per segment) - [ ] SRT with special characters (é, ñ, 中文) - [ ] SRT with formatting tags (should preserve in text)

ASS: - [ ] ASS with Default style - [ ] ASS with multiple custom styles - [ ] ASS with formatting tags (\i1, \b1, etc.) - should strip - [ ] ASS with \N line breaks - should convert to \n - [ ] ASS exported from Aegisub - [ ] ASS exported from SubtitleEdit

VTT: - [ ] VTT with WEBVTT header - [ ] VTT with identifiers - [ ] VTT without identifiers (timestamps only) - [ ] VTT with <v Speaker> tags - should strip - [ ] VTT with NOTE blocks - should ignore - [ ] VTT with STYLE blocks - should ignore

Style Application Tests¶

[ ] Default style (yellow text, black outline)
[ ] Anime Fansub style (verify colors match preview)
[ ] Custom style (created via style editor)
[ ] Job detail page shows style name (not ID)
[ ] Downloaded ASS has correct style definition

Translation Tests¶

[ ] English → Spanish (dual-language enabled)
[ ] English → French (translation only, no style)
[ ] Verify timing preserved after translation
[ ] Verify dual-language format (original above translation)

5. Performance Tests¶

5.1 Subtitle Import Performance¶

Test: Small SRT (100 segments) - Expected time: <10 seconds - Progress updates: Smooth (30% → 60% → 85% → 100%)

Test: Medium SRT (1000 segments) - Expected time: <20 seconds - Memory usage: <500MB

Test: Large SRT (10,000 segments) - Expected time: <60 seconds - Memory usage: <1GB - No timeouts or crashes

5.2 Format Parsing Performance¶

Test: ASS with Complex Styles - File: 1000 segments, 5 custom styles - Expected: Parse in <15 seconds - Verify: All styles extracted correctly

Test: VTT with Speaker Tags - File: 1000 segments with <v Speaker> tags - Expected: Parse in <15 seconds - Verify: All tags removed

6. Regression Tests¶

6.1 Video Transcription Still Works¶

Test: Upload video file - Verify video upload still works - Verify Whisper transcription runs - Verify SRT and ASS generated - Verify progress: 0% → 20% → 80% → 85% → 100%

6.2 Existing Features Unaffected¶

[ ] Custom style editor works
[ ] Style preview matches output
[ ] Job list shows all jobs
[ ] Download links work for old jobs
[ ] Dual-language translation works

7. Error Handling Tests¶

7.1 Worker Error Cases¶

Test: Missing subtitle file in storage

# Create job with valid subtitle_path
# Delete file from storage before worker runs
# Expected: Job fails with error "File not found"

Test: Corrupted subtitle file

# Upload binary file as .srt
# Expected: Job fails with error "Invalid format"

Test: Unsupported format

# Upload .txt file renamed to .srt
# Create job
# Expected: Format detected as 'srt' (default), parsing fails gracefully

7.2 Frontend Error Cases¶

Test: Network error during upload - Disconnect network mid-upload - Expected: Error toast shown, upload progress resets

Test: Unauthorized user - Clear auth token - Try to upload file - Expected: 401 error, redirect to login

8. Accessibility Tests¶

[ ] FileUploader drag-and-drop works with keyboard
[ ] File type validation errors announced to screen readers
[ ] Upload progress updates announced
[ ] Job status changes announced

9. Browser Compatibility¶

[ ] Chrome/Edge (latest)
[ ] Firefox (latest)
[ ] Safari (latest)
[ ] Mobile Safari (iOS)
[ ] Mobile Chrome (Android)

10. Production Verification¶

Post-Deployment Checklist¶

After deploying to production, verify:

Upload Test:
[ ] Upload small SRT file (test.srt)
[ ] Verify file appears in Supabase Storage under subtitles/ folder
[ ] Verify job created with subtitle_path set
Job Processing:
[ ] Job status changes: pending → processing → complete
[ ] Progress updates: 0% → 30% → 60% → 85% → 100%
[ ] No errors in worker logs
Output Verification:
[ ] Download SRT output
[ ] Download ASS output
[ ] Verify correct style applied
[ ] Verify all segments present

Database Verification:

-- Check recent subtitle import jobs
SELECT j.id, j.status, v.title, v.file_path, v.subtitle_path
FROM subtitleai.jobs j
JOIN subtitleai.videos v ON j.video_id = v.id
WHERE v.subtitle_path IS NOT NULL
ORDER BY j.created_at DESC
LIMIT 10;

Monitoring:
[ ] No errors in application logs
[ ] Worker queue processing normally
[ ] No storage quota issues

Test Data¶

Sample Files¶

Create test files in /root/projects/subtitleai/test-data/:

test_simple.srt:

1
00:00:00,000 --> 00:00:05,000
Hello world!

2
00:00:05,500 --> 00:00:10,000
This is a test subtitle.

3
00:00:10,500 --> 00:00:15,000
Third segment here.

test_multiline.srt:

1
00:00:00,000 --> 00:00:05,000
Line one
Line two
Line three

test_sample.ass:

[Script Info]
Title: Test ASS File
ScriptType: v4.00+

[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Default,Arial,48,&H00FFFFFF,&H000000FF,&H00000000,&H00000000,0,0,0,0,100,100,0,0,1,2,2,2,10,10,10,1

[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:00.00,0:00:05.00,Default,,0,0,0,,Hello from ASS!
Dialogue: 0,0:00:05.50,0:00:10.00,Default,,0,0,0,,{\i1}Italic text{\i0}
Dialogue: 0,0:00:10.50,0:00:15.00,Default,,0,0,0,,Line one\NLine two

test_sample.vtt:

WEBVTT

1
00:00:00.000 --> 00:00:05.000
Hello from VTT!

2
00:00:05.500 --> 00:00:10.000
<v Speaker>Speaker tagged text</v>

00:00:10.500 --> 00:00:15.000
No identifier here

Success Criteria¶

Phase 1 implementation is considered successful when:

✅ All unit tests pass
✅ All integration tests pass
✅ All end-to-end scenarios complete successfully
✅ No regressions in existing video transcription workflow
✅ Production deployment completes without errors
✅ All three formats (SRT, ASS, VTT) parse correctly
✅ Style application works for subtitle imports
✅ Translation works for subtitle imports
✅ Database XOR constraint enforced
✅ Worker logs show correct branching logic

Next Steps¶

After Phase 1 testing completes:

Phase 2: Subtitle Preview Component
Display parsed segments before job submission
Show segment count, duration, language detection
Phase 3: Style Comparison Visualizer
Before/after preview with side-by-side or toggle view
Real-time style preview without submitting job
Future Enhancements:
Bazarr API integration (Option 2)
Media library scanning (Option 3)
Batch subtitle processing
Advanced ASS style preservation for translation-only mode