mirror of
https://github.com/ksyasuda/SubMiner.git
synced 2026-02-28 06:22:45 -08:00
2.6 KiB
2.6 KiB
id, title, status, assignee, created_date, labels, dependencies, priority
| id | title | status | assignee | created_date | labels | dependencies | priority | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| TASK-41 | Add real-time sentence difficulty scoring with auto-pause on hard lines | To Do | 2026-02-14 02:09 |
|
|
medium |
Description
Compute a real-time difficulty score for each subtitle line using JLPT level data (TASK-23) and frequency dictionary data (TASK-25), and use this score to drive smart playback features.
Motivation
Learners at different levels have different needs. N4 learners want to pause on N2+ lines; advanced learners want to skip easy content. A per-line difficulty score enables intelligent playback that adapts to the learner's level.
Features
- Per-line difficulty score: Combine JLPT levels and frequency ranks of tokens to produce a composite difficulty score (e.g., 1-5 scale or JLPT-equivalent label)
- Visual difficulty indicator: Subtle color/icon on each subtitle line indicating difficulty
- Auto-pause on difficult lines: Configurable threshold — pause playback when a line exceeds the user's set difficulty level
- Per-episode difficulty rating: Average difficulty across all lines, shown in the episode browser (TASK-34)
- Difficulty trend within a video: Show whether difficulty increases/decreases over the episode (useful for detecting climax scenes with complex dialogue)
Scoring algorithm (suggested)
- For each token in a line, look up JLPT level (N5=1, N1=5) and frequency rank
- Weight unknown words (not in Anki known-word cache from TASK-24) more heavily
- Composite score = weighted average of token difficulties, with bonus for line length and grammar complexity
- Configurable weights so users can tune sensitivity
Design constraints
- Scoring must run synchronously during subtitle rendering without perceptible latency
- Score computation should be cached per subtitle line (lines repeat on seeks/replays)
- Auto-pause should be debounced to avoid rapid pause/unpause on sequential hard lines
Acceptance Criteria
- #1 Each subtitle line receives a difficulty score based on JLPT and frequency data.
- #2 A visual indicator shows per-line difficulty in the overlay.
- #3 Auto-pause triggers when a line exceeds the user's configured difficulty threshold.
- #4 Difficulty scoring does not add perceptible latency to subtitle rendering.
- #5 Per-episode average difficulty is available for the episode browser.
- #6 Scoring weights are configurable in settings.