mirror of
https://github.com/ksyasuda/SubMiner.git
synced 2026-02-27 18:22:41 -08:00
2.2 KiB
2.2 KiB
id, title, status, assignee, created_date, labels, dependencies, priority
| id | title | status | assignee | created_date | labels | dependencies | priority |
|---|---|---|---|---|---|---|---|
| TASK-25 | Add frequency-dictionary-based token highlighting with configurable top-X and color ramp | To Do | 2026-02-13 16:47 | high |
Description
Leverage user-installed frequency dictionaries to color subtitle tokens based on word frequency rank, with configurable behavior: either one shared color for all words below a rank threshold or a multi-color range mapping based on frequency bands. The feature should support a configurable X (top-N words) cutoff and integrate with existing subtitle rendering flow.
Acceptance Criteria
- #1 Add a feature flag and configuration for frequency-based highlighting with default disabled state.
- #2 Support selecting a user-installed frequency dictionary source and reading word frequency data from it.
- #3 Introduce a configurable top-X threshold in config for which words are eligible for frequency-based coloring.
- #4 When single-color mode is enabled, all matched words within the rank rule use the configured color.
- #5 When multi-color mode is enabled, map frequency bands to colors and color tokens by their actual rank bucket.
- #6 Ensure matching is token-aware (normalization/lowercasing handling) and preserves existing subtitle tokenization behavior.
- #7 Handle missing/unsupported dictionary formats and unknown words with deterministic no-highlight fallback.
- #8 Render underline/token highlights without breaking subtitle layout or interactions.
- #9 Add tests/verification for: single-color mode, color-band mode, threshold boundary, and disabled mode.
- #10 Document dictionary source format expectations, configuration example, and performance impact of ranking lookups.
- #11 If full automatic discovery of user-installed frequency dictionaries is not possible, provide clear configuration workflow/fallback path.
Definition of Done
- #1 Frequency-based highlighting renders using either single-color or banded-colors for valid matches, with configurable top-X threshold and documented setup.