Files
SubMiner/backlog/tasks/task-25 - Add-frequency-dictionary-based-token-highlighting-with-configurable-top-X-and-color-ramp.md
2026-02-15 22:48:56 -08:00

3.2 KiB

id, title, status, assignee, created_date, updated_date, labels, dependencies, documentation, priority
id title status assignee created_date updated_date labels dependencies documentation priority
TASK-25 Add frequency-dictionary-based token highlighting with configurable top-X and color ramp Done
2026-02-13 16:47 2026-02-16 06:48
/Users/sudacode/.codex/worktrees/2089/SubMiner/docs/configuration.md
/Users/sudacode/.codex/worktrees/2089/SubMiner/docs/jlpt-vocab-bundle.md
high

Description

Leverage user-installed frequency dictionaries to color subtitle tokens based on word frequency rank, with configurable behavior: either one shared color for all words below a rank threshold or a multi-color range mapping based on frequency bands. The feature should support a configurable X (top-N words) cutoff and integrate with existing subtitle rendering flow.

Acceptance Criteria

  • #1 Add a feature flag and configuration for frequency-based highlighting with default disabled state.
  • #2 Support selecting a user-installed frequency dictionary source and reading word frequency data from it.
  • #3 Introduce a configurable top-X threshold in config for which words are eligible for frequency-based coloring.
  • #4 When single-color mode is enabled, all matched words within the rank rule use the configured color.
  • #5 When multi-color mode is enabled, map frequency bands to colors and color tokens by their actual rank bucket.
  • #6 Ensure matching is token-aware (normalization/lowercasing handling) and preserves existing subtitle tokenization behavior.
  • #7 Handle missing/unsupported dictionary formats and unknown words with deterministic no-highlight fallback.
  • #8 Render underline/token highlights without breaking subtitle layout or interactions.
  • #9 Add tests/verification for: single-color mode, color-band mode, threshold boundary, and disabled mode.
  • #10 Document dictionary source format expectations, configuration example, and performance impact of ranking lookups.
  • #11 If full automatic discovery of user-installed frequency dictionaries is not possible, provide clear configuration workflow/fallback path.

Implementation Notes

2026-02-16: Updated docs for frequency dictionary behavior. Clarified built-in fallback, precedence, and shared format expectations in and .

Added docs references for frequency dictionary defaults and fallback behavior.

As of 2026-02-16, docs and implementation are considered complete for TASK-25; frequency highlighting fallback, custom sourcePath precedence, topX, single/banded modes, token pipeline integration, and fallback behavior are present; documentation and tests exist in src/core/services and src/renderer.

2026-02-16: Frequency-dictionary highlighting feature fully complete and shipped. Task acceptance criteria, DoD, and docs alignment are all marked complete in this task record.

Definition of Done

  • #1 Frequency-based highlighting renders using either single-color or banded-colors for valid matches, with configurable top-X threshold and documented setup.