Add opt-in JLPT tagging flow

This commit is contained in:
2026-02-15 16:28:00 -08:00
parent ca2b7bb2fe
commit f492622a8b
27 changed files with 1116 additions and 38 deletions

View File

@@ -3,7 +3,7 @@ id: TASK-23
title: >-
Add opt-in JLPT level tagging by bundling and querying local Yomitan
dictionary
status: To Do
status: In Progress
assignee: []
created_date: '2026-02-13 16:42'
labels: []
@@ -19,13 +19,13 @@ Implement an opt-in JLPT token annotation feature that annotates subtitle words
## Acceptance Criteria
<!-- AC:BEGIN -->
- [ ] #1 Add an opt-in setting/feature flag so JLPT tagging is disabled by default and can be enabled per user/session as requested.
- [ ] #2 Bundle the existing JLPT Yomitan extension package/data into the project so lookups can be performed offline from local files.
- [ ] #3 Implement token-level dictionary lookup against the bundled JLPT dictionary file to determine presence and JLPT level for words in subtitle lines.
- [ ] #4 Render a colored underline under each token determined to have a JLPT level; the underline must match token width/length and not affect layout or disrupt line rendering.
- [ ] #5 Assign different underline colors per JLPT level (at minimum N5/N4/N3/N2/N1) with a stable mapping documented in task notes.
- [ ] #6 Handle unknown/no-match tokens as non-tagged while preserving existing subtitle styling and interaction behavior.
- [ ] #7 When disabled, no JLPT lookups are performed and subtitles render exactly as current behavior.
- [x] #1 Add an opt-in setting/feature flag so JLPT tagging is disabled by default and can be enabled per user/session as requested.
- [x] #2 Bundle the existing JLPT Yomitan extension package/data into the project so lookups can be performed offline from local files.
- [x] #3 Implement token-level dictionary lookup against the bundled JLPT dictionary file to determine presence and JLPT level for words in subtitle lines.
- [x] #4 Render a colored underline under each token determined to have a JLPT level; the underline must match token width/length and not affect layout or disrupt line rendering.
- [x] #5 Assign different underline colors per JLPT level (at minimum N5/N4/N3/N2/N1) with a stable mapping documented in task notes.
- [x] #6 Handle unknown/no-match tokens as non-tagged while preserving existing subtitle styling and interaction behavior.
- [x] #7 When disabled, no JLPT lookups are performed and subtitles render exactly as current behavior.
- [ ] #8 Add tests or deterministic checks covering at least one positive match, one non-match, and one unknown/unsupported-level fallback path.
- [ ] #9 Document expected dictionary source and any size/performance impact of bundling the JLPT extension data.
- [ ] #10 If dictionary format/version constraints block exact level extraction, the task includes explicit limitation notes and a deterministic fallback strategy.
@@ -34,5 +34,8 @@ Implement an opt-in JLPT token annotation feature that annotates subtitle words
## Definition of Done
<!-- DOD:BEGIN -->
- [ ] #1 Feature has a clear toggle and persistence of preference if applicable.
- [ ] #2 JLPT rendering is visually verified for all supported levels with distinct colors and no overlap/regression in subtitle legibility.
- [x] #2 JLPT rendering is visually verified for all supported levels with distinct colors and no overlap/regression in subtitle legibility.
<!-- DOD:END -->
## Note
- Full performance/limits documentation and dictionary source/version/perf notes are deferred and tracked separately.

View File

@@ -1,7 +1,7 @@
---
id: TASK-23.1
title: Implement JLPT token lookup service for subtitle words
status: To Do
status: In Progress
assignee: []
created_date: '2026-02-13 16:42'
labels: []
@@ -18,14 +18,17 @@ Create a lookup layer that parses/queries the bundled JLPT dictionary file and r
## Acceptance Criteria
<!-- AC:BEGIN -->
- [ ] #1 Service accepts a token/normalized token and returns JLPT level or no-match deterministically.
- [ ] #2 Lookup handles expected dictionary format edge cases and unknown tokens without throwing.
- [x] #1 Service accepts a token/normalized token and returns JLPT level or no-match deterministically.
- [x] #2 Lookup handles expected dictionary format edge cases and unknown tokens without throwing.
- [ ] #3 Lookup path is efficient enough for frame-by-frame subtitle updates.
- [ ] #4 Tokenizer interaction preserves existing token ordering and positions needed for rendering spans/underlines.
- [x] #4 Tokenizer interaction preserves existing token ordering and positions needed for rendering spans/underlines.
- [ ] #5 Behavior on malformed/unsupported dictionary format is documented with fallback semantics.
<!-- AC:END -->
## Note
- Full performance and malformed-format limitation documentation is deferred per request and will be handled in a separate pass if needed.
## Definition of Done
<!-- DOD:BEGIN -->
- [ ] #1 Lookup service returns JLPT level with deterministic output for test fixtures.
- [x] #1 Lookup service returns JLPT level with deterministic output for test fixtures.
<!-- DOD:END -->

View File

@@ -1,7 +1,7 @@
---
id: TASK-23.2
title: Bundle JLPT Yomitan dictionary assets for offline local lookup
status: To Do
status: In Progress
assignee: []
created_date: '2026-02-13 16:42'
labels: []
@@ -18,13 +18,16 @@ Package and include the JLPT Yomitan extension dictionary assets in SubMiner so
## Acceptance Criteria
<!-- AC:BEGIN -->
- [ ] #1 JLPT dictionary asset from the existing Yomitan extension is added to the repository/build output in a tracked, offline-available location.
- [ ] #2 The loader locates and opens the JLPT dictionary file deterministically at runtime.
- [x] #1 JLPT dictionary asset from the existing Yomitan extension is added to the repository/build output in a tracked, offline-available location.
- [x] #2 The loader locates and opens the JLPT dictionary file deterministically at runtime.
- [ ] #3 Dictionary version/source is documented so future updates are explicit and reproducible.
- [ ] #4 Dictionary bundle size and load impact are documented in task notes or project docs.
<!-- AC:END -->
## Note
- Full dictionary source/version/performance notes are intentionally deferred for now (out of scope in this pass).
## Definition of Done
<!-- DOD:BEGIN -->
- [ ] #1 Dictionary data is bundled and consumable during development and packaged app runs.
- [x] #1 Dictionary data is bundled and consumable during development and packaged app runs.
<!-- DOD:END -->

View File

@@ -1,7 +1,7 @@
---
id: TASK-23.3
title: Render JLPT token underlines with level-based colors in subtitle lines
status: To Do
status: Done
assignee: []
created_date: '2026-02-13 16:42'
labels: []
@@ -18,14 +18,14 @@ Render JLPT-aware token annotations as token-length colored underlines in the su
## Acceptance Criteria
<!-- AC:BEGIN -->
- [ ] #1 For each token with JLPT level, renderer draws an underline matching token width/length.
- [ ] #2 Underlines use distinct colors by JLPT level (e.g., N5/N4/N3/N2/N1) and mapping is consistent/documented.
- [ ] #3 Non-tagged tokens remain visually unchanged.
- [ ] #4 Rendering does not alter line height/selection behavior or break wrapping behavior.
- [ ] #5 Feature degrades gracefully when level data is missing or lookup is unavailable.
- [x] #1 For each token with JLPT level, renderer draws an underline matching token width/length.
- [x] #2 Underlines use distinct colors by JLPT level (e.g., N5/N4/N3/N2/N1) and mapping is consistent/documented.
- [x] #3 Non-tagged tokens remain visually unchanged.
- [x] #4 Rendering does not alter line height/selection behavior or break wrapping behavior.
- [x] #5 Feature degrades gracefully when level data is missing or lookup is unavailable.
<!-- AC:END -->
## Definition of Done
<!-- DOD:BEGIN -->
- [ ] #1 Visual output validated for all mapped JLPT levels with no legibility/layout regressions.
- [x] #1 Visual output validated for all mapped JLPT levels with no legibility/layout regressions.
<!-- DOD:END -->

View File

@@ -1,7 +1,7 @@
---
id: TASK-23.4
title: Add opt-in control and end-to-end flow + tests for JLPT tagging
status: To Do
status: In Progress
assignee: []
created_date: '2026-02-13 16:42'
labels: []
@@ -18,12 +18,15 @@ Add user/config setting to enable JLPT tagging, wire the feature toggle through
## Acceptance Criteria
<!-- AC:BEGIN -->
- [ ] #1 JLPT tagging is opt-in and defaults to disabled.
- [ ] #2 When disabled, lookup/rendering pipeline does not execute JLPT processing.
- [ ] #3 When enabled, end-to-end flow tags subtitle words via token-level lookup and rendering.
- [x] #1 JLPT tagging is opt-in and defaults to disabled.
- [x] #2 When disabled, lookup/rendering pipeline does not execute JLPT processing.
- [x] #3 When enabled, end-to-end flow tags subtitle words via token-level lookup and rendering.
- [ ] #4 Add tests covering at least one positive match, one non-match, and disabled state.
<!-- AC:END -->
## Note
- Full end-to-end + disabled-state test coverage remains pending as an explicit follow-up item.
## Definition of Done
<!-- DOD:BEGIN -->
- [ ] #1 End-to-end option behavior and opt-in state persistence are implemented and verified.