--- id: TASK-332 title: Fix subtitle frequency annotation missing ranks shown in Yomitan popup status: Done assignee: - Codex created_date: '2026-05-04 03:29' updated_date: '2026-05-04 03:41' labels: - bug - tokenizer dependencies: [] priority: medium --- ## Description Subtitle frequency highlighting can miss a token even when the Yomitan popup shows a rank within the configured threshold. Reproduced with `第二走者とアンカーは\n中継地点に速やかに移動!`: Yomitan popup shows `第二` JPDB rank 1820, but SubMiner tokenizer output has no `frequencyRank` for `第二`, so renderer cannot annotate it. ## Acceptance Criteria - [x] #1 `第二` in `第二走者とアンカーは\n中継地点に速やかに移動!` receives the Yomitan rank shown by the popup when frequency highlighting is enabled. - [x] #2 Regression test covers the Yomitan scan/frequency ingestion path for exact popup-derived ranks. - [x] #3 Existing tokenizer frequency tests continue to pass. ## Implementation Plan 1. Reproduce and inspect the missing `第二` rank path with tokenizer probes and focused tests. 2. Preserve exact Yomitan scan frequency ranks when the matching frequency entry omits reading metadata but has the same exact term. 3. Allow ranked ordinal prefix-noun tokens (`第` + numeric noun, e.g. `第二`) through annotation POS filtering while keeping standalone prefixes excluded. 4. Verify with focused tokenizer/runtime/annotation tests, typecheck, changelog lint, and a live-style Yomitan profile probe. ## Implementation Notes Root-cause probe against temp copy of Yomitan profile: tokenizer returns no frequencyRank for `第二`; renderer config `topX` is 10000, so render threshold is not the blocker. User approved implementation plan on 2026-05-04. Verification: `bun test src/core/services/tokenizer.test.ts src/core/services/tokenizer/yomitan-parser-runtime.test.ts src/core/services/tokenizer/annotation-stage.test.ts` passed (192 tests). Verification: `bun run typecheck` passed. Verification: `bun run changelog:lint` passed. Verification: `bun run get-frequency:electron -- --yomitan-user-data /tmp/subminer-yomitan-probe-909423 "第二走者とアンカーは\\n中継地点に速やかに移動!"` produced `第二` with `frequencyRank: 1820`. Finalization check: implementation plan updated to reflect the discovered POS-filter root cause and completed solution. ## Final Summary Fixed subtitle frequency annotation for `第二` by allowing ranked ordinal prefix-noun compounds through annotation POS filtering. Also made scan rank matching tolerate exact frequency entries where one side omits reading metadata. Verified with tokenizer/runtime/annotation tests, typecheck, changelog lint, and a live-style Yomitan profile probe showing `第二` now receives frequencyRank 1820.