refactor(tokenizer): remove MeCab fallback tokenization path

This commit is contained in:
2026-02-22 18:03:38 -08:00
parent f1dc418e2d
commit badb82280a
9 changed files with 212 additions and 480 deletions

View File

@@ -86,3 +86,4 @@ Read first. Keep concise.
| `codex-task109-discord-presence-20260222T220537Z-lkfv` | `codex-task109-discord-presence` | `Execute TASK-109 Discord Rich Presence integration end-to-end with plan-first workflow (no commit)` | `handoff` | `docs/subagents/agents/codex-task109-discord-presence-20260222T220537Z-lkfv.md` | `2026-02-22T22:36:40Z` |
| `opencode-task103-jellyfin-main-composer-20260222T221152Z-n3p7` | `opencode-task103-jellyfin-main-composer` | `Implement TASK-103 Jellyfin runtime wiring extraction from main.ts into composer module(s), tests, docs, and required validations (no commit).` | `in_progress` | `docs/subagents/agents/opencode-task103-jellyfin-main-composer-20260222T221152Z-n3p7.md` | `2026-02-22T22:11:52Z` |
| `opencode-task109-discord-presence-20260223T011027Z-j9r4` | `opencode-task109-discord-presence` | `Finalize TASK-109 Discord Rich Presence with plan-first workflow and backlog closure.` | `in_progress` | `docs/subagents/agents/opencode-task109-discord-presence-20260223T011027Z-j9r4.md` | `2026-02-23T01:15:39Z` |
| `codex-task88-yomitan-flow-20260223T012755Z-x4m2` | `codex-task88-yomitan-flow` | `Execute TASK-88 remove MeCab fallback tokenizer and simplify Yomitan token flow via plan-first workflow (no commit).` | `handoff` | `docs/subagents/agents/codex-task88-yomitan-flow-20260223T012755Z-x4m2.md` | `2026-02-23T01:44:16Z` |

View File

@@ -0,0 +1,40 @@
# Agent: `codex-task88-yomitan-flow-20260223T012755Z-x4m2`
- alias: `codex-task88-yomitan-flow`
- mission: `Execute TASK-88 remove MeCab fallback tokenizer and simplify Yomitan token flow via plan-first workflow (no commit).`
- status: `handoff`
- branch: `main`
- started_at: `2026-02-23T01:27:55Z`
- heartbeat_minutes: `5`
## Current Work (newest first)
- [2026-02-23T01:44:16Z] handoff: implementation + docs updates complete for TASK-88 scope; tokenizer fallback removed, parser-selection simplified to scanning-parser-only, focused tokenizer/subtitle tests + build + docs build green.
- [2026-02-23T01:44:16Z] test: `bun test src/core/services/tokenizer/parser-selection-stage.test.ts src/core/services/tokenizer.test.ts` pass (47); `bun test src/core/services/subtitle-processing-controller.test.ts` pass (6); `bun run build` pass; `bun run docs:build` pass.
- [2026-02-23T01:30:00Z] progress: wrote plan at `docs/plans/2026-02-23-task-88-yomitan-only-token-flow.md` via writing-plans skill and executed via executing-plans skill.
- [2026-02-23T01:27:55Z] intent: load backlog context for TASK-88, write plan with writing-plans skill, execute with executing-plans skill, validate via focused/full tests, no commit.
## Files Touched
- `docs/subagents/agents/codex-task88-yomitan-flow-20260223T012755Z-x4m2.md`
- `docs/subagents/INDEX.md`
- `docs/subagents/collaboration.md`
- `docs/plans/2026-02-23-task-88-yomitan-only-token-flow.md`
- `src/core/services/tokenizer.ts`
- `src/core/services/tokenizer/parser-selection-stage.ts`
- `src/core/services/tokenizer/parser-selection-stage.test.ts`
- `src/core/services/tokenizer.test.ts`
- `docs/usage.md`
- `docs/troubleshooting.md`
## Assumptions
- Backlog is initialized and TASK-88 title/context from MCP search is authoritative despite stale `task_view` collision on legacy TASK-88.
## Open Questions / Blockers
- Backlog MCP `task_view TASK-88` resolves to a legacy completed TASK-88 entry; current TASK-88 content had to be read from `backlog/tasks/task-88 - Remove-MeCab-fallback-tokenizer-and-simplify-Yomitan-token-flow.md`.
## Next Step
- If needed, repair duplicate TASK-88 ID collision in Backlog MCP so `task_view`/`task_edit` target the active To Do ticket.

View File

@@ -148,3 +148,5 @@ Shared notes. Append-only.
- [2026-02-23T01:10:27Z] [opencode-task109-discord-presence-20260223T011027Z-j9r4|opencode-task109-discord-presence] starting TASK-109 closure pass via Backlog MCP + writing-plans/executing-plans; scope validate existing Discord config/runtime/docs changes, close remaining DoD evidence, and finalize task status if gates pass.
- [2026-02-23T01:15:39Z] [opencode-task109-discord-presence-20260223T011027Z-j9r4|opencode-task109-discord-presence] user feedback from real Discord session: status resumed to Playing with noticeable delay; tuned default `discordPresence.updateIntervalMs` from 15000 to 3000 in defaults/docs/examples and updated focused config expectations; reran focused config + discord presence tests green.
- [2026-02-23T01:27:55Z] [codex-task88-yomitan-flow-20260223T012755Z-x4m2|codex-task88-yomitan-flow] starting TASK-88 via Backlog MCP + writing-plans/executing-plans; expected overlap in tokenizer modules (`src/core/services/tokenizer*`, Yomitan flow wiring/tests); will keep scope to MeCab fallback removal and token flow simplification.
- [2026-02-23T01:44:16Z] [codex-task88-yomitan-flow-20260223T012755Z-x4m2|codex-task88-yomitan-flow] completed TASK-88 implementation pass: removed MeCab fallback branch from `tokenizeSubtitle`, restricted parser-selection to `scanning-parser` candidates, refreshed tokenizer regressions for Yomitan-only flow, updated usage/troubleshooting docs, and verified tokenizer+subtitle suites/build/docs-build green.