docs: add stats dashboard design docs, plans, and knowledge base

- Stats dashboard redesign design and implementation plans
- Episode detail and Anki card link design
- Internal knowledge base restructure
- Backlog tasks for testing, verification, and occurrence tracking
This commit is contained in:
2026-03-14 22:13:24 -07:00
parent 42abdd1268
commit cc5d270b8e
35 changed files with 5139 additions and 0 deletions

View File

@@ -0,0 +1,81 @@
# Stats Subcommand Design
**Problem:** Add a launcher command and matching app command that run only the stats dashboard stack: start the local stats server, initialize the data source it needs, and open the browser to the stats page.
**Constraints:**
- Public entrypoint is a launcher subcommand: `subminer stats`
- Reuse the existing app instance when one is already running
- Explicit `stats` launch overrides `stats.autoStartServer`
- If `immersionTracking.enabled` is `false`, fail with an error instead of opening an empty dashboard
- Scope limited to stats server + browser page; no overlay/mpv startup requirements
## Recommended Approach
Add a dedicated app CLI flag, `--stats`, and let the launcher subcommand forward into that path. Use the existing Electron single-instance flow so a second `subminer stats` invocation can be handled by the primary app instance. Add a small response-file handshake so the launcher can still return success or failure when work is delegated to an already-running primary instance.
## Runtime Flow
1. `subminer stats` runs in the launcher.
2. Launcher resolves the app binary and forwards:
- `--stats`
- `--log-level <level>` when provided
- internal `--stats-response-path <tmpfile>`
3. Electron startup parses `--stats` as an app-starting command.
4. If this process becomes the primary instance, it runs a stats-only startup path:
- load config
- fail if `immersionTracking.enabled === false`
- initialize immersion tracker
- start stats server, forcing startup regardless of `stats.autoStartServer`
- open `http://127.0.0.1:<stats.serverPort>`
- write success/error to the response path
5. If the process is a secondary instance, Electron forwards argv to the primary instance through the existing single-instance event. The primary instance runs the same stats command handler and writes the response result to the temp file. The secondary process waits for that file and exits with the same status.
## Code Shape
### Launcher
- Add `stats` top-level subcommand in `launcher/config/cli-parser-builder.ts`
- Normalize that invocation in launcher arg parsing
- Add `launcher/commands/stats-command.ts`
- Dispatch it from `launcher/main.ts`
- Reuse existing app passthrough spawn helpers
- Add launcher tests for routing and forwarded argv
### Electron app
- Extend `src/cli/args.ts` with:
- `stats: boolean`
- `statsResponsePath?: string`
- Update app start gating so `--stats` starts the app
- Add a focused stats CLI runtime service instead of burying stats launch logic inside `main.ts`
- Reuse existing immersion tracker startup and stats server helpers where possible
- Add a single function that:
- validates immersion tracking enabled
- ensures tracker exists
- ensures stats server exists
- opens browser
- reports completion/failure to optional response file
## Error Handling
- `immersionTracking.enabled === false`: hard failure, clear message
- tracker init failure: hard failure, clear message
- server start failure: hard failure, clear message
- browser open failure: hard failure, clear message
- response-path write failure: log warning; primary runtime behavior still follows command result
## Testing
- Launcher parser/routing tests for `subminer stats`
- Launcher forwarding test verifies `--stats` and `--stats-response-path`
- App CLI arg tests for `--stats`
- App runtime tests for:
- stats command starts tracker/server/browser
- stats command forces server start even when auto-start is off
- stats command fails when immersion tracking is disabled
- second-instance command path can surface failure via response file plumbing
## Docs
- Update CLI help text
- Update user docs where launcher/browser stats access is described

View File

@@ -0,0 +1,173 @@
# Stats Subcommand Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Add `subminer stats` plus app-side `--stats` support so SubMiner can launch only the stats dashboard stack, reuse an existing app instance, and fail when immersion tracking is disabled.
**Architecture:** The launcher gets a new `stats` subcommand that forwards into a dedicated Electron CLI flag. The app handles that flag through a focused stats command service that validates immersion tracking, ensures tracker/server startup, opens the browser, and optionally reports success/failure through a response file so second-instance reuse preserves shell exit status.
**Tech Stack:** TypeScript, Bun test runner, Electron single-instance lifecycle, Commander-based launcher CLI.
---
### Task 1: Record launcher command coverage
**Files:**
- Modify: `launcher/main.test.ts`
- Modify: `launcher/commands/command-modules.test.ts`
**Step 1: Write the failing test**
Add coverage that `subminer stats` routes through a dedicated launcher command and forwards `--stats` into the app process.
**Step 2: Run test to verify it fails**
Run: `bun test launcher/main.test.ts launcher/commands/command-modules.test.ts`
Expected: FAIL because the `stats` command does not exist yet.
**Step 3: Write minimal implementation**
Add the launcher parser, command module, and dispatch wiring needed for the tests to pass.
**Step 4: Run test to verify it passes**
Run: `bun test launcher/main.test.ts launcher/commands/command-modules.test.ts`
Expected: PASS
**Step 5: Commit**
```bash
git add launcher/main.test.ts launcher/commands/command-modules.test.ts launcher/config/cli-parser-builder.ts launcher/main.ts launcher/commands/stats-command.ts
git commit -m "feat: add launcher stats command"
```
### Task 2: Add failing app CLI parsing/runtime tests
**Files:**
- Modify: `src/cli/args.test.ts`
- Modify: `src/main/runtime/cli-command-runtime-handler.test.ts`
- Modify: `src/main/runtime/cli-command-context.test.ts`
**Step 1: Write the failing test**
Add tests for:
- parsing `--stats`
- starting the app for `--stats`
- stats command behavior when immersion tracking is disabled
- stats command behavior when startup succeeds
**Step 2: Run test to verify it fails**
Run: `bun test src/cli/args.test.ts src/main/runtime/cli-command-runtime-handler.test.ts src/main/runtime/cli-command-context.test.ts`
Expected: FAIL because the new flag/service is not implemented.
**Step 3: Write minimal implementation**
Extend CLI args and add the smallest stats command runtime surface required by the tests.
**Step 4: Run test to verify it passes**
Run: `bun test src/cli/args.test.ts src/main/runtime/cli-command-runtime-handler.test.ts src/main/runtime/cli-command-context.test.ts`
Expected: PASS
**Step 5: Commit**
```bash
git add src/cli/args.test.ts src/main/runtime/cli-command-runtime-handler.test.ts src/main/runtime/cli-command-context.test.ts src/cli/args.ts src/main/cli-runtime.ts src/main/runtime/cli-command-context.ts src/main/runtime/cli-command-context-deps.ts
git commit -m "feat: add app stats cli command"
```
### Task 3: Add stats-only startup plumbing
**Files:**
- Modify: `src/main.ts`
- Modify: `src/core/services/startup.ts`
- Modify: `src/core/services/cli-command.ts`
- Modify: `src/main/runtime/immersion-startup.ts`
- Test: existing runtime tests above plus any new focused stats tests
**Step 1: Write the failing test**
Add focused tests around the stats startup service and response-path reporting before touching production logic.
**Step 2: Run test to verify it fails**
Run: `bun test src/core/services/cli-command.test.ts`
Expected: FAIL because stats startup/reporting is missing.
**Step 3: Write minimal implementation**
Implement:
- stats-only startup gating
- tracker/server/browser startup orchestration
- response-file success/failure reporting for reused primary-instance handling
**Step 4: Run test to verify it passes**
Run: `bun test src/core/services/cli-command.test.ts`
Expected: PASS
**Step 5: Commit**
```bash
git add src/main.ts src/core/services/startup.ts src/core/services/cli-command.ts src/main/runtime/immersion-startup.ts src/core/services/cli-command.test.ts
git commit -m "feat: add stats-only startup flow"
```
### Task 4: Update docs/help
**Files:**
- Modify: `src/cli/help.ts`
- Modify: `docs-site/immersion-tracking.md`
- Modify: `docs-site/mining-workflow.md`
**Step 1: Write the failing test**
Add/update any help-text assertions first.
**Step 2: Run test to verify it fails**
Run: `bun test src/cli/help.test.ts`
Expected: FAIL until help text includes stats usage.
**Step 3: Write minimal implementation**
Document `subminer stats` and clarify that explicit invocation forces the local dashboard server to start.
**Step 4: Run test to verify it passes**
Run: `bun test src/cli/help.test.ts`
Expected: PASS
**Step 5: Commit**
```bash
git add src/cli/help.ts src/cli/help.test.ts docs-site/immersion-tracking.md docs-site/mining-workflow.md
git commit -m "docs: document stats subcommand"
```
### Task 5: Verify integrated behavior
**Files:**
- No new files required
**Step 1: Run targeted unit/integration lanes**
Run:
- `bun run typecheck`
- `bun run test:launcher`
- `bun test src/cli/args.test.ts src/core/services/cli-command.test.ts src/main/runtime/cli-command-runtime-handler.test.ts src/main/runtime/cli-command-context.test.ts src/cli/help.test.ts`
Expected: PASS
**Step 2: Run broader maintained lane if the targeted slice is clean**
Run: `bun run test:fast`
Expected: PASS or surface unrelated pre-existing failures.
**Step 3: Commit verification fixes if needed**
```bash
git add -A
git commit -m "test: stabilize stats command coverage"
```

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,152 @@
# Stats Dashboard v2 Redesign
## Summary
Redesign the stats dashboard to focus on session/media history as the primary experience. Activity feed as the default view, dedicated Library tab with anime cover art (via Anilist API), per-anime drill-down pages, and bug fixes for watch time inflation and relative date formatting.
## Bug Fixes (Pre-requisite)
### Watch Time Inflation
Telemetry values (`active_watched_ms`, `total_watched_ms`, `lines_seen`, `words_seen`, etc.) are cumulative snapshots — each sample stores the running total for that session. Both `getSessionSummaries` (query.ts) and `upsertDailyRollupsForGroups` / `upsertMonthlyRollupsForGroups` (maintenance.ts) incorrectly use `SUM()` across all telemetry rows instead of `MAX()` per session.
**Fix:**
- `getSessionSummaries`: change `SUM(t.active_watched_ms)``MAX(t.active_watched_ms)` (already grouped by `s.session_id`)
- `upsertDailyRollupsForGroups` / `upsertMonthlyRollupsForGroups`: use a subquery that gets `MAX()` per session_id, then `SUM()` across sessions
- Run `forceRebuild` rollup after migration to recompute all rollups
### Relative Date Formatting
`formatRelativeDate` only has day-level granularity ("Today", "Yesterday"). Add minute and hour levels:
- < 1 min → "just now"
- < 60 min → "Xm ago"
- < 24 hours → "Xh ago"
- < 2 days → "Yesterday"
- < 7 days → "Xd ago"
- otherwise → locale date string
## Tab Structure
**Overview** (default) | **Library** | **Trends** | **Vocabulary**
### Overview Tab — Activity Feed
Top section: hero stats (watch time today, cards mined today, streak, all-time total hours).
Below: recent sessions listed chronologically, grouped by day headers ("Today", "Yesterday", "March 10"). Each session row shows:
- Small cover art thumbnail (from Anilist cache)
- Clean title with episode info ("The Eminence in Shadow — Episode 5")
- Relative timestamp ("32m ago") and active duration ("24m active")
- Per-session stats: cards mined, words seen
### Library Tab — Cover Art Grid
Grid of anime cover art cards fetched from Anilist API. Each card shows:
- Cover art image (3:4 aspect ratio)
- Episode count badge
- Title, total watch time, cards mined
Controls: search bar, filter chips (All / Watching / Completed), total count + time summary.
Clicking a card navigates to the per-anime detail view.
### Per-Anime Detail View
Navigated from Library card click. Sections:
1. **Header** — cover art, title, total watch time, total episodes, total cards mined, avg session length
2. **Watch time chart** — bar chart scoped to this anime over time (14/30/90d range selector)
3. **Session history** — all sessions for this anime with timestamps, durations, per-session stats
4. **Vocabulary** — words and kanji learned from this show (joined via session events → video_id)
### Trends & Vocabulary Tabs
Keep existing implementation, mostly unchanged for v2.
## Anilist Integration & Cover Art Cache
### Title Parsing
Parse show name from `canonical_title` to search Anilist:
- Jellyfin titles are already clean (use as-is)
- Local file titles: use existing `guessit` + `parseMediaInfo` fallback from `anilist-updater.ts`
- Strip episode info, codec tags, resolution markers via regex
### Anilist GraphQL Query
Search query (no auth needed for public anime search):
```graphql
query ($search: String!) {
Page(perPage: 5) {
media(search: $search, type: ANIME) {
id
coverImage { large medium }
title { romaji english native }
episodes
}
}
}
```
### Cover Art Cache
New SQLite table `imm_media_art`:
- `video_id` INTEGER PRIMARY KEY (FK to imm_videos)
- `anilist_id` INTEGER
- `cover_url` TEXT
- `cover_blob` BLOB (cached image binary)
- `title_romaji` TEXT
- `title_english` TEXT
- `episodes_total` INTEGER
- `fetched_at_ms` INTEGER
- `CREATED_DATE` INTEGER
- `LAST_UPDATE_DATE` INTEGER
Serve cached images via: `GET /api/stats/media/:videoId/cover`
Fallback: gray placeholder with first character of title if no Anilist match.
### Rate Limiting Strategy
**Current state:** Anilist limit is 30 req/min (temporarily reduced from 90). Existing post-watch updater uses up to 3 requests per episode (search + entry lookup + save mutation). Retry queue can also fire requests. No centralized rate limiter.
**Centralized rate limiter:**
- Shared sliding-window tracker (array of timestamps) for all Anilist calls
- App-wide cap: 20 req/min (leaving 10 req/min headroom)
- All callers go through the limiter: existing `anilistGraphQl` helper and new cover art fetcher
- Read `X-RateLimit-Remaining` from response headers; if < 5, pause until window resets
- On 429 response, honor `Retry-After` header
**Cover art fetching behavior:**
- Lazy & one-shot: only fetch when a video appears in the stats UI with no cached art
- Once cached in SQLite, never re-fetch (cover art doesn't change)
- On first Library load with N uncached titles, fetch sequentially with ~3s gap between requests
- Show placeholder for unfetched titles, fill in as fetches complete
## New API Endpoints
- `GET /api/stats/media` — all media with aggregated stats (total time, episodes watched, cards, last watched, cover art status)
- `GET /api/stats/media/:videoId` — single media detail: session history, rollups, vocab for that video
- `GET /api/stats/media/:videoId/cover` — cached cover art image (binary response)
## New Database Queries
- `getMediaLibrary(db)` — group sessions by video_id, aggregate stats, join with imm_media_art
- `getMediaDetail(db, videoId)` — sessions + daily rollups + vocab scoped to one video_id
- `getMediaVocabulary(db, videoId)` — words/kanji from sessions belonging to a specific video_id (join imm_session_events with imm_sessions on video_id)
## Data Flow
```
Library tab loads
→ GET /api/stats/media
→ Returns list of videos with aggregated stats + cover art status
→ For videos without cached art:
→ Background: parse title → search Anilist → download cover → cache in SQLite
→ Rate-limited via centralized sliding window (20 req/min cap)
→ UI shows placeholders, fills in as covers arrive
User clicks anime card
→ GET /api/stats/media/:videoId
→ Returns sessions, rollups, vocab for that video
→ Renders detail view with all four sections
```

View File

@@ -0,0 +1,88 @@
# Internal Knowledge Base Restructure Design
**Problem:** `AGENTS.md` currently carries too much project detail while deeper internal guidance is either missing from `docs/` or mixed into `docs-site/`, which should stay user-facing. Agents and contributors need a stable entrypoint plus an internal system of record with progressive disclosure and mechanical enforcement.
**Goals:**
- Make `AGENTS.md` a short table of contents, not an encyclopedia.
- Establish `docs/` as the internal system of record for architecture, workflow, and knowledge-base conventions.
- Keep `docs-site/` user-facing only.
- Add lightweight enforcement so the split is maintained mechanically.
- Preserve existing build/test/release guidance while moving canonical internal pointers into `docs/`.
**Non-Goals:**
- Rework product/user docs information architecture beyond boundary cleanup.
- Build a custom documentation generator.
- Solve plan lifecycle cleanup in this change.
- Reorganize unrelated runtime code or existing feature docs.
## Recommended Approach
Create a small internal knowledge-base structure under `docs/`, rewrite `AGENTS.md` into a compact map that points into that structure, and add a repo-level test that enforces required internal docs and boundary rules. Keep `docs-site/` focused on public/product documentation and strip out any claims that it is the canonical source for internal architecture or workflow.
## Information Architecture
Proposed internal layout:
```text
docs/
README.md
architecture/
README.md
domains.md
layering.md
knowledge-base/
README.md
core-beliefs.md
catalog.md
quality.md
workflow/
README.md
planning.md
verification.md
plans/
...
```
Key rules:
- `AGENTS.md` links to `docs/README.md` plus a small set of core entrypoints.
- `docs/README.md` acts as the internal KB home page.
- Internal docs include lightweight metadata via explicit section fields:
- `Status`
- `Last verified`
- `Owner`
- `Read when`
- `docs-site/` remains public/user-facing and may link to internal docs only as contributor references, not as canonical internal source-of-truth pages.
## Enforcement
Add a repo-level knowledge-base test that validates:
- `AGENTS.md` links to required internal KB docs.
- `AGENTS.md` stays below a capped line count.
- Required internal docs exist.
- Internal KB docs include the expected metadata fields.
- `docs-site/development.md` and `docs-site/architecture.md` point internal readers to `docs/`.
- `AGENTS.md` does not treat `docs-site/` pages as the canonical internal source of truth.
Keep the new test outside `docs-site/` so internal/public boundaries stay clear.
## Migration Plan
1. Write the design and implementation plan docs.
2. Rewrite `AGENTS.md` as a compact map.
3. Create the internal KB entrypoints under `docs/`.
4. Update `docs-site/` contributor docs to reference `docs/` for internal guidance.
5. Add a repo-level test plus package/CI wiring.
6. Run docs and targeted repo verification.
## Verification Strategy
Primary commands:
- `bun run docs:test`
- `bun run test:fast`
- `bun run docs:build`
Focused review:
- read `AGENTS.md` start-to-finish and confirm it behaves like a table of contents
- inspect the new `docs/README.md` navigation and cross-links
- confirm `docs-site/` still reads as user-facing documentation

View File

@@ -0,0 +1,138 @@
# Internal Knowledge Base Restructure Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Turn `AGENTS.md` into a compact table of contents, establish `docs/` as the internal system of record, keep `docs-site/` user-facing, and enforce the split with tests/CI.
**Architecture:** Create a small internal knowledge-base hierarchy under `docs/`, migrate canonical contributor/agent guidance into it, and wire a repo-level verification test that checks required docs, metadata, and boundary rules. Keep public docs in `docs-site/` and update them to reference internal docs rather than acting as the source of truth themselves.
**Tech Stack:** Markdown docs, Bun test runner, existing GitHub Actions CI
---
### Task 1: Add the internal KB entrypoints
**Files:**
- Create: `docs/README.md`
- Create: `docs/architecture/README.md`
- Create: `docs/architecture/domains.md`
- Create: `docs/architecture/layering.md`
- Create: `docs/knowledge-base/README.md`
- Create: `docs/knowledge-base/core-beliefs.md`
- Create: `docs/knowledge-base/catalog.md`
- Create: `docs/knowledge-base/quality.md`
- Create: `docs/workflow/README.md`
- Create: `docs/workflow/planning.md`
- Create: `docs/workflow/verification.md`
**Step 1: Write the KB home page**
Add `docs/README.md` with navigation to architecture, workflow, knowledge-base maintenance, release docs, and active plans.
**Step 2: Add architecture pages**
Create the architecture index plus focused `domains.md` and `layering.md` pages that summarize runtime ownership and dependency boundaries from the existing architecture doc.
**Step 3: Add knowledge-base pages**
Create the knowledge-base index, core beliefs, catalog, and quality pages with explicit metadata fields and short maintenance guidance.
**Step 4: Add workflow pages**
Create workflow index, planning guide, and verification guide with the current maintained Bun commands and lane-selection guidance.
**Step 5: Review cross-links**
Read the new docs and confirm every page links back to at least one parent/index page.
### Task 2: Rewrite `AGENTS.md` as a compact map
**Files:**
- Modify: `AGENTS.md`
**Step 1: Replace encyclopedia-style content with a compact map**
Keep only the minimum operational guidance needed in injected context:
- quick start
- internal source-of-truth pointers
- build/test gate
- generated/sensitive file notes
- release pointer
- backlog note
**Step 2: Add direct links to the new KB entrypoints**
Point `AGENTS.md` at:
- `docs/README.md`
- `docs/architecture/README.md`
- `docs/workflow/README.md`
- `docs/workflow/verification.md`
- `docs/knowledge-base/README.md`
- `docs/RELEASING.md`
**Step 3: Keep the file intentionally short**
Target roughly 100 lines and avoid moving deep details back into `AGENTS.md`.
### Task 3: Re-boundary `docs-site/`
**Files:**
- Modify: `docs-site/development.md`
- Modify: `docs-site/architecture.md`
- Modify: `docs-site/README.md`
**Step 1: Update contributor-facing docs**
Keep build/run/testing instructions, but stop presenting `docs-site/*` pages as canonical internal architecture/workflow references.
**Step 2: Add explicit internal-doc pointers**
Link readers to `docs/README.md` and the new internal architecture/workflow pages for deep contributor guidance.
**Step 3: Preserve public-doc tone**
Ensure the `docs-site/` pages remain user/contributor-facing and do not become the internal KB themselves.
### Task 4: Add mechanical enforcement
**Files:**
- Create: `scripts/docs-knowledge-base.test.ts`
- Modify: `package.json`
- Modify: `.github/workflows/ci.yml`
**Step 1: Write a repo-level docs KB test**
Assert:
- required docs exist
- metadata fields exist on internal docs
- `AGENTS.md` links to internal KB entrypoints
- `AGENTS.md` stays under the line cap
- `docs-site/development.md` and `docs-site/architecture.md` point to `docs/`
**Step 2: Wire the test into package scripts**
Add a script for the KB test and include it in an existing maintained verification lane.
**Step 3: Ensure CI exercises the check**
Make sure the CI path that runs the maintained test lane catches KB regressions.
### Task 5: Verify and hand off
**Files:**
- Modify: any files above if verification reveals drift
**Step 1: Run targeted docs verification**
Run:
- `bun run docs:test`
- `bun run test:fast`
- `bun run docs:build`
**Step 2: Fix drift found by tests**
If any assertions fail, update the docs or test expectations so the enforced model matches the intended structure.
**Step 3: Summarize outcome**
Report the new internal KB entrypoints, the `AGENTS.md` table-of-contents rewrite, enforcement coverage, verification results, and any skipped items.

View File

@@ -0,0 +1,69 @@
# Imm Words Cleanup Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Fix `imm_words` so only allowed vocabulary tokens are persisted with POS metadata, and add an on-demand stats cleanup command that removes existing bad vocabulary rows.
**Architecture:** Thread processed `SubtitleData.tokens` into immersion tracking instead of extracting vocabulary from raw subtitle text. Store POS metadata on `imm_words`, reuse the existing POS exclusion logic for persistence and cleanup, and expose cleanup through the existing launcher/app stats command surface.
**Tech Stack:** TypeScript, Bun, SQLite/libsql wrapper, Commander-based launcher CLI
---
### Task 1: Lock the broken behavior down with failing tests
**Files:**
- Modify: `src/core/services/immersion-tracker-service.test.ts`
- Modify: `src/core/services/immersion-tracker/__tests__/query.test.ts`
- Modify: `launcher/parse-args.test.ts`
- Modify: `src/main/runtime/stats-cli-command.test.ts`
**Steps:**
1. Add a tracker regression test that records subtitle tokens with mixed POS and asserts excluded tokens are not written while allowed tokens retain POS metadata.
2. Add a cleanup/query test that seeds valid and invalid `imm_words` rows and asserts vocab cleanup deletes only invalid rows.
3. Add launcher parse tests for `subminer stats cleanup`, `subminer stats cleanup -v`, and default vocab cleanup mode.
4. Add app-side stats CLI tests for dispatching cleanup vs dashboard launch.
### Task 2: Fix live vocabulary persistence
**Files:**
- Modify: `src/core/services/immersion-tracker-service.ts`
- Modify: `src/core/services/immersion-tracker/types.ts`
- Modify: `src/core/services/immersion-tracker/storage.ts`
- Modify: `src/main/runtime/mpv-main-event-main-deps.ts`
- Modify: `src/main/runtime/mpv-main-event-bindings.ts`
- Modify: `src/main/runtime/mpv-main-event-actions.ts`
- Modify: `src/main/state.ts`
**Steps:**
1. Extend immersion subtitle recording to accept processed subtitle payloads or token arrays.
2. Add `imm_words` POS columns and prepared-statement support.
3. Convert tracker inserts to use processed tokens rather than raw regex extraction.
4. Reuse existing POS/frequency exclusion rules to decide whether a token belongs in `imm_words`.
### Task 3: Add vocab cleanup service and CLI wiring
**Files:**
- Modify: `src/core/services/immersion-tracker/query.ts`
- Modify: `src/main/runtime/stats-cli-command.ts`
- Modify: `src/cli/args.ts`
- Modify: `launcher/config/cli-parser-builder.ts`
- Modify: `launcher/config/args-normalizer.ts`
- Modify: `launcher/types.ts`
- Modify: `launcher/commands/stats-command.ts`
**Steps:**
1. Add a cleanup routine that removes invalid `imm_words` rows using the same persistence filter.
2. Extend app CLI args to represent stats cleanup actions and vocab mode selection.
3. Extend launcher stats command forwarding to pass cleanup flags through the attached app flow.
4. Print a compact cleanup summary and fail cleanly on errors.
### Task 4: Verify and document the final behavior
**Files:**
- Modify: `docs-site/immersion-tracking.md`
**Steps:**
1. Update user-facing stats docs to mention `subminer stats cleanup` vocab maintenance.
2. Run the cheapest sufficient verification lanes for touched files.
3. Record exact commands/results in the task final summary before handoff.

View File

@@ -0,0 +1,110 @@
# Immersion Anime Metadata Design
**Problem:** The immersion database is keyed around videos and sessions, which makes it awkward to present anime-centric stats such as per-anime totals, episode progress, and season breakdowns. We need first-class anime metadata without requiring migration or backfill support for existing databases.
**Goals:**
- Add anime-level identity that can be shared across multiple video files and rewatches.
- Persist parsed episode/season metadata so stats can group by anime, season, and episode.
- Use existing filename parsing conventions: `guessit` first, built-in parser fallback.
- Create provisional anime rows even when AniList lookup fails.
- Keep the change additive and forward-looking; do not spend time on migrations/backfill.
**Non-Goals:**
- Backfilling or migrating existing user databases.
- Perfect anime identity resolution across every edge case.
- Building the entire new stats UI in this design doc.
- Replacing existing `canonical_title` or current video/session APIs immediately.
## Recommended Approach
Add a new `imm_anime` table for anime-level metadata and link each `imm_videos` row to one anime row through `anime_id`. Keep season/episode and filename-derived fields on `imm_videos`, because those belong to a concrete file, not the anime as a whole.
Anime rows should exist even when AniList lookup fails. In that case, use a normalized parsed-title key as provisional identity. If the same anime is resolved to AniList later, upgrade the existing anime row in place instead of creating a duplicate.
## Data Model
### `imm_anime`
One row per anime identity.
Suggested fields:
- `anime_id INTEGER PRIMARY KEY AUTOINCREMENT`
- `identity_key TEXT NOT NULL UNIQUE`
- `parsed_title TEXT NOT NULL`
- `normalized_title TEXT NOT NULL`
- `anilist_id INTEGER`
- `title_romaji TEXT`
- `title_english TEXT`
- `title_native TEXT`
- `episodes_total INTEGER`
- `parser_source TEXT`
- `parser_confidence TEXT`
- `metadata_json TEXT`
- `CREATED_DATE INTEGER`
- `LAST_UPDATE_DATE INTEGER`
Identity rules:
- Resolved anime: `identity_key = anilist:<id>`
- Provisional anime: `identity_key = title:<normalized parsed title>`
- When a provisional row later gets an AniList match, update that row's `identity_key` to `anilist:<id>` and fill AniList metadata.
### `imm_videos`
Keep existing video metadata. Add:
- `anime_id INTEGER`
- `parsed_filename TEXT`
- `parsed_title TEXT`
- `parsed_title_normalized TEXT`
- `parsed_season INTEGER`
- `parsed_episode INTEGER`
- `parsed_episode_title TEXT`
- `parser_source TEXT`
- `parser_confidence TEXT`
- `parse_metadata_json TEXT`
`canonical_title` remains for compatibility. New fields are additive.
## Parsing and Lookup Flow
During `handleMediaChange(...)`:
1. Normalize path/title with the existing tracker flow.
2. Build/create the video row as today.
3. Parse anime metadata:
- use `guessit` against the basename/title when available
- fallback to existing `parseMediaInfo`
4. Use the parsed title to create/find a provisional anime row if needed.
5. Attempt AniList lookup using the same guessit-first, fallback-parser approach already used elsewhere.
6. If AniList lookup succeeds:
- upgrade or fill the anime row with AniList id/title metadata
- keep per-video season/episode fields on the video row
7. Link the video row to `anime_id` and store parsed per-video metadata.
## Query Shape
Add anime-aware query functions without deleting current video/session queries:
- anime library list
- anime detail summary
- anime episode list / season breakdown
- anime sessions list
Aggregation should group by `anime_id`, not `canonical_title`, so rewatches and multiple files collapse correctly.
## Edge Cases
- Multiple files for one anime: many videos may point to one anime row.
- Rewatches: same video/session history still aggregates under one anime row.
- No AniList match: keep provisional anime row keyed by normalized parsed title.
- Later AniList match: upgrade provisional row in place.
- Parser disagreement between files: season/episode remain per-video; anime identity uses AniList id or normalized parsed title.
- Remote/Jellyfin playback: use the effective title/path available to the current tracker flow and run the same parser pipeline.
## Testing Strategy
Start red/green with focused DB-backed tests:
- schema test for `imm_anime` and new video columns
- storage test for provisional anime creation, reuse, and AniList upgrade
- service test for media-change ingest wiring
- query test for anime-level aggregation and episode breakdown
Primary verification lane for implementation: `bun run test:immersion:sqlite:src`, then broader repo verification as needed.

View File

@@ -0,0 +1,370 @@
# Immersion Anime Metadata Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Add anime-level immersion metadata, link videos to anime rows, and expose anime/season/episode query surfaces so future stats can aggregate by anime instead of only by video title.
**Architecture:** Introduce a new `imm_anime` table plus additive `imm_videos` metadata columns. Wire media ingest through a guessit-first, fallback-parser flow that always creates or reuses an anime row, stores per-video episode metadata, and upgrades provisional anime rows when AniList data becomes available. Keep existing video/session behavior compatible while adding new query surfaces in parallel.
**Tech Stack:** TypeScript, Bun, libsql SQLite, existing immersion tracker storage/query/service modules, existing AniList parser helpers (`guessit`, `parseMediaInfo`)
---
### Task 1: Add Red Tests for Schema Shape
**Files:**
- Modify: `src/core/services/immersion-tracker/storage-session.test.ts`
- Inspect: `src/core/services/immersion-tracker/storage.ts`
- Inspect: `src/core/services/immersion-tracker/types.ts`
**Step 1: Write the failing schema test**
Add assertions that `ensureSchema()` creates:
- `imm_anime`
- new `imm_videos` columns for `anime_id`, parsed filename/title, season, episode, parser source/confidence, and parse metadata
Use `PRAGMA table_info(imm_videos)` and `sqlite_master` queries instead of indirect assertions.
**Step 2: Run the targeted test to verify it fails**
Run:
```bash
bun test src/core/services/immersion-tracker/storage-session.test.ts
```
Expected: FAIL because the new table/columns do not exist yet.
**Step 3: Implement minimal schema changes**
Modify `src/core/services/immersion-tracker/storage.ts` and `src/core/services/immersion-tracker/types.ts`:
- add `imm_anime`
- add new `imm_videos` columns
- add indexes/FKs needed for anime lookup
- bump schema version for the fresh-schema path
- do not add migration/backfill logic for older DB contents
**Step 4: Re-run the targeted test**
Run:
```bash
bun test src/core/services/immersion-tracker/storage-session.test.ts
```
Expected: PASS.
**Step 5: Commit**
```bash
git add src/core/services/immersion-tracker/types.ts src/core/services/immersion-tracker/storage.ts src/core/services/immersion-tracker/storage-session.test.ts
git commit -m "feat(immersion): add anime schema and video metadata fields"
```
### Task 2: Add Red Tests for Anime Storage Identity and Upgrade Rules
**Files:**
- Modify: `src/core/services/immersion-tracker/storage-session.test.ts`
- Modify: `src/core/services/immersion-tracker/storage.ts`
- Inspect: `src/core/services/immersion-tracker/query.ts`
**Step 1: Write failing storage tests**
Add DB-backed tests for:
- creating a provisional anime row from normalized parsed title
- reusing that row for another video from the same anime
- upgrading the same row when AniList id/title metadata becomes available later
- preserving per-video season/episode values while sharing one anime row
Prefer explicit row assertions over service-level mocks.
**Step 2: Run the targeted test file to verify it fails**
Run:
```bash
bun test src/core/services/immersion-tracker/storage-session.test.ts
```
Expected: FAIL because storage helpers do not exist yet.
**Step 3: Implement minimal storage helpers**
In `src/core/services/immersion-tracker/storage.ts`, add focused helpers such as:
- normalize anime identity key from parsed title
- get/create provisional anime row
- upgrade anime row with AniList data
- update/link per-video anime metadata
Keep responsibilities narrow and composable; do not bury query logic in the service class.
**Step 4: Re-run the targeted test file**
Run:
```bash
bun test src/core/services/immersion-tracker/storage-session.test.ts
```
Expected: PASS.
**Step 5: Commit**
```bash
git add src/core/services/immersion-tracker/storage.ts src/core/services/immersion-tracker/storage-session.test.ts
git commit -m "feat(immersion): store provisional anime rows and upgrade with AniList data"
```
### Task 3: Add Red Tests for Parser Metadata Extraction
**Files:**
- Modify: `src/core/services/immersion-tracker/metadata.test.ts`
- Modify: `src/core/services/immersion-tracker/metadata.ts`
- Inspect: `src/jimaku/utils.ts`
- Inspect: `src/core/services/anilist/anilist-updater.ts`
**Step 1: Write failing parser tests**
Add tests for a helper that returns parsed anime/video metadata from a media path/title:
- uses `guessit` output first when available
- falls back to built-in parser when `guessit` throws or returns incomplete data
- preserves season/episode/title/source/confidence
- records filename/basename for per-video metadata
Use representative filenames like:
- `Little Witch Academia S02E05.mkv`
- `[SubsPlease] Frieren - 03 (1080p).mkv`
**Step 2: Run the targeted parser test file to verify it fails**
Run:
```bash
bun test src/core/services/immersion-tracker/metadata.test.ts
```
Expected: FAIL because the helper does not exist yet.
**Step 3: Implement the minimal parser helper**
In `src/core/services/immersion-tracker/metadata.ts`:
- add a focused helper that wraps guessit-first parsing
- reuse existing parser conventions instead of inventing a new format
- keep ffprobe/local media metadata behavior intact
If shared types are needed, add them in `src/core/services/immersion-tracker/types.ts`.
**Step 4: Re-run the targeted parser test**
Run:
```bash
bun test src/core/services/immersion-tracker/metadata.test.ts
```
Expected: PASS.
**Step 5: Commit**
```bash
git add src/core/services/immersion-tracker/metadata.ts src/core/services/immersion-tracker/metadata.test.ts src/core/services/immersion-tracker/types.ts
git commit -m "feat(immersion): add guessit-first anime metadata parsing helper"
```
### Task 4: Add Red Tests for Media-Change Ingest Wiring
**Files:**
- Modify: `src/core/services/immersion-tracker-service.test.ts`
- Modify: `src/core/services/immersion-tracker-service.ts`
- Inspect: `src/core/services/immersion-tracker/storage.ts`
- Inspect: `src/core/services/immersion-tracker/metadata.ts`
**Step 1: Write failing service tests**
Add focused tests showing that `handleMediaChange(...)`:
- creates/links an anime row
- stores parsed season/episode/file metadata on the active video row
- reuses the same anime row across multiple video files for the same parsed anime
- keeps working when AniList lookup is missing
Prefer DB-backed assertions after service calls rather than deep mocking.
**Step 2: Run the targeted service test to verify it fails**
Run:
```bash
bun test src/core/services/immersion-tracker-service.test.ts
```
Expected: FAIL because ingest does not yet populate anime metadata.
**Step 3: Implement the minimal service wiring**
Modify `src/core/services/immersion-tracker-service.ts` to:
- call the new parser helper during media change
- create/reuse provisional anime rows
- persist per-video metadata
- trigger AniList enrichment/upgrade only as far as current dependencies already allow
Do not refactor unrelated tracker behavior while making this pass.
**Step 4: Re-run the targeted service test**
Run:
```bash
bun test src/core/services/immersion-tracker-service.test.ts
```
Expected: PASS.
**Step 5: Commit**
```bash
git add src/core/services/immersion-tracker-service.ts src/core/services/immersion-tracker-service.test.ts
git commit -m "feat(immersion): link videos to anime metadata during media ingest"
```
### Task 5: Add Red Tests for Anime Query Surfaces
**Files:**
- Modify: `src/core/services/immersion-tracker/__tests__/query.test.ts`
- Modify: `src/core/services/immersion-tracker/query.ts`
- Modify: `src/core/services/immersion-tracker/types.ts`
**Step 1: Write failing query tests**
Add tests for new query functions such as:
- anime library summary list
- anime detail summary
- per-anime episode list or season breakdown
Seed the DB with:
- one anime with multiple episode files
- repeated sessions on one episode
- another anime for contrast
Assert grouping by `anime_id`, not by `canonical_title`.
**Step 2: Run the targeted query test to verify it fails**
Run:
```bash
bun test src/core/services/immersion-tracker/__tests__/query.test.ts
```
Expected: FAIL because the anime query functions/types do not exist yet.
**Step 3: Implement minimal query functions**
Modify `src/core/services/immersion-tracker/query.ts` and related exported types to add anime-level queries in parallel with existing video-level queries.
Keep SQL explicit and aggregation stable:
- anime totals from linked sessions/videos
- episode/season data from video-level parsed fields
**Step 4: Re-run the targeted query test**
Run:
```bash
bun test src/core/services/immersion-tracker/__tests__/query.test.ts
```
Expected: PASS.
**Step 5: Commit**
```bash
git add src/core/services/immersion-tracker/query.ts src/core/services/immersion-tracker/types.ts src/core/services/immersion-tracker/__tests__/query.test.ts
git commit -m "feat(immersion): add anime-level stats queries"
```
### Task 6: Integrate Export Surfaces and Compatibility Checks
**Files:**
- Modify: `src/core/services/immersion-tracker-service.ts`
- Modify: any stats-server or API files only if needed after query integration
- Inspect: `src/core/services/__tests__/stats-server.test.ts`
- Inspect: `stats/src/lib/dashboard-data.ts`
**Step 1: Write the smallest failing integration test if API surface changes**
Only if the service/API export surface changes, add one failing test proving the new query path is exposed correctly. If no export change is needed yet, skip straight to implementation and note the skip in the task notes.
**Step 2: Run the targeted test to verify red state**
Run only the affected test file, for example:
```bash
bun test src/core/services/__tests__/stats-server.test.ts
```
Expected: FAIL if a new API contract is required; otherwise explicitly skip.
**Step 3: Implement minimal integration**
Export new query methods through the service only where needed for the next stats consumer. Avoid prematurely reshaping the public API if current UI work is out of scope.
**Step 4: Run the targeted integration test**
Run:
```bash
bun test src/core/services/__tests__/stats-server.test.ts
```
Expected: PASS, or documented skip if no API change was needed.
**Step 5: Commit**
```bash
git add src/core/services/immersion-tracker-service.ts src/core/services/__tests__/stats-server.test.ts stats/src/lib/dashboard-data.ts
git commit -m "feat(stats): expose anime-level immersion data where needed"
```
### Task 7: Run Focused Verification and Update Docs/Task
**Files:**
- Modify: `backlog/tasks/task-169 - Add-anime-level-immersion-metadata-and-link-videos.md`
- Modify: docs only if implementation changes user-visible behavior or API expectations
**Step 1: Run the focused SQLite immersion lane**
Run:
```bash
bun run test:immersion:sqlite:src
```
Expected: PASS.
**Step 2: Run any additional required verification**
Use the repo verifier/classifier to choose broader lanes if the diff touches runtime or stats-server surfaces:
```bash
bash .agents/skills/subminer-change-verification/scripts/classify_subminer_diff.sh
bash .agents/skills/subminer-change-verification/scripts/verify_subminer_change.sh --lane core
```
Escalate only if the touched files require it.
**Step 3: Update task notes and final summary**
Record:
- commands run
- pass/fail
- skipped lanes
- remaining risks
Update the task plan section if actual execution deviated.
**Step 4: Commit**
```bash
git add backlog/tasks/task-169\ -\ Add-anime-level-immersion-metadata-and-link-videos.md
git commit -m "docs(backlog): record immersion anime metadata verification"
```

View File

@@ -0,0 +1,56 @@
# Episode Detail & Anki Card Link — Design
**Date**: 2026-03-14
**Status**: Approved
## Motivation
The anime detail page shows episodes and cards mined but lacks drill-down into individual episodes. Users want to see per-episode stats (sessions, words, cards) and link directly to mined Anki cards.
## Design
### 1. Episode Expandable Detail
Click an episode row in `EpisodeList` or `AnimeCardsList` → expands inline:
- Sessions list for this episode (sessions linked to video_id)
- Cards mined list — timestamps + "Open in Anki" button per card (when note ID available)
- Top words from this episode (word occurrences scoped to video_id)
### 2. Anki Note ID Storage
- Extend `recordCardsMined` callback to accept note IDs: `recordCardsMined(count, noteIds)`
- Store in CARD_MINED event payload: `{ cardsMined: 1, noteIds: [12345] }`
- Proxy already has note IDs in `pendingNoteIds` — pass through callback chain
- Polling has note IDs from `newNoteIds` — same treatment
- No schema change — note IDs stored in existing `payload_json` column on `imm_session_events`
### 3. "Open in Anki" Flow
- New endpoint: `POST /api/stats/anki/browse?noteId=12345`
- Calls AnkiConnect `guiBrowse` with query `nid:12345`
- Opens Anki's card browser filtered to that note
- Frontend button hits this endpoint
### 4. Episode Words
- New query: `getEpisodeWords(videoId)` — like `getAnimeWords` but filtered by video_id
- Reuse AnimeWordList component pattern
### 5. Backend Changes
**Modified files:**
- `src/anki-integration/anki-connect-proxy.ts` — pass note IDs through recordCardsAdded callback
- `src/anki-integration/polling.ts` — pass note IDs through recordCardsAdded callback
- `src/anki-integration.ts` — update callback signature
- `src/core/services/immersion-tracker-service.ts` — accept and store note IDs in recordCardsMined
- `src/core/services/immersion-tracker/query.ts` — add getEpisodeWords, getEpisodeSessions, getEpisodeCardEvents
- `src/core/services/stats-server.ts` — add episode detail and anki browse endpoints
### 6. Frontend Changes
**Modified files:**
- `stats/src/components/anime/EpisodeList.tsx` — make rows expandable
- `stats/src/components/anime/AnimeCardsList.tsx` — make rows expandable
**New files:**
- `stats/src/components/anime/EpisodeDetail.tsx` — inline expandable content

View File

@@ -0,0 +1,402 @@
# Episode Detail & Anki Card Link Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Add expandable episode detail rows with per-episode sessions/words/cards, store Anki note IDs in card mined events, and add "Open in Anki" button that opens the card browser.
**Architecture:** Extend the recordCardsMined callback chain to pass note IDs alongside count. Store them in the existing payload_json column. Add a stats server endpoint that proxies AnkiConnect's guiBrowse. Frontend makes episode rows expandable with inline detail content.
**Tech Stack:** Hono (backend API), AnkiConnect (guiBrowse), React + Recharts + Tailwind/Catppuccin (frontend), bun test (backend), Vitest (frontend)
---
## Task 1: Extend recordCardsAdded callback to pass note IDs
**Files:**
- Modify: `src/anki-integration/anki-connect-proxy.ts`
- Modify: `src/anki-integration/polling.ts`
- Modify: `src/anki-integration.ts`
**Step 1: Update the callback type**
In `src/anki-integration/anki-connect-proxy.ts` line 18, change:
```typescript
recordCardsAdded?: (count: number) => void;
```
to:
```typescript
recordCardsAdded?: (count: number, noteIds: number[]) => void;
```
In `src/anki-integration/polling.ts` line 12, same change.
**Step 2: Pass note IDs through the proxy callback**
In `src/anki-integration/anki-connect-proxy.ts` line 349, change:
```typescript
this.deps.recordCardsAdded?.(enqueuedCount);
```
to:
```typescript
this.deps.recordCardsAdded?.(enqueuedCount, noteIds.filter(id => !this.pendingNoteIdSet.has(id)));
```
Wait — the dedup already happened by this point. The `noteIds` param to `enqueueNotes` contains the raw IDs. We need the ones that were actually enqueued (not filtered out as duplicates). Track them:
Actually, look at lines 334-348: it iterates `noteIds`, skips duplicates, and pushes accepted ones to `this.pendingNoteIds`. The `enqueuedCount` tracks how many were accepted. We need to collect those IDs:
```typescript
enqueueNotes(noteIds: number[]): void {
const accepted: number[] = [];
for (const noteId of noteIds) {
if (this.pendingNoteIdSet.has(noteId) || this.inFlightNoteIds.has(noteId)) {
continue;
}
this.pendingNoteIds.push(noteId);
this.pendingNoteIdSet.add(noteId);
accepted.push(noteId);
}
if (accepted.length > 0) {
this.deps.recordCardsAdded?.(accepted.length, accepted);
}
// ... rest of method
}
```
**Step 3: Pass note IDs through the polling callback**
In `src/anki-integration/polling.ts` line 84, change:
```typescript
this.deps.recordCardsAdded?.(newNoteIds.length);
```
to:
```typescript
this.deps.recordCardsAdded?.(newNoteIds.length, newNoteIds);
```
**Step 4: Update AnkiIntegration callback chain**
In `src/anki-integration.ts`:
Line 140, change field type:
```typescript
private recordCardsMinedCallback: ((count: number, noteIds?: number[]) => void) | null = null;
```
Line 154, update constructor param:
```typescript
recordCardsMined?: (count: number, noteIds?: number[]) => void
```
Lines 214-216 (polling deps), change to:
```typescript
recordCardsAdded: (count, noteIds) => {
this.recordCardsMinedCallback?.(count, noteIds);
}
```
Lines 238-240 (proxy deps), same change.
Line 1125-1127 (setter), update signature:
```typescript
setRecordCardsMinedCallback(callback: ((count: number, noteIds?: number[]) => void) | null): void
```
**Step 5: Commit**
```bash
git commit -m "feat(anki): pass note IDs through recordCardsAdded callback chain"
```
---
## Task 2: Store note IDs in card mined event payload
**Files:**
- Modify: `src/core/services/immersion-tracker-service.ts`
**Step 1: Update recordCardsMined to accept and store note IDs**
Find the `recordCardsMined` method (line 759). Change signature and payload:
```typescript
recordCardsMined(count = 1, noteIds?: number[]): void {
if (!this.sessionState) return;
this.sessionState.cardsMined += count;
this.sessionState.pendingTelemetry = true;
this.recordWrite({
kind: 'event',
sessionId: this.sessionState.sessionId,
sampleMs: Date.now(),
eventType: EVENT_CARD_MINED,
wordsDelta: 0,
cardsDelta: count,
payloadJson: sanitizePayload(
{ cardsMined: count, ...(noteIds?.length ? { noteIds } : {}) },
this.maxPayloadBytes,
),
});
}
```
**Step 2: Update the caller in main.ts**
Find where `recordCardsMined` is called (around line 2506-2508 and 3409-3411). Pass through noteIds:
```typescript
recordCardsMined: (count, noteIds) => {
ensureImmersionTrackerStarted();
appState.immersionTracker?.recordCardsMined(count, noteIds);
}
```
**Step 3: Commit**
```bash
git commit -m "feat(immersion): store anki note IDs in card mined event payload"
```
---
## Task 3: Add episode-level query functions
**Files:**
- Modify: `src/core/services/immersion-tracker/query.ts`
- Modify: `src/core/services/immersion-tracker/types.ts`
- Modify: `src/core/services/immersion-tracker-service.ts`
**Step 1: Add types**
In `types.ts`, add:
```typescript
export interface EpisodeCardEventRow {
eventId: number;
sessionId: number;
tsMs: number;
cardsDelta: number;
noteIds: number[];
}
```
**Step 2: Add query functions**
In `query.ts`:
```typescript
export function getEpisodeWords(db: DatabaseSync, videoId: number, limit = 50): AnimeWordRow[] {
return db.prepare(`
SELECT w.id AS wordId, w.headword, w.word, w.reading, w.part_of_speech AS partOfSpeech,
SUM(o.occurrence_count) AS frequency
FROM imm_word_line_occurrences o
JOIN imm_subtitle_lines sl ON sl.line_id = o.line_id
JOIN imm_words w ON w.id = o.word_id
WHERE sl.video_id = ?
GROUP BY w.id
ORDER BY frequency DESC
LIMIT ?
`).all(videoId, limit) as unknown as AnimeWordRow[];
}
export function getEpisodeSessions(db: DatabaseSync, videoId: number): SessionSummaryQueryRow[] {
return db.prepare(`
SELECT
s.session_id AS sessionId, s.video_id AS videoId,
v.canonical_title AS canonicalTitle,
s.started_at_ms AS startedAtMs, s.ended_at_ms AS endedAtMs,
COALESCE(MAX(t.total_watched_ms), 0) AS totalWatchedMs,
COALESCE(MAX(t.active_watched_ms), 0) AS activeWatchedMs,
COALESCE(MAX(t.lines_seen), 0) AS linesSeen,
COALESCE(MAX(t.words_seen), 0) AS wordsSeen,
COALESCE(MAX(t.tokens_seen), 0) AS tokensSeen,
COALESCE(MAX(t.cards_mined), 0) AS cardsMined,
COALESCE(MAX(t.lookup_count), 0) AS lookupCount,
COALESCE(MAX(t.lookup_hits), 0) AS lookupHits
FROM imm_sessions s
JOIN imm_videos v ON v.video_id = s.video_id
LEFT JOIN imm_session_telemetry t ON t.session_id = s.session_id
WHERE s.video_id = ?
GROUP BY s.session_id
ORDER BY s.started_at_ms DESC
`).all(videoId) as SessionSummaryQueryRow[];
}
export function getEpisodeCardEvents(db: DatabaseSync, videoId: number): EpisodeCardEventRow[] {
const rows = db.prepare(`
SELECT e.event_id AS eventId, e.session_id AS sessionId,
e.ts_ms AS tsMs, e.cards_delta AS cardsDelta,
e.payload_json AS payloadJson
FROM imm_session_events e
JOIN imm_sessions s ON s.session_id = e.session_id
WHERE s.video_id = ? AND e.event_type = 4
ORDER BY e.ts_ms DESC
`).all(videoId) as Array<{ eventId: number; sessionId: number; tsMs: number; cardsDelta: number; payloadJson: string | null }>;
return rows.map(row => {
let noteIds: number[] = [];
if (row.payloadJson) {
try {
const parsed = JSON.parse(row.payloadJson);
if (Array.isArray(parsed.noteIds)) noteIds = parsed.noteIds;
} catch {}
}
return { eventId: row.eventId, sessionId: row.sessionId, tsMs: row.tsMs, cardsDelta: row.cardsDelta, noteIds };
});
}
```
**Step 3: Add wrapper methods to immersion-tracker-service.ts**
**Step 4: Commit**
```bash
git commit -m "feat(stats): add episode-level query functions for sessions, words, cards"
```
---
## Task 4: Add episode detail and Anki browse API endpoints
**Files:**
- Modify: `src/core/services/stats-server.ts`
- Modify: `src/core/services/__tests__/stats-server.test.ts`
**Step 1: Add episode detail endpoint**
```typescript
app.get('/api/stats/episode/:videoId/detail', async (c) => {
const videoId = parseIntQuery(c.req.param('videoId'), 0);
if (videoId <= 0) return c.body(null, 400);
const sessions = await tracker.getEpisodeSessions(videoId);
const words = await tracker.getEpisodeWords(videoId);
const cardEvents = await tracker.getEpisodeCardEvents(videoId);
return c.json({ sessions, words, cardEvents });
});
```
**Step 2: Add Anki browse endpoint**
```typescript
app.post('/api/stats/anki/browse', async (c) => {
const noteId = parseIntQuery(c.req.query('noteId'), 0);
if (noteId <= 0) return c.body(null, 400);
try {
const response = await fetch('http://127.0.0.1:8765', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ action: 'guiBrowse', version: 6, params: { query: `nid:${noteId}` } }),
});
const result = await response.json();
return c.json(result);
} catch (err) {
return c.json({ error: 'Failed to reach AnkiConnect' }, 502);
}
});
```
**Step 3: Add tests and verify**
Run: `bun test ./src/core/services/__tests__/stats-server.test.ts`
**Step 4: Commit**
```bash
git commit -m "feat(stats): add episode detail and anki browse endpoints"
```
---
## Task 5: Add frontend types and API client methods
**Files:**
- Modify: `stats/src/types/stats.ts`
- Modify: `stats/src/lib/api-client.ts`
- Modify: `stats/src/lib/ipc-client.ts`
**Step 1: Add types**
```typescript
export interface EpisodeCardEvent {
eventId: number;
sessionId: number;
tsMs: number;
cardsDelta: number;
noteIds: number[];
}
export interface EpisodeDetailData {
sessions: SessionSummary[];
words: AnimeWord[];
cardEvents: EpisodeCardEvent[];
}
```
**Step 2: Add API client methods**
```typescript
getEpisodeDetail: (videoId: number) => fetchJson<EpisodeDetailData>(`/api/stats/episode/${videoId}/detail`),
ankiBrowse: (noteId: number) => fetchJson<unknown>(`/api/stats/anki/browse?noteId=${noteId}`, { method: 'POST' }),
```
Mirror in ipc-client.
**Step 3: Commit**
```bash
git commit -m "feat(stats): add episode detail types and API client methods"
```
---
## Task 6: Build EpisodeDetail component
**Files:**
- Create: `stats/src/components/anime/EpisodeDetail.tsx`
- Modify: `stats/src/components/anime/EpisodeList.tsx`
- Modify: `stats/src/components/anime/AnimeCardsList.tsx`
**Step 1: Create EpisodeDetail component**
Inline expandable content showing:
- Sessions list (compact: time, duration, cards, words)
- Cards mined list with "Open in Anki" button per note ID
- Top words grid (reuse AnimeWordList pattern)
Fetches data from `getEpisodeDetail(videoId)` on mount.
"Open in Anki" button calls `apiClient.ankiBrowse(noteId)`.
**Step 2: Make EpisodeList rows expandable**
Add `expandedVideoId` state. Clicking a row toggles expansion. Render `EpisodeDetail` below the expanded row.
**Step 3: Make AnimeCardsList rows expandable**
Same pattern — clicking an episode row expands to show `EpisodeDetail`.
**Step 4: Commit**
```bash
git commit -m "feat(stats): add expandable episode detail with anki card links"
```
---
## Task 7: Build and verify
**Step 1: Type check**
Run: `npx tsc --noEmit`
**Step 2: Run backend tests**
Run: `bun test ./src/core/services/__tests__/stats-server.test.ts`
**Step 3: Run frontend tests**
Run: `npx vitest run`
**Step 4: Build**
Run: `npx vite build`
**Step 5: Commit any fixes**
```bash
git commit -m "feat(stats): episode detail and anki link complete"
```

View File

@@ -0,0 +1,115 @@
# Immersion Occurrence Tracking Design
**Problem:** `imm_words` and `imm_kanji` only store global aggregates. They cannot answer "where did this word/kanji appear?" at the anime, episode, timestamp, or subtitle-line level.
**Goals:**
- Map normalized words and kanji back to exact subtitle lines.
- Preserve repeated tokens inside one subtitle line.
- Avoid storing token text repeatedly for each repeated token in the same line.
- Keep the change additive and compatible with current top-word/top-kanji stats.
**Non-Goals:**
- Exact token character offsets inside a subtitle line.
- Full stats UI redesign in the same change.
- Replacing existing aggregate tables or existing vocabulary queries.
## Recommended Approach
Add a normalized subtitle-line table plus counted bridge tables from lines to canonical word and kanji rows. Keep `imm_words` and `imm_kanji` as canonical lexeme aggregates, then link them to `imm_subtitle_lines` through one row per unique lexeme per line with `occurrence_count`.
This preserves total frequency within a line without duplicating token text or needing one row per repeated token. Reverse mapping becomes a simple join from canonical lexeme to line row to video/anime context.
## Data Model
### `imm_subtitle_lines`
One row per recorded subtitle line.
Suggested fields:
- `line_id INTEGER PRIMARY KEY AUTOINCREMENT`
- `session_id INTEGER NOT NULL`
- `event_id INTEGER`
- `video_id INTEGER NOT NULL`
- `anime_id INTEGER`
- `line_index INTEGER NOT NULL`
- `segment_start_ms INTEGER`
- `segment_end_ms INTEGER`
- `text TEXT NOT NULL`
- `CREATED_DATE INTEGER`
- `LAST_UPDATE_DATE INTEGER`
Notes:
- `event_id` links back to `imm_session_events` when the subtitle-line event is written.
- `anime_id` is nullable because some rows may predate anime linkage or come from unresolved media.
### `imm_word_line_occurrences`
One row per normalized word per subtitle line.
Suggested fields:
- `line_id INTEGER NOT NULL`
- `word_id INTEGER NOT NULL`
- `occurrence_count INTEGER NOT NULL`
- `PRIMARY KEY(line_id, word_id)`
`word_id` points at the canonical row in `imm_words`.
### `imm_kanji_line_occurrences`
One row per kanji per subtitle line.
Suggested fields:
- `line_id INTEGER NOT NULL`
- `kanji_id INTEGER NOT NULL`
- `occurrence_count INTEGER NOT NULL`
- `PRIMARY KEY(line_id, kanji_id)`
`kanji_id` points at the canonical row in `imm_kanji`.
## Write Path
During `recordSubtitleLine(...)`:
1. Normalize and validate the line as today.
2. Compute counted word and kanji occurrences for the line.
3. Upsert canonical `imm_words` / `imm_kanji` rows as today.
4. Insert one `imm_subtitle_lines` row for the line.
5. Insert counted bridge rows for each normalized word and kanji found in that line.
Counting rules:
- Words: count repeated allowed tokens in the token list; skip tokens excluded by the existing POS/noise filter.
- Kanji: count repeated kanji characters from the visible subtitle line text.
## Query Shape
Add reverse-mapping query functions for:
- word -> recent occurrence rows
- kanji -> recent occurrence rows
Each row should include enough context for drilldown:
- anime id/title
- video id/title
- session id
- line index
- segment start/end
- subtitle text
- occurrence count within that line
Existing top-word/top-kanji aggregate queries stay in place.
## Edge Cases
- Repeated tokens in one line: store once per lexeme per line with `occurrence_count > 1`.
- Duplicate identical lines in one session: each subtitle event gets its own `imm_subtitle_lines` row.
- No anime link yet: keep `anime_id` null and still preserve the line/video/session mapping.
- Legacy DBs: additive migration only; no destructive rebuild of existing word/kanji data.
## Testing Strategy
Start with focused DB-backed tests:
- schema test for new line/bridge tables and indexes
- service test for counted word/kanji line persistence
- query tests for reverse mapping from word/kanji to line/anime/video context
- migration test for existing DBs gaining the new tables cleanly
Primary verification lane: `bun run test:immersion:sqlite:src`, then broader lanes only if API/runtime surfaces widen.

View File

@@ -0,0 +1,71 @@
# Immersion Occurrence Tracking Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Add normalized counted occurrence tracking for immersion words and kanji so stats can map each item back to anime, episode, timestamp, and subtitle line context.
**Architecture:** Introduce `imm_subtitle_lines` plus counted bridge tables from lines to canonical `imm_words` and `imm_kanji` rows. Extend the subtitle write path to persist one line row per subtitle event, retain aggregate lexeme tables, and expose reverse-mapping queries without duplicating repeated token text in storage.
**Tech Stack:** TypeScript, Bun, libsql SQLite, existing immersion tracker storage/query/service modules
---
### Task 1: Lock schema and migration shape down with failing tests
**Files:**
- Modify: `src/core/services/immersion-tracker/storage-session.test.ts`
- Modify: `src/core/services/immersion-tracker/storage.ts`
- Modify: `src/core/services/immersion-tracker/types.ts`
**Steps:**
1. Add a red test asserting `ensureSchema()` creates `imm_subtitle_lines`, `imm_word_line_occurrences`, and `imm_kanji_line_occurrences`, plus additive migration support from the previous schema version.
2. Run `bun test src/core/services/immersion-tracker/storage-session.test.ts` and confirm failure.
3. Implement the minimal schema/version/index changes.
4. Re-run the targeted test and confirm green.
### Task 2: Lock counted subtitle-line persistence down with failing tests
**Files:**
- Modify: `src/core/services/immersion-tracker-service.test.ts`
- Modify: `src/core/services/immersion-tracker-service.ts`
- Modify: `src/core/services/immersion-tracker/storage.ts`
**Steps:**
1. Add a red service test that records a subtitle line with repeated allowed words and repeated kanji, then asserts one line row plus counted bridge rows are written.
2. Run `bun test src/core/services/immersion-tracker-service.test.ts` and confirm failure.
3. Implement the minimal subtitle-line insert and counted occurrence write path.
4. Re-run the targeted test and confirm green.
### Task 3: Add reverse-mapping query tests first
**Files:**
- Modify: `src/core/services/immersion-tracker/__tests__/query.test.ts`
- Modify: `src/core/services/immersion-tracker/query.ts`
- Modify: `src/core/services/immersion-tracker/types.ts`
**Steps:**
1. Add red query tests for `word -> lines` and `kanji -> lines` mappings, including anime/video/session/timestamp/text context and per-line `occurrence_count`.
2. Run `bun test src/core/services/immersion-tracker/__tests__/query.test.ts` and confirm failure.
3. Implement the minimal query functions/types.
4. Re-run the targeted test and confirm green.
### Task 4: Expose the new query surface through the tracker service
**Files:**
- Modify: `src/core/services/immersion-tracker-service.ts`
- Modify: any narrow API/service consumer files only if needed
**Steps:**
1. Add the service methods needed to consume the new reverse-mapping queries.
2. Keep the change narrow; do not widen unrelated UI/API contracts unless a current consumer needs them.
3. Re-run the focused affected tests.
### Task 5: Verify with the maintained immersion lane
**Files:**
- Modify: `backlog/tasks/task-171 - Add-normalized-immersion-word-and-kanji-occurrence-tracking.md`
**Steps:**
1. Run the focused SQLite immersion tests first.
2. Escalate to broader verification only if touched files cross into API/runtime boundaries.
3. Record exact commands and results in the backlog task notes/final summary.

View File

@@ -0,0 +1,137 @@
# Stats Dashboard Redesign — Anime-Centric Approach
**Date**: 2026-03-14
**Status**: Approved
## Motivation
The current stats dashboard tracks metrics that aren't particularly useful (words seen as a hero stat, word clouds). The data model now supports anime-level tracking (`imm_anime`, `imm_videos` with `parsed_episode`), subtitle line storage (`imm_subtitle_lines`), and word/kanji occurrence mapping (`imm_word_line_occurrences`, `imm_kanji_line_occurrences`). The dashboard should be restructured around anime as the primary unit, with sessions, episodes, and rollups as the core metrics.
## Data Model (already in place)
- `imm_anime` — anime-level: title, AniList ID, romaji/english/native titles, metadata
- `imm_videos` — episode-level: `anime_id`, `parsed_episode`, `parsed_season`
- `imm_sessions` — session-level: linked to video
- `imm_subtitle_lines` — line-level: linked to session, video, anime
- `imm_word_line_occurrences` / `imm_kanji_line_occurrences` — word/kanji → line mapping
- `imm_media_art` — cover art + `episodes_total`
- `imm_daily_rollups` / `imm_monthly_rollups` — aggregated metrics
- `imm_words` — POS data: `part_of_speech`, `pos1`, `pos2`, `pos3`
## Tab Structure (5 tabs)
### 1. Overview
**Hero Stats** (6 cards):
- Watch time today
- Cards mined today
- Sessions today
- Episodes watched today
- Current streak (days)
- Active anime (titles with sessions in last 30 days)
**14-day Watch Time Chart**: Bar chart (keep existing).
**Streak Calendar**: GitHub-contributions-style heatmap, last 90 days, colored by watch time intensity.
**Tracking Snapshot** (secondary stats): Total sessions, total episodes, all-time hours, active days, total cards.
**Recent Activity Feed**: Last 10 sessions grouped by day — anime title + cover art thumbnail, episode number, duration, cards mined.
Removed from Overview: 14-day words chart, "words today", "words this week" hero stats.
### 2. Anime (replaces Library)
**Grid View**:
- Responsive card grid with cover art
- Each card: title, progress bar (episodes watched / `episodes_total`), watch time, cards mined
- Search/filter by title
- Sort: last watched, watch time, cards mined, progress %
**Anime Detail View** (click into card):
- Header: cover art, titles (romaji/english/native), AniList link if available
- Progress: episode progress bar + "X / Y episodes"
- Stats row: total watch time, cards mined, words seen, lookup hit rate, avg session length
- Episode list: table of episodes (from `imm_videos`), each showing episode number, session count, watch time, cards, last watched date
- Watch time chart: bar chart over time (14d/30d/90d toggle)
- Words from this anime: top words learned from this show (via `imm_word_line_occurrences``imm_subtitle_lines``anime_id`), clickable to vocab detail
- Mining efficiency: cards per hour / cards per episode trend
### 3. Trends
**Existing charts (keep all 9)**:
1. Watch Time (min) — bar
2. Tracked Cards — bar
3. Words Seen — bar
4. Sessions — line
5. Avg Session (min) — line
6. Cards per Hour — line
7. Lookup Hit Rate (%) — line
8. Rolling 7d Watch Time — line
9. Rolling 7d Cards — line
**New charts (6)**:
10. Episodes watched per day/week
11. Anime completion progress over time (cumulative episodes / total across all anime)
12. New anime started over time (first session per anime by date)
13. Watch time per anime (stacked bar — top 5 anime + "other")
14. Streak history (visual streak timeline — active vs gap periods)
15. Cards per episode trend
**Controls**: Time range selector (7d/30d/90d/all), group by (day/month).
### 4. Vocabulary
**Hero Stats** (4 cards):
- Unique words (excluding particles/noise via POS filter)
- Unique kanji
- New this week
- Avg frequency
**Filters/Controls**:
- POS filter toggle: hide particles, single-char tokens by default (toggleable)
- Sort: by frequency / last seen / first seen
- Search by word/reading
**Word List**: Grid/table of words — headword, reading, POS tag, frequency. Each word is clickable.
**Word Detail Panel** (slide-out or modal):
- Headword, reading, POS (part_of_speech, pos1, pos2, pos3)
- Frequency + first/last seen dates
- Anime appearances: which anime this word appeared in, frequency per anime
- Example lines: actual subtitle lines where the word was used
- Similar words: words sharing same kanji or reading
**Kanji Section**: Same pattern — clickable kanji grid, detail panel with frequency, anime appearances, example lines, words using this kanji.
**Charts**: Top repeated words bar chart, new words by day timeline.
### 5. Sessions
**Session List**: Chronological, grouped by day.
- Each row: anime title + episode, cover art thumbnail, duration (active/total), cards mined, lines seen, lookup rate
- Expandable detail: session timeline chart (words/cards over time), event log (pauses, seeks, lookups, cards mined)
- Filters: by anime title, date range
Based on existing hidden `SessionsTab` component with anime/episode context added.
## Backend Changes Needed
### New API Endpoints
- `GET /api/stats/anime` — list all anime with episode counts, watch time, progress
- `GET /api/stats/anime/:animeId` — anime detail: episodes, stats, recent sessions
- `GET /api/stats/anime/:animeId/words` — top words from this anime
- `GET /api/stats/vocabulary/:wordId` — word detail: POS, frequency, anime appearances, example lines, similar words
- `GET /api/stats/kanji/:kanjiId` — kanji detail: frequency, anime appearances, example lines, words using this kanji
### Modified API Endpoints
- `GET /api/stats/vocabulary` — add POS fields to response, support POS filtering query param
- `GET /api/stats/overview` — add episodes today, active anime count
- `GET /api/stats/daily-rollups` — add episode count data for new trend charts
### New Query Functions
- Anime-level aggregation: episodes per anime, watch time per anime, cards per anime
- Word/kanji occurrence lookups: join through `imm_word_line_occurrences``imm_subtitle_lines``imm_anime`
- Streak calendar data: daily activity map for last 90 days
- Episode-level trend data: episodes per day for trend charts
- Stacked watch time: per-anime daily breakdown

File diff suppressed because it is too large Load Diff