Files
SubMiner/backlog/tasks/task-177.2 - Count-homepage-new-words-by-headword.md
sudacode 5feed360ca feat: add app-owned YouTube subtitle flow with absPlayer-style parsing (#31)
* fix: harden preload argv parsing for popup windows

* fix: align youtube playback with shared overlay startup

* fix: unwrap mpv youtube streams for anki media mining

* docs: update docs for youtube subtitle and mining flow

* refactor: unify cli and runtime wiring for startup and youtube flow

* feat: update subtitle sidebar overlay behavior

* chore: add shared log-file source for diagnostics

* fix(ci): add changelog fragment for immersion changes

* fix: address CodeRabbit review feedback

* fix: persist canonical title from youtube metadata

* style: format stats library tab

* fix: address latest review feedback

* style: format stats library files

* test: stub launcher youtube deps in CI

* test: isolate launcher youtube flow deps

* test: stub launcher youtube deps in failing case

* test: force x11 backend in launcher ci harness

* test: address latest review feedback

* fix(launcher): preserve user YouTube ytdl raw options

* docs(backlog): update task tracking notes

* fix(immersion): special-case youtube media paths in runtime and tracking

* feat(stats): improve YouTube media metadata and picker key handling

* fix(ci): format stats media library hook

* fix: address latest CodeRabbit review items

* docs: update youtube release notes and docs

* feat: auto-load youtube subtitles before manual picker

* fix: restore app-owned youtube subtitle flow

* docs: update youtube playback docs and config copy

* refactor: remove legacy youtube launcher mode plumbing

* fix: refine youtube subtitle startup binding

* docs: clarify youtube subtitle startup behavior

* fix: address PR #31 latest review follow-ups

* fix: address PR #31 follow-up review comments

* test: harden youtube picker test harness

* udpate backlog

* fix: add timeout to youtube metadata probe

* docs: refresh youtube and stats docs

* update backlog

* update backlog

* chore: release v0.9.0
2026-03-24 00:01:24 -07:00

3.2 KiB

id, title, status, assignee, created_date, updated_date, labels, dependencies, references, parent_task_id, priority, ordinal
id title status assignee created_date updated_date labels dependencies references parent_task_id priority ordinal
TASK-177.2 Count homepage new words by headword Done
@codex
2026-03-19 19:38 2026-03-23 03:22
stats
immersion-tracking
vocabulary
src/core/services/immersion-tracker/query.ts
stats/src/components/overview/TrackingSnapshot.tsx
stats/src/lib/dashboard-data.ts
TASK-177 medium 130500

Description

Align the homepage New Words metric with the Known Words semantics by counting distinct headwords first seen in the selected window, so inflected or alternate forms of the same word do not inflate the summary.

Acceptance Criteria

  • #1 Homepage new-word counts use distinct headwords by earliest first-seen timestamp instead of counting separate word-form rows
  • #2 Homepage tooltip/copy reflects the headword-based semantics
  • #3 Automated tests cover the headword de-duplication behavior and affected overview copy

Implementation Plan

  1. Change the new-word aggregate query to group imm_words by headword, compute each headword's earliest first_seen, and count headwords whose first sighting falls within today/week windows.
  2. Add failing tests first for the aggregate path so multiple rows sharing a headword only contribute once.
  3. Update homepage tooltip/copy to say unique headwords first seen today/week.
  4. Run focused query and stats overview tests, then record verification and any blockers.

Implementation Notes

Updated the new-word aggregate to count distinct headwords by each headword's earliest first_seen timestamp, so multiple inflected/form rows for the same headword contribute only once.

Adjusted homepage tooltip copy to say unique headwords first seen today/week, keeping the visible card labels unchanged.

Focused verification passed for the query aggregate and homepage snapshot tests.

SubMiner verifier core lane artifact: .tmp/skill-verification/subminer-verify-20260319-123942-4intgW. bun run typecheck passed there; bun run test:fast still fails for the unrelated environment issue in scripts/update-aur-package.test.ts (mapfile: command not found).

Final Summary

Homepage New Words now uses headword-level semantics instead of counting separate (headword, word, reading) rows. The aggregate query groups imm_words by headword, uses each headword's earliest first_seen, and counts headwords first seen today or this week so alternate forms do not inflate the summary. The homepage tooltip copy now explicitly says the metric is based on unique headwords.

Added focused regression coverage for the de-duplication rule in getQueryHints and for the updated homepage tooltip text. Targeted bun test runs passed for the touched query and stats UI files. Repo verifier --lane core again passed bun run typecheck; bun run test:fast remains blocked by the unrelated existing scripts/update-aur-package.sh: line 71: mapfile: command not found failure.