mirror of
https://github.com/ksyasuda/SubMiner.git
synced 2026-05-13 08:12:54 -07:00
feat(tokenizer): use Yomitan word classes for subtitle POS filtering (#57)
* feat(tokenizer): use Yomitan word classes for subtitle POS filtering - Carry matched headword wordClasses from termsFind into YomitanScanToken - Map recognized Yomitan wordClasses to SubMiner coarse POS before annotation - MeCab enrichment now fills only missing POS fields, preserving existing coarse pos1 - Exclude standalone grammar particles, して helper fragments, and single-kana surfaces from annotations - Respect source-text punctuation gaps when counting N+1 sentence words - Preserve known-word highlight on excluded kanji-containing tokens - Add backlog tasks 304 (N+1 boundary bug) and 305 (wordClasses POS, done) * fix(tokenizer): preserve annotation and enrichment behavior * fix: restore jlpt subtitle underlines * fix: exclude kana-only n+1 targets * fix: refresh overlay on Hyprland fullscreen * fix: address fullscreen and n-plus-one review notes * fix: address CodeRabbit review comments * fix: accept modified digits for multi-line sentence mining * Cancel pending Linux MPV fullscreen overlay refresh bursts - return a cancel handle from the Linux refresh burst scheduler - clear pending refresh bursts when overlays hide or windows close - tighten the burst test polling to wait for the async refresh * fix: suppress N+1 for kana-only candidates and fix minSentenceWords coun - Treat kana-only tokens with surrounding subtitle punctuation (…, ―, etc.) as kana-only so they are not promoted to N+1 targets - Exclude unknown tokens filtered from N+1 targeting from the minSentenceWords count so filtered kana-only unknowns cannot satisfy sentence length threshold - Add regression tests for kana-only candidate suppression and filtered-unknown padding cases * Suppress subtitle annotations for grammar fragments - Hide annotation metadata for auxiliary inflection and ja-nai endings - Preserve lexical `くれる` forms and add regression coverage * Fix kana-only N+1 tokenizer regression test - Use a pure-kana fixture for the subtitle token N+1 case - Update task notes for the latest CodeRabbit follow-up * Fix managed playback exit and tokenizer grammar splits - Ignore background stats daemons during regular app startup - Split standalone grammar endings before applying annotations - Clear helper-span annotations for auxiliary-only tokens * fix: refresh current subtitle after known-word mining * fix: suppress sigh interjection annotations * fix: preserve jlpt underline color after lookup * Replace grammar-ending permutations with shared matcher; preserve word a - Extract `grammar-ending.ts` with `isStandaloneGrammarEndingText` / `isSubtitleGrammarEndingText` pattern matchers - Replace `STANDALONE_GRAMMAR_ENDINGS` set in parser-selection-stage with shared matcher - Replace generated phrase sets in subtitle-annotation-filter with shared matcher - Remove stale duplicate subtitle-exclusion constants and helpers from annotation-stage - Manual clipboard card updates now write only to the sentence audio field, leaving word/expression audio untouched * fix: CI changelog, annotation options threading, and Jellyfin quit - Add `type: fixed` / `area:` frontmatter to `changes/319` to pass `changelog:lint` - Thread `TokenizerAnnotationOptions` through `stripSubtitleAnnotationMetadata` so `sourceText` is honored - Include `jellyfinPlay` in `shouldQuitOnDisconnectWhenOverlayRuntimeInitialized` predicate - Make mouse test `elementFromPoint` stubs coordinate-sensitive - Make Lua test `.tmp` mkdir portable on Windows * Preserve overlay across macOS flaps and mpv playlist changes - keep visible overlays alive during transient macOS tracker loss - reuse the running mpv overlay path on playlist navigation - update regression coverage and changelog fragments * fix: restore stats daemon deferral * fix: keep subtitle prefetch alive after cache hits * Fix JLPT underline color drift and AniList skipped-threshold sync - Replace JLPT `text-decoration` underlines with `border-bottom` so Chromium selection/hover cannot repaint them to another annotation's color - Lock JLPT underline color for combined annotation selectors (known, n+1, frequency) and character hover/selection states - Trigger AniList post-watch check on every mpv time-position update to catch skipped completion thresholds - Fall back to filename-parser season/episode when guessit omits them * fix: address coderabbit feedback * fix: sync AniList after seeked completion * fix: preserve ordinal frequency annotations * fix: preserve known highlighting for filtered tokens * fix: address PR #57 CodeRabbit feedback - Acquire AniList post-watch in-flight lock before async gating to prevent duplicate writes - Isolate manual watched mark result from AniList post-watch callback failures - Report known-word cache clears as mutations during immediate append when state existed - Add regression tests for each fix * fix: stop AniList setup reopening on Linux when keyring token exists - Gate setup success on token persistence: `saveToken` now returns `boolean`; on failure, keeps the setup window open instead of reporting success - Config reload passes `allowSetupPrompt: false` so playback reloads don't re-open the setup window - Add regression test for persistence-failure path * fix: suppress known highlights for subtitle particles * fix: retry transient AniList safeStorage failures * fix: hide overlay focus ring * fix: align Hyprland fullscreen overlays * fix: restore subtitle playback keybindings * fix: align Hyprland overlay windows to mpv and stop pinning them - Force-apply exact Hyprland move/resize/setprop dispatches when bounds are provided - Stop pinning overlay windows; toggle pin off when Hyprland reports pinned=true - Compensate stats overlay outer placement for Electron/Wayland content insets - Make stats overlay window and page opaque so mpv cannot show through transparent insets - Constrain stats app to h-screen with internal scroll so content covers mpv from y=0 - Lock overlay/stats window titles against page-title-updated events - Add regression coverage for placement dispatches, inset compensation, and CSS overlay mode * fix: retain frequency rank for honorific prefix-noun tokens - Add `shouldAllowHonorificPrefixNounFrequency` to exempt お/ご/御 + noun merged tokens from frequency exclusion - Add regression test for `ご機嫌` asserting rank 5484 is preserved after MeCab enrichment and annotation - Close TASK-341 * fix: map openCharacterDictionary session action to --open-character-dict - Add missing Lua CLI dispatch entry for openCharacterDictionary - Add regression test for Alt+Meta+A binding and CLI flag forwarding * fix: keep macOS overlay interactive while mpv remains active - Overlay no longer hides or becomes click-through during tracker refreshes when mpv is the focused window - Preserve already-visible overlay when tracker is temporarily not ready but mpv target signal is active - Add regression tests for active-mpv tracker refresh and transient tracker-not-ready paths * fix: address coderabbit subtitle follow-ups * fix: resolve media detail from sessions when lifetime summary is absent - Change `getMediaDetail` JOIN to LEFT JOIN on `imm_lifetime_media` and fall back to aggregated session metrics when no lifetime row exists - Add filter `AND (lm.video_id IS NOT NULL OR s.session_id IS NOT NULL)` to keep results valid - Add regression test covering the session-visible / media-detail-missing mismatch * fix: address PR-57 CodeRabbit findings and CI failures - use filtered word counts in media detail session token aggregation - cancel fullscreen refresh burst on exit via updateLinuxMpvFullscreenOverlayRefreshBurst - guard Hyprland JSON.parse in try/catch; exclude windowtitle from geometry events - narrow focus suppression from :focus to :focus-visible - apply JLPT lock selectors to word-name-match tokens (N1–N5) * fix: macOS overlay z-order and Yomitan compound token known highlighting - Release always-on-top when tracked mpv loses foreground on macOS - Skip visible overlay blur restacking on macOS to avoid covering unrelated windows - Prefer Yomitan internal parse tokens over fragmented scanner output for known-word decisions - Add regression tests for both behaviors * fix: macOS visible-overlay blur no longer invokes Windows-only blur call - Split win32/darwin branches in handleOverlayWindowBlurred so darwin visible blur returns early without calling onWindowsVisibleOverlayBlur - Add regression test asserting Windows callback stays inactive on macOS visible overlay blur - Close TASK-347
This commit is contained in:
@@ -0,0 +1,6 @@
|
||||
type: fixed
|
||||
area: tokenizer
|
||||
|
||||
- Use Yomitan `wordClasses` metadata for subtitle POS filtering.
|
||||
- Backfill blank MeCab POS detail fields during parser enrichment.
|
||||
- Keep subtitle annotation metadata stripped from token results.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: overlay
|
||||
|
||||
- Fixed Hyprland fullscreen transitions so mpv fullscreen changes refresh visible overlay geometry, reassert topmost stacking, and keep primary subtitle hover pause working after resize/toggle cycles.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: tokenizer
|
||||
|
||||
- Stopped kana-only subtitle tokens from being selected as N+1 targets.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: overlay
|
||||
|
||||
- Overlay: Restored persistent JLPT subtitle underlines while keeping hover JLPT labels and annotation color priority intact.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: shortcuts
|
||||
|
||||
- Accept follow-up number-row digits for multi-line subtitle mining even when the original shortcut modifiers are still held.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: overlay
|
||||
|
||||
- Suppressed subtitle annotation styling for standalone auxiliary inflection fragments such as `れる` and `れた` while keeping lexical `くれる` forms eligible for lookup metadata.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: overlay
|
||||
|
||||
- Suppressed subtitle annotation styling for grammar-only endings such as `じゃないですか` and standalone polite copula tails like `です` / `ですよ`.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: stats
|
||||
|
||||
- Kept regular app stats routing isolated from a separately running background stats daemon during playback startup.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: overlay
|
||||
|
||||
- Overlay: Kept JLPT subtitle underlines at their JLPT color after dictionary lookups, even when the token also has another annotation color.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: tokenizer
|
||||
|
||||
- Tokenizer: Suppress annotations for ハァ-style interjection subtitles.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: overlay
|
||||
|
||||
- Overlay: Refresh the current subtitle after successful card mining so newly known words recolor immediately.
|
||||
@@ -0,0 +1,5 @@
|
||||
type: fixed
|
||||
area: tokenizer
|
||||
|
||||
- Tokenizer: Replaced hard-coded standalone grammar-ending permutations with shared pattern matching for polite copula, negative copula, and explanatory subtitle endings.
|
||||
- Tokenizer: Kept grammar annotation exclusion logic in the shared subtitle filter and removed stale duplicate exclusion helpers from the annotation stage.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: anki
|
||||
|
||||
- Anki: Manual clipboard subtitle updates now preserve existing word audio while replacing sentence audio and animated-image media.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: overlay
|
||||
|
||||
- Kept the macOS visible overlay alive during transient mpv tracker flaps when the last tracked video geometry is still available.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: mpv
|
||||
|
||||
- mpv: Playlist navigation now reuses the running SubMiner overlay without repeating the pause-until-ready warmup gate.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: overlay
|
||||
|
||||
- Overlay: Kept JLPT subtitle underlines on their JLPT color when lookup selection overlaps known-word or frequency annotation colors.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: anilist
|
||||
|
||||
- AniList: Run post-watch progress checks on mpv time-position updates, read the fresh mpv position before threshold checks, wire manual mark-watched to force a progress sync, and fill missing `guessit` episode metadata from the filename parser.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: stats
|
||||
|
||||
- Restored stats startup deferral to a running background stats daemon so video launches no longer fail when the stats port is already in use.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: overlay
|
||||
|
||||
- Kept subtitle annotation prefetch running after immediate cache-hit renders so upcoming subtitle colors stay ready.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: overlay
|
||||
|
||||
- Overlay: Fixed frequency highlighting for ordinal prefix-noun tokens like `第二` so popup ranks such as JPDB 1820 are preserved in subtitle annotations.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: tokenizer
|
||||
|
||||
- Suppressed N+1, JLPT, frequency, and name styling for `ある` / `有る` existence verbs while still allowing known-word highlighting.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: anilist
|
||||
|
||||
- AniList: Prevented duplicate post-watch writes during concurrent checks, preserved manual watched marks when post-watch sync fails, and kept known-word cache refresh notifications accurate after cache resets.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: anilist
|
||||
|
||||
- AniList: Kept config reload from opening the setup window during playback when token storage cannot be resolved, and stopped setup login from reporting success when encrypted token persistence fails.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: overlay
|
||||
|
||||
- Overlay: Aligned the Hyprland fullscreen overlay with mpv when mpv reports client-requested fullscreen, force-applied exact Hyprland overlay window bounds after floating, disabled Hyprland floating-window decoration on exact overlay placement, compensated stats overlay placement for Electron/Wayland content insets, and made the stats overlay page/window opaque so mpv cannot show through transparent top insets.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: overlay
|
||||
|
||||
- Hid the browser focus outline on the top-level overlay surface so focused overlays no longer show a yellow/orange viewport border.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: anilist
|
||||
|
||||
- AniList: Retried Linux safeStorage availability after transient keyring startup failures so stored tokens can load and setup tokens can save once GNOME libsecret becomes available.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: tokenizer
|
||||
|
||||
- Prevented standalone grammar and helper tokens such as `に` from being colored as known words when readings from known-word decks match them.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: overlay
|
||||
|
||||
- Overlay: Stopped Hyprland from pinning SubMiner overlay windows across workspaces while keeping floating placement for fullscreen alignment.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: overlay
|
||||
|
||||
- Overlay: Fixed the default replay/next subtitle keybindings by moving the session-help shortcut to `Ctrl/Cmd+/`, leaving `Ctrl+Shift+H` and `Ctrl+Shift+L` free for subtitle playback controls. The mpv plugin now registers shifted letter chords with mpv's uppercase key form so `Ctrl+Shift+L` reaches the play-next-subtitle action instead of falling through as `Ctrl+L`, and play-next now starts playback from a paused state before pausing again at the subtitle end.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: overlay
|
||||
|
||||
- Overlay: Kept the macOS overlay visible and interactive while mpv remains the active tracked window, including transient tracker refreshes.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: stats
|
||||
|
||||
- Fixed recent session detail pages showing "Media not found" before lifetime media summaries are available.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: overlay
|
||||
|
||||
- Overlay: Kept the macOS overlay behind unrelated foreground windows while preserving its position above mpv.
|
||||
@@ -0,0 +1,4 @@
|
||||
type: fixed
|
||||
area: tokenizer
|
||||
|
||||
- Tokenizer: Preserve Yomitan compound tokens for known-word highlighting so known component words no longer color a larger unknown word green.
|
||||
Reference in New Issue
Block a user