mirror of https://github.com/ksyasuda/SubMiner.git synced 2026-05-13 08:12:54 -07:00

Files

T

sudacode 430373f010 feat(tokenizer): use Yomitan word classes for subtitle POS filtering (#57 )

* feat(tokenizer): use Yomitan word classes for subtitle POS filtering

- Carry matched headword wordClasses from termsFind into YomitanScanToken
- Map recognized Yomitan wordClasses to SubMiner coarse POS before annotation
- MeCab enrichment now fills only missing POS fields, preserving existing coarse pos1
- Exclude standalone grammar particles, して helper fragments, and single-kana surfaces from annotations
- Respect source-text punctuation gaps when counting N+1 sentence words
- Preserve known-word highlight on excluded kanji-containing tokens
- Add backlog tasks 304 (N+1 boundary bug) and 305 (wordClasses POS, done)

* fix(tokenizer): preserve annotation and enrichment behavior

* fix: restore jlpt subtitle underlines

* fix: exclude kana-only n+1 targets

* fix: refresh overlay on Hyprland fullscreen

* fix: address fullscreen and n-plus-one review notes

* fix: address CodeRabbit review comments

* fix: accept modified digits for multi-line sentence mining

* Cancel pending Linux MPV fullscreen overlay refresh bursts

- return a cancel handle from the Linux refresh burst scheduler
- clear pending refresh bursts when overlays hide or windows close
- tighten the burst test polling to wait for the async refresh

* fix: suppress N+1 for kana-only candidates and fix minSentenceWords coun

- Treat kana-only tokens with surrounding subtitle punctuation (…, ―, etc.) as kana-only so they are not promoted to N+1 targets
- Exclude unknown tokens filtered from N+1 targeting from the minSentenceWords count so filtered kana-only unknowns cannot satisfy sentence length threshold
- Add regression tests for kana-only candidate suppression and filtered-unknown padding cases

* Suppress subtitle annotations for grammar fragments

- Hide annotation metadata for auxiliary inflection and ja-nai endings
- Preserve lexical `くれる` forms and add regression coverage

* Fix kana-only N+1 tokenizer regression test

- Use a pure-kana fixture for the subtitle token N+1 case
- Update task notes for the latest CodeRabbit follow-up

* Fix managed playback exit and tokenizer grammar splits

- Ignore background stats daemons during regular app startup
- Split standalone grammar endings before applying annotations
- Clear helper-span annotations for auxiliary-only tokens

* fix: refresh current subtitle after known-word mining

* fix: suppress sigh interjection annotations

* fix: preserve jlpt underline color after lookup

* Replace grammar-ending permutations with shared matcher; preserve word a

- Extract `grammar-ending.ts` with `isStandaloneGrammarEndingText` / `isSubtitleGrammarEndingText` pattern matchers
- Replace `STANDALONE_GRAMMAR_ENDINGS` set in parser-selection-stage with shared matcher
- Replace generated phrase sets in subtitle-annotation-filter with shared matcher
- Remove stale duplicate subtitle-exclusion constants and helpers from annotation-stage
- Manual clipboard card updates now write only to the sentence audio field, leaving word/expression audio untouched

* fix: CI changelog, annotation options threading, and Jellyfin quit

- Add `type: fixed` / `area:` frontmatter to `changes/319` to pass `changelog:lint`
- Thread `TokenizerAnnotationOptions` through `stripSubtitleAnnotationMetadata` so `sourceText` is honored
- Include `jellyfinPlay` in `shouldQuitOnDisconnectWhenOverlayRuntimeInitialized` predicate
- Make mouse test `elementFromPoint` stubs coordinate-sensitive
- Make Lua test `.tmp` mkdir portable on Windows

* Preserve overlay across macOS flaps and mpv playlist changes

- keep visible overlays alive during transient macOS tracker loss
- reuse the running mpv overlay path on playlist navigation
- update regression coverage and changelog fragments

* fix: restore stats daemon deferral

* fix: keep subtitle prefetch alive after cache hits

* Fix JLPT underline color drift and AniList skipped-threshold sync

- Replace JLPT `text-decoration` underlines with `border-bottom` so Chromium selection/hover cannot repaint them to another annotation's color
- Lock JLPT underline color for combined annotation selectors (known, n+1, frequency) and character hover/selection states
- Trigger AniList post-watch check on every mpv time-position update to catch skipped completion thresholds
- Fall back to filename-parser season/episode when guessit omits them

* fix: address coderabbit feedback

* fix: sync AniList after seeked completion

* fix: preserve ordinal frequency annotations

* fix: preserve known highlighting for filtered tokens

* fix: address PR #57 CodeRabbit feedback

- Acquire AniList post-watch in-flight lock before async gating to prevent duplicate writes
- Isolate manual watched mark result from AniList post-watch callback failures
- Report known-word cache clears as mutations during immediate append when state existed
- Add regression tests for each fix

* fix: stop AniList setup reopening on Linux when keyring token exists

- Gate setup success on token persistence: `saveToken` now returns `boolean`; on failure, keeps the setup window open instead of reporting success
- Config reload passes `allowSetupPrompt: false` so playback reloads don't re-open the setup window
- Add regression test for persistence-failure path

* fix: suppress known highlights for subtitle particles

* fix: retry transient AniList safeStorage failures

* fix: hide overlay focus ring

* fix: align Hyprland fullscreen overlays

* fix: restore subtitle playback keybindings

* fix: align Hyprland overlay windows to mpv and stop pinning them

- Force-apply exact Hyprland move/resize/setprop dispatches when bounds are provided
- Stop pinning overlay windows; toggle pin off when Hyprland reports pinned=true
- Compensate stats overlay outer placement for Electron/Wayland content insets
- Make stats overlay window and page opaque so mpv cannot show through transparent insets
- Constrain stats app to h-screen with internal scroll so content covers mpv from y=0
- Lock overlay/stats window titles against page-title-updated events
- Add regression coverage for placement dispatches, inset compensation, and CSS overlay mode

* fix: retain frequency rank for honorific prefix-noun tokens

- Add `shouldAllowHonorificPrefixNounFrequency` to exempt お/ご/御 + noun merged tokens from frequency exclusion
- Add regression test for `ご機嫌` asserting rank 5484 is preserved after MeCab enrichment and annotation
- Close TASK-341

* fix: map openCharacterDictionary session action to --open-character-dict

- Add missing Lua CLI dispatch entry for openCharacterDictionary
- Add regression test for Alt+Meta+A binding and CLI flag forwarding

* fix: keep macOS overlay interactive while mpv remains active

- Overlay no longer hides or becomes click-through during tracker refreshes when mpv is the focused window
- Preserve already-visible overlay when tracker is temporarily not ready but mpv target signal is active
- Add regression tests for active-mpv tracker refresh and transient tracker-not-ready paths

* fix: address coderabbit subtitle follow-ups

* fix: resolve media detail from sessions when lifetime summary is absent

- Change `getMediaDetail` JOIN to LEFT JOIN on `imm_lifetime_media` and fall back to aggregated session metrics when no lifetime row exists
- Add filter `AND (lm.video_id IS NOT NULL OR s.session_id IS NOT NULL)` to keep results valid
- Add regression test covering the session-visible / media-detail-missing mismatch

* fix: address PR-57 CodeRabbit findings and CI failures

- use filtered word counts in media detail session token aggregation
- cancel fullscreen refresh burst on exit via updateLinuxMpvFullscreenOverlayRefreshBurst
- guard Hyprland JSON.parse in try/catch; exclude windowtitle from geometry events
- narrow focus suppression from :focus to :focus-visible
- apply JLPT lock selectors to word-name-match tokens (N1–N5)

* fix: macOS overlay z-order and Yomitan compound token known highlighting

- Release always-on-top when tracked mpv loses foreground on macOS
- Skip visible overlay blur restacking on macOS to avoid covering unrelated windows
- Prefer Yomitan internal parse tokens over fragmented scanner output for known-word decisions
- Add regression tests for both behaviors

* fix: macOS visible-overlay blur no longer invokes Windows-only blur call

- Split win32/darwin branches in handleOverlayWindowBlurred so darwin visible blur returns early without calling onWindowsVisibleOverlayBlur
- Add regression test asserting Windows callback stays inactive on macOS visible overlay blur
- Close TASK-347

2026-05-12 12:08:09 -07:00

13 KiB

Raw Blame History

Anki Integration

SubMiner uses the AnkiConnect add-on to create and update Anki cards with sentence context, audio, and screenshots. This project is built primarily for Kiku and Lapis note types, including sentence-card and field-grouping behavior.

Prerequisites

Install Anki.
Install the AnkiConnect add-on (code: 2055492159).
Keep Anki running while using SubMiner.

AnkiConnect listens on http://127.0.0.1:8765 by default. If you changed the port in AnkiConnect's settings, update ankiConnect.url in your SubMiner config.

Auto-Enrichment Transport

When you add a word via Yomitan, SubMiner detects the new card and fills in the sentence, audio, image, and translation fields automatically. Two detection methods are available:

Proxy mode — SubMiner runs a local AnkiConnect-compatible proxy and intercepts card creation instantly. Recommended when possible.

Polling mode (default) — SubMiner polls AnkiConnect every few seconds for newly added cards. Simpler setup, but with a short delay (~3 seconds).

Use proxy mode if you want immediate enrichment. Use polling mode if your Yomitan instance is external (browser-based) or you prefer minimal configuration.

In both modes, the enrichment workflow is the same:

Checks if a duplicate expression already exists (for field grouping).
Updates the sentence field with the current subtitle.
Generates and uploads audio and image media.
Fills the translation field from the secondary subtitle or AI.
Writes metadata to the miscInfo field.

Polling mode uses the query "deck:<ankiConnect.deck>" added:1 to find recently added cards. If no deck is configured, it searches all decks. Known-word sync scope is controlled by ankiConnect.knownWords.decks (object map), with ankiConnect.deck used as legacy fallback.

Proxy Mode Setup (Yomitan / Texthooker)

"ankiConnect": {
  "url": "http://127.0.0.1:8765", // real AnkiConnect
  "proxy": {
    "enabled": true,
    "host": "127.0.0.1",
    "port": 8766,
    "upstreamUrl": "http://127.0.0.1:8765"
  }
}

Then point Yomitan/clients to http://127.0.0.1:8766 instead of 8765.

When SubMiner loads the bundled Yomitan extension, it also attempts to update the default Yomitan profile (profiles[0].options.anki.server) to the active SubMiner endpoint:

proxy URL when ankiConnect.proxy.enabled is true
direct ankiConnect.url when proxy mode is disabled

To avoid clobbering custom setups, this auto-update only changes the default profile when its current server is blank or the stock Yomitan default (http://127.0.0.1:8765).

For browser-based Yomitan or other external clients (for example Texthooker in a normal browser profile), set their Anki server to the same proxy URL separately: http://127.0.0.1:8766 (or your configured proxy.host + proxy.port).

Browser/Yomitan external setup (separate profile)

If you want SubMiner to use proxy mode without touching your main/default Yomitan profile, create or select a separate Yomitan profile just for SubMiner and set its Anki server to the proxy URL.

That profile isolation gives you both benefits:

SubMiner can auto-enrich immediately via proxy.
Your default Yomitan profile keeps its existing Anki server setting.

In Yomitan, go to Settings → Profile and:

Create a profile for SubMiner (or choose one dedicated profile).
Open Anki settings for that profile.
Set server to http://127.0.0.1:8766 (or your configured proxy URL).
Save and make that profile active when using SubMiner.

This is only for non-bundled, external/browser Yomitan or other clients. The bundled profile auto-update logic only targets profiles[0] when it is blank or still default.

Proxy Troubleshooting (quick checks)

If auto-enrichment appears to do nothing:

Confirm proxy listener is running while SubMiner is active:

ss -ltnp | rg 8766

Confirm requests can pass through the proxy:

curl -sS http://127.0.0.1:8766 \
  -H 'content-type: application/json' \
  -d '{"action":"version","version":2}'

Check both log sinks:

Launcher/mpv-integrated log: ~/.cache/SubMiner/mp.log
App runtime log: ~/.config/SubMiner/logs/SubMiner-YYYY-MM-DD.log

Ensure config JSONC is valid and logging shape is correct:

"logging": {
  "level": "debug"
}

"logging": "debug" is invalid for current schema and can break reload/start behavior.

Field Mapping

SubMiner maps its data to your Anki note fields. Configure these under ankiConnect.fields:

"ankiConnect": {
  "fields": {
    "word": "Expression",           // mined word / expression text
    "audio": "ExpressionAudio",    // audio clip from the video
    "image": "Picture",             // screenshot or animated clip
    "sentence": "Sentence",         // subtitle text
    "miscInfo": "MiscInfo",         // metadata (filename, timestamp)
    "translation": "SelectionText"  // secondary sub or AI translation
  }
}

Field names must match your Anki note type exactly (case-sensitive). If a configured field does not exist on the note type, SubMiner skips it without error.

Minimal Config

If you only want sentence and audio on your cards:

"ankiConnect": {
  "enabled": true,
  "fields": {
    "sentence": "Sentence",
    "audio": "ExpressionAudio"
  }
}

Media Generation

SubMiner uses FFmpeg to generate audio and image media from the video. FFmpeg must be installed and on PATH.

Audio

Audio is extracted from the video file using the subtitle's start and end timestamps, with configurable padding added before and after.

"ankiConnect": {
  "media": {
    "generateAudio": true,
    "audioPadding": 0.5,         // seconds before and after subtitle timing
    "maxMediaDuration": 30       // cap total duration in seconds
  }
}

Output format: MP3 at 44100 Hz. If the video has multiple audio streams, SubMiner uses the active stream.

The audio is uploaded to Anki's media folder and inserted as [sound:audio_<timestamp>.mp3].

Screenshots (Static)

A single frame is captured at the current playback position.

"ankiConnect": {
  "media": {
    "generateImage": true,
    "imageType": "static",
    "imageFormat": "jpg",        // "jpg", "png", or "webp"
    "imageQuality": 92,          // 1–100
    "imageMaxWidth": null,       // optional, preserves aspect ratio
    "imageMaxHeight": null
  }
}

Animated Clips (AVIF)

Instead of a static screenshot, SubMiner can generate an animated AVIF covering the subtitle duration.

"ankiConnect": {
  "media": {
    "generateImage": true,
    "imageType": "avif",
    "animatedFps": 10,
    "animatedMaxWidth": 640,
    "animatedMaxHeight": null,
    "animatedCrf": 35            // 0–63, lower = better quality
  }
}

Animated AVIF requires an AV1 encoder (libaom-av1, libsvtav1, or librav1e) in your FFmpeg build. Generation timeout is 60 seconds.

Behavior Options

"ankiConnect": {
  "behavior": {
    "overwriteAudio": true,         // replace existing audio, or append
    "overwriteImage": true,         // replace existing image, or append
    "mediaInsertMode": "append",    // "append" or "prepend" to field content
    "autoUpdateNewCards": true,     // auto-update when new card detected
    "notificationType": "osd"       // "osd", "system", "both", or "none"
  }
}

overwriteAudio applies to automatic card updates and duplicate-card enrichment. Manual clipboard subtitle updates (Ctrl/Cmd+C, then Ctrl/Cmd+V) always replace generated sentence audio, while leaving the word audio field unchanged.

AI Translation

SubMiner can auto-translate the mined sentence and fill the translation field. Secondary subtitle text still wins when present. AI translation is only attempted when ankiConnect.ai.enabled is true and no secondary subtitle exists.

"ai": {
  "enabled": true,
  "apiKey": "sk-...",
  "apiKeyCommand": "",
  "baseUrl": "https://openrouter.ai/api",
  "requestTimeoutMs": 15000
},
"ankiConnect": {
  "ai": {
    "enabled": true,
    "model": "openai/gpt-4o-mini",
    "systemPrompt": "Translate mined sentence text only."
  }
}

ankiConnect.ai controls feature-local enablement plus optional model / systemPrompt overrides. Provider credentials and request transport settings live in top-level ai.

Translation priority:

If a secondary subtitle is available, use it as the translation.
If ankiConnect.ai.enabled is true and top-level ai.enabled is true, call the shared AI provider.
If AI translation fails and no secondary subtitle exists, fall back to the original sentence text.

The built-in translation request asks for English output by default. Customize that behavior through ankiConnect.ai.systemPrompt.

Sentence Cards (Lapis)

SubMiner can create standalone sentence cards (without a word/expression) using a separate note type. This is designed for use with Lapis and similar sentence-focused note types.

::: warning Required config Sentence card creation and audio card marking both require ankiConnect.isLapis.enabled: true and a valid sentenceCardModel pointing to your Lapis/Kiku note type. Without this, the Ctrl/Cmd+S and Ctrl/Cmd+Shift+A shortcuts will not create cards. :::

"ankiConnect": {
  "isLapis": {
    "enabled": true,
    "sentenceCardModel": "Japanese sentences"
  }
}

Trigger with the mine sentence shortcut (Ctrl/Cmd+S by default). The card is created directly via AnkiConnect with the sentence, audio, and image filled in.

To mine multiple subtitle lines as one sentence card, use Ctrl/Cmd+Shift+S followed by a digit (1–9) to select how many recent lines to combine.

Field Grouping (Kiku)

When you mine the same word multiple times, SubMiner can merge the cards instead of creating duplicates. This is designed for note types like Kiku that support grouped sentence/audio/image fields.

"ankiConnect": {
  "isKiku": {
    "enabled": true,
    "fieldGrouping": "manual",         // "auto", "manual", or "disabled"
    "deleteDuplicateInAuto": true      // delete new card after auto-merge
  }
}

Modes

Disabled ("disabled"): No duplicate detection. Each card is independent.

Auto ("auto"): When a duplicate expression is found, SubMiner merges the new card into the existing one automatically. Both sentences, audio clips, and images are preserved, and exact duplicate values are collapsed to one entry. If deleteDuplicateInAuto is true, the new card is deleted after merging.

Manual ("manual"): A modal appears in the overlay showing both cards. You choose which card to keep, preview the merge result, then confirm. The modal has a 90-second timeout, after which it cancels automatically.

What Gets Merged

Field	Merge behavior
Sentence	Both sentences preserved (exact duplicate text is deduplicated)
Audio	Both `[sound:...]` entries kept (exact duplicates deduplicated)
Image	Both images kept (exact duplicates deduplicated)

Key	Action
`1` / `2`	Select card 1 or card 2 to keep
`Enter`	Confirm selection
`Esc`	Cancel (keep both cards unchanged)

Full Config Example

{
  "ankiConnect": {
    "enabled": true,
    "url": "http://127.0.0.1:8765",
    "pollingRate": 3000,
    "proxy": {
      "enabled": false,
      "host": "127.0.0.1",
      "port": 8766,
      "upstreamUrl": "http://127.0.0.1:8765",
    },
    "fields": {
      "audio": "ExpressionAudio",
      "image": "Picture",
      "sentence": "Sentence",
      "miscInfo": "MiscInfo",
      "translation": "SelectionText",
    },
    "media": {
      "generateAudio": true,
      "generateImage": true,
      "imageType": "static",
      "imageFormat": "jpg",
      "imageQuality": 92,
      "audioPadding": 0.5,
      "maxMediaDuration": 30,
    },
    "behavior": {
      "overwriteAudio": true,
      "overwriteImage": true,
      "mediaInsertMode": "append",
      "autoUpdateNewCards": true,
      "notificationType": "osd",
    },
    "ai": {
      "enabled": false,
      "model": "openai/gpt-4o-mini",
      "systemPrompt": "Translate mined sentence text only.",
    },
    "isKiku": {
      "enabled": false,
      "fieldGrouping": "disabled",
      "deleteDuplicateInAuto": true,
    },
    "isLapis": {
      "enabled": false,
      "sentenceCardModel": "Japanese sentences",
    },
  },
  "ai": {
    "enabled": false,
    "apiKey": "",
    "apiKeyCommand": "",
    "baseUrl": "https://openrouter.ai/api",
    "requestTimeoutMs": 15000,
  },
}

13 KiB Raw Blame History Unescape Escape