mirror of
https://github.com/ksyasuda/SubMiner.git
synced 2026-03-24 12:11:29 -07:00
feat: add app-owned YouTube subtitle flow with absPlayer-style parsing (#31)
* fix: harden preload argv parsing for popup windows * fix: align youtube playback with shared overlay startup * fix: unwrap mpv youtube streams for anki media mining * docs: update docs for youtube subtitle and mining flow * refactor: unify cli and runtime wiring for startup and youtube flow * feat: update subtitle sidebar overlay behavior * chore: add shared log-file source for diagnostics * fix(ci): add changelog fragment for immersion changes * fix: address CodeRabbit review feedback * fix: persist canonical title from youtube metadata * style: format stats library tab * fix: address latest review feedback * style: format stats library files * test: stub launcher youtube deps in CI * test: isolate launcher youtube flow deps * test: stub launcher youtube deps in failing case * test: force x11 backend in launcher ci harness * test: address latest review feedback * fix(launcher): preserve user YouTube ytdl raw options * docs(backlog): update task tracking notes * fix(immersion): special-case youtube media paths in runtime and tracking * feat(stats): improve YouTube media metadata and picker key handling * fix(ci): format stats media library hook * fix: address latest CodeRabbit review items * docs: update youtube release notes and docs * feat: auto-load youtube subtitles before manual picker * fix: restore app-owned youtube subtitle flow * docs: update youtube playback docs and config copy * refactor: remove legacy youtube launcher mode plumbing * fix: refine youtube subtitle startup binding * docs: clarify youtube subtitle startup behavior * fix: address PR #31 latest review follow-ups * fix: address PR #31 follow-up review comments * test: harden youtube picker test harness * udpate backlog * fix: add timeout to youtube metadata probe * docs: refresh youtube and stats docs * update backlog * update backlog * chore: release v0.9.0
This commit is contained in:
@@ -17,6 +17,11 @@ For most users, start with this minimal configuration:
|
||||
"ankiConnect": {
|
||||
"enabled": true,
|
||||
"deck": "YourDeckName",
|
||||
"knownWords": {
|
||||
"decks": {
|
||||
"YourDeckName": ["Word", "Word Reading", "Expression"]
|
||||
}
|
||||
},
|
||||
"fields": {
|
||||
"sentence": "Sentence",
|
||||
"audio": "Audio",
|
||||
@@ -26,6 +31,8 @@ For most users, start with this minimal configuration:
|
||||
}
|
||||
```
|
||||
|
||||
`ankiConnect.deck` is still accepted for backward-compatible polling scope and legacy known-word fallback behavior. For known-word cache scope, prefer `ankiConnect.knownWords.decks` with deck-to-fields mapping.
|
||||
|
||||
Then customize as needed using the sections below.
|
||||
|
||||
## Configuration File
|
||||
@@ -120,7 +127,7 @@ The configuration file includes several main sections:
|
||||
- [**Discord Rich Presence**](#discord-rich-presence) - Optional Discord activity card updates
|
||||
- [**Immersion Tracking**](#immersion-tracking) - Track subtitle sessions and mining activity in SQLite
|
||||
- [**Stats Dashboard**](#stats-dashboard) - Local dashboard and overlay for immersion progress
|
||||
- [**YouTube Subtitle Generation**](#youtube-subtitle-generation) - Launcher defaults for yt-dlp + local whisper fallback
|
||||
- [**YouTube Playback Settings**](#youtube-playback-settings) - Defaults for YouTube subtitle loading
|
||||
|
||||
## Core Settings
|
||||
|
||||
@@ -348,7 +355,8 @@ Configure the parsed-subtitle sidebar modal.
|
||||
```json
|
||||
{
|
||||
"subtitleSidebar": {
|
||||
"enabled": true,
|
||||
"enabled": false,
|
||||
"autoOpen": false,
|
||||
"layout": "overlay",
|
||||
"toggleKey": "Backslash",
|
||||
"pauseVideoOnHover": false,
|
||||
@@ -362,12 +370,13 @@ Configure the parsed-subtitle sidebar modal.
|
||||
| Option | Values | Description |
|
||||
| --------------------------- | ---------------- | -------------------------------------------------------------------------------- |
|
||||
| `enabled` | boolean | Enable subtitle sidebar support (`false` by default) |
|
||||
| `autoOpen` | boolean | Open sidebar automatically on overlay startup (`false` by default) |
|
||||
| `layout` | string | `"overlay"` floats over mpv; `"embedded"` reserves right-side player space to mimic browser-like layout |
|
||||
| `toggleKey` | string | `KeyboardEvent.code` used to open/close the sidebar (default: `"Backslash"`) |
|
||||
| `pauseVideoOnHover` | boolean | Pause playback while hovering the sidebar cue list |
|
||||
| `autoScroll` | boolean | Keep the active cue in view while playback advances |
|
||||
| `maxWidth` | number | Maximum sidebar width in CSS pixels (default: `420`) |
|
||||
| `opacity` | number | Sidebar opacity between `0` and `1` (default: `0.78`) |
|
||||
| `opacity` | number | Sidebar opacity between `0` and `1` (default: `0.95`) |
|
||||
| `backgroundColor` | string | Sidebar shell background color |
|
||||
| `textColor` | hex color | Default cue text color |
|
||||
| `fontFamily` | string | CSS `font-family` value applied to sidebar cue text |
|
||||
@@ -460,6 +469,7 @@ See `config.example.jsonc` for detailed configuration options and more examples.
|
||||
| `Space` | `["cycle", "pause"]` | Toggle pause |
|
||||
| `KeyJ` | `["cycle", "sid"]` | Cycle primary subtitle track |
|
||||
| `Shift+KeyJ` | `["cycle", "secondary-sid"]` | Cycle secondary subtitle track |
|
||||
| `Ctrl+Alt+KeyC` | `["__youtube-picker-open"]` | Open the manual YouTube subtitle picker |
|
||||
| `ArrowRight` | `["seek", 5]` | Seek forward 5 seconds |
|
||||
| `ArrowLeft` | `["seek", -5]` | Seek backward 5 seconds |
|
||||
| `ArrowUp` | `["seek", 60]` | Seek forward 60 seconds |
|
||||
@@ -731,7 +741,7 @@ Palette controls:
|
||||
### Shared AI Provider
|
||||
|
||||
Shared OpenAI-compatible transport settings live at the top level under `ai`.
|
||||
Anki and YouTube subtitle cleanup both read this provider, then apply feature-local overrides where supported.
|
||||
Anki reads this provider directly. Legacy subtitle fallback keeps the same provider shape for compatibility, then applies feature-local overrides where supported.
|
||||
|
||||
```json
|
||||
{
|
||||
@@ -751,12 +761,14 @@ Anki and YouTube subtitle cleanup both read this provider, then apply feature-lo
|
||||
| `apiKey` | string | Static API key for the shared provider |
|
||||
| `apiKeyCommand` | string | Shell command used to resolve the API key |
|
||||
| `baseUrl` | string (URL) | OpenAI-compatible base URL |
|
||||
| `model` | string | Optional model override for shared provider workflows |
|
||||
| `systemPrompt` | string | Optional system prompt override for shared provider workflows |
|
||||
| `requestTimeoutMs` | integer milliseconds | Shared request timeout (default: `15000`) |
|
||||
|
||||
SubMiner uses the shared provider in two places:
|
||||
SubMiner uses the shared provider for:
|
||||
|
||||
- Anki translation/enrichment when `ankiConnect.ai.enabled` is `true`
|
||||
- YouTube whisper subtitle post-processing when `youtubeSubgen.fixWithAi` is `true`
|
||||
- Legacy subtitle fallback compatibility when `youtubeSubgen.fixWithAi` is `true`
|
||||
|
||||
### AnkiConnect
|
||||
|
||||
@@ -840,8 +852,8 @@ This example is intentionally compact. The option table below documents availabl
|
||||
| `proxy.port` | number | Bind port for local AnkiConnect proxy (default: `8766`) |
|
||||
| `proxy.upstreamUrl` | string (URL) | Upstream AnkiConnect URL that proxy forwards to (default: `http://127.0.0.1:8765`) |
|
||||
| `tags` | array of strings | Tags automatically added to cards mined/updated by SubMiner (default: `['SubMiner']`; set `[]` to disable automatic tagging). |
|
||||
| `deck` | string | Anki deck to monitor for new cards |
|
||||
| `ankiConnect.knownWords.decks` | array of strings | Decks used for known-word cache lookups. When omitted/empty, falls back to `ankiConnect.deck`. |
|
||||
| `ankiConnect.deck` | string | Legacy Anki polling/compatibility scope. Newer known-word cache scoping should use `ankiConnect.knownWords.decks`. |
|
||||
| `ankiConnect.knownWords.decks` | object | Deck→fields mapping for known-word cache queries (for example `{ "Kaishi 1.5k": ["Word", "Word Reading"] }`). |
|
||||
| `fields.word` | string | Card field for mined word / expression text (default: `Expression`) |
|
||||
| `fields.audio` | string | Card field for audio files (default: `ExpressionAudio`) |
|
||||
| `fields.image` | string | Card field for images (default: `Picture`) |
|
||||
@@ -862,6 +874,7 @@ This example is intentionally compact. The option table below documents availabl
|
||||
| `media.animatedMaxWidth` | number (px) | Max width for animated AVIF (default: `640`) |
|
||||
| `media.animatedMaxHeight` | number (px) | Optional max height for animated AVIF. Unset keeps source aspect-constrained height. |
|
||||
| `media.animatedCrf` | number (0-63) | CRF quality for AVIF; lower = higher quality (default: `35`) |
|
||||
| `media.syncAnimatedImageToWordAudio` | `true`, `false` | Whether animated AVIF includes an opening frame synced to sentence word-audio timing (default: `true`). |
|
||||
| `media.audioPadding` | number (seconds) | Padding around audio clip timing (default: `0.5`) |
|
||||
| `media.fallbackDuration` | number (seconds) | Default duration if timing unavailable (default: `3.0`) |
|
||||
| `media.maxMediaDuration` | number (seconds) | Max duration for generated media from multi-line copy (default: `30`, `0` to disable) |
|
||||
@@ -870,10 +883,11 @@ This example is intentionally compact. The option table below documents availabl
|
||||
| `behavior.mediaInsertMode` | `"append"`, `"prepend"` | Where to insert new media when overwrite is off (default: `"append"`) |
|
||||
| `behavior.highlightWord` | `true`, `false` | Highlight the word in sentence context (default: `true`) |
|
||||
| `ankiConnect.knownWords.highlightEnabled` | `true`, `false` | Enable fast local highlighting for words already known in Anki (default: `false`) |
|
||||
| `ankiConnect.knownWords.addMinedWordsImmediately` | `true`, `false` | Add words from successful mines into the local known-word cache immediately (default: `true`) |
|
||||
| `ankiConnect.knownWords.color` | hex color string | Text color for tokens already found in the local known-word cache (default: `"#a6da95"`). |
|
||||
| `ankiConnect.knownWords.matchMode` | `"headword"`, `"surface"` | Matching strategy for known-word highlighting (default: `"headword"`). `headword` uses token headwords; `surface` uses visible subtitle text. |
|
||||
| `ankiConnect.knownWords.refreshMinutes` | number | Minutes between known-word cache refreshes (default: `1440`) |
|
||||
| `ankiConnect.knownWords.decks` | array of strings | Decks used by known-word cache refresh. Leave empty for compatibility with legacy `deck` scope. |
|
||||
| `ankiConnect.knownWords.decks` | object | Deck→fields mapping used for known-word cache query scope (e.g. `{ "Kaishi 1.5k": ["Word", "Word Reading"] }`). |
|
||||
| `ankiConnect.nPlusOne.nPlusOne` | hex color string | Text color for the single target token to study when exactly one unknown candidate exists in a sentence (default: `"#c6a0f6"`). |
|
||||
| `ankiConnect.nPlusOne.minSentenceWords` | number | Minimum number of words required in a sentence before single unknown-word N+1 highlighting can trigger (default: `3`). |
|
||||
| `behavior.notificationType` | `"osd"`, `"system"`, `"both"`, `"none"` | Notification type on card update (default: `"osd"`) |
|
||||
@@ -919,7 +933,7 @@ Known-word cache policy:
|
||||
- `ankiConnect.nPlusOne.nPlusOne` sets the color for the single target token when exactly one eligible unknown word exists.
|
||||
- `ankiConnect.nPlusOne.minSentenceWords` sets the minimum token count required in a sentence for N+1 highlighting (default: `3`).
|
||||
- `ankiConnect.knownWords.color` sets the known-word highlight color for tokens already in Anki.
|
||||
- `ankiConnect.knownWords.decks` accepts one or more decks. If empty, it uses the legacy single `ankiConnect.deck` value as scope.
|
||||
- `ankiConnect.knownWords.decks` accepts an object keyed by deck name. If omitted or empty, it falls back to the legacy `ankiConnect.deck` single-deck scope.
|
||||
- Cache state is persisted to `known-words-cache.json` under the app `userData` directory.
|
||||
- The cache is automatically invalidated when the configured scope changes (for example, when deck changes).
|
||||
- Cache lookups are in-memory. By default, token headwords are matched against cached `Expression` / `Word` values; set `ankiConnect.knownWords.matchMode` to `"surface"` for raw subtitle text matching.
|
||||
@@ -1263,6 +1277,14 @@ Enable or disable local immersion analytics stored in SQLite for mined subtitles
|
||||
| `retention.monthlyRollupsDays` | integer (`0`-`36500`) | Monthly rollup retention window. Default `0` (keep all). |
|
||||
| `retention.vacuumIntervalDays` | integer (`0`-`3650`) | Minimum spacing between `VACUUM` passes. `0` disables vacuum. Default `0` (disabled). |
|
||||
|
||||
You can also disable immersion tracking for a single session using:
|
||||
|
||||
```bash
|
||||
SUBMINER_DISABLE_IMMERSION_TRACKING=1 subminer
|
||||
```
|
||||
|
||||
When this is set, SubMiner skips immersion-tracker startup and does not initialize or read the immersion SQLite database for that session.
|
||||
|
||||
Default behavior keeps raw events, telemetry, sessions, and rollups forever while still maintaining lifetime summary tables and daily/monthly rollups for faster reads. If you later want bounded retention, switch `retentionMode` or set explicit `retention.*` values.
|
||||
|
||||
When `dbPath` is blank or omitted, SubMiner writes telemetry and session summaries to the default app-data location:
|
||||
@@ -1283,7 +1305,7 @@ Configure the local stats UI served from SubMiner and the in-app stats overlay t
|
||||
{
|
||||
"stats": {
|
||||
"toggleKey": "Backquote",
|
||||
"serverPort": 5175,
|
||||
"serverPort": 6969,
|
||||
"autoStartServer": true,
|
||||
"autoOpenBrowser": true
|
||||
}
|
||||
@@ -1293,7 +1315,7 @@ Configure the local stats UI served from SubMiner and the in-app stats overlay t
|
||||
| Option | Values | Description |
|
||||
| ----------------- | ----------------- | --------------------------------------------------------------------------- |
|
||||
| `toggleKey` | Electron key code | Overlay-local key code used to toggle the stats overlay. Default `Backquote`. |
|
||||
| `serverPort` | integer | Localhost port for the browser stats UI. Default `5175`. |
|
||||
| `serverPort` | integer | Localhost port for the browser stats UI. Default `6969`. |
|
||||
| `autoStartServer` | `true`, `false` | Start the local stats HTTP server automatically once immersion tracking is active. Default `true`. |
|
||||
| `autoOpenBrowser` | `true`, `false` | When `subminer stats` starts the server on demand, also open the dashboard in your default browser. Default `true`. |
|
||||
|
||||
@@ -1304,22 +1326,13 @@ Usage notes:
|
||||
- The dashboard reads from the same immersion-tracking database, so keep `immersionTracking.enabled` on if you want data to appear.
|
||||
- The UI includes Overview, Library, Trends, Vocabulary, and Sessions tabs.
|
||||
|
||||
### YouTube Subtitle Generation
|
||||
### YouTube Playback Settings
|
||||
|
||||
Set defaults used by the `subminer` launcher for YouTube subtitle generation:
|
||||
Set defaults used by the `subminer` launcher for YouTube subtitle loading:
|
||||
|
||||
```json
|
||||
{
|
||||
"youtubeSubgen": {
|
||||
"whisperBin": "/path/to/whisper-cli",
|
||||
"whisperModel": "/path/to/ggml-model.bin",
|
||||
"whisperVadModel": "/path/to/ggml-vad.bin",
|
||||
"whisperThreads": 4,
|
||||
"fixWithAi": false,
|
||||
"ai": {
|
||||
"model": "openai/gpt-4o-mini",
|
||||
"systemPrompt": "Fix subtitle mistakes only."
|
||||
},
|
||||
"primarySubLanguages": ["ja", "jpn"]
|
||||
}
|
||||
}
|
||||
@@ -1327,27 +1340,22 @@ Set defaults used by the `subminer` launcher for YouTube subtitle generation:
|
||||
|
||||
| Option | Values | Description |
|
||||
| --------------------- | -------------------- | ---------------------------------------------------------------------------------------------- |
|
||||
| `whisperBin` | string path | Path to `whisper.cpp` CLI binary used as fallback transcription engine |
|
||||
| `whisperModel` | string path | Path to whisper model used by fallback transcription |
|
||||
| `whisperVadModel` | string path | Optional whisper VAD model path |
|
||||
| `whisperThreads` | integer | Thread count passed to whisper runs |
|
||||
| `fixWithAi` | `true`, `false` | Run shared AI post-processing on whisper-generated subtitles |
|
||||
| `ai.model` | string | Optional model override for YouTube AI subtitle cleanup |
|
||||
| `ai.systemPrompt` | string | Optional system prompt override for YouTube AI subtitle cleanup |
|
||||
| `primarySubLanguages` | string[] | Primary subtitle language priority for YouTube subtitle generation (default `["ja", "jpn"]`) |
|
||||
| `primarySubLanguages` | string[] | Primary subtitle language priority for YouTube auto-loading (default `["ja", "jpn"]`) |
|
||||
|
||||
Launcher behavior:
|
||||
Current launcher behavior:
|
||||
|
||||
- For YouTube URLs, subtitle generation now runs before mpv launch.
|
||||
- SubMiner probes manual/native YouTube subtitle tracks first.
|
||||
- Missing tracks fall back to local `whisper.cpp`.
|
||||
- English secondary subtitles can use whisper translate fallback when no manual track exists.
|
||||
- If `fixWithAi` is enabled, only whisper-generated `.srt` output is post-processed with the shared top-level `ai` provider.
|
||||
- For YouTube URLs, SubMiner probes subtitle tracks with yt-dlp after mpv bootstrap and binds auto-selected tracks before normal playback resumes.
|
||||
- If YouTube/mpv already exposes an authoritative matching subtitle track, SubMiner reuses it; otherwise it downloads and injects only the missing side.
|
||||
- SubMiner loads the primary subtitle plus a best-effort secondary subtitle.
|
||||
- Playback waits only for primary subtitle readiness; secondary failures do not block playback.
|
||||
- English secondary subtitles are selected from `secondarySub.secondarySubLanguages` when primary language matches are unavailable.
|
||||
- Native mpv secondary subtitle rendering stays hidden during this flow so the SubMiner overlay remains the visible secondary subtitle surface.
|
||||
- If primary subtitle loading fails, use `Ctrl+Alt+C` to open the subtitle modal and pick a track.
|
||||
|
||||
Language targets are derived from subtitle config:
|
||||
|
||||
- primary track: `youtubeSubgen.primarySubLanguages` (falls back to `["ja","jpn"]`)
|
||||
- secondary track: `secondarySub.secondarySubLanguages` (falls back to English when empty)
|
||||
- Subtitle files are generated or downloaded before mpv starts; the older launcher mode switch has been removed.
|
||||
- Tracks are resolved and loaded before mpv starts; the older launcher mode switch has been removed.
|
||||
|
||||
Precedence for launcher defaults is: CLI flag > environment variable > `config.jsonc` > built-in default.
|
||||
|
||||
Reference in New Issue
Block a user