Add inline character portraits and dictionary search workflow (#83)

This commit is contained in:
2026-05-25 03:16:25 -07:00
committed by GitHub
parent 7e6f9672cf
commit 807c0ff3db
54 changed files with 2306 additions and 178 deletions
+19 -9
View File
@@ -91,21 +91,26 @@ Name matching runs inside Yomitan's scanning pipeline during subtitle tokenizati
2. Entries from "SubMiner Character Dictionary" are checked with exact primary-source matching — the token must match the entry's `originalText` with `isPrimary: true` and `matchType: 'exact'`.
3. Matched tokens are flagged `isNameMatch: true` and forwarded to the renderer.
4. If `subtitleStyle.nameMatchEnabled` is enabled, the renderer applies the name-match highlight color (default: `#f5bde6`).
5. If `subtitleStyle.nameMatchImagesEnabled` is enabled, the renderer also injects a small circular AniList portrait from the cached snapshot image data.
Older snapshot schema versions are regenerated automatically. Current-version snapshots are normally reused, but when `subtitleStyle.nameMatchImagesEnabled` is enabled SubMiner also checks whether the cached snapshot contains usable character portrait data. If it does not, the snapshot is refreshed so the merged dictionary can include images.
Name matches are visually distinct from [N+1 targeting, frequency highlighting, and JLPT tags](/subtitle-annotations) so you can tell at a glance whether a highlighted word is a character name or a vocabulary target.
**Key settings:**
| Option | Default | Description |
| -------------------------------- | --------- | ---------------------------------- |
| `subtitleStyle.nameMatchEnabled` | `false` | Toggle character-name highlighting |
| `subtitleStyle.nameMatchColor` | `#f5bde6` | Highlight color for matched names |
| Option | Default | Description |
| -------------------------------------- | --------- | ----------------------------------------- |
| `subtitleStyle.nameMatchEnabled` | `false` | Toggle character-name highlighting |
| `subtitleStyle.nameMatchImagesEnabled` | `false` | Show small AniList portraits beside names |
| `subtitleStyle.nameMatchColor` | `#f5bde6` | Highlight color for matched names |
## Dictionary Entries
Each character entry in the Yomitan dictionary includes structured content:
- **Name** — native (Japanese) and romanized forms
- **Name** — the matched Japanese name form
- **Known names** — generated non-honorific Japanese aliases for that character, excluding raw romanized/English aliases from lookup results
- **Role badge** — color-coded by role: main (score 100), supporting (90), side (80), background (70)
- **Portrait** — character image from AniList, embedded in the ZIP
- **Description** — biography text from AniList (collapsible)
@@ -169,10 +174,13 @@ This creates a standalone dictionary ZIP for the target media and saves it along
## Correcting AniList Matches
SubMiner uses `guessit` to infer the anime title from the active filename, then searches AniList. Some filenames can still resolve to the wrong title. For example, `Re - ZERO, Starting Life in Another World (2016)` can be misread as a different `Re...` series.
SubMiner uses `guessit` to infer the anime title from the active filename before searching AniList. Some filenames can still resolve to the wrong title. For example, `Re - ZERO, Starting Life in Another World (2016)` can be misread as a different `Re...` series.
Use the in-app selector or CLI to pin the correct AniList media for the whole series:
- In-app: open the selector with `Ctrl/Cmd+Alt+A` or `--open-character-dictionary`, edit the prefilled title if needed, then search and choose the correct result.
- CLI: `--dictionary-candidates` still lists matches for the current filename guess.
```bash
# List candidate AniList matches for a file
subminer dictionary --candidates "/path/to/episode.mkv"
@@ -188,7 +196,7 @@ SubMiner.AppImage --dictionary-select --dictionary-anilist-id 21355 --dictionary
subminer app --open-character-dictionary
```
Manual selections are stored in `character-dictionaries/anilist-overrides.json` using a series key derived from the filename guess. Later episodes with the same series key use the selected AniList ID automatically. When the override replaces a previous wrong match, SubMiner removes that stale media ID from the merged dictionary's active set and rebuilds/imports the merged character dictionary.
Manual selections are stored in `character-dictionaries/anilist-overrides.json` using a series key derived from the episode's parent directory plus the filename guess. Later episodes in the same directory use the selected AniList ID automatically, while separate season directories can keep separate overrides and character dictionaries. When the override replaces a previous wrong match, SubMiner removes that stale media ID from the merged dictionary's active set and rebuilds/imports the merged character dictionary.
## File Structure
@@ -207,7 +215,7 @@ character-dictionaries/
m170942-va67890.jpg # Voice actor portrait
```
**Snapshot format** (v15): each snapshot contains the media ID, title, entry count, timestamp, an array of Yomitan term entries, and base64-encoded images.
**Snapshot format** (v16): each snapshot contains the media ID, title, entry count, timestamp, an array of Yomitan term entries, and base64-encoded images.
**ZIP structure** follows the Yomitan dictionary format:
@@ -231,6 +239,7 @@ merged.zip
| `anilist.characterDictionary.collapsibleSections.characterInformation` | `false` | Start Character Information section expanded |
| `anilist.characterDictionary.collapsibleSections.voicedBy` | `false` | Start Voiced By section expanded |
| `subtitleStyle.nameMatchEnabled` | `false` | Toggle character-name highlighting in subtitles |
| `subtitleStyle.nameMatchImagesEnabled` | `false` | Show small AniList portraits beside matched names |
| `subtitleStyle.nameMatchColor` | `#f5bde6` | Highlight color for character-name matches |
## Reference Implementation
@@ -253,8 +262,9 @@ If you work with visual novels or want a standalone dictionary generator indepen
## Troubleshooting
- **Names not highlighting:** Confirm `anilist.characterDictionary.enabled` is `true` and `subtitleStyle.nameMatchEnabled` is `true`. Check that the current media has an AniList entry — SubMiner needs a media ID to fetch characters.
- **Inline portraits missing:** Confirm `subtitleStyle.nameMatchImagesEnabled` is `true`. On the next character dictionary sync, SubMiner refreshes current-version snapshots that do not contain usable cached character portrait data. Portraits still require AniList to return an image and the image download to succeed.
- **Sync seems stuck:** The auto-sync debounces for 800ms after media changes and throttles image downloads at 250ms per image. Large casts (50+ characters) take longer. Check the status bar for the current sync phase.
- **Wrong characters showing:** Open the in-app character dictionary selector (`--open-character-dictionary`) or run `--dictionary-candidates`, then save the correct media with `--dictionary-select --dictionary-anilist-id <id>`. This replaces stale wrong-title entries for that series. If names are only from an older unrelated show, they'll rotate out once you watch enough new titles to push it past `maxLoaded`.
- **Wrong characters showing:** Open the in-app character dictionary selector (`--open-character-dictionary`), edit the search title, and select the right AniList entry. You can also run `--dictionary-candidates`, then save the correct media with `--dictionary-select --dictionary-anilist-id <id>`. This replaces stale wrong-title entries for that series. If names are only from an older unrelated show, they'll rotate out once you watch enough new titles to push it past `maxLoaded`.
- **Yomitan import fails:** SubMiner waits up to 7 seconds for Yomitan to be ready for mutations. If Yomitan is still loading dictionaries or performing another import, the operation may time out. Restarting the overlay typically resolves this.
- **Portraits missing:** Images are downloaded from AniList CDN during snapshot generation. If the network was unavailable during the initial sync, delete the snapshot file from `character-dictionaries/snapshots/` and let it regenerate.
+43 -41
View File
@@ -371,34 +371,35 @@ See `config.example.jsonc` for detailed configuration options.
}
```
| Option | Values | Description |
| ---------------------------------- | ----------- | -------------------------------------------------------------------------------------------------------------------------- |
| `fontFamily` | string | CSS font-family value (default: `"Hiragino Sans, M PLUS 1, Source Han Sans JP, Noto Sans CJK JP"`) |
| `fontSize` | number (px) | Font size in pixels (default: `35`) |
| `fontColor` | string | Any CSS color value (default: `"#cad3f5"`) |
| `css` | object | CSS declarations applied to subtitles after normal style defaults; the settings window writes textbox edits here |
| `fontWeight` | string | CSS font-weight, e.g. `"bold"`, `"normal"`, `"600"` (default: `"600"`) |
| `fontStyle` | string | `"normal"` or `"italic"` (default: `"normal"`) |
| `backgroundColor` | string | Any CSS color, including `"transparent"` (default: `"transparent"`) |
| `enableJlpt` | boolean | Enable JLPT level underline styling (`false` by default) |
| `preserveLineBreaks` | boolean | Preserve line breaks in visible overlay subtitle rendering (`false` by default). Enable to mirror mpv line layout. |
| `autoPauseVideoOnHover` | boolean | Pause playback while mouse hovers subtitle text, then resume on leave (`true` by default). |
| `autoPauseVideoOnYomitanPopup` | boolean | Pause playback while the Yomitan popup is open, then resume when the popup closes (`true` by default). |
| `hoverTokenColor` | string | Hex color used for hovered subtitle token highlight in mpv (default: catppuccin mauve) |
| Option | Values | Description |
| ---------------------------------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------ |
| `fontFamily` | string | CSS font-family value (default: `"Hiragino Sans, M PLUS 1, Source Han Sans JP, Noto Sans CJK JP"`) |
| `fontSize` | number (px) | Font size in pixels (default: `35`) |
| `fontColor` | string | Any CSS color value (default: `"#cad3f5"`) |
| `css` | object | CSS declarations applied to subtitles after normal style defaults; the settings window writes textbox edits here |
| `fontWeight` | string | CSS font-weight, e.g. `"bold"`, `"normal"`, `"600"` (default: `"600"`) |
| `fontStyle` | string | `"normal"` or `"italic"` (default: `"normal"`) |
| `backgroundColor` | string | Any CSS color, including `"transparent"` (default: `"transparent"`) |
| `enableJlpt` | boolean | Enable JLPT level underline styling (`false` by default) |
| `preserveLineBreaks` | boolean | Preserve line breaks in visible overlay subtitle rendering (`false` by default). Enable to mirror mpv line layout. |
| `autoPauseVideoOnHover` | boolean | Pause playback while mouse hovers subtitle text, then resume on leave (`true` by default). |
| `autoPauseVideoOnYomitanPopup` | boolean | Pause playback while the Yomitan popup is open, then resume when the popup closes (`true` by default). |
| `hoverTokenColor` | string | Hex color used for hovered subtitle token highlight in mpv (default: catppuccin mauve) |
| `hoverTokenBackgroundColor` | string | CSS color used for hovered subtitle token background highlight (default: `"transparent"`); `hoverBackground` is accepted as an alias |
| `nameMatchEnabled` | boolean | Enable subtitle token coloring for matches from the SubMiner character dictionary (`false` by default) |
| `nameMatchColor` | string | Hex color used for subtitle tokens matched from the SubMiner character dictionary (default: `#f5bde6`) |
| `knownWordColor` | string | Hex color used for known-word subtitle highlights (default: `#a6da95`) |
| `nPlusOneColor` | string | Hex color used for the single N+1 target subtitle highlight (default: `#c6a0f6`) |
| `frequencyDictionary.enabled` | boolean | Enable frequency highlighting from dictionary lookups (`false` by default) |
| `frequencyDictionary.sourcePath` | string | Path to a local frequency dictionary root. Leave empty or omit to use installed/default frequency-dictionary search paths. |
| `frequencyDictionary.topX` | number | Only color tokens whose frequency rank is `<= topX` (`1000` by default) |
| `frequencyDictionary.mode` | string | `"single"` or `"banded"` (`"single"` by default) |
| `frequencyDictionary.matchMode` | string | `"headword"` or `"surface"` (`"headword"` by default) |
| `frequencyDictionary.singleColor` | string | Color used for all highlighted tokens in single mode |
| `frequencyDictionary.bandedColors` | string[] | Array of five hex colors used for ranked bands in banded mode |
| `jlptColors` | object | JLPT level underline colors object (`N1`..`N5`) |
| `secondary` | object | Override any of the above for secondary subtitles (optional), including `secondary.css` declarations |
| `nameMatchEnabled` | boolean | Enable subtitle token coloring for matches from the SubMiner character dictionary (`false` by default) |
| `nameMatchImagesEnabled` | boolean | Show small cached AniList character portraits beside matched character-name tokens (`false` by default) |
| `nameMatchColor` | string | Hex color used for subtitle tokens matched from the SubMiner character dictionary (default: `#f5bde6`) |
| `knownWordColor` | string | Hex color used for known-word subtitle highlights (default: `#a6da95`) |
| `nPlusOneColor` | string | Hex color used for the single N+1 target subtitle highlight (default: `#c6a0f6`) |
| `frequencyDictionary.enabled` | boolean | Enable frequency highlighting from dictionary lookups (`false` by default) |
| `frequencyDictionary.sourcePath` | string | Path to a local frequency dictionary root. Leave empty or omit to use installed/default frequency-dictionary search paths. |
| `frequencyDictionary.topX` | number | Only color tokens whose frequency rank is `<= topX` (`1000` by default) |
| `frequencyDictionary.mode` | string | `"single"` or `"banded"` (`"single"` by default) |
| `frequencyDictionary.matchMode` | string | `"headword"` or `"surface"` (`"headword"` by default) |
| `frequencyDictionary.singleColor` | string | Color used for all highlighted tokens in single mode |
| `frequencyDictionary.bandedColors` | string[] | Array of five hex colors used for ranked bands in banded mode |
| `jlptColors` | object | JLPT level underline colors object (`N1`..`N5`) |
| `secondary` | object | Override any of the above for secondary subtitles (optional), including `secondary.css` declarations |
The Settings window keeps subtitle color controls separate, then saves CSS textboxes to
`subtitleStyle.css`, `subtitleStyle.secondary.css`, and `subtitleSidebar.css`. The generated example
@@ -420,6 +421,7 @@ In `single` mode all highlights use `singleColor`; in `banded` mode tokens map t
Character-name highlighting is separate from N+1 and frequency highlighting:
- `nameMatchEnabled` controls whether SubMiner includes character-dictionary name matches in subtitle token metadata and renderer styling.
- `nameMatchImagesEnabled` adds small circular portraits beside matched names using the AniList images already cached with character dictionary snapshots.
- `nameMatchColor` sets the highlight color for those matched character names.
- Matches come from the bundled SubMiner character dictionary, including AniList-synced merged dictionaries when enabled.
@@ -865,15 +867,15 @@ This is the single, shared connection to an OpenAI-compatible LLM endpoint. Conf
}
```
| Option | Values | Description |
| ------------------ | -------------------- | ---------------------------------------------------------------------------------- |
| `enabled` | `true`, `false` | Enable shared AI provider features (default: `false`) |
| `apiKey` | string | Static API key for the shared provider |
| `apiKeyCommand` | string | Shell command used to resolve the API key (preferred over a plaintext `apiKey`) |
| Option | Values | Description |
| ------------------ | -------------------- | ------------------------------------------------------------------------------------ |
| `enabled` | `true`, `false` | Enable shared AI provider features (default: `false`) |
| `apiKey` | string | Static API key for the shared provider |
| `apiKeyCommand` | string | Shell command used to resolve the API key (preferred over a plaintext `apiKey`) |
| `model` | string | Default model identifier requested from the provider (default: `openai/gpt-4o-mini`) |
| `baseUrl` | string (URL) | OpenAI-compatible base URL (default: `https://openrouter.ai/api`) |
| `systemPrompt` | string | Default system prompt sent with requests (default: a translation-engine prompt) |
| `requestTimeoutMs` | integer milliseconds | Shared request timeout (default: `15000`) |
| `baseUrl` | string (URL) | OpenAI-compatible base URL (default: `https://openrouter.ai/api`) |
| `systemPrompt` | string | Default system prompt sent with requests (default: a translation-engine prompt) |
| `requestTimeoutMs` | integer milliseconds | Shared request timeout (default: `15000`) |
SubMiner uses the shared provider for:
@@ -1125,12 +1127,12 @@ Sync the active subtitle track from the overlay picker using `alass` or `ffsubsy
}
```
| Option | Values | Description |
| ---------------- | -------------------- | ------------------------------------------------------------------------------------------------------------------------- |
| `alass_path` | string path | Path to `alass` executable. Empty or `null` resolves from `PATH`. `alass` must be installed separately. |
| `ffsubsync_path` | string path | Path to `ffsubsync` executable. Empty or `null` resolves from `PATH`. `ffsubsync` must be installed separately. |
| `ffmpeg_path` | string path | Path to `ffmpeg` (used for internal subtitle extraction). Empty or `null` falls back to `/usr/bin/ffmpeg`. |
| `replace` | `true`, `false` | When `true` (default), overwrite the active subtitle file on successful sync. When `false`, write `<name>_retimed.<ext>`. |
| Option | Values | Description |
| ---------------- | --------------- | ------------------------------------------------------------------------------------------------------------------------- |
| `alass_path` | string path | Path to `alass` executable. Empty or `null` resolves from `PATH`. `alass` must be installed separately. |
| `ffsubsync_path` | string path | Path to `ffsubsync` executable. Empty or `null` resolves from `PATH`. `ffsubsync` must be installed separately. |
| `ffmpeg_path` | string path | Path to `ffmpeg` (used for internal subtitle extraction). Empty or `null` falls back to `/usr/bin/ffmpeg`. |
| `replace` | `true`, `false` | When `true` (default), overwrite the active subtitle file on successful sync. When `false`, write `<name>_retimed.<ext>`. |
Default trigger is `Ctrl+Alt+S` via `shortcuts.triggerSubsync`.
Customize it there, or set it to `null` to disable.
+1
View File
@@ -384,6 +384,7 @@
"autoPauseVideoOnHover": true, // Automatically pause mpv playback while hovering subtitle text, then resume on leave. Values: true | false
"autoPauseVideoOnYomitanPopup": true, // Automatically pause mpv playback while Yomitan popup is open, then resume when popup closes. Values: true | false
"nameMatchEnabled": false, // Enable subtitle token coloring for matches from the SubMiner character dictionary. Values: true | false
"nameMatchImagesEnabled": false, // Show small character portraits beside subtitle tokens matched from the SubMiner character dictionary. Values: true | false
"nameMatchColor": "#f5bde6", // Hex color used when a subtitle token matches an entry from the SubMiner character dictionary.
"nPlusOneColor": "#c6a0f6", // Color used for the single N+1 target token subtitle highlight.
"knownWordColor": "#a6da95", // Color used for known-word subtitle highlights.
+15 -12
View File
@@ -44,13 +44,15 @@ Character-name matches are built from the active merged SubMiner character dicti
1. Subtitles are tokenized, then candidate name tokens are matched against the character dictionary via Yomitan's scanning pipeline.
2. Matching tokens receive a dedicated style distinct from N+1 and frequency layers.
3. This layer can be independently toggled with `subtitleStyle.nameMatchEnabled`.
4. When `subtitleStyle.nameMatchImagesEnabled` is also enabled, SubMiner shows the cached AniList portrait beside matched names.
**Key settings:**
| Option | Default | Description |
| -------------------------------- | --------- | ---------------------------------------- |
| `subtitleStyle.nameMatchEnabled` | `false` | Enable character-name token highlighting |
| `subtitleStyle.nameMatchColor` | `#f5bde6` | Color used for character-name matches |
| Option | Default | Description |
| -------------------------------------- | --------- | ------------------------------------------------ |
| `subtitleStyle.nameMatchEnabled` | `false` | Enable character-name token highlighting |
| `subtitleStyle.nameMatchImagesEnabled` | `false` | Show small AniList portraits next to name tokens |
| `subtitleStyle.nameMatchColor` | `#f5bde6` | Color used for character-name matches |
For full details on dictionary generation, name variant expansion, auto-sync lifecycle, and configuration, see the dedicated [Character Dictionary](/character-dictionary) page.
@@ -67,14 +69,14 @@ SubMiner looks up each token's `frequencyRank` from `term_meta_bank_*.json` file
**Key settings:**
| Option | Default | Description |
| ------------------------------------------------ | ------------ | ---------------------------------------- |
| `subtitleStyle.frequencyDictionary.enabled` | `false` | Enable frequency highlighting |
| `subtitleStyle.frequencyDictionary.topX` | `1000` | Max frequency rank to highlight |
| `subtitleStyle.frequencyDictionary.mode` | `"single"` | `"single"` or `"banded"` |
| `subtitleStyle.frequencyDictionary.matchMode` | `"headword"` | `"headword"` or `"surface"` |
| `subtitleStyle.frequencyDictionary.singleColor` | `#f5a97f` | Color for single mode |
| `subtitleStyle.frequencyDictionary.bandedColors` | 5 colors[^1] | Array of five hex colors for banded mode |
| Option | Default | Description |
| ------------------------------------------------ | ------------ | ---------------------------------------------------------------- |
| `subtitleStyle.frequencyDictionary.enabled` | `false` | Enable frequency highlighting |
| `subtitleStyle.frequencyDictionary.topX` | `1000` | Max frequency rank to highlight |
| `subtitleStyle.frequencyDictionary.mode` | `"single"` | `"single"` or `"banded"` |
| `subtitleStyle.frequencyDictionary.matchMode` | `"headword"` | `"headword"` or `"surface"` |
| `subtitleStyle.frequencyDictionary.singleColor` | `#f5a97f` | Color for single mode |
| `subtitleStyle.frequencyDictionary.bandedColors` | 5 colors[^1] | Array of five hex colors for banded mode |
| `subtitleStyle.frequencyDictionary.sourcePath` | `""` | Custom path to frequency dictionary root (empty = auto-discover) |
[^1]: Default banded palette (most common → least common): `#ed8796`, `#f5a97f`, `#f9e2af`, `#8bd5ca`, `#8aadf4`.
@@ -122,6 +124,7 @@ All annotation layers can be toggled at runtime via the mpv command menu without
- `ankiConnect.knownWords.highlightEnabled` (`On` / `Off`)
- `subtitleStyle.nameMatchEnabled` (`On` / `Off`)
- `subtitleStyle.nameMatchImagesEnabled` (`On` / `Off`)
- `subtitleStyle.enableJlpt` (`On` / `Off`)
- `subtitleStyle.frequencyDictionary.enabled` (`On` / `Off`)