docs: add setup guides, architecture docs, and config examples

2026-02-27 18:22:41 -08:00 · 2026-02-22 21:43:43 -08:00
parent ae95601698
commit 64020a9069
48 changed files with 5095 additions and 0 deletions
--- a/docs/anki-integration.md
+++ b/docs/anki-integration.md
@@ -0,0 +1,258 @@
+# Anki Integration
+
+SubMiner uses the [AnkiConnect](https://ankiweb.net/shared/info/2055492159) add-on to create and update Anki cards with sentence context, audio, and screenshots.
+
+## Prerequisites
+
+1. Install [Anki](https://apps.ankiweb.net/).
+2. Install the [AnkiConnect](https://ankiweb.net/shared/info/2055492159) add-on (code: `2055492159`).
+3. Keep Anki running while using SubMiner.
+
+AnkiConnect listens on `http://127.0.0.1:8765` by default. If you changed the port in AnkiConnect's settings, update `ankiConnect.url` in your SubMiner config.
+
+## How Polling Works
+
+SubMiner polls AnkiConnect at a regular interval (default: 3 seconds, configurable via `ankiConnect.pollingRate`) to detect new cards. When it finds a card that was added since the last poll:
+
+1. Checks if a duplicate expression already exists (for field grouping).
+2. Updates the sentence field with the current subtitle.
+3. Generates and uploads audio and image media.
+4. Fills the translation field from the secondary subtitle or AI.
+5. Writes metadata to the miscInfo field.
+
+Polling uses the query `"deck:<your-deck>" added:1` to find recently added cards. If no deck is configured, it searches all decks.
+
+## Field Mapping
+
+SubMiner maps its data to your Anki note fields. Configure these under `ankiConnect.fields`:
+
+```jsonc
+"ankiConnect": {
+  "fields": {
+    "audio": "ExpressionAudio",    // audio clip from the video
+    "image": "Picture",             // screenshot or animated clip
+    "sentence": "Sentence",         // subtitle text
+    "miscInfo": "MiscInfo",         // metadata (filename, timestamp)
+    "translation": "SelectionText"  // secondary sub or AI translation
+  }
+}
+```
+
+Field names must match your Anki note type exactly (case-sensitive). If a configured field does not exist on the note type, SubMiner skips it without error.
+
+### Minimal Config
+
+If you only want sentence and audio on your cards:
+
+```jsonc
+"ankiConnect": {
+  "enabled": true,
+  "fields": {
+    "sentence": "Sentence",
+    "audio": "ExpressionAudio"
+  }
+}
+```
+
+## Media Generation
+
+SubMiner uses FFmpeg to generate audio and image media from the video. FFmpeg must be installed and on `PATH`.
+
+### Audio
+
+Audio is extracted from the video file using the subtitle's start and end timestamps, with configurable padding added before and after.
+
+```jsonc
+"ankiConnect": {
+  "media": {
+    "generateAudio": true,
+    "audioPadding": 0.5,         // seconds before and after subtitle timing
+    "maxMediaDuration": 30       // cap total duration in seconds
+  }
+}
+```
+
+Output format: MP3 at 44100 Hz. If the video has multiple audio streams, SubMiner uses the active stream.
+
+The audio is uploaded to Anki's media folder and inserted as `[sound:audio_<timestamp>.mp3]`.
+
+### Screenshots (Static)
+
+A single frame is captured at the current playback position.
+
+```jsonc
+"ankiConnect": {
+  "media": {
+    "generateImage": true,
+    "imageType": "static",
+    "imageFormat": "jpg",        // "jpg", "png", or "webp"
+    "imageQuality": 92,          // 1–100
+    "imageMaxWidth": null,       // optional, preserves aspect ratio
+    "imageMaxHeight": null
+  }
+}
+```
+
+### Animated Clips (AVIF)
+
+Instead of a static screenshot, SubMiner can generate an animated AVIF covering the subtitle duration.
+
+```jsonc
+"ankiConnect": {
+  "media": {
+    "generateImage": true,
+    "imageType": "avif",
+    "animatedFps": 10,
+    "animatedMaxWidth": 640,
+    "animatedMaxHeight": null,
+    "animatedCrf": 35            // 0–63, lower = better quality
+  }
+}
+```
+
+Animated AVIF requires an AV1 encoder (`libaom-av1`, `libsvtav1`, or `librav1e`) in your FFmpeg build. Generation timeout is 60 seconds.
+
+### Behavior Options
+
+```jsonc
+"ankiConnect": {
+  "behavior": {
+    "overwriteAudio": true,         // replace existing audio, or append
+    "overwriteImage": true,         // replace existing image, or append
+    "mediaInsertMode": "append",    // "append" or "prepend" to field content
+    "autoUpdateNewCards": true,     // auto-update when new card detected
+    "notificationType": "osd"       // "osd", "system", "both", or "none"
+  }
+}
+```
+
+## AI Translation
+
+SubMiner can auto-translate the mined sentence and fill the translation field. By default, if a secondary subtitle track is available, its text is used. When AI is enabled, SubMiner calls an LLM API instead.
+
+```jsonc
+"ankiConnect": {
+  "ai": {
+    "enabled": true,
+    "alwaysUseAiTranslation": false,  // true = ignore secondary sub
+    "apiKey": "sk-...",
+    "model": "openai/gpt-4o-mini",
+    "baseUrl": "https://openrouter.ai/api",
+    "targetLanguage": "English",
+    "systemPrompt": "You are a translation engine. Return only the translation."
+  }
+}
+```
+
+Translation priority:
+
+1. If `alwaysUseAiTranslation` is `true`, always call the AI API.
+2. If a secondary subtitle is available, use it as the translation.
+3. If AI is enabled and no secondary subtitle exists, call the AI API.
+4. Otherwise, leave the field empty.
+
+## Sentence Cards (Lapis)
+
+SubMiner can create standalone sentence cards (without a word/expression) using a separate note type. This is designed for use with [Lapis](https://github.com/donkuri/Lapis) and similar sentence-focused note types.
+
+```jsonc
+"ankiConnect": {
+  "isLapis": {
+    "enabled": true,
+    "sentenceCardModel": "Japanese sentences"
+  }
+}
+```
+
+Trigger with the mine sentence shortcut (`Ctrl/Cmd+S` by default). The card is created directly via AnkiConnect with the sentence, audio, and image filled in.
+
+To mine multiple subtitle lines as one sentence card, use `Ctrl/Cmd+Shift+S` followed by a digit (1–9) to select how many recent lines to combine.
+
+## Field Grouping (Kiku)
+
+When you mine the same word multiple times, SubMiner can merge the cards instead of creating duplicates. This is designed for note types like [Kiku](https://github.com/youyoumu/kiku) that support grouped sentence/audio/image fields.
+
+```jsonc
+"ankiConnect": {
+  "isKiku": {
+    "enabled": true,
+    "fieldGrouping": "manual",         // "auto", "manual", or "disabled"
+    "deleteDuplicateInAuto": true      // delete new card after auto-merge
+  }
+}
+```
+
+### Modes
+
+**Disabled** (`"disabled"`): No duplicate detection. Each card is independent.
+
+**Auto** (`"auto"`): When a duplicate expression is found, SubMiner merges the new card into the existing one automatically. Both sentences, audio clips, and images are preserved. If `deleteDuplicateInAuto` is true, the new card is deleted after merging.
+
+**Manual** (`"manual"`): A modal appears in the overlay showing both cards. You choose which card to keep, preview the merge result, then confirm. The modal has a 90-second timeout, after which it cancels automatically.
+
+### What Gets Merged
+
+| Field    | Merge behavior                                                 |
+| -------- | -------------------------------------------------------------- |
+| Sentence | Both sentences preserved, labeled `[Original]` / `[Duplicate]` |
+| Audio    | Both `[sound:...]` entries kept                                |
+| Image    | Both images kept                                               |
+
+### Keyboard Shortcuts in the Modal
+
+| Key       | Action                             |
+| --------- | ---------------------------------- |
+| `1` / `2` | Select card 1 or card 2 to keep    |
+| `Enter`   | Confirm selection                  |
+| `Esc`     | Cancel (keep both cards unchanged) |
+
+## Full Config Example
+
+```jsonc
+{
+  "ankiConnect": {
+    "enabled": true,
+    "url": "http://127.0.0.1:8765",
+    "pollingRate": 3000,
+    "fields": {
+      "audio": "ExpressionAudio",
+      "image": "Picture",
+      "sentence": "Sentence",
+      "miscInfo": "MiscInfo",
+      "translation": "SelectionText",
+    },
+    "media": {
+      "generateAudio": true,
+      "generateImage": true,
+      "imageType": "static",
+      "imageFormat": "jpg",
+      "imageQuality": 92,
+      "audioPadding": 0.5,
+      "maxMediaDuration": 30,
+    },
+    "behavior": {
+      "overwriteAudio": true,
+      "overwriteImage": true,
+      "mediaInsertMode": "append",
+      "autoUpdateNewCards": true,
+      "notificationType": "osd",
+    },
+    "ai": {
+      "enabled": false,
+      "apiKey": "",
+      "model": "openai/gpt-4o-mini",
+      "baseUrl": "https://openrouter.ai/api",
+      "targetLanguage": "English",
+    },
+    "isKiku": {
+      "enabled": false,
+      "fieldGrouping": "disabled",
+      "deleteDuplicateInAuto": true,
+    },
+    "isLapis": {
+      "enabled": false,
+      "sentenceCardModel": "Japanese sentences",
+    },
+  },
+}
+```