mirror of
https://github.com/ksyasuda/SubMiner.git
synced 2026-02-27 18:22:41 -08:00
Merge pull request #5 from ksyasuda/feature/frequency-based-highlighting
Add vendor frequency defaults with override support
This commit is contained in:
@@ -3,11 +3,15 @@ id: TASK-25
|
|||||||
title: >-
|
title: >-
|
||||||
Add frequency-dictionary-based token highlighting with configurable top-X and
|
Add frequency-dictionary-based token highlighting with configurable top-X and
|
||||||
color ramp
|
color ramp
|
||||||
status: To Do
|
status: Done
|
||||||
assignee: []
|
assignee: []
|
||||||
created_date: '2026-02-13 16:47'
|
created_date: '2026-02-13 16:47'
|
||||||
|
updated_date: '2026-02-16 06:48'
|
||||||
labels: []
|
labels: []
|
||||||
dependencies: []
|
dependencies: []
|
||||||
|
documentation:
|
||||||
|
- /Users/sudacode/.codex/worktrees/2089/SubMiner/docs/configuration.md
|
||||||
|
- /Users/sudacode/.codex/worktrees/2089/SubMiner/docs/jlpt-vocab-bundle.md
|
||||||
priority: high
|
priority: high
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -19,20 +23,32 @@ Leverage user-installed frequency dictionaries to color subtitle tokens based on
|
|||||||
|
|
||||||
## Acceptance Criteria
|
## Acceptance Criteria
|
||||||
<!-- AC:BEGIN -->
|
<!-- AC:BEGIN -->
|
||||||
- [ ] #1 Add a feature flag and configuration for frequency-based highlighting with default disabled state.
|
- [x] #1 Add a feature flag and configuration for frequency-based highlighting with default disabled state.
|
||||||
- [ ] #2 Support selecting a user-installed frequency dictionary source and reading word frequency data from it.
|
- [x] #2 Support selecting a user-installed frequency dictionary source and reading word frequency data from it.
|
||||||
- [ ] #3 Introduce a configurable top-X threshold in config for which words are eligible for frequency-based coloring.
|
- [x] #3 Introduce a configurable top-X threshold in config for which words are eligible for frequency-based coloring.
|
||||||
- [ ] #4 When single-color mode is enabled, all matched words within the rank rule use the configured color.
|
- [x] #4 When single-color mode is enabled, all matched words within the rank rule use the configured color.
|
||||||
- [ ] #5 When multi-color mode is enabled, map frequency bands to colors and color tokens by their actual rank bucket.
|
- [x] #5 When multi-color mode is enabled, map frequency bands to colors and color tokens by their actual rank bucket.
|
||||||
- [ ] #6 Ensure matching is token-aware (normalization/lowercasing handling) and preserves existing subtitle tokenization behavior.
|
- [x] #6 Ensure matching is token-aware (normalization/lowercasing handling) and preserves existing subtitle tokenization behavior.
|
||||||
- [ ] #7 Handle missing/unsupported dictionary formats and unknown words with deterministic no-highlight fallback.
|
- [x] #7 Handle missing/unsupported dictionary formats and unknown words with deterministic no-highlight fallback.
|
||||||
- [ ] #8 Render underline/token highlights without breaking subtitle layout or interactions.
|
- [x] #8 Render underline/token highlights without breaking subtitle layout or interactions.
|
||||||
- [ ] #9 Add tests/verification for: single-color mode, color-band mode, threshold boundary, and disabled mode.
|
- [x] #9 Add tests/verification for: single-color mode, color-band mode, threshold boundary, and disabled mode.
|
||||||
- [ ] #10 Document dictionary source format expectations, configuration example, and performance impact of ranking lookups.
|
- [x] #10 Document dictionary source format expectations, configuration example, and performance impact of ranking lookups.
|
||||||
- [ ] #11 If full automatic discovery of user-installed frequency dictionaries is not possible, provide clear configuration workflow/fallback path.
|
- [x] #11 If full automatic discovery of user-installed frequency dictionaries is not possible, provide clear configuration workflow/fallback path.
|
||||||
<!-- AC:END -->
|
<!-- AC:END -->
|
||||||
|
|
||||||
|
## Implementation Notes
|
||||||
|
|
||||||
|
<!-- SECTION:NOTES:BEGIN -->
|
||||||
|
2026-02-16: Updated docs for frequency dictionary behavior. Clarified built-in fallback, precedence, and shared format expectations in and .
|
||||||
|
|
||||||
|
Added docs references for frequency dictionary defaults and fallback behavior.
|
||||||
|
|
||||||
|
As of 2026-02-16, docs and implementation are considered complete for TASK-25; frequency highlighting fallback, custom sourcePath precedence, topX, single/banded modes, token pipeline integration, and fallback behavior are present; documentation and tests exist in src/core/services and src/renderer.
|
||||||
|
|
||||||
|
2026-02-16: Frequency-dictionary highlighting feature fully complete and shipped. Task acceptance criteria, DoD, and docs alignment are all marked complete in this task record.
|
||||||
|
<!-- SECTION:NOTES:END -->
|
||||||
|
|
||||||
## Definition of Done
|
## Definition of Done
|
||||||
<!-- DOD:BEGIN -->
|
<!-- DOD:BEGIN -->
|
||||||
- [ ] #1 Frequency-based highlighting renders using either single-color or banded-colors for valid matches, with configurable top-X threshold and documented setup.
|
- [x] #1 Frequency-based highlighting renders using either single-color or banded-colors for valid matches, with configurable top-X threshold and documented setup.
|
||||||
<!-- DOD:END -->
|
<!-- DOD:END -->
|
||||||
|
|||||||
@@ -555,6 +555,12 @@ See `config.example.jsonc` for detailed configuration options.
|
|||||||
| `fontStyle` | string | `"normal"` or `"italic"` (default: `"normal"`) |
|
| `fontStyle` | string | `"normal"` or `"italic"` (default: `"normal"`) |
|
||||||
| `backgroundColor` | string | Any CSS color, including `"transparent"` (default: `"rgba(54, 58, 79, 0.5)"`) |
|
| `backgroundColor` | string | Any CSS color, including `"transparent"` (default: `"rgba(54, 58, 79, 0.5)"`) |
|
||||||
| `enableJlpt` | boolean | Enable JLPT level underline styling (`false` by default) |
|
| `enableJlpt` | boolean | Enable JLPT level underline styling (`false` by default) |
|
||||||
|
| `frequencyDictionary.enabled` | boolean | Enable frequency highlighting from dictionary lookups (`false` by default) |
|
||||||
|
| `frequencyDictionary.sourcePath` | string | Path to a local frequency dictionary root. Leave empty or omit to use the built-in bundled dictionary search paths. |
|
||||||
|
| `frequencyDictionary.topX` | number | Only color tokens whose frequency rank is `<= topX` (`1000` by default) |
|
||||||
|
| `frequencyDictionary.mode` | string | `"single"` or `"banded"` (`"single"` by default) |
|
||||||
|
| `frequencyDictionary.singleColor` | string | Color used for all highlighted tokens in single mode |
|
||||||
|
| `frequencyDictionary.bandedColors` | string[] | Array of five hex colors used for ranked bands in banded mode |
|
||||||
| `nPlusOneColor` | string | Existing n+1 highlight color (default: `#c6a0f6`) |
|
| `nPlusOneColor` | string | Existing n+1 highlight color (default: `#c6a0f6`) |
|
||||||
| `knownWordColor` | string | Existing known-word highlight color (default: `#a6da95`) |
|
| `knownWordColor` | string | Existing known-word highlight color (default: `#a6da95`) |
|
||||||
| `jlptColors` | object | JLPT level underline colors object (`N1`..`N5`) |
|
| `jlptColors` | object | JLPT level underline colors object (`N1`..`N5`) |
|
||||||
@@ -562,6 +568,16 @@ See `config.example.jsonc` for detailed configuration options.
|
|||||||
|
|
||||||
JLPT underlining is powered by offline term-meta bank files at runtime. See [`docs/jlpt-vocab-bundle.md`](jlpt-vocab-bundle.md) for required files, source/version refresh steps, and deterministic fallback behavior.
|
JLPT underlining is powered by offline term-meta bank files at runtime. See [`docs/jlpt-vocab-bundle.md`](jlpt-vocab-bundle.md) for required files, source/version refresh steps, and deterministic fallback behavior.
|
||||||
|
|
||||||
|
Frequency dictionary highlighting uses the same dictionary file format as JLPT bundle lookups (`term_meta_bank_*.json` under discovered dictionary directories). A token is highlighted when it has a positive integer `frequencyRank` (lower is more common) and the rank is within `topX`.
|
||||||
|
|
||||||
|
Lookup behavior:
|
||||||
|
|
||||||
|
- Set `frequencyDictionary.sourcePath` to a directory containing `term_meta_bank_*.json` for a fully custom source.
|
||||||
|
- If `sourcePath` is missing or empty, SubMiner uses bundled defaults from `vendor/jiten_freq_global` (packaged under `<resources>/jiten_freq_global` in distribution builds).
|
||||||
|
- In both cases, only terms with a valid `frequencyRank` are used; everything else falls back to no highlighting.
|
||||||
|
|
||||||
|
In `single` mode all highlights use `singleColor`; in `banded` mode tokens map to five ascending color bands from most common to least common inside the topX window.
|
||||||
|
|
||||||
Secondary subtitle defaults: `fontSize: 24`, `fontColor: "#ffffff"`, `backgroundColor: "transparent"`. Any property not set in `secondary` falls back to the CSS defaults.
|
Secondary subtitle defaults: `fontSize: 24`, `fontColor: "#ffffff"`, `backgroundColor: "transparent"`. Any property not set in `secondary` falls back to the CSS defaults.
|
||||||
|
|
||||||
**See `config.example.jsonc`** for the complete list of subtitle style configuration options.
|
**See `config.example.jsonc`** for the complete list of subtitle style configuration options.
|
||||||
|
|||||||
@@ -26,6 +26,8 @@ The expected files are:
|
|||||||
|
|
||||||
Each bank maps terms to frequency metadata; only entries with a `frequency.displayValue` are considered for JLPT tagging.
|
Each bank maps terms to frequency metadata; only entries with a `frequency.displayValue` are considered for JLPT tagging.
|
||||||
|
|
||||||
|
SubMiner also reuses the same `term_meta_bank_*.json` format for frequency-based subtitle highlighting. The default frequency source is now bundled as `vendor/jiten_freq_global`, so users can enable `subtitleStyle.frequencyDictionary` without extra setup.
|
||||||
|
|
||||||
## Source and update process
|
## Source and update process
|
||||||
|
|
||||||
For reproducible updates:
|
For reproducible updates:
|
||||||
|
|||||||
@@ -151,6 +151,20 @@
|
|||||||
// ==========================================
|
// ==========================================
|
||||||
"subtitleStyle": {
|
"subtitleStyle": {
|
||||||
"enableJlpt": false,
|
"enableJlpt": false,
|
||||||
|
"frequencyDictionary": {
|
||||||
|
"enabled": false,
|
||||||
|
"sourcePath": "",
|
||||||
|
"topX": 1000,
|
||||||
|
"mode": "single",
|
||||||
|
"singleColor": "#f5a97f",
|
||||||
|
"bandedColors": [
|
||||||
|
"#ed8796",
|
||||||
|
"#f5a97f",
|
||||||
|
"#f9e2af",
|
||||||
|
"#a6e3a1",
|
||||||
|
"#8aadf4"
|
||||||
|
]
|
||||||
|
},
|
||||||
"fontFamily": "Noto Sans CJK JP Regular, Noto Sans CJK JP, Arial Unicode MS, Arial, sans-serif",
|
"fontFamily": "Noto Sans CJK JP Regular, Noto Sans CJK JP, Arial Unicode MS, Arial, sans-serif",
|
||||||
"fontSize": 35,
|
"fontSize": 35,
|
||||||
"fontColor": "#cad3f5",
|
"fontColor": "#cad3f5",
|
||||||
|
|||||||
@@ -103,6 +103,10 @@
|
|||||||
"from": "vendor/yomitan-jlpt-vocab",
|
"from": "vendor/yomitan-jlpt-vocab",
|
||||||
"to": "yomitan-jlpt-vocab"
|
"to": "yomitan-jlpt-vocab"
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"from": "vendor/jiten_freq_global",
|
||||||
|
"to": "jiten_freq_global"
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"from": "assets",
|
"from": "assets",
|
||||||
"to": "assets"
|
"to": "assets"
|
||||||
|
|||||||
@@ -28,6 +28,31 @@ local function is_linux()
|
|||||||
return not is_windows() and not is_macos()
|
return not is_windows() and not is_macos()
|
||||||
end
|
end
|
||||||
|
|
||||||
|
local function normalize_binary_path_candidate(candidate)
|
||||||
|
if type(candidate) ~= "string" then
|
||||||
|
return nil
|
||||||
|
end
|
||||||
|
local trimmed = candidate:match("^%s*(.-)%s*$") or ""
|
||||||
|
if trimmed == "" then
|
||||||
|
return nil
|
||||||
|
end
|
||||||
|
if #trimmed >= 2 then
|
||||||
|
local first = trimmed:sub(1, 1)
|
||||||
|
local last = trimmed:sub(-1)
|
||||||
|
if (first == '"' and last == '"') or (first == "'" and last == "'") then
|
||||||
|
trimmed = trimmed:sub(2, -2)
|
||||||
|
end
|
||||||
|
end
|
||||||
|
return trimmed ~= "" and trimmed or nil
|
||||||
|
end
|
||||||
|
|
||||||
|
local function binary_candidates_from_app_path(app_path)
|
||||||
|
return {
|
||||||
|
utils.join_path(app_path, "Contents", "MacOS", "SubMiner"),
|
||||||
|
utils.join_path(app_path, "Contents", "MacOS", "subminer"),
|
||||||
|
}
|
||||||
|
end
|
||||||
|
|
||||||
local opts = {
|
local opts = {
|
||||||
binary_path = "",
|
binary_path = "",
|
||||||
socket_path = default_socket_path(),
|
socket_path = default_socket_path(),
|
||||||
@@ -131,12 +156,68 @@ end
|
|||||||
|
|
||||||
local function file_exists(path)
|
local function file_exists(path)
|
||||||
local info = utils.file_info(path)
|
local info = utils.file_info(path)
|
||||||
return info ~= nil
|
if not info then return false end
|
||||||
|
if info.is_dir ~= nil then
|
||||||
|
return not info.is_dir
|
||||||
|
end
|
||||||
|
return true
|
||||||
|
end
|
||||||
|
|
||||||
|
local function resolve_binary_candidate(candidate)
|
||||||
|
local normalized = normalize_binary_path_candidate(candidate)
|
||||||
|
if not normalized then
|
||||||
|
return nil
|
||||||
|
end
|
||||||
|
|
||||||
|
if file_exists(normalized) then
|
||||||
|
return normalized
|
||||||
|
end
|
||||||
|
|
||||||
|
if not normalized:lower():find("%.app") then
|
||||||
|
return nil
|
||||||
|
end
|
||||||
|
|
||||||
|
local app_root = normalized
|
||||||
|
if not app_root:lower():match("%.app$") then
|
||||||
|
app_root = normalized:match("(.+%.app)")
|
||||||
|
end
|
||||||
|
if not app_root then
|
||||||
|
return nil
|
||||||
|
end
|
||||||
|
|
||||||
|
for _, path in ipairs(binary_candidates_from_app_path(app_root)) do
|
||||||
|
if file_exists(path) then
|
||||||
|
return path
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
|
return nil
|
||||||
|
end
|
||||||
|
|
||||||
|
local function find_binary_override()
|
||||||
|
local candidates = {
|
||||||
|
resolve_binary_candidate(os.getenv("SUBMINER_APPIMAGE_PATH")),
|
||||||
|
resolve_binary_candidate(os.getenv("SUBMINER_BINARY_PATH")),
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, path in ipairs(candidates) do
|
||||||
|
if path and path ~= "" then
|
||||||
|
return path
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
|
return nil
|
||||||
end
|
end
|
||||||
|
|
||||||
local function find_binary()
|
local function find_binary()
|
||||||
if opts.binary_path ~= "" and file_exists(opts.binary_path) then
|
local override = find_binary_override()
|
||||||
return opts.binary_path
|
if override then
|
||||||
|
return override
|
||||||
|
end
|
||||||
|
|
||||||
|
local configured = resolve_binary_candidate(opts.binary_path)
|
||||||
|
if configured then
|
||||||
|
return configured
|
||||||
end
|
end
|
||||||
|
|
||||||
local search_paths = {
|
local search_paths = {
|
||||||
|
|||||||
@@ -195,6 +195,20 @@ export const DEFAULT_CONFIG: ResolvedConfig = {
|
|||||||
N4: "#a6e3a1",
|
N4: "#a6e3a1",
|
||||||
N5: "#8aadf4",
|
N5: "#8aadf4",
|
||||||
},
|
},
|
||||||
|
frequencyDictionary: {
|
||||||
|
enabled: false,
|
||||||
|
sourcePath: "",
|
||||||
|
topX: 1000,
|
||||||
|
mode: "single",
|
||||||
|
singleColor: "#f5a97f",
|
||||||
|
bandedColors: [
|
||||||
|
"#ed8796",
|
||||||
|
"#f5a97f",
|
||||||
|
"#f9e2af",
|
||||||
|
"#a6e3a1",
|
||||||
|
"#8aadf4",
|
||||||
|
],
|
||||||
|
},
|
||||||
secondary: {
|
secondary: {
|
||||||
fontSize: 24,
|
fontSize: 24,
|
||||||
fontColor: "#ffffff",
|
fontColor: "#ffffff",
|
||||||
@@ -306,6 +320,48 @@ export const CONFIG_OPTION_REGISTRY: ConfigOptionRegistryEntry[] = [
|
|||||||
description: "Enable JLPT vocabulary level underlines. "
|
description: "Enable JLPT vocabulary level underlines. "
|
||||||
+ "When disabled, JLPT tagging lookup and underlines are skipped.",
|
+ "When disabled, JLPT tagging lookup and underlines are skipped.",
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
path: "subtitleStyle.frequencyDictionary.enabled",
|
||||||
|
kind: "boolean",
|
||||||
|
defaultValue: DEFAULT_CONFIG.subtitleStyle.frequencyDictionary.enabled,
|
||||||
|
description:
|
||||||
|
"Enable frequency-dictionary-based highlighting based on token rank.",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
path: "subtitleStyle.frequencyDictionary.sourcePath",
|
||||||
|
kind: "string",
|
||||||
|
defaultValue: DEFAULT_CONFIG.subtitleStyle.frequencyDictionary.sourcePath,
|
||||||
|
description:
|
||||||
|
"Optional absolute path to a frequency dictionary directory."
|
||||||
|
+ " If empty, built-in discovery search paths are used.",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
path: "subtitleStyle.frequencyDictionary.topX",
|
||||||
|
kind: "number",
|
||||||
|
defaultValue: DEFAULT_CONFIG.subtitleStyle.frequencyDictionary.topX,
|
||||||
|
description: "Only color tokens with frequency rank <= topX (default: 1000).",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
path: "subtitleStyle.frequencyDictionary.mode",
|
||||||
|
kind: "enum",
|
||||||
|
enumValues: ["single", "banded"],
|
||||||
|
defaultValue: DEFAULT_CONFIG.subtitleStyle.frequencyDictionary.mode,
|
||||||
|
description:
|
||||||
|
"single: use one color for all matching tokens. banded: use color ramp by frequency band.",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
path: "subtitleStyle.frequencyDictionary.singleColor",
|
||||||
|
kind: "string",
|
||||||
|
defaultValue: DEFAULT_CONFIG.subtitleStyle.frequencyDictionary.singleColor,
|
||||||
|
description: "Color used when frequencyDictionary.mode is `single`.",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
path: "subtitleStyle.frequencyDictionary.bandedColors",
|
||||||
|
kind: "array",
|
||||||
|
defaultValue: DEFAULT_CONFIG.subtitleStyle.frequencyDictionary.bandedColors,
|
||||||
|
description:
|
||||||
|
"Five colors used for rank bands when mode is `banded` (from most common to least within topX).",
|
||||||
|
},
|
||||||
{
|
{
|
||||||
path: "ankiConnect.enabled",
|
path: "ankiConnect.enabled",
|
||||||
kind: "boolean",
|
kind: "boolean",
|
||||||
|
|||||||
@@ -45,6 +45,21 @@ function asColor(value: unknown): string | undefined {
|
|||||||
return hexColorPattern.test(text) ? text : undefined;
|
return hexColorPattern.test(text) ? text : undefined;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
function asFrequencyBandedColors(
|
||||||
|
value: unknown,
|
||||||
|
): [string, string, string, string, string] | undefined {
|
||||||
|
if (!Array.isArray(value) || value.length !== 5) {
|
||||||
|
return undefined;
|
||||||
|
}
|
||||||
|
|
||||||
|
const colors = value.map((item) => asColor(item));
|
||||||
|
if (colors.some((color) => color === undefined)) {
|
||||||
|
return undefined;
|
||||||
|
}
|
||||||
|
|
||||||
|
return colors as [string, string, string, string, string];
|
||||||
|
}
|
||||||
|
|
||||||
export class ConfigService {
|
export class ConfigService {
|
||||||
private readonly configDir: string;
|
private readonly configDir: string;
|
||||||
private readonly configFileJsonc: string;
|
private readonly configFileJsonc: string;
|
||||||
@@ -468,6 +483,108 @@ export class ConfigService {
|
|||||||
"Expected boolean.",
|
"Expected boolean.",
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
const frequencyDictionary = isObject(
|
||||||
|
(src.subtitleStyle as { frequencyDictionary?: unknown })
|
||||||
|
.frequencyDictionary,
|
||||||
|
)
|
||||||
|
? ((src.subtitleStyle as { frequencyDictionary?: unknown })
|
||||||
|
.frequencyDictionary as Record<string, unknown>)
|
||||||
|
: {};
|
||||||
|
const frequencyEnabled = asBoolean(
|
||||||
|
(frequencyDictionary as { enabled?: unknown }).enabled,
|
||||||
|
);
|
||||||
|
if (frequencyEnabled !== undefined) {
|
||||||
|
resolved.subtitleStyle.frequencyDictionary.enabled = frequencyEnabled;
|
||||||
|
} else if (
|
||||||
|
(frequencyDictionary as { enabled?: unknown }).enabled !== undefined
|
||||||
|
) {
|
||||||
|
warn(
|
||||||
|
"subtitleStyle.frequencyDictionary.enabled",
|
||||||
|
(frequencyDictionary as { enabled?: unknown }).enabled,
|
||||||
|
resolved.subtitleStyle.frequencyDictionary.enabled,
|
||||||
|
"Expected boolean.",
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
const sourcePath = asString(
|
||||||
|
(frequencyDictionary as { sourcePath?: unknown }).sourcePath,
|
||||||
|
);
|
||||||
|
if (sourcePath !== undefined) {
|
||||||
|
resolved.subtitleStyle.frequencyDictionary.sourcePath = sourcePath;
|
||||||
|
} else if (
|
||||||
|
(frequencyDictionary as { sourcePath?: unknown }).sourcePath !== undefined
|
||||||
|
) {
|
||||||
|
warn(
|
||||||
|
"subtitleStyle.frequencyDictionary.sourcePath",
|
||||||
|
(frequencyDictionary as { sourcePath?: unknown }).sourcePath,
|
||||||
|
resolved.subtitleStyle.frequencyDictionary.sourcePath,
|
||||||
|
"Expected string.",
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
const topX = asNumber((frequencyDictionary as { topX?: unknown }).topX);
|
||||||
|
if (
|
||||||
|
topX !== undefined &&
|
||||||
|
Number.isInteger(topX) &&
|
||||||
|
topX > 0
|
||||||
|
) {
|
||||||
|
resolved.subtitleStyle.frequencyDictionary.topX = Math.floor(topX);
|
||||||
|
} else if ((frequencyDictionary as { topX?: unknown }).topX !== undefined) {
|
||||||
|
warn(
|
||||||
|
"subtitleStyle.frequencyDictionary.topX",
|
||||||
|
(frequencyDictionary as { topX?: unknown }).topX,
|
||||||
|
resolved.subtitleStyle.frequencyDictionary.topX,
|
||||||
|
"Expected a positive integer.",
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
const frequencyMode = frequencyDictionary.mode;
|
||||||
|
if (
|
||||||
|
frequencyMode === "single" ||
|
||||||
|
frequencyMode === "banded"
|
||||||
|
) {
|
||||||
|
resolved.subtitleStyle.frequencyDictionary.mode = frequencyMode;
|
||||||
|
} else if (frequencyMode !== undefined) {
|
||||||
|
warn(
|
||||||
|
"subtitleStyle.frequencyDictionary.mode",
|
||||||
|
frequencyDictionary.mode,
|
||||||
|
resolved.subtitleStyle.frequencyDictionary.mode,
|
||||||
|
"Expected 'single' or 'banded'.",
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
const singleColor = asColor(
|
||||||
|
(frequencyDictionary as { singleColor?: unknown }).singleColor,
|
||||||
|
);
|
||||||
|
if (singleColor !== undefined) {
|
||||||
|
resolved.subtitleStyle.frequencyDictionary.singleColor = singleColor;
|
||||||
|
} else if (
|
||||||
|
(frequencyDictionary as { singleColor?: unknown }).singleColor !== undefined
|
||||||
|
) {
|
||||||
|
warn(
|
||||||
|
"subtitleStyle.frequencyDictionary.singleColor",
|
||||||
|
(frequencyDictionary as { singleColor?: unknown }).singleColor,
|
||||||
|
resolved.subtitleStyle.frequencyDictionary.singleColor,
|
||||||
|
"Expected hex color.",
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
const bandedColors = asFrequencyBandedColors(
|
||||||
|
(frequencyDictionary as { bandedColors?: unknown }).bandedColors,
|
||||||
|
);
|
||||||
|
if (bandedColors !== undefined) {
|
||||||
|
resolved.subtitleStyle.frequencyDictionary.bandedColors = bandedColors;
|
||||||
|
} else if (
|
||||||
|
(frequencyDictionary as { bandedColors?: unknown }).bandedColors !== undefined
|
||||||
|
) {
|
||||||
|
warn(
|
||||||
|
"subtitleStyle.frequencyDictionary.bandedColors",
|
||||||
|
(frequencyDictionary as { bandedColors?: unknown }).bandedColors,
|
||||||
|
resolved.subtitleStyle.frequencyDictionary.bandedColors,
|
||||||
|
"Expected an array of five hex colors.",
|
||||||
|
);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
if (isObject(src.ankiConnect)) {
|
if (isObject(src.ankiConnect)) {
|
||||||
|
|||||||
49
src/core/services/frequency-dictionary-service.test.ts
Normal file
49
src/core/services/frequency-dictionary-service.test.ts
Normal file
@@ -0,0 +1,49 @@
|
|||||||
|
import test from "node:test";
|
||||||
|
import assert from "node:assert/strict";
|
||||||
|
import fs from "node:fs";
|
||||||
|
import os from "node:os";
|
||||||
|
import path from "node:path";
|
||||||
|
|
||||||
|
import { createFrequencyDictionaryLookupService } from "./frequency-dictionary-service";
|
||||||
|
|
||||||
|
test("createFrequencyDictionaryLookupService logs parse errors and returns no-op for invalid dictionaries", async () => {
|
||||||
|
const logs: string[] = [];
|
||||||
|
const tempDir = fs.mkdtempSync(path.join(os.tmpdir(), "subminer-frequency-dict-"));
|
||||||
|
const bankPath = path.join(tempDir, "term_meta_bank_1.json");
|
||||||
|
fs.writeFileSync(bankPath, "{ invalid json");
|
||||||
|
|
||||||
|
const lookup = await createFrequencyDictionaryLookupService({
|
||||||
|
searchPaths: [tempDir],
|
||||||
|
log: (message) => {
|
||||||
|
logs.push(message);
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
|
const rank = lookup("猫");
|
||||||
|
|
||||||
|
assert.equal(rank, null);
|
||||||
|
assert.equal(
|
||||||
|
logs.some((entry) =>
|
||||||
|
entry.includes("Failed to parse frequency dictionary file as JSON") &&
|
||||||
|
entry.includes("term_meta_bank_1.json")
|
||||||
|
),
|
||||||
|
true,
|
||||||
|
);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("createFrequencyDictionaryLookupService continues with no-op lookup when search path is missing", async () => {
|
||||||
|
const logs: string[] = [];
|
||||||
|
const missingPath = path.join(os.tmpdir(), "subminer-frequency-dict-missing-dir");
|
||||||
|
const lookup = await createFrequencyDictionaryLookupService({
|
||||||
|
searchPaths: [missingPath],
|
||||||
|
log: (message) => {
|
||||||
|
logs.push(message);
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
|
assert.equal(lookup("猫"), null);
|
||||||
|
assert.equal(
|
||||||
|
logs.some((entry) => entry.includes(`Frequency dictionary not found.`)),
|
||||||
|
true,
|
||||||
|
);
|
||||||
|
});
|
||||||
202
src/core/services/frequency-dictionary-service.ts
Normal file
202
src/core/services/frequency-dictionary-service.ts
Normal file
@@ -0,0 +1,202 @@
|
|||||||
|
import * as fs from "node:fs";
|
||||||
|
import * as path from "node:path";
|
||||||
|
|
||||||
|
export interface FrequencyDictionaryLookupOptions {
|
||||||
|
searchPaths: string[];
|
||||||
|
log: (message: string) => void;
|
||||||
|
}
|
||||||
|
|
||||||
|
interface FrequencyDictionaryEntry {
|
||||||
|
rank: number;
|
||||||
|
term: string;
|
||||||
|
}
|
||||||
|
|
||||||
|
const FREQUENCY_BANK_FILE_GLOB = /^term_meta_bank_.*\.json$/;
|
||||||
|
const NOOP_LOOKUP = (): null => null;
|
||||||
|
|
||||||
|
function normalizeFrequencyTerm(value: string): string {
|
||||||
|
return value.trim().toLowerCase();
|
||||||
|
}
|
||||||
|
|
||||||
|
function extractFrequencyDisplayValue(meta: unknown): number | null {
|
||||||
|
if (!meta || typeof meta !== "object") return null;
|
||||||
|
const frequency = (meta as { frequency?: unknown }).frequency;
|
||||||
|
if (!frequency || typeof frequency !== "object") return null;
|
||||||
|
const displayValue = (frequency as { displayValue?: unknown }).displayValue;
|
||||||
|
if (typeof displayValue === "number") {
|
||||||
|
if (!Number.isFinite(displayValue) || displayValue <= 0) return null;
|
||||||
|
return Math.floor(displayValue);
|
||||||
|
}
|
||||||
|
if (typeof displayValue === "string") {
|
||||||
|
const normalized = displayValue.trim().replace(/,/g, "");
|
||||||
|
const parsed = Number.parseInt(normalized, 10);
|
||||||
|
if (!Number.isFinite(parsed) || parsed <= 0) return null;
|
||||||
|
return parsed;
|
||||||
|
}
|
||||||
|
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
|
||||||
|
function asFrequencyDictionaryEntry(
|
||||||
|
entry: unknown,
|
||||||
|
): FrequencyDictionaryEntry | null {
|
||||||
|
if (!Array.isArray(entry) || entry.length < 3) {
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
|
||||||
|
const [term, _id, meta] = entry as [
|
||||||
|
unknown,
|
||||||
|
unknown,
|
||||||
|
unknown,
|
||||||
|
];
|
||||||
|
if (typeof term !== "string") {
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
|
||||||
|
const frequency = extractFrequencyDisplayValue(meta);
|
||||||
|
if (frequency === null) return null;
|
||||||
|
|
||||||
|
const normalizedTerm = normalizeFrequencyTerm(term);
|
||||||
|
if (!normalizedTerm) return null;
|
||||||
|
|
||||||
|
return {
|
||||||
|
term: normalizedTerm,
|
||||||
|
rank: frequency,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
function addEntriesToMap(
|
||||||
|
rawEntries: unknown,
|
||||||
|
terms: Map<string, number>,
|
||||||
|
log: (message: string) => void,
|
||||||
|
): void {
|
||||||
|
if (!Array.isArray(rawEntries)) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
for (const rawEntry of rawEntries) {
|
||||||
|
const entry = asFrequencyDictionaryEntry(rawEntry);
|
||||||
|
if (!entry) {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
const currentRank = terms.get(entry.term);
|
||||||
|
if (currentRank === undefined || entry.rank < currentRank) {
|
||||||
|
terms.set(entry.term, entry.rank);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
log(
|
||||||
|
`Frequency dictionary duplicate term ${entry.term} with weaker rank ${entry.rank}; keeping ${currentRank}.`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function collectDictionaryFromPath(
|
||||||
|
dictionaryPath: string,
|
||||||
|
log: (message: string) => void,
|
||||||
|
): Map<string, number> {
|
||||||
|
const terms = new Map<string, number>();
|
||||||
|
|
||||||
|
let fileNames: string[];
|
||||||
|
try {
|
||||||
|
fileNames = fs.readdirSync(dictionaryPath);
|
||||||
|
} catch (error) {
|
||||||
|
log(
|
||||||
|
`Failed to read frequency dictionary directory ${dictionaryPath}: ${String(error)}`,
|
||||||
|
);
|
||||||
|
return terms;
|
||||||
|
}
|
||||||
|
|
||||||
|
const bankFiles = fileNames
|
||||||
|
.filter((name) => FREQUENCY_BANK_FILE_GLOB.test(name))
|
||||||
|
.sort();
|
||||||
|
|
||||||
|
if (bankFiles.length === 0) {
|
||||||
|
return terms;
|
||||||
|
}
|
||||||
|
|
||||||
|
for (const bankFile of bankFiles) {
|
||||||
|
const bankPath = path.join(dictionaryPath, bankFile);
|
||||||
|
let rawText: string;
|
||||||
|
try {
|
||||||
|
rawText = fs.readFileSync(bankPath, "utf-8");
|
||||||
|
} catch {
|
||||||
|
log(`Failed to read frequency dictionary file ${bankPath}`);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
let rawEntries: unknown;
|
||||||
|
try {
|
||||||
|
rawEntries = JSON.parse(rawText) as unknown;
|
||||||
|
} catch {
|
||||||
|
log(`Failed to parse frequency dictionary file as JSON: ${bankPath}`);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
const beforeSize = terms.size;
|
||||||
|
addEntriesToMap(rawEntries, terms, log);
|
||||||
|
if (terms.size === beforeSize) {
|
||||||
|
log(
|
||||||
|
`Frequency dictionary file contained no extractable entries: ${bankPath}`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return terms;
|
||||||
|
}
|
||||||
|
|
||||||
|
export async function createFrequencyDictionaryLookupService(
|
||||||
|
options: FrequencyDictionaryLookupOptions,
|
||||||
|
): Promise<(term: string) => number | null> {
|
||||||
|
const attemptedPaths: string[] = [];
|
||||||
|
let foundDictionaryPathCount = 0;
|
||||||
|
|
||||||
|
for (const dictionaryPath of options.searchPaths) {
|
||||||
|
attemptedPaths.push(dictionaryPath);
|
||||||
|
let isDirectory = false;
|
||||||
|
|
||||||
|
try {
|
||||||
|
if (!fs.existsSync(dictionaryPath)) {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
isDirectory = fs.statSync(dictionaryPath).isDirectory();
|
||||||
|
} catch (error) {
|
||||||
|
options.log(
|
||||||
|
`Failed to inspect frequency dictionary path ${dictionaryPath}: ${String(error)}`,
|
||||||
|
);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!isDirectory) {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
foundDictionaryPathCount += 1;
|
||||||
|
const terms = collectDictionaryFromPath(dictionaryPath, options.log);
|
||||||
|
if (terms.size > 0) {
|
||||||
|
options.log(
|
||||||
|
`Frequency dictionary loaded from ${dictionaryPath} (${terms.size} entries)`,
|
||||||
|
);
|
||||||
|
return (term: string): number | null => {
|
||||||
|
const normalized = normalizeFrequencyTerm(term);
|
||||||
|
if (!normalized) return null;
|
||||||
|
return terms.get(normalized) ?? null;
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
options.log(
|
||||||
|
`Frequency dictionary directory exists but contains no readable term_meta_bank_*.json files: ${dictionaryPath}`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
options.log(
|
||||||
|
`Frequency dictionary not found. Searched ${attemptedPaths.length} candidate path(s): ${attemptedPaths.join(", ")}`,
|
||||||
|
);
|
||||||
|
if (foundDictionaryPathCount > 0) {
|
||||||
|
options.log(
|
||||||
|
"Frequency dictionary directories found, but no usable term_meta_bank_*.json files were loaded.",
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
return NOOP_LOOKUP;
|
||||||
|
}
|
||||||
@@ -32,6 +32,7 @@ export {
|
|||||||
} from "./startup-service";
|
} from "./startup-service";
|
||||||
export { openYomitanSettingsWindow } from "./yomitan-settings-service";
|
export { openYomitanSettingsWindow } from "./yomitan-settings-service";
|
||||||
export { createTokenizerDepsRuntimeService, tokenizeSubtitleService } from "./tokenizer-service";
|
export { createTokenizerDepsRuntimeService, tokenizeSubtitleService } from "./tokenizer-service";
|
||||||
|
export { createFrequencyDictionaryLookupService } from "./frequency-dictionary-service";
|
||||||
export { createJlptVocabularyLookupService } from "./jlpt-vocab-service";
|
export { createJlptVocabularyLookupService } from "./jlpt-vocab-service";
|
||||||
export {
|
export {
|
||||||
getIgnoredPos1Entries,
|
getIgnoredPos1Entries,
|
||||||
|
|||||||
@@ -190,6 +190,144 @@ test("tokenizeSubtitleService skips JLPT lookups when disabled", async () => {
|
|||||||
assert.equal(lookupCalls, 0);
|
assert.equal(lookupCalls, 0);
|
||||||
});
|
});
|
||||||
|
|
||||||
|
test("tokenizeSubtitleService applies frequency dictionary ranks", async () => {
|
||||||
|
const result = await tokenizeSubtitleService(
|
||||||
|
"猫です",
|
||||||
|
makeDeps({
|
||||||
|
getFrequencyDictionaryEnabled: () => true,
|
||||||
|
tokenizeWithMecab: async () => [
|
||||||
|
{
|
||||||
|
headword: "猫",
|
||||||
|
surface: "猫",
|
||||||
|
reading: "ネコ",
|
||||||
|
startPos: 0,
|
||||||
|
endPos: 1,
|
||||||
|
partOfSpeech: PartOfSpeech.noun,
|
||||||
|
isMerged: false,
|
||||||
|
isKnown: false,
|
||||||
|
isNPlusOneTarget: false,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
headword: "です",
|
||||||
|
surface: "です",
|
||||||
|
reading: "デス",
|
||||||
|
startPos: 1,
|
||||||
|
endPos: 2,
|
||||||
|
partOfSpeech: PartOfSpeech.bound_auxiliary,
|
||||||
|
isMerged: false,
|
||||||
|
isKnown: false,
|
||||||
|
isNPlusOneTarget: false,
|
||||||
|
},
|
||||||
|
],
|
||||||
|
getFrequencyRank: (text) => (text === "猫" ? 23 : 1200),
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
|
||||||
|
assert.equal(result.tokens?.length, 2);
|
||||||
|
assert.equal(result.tokens?.[0]?.frequencyRank, 23);
|
||||||
|
assert.equal(result.tokens?.[1]?.frequencyRank, 1200);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("tokenizeSubtitleService ignores frequency lookup failures", async () => {
|
||||||
|
const result = await tokenizeSubtitleService(
|
||||||
|
"猫",
|
||||||
|
makeDeps({
|
||||||
|
getFrequencyDictionaryEnabled: () => true,
|
||||||
|
tokenizeWithMecab: async () => [
|
||||||
|
{
|
||||||
|
headword: "猫",
|
||||||
|
surface: "猫",
|
||||||
|
reading: "ネコ",
|
||||||
|
startPos: 0,
|
||||||
|
endPos: 1,
|
||||||
|
partOfSpeech: PartOfSpeech.noun,
|
||||||
|
isMerged: false,
|
||||||
|
isKnown: false,
|
||||||
|
isNPlusOneTarget: false,
|
||||||
|
},
|
||||||
|
],
|
||||||
|
getFrequencyRank: () => {
|
||||||
|
throw new Error("frequency lookup unavailable");
|
||||||
|
},
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
|
||||||
|
assert.equal(result.tokens?.[0]?.frequencyRank, undefined);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("tokenizeSubtitleService ignores invalid frequency ranks", async () => {
|
||||||
|
const result = await tokenizeSubtitleService(
|
||||||
|
"猫",
|
||||||
|
makeDeps({
|
||||||
|
getFrequencyDictionaryEnabled: () => true,
|
||||||
|
tokenizeWithMecab: async () => [
|
||||||
|
{
|
||||||
|
headword: "猫",
|
||||||
|
surface: "猫",
|
||||||
|
reading: "ネコ",
|
||||||
|
startPos: 0,
|
||||||
|
endPos: 1,
|
||||||
|
partOfSpeech: PartOfSpeech.noun,
|
||||||
|
isMerged: false,
|
||||||
|
isKnown: false,
|
||||||
|
isNPlusOneTarget: false,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
headword: "です",
|
||||||
|
surface: "です",
|
||||||
|
reading: "デス",
|
||||||
|
startPos: 1,
|
||||||
|
endPos: 2,
|
||||||
|
partOfSpeech: PartOfSpeech.bound_auxiliary,
|
||||||
|
isMerged: false,
|
||||||
|
isKnown: false,
|
||||||
|
isNPlusOneTarget: false,
|
||||||
|
},
|
||||||
|
],
|
||||||
|
getFrequencyRank: (text) => {
|
||||||
|
if (text === "猫") return Number.NaN;
|
||||||
|
if (text === "です") return -1;
|
||||||
|
return 100;
|
||||||
|
},
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
|
||||||
|
assert.equal(result.tokens?.length, 2);
|
||||||
|
assert.equal(result.tokens?.[0]?.frequencyRank, undefined);
|
||||||
|
assert.equal(result.tokens?.[1]?.frequencyRank, undefined);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("tokenizeSubtitleService skips frequency lookups when disabled", async () => {
|
||||||
|
let frequencyCalls = 0;
|
||||||
|
const result = await tokenizeSubtitleService(
|
||||||
|
"猫",
|
||||||
|
makeDeps({
|
||||||
|
getFrequencyDictionaryEnabled: () => false,
|
||||||
|
tokenizeWithMecab: async () => [
|
||||||
|
{
|
||||||
|
headword: "猫",
|
||||||
|
surface: "猫",
|
||||||
|
reading: "ネコ",
|
||||||
|
startPos: 0,
|
||||||
|
endPos: 1,
|
||||||
|
partOfSpeech: PartOfSpeech.noun,
|
||||||
|
isMerged: false,
|
||||||
|
isKnown: false,
|
||||||
|
isNPlusOneTarget: false,
|
||||||
|
},
|
||||||
|
],
|
||||||
|
getFrequencyRank: () => {
|
||||||
|
frequencyCalls += 1;
|
||||||
|
return 10;
|
||||||
|
},
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
|
||||||
|
assert.equal(result.tokens?.length, 1);
|
||||||
|
assert.equal(result.tokens?.[0]?.frequencyRank, undefined);
|
||||||
|
assert.equal(frequencyCalls, 0);
|
||||||
|
});
|
||||||
|
|
||||||
test("tokenizeSubtitleService skips JLPT level for excluded demonstratives", async () => {
|
test("tokenizeSubtitleService skips JLPT level for excluded demonstratives", async () => {
|
||||||
const result = await tokenizeSubtitleService(
|
const result = await tokenizeSubtitleService(
|
||||||
"この",
|
"この",
|
||||||
|
|||||||
@@ -7,6 +7,7 @@ import {
|
|||||||
PartOfSpeech,
|
PartOfSpeech,
|
||||||
SubtitleData,
|
SubtitleData,
|
||||||
Token,
|
Token,
|
||||||
|
FrequencyDictionaryLookup,
|
||||||
} from "../../types";
|
} from "../../types";
|
||||||
import {
|
import {
|
||||||
shouldIgnoreJlptForMecabPos1,
|
shouldIgnoreJlptForMecabPos1,
|
||||||
@@ -35,11 +36,16 @@ const KATAKANA_TO_HIRAGANA_OFFSET = 0x60;
|
|||||||
const KATAKANA_CODEPOINT_START = 0x30a1;
|
const KATAKANA_CODEPOINT_START = 0x30a1;
|
||||||
const KATAKANA_CODEPOINT_END = 0x30f6;
|
const KATAKANA_CODEPOINT_END = 0x30f6;
|
||||||
const JLPT_LEVEL_LOOKUP_CACHE_LIMIT = 2048;
|
const JLPT_LEVEL_LOOKUP_CACHE_LIMIT = 2048;
|
||||||
|
const FREQUENCY_RANK_LOOKUP_CACHE_LIMIT = 2048;
|
||||||
|
|
||||||
const jlptLevelLookupCaches = new WeakMap<
|
const jlptLevelLookupCaches = new WeakMap<
|
||||||
(text: string) => JlptLevel | null,
|
(text: string) => JlptLevel | null,
|
||||||
Map<string, JlptLevel | null>
|
Map<string, JlptLevel | null>
|
||||||
>();
|
>();
|
||||||
|
const frequencyRankLookupCaches = new WeakMap<
|
||||||
|
FrequencyDictionaryLookup,
|
||||||
|
Map<string, number | null>
|
||||||
|
>();
|
||||||
|
|
||||||
function isObject(value: unknown): value is Record<string, unknown> {
|
function isObject(value: unknown): value is Record<string, unknown> {
|
||||||
return Boolean(value && typeof value === "object");
|
return Boolean(value && typeof value === "object");
|
||||||
@@ -61,6 +67,8 @@ export interface TokenizerServiceDeps {
|
|||||||
getKnownWordMatchMode: () => NPlusOneMatchMode;
|
getKnownWordMatchMode: () => NPlusOneMatchMode;
|
||||||
getJlptLevel: (text: string) => JlptLevel | null;
|
getJlptLevel: (text: string) => JlptLevel | null;
|
||||||
getJlptEnabled?: () => boolean;
|
getJlptEnabled?: () => boolean;
|
||||||
|
getFrequencyDictionaryEnabled?: () => boolean;
|
||||||
|
getFrequencyRank?: FrequencyDictionaryLookup;
|
||||||
getMinSentenceWordsForNPlusOne?: () => number;
|
getMinSentenceWordsForNPlusOne?: () => number;
|
||||||
tokenizeWithMecab: (text: string) => Promise<MergedToken[] | null>;
|
tokenizeWithMecab: (text: string) => Promise<MergedToken[] | null>;
|
||||||
}
|
}
|
||||||
@@ -81,6 +89,8 @@ export interface TokenizerDepsRuntimeOptions {
|
|||||||
getKnownWordMatchMode: () => NPlusOneMatchMode;
|
getKnownWordMatchMode: () => NPlusOneMatchMode;
|
||||||
getJlptLevel: (text: string) => JlptLevel | null;
|
getJlptLevel: (text: string) => JlptLevel | null;
|
||||||
getJlptEnabled?: () => boolean;
|
getJlptEnabled?: () => boolean;
|
||||||
|
getFrequencyDictionaryEnabled?: () => boolean;
|
||||||
|
getFrequencyRank?: FrequencyDictionaryLookup;
|
||||||
getMinSentenceWordsForNPlusOne?: () => number;
|
getMinSentenceWordsForNPlusOne?: () => number;
|
||||||
getMecabTokenizer: () => MecabTokenizerLike | null;
|
getMecabTokenizer: () => MecabTokenizerLike | null;
|
||||||
}
|
}
|
||||||
@@ -122,6 +132,52 @@ function getCachedJlptLevel(
|
|||||||
return level;
|
return level;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
function normalizeFrequencyLookupText(rawText: string): string {
|
||||||
|
return rawText.trim().toLowerCase();
|
||||||
|
}
|
||||||
|
|
||||||
|
function getCachedFrequencyRank(
|
||||||
|
lookupText: string,
|
||||||
|
getFrequencyRank: FrequencyDictionaryLookup,
|
||||||
|
): number | null {
|
||||||
|
const normalizedText = normalizeFrequencyLookupText(lookupText);
|
||||||
|
if (!normalizedText) {
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
|
||||||
|
let cache = frequencyRankLookupCaches.get(getFrequencyRank);
|
||||||
|
if (!cache) {
|
||||||
|
cache = new Map<string, number | null>();
|
||||||
|
frequencyRankLookupCaches.set(getFrequencyRank, cache);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (cache.has(normalizedText)) {
|
||||||
|
return cache.get(normalizedText) ?? null;
|
||||||
|
}
|
||||||
|
|
||||||
|
let rank: number | null;
|
||||||
|
try {
|
||||||
|
rank = getFrequencyRank(normalizedText);
|
||||||
|
} catch {
|
||||||
|
rank = null;
|
||||||
|
}
|
||||||
|
if (rank !== null) {
|
||||||
|
if (!Number.isFinite(rank) || rank <= 0) {
|
||||||
|
rank = null;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
cache.set(normalizedText, rank);
|
||||||
|
while (cache.size > FREQUENCY_RANK_LOOKUP_CACHE_LIMIT) {
|
||||||
|
const firstKey = cache.keys().next().value;
|
||||||
|
if (firstKey !== undefined) {
|
||||||
|
cache.delete(firstKey);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return rank;
|
||||||
|
}
|
||||||
|
|
||||||
export function createTokenizerDepsRuntimeService(
|
export function createTokenizerDepsRuntimeService(
|
||||||
options: TokenizerDepsRuntimeOptions,
|
options: TokenizerDepsRuntimeOptions,
|
||||||
): TokenizerServiceDeps {
|
): TokenizerServiceDeps {
|
||||||
@@ -137,6 +193,8 @@ export function createTokenizerDepsRuntimeService(
|
|||||||
getKnownWordMatchMode: options.getKnownWordMatchMode,
|
getKnownWordMatchMode: options.getKnownWordMatchMode,
|
||||||
getJlptLevel: options.getJlptLevel,
|
getJlptLevel: options.getJlptLevel,
|
||||||
getJlptEnabled: options.getJlptEnabled,
|
getJlptEnabled: options.getJlptEnabled,
|
||||||
|
getFrequencyDictionaryEnabled: options.getFrequencyDictionaryEnabled,
|
||||||
|
getFrequencyRank: options.getFrequencyRank,
|
||||||
getMinSentenceWordsForNPlusOne:
|
getMinSentenceWordsForNPlusOne:
|
||||||
options.getMinSentenceWordsForNPlusOne ?? (() => 3),
|
options.getMinSentenceWordsForNPlusOne ?? (() => 3),
|
||||||
tokenizeWithMecab: async (text) => {
|
tokenizeWithMecab: async (text) => {
|
||||||
@@ -184,6 +242,34 @@ function applyKnownWordMarking(
|
|||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
|
function resolveFrequencyLookupText(token: MergedToken): string {
|
||||||
|
if (token.headword && token.headword.length > 0) {
|
||||||
|
return token.headword;
|
||||||
|
}
|
||||||
|
if (token.reading && token.reading.length > 0) {
|
||||||
|
return token.reading;
|
||||||
|
}
|
||||||
|
return token.surface;
|
||||||
|
}
|
||||||
|
|
||||||
|
function applyFrequencyMarking(
|
||||||
|
tokens: MergedToken[],
|
||||||
|
getFrequencyRank: FrequencyDictionaryLookup,
|
||||||
|
): MergedToken[] {
|
||||||
|
return tokens.map((token) => {
|
||||||
|
const lookupText = resolveFrequencyLookupText(token);
|
||||||
|
if (!lookupText) {
|
||||||
|
return { ...token, frequencyRank: undefined };
|
||||||
|
}
|
||||||
|
|
||||||
|
const rank = getCachedFrequencyRank(lookupText, getFrequencyRank);
|
||||||
|
return {
|
||||||
|
...token,
|
||||||
|
frequencyRank: rank ?? undefined,
|
||||||
|
};
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
function resolveJlptLookupText(token: MergedToken): string {
|
function resolveJlptLookupText(token: MergedToken): string {
|
||||||
if (token.headword && token.headword.length > 0) {
|
if (token.headword && token.headword.length > 0) {
|
||||||
return token.headword;
|
return token.headword;
|
||||||
@@ -753,6 +839,8 @@ export async function tokenizeSubtitleService(
|
|||||||
.replace(/\s+/g, " ")
|
.replace(/\s+/g, " ")
|
||||||
.trim();
|
.trim();
|
||||||
const jlptEnabled = deps.getJlptEnabled?.() !== false;
|
const jlptEnabled = deps.getJlptEnabled?.() !== false;
|
||||||
|
const frequencyEnabled = deps.getFrequencyDictionaryEnabled?.() !== false;
|
||||||
|
const frequencyLookup = deps.getFrequencyRank;
|
||||||
|
|
||||||
const yomitanTokens = await parseWithYomitanInternalParser(tokenizeText, deps);
|
const yomitanTokens = await parseWithYomitanInternalParser(tokenizeText, deps);
|
||||||
if (yomitanTokens && yomitanTokens.length > 0) {
|
if (yomitanTokens && yomitanTokens.length > 0) {
|
||||||
@@ -761,9 +849,16 @@ export async function tokenizeSubtitleService(
|
|||||||
deps.isKnownWord,
|
deps.isKnownWord,
|
||||||
deps.getKnownWordMatchMode(),
|
deps.getKnownWordMatchMode(),
|
||||||
);
|
);
|
||||||
const jlptMarkedTokens = jlptEnabled
|
const frequencyMarkedTokens =
|
||||||
? applyJlptMarking(knownMarkedTokens, deps.getJlptLevel)
|
frequencyEnabled && frequencyLookup
|
||||||
: knownMarkedTokens.map((token) => ({ ...token, jlptLevel: undefined }));
|
? applyFrequencyMarking(knownMarkedTokens, frequencyLookup)
|
||||||
|
: knownMarkedTokens.map((token) => ({
|
||||||
|
...token,
|
||||||
|
frequencyRank: undefined,
|
||||||
|
}));
|
||||||
|
const jlptMarkedTokens = jlptEnabled
|
||||||
|
? applyJlptMarking(frequencyMarkedTokens, deps.getJlptLevel)
|
||||||
|
: frequencyMarkedTokens.map((token) => ({ ...token, jlptLevel: undefined }));
|
||||||
return {
|
return {
|
||||||
text: displayText,
|
text: displayText,
|
||||||
tokens: markNPlusOneTargets(
|
tokens: markNPlusOneTargets(
|
||||||
@@ -781,9 +876,16 @@ export async function tokenizeSubtitleService(
|
|||||||
deps.isKnownWord,
|
deps.isKnownWord,
|
||||||
deps.getKnownWordMatchMode(),
|
deps.getKnownWordMatchMode(),
|
||||||
);
|
);
|
||||||
|
const frequencyMarkedTokens =
|
||||||
|
frequencyEnabled && frequencyLookup
|
||||||
|
? applyFrequencyMarking(knownMarkedTokens, frequencyLookup)
|
||||||
|
: knownMarkedTokens.map((token) => ({
|
||||||
|
...token,
|
||||||
|
frequencyRank: undefined,
|
||||||
|
}));
|
||||||
const jlptMarkedTokens = jlptEnabled
|
const jlptMarkedTokens = jlptEnabled
|
||||||
? applyJlptMarking(knownMarkedTokens, deps.getJlptLevel)
|
? applyJlptMarking(frequencyMarkedTokens, deps.getJlptLevel)
|
||||||
: knownMarkedTokens.map((token) => ({ ...token, jlptLevel: undefined }));
|
: frequencyMarkedTokens.map((token) => ({ ...token, jlptLevel: undefined }));
|
||||||
return {
|
return {
|
||||||
text: displayText,
|
text: displayText,
|
||||||
tokens: markNPlusOneTargets(
|
tokens: markNPlusOneTargets(
|
||||||
|
|||||||
43
src/main.ts
43
src/main.ts
@@ -162,6 +162,10 @@ import {
|
|||||||
createJlptDictionaryRuntimeService,
|
createJlptDictionaryRuntimeService,
|
||||||
getJlptDictionarySearchPaths,
|
getJlptDictionarySearchPaths,
|
||||||
} from "./main/jlpt-runtime";
|
} from "./main/jlpt-runtime";
|
||||||
|
import {
|
||||||
|
createFrequencyDictionaryRuntimeService,
|
||||||
|
getFrequencyDictionarySearchPaths,
|
||||||
|
} from "./main/frequency-dictionary-runtime";
|
||||||
import { createMediaRuntimeService } from "./main/media-runtime";
|
import { createMediaRuntimeService } from "./main/media-runtime";
|
||||||
import { createOverlayVisibilityRuntimeService } from "./main/overlay-visibility-runtime";
|
import { createOverlayVisibilityRuntimeService } from "./main/overlay-visibility-runtime";
|
||||||
import {
|
import {
|
||||||
@@ -353,6 +357,39 @@ const jlptDictionaryRuntime = createJlptDictionaryRuntimeService({
|
|||||||
},
|
},
|
||||||
});
|
});
|
||||||
|
|
||||||
|
const frequencyDictionaryRuntime = createFrequencyDictionaryRuntimeService({
|
||||||
|
isFrequencyDictionaryEnabled: () =>
|
||||||
|
getResolvedConfig().subtitleStyle.frequencyDictionary.enabled,
|
||||||
|
getSearchPaths: () =>
|
||||||
|
getFrequencyDictionarySearchPaths({
|
||||||
|
getDictionaryRoots: () => [
|
||||||
|
path.join(__dirname, "..", "..", "vendor", "jiten_freq_global"),
|
||||||
|
path.join(__dirname, "..", "..", "vendor", "frequency-dictionary"),
|
||||||
|
path.join(app.getAppPath(), "vendor", "jiten_freq_global"),
|
||||||
|
path.join(app.getAppPath(), "vendor", "frequency-dictionary"),
|
||||||
|
path.join(process.resourcesPath, "jiten_freq_global"),
|
||||||
|
path.join(process.resourcesPath, "frequency-dictionary"),
|
||||||
|
path.join(process.resourcesPath, "app.asar", "vendor", "jiten_freq_global"),
|
||||||
|
path.join(process.resourcesPath, "app.asar", "vendor", "frequency-dictionary"),
|
||||||
|
USER_DATA_PATH,
|
||||||
|
app.getPath("userData"),
|
||||||
|
path.join(os.homedir(), ".config", "SubMiner"),
|
||||||
|
path.join(os.homedir(), ".config", "subminer"),
|
||||||
|
path.join(os.homedir(), "Library", "Application Support", "SubMiner"),
|
||||||
|
path.join(os.homedir(), "Library", "Application Support", "subminer"),
|
||||||
|
process.cwd(),
|
||||||
|
].filter((dictionaryRoot) => dictionaryRoot),
|
||||||
|
getSourcePath: () =>
|
||||||
|
getResolvedConfig().subtitleStyle.frequencyDictionary.sourcePath,
|
||||||
|
}),
|
||||||
|
setFrequencyRankLookup: (lookup) => {
|
||||||
|
appState.frequencyRankLookup = lookup;
|
||||||
|
},
|
||||||
|
log: (message) => {
|
||||||
|
logger.info(`[Frequency] ${message}`);
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
function getFieldGroupingResolver(): ((choice: KikuFieldGroupingChoice) => void) | null {
|
function getFieldGroupingResolver(): ((choice: KikuFieldGroupingChoice) => void) | null {
|
||||||
return appState.fieldGroupingResolver;
|
return appState.fieldGroupingResolver;
|
||||||
}
|
}
|
||||||
@@ -844,6 +881,7 @@ function updateMpvSubtitleRenderMetrics(
|
|||||||
|
|
||||||
async function tokenizeSubtitle(text: string): Promise<SubtitleData> {
|
async function tokenizeSubtitle(text: string): Promise<SubtitleData> {
|
||||||
await jlptDictionaryRuntime.ensureJlptDictionaryLookup();
|
await jlptDictionaryRuntime.ensureJlptDictionaryLookup();
|
||||||
|
await frequencyDictionaryRuntime.ensureFrequencyDictionaryLookup();
|
||||||
return tokenizeSubtitleService(
|
return tokenizeSubtitleService(
|
||||||
text,
|
text,
|
||||||
createTokenizerDepsRuntimeService({
|
createTokenizerDepsRuntimeService({
|
||||||
@@ -870,6 +908,9 @@ async function tokenizeSubtitle(text: string): Promise<SubtitleData> {
|
|||||||
getJlptLevel: (text) => appState.jlptLevelLookup(text),
|
getJlptLevel: (text) => appState.jlptLevelLookup(text),
|
||||||
getJlptEnabled: () =>
|
getJlptEnabled: () =>
|
||||||
getResolvedConfig().subtitleStyle.enableJlpt,
|
getResolvedConfig().subtitleStyle.enableJlpt,
|
||||||
|
getFrequencyDictionaryEnabled: () =>
|
||||||
|
getResolvedConfig().subtitleStyle.frequencyDictionary.enabled,
|
||||||
|
getFrequencyRank: (text) => appState.frequencyRankLookup(text),
|
||||||
getMecabTokenizer: () => appState.mecabTokenizer,
|
getMecabTokenizer: () => appState.mecabTokenizer,
|
||||||
}),
|
}),
|
||||||
);
|
);
|
||||||
@@ -1345,6 +1386,8 @@ registerIpcRuntimeServices({
|
|||||||
nPlusOneColor: resolvedConfig.ankiConnect.nPlusOne.nPlusOne,
|
nPlusOneColor: resolvedConfig.ankiConnect.nPlusOne.nPlusOne,
|
||||||
knownWordColor: resolvedConfig.ankiConnect.nPlusOne.knownWord,
|
knownWordColor: resolvedConfig.ankiConnect.nPlusOne.knownWord,
|
||||||
enableJlpt: resolvedConfig.subtitleStyle.enableJlpt,
|
enableJlpt: resolvedConfig.subtitleStyle.enableJlpt,
|
||||||
|
frequencyDictionary:
|
||||||
|
resolvedConfig.subtitleStyle.frequencyDictionary,
|
||||||
};
|
};
|
||||||
},
|
},
|
||||||
saveSubtitlePosition: (position: unknown) =>
|
saveSubtitlePosition: (position: unknown) =>
|
||||||
|
|||||||
86
src/main/frequency-dictionary-runtime.ts
Normal file
86
src/main/frequency-dictionary-runtime.ts
Normal file
@@ -0,0 +1,86 @@
|
|||||||
|
import * as path from "path";
|
||||||
|
import type { FrequencyDictionaryLookup } from "../types";
|
||||||
|
import { createFrequencyDictionaryLookupService } from "../core/services";
|
||||||
|
|
||||||
|
export interface FrequencyDictionarySearchPathDeps {
|
||||||
|
getDictionaryRoots: () => string[];
|
||||||
|
getSourcePath?: () => string | undefined;
|
||||||
|
}
|
||||||
|
|
||||||
|
export interface FrequencyDictionaryRuntimeDeps {
|
||||||
|
isFrequencyDictionaryEnabled: () => boolean;
|
||||||
|
getSearchPaths: () => string[];
|
||||||
|
setFrequencyRankLookup: (lookup: FrequencyDictionaryLookup) => void;
|
||||||
|
log: (message: string) => void;
|
||||||
|
}
|
||||||
|
|
||||||
|
let frequencyDictionaryLookupInitialized = false;
|
||||||
|
let frequencyDictionaryLookupInitialization: Promise<void> | null = null;
|
||||||
|
|
||||||
|
// Frequency dictionary services are initialized lazily as a process-wide singleton.
|
||||||
|
// Initialization is idempotent and intentionally shared across callers.
|
||||||
|
|
||||||
|
export function getFrequencyDictionarySearchPaths(
|
||||||
|
deps: FrequencyDictionarySearchPathDeps,
|
||||||
|
): string[] {
|
||||||
|
const dictionaryRoots = deps.getDictionaryRoots();
|
||||||
|
const sourcePath = deps.getSourcePath?.();
|
||||||
|
|
||||||
|
const rawSearchPaths: string[] = [];
|
||||||
|
// User-provided path takes precedence over bundled/default roots.
|
||||||
|
// Root list should include `vendor/jiten_freq_global` in callers.
|
||||||
|
if (sourcePath && sourcePath.trim()) {
|
||||||
|
rawSearchPaths.push(sourcePath.trim());
|
||||||
|
rawSearchPaths.push(path.join(sourcePath.trim(), "frequency-dictionary"));
|
||||||
|
rawSearchPaths.push(path.join(sourcePath.trim(), "vendor", "frequency-dictionary"));
|
||||||
|
}
|
||||||
|
|
||||||
|
for (const dictionaryRoot of dictionaryRoots) {
|
||||||
|
rawSearchPaths.push(dictionaryRoot);
|
||||||
|
rawSearchPaths.push(path.join(dictionaryRoot, "frequency-dictionary"));
|
||||||
|
rawSearchPaths.push(path.join(dictionaryRoot, "vendor", "frequency-dictionary"));
|
||||||
|
}
|
||||||
|
|
||||||
|
return [...new Set(rawSearchPaths)];
|
||||||
|
}
|
||||||
|
|
||||||
|
export async function initializeFrequencyDictionaryLookup(
|
||||||
|
deps: FrequencyDictionaryRuntimeDeps,
|
||||||
|
): Promise<void> {
|
||||||
|
const lookup = await createFrequencyDictionaryLookupService({
|
||||||
|
searchPaths: deps.getSearchPaths(),
|
||||||
|
log: deps.log,
|
||||||
|
});
|
||||||
|
deps.setFrequencyRankLookup(lookup);
|
||||||
|
}
|
||||||
|
|
||||||
|
export async function ensureFrequencyDictionaryLookup(
|
||||||
|
deps: FrequencyDictionaryRuntimeDeps,
|
||||||
|
): Promise<void> {
|
||||||
|
if (!deps.isFrequencyDictionaryEnabled()) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
if (frequencyDictionaryLookupInitialized) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
if (!frequencyDictionaryLookupInitialization) {
|
||||||
|
frequencyDictionaryLookupInitialization = initializeFrequencyDictionaryLookup(deps)
|
||||||
|
.then(() => {
|
||||||
|
frequencyDictionaryLookupInitialized = true;
|
||||||
|
})
|
||||||
|
.catch((error) => {
|
||||||
|
frequencyDictionaryLookupInitialized = true;
|
||||||
|
deps.log(`Failed to initialize frequency dictionary: ${String(error)}`);
|
||||||
|
deps.setFrequencyRankLookup(() => null);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
await frequencyDictionaryLookupInitialization;
|
||||||
|
}
|
||||||
|
|
||||||
|
export function createFrequencyDictionaryRuntimeService(
|
||||||
|
deps: FrequencyDictionaryRuntimeDeps,
|
||||||
|
): { ensureFrequencyDictionaryLookup: () => Promise<void> } {
|
||||||
|
return {
|
||||||
|
ensureFrequencyDictionaryLookup: () => ensureFrequencyDictionaryLookup(deps),
|
||||||
|
};
|
||||||
|
}
|
||||||
@@ -7,6 +7,7 @@ import type {
|
|||||||
SubtitlePosition,
|
SubtitlePosition,
|
||||||
KikuFieldGroupingChoice,
|
KikuFieldGroupingChoice,
|
||||||
JlptLevel,
|
JlptLevel,
|
||||||
|
FrequencyDictionaryLookup,
|
||||||
} from "../types";
|
} from "../types";
|
||||||
import type { CliArgs } from "../cli/args";
|
import type { CliArgs } from "../cli/args";
|
||||||
import type { SubtitleTimingTracker } from "../subtitle-timing-tracker";
|
import type { SubtitleTimingTracker } from "../subtitle-timing-tracker";
|
||||||
@@ -55,6 +56,7 @@ export interface AppState {
|
|||||||
autoStartOverlay: boolean;
|
autoStartOverlay: boolean;
|
||||||
texthookerOnlyMode: boolean;
|
texthookerOnlyMode: boolean;
|
||||||
jlptLevelLookup: (term: string) => JlptLevel | null;
|
jlptLevelLookup: (term: string) => JlptLevel | null;
|
||||||
|
frequencyRankLookup: FrequencyDictionaryLookup;
|
||||||
}
|
}
|
||||||
|
|
||||||
export interface AppStateInitialValues {
|
export interface AppStateInitialValues {
|
||||||
@@ -115,6 +117,7 @@ export function createAppState(values: AppStateInitialValues): AppState {
|
|||||||
autoStartOverlay: values.autoStartOverlay ?? false,
|
autoStartOverlay: values.autoStartOverlay ?? false,
|
||||||
texthookerOnlyMode: values.texthookerOnlyMode ?? false,
|
texthookerOnlyMode: values.texthookerOnlyMode ?? false,
|
||||||
jlptLevelLookup: () => null,
|
jlptLevelLookup: () => null,
|
||||||
|
frequencyRankLookup: () => null,
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -76,6 +76,15 @@ export type RendererState = {
|
|||||||
jlptN3Color: string;
|
jlptN3Color: string;
|
||||||
jlptN4Color: string;
|
jlptN4Color: string;
|
||||||
jlptN5Color: string;
|
jlptN5Color: string;
|
||||||
|
frequencyDictionaryEnabled: boolean;
|
||||||
|
frequencyDictionaryTopX: number;
|
||||||
|
frequencyDictionaryMode: "single" | "banded";
|
||||||
|
frequencyDictionarySingleColor: string;
|
||||||
|
frequencyDictionaryBand1Color: string;
|
||||||
|
frequencyDictionaryBand2Color: string;
|
||||||
|
frequencyDictionaryBand3Color: string;
|
||||||
|
frequencyDictionaryBand4Color: string;
|
||||||
|
frequencyDictionaryBand5Color: string;
|
||||||
|
|
||||||
keybindingsMap: Map<string, (string | number)[]>;
|
keybindingsMap: Map<string, (string | number)[]>;
|
||||||
chordPending: boolean;
|
chordPending: boolean;
|
||||||
@@ -140,6 +149,15 @@ export function createRendererState(): RendererState {
|
|||||||
jlptN3Color: "#f9e2af",
|
jlptN3Color: "#f9e2af",
|
||||||
jlptN4Color: "#a6e3a1",
|
jlptN4Color: "#a6e3a1",
|
||||||
jlptN5Color: "#8aadf4",
|
jlptN5Color: "#8aadf4",
|
||||||
|
frequencyDictionaryEnabled: false,
|
||||||
|
frequencyDictionaryTopX: 1000,
|
||||||
|
frequencyDictionaryMode: "single",
|
||||||
|
frequencyDictionarySingleColor: "#f5a97f",
|
||||||
|
frequencyDictionaryBand1Color: "#ed8796",
|
||||||
|
frequencyDictionaryBand2Color: "#f5a97f",
|
||||||
|
frequencyDictionaryBand3Color: "#f9e2af",
|
||||||
|
frequencyDictionaryBand4Color: "#a6e3a1",
|
||||||
|
frequencyDictionaryBand5Color: "#8aadf4",
|
||||||
|
|
||||||
keybindingsMap: new Map(),
|
keybindingsMap: new Map(),
|
||||||
chordPending: false,
|
chordPending: false,
|
||||||
|
|||||||
@@ -255,6 +255,12 @@ body {
|
|||||||
--subtitle-jlpt-n3-color: #f9e2af;
|
--subtitle-jlpt-n3-color: #f9e2af;
|
||||||
--subtitle-jlpt-n4-color: #a6e3a1;
|
--subtitle-jlpt-n4-color: #a6e3a1;
|
||||||
--subtitle-jlpt-n5-color: #8aadf4;
|
--subtitle-jlpt-n5-color: #8aadf4;
|
||||||
|
--subtitle-frequency-single-color: #f5a97f;
|
||||||
|
--subtitle-frequency-band-1-color: #ed8796;
|
||||||
|
--subtitle-frequency-band-2-color: #f5a97f;
|
||||||
|
--subtitle-frequency-band-3-color: #f9e2af;
|
||||||
|
--subtitle-frequency-band-4-color: #a6e3a1;
|
||||||
|
--subtitle-frequency-band-5-color: #8aadf4;
|
||||||
text-shadow:
|
text-shadow:
|
||||||
2px 2px 4px rgba(0, 0, 0, 0.8),
|
2px 2px 4px rgba(0, 0, 0, 0.8),
|
||||||
-1px -1px 2px rgba(0, 0, 0, 0.5);
|
-1px -1px 2px rgba(0, 0, 0, 0.5);
|
||||||
@@ -346,6 +352,39 @@ body.settings-modal-open #subtitleContainer {
|
|||||||
text-decoration-style: solid;
|
text-decoration-style: solid;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#subtitleRoot .word.word-frequency-single,
|
||||||
|
#subtitleRoot .word.word-frequency-band-1,
|
||||||
|
#subtitleRoot .word.word-frequency-band-2,
|
||||||
|
#subtitleRoot .word.word-frequency-band-3,
|
||||||
|
#subtitleRoot .word.word-frequency-band-4,
|
||||||
|
#subtitleRoot .word.word-frequency-band-5 {
|
||||||
|
text-shadow: 0 0 6px rgba(255, 255, 255, 0.3);
|
||||||
|
}
|
||||||
|
|
||||||
|
#subtitleRoot .word.word-frequency-single {
|
||||||
|
color: var(--subtitle-frequency-single-color, #f5a97f);
|
||||||
|
}
|
||||||
|
|
||||||
|
#subtitleRoot .word.word-frequency-band-1 {
|
||||||
|
color: var(--subtitle-frequency-band-1-color, #ed8796);
|
||||||
|
}
|
||||||
|
|
||||||
|
#subtitleRoot .word.word-frequency-band-2 {
|
||||||
|
color: var(--subtitle-frequency-band-2-color, #f5a97f);
|
||||||
|
}
|
||||||
|
|
||||||
|
#subtitleRoot .word.word-frequency-band-3 {
|
||||||
|
color: var(--subtitle-frequency-band-3-color, #f9e2af);
|
||||||
|
}
|
||||||
|
|
||||||
|
#subtitleRoot .word.word-frequency-band-4 {
|
||||||
|
color: var(--subtitle-frequency-band-4-color, #a6e3a1);
|
||||||
|
}
|
||||||
|
|
||||||
|
#subtitleRoot .word.word-frequency-band-5 {
|
||||||
|
color: var(--subtitle-frequency-band-5-color, #8aadf4);
|
||||||
|
}
|
||||||
|
|
||||||
#subtitleRoot .word:hover {
|
#subtitleRoot .word:hover {
|
||||||
background: rgba(255, 255, 255, 0.2);
|
background: rgba(255, 255, 255, 0.2);
|
||||||
border-radius: 3px;
|
border-radius: 3px;
|
||||||
|
|||||||
@@ -22,8 +22,7 @@ function createToken(overrides: Partial<MergedToken>): MergedToken {
|
|||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
function extractClassBlock(cssText: string, level: number): string {
|
function extractClassBlock(cssText: string, selector: string): string {
|
||||||
const selector = `#subtitleRoot .word.word-jlpt-n${level}`;
|
|
||||||
const start = cssText.indexOf(selector);
|
const start = cssText.indexOf(selector);
|
||||||
if (start < 0) return "";
|
if (start < 0) return "";
|
||||||
|
|
||||||
@@ -54,6 +53,152 @@ test("computeWordClass preserves known and n+1 classes while adding JLPT classes
|
|||||||
);
|
);
|
||||||
});
|
});
|
||||||
|
|
||||||
|
test("computeWordClass does not add frequency class to known or N+1 terms", () => {
|
||||||
|
const known = createToken({
|
||||||
|
isKnown: true,
|
||||||
|
frequencyRank: 10,
|
||||||
|
surface: "既知",
|
||||||
|
});
|
||||||
|
const nPlusOne = createToken({
|
||||||
|
isNPlusOneTarget: true,
|
||||||
|
frequencyRank: 10,
|
||||||
|
surface: "目標",
|
||||||
|
});
|
||||||
|
const frequency = createToken({
|
||||||
|
frequencyRank: 10,
|
||||||
|
surface: "頻度",
|
||||||
|
});
|
||||||
|
|
||||||
|
assert.equal(
|
||||||
|
computeWordClass(known, {
|
||||||
|
enabled: true,
|
||||||
|
topX: 100,
|
||||||
|
mode: "single",
|
||||||
|
singleColor: "#000000",
|
||||||
|
bandedColors: ["#000000", "#000000", "#000000", "#000000", "#000000"] as const,
|
||||||
|
}),
|
||||||
|
"word word-known",
|
||||||
|
);
|
||||||
|
assert.equal(
|
||||||
|
computeWordClass(nPlusOne, {
|
||||||
|
enabled: true,
|
||||||
|
topX: 100,
|
||||||
|
mode: "single",
|
||||||
|
singleColor: "#000000",
|
||||||
|
bandedColors: ["#000000", "#000000", "#000000", "#000000", "#000000"] as const,
|
||||||
|
}),
|
||||||
|
"word word-n-plus-one",
|
||||||
|
);
|
||||||
|
assert.equal(
|
||||||
|
computeWordClass(frequency, {
|
||||||
|
enabled: true,
|
||||||
|
topX: 100,
|
||||||
|
mode: "single",
|
||||||
|
singleColor: "#000000",
|
||||||
|
bandedColors: ["#000000", "#000000", "#000000", "#000000", "#000000"] as const,
|
||||||
|
}),
|
||||||
|
"word word-frequency-single",
|
||||||
|
);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("computeWordClass adds frequency class for single mode when rank is within topX", () => {
|
||||||
|
const token = createToken({
|
||||||
|
surface: "猫",
|
||||||
|
frequencyRank: 50,
|
||||||
|
});
|
||||||
|
|
||||||
|
const actual = computeWordClass(
|
||||||
|
token,
|
||||||
|
{
|
||||||
|
enabled: true,
|
||||||
|
topX: 100,
|
||||||
|
mode: "single",
|
||||||
|
singleColor: "#000000",
|
||||||
|
bandedColors: ["#000000", "#000000", "#000000", "#000000", "#000000"] as const,
|
||||||
|
},
|
||||||
|
);
|
||||||
|
|
||||||
|
assert.equal(actual, "word word-frequency-single");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("computeWordClass adds frequency class when rank equals topX", () => {
|
||||||
|
const token = createToken({
|
||||||
|
surface: "水",
|
||||||
|
frequencyRank: 100,
|
||||||
|
});
|
||||||
|
|
||||||
|
const actual = computeWordClass(
|
||||||
|
token,
|
||||||
|
{
|
||||||
|
enabled: true,
|
||||||
|
topX: 100,
|
||||||
|
mode: "single",
|
||||||
|
singleColor: "#000000",
|
||||||
|
bandedColors: ["#000000", "#000000", "#000000", "#000000", "#000000"] as const,
|
||||||
|
},
|
||||||
|
);
|
||||||
|
|
||||||
|
assert.equal(actual, "word word-frequency-single");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("computeWordClass adds frequency class for banded mode", () => {
|
||||||
|
const token = createToken({
|
||||||
|
surface: "犬",
|
||||||
|
frequencyRank: 250,
|
||||||
|
});
|
||||||
|
|
||||||
|
const actual = computeWordClass(
|
||||||
|
token,
|
||||||
|
{
|
||||||
|
enabled: true,
|
||||||
|
topX: 1000,
|
||||||
|
mode: "banded",
|
||||||
|
singleColor: "#000000",
|
||||||
|
bandedColors:
|
||||||
|
["#111111", "#222222", "#333333", "#444444", "#555555"] as const,
|
||||||
|
},
|
||||||
|
);
|
||||||
|
|
||||||
|
assert.equal(actual, "word word-frequency-band-2");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("computeWordClass uses configured band count for banded mode", () => {
|
||||||
|
const token = createToken({
|
||||||
|
surface: "犬",
|
||||||
|
frequencyRank: 2,
|
||||||
|
});
|
||||||
|
|
||||||
|
const actual = computeWordClass(token, {
|
||||||
|
enabled: true,
|
||||||
|
topX: 4,
|
||||||
|
mode: "banded",
|
||||||
|
singleColor: "#000000",
|
||||||
|
bandedColors: ["#111111", "#222222", "#333333"] as any,
|
||||||
|
} as any);
|
||||||
|
|
||||||
|
assert.equal(actual, "word word-frequency-band-1");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("computeWordClass skips frequency class when rank is out of topX", () => {
|
||||||
|
const token = createToken({
|
||||||
|
surface: "犬",
|
||||||
|
frequencyRank: 1200,
|
||||||
|
});
|
||||||
|
|
||||||
|
const actual = computeWordClass(
|
||||||
|
token,
|
||||||
|
{
|
||||||
|
enabled: true,
|
||||||
|
topX: 1000,
|
||||||
|
mode: "single",
|
||||||
|
singleColor: "#000000",
|
||||||
|
bandedColors: ["#000000", "#000000", "#000000", "#000000", "#000000"] as const,
|
||||||
|
},
|
||||||
|
);
|
||||||
|
|
||||||
|
assert.equal(actual, "word");
|
||||||
|
});
|
||||||
|
|
||||||
test("JLPT CSS rules use underline-only styling in renderer stylesheet", () => {
|
test("JLPT CSS rules use underline-only styling in renderer stylesheet", () => {
|
||||||
const distCssPath = path.join(process.cwd(), "dist", "renderer", "style.css");
|
const distCssPath = path.join(process.cwd(), "dist", "renderer", "style.css");
|
||||||
const srcCssPath = path.join(process.cwd(), "src", "renderer", "style.css");
|
const srcCssPath = path.join(process.cwd(), "src", "renderer", "style.css");
|
||||||
@@ -70,11 +215,25 @@ test("JLPT CSS rules use underline-only styling in renderer stylesheet", () => {
|
|||||||
const cssText = fs.readFileSync(cssPath, "utf-8");
|
const cssText = fs.readFileSync(cssPath, "utf-8");
|
||||||
|
|
||||||
for (let level = 1; level <= 5; level += 1) {
|
for (let level = 1; level <= 5; level += 1) {
|
||||||
const block = extractClassBlock(cssText, level);
|
const block = extractClassBlock(
|
||||||
|
cssText,
|
||||||
|
`#subtitleRoot .word.word-jlpt-n${level}`,
|
||||||
|
);
|
||||||
assert.ok(block.length > 0, `word-jlpt-n${level} class should exist`);
|
assert.ok(block.length > 0, `word-jlpt-n${level} class should exist`);
|
||||||
assert.match(block, /text-decoration-line:\s*underline;/);
|
assert.match(block, /text-decoration-line:\s*underline;/);
|
||||||
assert.match(block, /text-decoration-thickness:\s*2px;/);
|
assert.match(block, /text-decoration-thickness:\s*2px;/);
|
||||||
assert.match(block, /text-underline-offset:\s*4px;/);
|
assert.match(block, /text-underline-offset:\s*4px;/);
|
||||||
assert.match(block, /color:\s*inherit;/);
|
assert.match(block, /color:\s*inherit;/);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
for (let band = 1; band <= 5; band += 1) {
|
||||||
|
const block = extractClassBlock(
|
||||||
|
cssText,
|
||||||
|
band === 1
|
||||||
|
? "#subtitleRoot .word.word-frequency-single"
|
||||||
|
: `#subtitleRoot .word.word-frequency-band-${band}`,
|
||||||
|
);
|
||||||
|
assert.ok(block.length > 0, `frequency class word-frequency-${band === 1 ? "single" : `band-${band}`} should exist`);
|
||||||
|
assert.match(block, /color:\s*var\(/);
|
||||||
|
}
|
||||||
});
|
});
|
||||||
|
|||||||
@@ -6,6 +6,14 @@ import type {
|
|||||||
} from "../types";
|
} from "../types";
|
||||||
import type { RendererContext } from "./context";
|
import type { RendererContext } from "./context";
|
||||||
|
|
||||||
|
type FrequencyRenderSettings = {
|
||||||
|
enabled: boolean;
|
||||||
|
topX: number;
|
||||||
|
mode: "single" | "banded";
|
||||||
|
singleColor: string;
|
||||||
|
bandedColors: [string, string, string, string, string];
|
||||||
|
};
|
||||||
|
|
||||||
function normalizeSubtitle(text: string, trim = true): string {
|
function normalizeSubtitle(text: string, trim = true): string {
|
||||||
if (!text) return "";
|
if (!text) return "";
|
||||||
|
|
||||||
@@ -24,7 +32,88 @@ function sanitizeHexColor(value: unknown, fallback: string): string {
|
|||||||
: fallback;
|
: fallback;
|
||||||
}
|
}
|
||||||
|
|
||||||
function renderWithTokens(root: HTMLElement, tokens: MergedToken[]): void {
|
const DEFAULT_FREQUENCY_RENDER_SETTINGS: FrequencyRenderSettings = {
|
||||||
|
enabled: false,
|
||||||
|
topX: 1000,
|
||||||
|
mode: "single",
|
||||||
|
singleColor: "#f5a97f",
|
||||||
|
bandedColors: ["#ed8796", "#f5a97f", "#f9e2af", "#a6e3a1", "#8aadf4"],
|
||||||
|
};
|
||||||
|
|
||||||
|
function sanitizeFrequencyTopX(value: unknown, fallback: number): number {
|
||||||
|
if (typeof value !== "number" || !Number.isFinite(value) || value <= 0) {
|
||||||
|
return fallback;
|
||||||
|
}
|
||||||
|
return Math.max(1, Math.floor(value));
|
||||||
|
}
|
||||||
|
|
||||||
|
function sanitizeFrequencyBandedColors(
|
||||||
|
value: unknown,
|
||||||
|
fallback: FrequencyRenderSettings["bandedColors"],
|
||||||
|
): FrequencyRenderSettings["bandedColors"] {
|
||||||
|
if (!Array.isArray(value) || value.length !== 5) {
|
||||||
|
return fallback;
|
||||||
|
}
|
||||||
|
|
||||||
|
return [
|
||||||
|
sanitizeHexColor(value[0], fallback[0]),
|
||||||
|
sanitizeHexColor(value[1], fallback[1]),
|
||||||
|
sanitizeHexColor(value[2], fallback[2]),
|
||||||
|
sanitizeHexColor(value[3], fallback[3]),
|
||||||
|
sanitizeHexColor(value[4], fallback[4]),
|
||||||
|
];
|
||||||
|
}
|
||||||
|
|
||||||
|
function getFrequencyDictionaryClass(
|
||||||
|
token: MergedToken,
|
||||||
|
settings: FrequencyRenderSettings,
|
||||||
|
): string {
|
||||||
|
if (!settings.enabled) {
|
||||||
|
return "";
|
||||||
|
}
|
||||||
|
|
||||||
|
if (typeof token.frequencyRank !== "number" || !Number.isFinite(token.frequencyRank)) {
|
||||||
|
return "";
|
||||||
|
}
|
||||||
|
|
||||||
|
const rank = Math.max(1, Math.floor(token.frequencyRank));
|
||||||
|
const topX = sanitizeFrequencyTopX(settings.topX, DEFAULT_FREQUENCY_RENDER_SETTINGS.topX);
|
||||||
|
if (rank > topX) {
|
||||||
|
return "";
|
||||||
|
}
|
||||||
|
|
||||||
|
if (settings.mode === "banded") {
|
||||||
|
const bandCount = settings.bandedColors.length;
|
||||||
|
const normalizedBand = Math.ceil((rank / topX) * bandCount);
|
||||||
|
const band = Math.min(bandCount, Math.max(1, normalizedBand));
|
||||||
|
return `word-frequency-band-${band}`;
|
||||||
|
}
|
||||||
|
|
||||||
|
return "word-frequency-single";
|
||||||
|
}
|
||||||
|
|
||||||
|
function renderWithTokens(
|
||||||
|
root: HTMLElement,
|
||||||
|
tokens: MergedToken[],
|
||||||
|
frequencyRenderSettings?: Partial<FrequencyRenderSettings>,
|
||||||
|
): void {
|
||||||
|
const resolvedFrequencyRenderSettings = {
|
||||||
|
...DEFAULT_FREQUENCY_RENDER_SETTINGS,
|
||||||
|
...frequencyRenderSettings,
|
||||||
|
bandedColors: sanitizeFrequencyBandedColors(
|
||||||
|
frequencyRenderSettings?.bandedColors,
|
||||||
|
DEFAULT_FREQUENCY_RENDER_SETTINGS.bandedColors,
|
||||||
|
),
|
||||||
|
topX: sanitizeFrequencyTopX(
|
||||||
|
frequencyRenderSettings?.topX,
|
||||||
|
DEFAULT_FREQUENCY_RENDER_SETTINGS.topX,
|
||||||
|
),
|
||||||
|
singleColor: sanitizeHexColor(
|
||||||
|
frequencyRenderSettings?.singleColor,
|
||||||
|
DEFAULT_FREQUENCY_RENDER_SETTINGS.singleColor,
|
||||||
|
),
|
||||||
|
};
|
||||||
|
|
||||||
const fragment = document.createDocumentFragment();
|
const fragment = document.createDocumentFragment();
|
||||||
|
|
||||||
for (const token of tokens) {
|
for (const token of tokens) {
|
||||||
@@ -35,7 +124,10 @@ function renderWithTokens(root: HTMLElement, tokens: MergedToken[]): void {
|
|||||||
for (let i = 0; i < parts.length; i += 1) {
|
for (let i = 0; i < parts.length; i += 1) {
|
||||||
if (parts[i]) {
|
if (parts[i]) {
|
||||||
const span = document.createElement("span");
|
const span = document.createElement("span");
|
||||||
span.className = computeWordClass(token);
|
span.className = computeWordClass(
|
||||||
|
token,
|
||||||
|
resolvedFrequencyRenderSettings,
|
||||||
|
);
|
||||||
span.textContent = parts[i];
|
span.textContent = parts[i];
|
||||||
if (token.reading) span.dataset.reading = token.reading;
|
if (token.reading) span.dataset.reading = token.reading;
|
||||||
if (token.headword) span.dataset.headword = token.headword;
|
if (token.headword) span.dataset.headword = token.headword;
|
||||||
@@ -49,7 +141,7 @@ function renderWithTokens(root: HTMLElement, tokens: MergedToken[]): void {
|
|||||||
}
|
}
|
||||||
|
|
||||||
const span = document.createElement("span");
|
const span = document.createElement("span");
|
||||||
span.className = computeWordClass(token);
|
span.className = computeWordClass(token, resolvedFrequencyRenderSettings);
|
||||||
span.textContent = surface;
|
span.textContent = surface;
|
||||||
if (token.reading) span.dataset.reading = token.reading;
|
if (token.reading) span.dataset.reading = token.reading;
|
||||||
if (token.headword) span.dataset.headword = token.headword;
|
if (token.headword) span.dataset.headword = token.headword;
|
||||||
@@ -59,7 +151,27 @@ function renderWithTokens(root: HTMLElement, tokens: MergedToken[]): void {
|
|||||||
root.appendChild(fragment);
|
root.appendChild(fragment);
|
||||||
}
|
}
|
||||||
|
|
||||||
export function computeWordClass(token: MergedToken): string {
|
export function computeWordClass(
|
||||||
|
token: MergedToken,
|
||||||
|
frequencySettings?: Partial<FrequencyRenderSettings>,
|
||||||
|
): string {
|
||||||
|
const resolvedFrequencySettings = {
|
||||||
|
...DEFAULT_FREQUENCY_RENDER_SETTINGS,
|
||||||
|
...frequencySettings,
|
||||||
|
bandedColors: sanitizeFrequencyBandedColors(
|
||||||
|
frequencySettings?.bandedColors,
|
||||||
|
DEFAULT_FREQUENCY_RENDER_SETTINGS.bandedColors,
|
||||||
|
),
|
||||||
|
topX: sanitizeFrequencyTopX(
|
||||||
|
frequencySettings?.topX,
|
||||||
|
DEFAULT_FREQUENCY_RENDER_SETTINGS.topX,
|
||||||
|
),
|
||||||
|
singleColor: sanitizeHexColor(
|
||||||
|
frequencySettings?.singleColor,
|
||||||
|
DEFAULT_FREQUENCY_RENDER_SETTINGS.singleColor,
|
||||||
|
),
|
||||||
|
};
|
||||||
|
|
||||||
const classes = ["word"];
|
const classes = ["word"];
|
||||||
|
|
||||||
if (token.isNPlusOneTarget) {
|
if (token.isNPlusOneTarget) {
|
||||||
@@ -72,6 +184,16 @@ export function computeWordClass(token: MergedToken): string {
|
|||||||
classes.push(`word-jlpt-${token.jlptLevel.toLowerCase()}`);
|
classes.push(`word-jlpt-${token.jlptLevel.toLowerCase()}`);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (!token.isKnown && !token.isNPlusOneTarget) {
|
||||||
|
const frequencyClass = getFrequencyDictionaryClass(
|
||||||
|
token,
|
||||||
|
resolvedFrequencySettings,
|
||||||
|
);
|
||||||
|
if (frequencyClass) {
|
||||||
|
classes.push(frequencyClass);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
return classes.join(" ");
|
return classes.join(" ");
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -139,12 +261,32 @@ export function createSubtitleRenderer(ctx: RendererContext) {
|
|||||||
|
|
||||||
const normalized = normalizeSubtitle(text);
|
const normalized = normalizeSubtitle(text);
|
||||||
if (tokens && tokens.length > 0) {
|
if (tokens && tokens.length > 0) {
|
||||||
renderWithTokens(ctx.dom.subtitleRoot, tokens);
|
renderWithTokens(
|
||||||
|
ctx.dom.subtitleRoot,
|
||||||
|
tokens,
|
||||||
|
getFrequencyRenderSettings(),
|
||||||
|
);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
renderCharacterLevel(ctx.dom.subtitleRoot, normalized);
|
renderCharacterLevel(ctx.dom.subtitleRoot, normalized);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
function getFrequencyRenderSettings(): Partial<FrequencyRenderSettings> {
|
||||||
|
return {
|
||||||
|
enabled: ctx.state.frequencyDictionaryEnabled,
|
||||||
|
topX: ctx.state.frequencyDictionaryTopX,
|
||||||
|
mode: ctx.state.frequencyDictionaryMode,
|
||||||
|
singleColor: ctx.state.frequencyDictionarySingleColor,
|
||||||
|
bandedColors: [
|
||||||
|
ctx.state.frequencyDictionaryBand1Color,
|
||||||
|
ctx.state.frequencyDictionaryBand2Color,
|
||||||
|
ctx.state.frequencyDictionaryBand3Color,
|
||||||
|
ctx.state.frequencyDictionaryBand4Color,
|
||||||
|
ctx.state.frequencyDictionaryBand5Color,
|
||||||
|
] as [string, string, string, string, string],
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
function renderSecondarySub(text: string): void {
|
function renderSecondarySub(text: string): void {
|
||||||
ctx.dom.secondarySubRoot.innerHTML = "";
|
ctx.dom.secondarySubRoot.innerHTML = "";
|
||||||
if (!text) return;
|
if (!text) return;
|
||||||
@@ -236,6 +378,66 @@ export function createSubtitleRenderer(ctx: RendererContext) {
|
|||||||
ctx.dom.subtitleRoot.style.setProperty("--subtitle-jlpt-n3-color", jlptColors.N3);
|
ctx.dom.subtitleRoot.style.setProperty("--subtitle-jlpt-n3-color", jlptColors.N3);
|
||||||
ctx.dom.subtitleRoot.style.setProperty("--subtitle-jlpt-n4-color", jlptColors.N4);
|
ctx.dom.subtitleRoot.style.setProperty("--subtitle-jlpt-n4-color", jlptColors.N4);
|
||||||
ctx.dom.subtitleRoot.style.setProperty("--subtitle-jlpt-n5-color", jlptColors.N5);
|
ctx.dom.subtitleRoot.style.setProperty("--subtitle-jlpt-n5-color", jlptColors.N5);
|
||||||
|
const frequencyDictionarySettings = style.frequencyDictionary ?? {};
|
||||||
|
const frequencyEnabled =
|
||||||
|
frequencyDictionarySettings.enabled ?? ctx.state.frequencyDictionaryEnabled;
|
||||||
|
const frequencyTopX = sanitizeFrequencyTopX(
|
||||||
|
frequencyDictionarySettings.topX,
|
||||||
|
ctx.state.frequencyDictionaryTopX,
|
||||||
|
);
|
||||||
|
const frequencyMode = frequencyDictionarySettings.mode
|
||||||
|
? frequencyDictionarySettings.mode
|
||||||
|
: ctx.state.frequencyDictionaryMode;
|
||||||
|
const frequencySingleColor = sanitizeHexColor(
|
||||||
|
frequencyDictionarySettings.singleColor,
|
||||||
|
ctx.state.frequencyDictionarySingleColor,
|
||||||
|
);
|
||||||
|
const frequencyBandedColors = sanitizeFrequencyBandedColors(
|
||||||
|
frequencyDictionarySettings.bandedColors,
|
||||||
|
[
|
||||||
|
ctx.state.frequencyDictionaryBand1Color,
|
||||||
|
ctx.state.frequencyDictionaryBand2Color,
|
||||||
|
ctx.state.frequencyDictionaryBand3Color,
|
||||||
|
ctx.state.frequencyDictionaryBand4Color,
|
||||||
|
ctx.state.frequencyDictionaryBand5Color,
|
||||||
|
] as [string, string, string, string, string],
|
||||||
|
);
|
||||||
|
|
||||||
|
ctx.state.frequencyDictionaryEnabled = frequencyEnabled;
|
||||||
|
ctx.state.frequencyDictionaryTopX = frequencyTopX;
|
||||||
|
ctx.state.frequencyDictionaryMode = frequencyMode;
|
||||||
|
ctx.state.frequencyDictionarySingleColor = frequencySingleColor;
|
||||||
|
[
|
||||||
|
ctx.state.frequencyDictionaryBand1Color,
|
||||||
|
ctx.state.frequencyDictionaryBand2Color,
|
||||||
|
ctx.state.frequencyDictionaryBand3Color,
|
||||||
|
ctx.state.frequencyDictionaryBand4Color,
|
||||||
|
ctx.state.frequencyDictionaryBand5Color,
|
||||||
|
] = frequencyBandedColors;
|
||||||
|
ctx.dom.subtitleRoot.style.setProperty(
|
||||||
|
"--subtitle-frequency-single-color",
|
||||||
|
frequencySingleColor,
|
||||||
|
);
|
||||||
|
ctx.dom.subtitleRoot.style.setProperty(
|
||||||
|
"--subtitle-frequency-band-1-color",
|
||||||
|
frequencyBandedColors[0],
|
||||||
|
);
|
||||||
|
ctx.dom.subtitleRoot.style.setProperty(
|
||||||
|
"--subtitle-frequency-band-2-color",
|
||||||
|
frequencyBandedColors[1],
|
||||||
|
);
|
||||||
|
ctx.dom.subtitleRoot.style.setProperty(
|
||||||
|
"--subtitle-frequency-band-3-color",
|
||||||
|
frequencyBandedColors[2],
|
||||||
|
);
|
||||||
|
ctx.dom.subtitleRoot.style.setProperty(
|
||||||
|
"--subtitle-frequency-band-4-color",
|
||||||
|
frequencyBandedColors[3],
|
||||||
|
);
|
||||||
|
ctx.dom.subtitleRoot.style.setProperty(
|
||||||
|
"--subtitle-frequency-band-5-color",
|
||||||
|
frequencyBandedColors[4],
|
||||||
|
);
|
||||||
|
|
||||||
const secondaryStyle = style.secondary;
|
const secondaryStyle = style.secondary;
|
||||||
if (!secondaryStyle) return;
|
if (!secondaryStyle) return;
|
||||||
|
|||||||
25
src/types.ts
25
src/types.ts
@@ -55,8 +55,11 @@ export interface MergedToken {
|
|||||||
isKnown: boolean;
|
isKnown: boolean;
|
||||||
isNPlusOneTarget: boolean;
|
isNPlusOneTarget: boolean;
|
||||||
jlptLevel?: JlptLevel;
|
jlptLevel?: JlptLevel;
|
||||||
|
frequencyRank?: number;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
export type FrequencyDictionaryLookup = (term: string) => number | null;
|
||||||
|
|
||||||
export type JlptLevel = "N1" | "N2" | "N3" | "N4" | "N5";
|
export type JlptLevel = "N1" | "N2" | "N3" | "N4" | "N5";
|
||||||
|
|
||||||
export interface WindowGeometry {
|
export interface WindowGeometry {
|
||||||
@@ -283,6 +286,14 @@ export interface SubtitleStyleConfig {
|
|||||||
N4: string;
|
N4: string;
|
||||||
N5: string;
|
N5: string;
|
||||||
};
|
};
|
||||||
|
frequencyDictionary?: {
|
||||||
|
enabled?: boolean;
|
||||||
|
sourcePath?: string;
|
||||||
|
topX?: number;
|
||||||
|
mode?: FrequencyDictionaryMode;
|
||||||
|
singleColor?: string;
|
||||||
|
bandedColors?: [string, string, string, string, string];
|
||||||
|
};
|
||||||
secondary?: {
|
secondary?: {
|
||||||
fontFamily?: string;
|
fontFamily?: string;
|
||||||
fontSize?: number;
|
fontSize?: number;
|
||||||
@@ -293,6 +304,8 @@ export interface SubtitleStyleConfig {
|
|||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
|
export type FrequencyDictionaryMode = "single" | "banded";
|
||||||
|
|
||||||
export interface ShortcutsConfig {
|
export interface ShortcutsConfig {
|
||||||
toggleVisibleOverlayGlobal?: string | null;
|
toggleVisibleOverlayGlobal?: string | null;
|
||||||
toggleInvisibleOverlayGlobal?: string | null;
|
toggleInvisibleOverlayGlobal?: string | null;
|
||||||
@@ -431,8 +444,18 @@ export interface ResolvedConfig {
|
|||||||
shortcuts: Required<ShortcutsConfig>;
|
shortcuts: Required<ShortcutsConfig>;
|
||||||
secondarySub: Required<SecondarySubConfig>;
|
secondarySub: Required<SecondarySubConfig>;
|
||||||
subsync: Required<SubsyncConfig>;
|
subsync: Required<SubsyncConfig>;
|
||||||
subtitleStyle: Required<Omit<SubtitleStyleConfig, "secondary">> & {
|
subtitleStyle: Required<
|
||||||
|
Omit<SubtitleStyleConfig, "secondary" | "frequencyDictionary">
|
||||||
|
> & {
|
||||||
secondary: Required<NonNullable<SubtitleStyleConfig["secondary"]>>;
|
secondary: Required<NonNullable<SubtitleStyleConfig["secondary"]>>;
|
||||||
|
frequencyDictionary: {
|
||||||
|
enabled: boolean;
|
||||||
|
sourcePath: string;
|
||||||
|
topX: number;
|
||||||
|
mode: FrequencyDictionaryMode;
|
||||||
|
singleColor: string;
|
||||||
|
bandedColors: [string, string, string, string, string];
|
||||||
|
};
|
||||||
};
|
};
|
||||||
auto_start_overlay: boolean;
|
auto_start_overlay: boolean;
|
||||||
bind_visible_overlay_to_mpv_sub_visibility: boolean;
|
bind_visible_overlay_to_mpv_sub_visibility: boolean;
|
||||||
|
|||||||
42
subminer
42
subminer
@@ -705,6 +705,39 @@ function isExecutable(filePath: string): boolean {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
function resolveMacAppBinaryCandidate(candidate: string): string {
|
||||||
|
if (process.platform !== "darwin") return "";
|
||||||
|
|
||||||
|
const direct = resolveBinaryPathCandidate(candidate);
|
||||||
|
if (!direct) return "";
|
||||||
|
|
||||||
|
if (isExecutable(direct)) {
|
||||||
|
return direct;
|
||||||
|
}
|
||||||
|
|
||||||
|
const appIndex = direct.indexOf(".app/");
|
||||||
|
const appPath =
|
||||||
|
direct.endsWith(".app") && direct.includes(".app")
|
||||||
|
? direct
|
||||||
|
: appIndex >= 0
|
||||||
|
? direct.slice(0, appIndex + ".app".length)
|
||||||
|
: "";
|
||||||
|
if (!appPath) return "";
|
||||||
|
|
||||||
|
const candidates = [
|
||||||
|
path.join(appPath, "Contents", "MacOS", "SubMiner"),
|
||||||
|
path.join(appPath, "Contents", "MacOS", "subminer"),
|
||||||
|
];
|
||||||
|
|
||||||
|
for (const candidateBinary of candidates) {
|
||||||
|
if (isExecutable(candidateBinary)) {
|
||||||
|
return candidateBinary;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return "";
|
||||||
|
}
|
||||||
|
|
||||||
function commandExists(command: string): boolean {
|
function commandExists(command: string): boolean {
|
||||||
const pathEnv = process.env.PATH ?? "";
|
const pathEnv = process.env.PATH ?? "";
|
||||||
for (const dir of pathEnv.split(path.delimiter)) {
|
for (const dir of pathEnv.split(path.delimiter)) {
|
||||||
@@ -1666,8 +1699,8 @@ function findAppBinary(selfPath: string): string | null {
|
|||||||
].filter((candidate): candidate is string => Boolean(candidate));
|
].filter((candidate): candidate is string => Boolean(candidate));
|
||||||
|
|
||||||
for (const envPath of envPaths) {
|
for (const envPath of envPaths) {
|
||||||
const resolved = resolveBinaryPathCandidate(envPath);
|
const resolved = resolveMacAppBinaryCandidate(envPath);
|
||||||
if (resolved && isExecutable(resolved)) {
|
if (resolved) {
|
||||||
return resolved;
|
return resolved;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -2636,6 +2669,7 @@ function startMpv(
|
|||||||
targetKind: "file" | "url",
|
targetKind: "file" | "url",
|
||||||
args: Args,
|
args: Args,
|
||||||
socketPath: string,
|
socketPath: string,
|
||||||
|
appPath: string,
|
||||||
preloadedSubtitles?: { primaryPath?: string; secondaryPath?: string },
|
preloadedSubtitles?: { primaryPath?: string; secondaryPath?: string },
|
||||||
): void {
|
): void {
|
||||||
if (
|
if (
|
||||||
@@ -2692,6 +2726,9 @@ function startMpv(
|
|||||||
if (preloadedSubtitles?.secondaryPath) {
|
if (preloadedSubtitles?.secondaryPath) {
|
||||||
mpvArgs.push(`--sub-file=${preloadedSubtitles.secondaryPath}`);
|
mpvArgs.push(`--sub-file=${preloadedSubtitles.secondaryPath}`);
|
||||||
}
|
}
|
||||||
|
mpvArgs.push(
|
||||||
|
`--script-opts=subminer-binary_path=${appPath},subminer-socket_path=${socketPath}`,
|
||||||
|
);
|
||||||
mpvArgs.push(`--log-file=${getMpvLogPath()}`);
|
mpvArgs.push(`--log-file=${getMpvLogPath()}`);
|
||||||
|
|
||||||
try {
|
try {
|
||||||
@@ -2833,6 +2870,7 @@ async function main(): Promise<void> {
|
|||||||
selectedTarget.kind,
|
selectedTarget.kind,
|
||||||
args,
|
args,
|
||||||
mpvSocketPath,
|
mpvSocketPath,
|
||||||
|
appPath,
|
||||||
preloadedSubtitles,
|
preloadedSubtitles,
|
||||||
);
|
);
|
||||||
|
|
||||||
|
|||||||
1
vendor/jiten_freq_global/index.json
vendored
Normal file
1
vendor/jiten_freq_global/index.json
vendored
Normal file
@@ -0,0 +1 @@
|
|||||||
|
{"title":"Jiten","format":3,"revision":"Jiten 26-02-16","isUpdatable":true,"indexUrl":"https://api.jiten.moe/api/frequency-list/index","downloadUrl":"https://api.jiten.moe/api/frequency-list/download","sequenced":false,"frequencyMode":"rank-based","author":"Jiten","url":"https://jiten.moe","description":"Dictionary based on frequency data of all media from jiten.moe"}
|
||||||
1
vendor/jiten_freq_global/term_meta_bank_1.json
vendored
Normal file
1
vendor/jiten_freq_global/term_meta_bank_1.json
vendored
Normal file
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user