feat(tokenizer): use Yomitan word classes for subtitle POS filtering (#57)

* feat(tokenizer): use Yomitan word classes for subtitle POS filtering

- Carry matched headword wordClasses from termsFind into YomitanScanToken
- Map recognized Yomitan wordClasses to SubMiner coarse POS before annotation
- MeCab enrichment now fills only missing POS fields, preserving existing coarse pos1
- Exclude standalone grammar particles, して helper fragments, and single-kana surfaces from annotations
- Respect source-text punctuation gaps when counting N+1 sentence words
- Preserve known-word highlight on excluded kanji-containing tokens
- Add backlog tasks 304 (N+1 boundary bug) and 305 (wordClasses POS, done)

* fix(tokenizer): preserve annotation and enrichment behavior

* fix: restore jlpt subtitle underlines

* fix: exclude kana-only n+1 targets

* fix: refresh overlay on Hyprland fullscreen

* fix: address fullscreen and n-plus-one review notes

* fix: address CodeRabbit review comments

* fix: accept modified digits for multi-line sentence mining

* Cancel pending Linux MPV fullscreen overlay refresh bursts

- return a cancel handle from the Linux refresh burst scheduler
- clear pending refresh bursts when overlays hide or windows close
- tighten the burst test polling to wait for the async refresh

* fix: suppress N+1 for kana-only candidates and fix minSentenceWords coun

- Treat kana-only tokens with surrounding subtitle punctuation (…, ―, etc.) as kana-only so they are not promoted to N+1 targets
- Exclude unknown tokens filtered from N+1 targeting from the minSentenceWords count so filtered kana-only unknowns cannot satisfy sentence length threshold
- Add regression tests for kana-only candidate suppression and filtered-unknown padding cases

* Suppress subtitle annotations for grammar fragments

- Hide annotation metadata for auxiliary inflection and ja-nai endings
- Preserve lexical `くれる` forms and add regression coverage

* Fix kana-only N+1 tokenizer regression test

- Use a pure-kana fixture for the subtitle token N+1 case
- Update task notes for the latest CodeRabbit follow-up

* Fix managed playback exit and tokenizer grammar splits

- Ignore background stats daemons during regular app startup
- Split standalone grammar endings before applying annotations
- Clear helper-span annotations for auxiliary-only tokens

* fix: refresh current subtitle after known-word mining

* fix: suppress sigh interjection annotations

* fix: preserve jlpt underline color after lookup

* Replace grammar-ending permutations with shared matcher; preserve word a

- Extract `grammar-ending.ts` with `isStandaloneGrammarEndingText` / `isSubtitleGrammarEndingText` pattern matchers
- Replace `STANDALONE_GRAMMAR_ENDINGS` set in parser-selection-stage with shared matcher
- Replace generated phrase sets in subtitle-annotation-filter with shared matcher
- Remove stale duplicate subtitle-exclusion constants and helpers from annotation-stage
- Manual clipboard card updates now write only to the sentence audio field, leaving word/expression audio untouched

* fix: CI changelog, annotation options threading, and Jellyfin quit

- Add `type: fixed` / `area:` frontmatter to `changes/319` to pass `changelog:lint`
- Thread `TokenizerAnnotationOptions` through `stripSubtitleAnnotationMetadata` so `sourceText` is honored
- Include `jellyfinPlay` in `shouldQuitOnDisconnectWhenOverlayRuntimeInitialized` predicate
- Make mouse test `elementFromPoint` stubs coordinate-sensitive
- Make Lua test `.tmp` mkdir portable on Windows

* Preserve overlay across macOS flaps and mpv playlist changes

- keep visible overlays alive during transient macOS tracker loss
- reuse the running mpv overlay path on playlist navigation
- update regression coverage and changelog fragments

* fix: restore stats daemon deferral

* fix: keep subtitle prefetch alive after cache hits

* Fix JLPT underline color drift and AniList skipped-threshold sync

- Replace JLPT `text-decoration` underlines with `border-bottom` so Chromium selection/hover cannot repaint them to another annotation's color
- Lock JLPT underline color for combined annotation selectors (known, n+1, frequency) and character hover/selection states
- Trigger AniList post-watch check on every mpv time-position update to catch skipped completion thresholds
- Fall back to filename-parser season/episode when guessit omits them

* fix: address coderabbit feedback

* fix: sync AniList after seeked completion

* fix: preserve ordinal frequency annotations

* fix: preserve known highlighting for filtered tokens

* fix: address PR #57 CodeRabbit feedback

- Acquire AniList post-watch in-flight lock before async gating to prevent duplicate writes
- Isolate manual watched mark result from AniList post-watch callback failures
- Report known-word cache clears as mutations during immediate append when state existed
- Add regression tests for each fix

* fix: stop AniList setup reopening on Linux when keyring token exists

- Gate setup success on token persistence: `saveToken` now returns `boolean`; on failure, keeps the setup window open instead of reporting success
- Config reload passes `allowSetupPrompt: false` so playback reloads don't re-open the setup window
- Add regression test for persistence-failure path

* fix: suppress known highlights for subtitle particles

* fix: retry transient AniList safeStorage failures

* fix: hide overlay focus ring

* fix: align Hyprland fullscreen overlays

* fix: restore subtitle playback keybindings

* fix: align Hyprland overlay windows to mpv and stop pinning them

- Force-apply exact Hyprland move/resize/setprop dispatches when bounds are provided
- Stop pinning overlay windows; toggle pin off when Hyprland reports pinned=true
- Compensate stats overlay outer placement for Electron/Wayland content insets
- Make stats overlay window and page opaque so mpv cannot show through transparent insets
- Constrain stats app to h-screen with internal scroll so content covers mpv from y=0
- Lock overlay/stats window titles against page-title-updated events
- Add regression coverage for placement dispatches, inset compensation, and CSS overlay mode

* fix: retain frequency rank for honorific prefix-noun tokens

- Add `shouldAllowHonorificPrefixNounFrequency` to exempt お/ご/御 + noun merged tokens from frequency exclusion
- Add regression test for `ご機嫌` asserting rank 5484 is preserved after MeCab enrichment and annotation
- Close TASK-341

* fix: map openCharacterDictionary session action to --open-character-dict

- Add missing Lua CLI dispatch entry for openCharacterDictionary
- Add regression test for Alt+Meta+A binding and CLI flag forwarding

* fix: keep macOS overlay interactive while mpv remains active

- Overlay no longer hides or becomes click-through during tracker refreshes when mpv is the focused window
- Preserve already-visible overlay when tracker is temporarily not ready but mpv target signal is active
- Add regression tests for active-mpv tracker refresh and transient tracker-not-ready paths

* fix: address coderabbit subtitle follow-ups

* fix: resolve media detail from sessions when lifetime summary is absent

- Change `getMediaDetail` JOIN to LEFT JOIN on `imm_lifetime_media` and fall back to aggregated session metrics when no lifetime row exists
- Add filter `AND (lm.video_id IS NOT NULL OR s.session_id IS NOT NULL)` to keep results valid
- Add regression test covering the session-visible / media-detail-missing mismatch

* fix: address PR-57 CodeRabbit findings and CI failures

- use filtered word counts in media detail session token aggregation
- cancel fullscreen refresh burst on exit via updateLinuxMpvFullscreenOverlayRefreshBurst
- guard Hyprland JSON.parse in try/catch; exclude windowtitle from geometry events
- narrow focus suppression from :focus to :focus-visible
- apply JLPT lock selectors to word-name-match tokens (N1–N5)

* fix: macOS overlay z-order and Yomitan compound token known highlighting

- Release always-on-top when tracked mpv loses foreground on macOS
- Skip visible overlay blur restacking on macOS to avoid covering unrelated windows
- Prefer Yomitan internal parse tokens over fragmented scanner output for known-word decisions
- Add regression tests for both behaviors

* fix: macOS visible-overlay blur no longer invokes Windows-only blur call

- Split win32/darwin branches in handleOverlayWindowBlurred so darwin visible blur returns early without calling onWindowsVisibleOverlayBlur
- Add regression test asserting Windows callback stays inactive on macOS visible overlay blur
- Close TASK-347
This commit is contained in:
2026-05-12 12:08:09 -07:00
committed by GitHub
parent b68d17614d
commit 430373f010
176 changed files with 8174 additions and 569 deletions
@@ -38,6 +38,24 @@ function createPassthroughStorage(): SafeStorageLike {
};
}
function createTransientUnavailableStorage(): SafeStorageLike & {
setAvailable: (next: boolean) => void;
} {
let available = false;
return {
isEncryptionAvailable: () => available,
encryptString: (value: string) => Buffer.from(`enc:${value}`, 'utf-8'),
decryptString: (value: Buffer) => {
const raw = value.toString('utf-8');
return raw.startsWith('enc:') ? raw.slice(4) : raw;
},
getSelectedStorageBackend: () => (available ? 'gnome_libsecret' : 'unknown'),
setAvailable(next: boolean) {
available = next;
},
} as SafeStorageLike & { setAvailable: (next: boolean) => void };
}
test('anilist token store saves and loads encrypted token', () => {
const filePath = createTempTokenFile();
const store = createAnilistTokenStore(filePath, createLogger(), createStorage(true));
@@ -61,6 +79,27 @@ test('anilist token store refuses to persist token when encryption unavailable',
assert.equal(store.loadToken(), null);
});
test('anilist token store retries safeStorage after transient encryption unavailability', () => {
const filePath = createTempTokenFile();
fs.writeFileSync(
filePath,
JSON.stringify({
encryptedToken: Buffer.from('enc:stored-token', 'utf-8').toString('base64'),
updatedAt: Date.now(),
}),
'utf-8',
);
const storage = createTransientUnavailableStorage();
const store = createAnilistTokenStore(filePath, createLogger(), storage);
assert.equal(store.loadToken(), null);
storage.setAvailable(true);
assert.equal(store.loadToken(), 'stored-token');
assert.equal(store.saveToken('new-token'), true);
assert.equal(store.loadToken(), 'new-token');
});
test('anilist token store migrates legacy plaintext to encrypted', () => {
const filePath = createTempTokenFile();
fs.writeFileSync(
@@ -69,7 +69,6 @@ export function createAnilistTokenStore(
`AniList token encryption unavailable: safeStorage.isEncryptionAvailable() is false. ` +
`Context: ${getSafeStorageDebugContext()}`,
);
safeStorageUsable = false;
return false;
}
const probe = storage.encryptString('__subminer_anilist_probe__');
@@ -77,7 +76,6 @@ export function createAnilistTokenStore(
notifyUser(
'AniList token encryption probe failed: safeStorage.encryptString() returned plaintext bytes.',
);
safeStorageUsable = false;
return false;
}
const roundTrip = storage.decryptString(probe);
@@ -85,7 +83,6 @@ export function createAnilistTokenStore(
notifyUser(
'AniList token encryption probe failed: encrypt/decrypt round trip returned unexpected content.',
);
safeStorageUsable = false;
return false;
}
safeStorageUsable = true;
@@ -96,7 +93,6 @@ export function createAnilistTokenStore(
`AniList token encryption unavailable: safeStorage probe threw an error. ` +
`Context: ${getSafeStorageDebugContext()}`,
);
safeStorageUsable = false;
return false;
}
};
@@ -22,6 +22,44 @@ test('guessAnilistMediaInfo uses guessit output when available', async () => {
});
});
test('guessAnilistMediaInfo fills missing guessit episode from filename parser', async () => {
const result = await guessAnilistMediaInfo('/tmp/Guessit Title S01E09.mkv', null, {
runGuessit: async () => JSON.stringify({ title: 'Guessit Title' }),
});
assert.deepEqual(result, {
title: 'Guessit Title',
season: 1,
episode: 9,
source: 'guessit',
});
});
test('guessAnilistMediaInfo ignores low-confidence parser details when guessit omits them', async () => {
const result = await guessAnilistMediaInfo('/tmp/Season 2/Guessit Title.mkv', null, {
runGuessit: async () => JSON.stringify({ title: 'Guessit Title' }),
});
assert.deepEqual(result, {
title: 'Guessit Title',
season: null,
episode: null,
source: 'guessit',
});
});
test('guessAnilistMediaInfo parses Little Witch Academia release filename', async () => {
const filename =
'/tmp/Little Witch Academia (2017) - S01E02 - 002 - Papiliodia [Bluray-1080p][10bit][h265][AC3 2.0][JA].mkv';
const result = await guessAnilistMediaInfo(filename, null, {
runGuessit: async () => JSON.stringify({ title: 'Little Witch Academia' }),
});
assert.deepEqual(result, {
title: 'Little Witch Academia',
season: 1,
episode: 2,
source: 'guessit',
});
});
test('guessAnilistMediaInfo falls back to parser when guessit fails', async () => {
const result = await guessAnilistMediaInfo('/tmp/My Anime S01E03.mkv', null, {
runGuessit: async () => {
@@ -54,7 +92,7 @@ test('guessAnilistMediaInfo uses basename for guessit input', async () => {
]);
assert.deepEqual(result, {
title: 'Rascal Does Not Dream of Bunny Girl Senpai',
season: null,
season: 1,
episode: 1,
source: 'guessit',
});
+4 -2
View File
@@ -236,12 +236,14 @@ export async function guessAnilistMediaInfo(
const season = firstPositiveInteger(parsed.season);
const year = firstYear(parsed.year);
if (title) {
const fallback = parseMediaInfo(target);
const canUseFallbackDetails = fallback.confidence !== 'low';
return {
title: buildGuessitTitle(title, alternativeTitle),
...(alternativeTitle ? { alternativeTitle } : {}),
...(year ? { year } : {}),
season,
episode,
season: season ?? (canUseFallbackDetails ? fallback.season : null),
episode: episode ?? (canUseFallbackDetails ? fallback.episode : null),
source: 'guessit',
};
}
@@ -0,0 +1,200 @@
import assert from 'node:assert/strict';
import test from 'node:test';
import {
buildHyprlandPlacementDispatches,
ensureHyprlandWindowFloatingByTitle,
findHyprlandWindowForPlacement,
shouldAttemptHyprlandWindowPlacement,
} from './hyprland-window-placement';
test('shouldAttemptHyprlandWindowPlacement only enables on Hyprland Linux sessions', () => {
assert.equal(
shouldAttemptHyprlandWindowPlacement('linux', {
HYPRLAND_INSTANCE_SIGNATURE: 'abc',
}),
true,
);
assert.equal(
shouldAttemptHyprlandWindowPlacement('linux', {
WAYLAND_DISPLAY: 'wayland-1',
}),
false,
);
assert.equal(
shouldAttemptHyprlandWindowPlacement('darwin', {
HYPRLAND_INSTANCE_SIGNATURE: 'abc',
}),
false,
);
});
test('findHyprlandWindowForPlacement matches current process by title', () => {
const client = findHyprlandWindowForPlacement(
[
{
address: '0xother',
pid: 123,
title: 'SubMiner Stats',
mapped: true,
},
{
address: '0xmatch',
pid: 456,
title: 'SubMiner Stats',
mapped: true,
},
],
{
pid: 456,
title: 'SubMiner Stats',
},
);
assert.equal(client?.address, '0xmatch');
});
test('buildHyprlandPlacementDispatches floats tiled overlay windows without pinning them', () => {
assert.deepEqual(
buildHyprlandPlacementDispatches({
address: '0xabc',
floating: false,
pinned: false,
}),
[['dispatch', 'setfloating', 'address:0xabc']],
);
});
test('buildHyprlandPlacementDispatches force-aligns floating overlay windows to target bounds', () => {
assert.deepEqual(
buildHyprlandPlacementDispatches(
{
address: '0xabc',
floating: true,
pinned: false,
},
{
x: 0,
y: 0,
width: 1920,
height: 1080,
},
),
[
['dispatch', 'movewindowpixel', 'exact 0 0,address:0xabc'],
['dispatch', 'resizewindowpixel', 'exact 1920 1080,address:0xabc'],
['dispatch', 'setprop', 'address:0xabc rounding 0'],
['dispatch', 'setprop', 'address:0xabc border_size 0'],
['dispatch', 'setprop', 'address:0xabc no_shadow 1'],
['dispatch', 'setprop', 'address:0xabc no_blur 1'],
['dispatch', 'setprop', 'address:0xabc decorate 0'],
],
);
});
test('buildHyprlandPlacementDispatches does not pin already floating overlay windows', () => {
assert.deepEqual(
buildHyprlandPlacementDispatches({
address: '0xabc',
floating: true,
pinned: false,
}),
[],
);
});
test('buildHyprlandPlacementDispatches unpins previously pinned overlay windows', () => {
assert.deepEqual(
buildHyprlandPlacementDispatches({
address: '0xabc',
floating: true,
pinned: true,
}),
[['dispatch', 'pin', 'address:0xabc']],
);
});
test('ensureHyprlandWindowFloatingByTitle dispatches float-only placement for matching tiled window', () => {
const calls: unknown[][] = [];
const placed = ensureHyprlandWindowFloatingByTitle({
title: 'SubMiner Stats',
platform: 'linux',
env: {
HYPRLAND_INSTANCE_SIGNATURE: 'abc',
},
pid: 456,
execFileSync: ((command: string, args: string[], options: unknown) => {
calls.push([command, args, options]);
if (args.join(' ') === '-j clients') {
return JSON.stringify([
{
address: '0xmatch',
pid: 456,
title: 'SubMiner Stats',
mapped: true,
floating: false,
pinned: false,
},
]);
}
return '';
}) as never,
});
assert.equal(placed, true);
assert.deepEqual(
calls.map(([, args]) => args),
[
['-j', 'clients'],
['dispatch', 'setfloating', 'address:0xmatch'],
],
);
});
test('ensureHyprlandWindowFloatingByTitle dispatches exact Hyprland geometry when bounds are provided', () => {
const calls: unknown[][] = [];
const placed = ensureHyprlandWindowFloatingByTitle({
title: 'SubMiner Stats',
platform: 'linux',
env: {
HYPRLAND_INSTANCE_SIGNATURE: 'abc',
},
pid: 456,
bounds: {
x: 0,
y: 0,
width: 1920,
height: 1080,
},
execFileSync: ((command: string, args: string[], options: unknown) => {
calls.push([command, args, options]);
if (args.join(' ') === '-j clients') {
return JSON.stringify([
{
address: '0xmatch',
pid: 456,
title: 'SubMiner Stats',
mapped: true,
floating: true,
pinned: false,
},
]);
}
return '';
}) as never,
});
assert.equal(placed, true);
assert.deepEqual(
calls.map(([, args]) => args),
[
['-j', 'clients'],
['dispatch', 'movewindowpixel', 'exact 0 0,address:0xmatch'],
['dispatch', 'resizewindowpixel', 'exact 1920 1080,address:0xmatch'],
['dispatch', 'setprop', 'address:0xmatch rounding 0'],
['dispatch', 'setprop', 'address:0xmatch border_size 0'],
['dispatch', 'setprop', 'address:0xmatch no_shadow 1'],
['dispatch', 'setprop', 'address:0xmatch no_blur 1'],
['dispatch', 'setprop', 'address:0xmatch decorate 0'],
],
);
});
@@ -0,0 +1,156 @@
import { execFileSync } from 'node:child_process';
export interface HyprlandPlacementClient {
address?: string;
floating?: boolean;
hidden?: boolean;
initialTitle?: string;
mapped?: boolean;
pid?: number;
pinned?: boolean;
title?: string;
}
export interface HyprlandPlacementBounds {
x: number;
y: number;
width: number;
height: number;
}
type ExecFileSync = typeof execFileSync;
export function shouldAttemptHyprlandWindowPlacement(
platform: NodeJS.Platform = process.platform,
env: NodeJS.ProcessEnv = process.env,
): boolean {
return platform === 'linux' && Boolean(env.HYPRLAND_INSTANCE_SIGNATURE);
}
function parseHyprlandClients(output: string): HyprlandPlacementClient[] {
const payloadStart = output.indexOf('[');
if (payloadStart < 0) {
return [];
}
const parsed = JSON.parse(output.slice(payloadStart)) as unknown;
return Array.isArray(parsed) ? (parsed as HyprlandPlacementClient[]) : [];
}
export function findHyprlandWindowForPlacement(
clients: HyprlandPlacementClient[],
options: {
pid: number;
title: string;
},
): HyprlandPlacementClient | null {
const title = options.title.trim();
if (!title) {
return null;
}
return (
clients.find(
(client) =>
client.pid === options.pid &&
client.address &&
client.mapped !== false &&
client.hidden !== true &&
(client.title === title || client.initialTitle === title),
) ?? null
);
}
export function buildHyprlandPlacementDispatches(
client: HyprlandPlacementClient,
bounds?: HyprlandPlacementBounds | null,
): string[][] {
if (!client.address) {
return [];
}
const windowAddress = `address:${client.address}`;
const dispatches: string[][] = [];
if (client.floating !== true) {
dispatches.push(['dispatch', 'setfloating', windowAddress]);
}
if (client.pinned === true) {
dispatches.push(['dispatch', 'pin', windowAddress]);
}
const roundedBounds = roundPlacementBounds(bounds);
if (roundedBounds) {
dispatches.push([
'dispatch',
'movewindowpixel',
`exact ${roundedBounds.x} ${roundedBounds.y},${windowAddress}`,
]);
dispatches.push([
'dispatch',
'resizewindowpixel',
`exact ${roundedBounds.width} ${roundedBounds.height},${windowAddress}`,
]);
dispatches.push(['dispatch', 'setprop', `${windowAddress} rounding 0`]);
dispatches.push(['dispatch', 'setprop', `${windowAddress} border_size 0`]);
dispatches.push(['dispatch', 'setprop', `${windowAddress} no_shadow 1`]);
dispatches.push(['dispatch', 'setprop', `${windowAddress} no_blur 1`]);
dispatches.push(['dispatch', 'setprop', `${windowAddress} decorate 0`]);
}
return dispatches;
}
function roundPlacementBounds(
bounds?: HyprlandPlacementBounds | null,
): HyprlandPlacementBounds | null {
if (!bounds) {
return null;
}
const rounded = {
x: Math.round(bounds.x),
y: Math.round(bounds.y),
width: Math.round(bounds.width),
height: Math.round(bounds.height),
};
return Number.isFinite(rounded.x) &&
Number.isFinite(rounded.y) &&
Number.isFinite(rounded.width) &&
Number.isFinite(rounded.height) &&
rounded.width > 0 &&
rounded.height > 0
? rounded
: null;
}
export function ensureHyprlandWindowFloatingByTitle(options: {
title: string;
bounds?: HyprlandPlacementBounds | null;
platform?: NodeJS.Platform;
env?: NodeJS.ProcessEnv;
pid?: number;
execFileSync?: ExecFileSync;
}): boolean {
if (!shouldAttemptHyprlandWindowPlacement(options.platform, options.env)) {
return false;
}
const run = options.execFileSync ?? execFileSync;
try {
const clients = parseHyprlandClients(
String(run('hyprctl', ['-j', 'clients'], { encoding: 'utf-8' })),
);
const client = findHyprlandWindowForPlacement(clients, {
pid: options.pid ?? process.pid,
title: options.title,
});
if (!client) {
return false;
}
const dispatches = buildHyprlandPlacementDispatches(client, options.bounds);
for (const args of dispatches) {
run('hyprctl', args, { stdio: 'ignore' });
}
return dispatches.length > 0;
} catch {
return false;
}
}
@@ -3050,6 +3050,59 @@ test('anime and media detail prefer lifetime totals over partial retained sessio
}
});
test('media detail resolves retained sessions before lifetime summary exists', () => {
const dbPath = makeDbPath();
const db = new Database(dbPath);
try {
ensureSchema(db);
const videoId = getOrCreateVideoRecord(db, 'local:/tmp/recent-session.mkv', {
canonicalTitle: 'Recent Session Episode',
sourcePath: '/tmp/recent-session.mkv',
sourceUrl: null,
sourceType: SOURCE_TYPE_LOCAL,
});
const startedAtMs = 1_700_000_000_000;
const { sessionId } = startSessionRecord(db, videoId, startedAtMs);
db.prepare(
`
UPDATE imm_sessions
SET ended_at_ms = ?, status = 2, active_watched_ms = ?, lines_seen = ?, tokens_seen = ?, cards_mined = ?
WHERE session_id = ?
`,
).run(startedAtMs + 600_000, 600_000, 100, 990, 1, sessionId);
insertFilteredWordOccurrence(db, {
sessionId,
videoId,
occurrenceCount: 4,
startedAtMs,
});
assert.equal(getSessionSummaries(db, 1)[0]?.videoId, videoId);
assert.equal(
(
db
.prepare('SELECT COUNT(*) AS total FROM imm_lifetime_media WHERE video_id = ?')
.get(videoId) as { total: number }
).total,
0,
);
const detail = getMediaDetail(db, videoId);
assert.ok(detail);
assert.equal(detail.canonicalTitle, 'Recent Session Episode');
assert.equal(detail.totalSessions, 1);
assert.equal(detail.totalActiveMs, 600_000);
assert.equal(detail.totalLinesSeen, 100);
assert.equal(detail.totalTokensSeen, 4);
assert.equal(detail.totalCards, 1);
} finally {
db.close();
cleanupDbPath(dbPath);
}
});
test('media library and detail queries read lifetime totals', () => {
const dbPath = makeDbPath();
const db = new Database(dbPath);
@@ -243,6 +243,7 @@ export function getMediaLibrary(db: DatabaseSync): MediaLibraryRow[] {
}
export function getMediaDetail(db: DatabaseSync, videoId: number): MediaDetailRow | null {
const wordsExpr = sessionDisplayWordsExpr('s', 'swc', 'COALESCE(asm.tokensSeen, s.tokens_seen)');
return db
.prepare(
`
@@ -251,11 +252,26 @@ export function getMediaDetail(db: DatabaseSync, videoId: number): MediaDetailRo
v.video_id AS videoId,
v.canonical_title AS canonicalTitle,
v.anime_id AS animeId,
COALESCE(lm.total_sessions, 0) AS totalSessions,
COALESCE(lm.total_active_ms, 0) AS totalActiveMs,
COALESCE(lm.total_cards, 0) AS totalCards,
COALESCE(lm.total_tokens_seen, 0) AS totalTokensSeen,
COALESCE(lm.total_lines_seen, 0) AS totalLinesSeen,
CASE
WHEN lm.video_id IS NOT NULL THEN COALESCE(lm.total_sessions, 0)
ELSE COUNT(DISTINCT s.session_id)
END AS totalSessions,
CASE
WHEN lm.video_id IS NOT NULL THEN COALESCE(lm.total_active_ms, 0)
ELSE COALESCE(SUM(COALESCE(asm.activeWatchedMs, s.active_watched_ms, 0)), 0)
END AS totalActiveMs,
CASE
WHEN lm.video_id IS NOT NULL THEN COALESCE(lm.total_cards, 0)
ELSE COALESCE(SUM(COALESCE(asm.cardsMined, s.cards_mined, 0)), 0)
END AS totalCards,
CASE
WHEN lm.video_id IS NOT NULL THEN COALESCE(lm.total_tokens_seen, 0)
ELSE COALESCE(SUM(${wordsExpr}), 0)
END AS totalTokensSeen,
CASE
WHEN lm.video_id IS NOT NULL THEN COALESCE(lm.total_lines_seen, 0)
ELSE COALESCE(SUM(COALESCE(asm.linesSeen, s.lines_seen, 0)), 0)
END AS totalLinesSeen,
COALESCE(SUM(COALESCE(asm.lookupCount, s.lookup_count, 0)), 0) AS totalLookupCount,
COALESCE(SUM(COALESCE(asm.lookupHits, s.lookup_hits, 0)), 0) AS totalLookupHits,
COALESCE(SUM(COALESCE(asm.yomitanLookupCount, s.yomitan_lookup_count, 0)), 0) AS totalYomitanLookupCount,
@@ -271,11 +287,13 @@ export function getMediaDetail(db: DatabaseSync, videoId: number): MediaDetailRo
yv.uploader_url AS uploaderUrl,
yv.description AS description
FROM imm_videos v
JOIN imm_lifetime_media lm ON lm.video_id = v.video_id
LEFT JOIN imm_lifetime_media lm ON lm.video_id = v.video_id
LEFT JOIN imm_youtube_videos yv ON yv.video_id = v.video_id
LEFT JOIN imm_sessions s ON s.video_id = v.video_id
LEFT JOIN active_session_metrics asm ON asm.sessionId = s.session_id
LEFT JOIN session_word_counts swc ON swc.sessionId = s.session_id
WHERE v.video_id = ?
AND (lm.video_id IS NOT NULL OR s.session_id IS NOT NULL)
GROUP BY v.video_id
`,
)
+80
View File
@@ -302,6 +302,86 @@ test('createIpcDepsRuntime wires AniList handlers', async () => {
assert.equal(deps.getPlaybackPaused(), true);
});
test('registerIpcHandlers runs AniList update after manual mark watched succeeds', async () => {
const { registrar, handlers } = createFakeIpcRegistrar();
const calls: string[] = [];
registerIpcHandlers(
createRegisterIpcDeps({
immersionTracker: createFakeImmersionTracker({
markActiveVideoWatched: async () => {
calls.push('mark');
return true;
},
}),
runAnilistPostWatchUpdateOnManualMark: async () => {
calls.push('anilist');
},
}),
registrar,
);
const result = await handlers.handle.get(IPC_CHANNELS.command.markActiveVideoWatched)?.({});
assert.equal(result, true);
assert.deepEqual(calls, ['mark', 'anilist']);
});
test('registerIpcHandlers isolates AniList update failures after manual mark watched succeeds', async () => {
const { registrar, handlers } = createFakeIpcRegistrar();
const calls: string[] = [];
const originalWarn = console.warn;
console.warn = () => undefined;
try {
registerIpcHandlers(
createRegisterIpcDeps({
immersionTracker: createFakeImmersionTracker({
markActiveVideoWatched: async () => {
calls.push('mark');
return true;
},
}),
runAnilistPostWatchUpdateOnManualMark: async () => {
calls.push('anilist');
throw new Error('post-watch failed');
},
}),
registrar,
);
const result = await handlers.handle.get(IPC_CHANNELS.command.markActiveVideoWatched)?.({});
assert.equal(result, true);
assert.deepEqual(calls, ['mark', 'anilist']);
} finally {
console.warn = originalWarn;
}
});
test('registerIpcHandlers skips AniList update when manual mark watched has no active session', async () => {
const { registrar, handlers } = createFakeIpcRegistrar();
const calls: string[] = [];
registerIpcHandlers(
createRegisterIpcDeps({
immersionTracker: createFakeImmersionTracker({
markActiveVideoWatched: async () => {
calls.push('mark');
return false;
},
}),
runAnilistPostWatchUpdateOnManualMark: async () => {
calls.push('anilist');
},
}),
registrar,
);
const result = await handlers.handle.get(IPC_CHANNELS.command.markActiveVideoWatched)?.({});
assert.equal(result, false);
assert.deepEqual(calls, ['mark']);
});
test('registerIpcHandlers exposes playlist browser snapshot and mutations', async () => {
const { registrar, handlers } = createFakeIpcRegistrar();
const calls: Array<[string, unknown[]]> = [];
+15 -1
View File
@@ -90,6 +90,7 @@ export interface IpcServiceDeps {
openAnilistSetup: () => void;
getAnilistQueueStatus: () => unknown;
retryAnilistQueueNow: () => Promise<{ ok: boolean; message: string }>;
runAnilistPostWatchUpdateOnManualMark?: () => Promise<void>;
getCharacterDictionarySelection?: () => Promise<unknown>;
setCharacterDictionarySelection?: (mediaId: number) => Promise<unknown>;
appendClipboardVideoToQueue: () => { ok: boolean; message: string };
@@ -213,6 +214,7 @@ export interface IpcDepsRuntimeOptions {
openAnilistSetup: () => void;
getAnilistQueueStatus: () => unknown;
retryAnilistQueueNow: () => Promise<{ ok: boolean; message: string }>;
runAnilistPostWatchUpdateOnManualMark?: () => Promise<void>;
getCharacterDictionarySelection?: () => Promise<unknown>;
setCharacterDictionarySelection?: (mediaId: number) => Promise<unknown>;
appendClipboardVideoToQueue: () => { ok: boolean; message: string };
@@ -288,6 +290,7 @@ export function createIpcDepsRuntime(options: IpcDepsRuntimeOptions): IpcService
openAnilistSetup: options.openAnilistSetup,
getAnilistQueueStatus: options.getAnilistQueueStatus,
retryAnilistQueueNow: options.retryAnilistQueueNow,
runAnilistPostWatchUpdateOnManualMark: options.runAnilistPostWatchUpdateOnManualMark,
getCharacterDictionarySelection:
options.getCharacterDictionarySelection ??
(async () => ({
@@ -385,7 +388,18 @@ export function registerIpcHandlers(deps: IpcServiceDeps, ipc: IpcMainRegistrar
});
ipc.handle(IPC_CHANNELS.command.markActiveVideoWatched, async () => {
return (await deps.immersionTracker?.markActiveVideoWatched()) ?? false;
const marked = (await deps.immersionTracker?.markActiveVideoWatched()) ?? false;
if (marked) {
try {
await deps.runAnilistPostWatchUpdateOnManualMark?.();
} catch (error) {
console.warn(
'Failed to run AniList post-watch update after manual watched mark:',
(error as Error).message,
);
}
}
return marked;
});
ipc.on(IPC_CHANNELS.command.quitApp, () => {
+1
View File
@@ -59,6 +59,7 @@ const MPV_SUBTITLE_PROPERTY_OBSERVATIONS: string[] = [
'sub-ass-override',
'sub-use-margins',
'pause',
'fullscreen',
'duration',
'media-title',
'secondary-sub-visibility',
+31
View File
@@ -93,6 +93,7 @@ function createDeps(overrides: Partial<MpvProtocolHandleMessageDeps> = {}): {
emitTimePosChange: () => {},
emitDurationChange: () => {},
emitPauseChange: () => {},
emitFullscreenChange: (payload) => state.events.push(payload),
autoLoadSecondarySubTrack: () => {},
setCurrentVideoPath: () => {},
emitSecondarySubtitleVisibility: (payload) => state.events.push(payload),
@@ -160,6 +161,17 @@ test('dispatchMpvProtocolMessage enforces sub-visibility hidden when overlay sup
]);
});
test('dispatchMpvProtocolMessage emits fullscreen changes', async () => {
const { deps, state } = createDeps();
await dispatchMpvProtocolMessage(
{ event: 'property-change', name: 'fullscreen', data: true },
deps,
);
assert.deepEqual(state.events, [{ fullscreen: true }]);
});
test('dispatchMpvProtocolMessage skips sub-visibility suppression when overlay is hidden', async () => {
const { deps, state } = createDeps({
isVisibleOverlayVisible: () => false,
@@ -269,6 +281,25 @@ test('dispatchMpvProtocolMessage pauses on sub-end when pendingPauseAtSubEnd is
});
});
test('dispatchMpvProtocolMessage updates current time before emitting time-pos change', async () => {
const calls: string[] = [];
let currentTimePos = 0;
const { deps } = createDeps({
setCurrentTimePos: (time) => {
currentTimePos = time;
calls.push(`set:${time}`);
},
getCurrentTimePos: () => currentTimePos,
emitTimePosChange: ({ time }) => {
calls.push(`emit:${time}:current=${currentTimePos}`);
},
});
await dispatchMpvProtocolMessage({ event: 'property-change', name: 'time-pos', data: 90 }, deps);
assert.deepEqual(calls, ['set:90', 'emit:90:current=90']);
});
test('splitMpvMessagesFromBuffer parses complete lines and preserves partial buffer', () => {
const parsed = splitMpvMessagesFromBuffer(
'{"event":"shutdown"}\n{"event":"property-change","name":"media-title","data":"x"}\n{"partial"',
+6 -2
View File
@@ -65,6 +65,7 @@ export interface MpvProtocolHandleMessageDeps {
emitTimePosChange: (payload: { time: number }) => void;
emitDurationChange: (payload: { duration: number }) => void;
emitPauseChange: (payload: { paused: boolean }) => void;
emitFullscreenChange: (payload: { fullscreen: boolean }) => void;
emitSubtitleMetricsChange: (payload: Partial<MpvSubtitleRenderMetrics>) => void;
setCurrentSecondarySubText: (text: string) => void;
resolvePendingRequest: (requestId: number, message: MpvMessage) => boolean;
@@ -275,8 +276,9 @@ export async function dispatchMpvProtocolMessage(
deps.setCurrentAudioTrackId(typeof msg.data === 'number' ? (msg.data as number) : null);
deps.syncCurrentAudioStreamIndex();
} else if (msg.name === 'time-pos') {
deps.emitTimePosChange({ time: (msg.data as number) || 0 });
deps.setCurrentTimePos((msg.data as number) || 0);
const timePos = (msg.data as number) || 0;
deps.setCurrentTimePos(timePos);
deps.emitTimePosChange({ time: timePos });
if (
deps.getPauseAtTime() !== null &&
deps.getCurrentTimePos() >= (deps.getPauseAtTime() as number)
@@ -291,6 +293,8 @@ export async function dispatchMpvProtocolMessage(
}
} else if (msg.name === 'pause') {
deps.emitPauseChange({ paused: asBoolean(msg.data, false) });
} else if (msg.name === 'fullscreen') {
deps.emitFullscreenChange({ fullscreen: asBoolean(msg.data, false) });
} else if (msg.name === 'media-title') {
deps.emitMediaTitleChange({
title: typeof msg.data === 'string' ? msg.data.trim() : null,
+39 -3
View File
@@ -57,6 +57,22 @@ test('MpvIpcClient handles sub-text property change and broadcasts tokenized sub
assert.equal(events[0]!.isOverlayVisible, false);
});
test('MpvIpcClient emits fullscreen property changes', async () => {
const events: Array<{ fullscreen: boolean }> = [];
const client = new MpvIpcClient('/tmp/mpv.sock', makeDeps());
client.on('fullscreen-change', (payload) => {
events.push(payload);
});
await invokeHandleMessage(client, {
event: 'property-change',
name: 'fullscreen',
data: true,
});
assert.deepEqual(events, [{ fullscreen: true }]);
});
test('MpvIpcClient clears cached media title when media path changes', async () => {
const client = new MpvIpcClient('/tmp/mpv.sock', makeDeps());
@@ -473,7 +489,7 @@ test('MpvIpcClient updates current audio stream index from track list', async ()
assert.equal(client.currentAudioStreamIndex, 11);
});
test('MpvIpcClient playNextSubtitle preserves a manual paused state', async () => {
test('MpvIpcClient playNextSubtitle starts playback from paused state and auto-pauses at end', async () => {
const commands: unknown[] = [];
const client = new MpvIpcClient('/tmp/mpv.sock', makeDeps());
(client as any).send = (payload: unknown) => {
@@ -491,9 +507,29 @@ test('MpvIpcClient playNextSubtitle preserves a manual paused state', async () =
client.playNextSubtitle();
assert.equal((client as any).pendingPauseAtSubEnd, false);
assert.equal((client as any).pendingPauseAtSubEnd, true);
assert.equal((client as any).pauseAtTime, null);
assert.deepEqual(commands, [{ command: ['sub-seek', 1] }]);
assert.deepEqual(commands, [
{ command: ['sub-seek', 1] },
{ command: ['set_property', 'pause', false] },
]);
});
test('MpvIpcClient playNextSubtitle starts playback when pause state is unknown', () => {
const commands: unknown[] = [];
const client = new MpvIpcClient('/tmp/mpv.sock', makeDeps());
(client as any).send = (payload: unknown) => {
commands.push(payload);
return true;
};
client.playNextSubtitle();
assert.equal((client as any).pendingPauseAtSubEnd, true);
assert.deepEqual(commands, [
{ command: ['sub-seek', 1] },
{ command: ['set_property', 'pause', false] },
]);
});
test('MpvIpcClient playNextSubtitle still auto-pauses at end while already playing', async () => {
+8 -6
View File
@@ -119,6 +119,7 @@ export interface MpvIpcClientEventMap {
'time-pos-change': { time: number };
'duration-change': { duration: number };
'pause-change': { paused: boolean };
'fullscreen-change': { fullscreen: boolean };
'secondary-subtitle-change': { text: string };
'subtitle-track-change': { sid: number | null };
'subtitle-track-list-change': { trackList: unknown[] | null };
@@ -330,6 +331,9 @@ export class MpvIpcClient implements MpvClient {
this.playbackPaused = payload.paused;
this.emit('pause-change', payload);
},
emitFullscreenChange: (payload) => {
this.emit('fullscreen-change', payload);
},
emitSecondarySubtitleChange: (payload) => {
this.emit('secondary-subtitle-change', payload);
},
@@ -518,14 +522,12 @@ export class MpvIpcClient implements MpvClient {
}
playNextSubtitle(): void {
if (this.playbackPaused === true) {
this.pendingPauseAtSubEnd = false;
this.pauseAtTime = null;
this.send({ command: ['sub-seek', 1] });
return;
}
this.pendingPauseAtSubEnd = true;
this.pauseAtTime = null;
this.send({ command: ['sub-seek', 1] });
if (this.playbackPaused !== false) {
this.send({ command: ['set_property', 'pause', false] });
}
}
restorePreviousSecondarySubVisibility(): void {
@@ -77,6 +77,7 @@ test('overlay manager applies bounds for main and modal windows', () => {
const visibleCalls: Electron.Rectangle[] = [];
const visibleWindow = {
isDestroyed: () => false,
getTitle: () => 'SubMiner Overlay',
setBounds: (bounds: Electron.Rectangle) => {
visibleCalls.push(bounds);
},
@@ -84,6 +85,7 @@ test('overlay manager applies bounds for main and modal windows', () => {
const modalCalls: Electron.Rectangle[] = [];
const modalWindow = {
isDestroyed: () => false,
getTitle: () => 'SubMiner Overlay Modal',
setBounds: (bounds: Electron.Rectangle) => {
modalCalls.push(bounds);
},
+208 -2
View File
@@ -883,7 +883,7 @@ test('visible overlay stays hidden while a modal window is active', () => {
assert.ok(!calls.includes('update-bounds'));
});
test('macOS tracked visible overlay stays click-through without passively stealing focus', () => {
test('macOS tracked visible overlay stays interactive without passively stealing focus', () => {
const { window, calls } = createMainWindowRecorder();
const tracker: WindowTrackerStub = {
isTracking: () => true,
@@ -915,11 +915,158 @@ test('macOS tracked visible overlay stays click-through without passively steali
isWindowsPlatform: false,
} as never);
assert.ok(calls.includes('mouse-ignore:true:forward'));
assert.ok(calls.includes('mouse-ignore:false:plain'));
assert.ok(calls.includes('show'));
assert.ok(!calls.includes('focus'));
});
test('macOS keeps active mpv overlay visible and interactive during tracker refresh', () => {
const { window, calls } = createMainWindowRecorder();
const osdMessages: string[] = [];
const tracker: WindowTrackerStub = {
isTracking: () => true,
getGeometry: () => ({ x: 0, y: 0, width: 1280, height: 720 }),
isTargetWindowFocused: () => true,
};
updateVisibleOverlayVisibility({
visibleOverlayVisible: true,
mainWindow: window as never,
windowTracker: tracker as never,
trackerNotReadyWarningShown: false,
setTrackerNotReadyWarningShown: () => {
calls.push('tracker-warning');
},
updateVisibleOverlayBounds: () => {
calls.push('update-bounds');
},
ensureOverlayWindowLevel: () => {
calls.push('ensure-level');
},
syncPrimaryOverlayWindowLayer: () => {
calls.push('sync-layer');
},
enforceOverlayLayerOrder: () => {
calls.push('enforce-order');
},
syncOverlayShortcuts: () => {
calls.push('sync-shortcuts');
},
isMacOSPlatform: true,
isWindowsPlatform: false,
showOverlayLoadingOsd: (message: string) => {
osdMessages.push(message);
},
} as never);
assert.ok(calls.includes('update-bounds'));
assert.ok(calls.includes('sync-layer'));
assert.ok(calls.includes('mouse-ignore:false:plain'));
assert.ok(calls.includes('ensure-level'));
assert.ok(calls.includes('enforce-order'));
assert.ok(calls.includes('sync-shortcuts'));
assert.ok(!calls.includes('hide'));
assert.deepEqual(osdMessages, []);
});
test('macOS tracked overlay releases topmost level when mpv loses foreground', () => {
const { window, calls } = createMainWindowRecorder();
const tracker: WindowTrackerStub = {
isTracking: () => true,
getGeometry: () => ({ x: 0, y: 0, width: 1280, height: 720 }),
isTargetWindowFocused: () => false,
};
updateVisibleOverlayVisibility({
visibleOverlayVisible: true,
mainWindow: window as never,
windowTracker: tracker as never,
trackerNotReadyWarningShown: false,
setTrackerNotReadyWarningShown: () => {},
updateVisibleOverlayBounds: () => {
calls.push('update-bounds');
},
ensureOverlayWindowLevel: () => {
calls.push('ensure-level');
},
syncPrimaryOverlayWindowLayer: () => {
calls.push('sync-layer');
},
enforceOverlayLayerOrder: () => {
calls.push('enforce-order');
},
syncOverlayShortcuts: () => {
calls.push('sync-shortcuts');
},
isMacOSPlatform: true,
isWindowsPlatform: false,
} as never);
assert.ok(calls.includes('update-bounds'));
assert.ok(calls.includes('sync-layer'));
assert.ok(calls.includes('mouse-ignore:true:forward'));
assert.ok(calls.includes('always-on-top:false'));
assert.ok(calls.includes('show'));
assert.ok(calls.includes('sync-shortcuts'));
assert.ok(!calls.includes('ensure-level'));
assert.ok(!calls.includes('enforce-order'));
assert.ok(!calls.includes('focus'));
assert.ok(!calls.includes('hide'));
});
test('macOS preserves an already visible active mpv overlay while tracker is temporarily not ready', () => {
const { window, calls } = createMainWindowRecorder();
const osdMessages: string[] = [];
let trackerWarning = false;
const tracker: WindowTrackerStub = {
isTracking: () => false,
getGeometry: () => null,
isTargetWindowFocused: () => true,
};
window.show();
calls.length = 0;
updateVisibleOverlayVisibility({
visibleOverlayVisible: true,
mainWindow: window as never,
windowTracker: tracker as never,
trackerNotReadyWarningShown: trackerWarning,
setTrackerNotReadyWarningShown: (shown: boolean) => {
trackerWarning = shown;
calls.push(`tracker-warning:${shown}`);
},
updateVisibleOverlayBounds: () => {
calls.push('update-bounds');
},
ensureOverlayWindowLevel: () => {
calls.push('ensure-level');
},
syncPrimaryOverlayWindowLayer: () => {
calls.push('sync-layer');
},
enforceOverlayLayerOrder: () => {
calls.push('enforce-order');
},
syncOverlayShortcuts: () => {
calls.push('sync-shortcuts');
},
isMacOSPlatform: true,
isWindowsPlatform: false,
showOverlayLoadingOsd: (message: string) => {
osdMessages.push(message);
},
} as never);
assert.equal(trackerWarning, false);
assert.ok(calls.includes('sync-layer'));
assert.ok(calls.includes('mouse-ignore:false:plain'));
assert.ok(calls.includes('ensure-level'));
assert.ok(calls.includes('sync-shortcuts'));
assert.ok(!calls.includes('hide'));
assert.deepEqual(osdMessages, []);
});
test('forced mouse passthrough keeps macOS tracked overlay passive while visible', () => {
const { window, calls } = createMainWindowRecorder();
const tracker: WindowTrackerStub = {
@@ -1192,6 +1339,65 @@ test('macOS keeps visible overlay hidden while tracker is not initialized yet',
assert.ok(!calls.includes('update-bounds'));
});
test('macOS preserves visible overlay during transient tracker loss with retained geometry', () => {
const { window, calls } = createMainWindowRecorder();
const osdMessages: string[] = [];
let trackerWarning = false;
let tracking = true;
const tracker: WindowTrackerStub = {
isTracking: () => tracking,
getGeometry: () => ({ x: 0, y: 0, width: 1280, height: 720 }),
isTargetWindowFocused: () => true,
};
const run = () =>
updateVisibleOverlayVisibility({
visibleOverlayVisible: true,
mainWindow: window as never,
windowTracker: tracker as never,
trackerNotReadyWarningShown: trackerWarning,
setTrackerNotReadyWarningShown: (shown: boolean) => {
trackerWarning = shown;
},
updateVisibleOverlayBounds: () => {
calls.push('update-bounds');
},
ensureOverlayWindowLevel: () => {
calls.push('ensure-level');
},
syncPrimaryOverlayWindowLayer: () => {
calls.push('sync-layer');
},
enforceOverlayLayerOrder: () => {
calls.push('enforce-order');
},
syncOverlayShortcuts: () => {
calls.push('sync-shortcuts');
},
isMacOSPlatform: true,
showOverlayLoadingOsd: (message: string) => {
osdMessages.push(message);
},
} as never);
run();
calls.length = 0;
tracking = false;
run();
assert.equal(trackerWarning, false);
assert.deepEqual(osdMessages, []);
assert.ok(calls.includes('update-bounds'));
assert.ok(calls.includes('sync-layer'));
assert.ok(calls.includes('mouse-ignore:false:plain'));
assert.ok(calls.includes('ensure-level'));
assert.ok(calls.includes('enforce-order'));
assert.ok(calls.includes('sync-shortcuts'));
assert.ok(!calls.includes('hide'));
assert.ok(!calls.includes('show'));
});
test('macOS suppresses immediate repeat loading OSD after tracker recovery until cooldown expires', () => {
const { window } = createMainWindowRecorder();
const osdMessages: string[] = [];
+33 -11
View File
@@ -89,13 +89,22 @@ export function updateVisibleOverlayVisibility(args: {
return;
}
const showPassiveVisibleOverlay = (): void => {
const showPassiveVisibleOverlay = (): boolean => {
const forceMousePassthrough = args.forceMousePassthrough === true;
const wasVisible = mainWindow.isVisible();
const shouldDefaultToPassthrough =
args.isMacOSPlatform || args.isWindowsPlatform || forceMousePassthrough;
const isVisibleOverlayFocused =
typeof mainWindow.isFocused === 'function' && mainWindow.isFocused();
const isTrackedMacOSTargetFocused =
!args.isMacOSPlatform || !args.windowTracker
? true
: (args.windowTracker.isTargetWindowFocused?.() ?? true);
const shouldReleaseMacOSOverlayLevel =
args.isMacOSPlatform &&
!!args.windowTracker &&
!isVisibleOverlayFocused &&
!isTrackedMacOSTargetFocused;
const shouldDefaultToPassthrough =
args.isWindowsPlatform || forceMousePassthrough || shouldReleaseMacOSOverlayLevel;
const windowsForegroundProcessName =
args.lastKnownWindowsForegroundProcessName?.trim().toLowerCase() ?? null;
const windowsOverlayProcessName = args.windowsOverlayProcessName?.trim().toLowerCase() ?? null;
@@ -138,7 +147,7 @@ export function updateVisibleOverlayVisibility(args: {
// On Windows, z-order is enforced by the OS via the owner window mechanism
// (SetWindowLongPtr GWLP_HWNDPARENT). The overlay is always above mpv
// without any manual z-order management.
} else if (!forceMousePassthrough) {
} else if (!forceMousePassthrough && !shouldReleaseMacOSOverlayLevel) {
args.ensureOverlayWindowLevel(mainWindow);
} else {
mainWindow.setAlwaysOnTop(false);
@@ -187,6 +196,8 @@ export function updateVisibleOverlayVisibility(args: {
if (!args.isWindowsPlatform && !args.isMacOSPlatform && !forceMousePassthrough) {
mainWindow.focus();
}
return !shouldReleaseMacOSOverlayLevel;
};
const maybeShowOverlayLoadingOsd = (): void => {
@@ -230,8 +241,8 @@ export function updateVisibleOverlayVisibility(args: {
args.updateVisibleOverlayBounds(geometry);
}
args.syncPrimaryOverlayWindowLayer('visible');
showPassiveVisibleOverlay();
if (!args.forceMousePassthrough && !args.isWindowsPlatform) {
const shouldEnforceLayerOrder = showPassiveVisibleOverlay();
if (shouldEnforceLayerOrder && !args.forceMousePassthrough && !args.isWindowsPlatform) {
args.enforceOverlayLayerOrder();
}
args.syncOverlayShortcuts();
@@ -260,11 +271,19 @@ export function updateVisibleOverlayVisibility(args: {
return;
}
const hasRetainedTrackedGeometry = args.windowTracker.getGeometry() !== null;
const hasActiveMacOSTargetSignal =
args.isMacOSPlatform && (args.windowTracker.isTargetWindowFocused?.() ?? false);
const shouldPreserveTransientTrackedOverlay =
(args.isMacOSPlatform &&
(hasRetainedTrackedGeometry || (mainWindow.isVisible() && hasActiveMacOSTargetSignal))) ||
(args.isWindowsPlatform &&
typeof args.windowTracker.isTargetWindowMinimized === 'function' &&
!args.windowTracker.isTargetWindowMinimized());
if (
args.isWindowsPlatform &&
typeof args.windowTracker.isTargetWindowMinimized === 'function' &&
!args.windowTracker.isTargetWindowMinimized() &&
(mainWindow.isVisible() || args.windowTracker.getGeometry() !== null)
shouldPreserveTransientTrackedOverlay &&
(mainWindow.isVisible() || hasRetainedTrackedGeometry)
) {
args.setTrackerNotReadyWarningShown(false);
const geometry = args.windowTracker.getGeometry();
@@ -272,7 +291,10 @@ export function updateVisibleOverlayVisibility(args: {
args.updateVisibleOverlayBounds(geometry);
}
args.syncPrimaryOverlayWindowLayer('visible');
showPassiveVisibleOverlay();
const shouldEnforceLayerOrder = showPassiveVisibleOverlay();
if (shouldEnforceLayerOrder && !args.forceMousePassthrough && !args.isWindowsPlatform) {
args.enforceOverlayLayerOrder();
}
args.syncOverlayShortcuts();
return;
}
@@ -8,6 +8,7 @@ test('overlay window config explicitly disables renderer sandbox for preload com
yomitanSession: null,
});
assert.equal(options.title, 'SubMiner Overlay');
assert.equal(options.backgroundColor, '#00000000');
assert.equal(options.webPreferences?.sandbox, false);
assert.equal(options.webPreferences?.backgroundThrottling, false);
+5 -1
View File
@@ -69,10 +69,14 @@ export function handleOverlayWindowBlurred(options: {
onWindowsVisibleOverlayBlur?: () => void;
platform?: NodeJS.Platform;
}): boolean {
if ((options.platform ?? process.platform) === 'win32' && options.kind === 'visible') {
const platform = options.platform ?? process.platform;
if (platform === 'win32' && options.kind === 'visible') {
options.onWindowsVisibleOverlayBlur?.();
return false;
}
if (platform === 'darwin' && options.kind === 'visible') {
return false;
}
if (options.kind === 'visible' && !options.isOverlayVisible(options.kind)) {
return false;
@@ -2,6 +2,11 @@ import type { BrowserWindowConstructorOptions, Session } from 'electron';
import * as path from 'path';
import type { OverlayWindowKind } from './overlay-window-input';
export const OVERLAY_WINDOW_TITLES: Record<OverlayWindowKind, string> = {
visible: 'SubMiner Overlay',
modal: 'SubMiner Overlay Modal',
};
export function buildOverlayWindowOptions(
kind: OverlayWindowKind,
options: {
@@ -14,6 +19,7 @@ export function buildOverlayWindowOptions(
return {
show: false,
title: OVERLAY_WINDOW_TITLES[kind],
width: 800,
height: 600,
x: 0,
+43
View File
@@ -146,6 +146,49 @@ test('handleOverlayWindowBlurred notifies Windows visible overlay blur callback
assert.deepEqual(calls, ['windows-visible-blur']);
});
test('handleOverlayWindowBlurred skips macOS visible overlay restacking after focus loss', () => {
const calls: string[] = [];
const handled = handleOverlayWindowBlurred({
kind: 'visible',
windowVisible: true,
isOverlayVisible: () => true,
ensureOverlayWindowLevel: () => {
calls.push('ensure-level');
},
moveWindowTop: () => {
calls.push('move-top');
},
platform: 'darwin',
});
assert.equal(handled, false);
assert.deepEqual(calls, []);
});
test('handleOverlayWindowBlurred leaves Windows callback inactive on macOS visible overlay blur', () => {
const calls: string[] = [];
const handled = handleOverlayWindowBlurred({
kind: 'visible',
windowVisible: true,
isOverlayVisible: () => true,
ensureOverlayWindowLevel: () => {
calls.push('ensure-level');
},
moveWindowTop: () => {
calls.push('move-top');
},
onWindowsVisibleOverlayBlur: () => {
calls.push('windows-visible-blur');
},
platform: 'darwin',
});
assert.equal(handled, false);
assert.deepEqual(calls, []);
});
test('handleOverlayWindowBlurred preserves active visible/modal window stacking', () => {
const calls: string[] = [];
+18 -4
View File
@@ -1,4 +1,5 @@
import { BrowserWindow, screen, type Session } from 'electron';
import electron from 'electron';
import type { BrowserWindow, Session } from 'electron';
import * as path from 'path';
import { WindowGeometry } from '../../types';
import { createLogger } from '../../logger';
@@ -8,12 +9,14 @@ import {
handleOverlayWindowBlurred,
type OverlayWindowKind,
} from './overlay-window-input';
import { buildOverlayWindowOptions } from './overlay-window-options';
import { ensureHyprlandWindowFloatingByTitle } from './hyprland-window-placement';
import { buildOverlayWindowOptions, OVERLAY_WINDOW_TITLES } from './overlay-window-options';
import { normalizeOverlayWindowBoundsForPlatform } from './overlay-window-bounds';
import { OVERLAY_WINDOW_CONTENT_READY_FLAG } from './overlay-window-flags';
export { OVERLAY_WINDOW_CONTENT_READY_FLAG } from './overlay-window-flags';
const logger = createLogger('main:overlay-window');
const { BrowserWindow: ElectronBrowserWindow, screen } = electron;
const overlayWindowLayerByInstance = new WeakMap<BrowserWindow, OverlayWindowKind>();
const overlayWindowContentReady = new WeakSet<BrowserWindow>();
@@ -50,7 +53,9 @@ export function updateOverlayWindowBounds(
window: BrowserWindow | null,
): void {
if (!geometry || !window || window.isDestroyed()) return;
window.setBounds(normalizeOverlayWindowBoundsForPlatform(geometry, process.platform, screen));
const bounds = normalizeOverlayWindowBoundsForPlatform(geometry, process.platform, screen);
window.setBounds(bounds);
ensureHyprlandWindowFloatingByTitle({ title: window.getTitle(), bounds });
}
export function ensureOverlayWindowLevel(window: BrowserWindow): void {
@@ -67,6 +72,9 @@ export function ensureOverlayWindowLevel(window: BrowserWindow): void {
return;
}
window.setAlwaysOnTop(true);
window.setVisibleOnAllWorkspaces(true, { visibleOnFullScreen: true });
ensureHyprlandWindowFloatingByTitle({ title: window.getTitle() });
window.moveTop();
}
export function enforceOverlayLayerOrder(options: {
@@ -97,7 +105,7 @@ export function createOverlayWindow(
yomitanSession?: Session | null;
},
): BrowserWindow {
const window = new BrowserWindow(buildOverlayWindowOptions(kind, options));
const window = new ElectronBrowserWindow(buildOverlayWindowOptions(kind, options));
(window as BrowserWindow & { [OVERLAY_WINDOW_CONTENT_READY_FLAG]?: boolean })[
OVERLAY_WINDOW_CONTENT_READY_FLAG
] = false;
@@ -112,9 +120,15 @@ export function createOverlayWindow(
});
window.webContents.on('did-finish-load', () => {
window.setTitle(OVERLAY_WINDOW_TITLES[kind]);
options.onRuntimeOptionsChanged();
});
window.webContents.on('page-title-updated', (event) => {
event.preventDefault();
window.setTitle(OVERLAY_WINDOW_TITLES[kind]);
});
window.once('ready-to-show', () => {
overlayWindowContentReady.add(window);
(window as BrowserWindow & { [OVERLAY_WINDOW_CONTENT_READY_FLAG]?: boolean })[
+31 -1
View File
@@ -2,7 +2,8 @@ import assert from 'node:assert/strict';
import test from 'node:test';
import type { Keybinding } from '../../types';
import type { ConfiguredShortcuts } from '../utils/shortcut-config';
import { SPECIAL_COMMANDS } from '../../config/definitions';
import { DEFAULT_CONFIG, DEFAULT_KEYBINDINGS, SPECIAL_COMMANDS } from '../../config/definitions';
import { resolveConfiguredShortcuts } from '../utils/shortcut-config';
import { compileSessionBindings } from './session-bindings';
function createShortcuts(overrides: Partial<ConfiguredShortcuts> = {}): ConfiguredShortcuts {
@@ -179,6 +180,35 @@ test('compileSessionBindings drops conflicting bindings that canonicalize to the
]);
});
test('compileSessionBindings keeps default replay and next subtitle session actions on Linux', () => {
const result = compileSessionBindings({
shortcuts: resolveConfiguredShortcuts(DEFAULT_CONFIG, DEFAULT_CONFIG),
keybindings: DEFAULT_KEYBINDINGS,
statsToggleKey: DEFAULT_CONFIG.stats.toggleKey,
platform: 'linux',
rawConfig: DEFAULT_CONFIG,
});
assert.deepEqual(
result.warnings.filter((warning) => warning.kind === 'conflict'),
[],
);
const bySignature = new Map(
result.bindings.map((binding) => [
`${binding.key.modifiers.join('+')}+${binding.key.code}`,
binding,
]),
);
const replay = bySignature.get('ctrl+shift+KeyH');
assert.equal(replay?.actionType, 'session-action');
assert.equal(replay?.actionId, 'replayCurrentSubtitle');
const next = bySignature.get('ctrl+shift+KeyL');
assert.equal(next?.actionType, 'session-action');
assert.equal(next?.actionId, 'playNextSubtitle');
});
test('compileSessionBindings omits disabled bindings', () => {
const result = compileSessionBindings({
shortcuts: createShortcuts({
+30 -2
View File
@@ -3,10 +3,13 @@ import type { WindowGeometry } from '../../types';
const DEFAULT_STATS_WINDOW_WIDTH = 900;
const DEFAULT_STATS_WINDOW_HEIGHT = 700;
export const STATS_WINDOW_TITLE = 'SubMiner Stats';
type StatsWindowLevelController = Pick<BrowserWindow, 'setAlwaysOnTop' | 'moveTop'> &
Partial<Pick<BrowserWindow, 'setVisibleOnAllWorkspaces' | 'setFullScreenable'>>;
type StatsWindowBoundsController = Pick<BrowserWindow, 'getBounds' | 'getContentBounds'>;
function isBareToggleKeyInput(input: Electron.Input, toggleKey: string): boolean {
return (
input.type === 'keyDown' &&
@@ -30,12 +33,13 @@ export function buildStatsWindowOptions(options: {
bounds?: WindowGeometry | null;
}): BrowserWindowConstructorOptions {
return {
title: STATS_WINDOW_TITLE,
x: options.bounds?.x,
y: options.bounds?.y,
width: options.bounds?.width ?? DEFAULT_STATS_WINDOW_WIDTH,
height: options.bounds?.height ?? DEFAULT_STATS_WINDOW_HEIGHT,
frame: false,
transparent: true,
transparent: false,
alwaysOnTop: true,
resizable: false,
skipTaskbar: true,
@@ -43,7 +47,7 @@ export function buildStatsWindowOptions(options: {
focusable: true,
acceptFirstMouse: true,
fullscreenable: false,
backgroundColor: '#1e1e2e',
backgroundColor: '#24273a',
show: false,
webPreferences: {
nodeIntegration: false,
@@ -54,6 +58,30 @@ export function buildStatsWindowOptions(options: {
};
}
export function resolveStatsWindowOuterBoundsForContent(
window: StatsWindowBoundsController,
target: WindowGeometry,
): WindowGeometry {
const outer = window.getBounds();
const content = window.getContentBounds();
const leftInset = content.x - outer.x;
const topInset = content.y - outer.y;
const rightInset = outer.x + outer.width - (content.x + content.width);
const bottomInset = outer.y + outer.height - (content.y + content.height);
const insets = [leftInset, topInset, rightInset, bottomInset];
if (insets.some((inset) => !Number.isFinite(inset) || inset < 0)) {
return target;
}
return {
x: target.x - leftInset,
y: target.y - topInset,
width: target.width + leftInset + rightInset,
height: target.height + topInset + bottomInset,
};
}
export function promoteStatsWindowLevel(
window: StatsWindowLevelController,
platform: NodeJS.Platform = process.platform,
+31 -1
View File
@@ -4,6 +4,7 @@ import {
buildStatsWindowLoadFileOptions,
buildStatsWindowOptions,
promoteStatsWindowLevel,
resolveStatsWindowOuterBoundsForContent,
shouldHideStatsWindowForInput,
} from './stats-window-runtime';
@@ -18,12 +19,14 @@ test('buildStatsWindowOptions uses tracked overlay bounds and preload-friendly w
},
});
assert.equal(options.title, 'SubMiner Stats');
assert.equal(options.x, 120);
assert.equal(options.y, 80);
assert.equal(options.width, 1440);
assert.equal(options.height, 900);
assert.equal(options.frame, false);
assert.equal(options.transparent, true);
assert.equal(options.transparent, false);
assert.equal(options.backgroundColor, '#24273a');
assert.equal(options.resizable, false);
assert.equal(options.webPreferences?.preload, '/tmp/preload-stats.js');
assert.equal(options.webPreferences?.contextIsolation, true);
@@ -151,6 +154,33 @@ test('buildStatsWindowLoadFileOptions includes provided stats API base URL', ()
});
});
test('resolveStatsWindowOuterBoundsForContent compensates for Wayland content insets', () => {
assert.deepEqual(
resolveStatsWindowOuterBoundsForContent(
{
getBounds: () => ({ x: 0, y: 0, width: 3440, height: 1440 }),
getContentBounds: () => ({ x: 0, y: 14, width: 3440, height: 1426 }),
},
{ x: 0, y: 0, width: 3440, height: 1440 },
),
{ x: 0, y: -14, width: 3440, height: 1454 },
);
});
test('resolveStatsWindowOuterBoundsForContent ignores invalid inset geometry', () => {
const target = { x: 0, y: 0, width: 3440, height: 1440 };
assert.deepEqual(
resolveStatsWindowOuterBoundsForContent(
{
getBounds: () => ({ x: 0, y: 0, width: 3440, height: 1440 }),
getContentBounds: () => ({ x: -1, y: 0, width: 3440, height: 1440 }),
},
target,
),
target,
);
});
test('promoteStatsWindowLevel raises stats above overlay level on macOS', () => {
const calls: string[] = [];
promoteStatsWindowLevel(
+28 -8
View File
@@ -6,8 +6,11 @@ import {
buildStatsWindowLoadFileOptions,
buildStatsWindowOptions,
promoteStatsWindowLevel,
resolveStatsWindowOuterBoundsForContent,
shouldHideStatsWindowForInput,
STATS_WINDOW_TITLE,
} from './stats-window-runtime.js';
import { ensureHyprlandWindowFloatingByTitle } from './hyprland-window-placement.js';
let statsWindow: BrowserWindow | null = null;
let toggleRegistered = false;
@@ -27,20 +30,32 @@ export interface StatsWindowOptions {
onVisibilityChanged?: (visible: boolean) => void;
}
function syncStatsWindowBounds(window: BrowserWindow, bounds: WindowGeometry | null): void {
if (!bounds || window.isDestroyed()) return;
function syncStatsWindowBounds(
window: BrowserWindow,
bounds: WindowGeometry | null,
): WindowGeometry | null {
if (!bounds || window.isDestroyed()) return null;
const outerBounds = resolveStatsWindowOuterBoundsForContent(window, bounds);
window.setBounds({
x: bounds.x,
y: bounds.y,
width: bounds.width,
height: bounds.height,
x: outerBounds.x,
y: outerBounds.y,
width: outerBounds.width,
height: outerBounds.height,
});
return outerBounds;
}
function showStatsWindow(window: BrowserWindow, options: StatsWindowOptions): void {
syncStatsWindowBounds(window, options.resolveBounds());
const bounds = options.resolveBounds();
let placementBounds = syncStatsWindowBounds(window, bounds);
promoteStatsWindowLevel(window);
window.show();
placementBounds = syncStatsWindowBounds(window, bounds) ?? placementBounds;
if (
!ensureHyprlandWindowFloatingByTitle({ title: STATS_WINDOW_TITLE, bounds: placementBounds })
) {
placementBounds = syncStatsWindowBounds(window, bounds) ?? placementBounds;
}
window.focus();
options.onVisibilityChanged?.(true);
promoteStatsWindowLevel(window);
@@ -59,6 +74,12 @@ export function toggleStatsOverlay(options: StatsWindowOptions): void {
}),
);
statsWindow.setTitle(STATS_WINDOW_TITLE);
statsWindow.webContents.on('page-title-updated', (event) => {
event.preventDefault();
statsWindow?.setTitle(STATS_WINDOW_TITLE);
});
const indexPath = path.join(options.staticDir, 'index.html');
statsWindow.loadFile(indexPath, buildStatsWindowLoadFileOptions(options.getApiBaseUrl?.()));
@@ -74,7 +95,6 @@ export function toggleStatsOverlay(options: StatsWindowOptions): void {
options.onVisibilityChanged?.(false);
}
});
statsWindow.once('ready-to-show', () => {
if (!statsWindow) return;
showStatsWindow(statsWindow, options);
File diff suppressed because it is too large Load Diff
+76 -11
View File
@@ -96,6 +96,7 @@ interface TokenizerAnnotationOptions {
minSentenceWordsForNPlusOne: number | undefined;
pos1Exclusions: ReadonlySet<string>;
pos2Exclusions: ReadonlySet<string>;
sourceText?: string;
}
let parserEnrichmentWorkerRuntimeModulePromise: Promise<
@@ -159,7 +160,7 @@ async function applyAnnotationStage(
options: TokenizerAnnotationOptions,
): Promise<MergedToken[]> {
if (!hasAnyAnnotationEnabled(options)) {
return tokens;
return stripSubtitleAnnotationMetadata(tokens, options);
}
if (!annotationStageModulePromise) {
@@ -178,7 +179,10 @@ async function applyAnnotationStage(
);
}
async function stripSubtitleAnnotationMetadata(tokens: MergedToken[]): Promise<MergedToken[]> {
async function stripSubtitleAnnotationMetadata(
tokens: MergedToken[],
options: TokenizerAnnotationOptions,
): Promise<MergedToken[]> {
if (tokens.length === 0) {
return tokens;
}
@@ -188,7 +192,7 @@ async function stripSubtitleAnnotationMetadata(tokens: MergedToken[]): Promise<M
}
const annotationStage = await annotationStageModulePromise;
return tokens.map((token) => annotationStage.stripSubtitleAnnotationMetadata(token));
return tokens.map((token) => annotationStage.stripSubtitleAnnotationMetadata(token, options));
}
export function createTokenizerDepsRuntime(
@@ -333,6 +337,66 @@ function normalizeSelectedYomitanTokens(tokens: MergedToken[]): MergedToken[] {
}));
}
function normalizeYomitanWordClasses(wordClasses: unknown): string[] {
if (!Array.isArray(wordClasses)) {
return [];
}
const normalized: string[] = [];
for (const wordClass of wordClasses) {
if (typeof wordClass !== 'string') {
continue;
}
const trimmed = wordClass.trim();
if (trimmed && !normalized.includes(trimmed)) {
normalized.push(trimmed);
}
}
return normalized;
}
function resolvePartOfSpeechFromYomitanWordClasses(wordClasses: string[]): {
partOfSpeech: PartOfSpeech;
pos1?: string;
} {
if (wordClasses.includes('prt')) {
return { partOfSpeech: PartOfSpeech.particle, pos1: '助詞' };
}
if (wordClasses.some((wordClass) => wordClass === 'aux' || wordClass.startsWith('aux-'))) {
return { partOfSpeech: PartOfSpeech.bound_auxiliary, pos1: '助動詞' };
}
if (wordClasses.some((wordClass) => wordClass.startsWith('v'))) {
return { partOfSpeech: PartOfSpeech.verb, pos1: '動詞' };
}
if (wordClasses.includes('adj-i') || wordClasses.includes('adj-ix')) {
return { partOfSpeech: PartOfSpeech.i_adjective, pos1: '形容詞' };
}
if (wordClasses.includes('adj-na')) {
return { partOfSpeech: PartOfSpeech.na_adjective, pos1: '名詞' };
}
if (
wordClasses.some(
(wordClass) =>
wordClass === 'n' ||
wordClass === 'num' ||
wordClass === 'ctr' ||
wordClass === 'pn' ||
wordClass.startsWith('n-'),
)
) {
return { partOfSpeech: PartOfSpeech.noun, pos1: '名詞' };
}
return { partOfSpeech: PartOfSpeech.other };
}
function getYomitanWordClassPosMetadata(wordClasses: unknown): {
partOfSpeech: PartOfSpeech;
pos1?: string;
} {
return resolvePartOfSpeechFromYomitanWordClasses(normalizeYomitanWordClasses(wordClasses));
}
function resolveFrequencyLookupText(
token: MergedToken,
matchMode: FrequencyDictionaryMatchMode,
@@ -622,21 +686,23 @@ async function parseWithYomitanInternalParser(
return null;
}
const normalizedSelectedTokens = normalizeSelectedYomitanTokens(
selectedTokens.map(
(token): MergedToken => ({
selectedTokens.map((token): MergedToken => {
const posMetadata = getYomitanWordClassPosMetadata(token.wordClasses);
return {
surface: token.surface,
reading: token.reading,
headword: token.headword,
startPos: token.startPos,
endPos: token.endPos,
partOfSpeech: PartOfSpeech.other,
partOfSpeech: posMetadata.partOfSpeech,
pos1: posMetadata.pos1,
isMerged: true,
isKnown: false,
isNPlusOneTarget: false,
isNameMatch: token.isNameMatch ?? false,
frequencyRank: token.frequencyRank,
}),
),
};
}),
);
if (deps.getYomitanGroupDebugEnabled?.() === true) {
@@ -716,12 +782,11 @@ export async function tokenizeSubtitle(
.replace(/\s+/g, ' ')
.trim();
const annotationOptions = getAnnotationOptions(deps);
annotationOptions.sourceText = tokenizeText;
const yomitanTokens = await parseWithYomitanInternalParser(tokenizeText, deps, annotationOptions);
if (yomitanTokens && yomitanTokens.length > 0) {
const annotatedTokens = await stripSubtitleAnnotationMetadata(
await applyAnnotationStage(yomitanTokens, deps, annotationOptions),
);
const annotatedTokens = await applyAnnotationStage(yomitanTokens, deps, annotationOptions);
return {
text: displayText,
tokens: annotatedTokens.length > 0 ? annotatedTokens : null,
File diff suppressed because it is too large Load Diff
+65 -149
View File
@@ -18,57 +18,6 @@ const KATAKANA_TO_HIRAGANA_OFFSET = 0x60;
const KATAKANA_CODEPOINT_START = 0x30a1;
const KATAKANA_CODEPOINT_END = 0x30f6;
const JLPT_LEVEL_LOOKUP_CACHE_LIMIT = 2048;
const SUBTITLE_ANNOTATION_EXCLUDED_TERMS = new Set([
'ああ',
'ええ',
'うう',
'おお',
'はあ',
'はは',
'へえ',
'ふう',
'ほう',
]);
const SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDING_PREFIXES = ['ん', 'の', 'なん', 'なの'];
const SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDING_CORES = [
'だ',
'です',
'でした',
'だった',
'では',
'じゃ',
'でしょう',
'だろう',
] as const;
const SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDING_TRAILING_PARTICLES = [
'',
'か',
'ね',
'よ',
'な',
'けど',
'よね',
'かな',
'かね',
] as const;
const SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDINGS = new Set(
SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDING_PREFIXES.flatMap((prefix) =>
SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDING_CORES.flatMap((core) =>
SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDING_TRAILING_PARTICLES.map(
(particle) => `${prefix}${core}${particle}`,
),
),
),
);
const SUBTITLE_ANNOTATION_EXCLUDED_TRAILING_PARTICLE_SUFFIXES = new Set([
'って',
'ってよ',
'ってね',
'ってな',
'ってさ',
'ってか',
'ってば',
]);
const jlptLevelLookupCaches = new WeakMap<
(text: string) => JlptLevel | null,
@@ -89,6 +38,7 @@ export interface AnnotationStageOptions {
minSentenceWordsForNPlusOne?: number;
pos1Exclusions?: ReadonlySet<string>;
pos2Exclusions?: ReadonlySet<string>;
sourceText?: string;
}
function resolveKnownWordText(
@@ -103,10 +53,6 @@ function normalizePos1Tag(pos1: string | undefined): string {
return typeof pos1 === 'string' ? pos1.trim() : '';
}
const SUBTITLE_ANNOTATION_EXCLUDED_POS1 = new Set(['感動詞']);
const SUBTITLE_ANNOTATION_GRAMMAR_ONLY_POS1 = new Set(['助詞', '助動詞', '連体詞']);
const AUXILIARY_STEM_GRAMMAR_TAIL_POS1 = new Set(['名詞', '助動詞', '助詞']);
function splitNormalizedTagParts(normalizedTag: string): string[] {
if (!normalizedTag) {
return [];
@@ -128,57 +74,6 @@ function isExcludedByTagSet(normalizedTag: string, exclusions: ReadonlySet<strin
return parts.some((part) => exclusions.has(part));
}
function isExcludedFromSubtitleAnnotationsByPos1(normalizedPos1: string): boolean {
const parts = splitNormalizedTagParts(normalizedPos1);
if (parts.some((part) => SUBTITLE_ANNOTATION_EXCLUDED_POS1.has(part))) {
return true;
}
return parts.length > 0 && parts.every((part) => SUBTITLE_ANNOTATION_GRAMMAR_ONLY_POS1.has(part));
}
function isExcludedTrailingParticleMergedToken(token: MergedToken): boolean {
const normalizedSurface = normalizeJlptTextForExclusion(token.surface);
const normalizedHeadword = normalizeJlptTextForExclusion(token.headword);
if (
!normalizedSurface ||
!normalizedHeadword ||
!normalizedSurface.startsWith(normalizedHeadword)
) {
return false;
}
const suffix = normalizedSurface.slice(normalizedHeadword.length);
if (!SUBTITLE_ANNOTATION_EXCLUDED_TRAILING_PARTICLE_SUFFIXES.has(suffix)) {
return false;
}
const pos1Parts = splitNormalizedTagParts(normalizePos1Tag(token.pos1));
if (pos1Parts.length < 2) {
return false;
}
const [leadingPos1, ...trailingPos1] = pos1Parts;
if (!leadingPos1 || SUBTITLE_ANNOTATION_GRAMMAR_ONLY_POS1.has(leadingPos1)) {
return false;
}
return trailingPos1.length > 0 && trailingPos1.every((part) => part === '助詞');
}
function isAuxiliaryStemGrammarTailToken(token: MergedToken): boolean {
const pos1Parts = splitNormalizedTagParts(normalizePos1Tag(token.pos1));
if (
pos1Parts.length === 0 ||
!pos1Parts.every((part) => AUXILIARY_STEM_GRAMMAR_TAIL_POS1.has(part))
) {
return false;
}
const pos3Parts = splitNormalizedTagParts(normalizePos2Tag(token.pos3));
return pos3Parts.includes('助動詞語幹');
}
function resolvePos1Exclusions(options: AnnotationStageOptions): ReadonlySet<string> {
if (options.pos1Exclusions) {
return options.pos1Exclusions;
@@ -254,6 +149,45 @@ function shouldAllowContentLedMergedTokenFrequency(
return true;
}
function shouldAllowOrdinalPrefixNounFrequency(token: MergedToken): boolean {
const normalizedSurface = token.surface.trim();
const normalizedHeadword = token.headword.trim();
if (!normalizedSurface.startsWith('第') && !normalizedHeadword.startsWith('第')) {
return false;
}
const pos1Parts = splitNormalizedTagParts(normalizePos1Tag(token.pos1));
const pos2Parts = splitNormalizedTagParts(normalizePos2Tag(token.pos2));
return (
pos1Parts.length >= 2 &&
pos1Parts[0] === '接頭詞' &&
pos1Parts.slice(1).some((part) => part === '名詞') &&
pos2Parts[0] === '数接続' &&
pos2Parts.slice(1).some((part) => part === '数')
);
}
function shouldAllowHonorificPrefixNounFrequency(token: MergedToken): boolean {
const normalizedSurface = token.surface.trim();
const normalizedHeadword = token.headword.trim();
if (
!['お', 'ご', '御'].some(
(prefix) => normalizedSurface.startsWith(prefix) || normalizedHeadword.startsWith(prefix),
)
) {
return false;
}
const pos1Parts = splitNormalizedTagParts(normalizePos1Tag(token.pos1));
const pos2Parts = splitNormalizedTagParts(normalizePos2Tag(token.pos2));
return (
pos1Parts.length >= 2 &&
pos1Parts[0] === '接頭詞' &&
pos1Parts.slice(1).some((part) => part === '名詞') &&
pos2Parts[0] === '名詞接続'
);
}
function isFrequencyExcludedByPos(
token: MergedToken,
pos1Exclusions: ReadonlySet<string>,
@@ -273,12 +207,24 @@ function isFrequencyExcludedByPos(
pos1Exclusions,
pos2Exclusions,
);
const allowOrdinalPrefixNounToken = shouldAllowOrdinalPrefixNounFrequency(token);
const allowHonorificPrefixNounToken = shouldAllowHonorificPrefixNounFrequency(token);
if (isExcludedByTagSet(normalizedPos1, pos1Exclusions) && !allowContentLedMergedToken) {
if (
isExcludedByTagSet(normalizedPos1, pos1Exclusions) &&
!allowContentLedMergedToken &&
!allowOrdinalPrefixNounToken &&
!allowHonorificPrefixNounToken
) {
return true;
}
if (isExcludedByTagSet(normalizedPos2, pos2Exclusions) && !allowContentLedMergedToken) {
if (
isExcludedByTagSet(normalizedPos2, pos2Exclusions) &&
!allowContentLedMergedToken &&
!allowOrdinalPrefixNounToken &&
!allowHonorificPrefixNounToken
) {
return true;
}
@@ -608,50 +554,15 @@ function isJlptEligibleToken(token: MergedToken): boolean {
return true;
}
function isExcludedFromSubtitleAnnotationsByTerm(token: MergedToken): boolean {
const candidates = [token.surface, token.reading, resolveJlptLookupText(token)].filter(
(candidate): candidate is string => typeof candidate === 'string' && candidate.length > 0,
);
for (const candidate of candidates) {
const trimmedCandidate = candidate.trim();
if (!trimmedCandidate) {
continue;
}
const normalizedCandidate = normalizeJlptTextForExclusion(trimmedCandidate);
if (!normalizedCandidate) {
continue;
}
if (
SUBTITLE_ANNOTATION_EXCLUDED_TERMS.has(trimmedCandidate) ||
SUBTITLE_ANNOTATION_EXCLUDED_TERMS.has(normalizedCandidate) ||
SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDINGS.has(trimmedCandidate) ||
SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDINGS.has(normalizedCandidate)
) {
return true;
}
if (
isTrailingSmallTsuKanaSfx(trimmedCandidate) ||
isTrailingSmallTsuKanaSfx(normalizedCandidate) ||
isReduplicatedKanaSfxWithOptionalTrailingTo(trimmedCandidate) ||
isReduplicatedKanaSfxWithOptionalTrailingTo(normalizedCandidate)
) {
return true;
}
}
return false;
}
export function shouldExcludeTokenFromSubtitleAnnotations(token: MergedToken): boolean {
return sharedShouldExcludeTokenFromSubtitleAnnotations(token);
}
export function stripSubtitleAnnotationMetadata(token: MergedToken): MergedToken {
return sharedStripSubtitleAnnotationMetadata(token);
export function stripSubtitleAnnotationMetadata(
token: MergedToken,
options: AnnotationStageOptions = {},
): MergedToken {
return sharedStripSubtitleAnnotationMetadata(token, options);
}
function computeTokenKnownStatus(
@@ -734,10 +645,14 @@ export function annotateTokens(
pos2Exclusions,
})
) {
return sharedStripSubtitleAnnotationMetadata(token, {
const strippedToken = sharedStripSubtitleAnnotationMetadata(token, {
pos1Exclusions,
pos2Exclusions,
});
return {
...strippedToken,
isKnown: false,
};
}
const prioritizedNameMatch = nameMatchEnabled && token.isNameMatch === true;
@@ -781,6 +696,7 @@ export function annotateTokens(
sanitizedMinSentenceWordsForNPlusOne,
pos1Exclusions,
pos2Exclusions,
options.sourceText,
);
if (!nameMatchEnabled) {
@@ -0,0 +1,124 @@
const KATAKANA_TO_HIRAGANA_OFFSET = 0x60;
const KATAKANA_CODEPOINT_START = 0x30a1;
const KATAKANA_CODEPOINT_END = 0x30f6;
const SENTENCE_FINAL_PARTICLE_SUFFIXES = ['', 'か', 'ね', 'よ', 'な', 'わ'] as const;
const EXPLANATORY_ENDING_PREFIXES = ['ん', 'の', 'なん', 'なの'] as const;
const EXPLANATORY_ENDING_CORES = [
'だ',
'です',
'でした',
'だった',
'では',
'じゃ',
'でしょう',
'だろう',
] as const;
const EXPLANATORY_ENDING_TRAILING_PARTICLES = [
'',
'か',
'ね',
'よ',
'な',
'けど',
'よね',
'かな',
'かね',
] as const;
const EXPLANATORY_ENDING_THOUGHT_SUFFIXES = ['か', 'かな', 'かね'] as const;
const NEGATIVE_COPULA_PREFIXES = ['じゃ', 'では'] as const;
export function normalizeGrammarEndingText(text: string): string {
const raw = text.trim();
if (!raw) {
return '';
}
let normalized = '';
for (const char of raw) {
const code = char.codePointAt(0);
if (code === undefined) {
continue;
}
if (code >= KATAKANA_CODEPOINT_START && code <= KATAKANA_CODEPOINT_END) {
normalized += String.fromCodePoint(code - KATAKANA_TO_HIRAGANA_OFFSET);
continue;
}
normalized += char;
}
return normalized;
}
function matchesSuffix(text: string, suffixes: readonly string[]): boolean {
return suffixes.some((suffix) => text === suffix);
}
function matchesPoliteCopulaEnding(text: string): boolean {
if (!text.startsWith('です')) {
return false;
}
return matchesSuffix(text.slice('です'.length), SENTENCE_FINAL_PARTICLE_SUFFIXES);
}
function matchesNegativeCopulaEnding(text: string): boolean {
for (const prefix of NEGATIVE_COPULA_PREFIXES) {
const negativeStem = `${prefix}ない`;
if (!text.startsWith(negativeStem)) {
continue;
}
const suffix = text.slice(negativeStem.length);
return (
matchesSuffix(suffix, SENTENCE_FINAL_PARTICLE_SUFFIXES) || matchesPoliteCopulaEnding(suffix)
);
}
return false;
}
function matchesExplanatoryEnding(text: string): boolean {
for (const prefix of EXPLANATORY_ENDING_PREFIXES) {
if (EXPLANATORY_ENDING_THOUGHT_SUFFIXES.some((suffix) => text === `${prefix}${suffix}`)) {
return true;
}
if (!text.startsWith(prefix)) {
continue;
}
const suffix = text.slice(prefix.length);
for (const core of EXPLANATORY_ENDING_CORES) {
if (!suffix.startsWith(core)) {
continue;
}
if (matchesSuffix(suffix.slice(core.length), EXPLANATORY_ENDING_TRAILING_PARTICLES)) {
return true;
}
}
}
return false;
}
export function isStandaloneGrammarEndingText(text: string): boolean {
const normalized = normalizeGrammarEndingText(text);
if (!normalized) {
return false;
}
return matchesPoliteCopulaEnding(normalized) || matchesNegativeCopulaEnding(normalized);
}
export function isSubtitleGrammarEndingText(text: string): boolean {
const normalized = normalizeGrammarEndingText(text);
if (!normalized) {
return false;
}
return isStandaloneGrammarEndingText(normalized) || matchesExplanatoryEnding(normalized);
}
@@ -39,6 +39,33 @@ test('enrichTokensWithMecabPos1 fills missing pos1 using surface-sequence fallba
assert.equal(enriched[0]?.pos1, '助詞');
});
test('enrichTokensWithMecabPos1 backfills blank pos2 and pos3 fields', () => {
const tokens = [
makeToken({
surface: 'は',
startPos: 0,
endPos: 1,
pos1: '助詞',
pos2: '',
pos3: ' ',
}),
];
const mecabTokens = [
makeToken({
surface: 'は',
startPos: 0,
endPos: 1,
pos1: '助詞',
pos2: '係助詞',
pos3: '一般',
}),
];
const enriched = enrichTokensWithMecabPos1(tokens, mecabTokens);
assert.equal(enriched[0]?.pos2, '係助詞');
assert.equal(enriched[0]?.pos3, '一般');
});
test('enrichTokensWithMecabPos1 keeps partOfSpeech unchanged and only enriches POS tags', () => {
const tokens = [makeToken({ surface: 'これは', startPos: 0, endPos: 3 })];
const mecabTokens = [
@@ -120,6 +120,13 @@ function lowerBoundByIndex(candidates: IndexedMecabToken[], targetIndex: number)
return low;
}
function coalesceMissingPosField(
current: string | undefined,
fallback: string | undefined,
): string | undefined {
return typeof current === 'string' && current.trim().length > 0 ? current : fallback;
}
function joinUniqueTags(values: Array<string | undefined>): string | undefined {
const unique: string[] = [];
for (const value of values) {
@@ -303,7 +310,8 @@ function fillMissingPos1BySurfaceSequence(
let cursor = 0;
return tokens.map((token) => {
if (token.pos1 && token.pos1.trim().length > 0) {
const hasCompletePosMetadata = token.pos1?.trim() && token.pos2?.trim() && token.pos3?.trim();
if (hasCompletePosMetadata) {
return token;
}
@@ -327,9 +335,9 @@ function fillMissingPos1BySurfaceSequence(
cursor = best.index + 1;
return {
...token,
pos1: best.pos1,
pos2: best.pos2,
pos3: best.pos3,
pos1: coalesceMissingPosField(token.pos1, best.pos1),
pos2: coalesceMissingPosField(token.pos2, best.pos2),
pos3: coalesceMissingPosField(token.pos3, best.pos3),
};
});
}
@@ -382,7 +390,7 @@ export function enrichTokensWithMecabPos1(
const metadataByTokenIndex = new Map<number, MecabPosMetadata>();
for (const [index, token] of tokens.entries()) {
if (token.pos1) {
if (token.pos1?.trim() && token.pos2?.trim() && token.pos3?.trim()) {
continue;
}
@@ -410,9 +418,9 @@ export function enrichTokensWithMecabPos1(
return {
...token,
pos1: metadata.pos1,
pos2: metadata.pos2,
pos3: metadata.pos3,
pos1: coalesceMissingPosField(token.pos1, metadata.pos1),
pos2: coalesceMissingPosField(token.pos2, metadata.pos2),
pos3: coalesceMissingPosField(token.pos3, metadata.pos3),
};
});
@@ -155,7 +155,7 @@ test('prefers the longest dictionary headword across merged segments', () => {
);
});
test('keeps the first headword when later segments are standalone words', () => {
test('splits trailing grammar endings when later segments are standalone words', () => {
const parseResults = [
makeParseItem('scanning-parser', [
[
@@ -174,10 +174,111 @@ test('keeps the first headword when later segments are standalone words', () =>
})),
[
{
surface: '猫です',
reading: 'ねこです',
surface: '猫',
reading: 'ねこ',
headword: '猫',
},
{
surface: 'です',
reading: 'です',
headword: 'です',
},
],
);
});
test('keeps preceding reading when standalone grammar ending has empty reading', () => {
const parseResults = [
makeParseItem('scanning-parser', [
[
{ text: '猫', reading: 'ねこ', headword: '猫' },
{ text: 'です', reading: '', headword: 'です' },
],
]),
];
const tokens = selectYomitanParseTokens(parseResults, () => false, 'headword');
assert.deepEqual(
tokens?.map((token) => ({
surface: token.surface,
reading: token.reading,
headword: token.headword,
})),
[
{
surface: '猫',
reading: 'ねこ',
headword: '猫',
},
{
surface: 'です',
reading: '',
headword: 'です',
},
],
);
});
test('splits trailing ja-nai grammar endings from preceding content', () => {
const parseResults = [
makeParseItem('scanning-parser', [
[
{ text: 'いる', reading: 'いる', headword: 'いる' },
{ text: 'じゃない', reading: 'じゃない', headword: 'じゃない' },
],
]),
];
const tokens = selectYomitanParseTokens(parseResults, () => false, 'headword');
assert.deepEqual(
tokens?.map((token) => ({
surface: token.surface,
reading: token.reading,
headword: token.headword,
})),
[
{
surface: 'いる',
reading: 'いる',
headword: 'いる',
},
{
surface: 'じゃない',
reading: 'じゃない',
headword: 'じゃない',
},
],
);
});
test('splits trailing negative-copula grammar endings by pattern', () => {
const parseResults = [
makeParseItem('scanning-parser', [
[
{ text: '問題', reading: 'もんだい', headword: '問題' },
{ text: 'ではないですか', reading: 'ではないですか', headword: 'ない' },
],
]),
];
const tokens = selectYomitanParseTokens(parseResults, () => false, 'headword');
assert.deepEqual(
tokens?.map((token) => ({
surface: token.surface,
reading: token.reading,
headword: token.headword,
})),
[
{
surface: '問題',
reading: 'もんだい',
headword: '問題',
},
{
surface: 'ではないですか',
reading: 'ではないですか',
headword: 'ない',
},
],
);
});
@@ -1,4 +1,5 @@
import { MergedToken, NPlusOneMatchMode, PartOfSpeech } from '../../../types';
import { isStandaloneGrammarEndingText } from './grammar-ending';
interface YomitanParseHeadword {
term?: unknown;
@@ -141,6 +142,15 @@ function isKanaOnlyText(text: string): boolean {
return text.length > 0 && Array.from(text).every((char) => isKanaChar(char));
}
function isStandaloneGrammarEndingSegment(segment: YomitanParseSegment): boolean {
const surface = segment.text?.trim() ?? '';
const headword = extractYomitanHeadword(segment).trim();
return (
headword.length > 0 &&
(isStandaloneGrammarEndingText(surface) || isStandaloneGrammarEndingText(headword))
);
}
function shouldMergeKanaContinuation(
previousToken: MergedToken | undefined,
continuationSurface: string,
@@ -186,20 +196,97 @@ export function mapYomitanParseResultItemToMergedTokens(
let combinedSurface = '';
let combinedReading = '';
let combinedStart = charOffset;
let firstHeadword = '';
const expandedHeadwords: string[] = [];
const pushToken = (
surface: string,
reading: string,
headword: string,
start: number,
end: number,
): void => {
tokens.push({
surface,
reading,
headword,
startPos: start,
endPos: end,
partOfSpeech: PartOfSpeech.other,
pos1: '',
isMerged: true,
isNPlusOneTarget: false,
isKnown: (() => {
const matchText = resolveKnownWordText(surface, headword, knownWordMatchMode);
return matchText ? isKnownWord(matchText) : false;
})(),
});
};
const flushCombinedToken = (end: number): void => {
if (!combinedSurface) {
combinedStart = end;
return;
}
const combinedHeadword = selectMergedHeadword(
firstHeadword,
expandedHeadwords,
combinedSurface,
);
if (!combinedHeadword) {
const previousToken = tokens[tokens.length - 1];
if (shouldMergeKanaContinuation(previousToken, combinedSurface)) {
previousToken.surface += combinedSurface;
previousToken.reading += combinedReading;
previousToken.endPos = end;
}
} else {
hasDictionaryMatch = true;
pushToken(combinedSurface, combinedReading, combinedHeadword, combinedStart, end);
}
combinedSurface = '';
combinedReading = '';
firstHeadword = '';
expandedHeadwords.length = 0;
combinedStart = end;
};
for (const segment of line) {
const segmentText = segment.text;
if (!segmentText || segmentText.length === 0) {
continue;
}
const segmentStart = charOffset;
const segmentEnd = segmentStart + segmentText.length;
charOffset = segmentEnd;
combinedSurface += segmentText;
if (typeof segment.reading === 'string') {
combinedReading += segment.reading;
}
const segmentHeadword = extractYomitanHeadword(segment);
if (isStandaloneGrammarEndingSegment(segment)) {
combinedSurface = combinedSurface.slice(0, -segmentText.length);
if (typeof segment.reading === 'string' && segment.reading.length > 0) {
combinedReading = combinedReading.slice(0, -segment.reading.length);
}
flushCombinedToken(segmentStart);
const grammarHeadword = segmentHeadword || segmentText;
hasDictionaryMatch = true;
pushToken(
segmentText,
typeof segment.reading === 'string' ? segment.reading : '',
grammarHeadword,
segmentStart,
segmentEnd,
);
combinedStart = segmentEnd;
continue;
}
if (segmentHeadword) {
if (!firstHeadword) {
firstHeadword = segmentHeadword;
@@ -210,49 +297,7 @@ export function mapYomitanParseResultItemToMergedTokens(
}
}
if (!combinedSurface) {
continue;
}
const start = charOffset;
const end = start + combinedSurface.length;
charOffset = end;
const combinedHeadword = selectMergedHeadword(
firstHeadword,
expandedHeadwords,
combinedSurface,
);
if (!combinedHeadword) {
const previousToken = tokens[tokens.length - 1];
if (shouldMergeKanaContinuation(previousToken, combinedSurface)) {
previousToken.surface += combinedSurface;
previousToken.reading += combinedReading;
previousToken.endPos = end;
continue;
}
// No dictionary-backed headword for this merged unit; skip it entirely so
// downstream keyboard/frequency/JLPT flows only operate on lookup-backed tokens.
continue;
}
hasDictionaryMatch = true;
const headword = combinedHeadword;
tokens.push({
surface: combinedSurface,
reading: combinedReading,
headword,
startPos: start,
endPos: end,
partOfSpeech: PartOfSpeech.other,
pos1: '',
isMerged: true,
isNPlusOneTarget: false,
isKnown: (() => {
const matchText = resolveKnownWordText(combinedSurface, headword, knownWordMatchMode);
return matchText ? isKnownWord(matchText) : false;
})(),
});
flushCombinedToken(charOffset);
}
if (validLineCount === 0 || tokens.length === 0 || !hasDictionaryMatch) {
@@ -8,14 +8,21 @@ import {
} from '../../../token-pos2-exclusions';
import { MergedToken, PartOfSpeech } from '../../../types';
import { shouldIgnoreJlptByTerm } from '../jlpt-token-filter';
import { isSubtitleGrammarEndingText } from './grammar-ending';
const KATAKANA_TO_HIRAGANA_OFFSET = 0x60;
const KATAKANA_CODEPOINT_START = 0x30a1;
const KATAKANA_CODEPOINT_END = 0x30f6;
const STANDALONE_GRAMMAR_PARTICLE_PHRASES = ['たって', 'だって'] as const;
const STANDALONE_GRAMMAR_PARTICLE_PHRASES_SET: ReadonlySet<string> = new Set(
STANDALONE_GRAMMAR_PARTICLE_PHRASES,
);
export const SUBTITLE_ANNOTATION_EXCLUDED_TERMS = new Set([
'あ',
'ああ',
'ある',
'あなた',
'あんた',
'ええ',
@@ -25,6 +32,7 @@ export const SUBTITLE_ANNOTATION_EXCLUDED_TERMS = new Set([
'お前',
'こいつ',
'こっち',
'くれ',
'じゃない',
'そうだ',
'たち',
@@ -32,58 +40,27 @@ export const SUBTITLE_ANNOTATION_EXCLUDED_TERMS = new Set([
'どこか',
'なんか',
'べき',
'って',
'はあ',
'はぁ',
'はは',
'へえ',
'ふう',
'ほう',
'やはり',
'って',
'何か',
'何だ',
'何も',
'如何した',
'有る',
'在る',
'様',
'確かに',
'誰も',
'貴方',
'もんか',
'ものか',
]);
const SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDING_PREFIXES = ['ん', 'の', 'なん', 'なの'];
const SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDING_CORES = [
'だ',
'です',
'でした',
'だった',
'では',
'じゃ',
'でしょう',
'だろう',
] as const;
const SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDING_TRAILING_PARTICLES = [
'',
'か',
'ね',
'よ',
'な',
'けど',
'よね',
'かな',
'かね',
] as const;
const SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDING_THOUGHT_SUFFIXES = [
'か',
'かな',
'かね',
] as const;
const SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDINGS = new Set(
SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDING_PREFIXES.flatMap((prefix) =>
SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDING_CORES.flatMap((core) =>
SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDING_TRAILING_PARTICLES.map(
(particle) => `${prefix}${core}${particle}`,
),
),
),
);
const SUBTITLE_ANNOTATION_EXCLUDED_TRAILING_PARTICLE_SUFFIXES = new Set([
'って',
'ってよ',
@@ -95,7 +72,28 @@ const SUBTITLE_ANNOTATION_EXCLUDED_TRAILING_PARTICLE_SUFFIXES = new Set([
]);
const AUXILIARY_STEM_GRAMMAR_TAIL_POS1 = new Set(['名詞', '助動詞', '助詞']);
const NON_INDEPENDENT_NOUN_HELPER_TAIL_POS1 = new Set(['助詞', '助動詞']);
const AUXILIARY_INFLECTION_TRAILING_POS1 = new Set(['助動詞']);
const AUXILIARY_HELPER_SPAN_POS1 = new Set(['助詞', '助動詞', '動詞']);
const LEXICAL_VERB_POS2 = new Set(['自立']);
const STANDALONE_GRAMMAR_PARTICLE_SURFACES = new Set([
'か',
'が',
'さ',
'し',
'ぞ',
'ぜ',
'と',
'な',
'に',
'ね',
'の',
'は',
'へ',
'も',
'や',
'よ',
'を',
]);
export interface SubtitleAnnotationFilterOptions {
pos1Exclusions?: ReadonlySet<string>;
pos2Exclusions?: ReadonlySet<string>;
@@ -301,6 +299,99 @@ function isKanaOnlyNonIndependentNounHelperMerge(token: MergedToken): boolean {
return pos1Parts.slice(1).every((part) => NON_INDEPENDENT_NOUN_HELPER_TAIL_POS1.has(part));
}
function isKanaOnlyText(text: string): boolean {
const normalized = normalizeKana(text);
return normalized.length > 0 && [...normalized].every(isKanaChar);
}
function isLexicalKureruVerb(token: MergedToken): boolean {
const normalizedSurface = normalizeKana(token.surface);
const normalizedHeadword = normalizeKana(token.headword);
const pos1Parts = splitNormalizedTagParts(normalizePosTag(token.pos1));
const pos2Parts = splitNormalizedTagParts(normalizePosTag(token.pos2));
return (
normalizedSurface === 'くれ' &&
normalizedHeadword === 'くれる' &&
pos1Parts.length === 1 &&
pos1Parts[0] === '動詞' &&
pos2Parts.length === 1 &&
pos2Parts[0] === '自立'
);
}
function isStandaloneAuxiliaryInflectionFragment(token: MergedToken): boolean {
const normalizedSurface = normalizeKana(token.surface);
if (!isKanaOnlyText(normalizedSurface)) {
return false;
}
const pos1Parts = splitNormalizedTagParts(normalizePosTag(token.pos1));
if (pos1Parts.length === 0) {
return false;
}
if (pos1Parts.every((part) => part === '助動詞')) {
return true;
}
const pos2Parts = splitNormalizedTagParts(normalizePosTag(token.pos2));
return (
pos1Parts[0] === '動詞' &&
pos2Parts[0] === '接尾' &&
pos1Parts.slice(1).every((part) => AUXILIARY_INFLECTION_TRAILING_POS1.has(part))
);
}
function isAuxiliaryOnlyHelperSpan(token: MergedToken): boolean {
const normalizedSurface = normalizeKana(token.surface);
const normalizedHeadword = normalizeKana(token.headword);
if (!isKanaOnlyText(normalizedSurface) || !isKanaOnlyText(normalizedHeadword)) {
return false;
}
const pos1Parts = splitNormalizedTagParts(normalizePosTag(token.pos1));
if (
pos1Parts.length === 0 ||
!pos1Parts.every((part) => AUXILIARY_HELPER_SPAN_POS1.has(part)) ||
!pos1Parts.includes('助詞') ||
!pos1Parts.includes('動詞')
) {
return false;
}
const pos2Parts = splitNormalizedTagParts(normalizePosTag(token.pos2));
return !pos2Parts.some((part) => LEXICAL_VERB_POS2.has(part));
}
function isStandaloneSuruTeGrammarHelper(token: MergedToken): boolean {
const normalizedSurface = normalizeKana(token.surface);
const normalizedHeadword = normalizeKana(token.headword);
if (!normalizedSurface.startsWith('して') || normalizedHeadword !== 'する') {
return false;
}
const pos1Parts = splitNormalizedTagParts(normalizePosTag(token.pos1));
return (
isKanaOnlyText(normalizedSurface) && (pos1Parts.length === 0 || pos1Parts.includes('動詞'))
);
}
function isStandaloneGrammarParticle(token: MergedToken): boolean {
const normalizedSurface = normalizeKana(token.surface);
const normalizedHeadword = normalizeKana(token.headword);
return (
normalizedSurface === normalizedHeadword &&
(STANDALONE_GRAMMAR_PARTICLE_SURFACES.has(normalizedSurface) ||
STANDALONE_GRAMMAR_PARTICLE_PHRASES_SET.has(normalizedSurface))
);
}
function isSingleKanaSurfaceFragment(token: MergedToken): boolean {
const normalizedSurface = normalizeKana(token.surface);
const chars = [...normalizedSurface];
return chars.length === 1 && chars.every(isKanaChar);
}
function isExcludedByTerm(token: MergedToken): boolean {
const candidates = [token.surface, token.reading, token.headword].filter(
(candidate): candidate is string => typeof candidate === 'string' && candidate.length > 0,
@@ -317,21 +408,11 @@ function isExcludedByTerm(token: MergedToken): boolean {
continue;
}
if (
SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDING_PREFIXES.some((prefix) =>
SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDING_THOUGHT_SUFFIXES.some(
(suffix) => normalized === `${prefix}${suffix}`,
),
)
) {
return true;
}
if (
SUBTITLE_ANNOTATION_EXCLUDED_TERMS.has(trimmed) ||
SUBTITLE_ANNOTATION_EXCLUDED_TERMS.has(normalized) ||
SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDINGS.has(trimmed) ||
SUBTITLE_ANNOTATION_EXCLUDED_EXPLANATORY_ENDINGS.has(normalized) ||
isSubtitleGrammarEndingText(trimmed) ||
isSubtitleGrammarEndingText(normalized) ||
shouldIgnoreJlptByTerm(trimmed) ||
shouldIgnoreJlptByTerm(normalized)
) {
@@ -388,10 +469,34 @@ export function shouldExcludeTokenFromSubtitleAnnotations(
return true;
}
if (isStandaloneAuxiliaryInflectionFragment(token)) {
return true;
}
if (isAuxiliaryOnlyHelperSpan(token)) {
return true;
}
if (isStandaloneSuruTeGrammarHelper(token)) {
return true;
}
if (isStandaloneGrammarParticle(token)) {
return true;
}
if (isSingleKanaSurfaceFragment(token)) {
return true;
}
if (isExcludedTrailingParticleMergedToken(token)) {
return true;
}
if (isLexicalKureruVerb(token)) {
return false;
}
return isExcludedByTerm(token);
}
@@ -405,7 +510,6 @@ export function stripSubtitleAnnotationMetadata(
return {
...token,
isKnown: false,
isNPlusOneTarget: false,
isNameMatch: false,
jlptLevel: undefined,
@@ -533,7 +533,7 @@ test('requestYomitanTermFrequencies caches repeated term+reading lookups', async
assert.equal(frequencyCalls, 1);
});
test('requestYomitanScanTokens uses left-to-right termsFind scanning instead of parseText', async () => {
test('requestYomitanScanTokens prefers parseText tokenization over termsFind fragments', async () => {
const scripts: string[] = [];
const deps = createDeps(async (script) => {
scripts.push(script);
@@ -549,6 +549,138 @@ test('requestYomitanScanTokens uses left-to-right termsFind scanning instead of
],
};
}
if (script.includes('parseText')) {
return [
{
source: 'scanning-parser',
index: 0,
content: [
[
{
text: '取り組んで',
reading: 'とりくんで',
headwords: [[{ term: '取り組む' }]],
},
],
],
},
];
}
return [
{
surface: '取り',
reading: 'とり',
headword: '取る',
startPos: 0,
endPos: 2,
},
{
surface: '組んで',
reading: 'くんで',
headword: '組む',
startPos: 2,
endPos: 5,
},
];
});
const result = await requestYomitanScanTokens('取り組んで', deps, {
error: () => undefined,
});
assert.deepEqual(result, [
{
surface: '取り組んで',
reading: 'とりくんで',
headword: '取り組む',
startPos: 0,
endPos: 5,
},
]);
assert.ok(scripts.some((script) => script.includes('parseText')));
assert.ok(scripts.some((script) => script.includes('termsFind')));
});
test('requestYomitanScanTokens keeps scanner metadata when parse spans agree', async () => {
const deps = createDeps(async (script) => {
if (script.includes('optionsGetFull')) {
return {
profileCurrent: 0,
profiles: [
{
options: {
scanning: { length: 40 },
},
},
],
};
}
if (script.includes('parseText')) {
return [
{
source: 'scanning-parser',
index: 0,
content: [
[
{
text: 'アクア',
reading: 'あくあ',
headwords: [[{ term: 'アクア' }]],
},
],
],
},
];
}
return [
{
surface: 'アクア',
reading: 'あくあ',
headword: 'アクア',
startPos: 0,
endPos: 3,
isNameMatch: true,
wordClasses: ['n'],
},
];
});
const result = await requestYomitanScanTokens('アクア', deps, {
error: () => undefined,
});
assert.deepEqual(result, [
{
surface: 'アクア',
reading: 'あくあ',
headword: 'アクア',
startPos: 0,
endPos: 3,
isNameMatch: true,
wordClasses: ['n'],
},
]);
});
test('requestYomitanScanTokens falls back to left-to-right termsFind scanning', async () => {
const scripts: string[] = [];
const deps = createDeps(async (script) => {
scripts.push(script);
if (script.includes('optionsGetFull')) {
return {
profileCurrent: 0,
profiles: [
{
options: {
scanning: { length: 40 },
},
},
],
};
}
if (script.includes('parseText')) {
return [];
}
return [
{
surface: 'カズマ',
@@ -573,6 +705,7 @@ test('requestYomitanScanTokens uses left-to-right termsFind scanning instead of
endPos: 3,
},
]);
assert.ok(scripts.some((script) => script.includes('parseText')));
const scannerScript = scripts.find((script) => script.includes('termsFind'));
assert.ok(scannerScript, 'expected termsFind scanning request script');
assert.doesNotMatch(scannerScript ?? '', /parseText/);
@@ -891,6 +1024,105 @@ test('requestYomitanScanTokens can use frequency from later exact secondary-matc
]);
});
test('requestYomitanScanTokens uses exact frequency entry when selected reading differs', async () => {
let scannerScript = '';
const deps = createDeps(async (script) => {
if (script.includes('termsFind')) {
scannerScript = script;
return [];
}
if (script.includes('optionsGetFull')) {
return {
profileCurrent: 0,
profileIndex: 0,
scanLength: 40,
dictionaries: ['JPDBv2㋕', 'Jiten', 'CC100'],
dictionaryPriorityByName: {
'JPDBv2㋕': 0,
Jiten: 1,
CC100: 2,
},
dictionaryFrequencyModeByName: {
'JPDBv2㋕': 'rank-based',
Jiten: 'rank-based',
CC100: 'rank-based',
},
profiles: [
{
options: {
scanning: { length: 40 },
dictionaries: [
{ name: 'JPDBv2㋕', enabled: true, id: 0 },
{ name: 'Jiten', enabled: true, id: 1 },
{ name: 'CC100', enabled: true, id: 2 },
],
},
},
],
};
}
return null;
});
await requestYomitanScanTokens('第二走者', deps, {
error: () => undefined,
});
const result = (await runInjectedYomitanScript(scannerScript, (action, params) => {
if (action !== 'termsFind') {
throw new Error(`unexpected action: ${action}`);
}
const text = (params as { text?: string } | undefined)?.text ?? '';
if (!text.startsWith('第二')) {
return { originalTextLength: 0, dictionaryEntries: [] };
}
return {
originalTextLength: 2,
dictionaryEntries: [
{
headwords: [
{
term: '第二',
reading: 'だいに',
sources: [{ originalText: '第二', isPrimary: true, matchType: 'exact' }],
},
],
frequencies: [],
},
{
headwords: [
{
term: '第二',
reading: '',
sources: [{ originalText: '第二', isPrimary: false, matchType: 'exact' }],
},
],
frequencies: [
{
headwordIndex: 0,
dictionary: 'JPDBv2㋕',
frequency: 189513,
displayValue: '1820,189513句',
},
],
},
],
};
})) as Array<Record<string, unknown>>;
assert.deepEqual(result?.[0], {
surface: '第二',
reading: 'だいに',
headword: '第二',
startPos: 0,
endPos: 2,
isNameMatch: false,
frequencyRank: 1820,
});
});
test('requestYomitanScanTokens marks tokens backed by SubMiner character dictionary entries', async () => {
const deps = createDeps(async (script) => {
if (script.includes('optionsGetFull')) {
@@ -1049,6 +1281,60 @@ test('requestYomitanScanTokens marks grouped entries when SubMiner dictionary al
assert.equal((result as Array<{ isNameMatch?: boolean }>)[0]?.isNameMatch, true);
});
test('requestYomitanScanTokens preserves matched headword word classes', async () => {
let scannerScript = '';
const deps = createDeps(async (script) => {
if (script.includes('termsFind')) {
scannerScript = script;
return [];
}
if (script.includes('optionsGetFull')) {
return {
profileCurrent: 0,
profiles: [
{
options: {
scanning: { length: 40 },
},
},
],
};
}
return null;
});
await requestYomitanScanTokens('は', deps, { error: () => undefined });
const result = await runInjectedYomitanScript(scannerScript, (action, params) => {
if (action !== 'termsFind') {
throw new Error(`unexpected action: ${action}`);
}
const text = (params as { text?: string } | undefined)?.text;
if (text !== 'は') {
return { originalTextLength: 0, dictionaryEntries: [] };
}
return {
originalTextLength: 1,
dictionaryEntries: [
{
headwords: [
{
term: 'は',
reading: 'は',
wordClasses: ['prt'],
sources: [{ originalText: 'は', isPrimary: true, matchType: 'exact' }],
},
],
},
],
};
});
assert.deepEqual((result as Array<{ wordClasses?: string[] }>)[0]?.wordClasses, ['prt']);
});
test('requestYomitanScanTokens skips fallback fragments without exact primary source matches', async () => {
const deps = createDeps(async (script) => {
if (script.includes('optionsGetFull')) {
@@ -53,6 +53,7 @@ export interface YomitanScanToken {
endPos: number;
isNameMatch?: boolean;
frequencyRank?: number;
wordClasses?: string[];
}
interface YomitanProfileMetadata {
@@ -91,11 +92,30 @@ function isScanTokenArray(value: unknown): value is YomitanScanToken[] {
typeof entry.startPos === 'number' &&
typeof entry.endPos === 'number' &&
(entry.isNameMatch === undefined || typeof entry.isNameMatch === 'boolean') &&
(entry.frequencyRank === undefined || typeof entry.frequencyRank === 'number'),
(entry.frequencyRank === undefined || typeof entry.frequencyRank === 'number') &&
(entry.wordClasses === undefined ||
(Array.isArray(entry.wordClasses) &&
entry.wordClasses.every((wordClass) => typeof wordClass === 'string'))),
)
);
}
function hasSameTokenSpans(left: YomitanScanToken[], right: YomitanScanToken[]): boolean {
if (left.length !== right.length) {
return false;
}
return left.every((token, index) => {
const other = right[index];
return (
other !== undefined &&
token.surface === other.surface &&
token.startPos === other.startPos &&
token.endPos === other.endPos
);
});
}
function makeTermReadingCacheKey(term: string, reading: string | null): string {
return `${term}\u0000${reading ?? ''}`;
}
@@ -956,6 +976,9 @@ const YOMITAN_SCANNING_HELPERS = String.raw`
const matchReading = typeof match.headword?.reading === 'string' ? match.headword.reading : '';
const preferredReading =
typeof preferredMatch.headword?.reading === 'string' ? preferredMatch.headword.reading : '';
if (!matchReading || !preferredReading) {
return true;
}
return matchReading === preferredReading;
}
function getBestFrequencyRankForMatches(matches, dictionaryPriorityByName, dictionaryFrequencyModeByName) {
@@ -975,6 +998,11 @@ const YOMITAN_SCANNING_HELPERS = String.raw`
return best;
}
function getPreferredHeadword(dictionaryEntries, token, dictionaryPriorityByName, dictionaryFrequencyModeByName) {
function normalizeWordClasses(headword) {
if (!Array.isArray(headword?.wordClasses)) { return undefined; }
const classes = headword.wordClasses.filter((wordClass) => typeof wordClass === "string" && wordClass.trim().length > 0);
return classes.length > 0 ? classes : undefined;
}
function appendDictionaryNames(target, value) {
if (!value || typeof value !== 'object') {
return;
@@ -1033,6 +1061,7 @@ const YOMITAN_SCANNING_HELPERS = String.raw`
return {
term: preferredMatch.headword.term,
reading: preferredMatch.headword.reading,
wordClasses: normalizeWordClasses(preferredMatch.headword),
isNameMatch: matchedNameDictionary || isNameDictionaryEntry(preferredMatch.dictionaryEntry),
frequencyRank: getBestFrequencyRankForMatches(
exactFrequencyMatches.length > 0 ? exactFrequencyMatches : exactPrimaryMatches,
@@ -1099,7 +1128,7 @@ ${YOMITAN_SCANNING_HELPERS}
if (preferredHeadword && typeof preferredHeadword.term === "string") {
const reading = typeof preferredHeadword.reading === "string" ? preferredHeadword.reading : "";
const segments = distributeFuriganaInflected(preferredHeadword.term, reading, source);
tokens.push({
const tokenPayload = {
surface: segments.map((segment) => segment.text).join("") || source,
reading: segments.map((segment) => typeof segment.reading === "string" ? segment.reading : "").join(""),
headword: preferredHeadword.term,
@@ -1110,7 +1139,11 @@ ${YOMITAN_SCANNING_HELPERS}
typeof preferredHeadword.frequencyRank === "number" && Number.isFinite(preferredHeadword.frequencyRank)
? Math.max(1, Math.floor(preferredHeadword.frequencyRank))
: undefined,
});
};
if (Array.isArray(preferredHeadword.wordClasses) && preferredHeadword.wordClasses.length > 0) {
tokenPayload.wordClasses = preferredHeadword.wordClasses;
}
tokens.push(tokenPayload);
i += originalTextLength;
continue;
}
@@ -1235,6 +1268,17 @@ export async function requestYomitanScanTokens(
return null;
}
const parseResults = await requestYomitanParseResults(text, deps, logger);
const selectedParseTokens = selectYomitanParseTokens(parseResults, () => false, 'headword');
const parseScanTokens =
selectedParseTokens?.map((token) => ({
surface: token.surface,
reading: token.reading,
headword: token.headword,
startPos: token.startPos,
endPos: token.endPos,
})) ?? null;
const metadata = await requestYomitanProfileMetadata(parserWindow, logger);
const profileIndex = metadata?.profileIndex ?? 0;
const scanLength = metadata?.scanLength ?? DEFAULT_YOMITAN_SCAN_LENGTH;
@@ -1252,6 +1296,9 @@ export async function requestYomitanScanTokens(
true,
);
if (isScanTokenArray(rawResult)) {
if (parseScanTokens && parseScanTokens.length > 0) {
return hasSameTokenSpans(parseScanTokens, rawResult) ? rawResult : parseScanTokens;
}
return rawResult;
}
if (Array.isArray(rawResult)) {
@@ -1266,8 +1313,14 @@ export async function requestYomitanScanTokens(
})) ?? null
);
}
if (parseScanTokens && parseScanTokens.length > 0) {
return parseScanTokens;
}
return null;
} catch (err) {
if (parseScanTokens && parseScanTokens.length > 0) {
return parseScanTokens;
}
logger.error('Yomitan scanner request failed:', (err as Error).message);
return null;
}