mirror of
https://github.com/ksyasuda/SubMiner.git
synced 2026-02-27 18:22:41 -08:00
Decompose the immersion tracker facade into focused storage/session/metadata collaborators with dedicated tests and updated ownership docs while preserving runtime behavior.
4.4 KiB
4.4 KiB
Immersion Tracking Storage
SubMiner stores immersion analytics in local SQLite (immersion.sqlite) by default.
Runtime Model
- Write path is asynchronous and queue-backed.
- Hot paths (subtitle parsing/render/token flows) enqueue telemetry/events and never await SQLite writes.
- Queue overflow policy is deterministic: drop oldest queued writes, keep newest.
- Flush policy defaults to
25writes or500msmax delay. - SQLite pragmas:
journal_mode=WAL,synchronous=NORMAL,foreign_keys=ON,busy_timeout=2500.
Schema (v1)
Schema versioning table:
imm_schema_version(schema_version PK, applied_at_ms)
Core entities:
imm_videos: video key/title/source metadata + optional media metadata fieldsimm_sessions: session UUID, video reference, timing/status fieldsimm_session_telemetry: high-frequency session aggregates over timeimm_session_events: event stream with compact numeric event types
Rollups:
imm_daily_rollupsimm_monthly_rollups
Primary index coverage:
- session-by-video/time:
idx_sessions_video_started - session-by-status/time:
idx_sessions_status_started - timeline reads:
idx_telemetry_session_sample - event timeline/type reads:
idx_events_session_ts,idx_events_type_ts - rollup reads:
idx_rollups_day_video,idx_rollups_month_video
Ownership boundaries:
src/core/services/immersion-tracker-service.ts: orchestration facade (queueing, flush cadence, runtime event/session coordination).src/core/services/immersion-tracker/storage.ts: schema bootstrap, prepared statement construction, and DB record writes/updates.src/core/services/immersion-tracker/session.ts: session row lifecycle transitions (start/finalize).src/core/services/immersion-tracker/metadata.ts: local media metadata probing (ffprobe, sha256, parsed metadata shaping).
Retention and Maintenance Defaults
- Raw events:
7d - Telemetry:
30d - Daily rollups:
365d - Monthly rollups:
5y - Maintenance cadence: startup + every
24h - Vacuum cadence: idle weekly (
7dminimum spacing)
Retention cleanup and rollup refresh stay in service maintenance orchestration + src/core/services/immersion-tracker/maintenance.ts.
Configurable Policy Knobs
All knobs are under immersionTracking in config:
batchSizeflushIntervalMsqueueCappayloadCapBytesmaintenanceIntervalMsretention.eventsDaysretention.telemetryDaysretention.dailyRollupsDaysretention.monthlyRollupsDaysretention.vacuumIntervalDays
These map directly to runtime tracker policy and allow tuning without code changes.
Query Templates
Timeline for one session:
SELECT
sample_ms,
total_watched_ms,
active_watched_ms,
lines_seen,
words_seen,
tokens_seen,
cards_mined
FROM imm_session_telemetry
WHERE session_id = ?
ORDER BY sample_ms DESC
LIMIT ?;
Session throughput summary:
SELECT
s.session_id,
s.video_id,
s.started_at_ms,
s.ended_at_ms,
COALESCE(SUM(t.active_watched_ms), 0) AS active_watched_ms,
COALESCE(SUM(t.words_seen), 0) AS words_seen,
COALESCE(SUM(t.cards_mined), 0) AS cards_mined,
CASE
WHEN COALESCE(SUM(t.active_watched_ms), 0) > 0
THEN COALESCE(SUM(t.words_seen), 0) / (COALESCE(SUM(t.active_watched_ms), 0) / 60000.0)
ELSE NULL
END AS words_per_min,
CASE
WHEN COALESCE(SUM(t.active_watched_ms), 0) > 0
THEN (COALESCE(SUM(t.cards_mined), 0) * 60.0) / (COALESCE(SUM(t.active_watched_ms), 0) / 60000.0)
ELSE NULL
END AS cards_per_hour
FROM imm_sessions s
LEFT JOIN imm_session_telemetry t ON t.session_id = s.session_id
GROUP BY s.session_id
ORDER BY s.started_at_ms DESC
LIMIT ?;
Daily rollups:
SELECT
rollup_day,
video_id,
total_sessions,
total_active_min,
total_lines_seen,
total_words_seen,
total_tokens_seen,
total_cards,
cards_per_hour,
words_per_min,
lookup_hit_rate
FROM imm_daily_rollups
ORDER BY rollup_day DESC, video_id DESC
LIMIT ?;
Monthly rollups:
SELECT
rollup_month,
video_id,
total_sessions,
total_active_min,
total_lines_seen,
total_words_seen,
total_tokens_seen,
total_cards
FROM imm_monthly_rollups
ORDER BY rollup_month DESC, video_id DESC
LIMIT ?;
Extension Points
- Adapter boundary for non-SQLite backends is tracked in
TASK-32. - Keep analytics/query callers bound to tracker service methods (not raw table assumptions) so persistence adapters can swap in later.