Overlay 2.0 (#12)

This commit is contained in:
2026-03-01 02:36:51 -08:00
committed by GitHub
parent 45df3c466b
commit 44c7761c7c
397 changed files with 15139 additions and 7127 deletions

View File

@@ -6,11 +6,14 @@ SubMiner stores immersion analytics in local SQLite (`immersion.sqlite`) by defa
- Write path is asynchronous and queue-backed.
- Hot paths (subtitle parsing/render/token flows) enqueue telemetry/events and never await SQLite writes.
- Background line processing also upserts to `imm_words` and `imm_kanji`.
- Queue overflow policy is deterministic: drop oldest queued writes, keep newest.
- Flush policy defaults to `25` writes or `500ms` max delay.
- SQLite pragmas: `journal_mode=WAL`, `synchronous=NORMAL`, `foreign_keys=ON`, `busy_timeout=2500`.
- Rollups now run incrementally from the last processed telemetry sample; startup performs a one-time bootstrap rebuild-equivalent pass.
- If retention pruning removes telemetry/session rows, maintenance triggers a full rollup rebuild to resync historical aggregates.
## Schema (v1)
## Schema (v3)
Schema versioning table:
@@ -18,15 +21,21 @@ Schema versioning table:
Core entities:
- `imm_videos`: video key/title/source metadata + optional media metadata fields
- `imm_sessions`: session UUID, video reference, timing/status fields
- `imm_session_telemetry`: high-frequency session aggregates over time
- `imm_session_events`: event stream with compact numeric event types
- `imm_videos`: video key/title/source metadata + optional media metadata fields, `CREATED_DATE`/`LAST_UPDATE_DATE`
- `imm_sessions`: session UUID, video reference, timing/status fields, `CREATED_DATE`/`LAST_UPDATE_DATE`
- `imm_session_telemetry`: high-frequency session aggregates over time, `CREATED_DATE`/`LAST_UPDATE_DATE`
- `imm_session_events`: event stream with compact numeric event types, `CREATED_DATE`/`LAST_UPDATE_DATE`
Rollups:
- `imm_daily_rollups`
- `imm_monthly_rollups`
- `imm_daily_rollups`: includes `CREATED_DATE`/`LAST_UPDATE_DATE`
- `imm_monthly_rollups`: includes `CREATED_DATE`/`LAST_UPDATE_DATE`
Vocabulary:
- `imm_words(id, headword, word, reading, first_seen, last_seen, frequency)`
- `imm_kanji(id, kanji, first_seen, last_seen, frequency)`
- `first_seen`/`last_seen` store Unix timestamps and are upserted with line ingestion
Primary index coverage:
@@ -147,4 +156,3 @@ FROM imm_monthly_rollups
ORDER BY rollup_month DESC, video_id DESC
LIMIT ?;
```