mirror of
https://github.com/ksyasuda/SubMiner.git
synced 2026-03-20 12:11:28 -07:00
docs: refresh immersion and stats documentation
This commit is contained in:
@@ -1180,12 +1180,20 @@ Enable or disable local immersion analytics stored in SQLite for mined subtitles
|
||||
"queueCap": 1000,
|
||||
"payloadCapBytes": 256,
|
||||
"maintenanceIntervalMs": 86400000,
|
||||
"retentionMode": "preset",
|
||||
"retentionPreset": "balanced",
|
||||
"retention": {
|
||||
"eventsDays": 7,
|
||||
"telemetryDays": 30,
|
||||
"dailyRollupsDays": 365,
|
||||
"monthlyRollupsDays": 1825,
|
||||
"vacuumIntervalDays": 7
|
||||
"eventsDays": 0,
|
||||
"telemetryDays": 0,
|
||||
"sessionsDays": 0,
|
||||
"dailyRollupsDays": 0,
|
||||
"monthlyRollupsDays": 0,
|
||||
"vacuumIntervalDays": 0
|
||||
},
|
||||
"lifetimeSummaries": {
|
||||
"global": true,
|
||||
"anime": true,
|
||||
"media": true
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1200,11 +1208,16 @@ Enable or disable local immersion analytics stored in SQLite for mined subtitles
|
||||
| `queueCap` | integer (`100`-`100000`) | In-memory queue cap. Overflow drops oldest writes. Default `1000`. |
|
||||
| `payloadCapBytes` | integer (`64`-`8192`) | Event payload byte cap before truncation marker. Default `256`. |
|
||||
| `maintenanceIntervalMs` | integer (`60000`-`604800000`) | Prune + rollup maintenance cadence. Default `86400000` (24h). |
|
||||
| `retention.eventsDays` | integer (`1`-`3650`) | Raw event retention window. Default `7` days. |
|
||||
| `retention.telemetryDays` | integer (`1`-`3650`) | Telemetry retention window. Default `30` days. |
|
||||
| `retention.dailyRollupsDays` | integer (`1`-`36500`) | Daily rollup retention window. Default `365` days. |
|
||||
| `retention.monthlyRollupsDays` | integer (`1`-`36500`) | Monthly rollup retention window. Default `1825` days (~5 years). |
|
||||
| `retention.vacuumIntervalDays` | integer (`1`-`3650`) | Minimum spacing between `VACUUM` passes. Default `7` days. |
|
||||
| `retentionMode` | `preset`,`advanced` | Retention mode. `preset` applies `retentionPreset`, `advanced` uses explicit values only. Default `preset`. |
|
||||
| `retentionPreset` | `minimal`,`balanced`,`deep-history` | Retention preset used when `retentionMode = "preset"`. Default `balanced`. |
|
||||
| `retention.eventsDays` | integer (`0`-`3650`) | Raw event retention window in days. Default `0` (keep all). |
|
||||
| `retention.telemetryDays` | integer (`0`-`3650`) | Telemetry retention window in days. Default `0` (keep all). |
|
||||
| `retention.sessionsDays` | integer (`0`-`3650`) | Session retention window in days. Default `0` (keep all). |
|
||||
| `retention.dailyRollupsDays` | integer (`0`-`36500`) | Daily rollup retention window. Default `0` (keep all). |
|
||||
| `retention.monthlyRollupsDays` | integer (`0`-`36500`) | Monthly rollup retention window. Default `0` (keep all). |
|
||||
| `retention.vacuumIntervalDays` | integer (`0`-`3650`) | Minimum spacing between `VACUUM` passes. `0` disables vacuum. Default `0` (disabled). |
|
||||
|
||||
Default behavior keeps raw events, telemetry, sessions, and rollups forever while still maintaining lifetime summary tables and daily/monthly rollups for faster reads. If you later want bounded retention, switch `retentionMode` or set explicit `retention.*` values.
|
||||
|
||||
When `dbPath` is blank or omitted, SubMiner writes telemetry and session summaries to the default app-data location:
|
||||
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
SubMiner can log your watching and mining activity to a local SQLite database, then surface it in the built-in stats dashboard. Tracking is enabled by default and can be turned off if you do not want local analytics.
|
||||
|
||||
When enabled, SubMiner records per-session statistics (watch time, subtitle lines seen, words encountered, cards mined) and maintains daily and monthly rollups. You can view that data in SubMiner's stats UI or query the database directly with any SQLite tool.
|
||||
When enabled, SubMiner records per-session statistics (watch time, subtitle lines seen, words encountered, cards mined) and maintains exact lifetime summary tables plus daily/monthly rollups. You can view that data in SubMiner's stats UI or query the database directly with any SQLite tool.
|
||||
|
||||
## Enabling
|
||||
|
||||
@@ -74,16 +74,35 @@ The Vocabulary tab toolbar includes an **Exclusions** button for hiding words fr
|
||||
|
||||
## Retention Defaults
|
||||
|
||||
Data is kept for the following durations before automatic cleanup:
|
||||
By default, SubMiner keeps all retention tables and raw data (`0` means keep all) while continuing daily/monthly rollup maintenance:
|
||||
|
||||
| Data type | Retention |
|
||||
| -------------- | --------- |
|
||||
| Raw events | 7 days |
|
||||
| Telemetry | 30 days |
|
||||
| Daily rollups | 1 year |
|
||||
| Monthly rollups | 5 years |
|
||||
| Raw events | 0 (keep all) |
|
||||
| Telemetry | 0 (keep all) |
|
||||
| Sessions | 0 (keep all) |
|
||||
| Daily rollups | 0 (keep all) |
|
||||
| Monthly rollups | 0 (keep all) |
|
||||
|
||||
Maintenance runs on startup and every 24 hours. Vacuum runs weekly.
|
||||
Maintenance runs on startup and every 24 hours. Vacuum runs only when `retention.vacuumIntervalDays` is non-zero.
|
||||
|
||||
In practice:
|
||||
|
||||
- Overview totals read from lifetime summary tables, so all-time watch time/cards/words stay exact even if raw query paths evolve.
|
||||
- Anime and episode pages keep lifetime totals from summary tables while session drill-down still reads retained sessions directly. With the current defaults, both are kept forever.
|
||||
- Trends can read the full available history because daily/monthly rollups are also kept forever by default.
|
||||
- Vocabulary and kanji totals are cumulative and not bounded by the raw session retention knobs.
|
||||
|
||||
## Storage / Performance Model
|
||||
|
||||
The tracker is optimized for "keep everything" defaults:
|
||||
|
||||
- Exact all-time totals live in dedicated lifetime summary tables (`imm_lifetime_global`, `imm_lifetime_anime`, `imm_lifetime_media`).
|
||||
- Ended-session totals are persisted onto `imm_sessions`, so most dashboard reads do not need to rescan raw telemetry.
|
||||
- Daily and monthly rollups remain available for chart queries and coarse trend views.
|
||||
- Subtitle text is stored once in `imm_subtitle_lines`; subtitle-line event payloads keep compact metadata only.
|
||||
- Cover-art binaries are deduplicated through a shared blob store so episodes in the same series do not each carry duplicate image bytes.
|
||||
- Hot tables have dedicated indexes for session time ranges, telemetry sample windows, frequency-ranked vocabulary, and cover-art lookup keys.
|
||||
|
||||
## Configurable Knobs
|
||||
|
||||
@@ -98,9 +117,15 @@ All policy options live under `immersionTracking` in your config:
|
||||
| `maintenanceIntervalMs` | How often maintenance runs |
|
||||
| `retention.eventsDays` | Raw event retention |
|
||||
| `retention.telemetryDays` | Telemetry retention |
|
||||
| `retention.sessionsDays` | Session retention |
|
||||
| `retention.dailyRollupsDays` | Daily rollup retention |
|
||||
| `retention.monthlyRollupsDays` | Monthly rollup retention |
|
||||
| `retention.vacuumIntervalDays` | Minimum spacing between vacuums |
|
||||
| `retentionMode` | `preset` or `advanced` |
|
||||
| `retentionPreset` | `minimal`, `balanced`, or `deep-history` (used by `retentionMode`) |
|
||||
| `lifetimeSummaries.global` | Maintain global lifetime totals |
|
||||
| `lifetimeSummaries.anime` | Maintain per-anime lifetime totals |
|
||||
| `lifetimeSummaries.media` | Maintain per-media lifetime totals |
|
||||
|
||||
## Query Templates
|
||||
|
||||
@@ -129,26 +154,43 @@ SELECT
|
||||
s.video_id,
|
||||
s.started_at_ms,
|
||||
s.ended_at_ms,
|
||||
COALESCE(SUM(t.active_watched_ms), 0) AS active_watched_ms,
|
||||
COALESCE(SUM(t.words_seen), 0) AS words_seen,
|
||||
COALESCE(SUM(t.cards_mined), 0) AS cards_mined,
|
||||
COALESCE(s.active_watched_ms, 0) AS active_watched_ms,
|
||||
COALESCE(s.words_seen, 0) AS words_seen,
|
||||
COALESCE(s.cards_mined, 0) AS cards_mined,
|
||||
CASE
|
||||
WHEN COALESCE(SUM(t.active_watched_ms), 0) > 0
|
||||
THEN COALESCE(SUM(t.words_seen), 0) / (COALESCE(SUM(t.active_watched_ms), 0) / 60000.0)
|
||||
WHEN COALESCE(s.active_watched_ms, 0) > 0
|
||||
THEN COALESCE(s.words_seen, 0) / (COALESCE(s.active_watched_ms, 0) / 60000.0)
|
||||
ELSE NULL
|
||||
END AS words_per_min,
|
||||
CASE
|
||||
WHEN COALESCE(SUM(t.active_watched_ms), 0) > 0
|
||||
THEN (COALESCE(SUM(t.cards_mined), 0) * 60.0) / (COALESCE(SUM(t.active_watched_ms), 0) / 60000.0)
|
||||
WHEN COALESCE(s.active_watched_ms, 0) > 0
|
||||
THEN (COALESCE(s.cards_mined, 0) * 60.0) / (COALESCE(s.active_watched_ms, 0) / 60000.0)
|
||||
ELSE NULL
|
||||
END AS cards_per_hour
|
||||
FROM imm_sessions s
|
||||
LEFT JOIN imm_session_telemetry t ON t.session_id = s.session_id
|
||||
GROUP BY s.session_id
|
||||
ORDER BY s.started_at_ms DESC
|
||||
LIMIT ?;
|
||||
```
|
||||
|
||||
### Lifetime anime totals
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
a.anime_id,
|
||||
a.canonical_title,
|
||||
la.total_sessions,
|
||||
la.total_active_ms,
|
||||
la.total_cards,
|
||||
la.total_words_seen,
|
||||
la.total_lines_seen,
|
||||
la.first_watched_ms,
|
||||
la.last_watched_ms
|
||||
FROM imm_lifetime_anime la
|
||||
JOIN imm_anime a ON a.anime_id = la.anime_id
|
||||
ORDER BY la.last_watched_ms DESC
|
||||
LIMIT ?;
|
||||
```
|
||||
|
||||
### Daily rollups
|
||||
|
||||
```sql
|
||||
@@ -192,16 +234,25 @@ LIMIT ?;
|
||||
- Queue overflow policy: drop oldest queued writes, keep newest.
|
||||
- SQLite pragmas: `journal_mode=WAL`, `synchronous=NORMAL`, `foreign_keys=ON`, `busy_timeout=2500`.
|
||||
- Rollups run incrementally from the last processed telemetry sample; startup performs a one-time bootstrap pass.
|
||||
- If retention pruning removes telemetry/session rows, maintenance triggers a full rollup rebuild to resync historical aggregates.
|
||||
- Cover-art blobs are deduplicated into `imm_cover_art_blobs` and referenced from `imm_media_art`.
|
||||
- Large-table reads are index-backed for `sample_ms`, session time windows, frequency-ranked words/kanji, and cover-art identity lookups.
|
||||
|
||||
### Schema (v3)
|
||||
### Schema (v12)
|
||||
|
||||
Core tables:
|
||||
|
||||
- `imm_videos` — video key/title/source metadata
|
||||
- `imm_sessions` — session UUID, video reference, timing/status
|
||||
- `imm_sessions` — session UUID, video reference, timing/status, final denormalized totals
|
||||
- `imm_session_telemetry` — high-frequency session aggregates over time
|
||||
- `imm_session_events` — event stream with compact numeric event types
|
||||
- `imm_subtitle_lines` — persisted subtitle text and timing per session/video
|
||||
|
||||
Lifetime summary tables:
|
||||
|
||||
- `imm_lifetime_global`
|
||||
- `imm_lifetime_anime`
|
||||
- `imm_lifetime_media`
|
||||
- `imm_lifetime_applied_sessions`
|
||||
|
||||
Rollup tables:
|
||||
|
||||
@@ -212,3 +263,8 @@ Vocabulary tables:
|
||||
|
||||
- `imm_words(id, headword, word, reading, first_seen, last_seen, frequency)`
|
||||
- `imm_kanji(id, kanji, first_seen, last_seen, frequency)`
|
||||
|
||||
Media-art tables:
|
||||
|
||||
- `imm_media_art` — per-video cover metadata plus shared blob reference
|
||||
- `imm_cover_art_blobs` — deduplicated image bytes keyed by blob hash
|
||||
|
||||
@@ -498,13 +498,21 @@
|
||||
"queueCap": 1000, // In-memory write queue cap before overflow policy applies.
|
||||
"payloadCapBytes": 256, // Max JSON payload size per event before truncation.
|
||||
"maintenanceIntervalMs": 86400000, // Maintenance cadence (prune + rollup + vacuum checks).
|
||||
"retentionMode": "preset", // Retention mode to use for defaults. Values: preset | advanced
|
||||
"retentionPreset": "balanced", // Named preset when retentionMode is preset.
|
||||
"retention": {
|
||||
"eventsDays": 7, // Raw event retention window in days.
|
||||
"telemetryDays": 30, // Telemetry retention window in days.
|
||||
"dailyRollupsDays": 365, // Daily rollup retention window in days.
|
||||
"monthlyRollupsDays": 1825, // Monthly rollup retention window in days.
|
||||
"vacuumIntervalDays": 7 // Minimum days between VACUUM runs.
|
||||
} // Retention setting.
|
||||
"eventsDays": 0, // Raw event retention window in days. Use 0 to keep all.
|
||||
"telemetryDays": 0, // Telemetry retention window in days. Use 0 to keep all.
|
||||
"sessionsDays": 0, // Session retention window in days. Use 0 to keep all.
|
||||
"dailyRollupsDays": 0, // Daily rollup retention window in days. Use 0 to keep all.
|
||||
"monthlyRollupsDays": 0, // Monthly rollup retention window in days. Use 0 to keep all.
|
||||
"vacuumIntervalDays": 0 // Minimum days between VACUUM runs. Use 0 to disable.
|
||||
}, // Retention setting.
|
||||
"lifetimeSummaries": {
|
||||
"global": true, // Keep lifetime global totals.
|
||||
"anime": true, // Keep lifetime per-anime totals.
|
||||
"media": true // Keep lifetime per-media totals.
|
||||
} // Lifetime summary setting.
|
||||
}, // Enable/disable immersion tracking.
|
||||
|
||||
// ==========================================
|
||||
|
||||
Reference in New Issue
Block a user