Skip to content

In-memory projection: implementation plan

Status: Planned 2026-04-25; updated 2026-04-26 to use narrow ports per target-architecture.md. Supersedes the implementation-order section of event-sourced-state.md, which described the what and why. This doc is the how — concrete shape, files, sequencing, and the unified-log decision that came out of the planning round.

  • Unified event log decision: yes. Collapse the four parallel JSONL streams (entries, interactions, firings, classifications) into one events.jsonl with a t discriminator. Reducers project four views from the one stream. Rationale below.
  • Concrete reducer + lock layout. One RwLock<ProjectionData>, one Projection::apply(event) switch, fanned out to view-specific upserts.
  • Narrow ports, not &dyn AppendLog. Per target-architecture.md, services depend on &dyn EventStore (write side) and &Projection (read side), not on the umbrella Device trait. The pure sublayer (ProjectionData::apply, query fns) takes no traits at all.
  • Migration is a one-time replay tool, not an in-app fork.
  • Stage gating. Five stages, each independently shippable, each reversible until stage 4.

For the architectural framing this plan implements — what’s “pure” vs. “hexagonal” in this codebase, why we didn’t go strict FC/IS, and the no-&dyn Device-in-core rule — see target-architecture.md. This doc assumes that target and shows how the projection refactor lands us there.

QuestionDecisionWhy
Unified log?Yes — one events.jsonl per user with t discriminatorOne merge surface for sync; one apply switch; cross-stream invariants enforceable
Lock granularitySingle RwLock<ProjectionData>5:1 read:write at this scale; no torn reads on cross-view joins
Hydration timingSync, at Tauri startup, before app.manageSub-10k records replays in <100ms; solo-prototype, blocking is fine
Write path ownerProjection::append_* does both phasesTwo-phase invariant lives in one place; reducer = hydrator
Reducer state shapeMaterialized per-view (HashMap<id, Entry>, Vec<Interaction>, Vec<Firing>) + materialized SRS-per-entry inputsReads are O(view-size); SRS offset applied at read because utc_offset_seconds is read-time
Edit semanticsFull snapshots, never diffsOne event type per concept; reducer doesn’t need prior state to apply
Recovery on phase-2 panicLog is truth → next boot re-hydratesNo app-level rollback; solo-prototype simplicity
Out-of-order on syncTolerate; latest-wins by (id, timestamp)No vector clocks; clock skew tolerated within seconds

One file per user: events.jsonl. Each line is one event, internally tagged on t, PascalCase EntityVerbed past-tense:

// Entries (entity events; latest-wins by id)
{ "t": "EntryCaptured", "id": "e-...", "capturedAt": 0, "source": "...", "content": "...", "shape": { "t": "Unsure" } }
{ "t": "EntryEdited", "id": "e-...", "editedAt": 0, "source": "...", "content": "...", "shape": { ... } }
{ "t": "EntryDeleted", "id": "e-...", "deletedAt": 0 }
// Classifier provenance (append-only; latest per entryId is current pick)
{ "t": "EntryClassified", "id": "cls-...", "entryId": "e-...", "classifierShape": "train", "confidence": "high", "reasoning": "...", "classifiedAt": 0 }
// Practice rituals (append-only; idempotent by id)
{ "t": "Practiced", "id": "ix-...", "entryId": "e-...", "occurredAt": 0, "response": null }
{ "t": "Integrated", "id": "ix-...", "entryId": "e-...", "occurredAt": 0, "response": null }
// Real-world firings (append-only; idempotent by id)
{ "t": "FiringLogged", "id": "fir-...", "entryId": "e-...", "occurredAt": 0, "note": null }
  • One merge surface for sync. Dropbox conflict files fan out per file; one log = one conflict.
  • Cross-stream invariants stay enforceable. EntryView joins entries × interactions × SRS today; with a totally-ordered stream the join becomes a deterministic left-fold rather than a multi-source dance.
  • One apply switch, N projections. The reducer is one match; fan-out to view-specific upserts is structural, not behavioral.
  • Schema evolution lives in one place. Adding EntryArchived is one new arm.

Why we still keep EntryEdited distinct from EntryCaptured

Section titled “Why we still keep EntryEdited distinct from EntryCaptured”

Structurally identical (full snapshot), semantically distinct. Analytics and the audit trail read better. Cost is one extra arm. Defensible to collapse into EntryWritten later — keep both for now.

  • Never rename a field. Use #[serde(rename = "diskName")] if the Rust name needs to drift.
  • Never change a field’s meaning. Add a new field with #[serde(default)], or mint a new event variant (EntryCapturedV2).
  • Unknown t values: log + skip. Mirrors today’s silent-skip on malformed lines (if let Ok(...) in list_entries). Forward-compat for free at solo-prototype scale.
  • No _ wildcard on the apply match. A new event variant is a compile error at every reducer site (project rule).

What goes in the log: facts, never commands

Section titled “What goes in the log: facts, never commands”

EntryClassified (with the result baked in), never ClassifyEntry (which would re-fire the LLM during replay). The side-effect orchestration rule (side-effect-orchestration.md) stays intact: Rust runs the side effect, then appends the resulting fact. Replay never re-runs the side effect. No now() or random() inside a reducer.

// crates/core/src/projection.rs (new module)
pub struct Projection {
inner: RwLock<ProjectionData>,
}
struct ProjectionData {
// Latest-wins by id; tombstones already removed.
entries_by_id: HashMap<String, Entry>,
// File-order, no dedup.
interactions: Vec<Interaction>,
firings: Vec<Firing>,
classifications: Vec<Classification>,
// SRS inputs per entry: last-action timestamp + practiced count
// + last action. Offset is read-time (next_review_millis depends
// on user's *current* local offset, not write-time offset).
srs_inputs_by_entry: HashMap<String, SrsInputs>,
}
struct SrsInputs {
latest_action: Action,
latest_timestamp: i64,
practiced_count: u32,
}

Wrapped in Arc<Projection> and stored in tauri::State. The classify spawn task and the dispatch handler each clone the Arc; inner RwLock serializes writes against reads.

impl ProjectionData {
fn apply(&mut self, event: Event) {
match event {
Event::EntryCaptured { id, captured_at, source, content, shape }
| Event::EntryEdited { id, edited_at: captured_at, source, content, shape } => {
self.entries_by_id.insert(id.clone(), Entry {
id, created_at: captured_at, source, content, shape,
is_deleted: false,
});
}
Event::EntryDeleted { id, .. } => {
self.entries_by_id.remove(&id);
// SRS inputs deliberately retained — if a tombstone
// is later reversed by a fresh EntryCaptured/Edited
// (resurrection), the practice history rejoins.
}
Event::EntryClassified { id, entry_id, classifier_shape, confidence, reasoning, classified_at } => {
self.classifications.push(Classification {
id, entry_id, classifier_shape, confidence, reasoning, classified_at,
});
}
Event::Practiced { id, entry_id, occurred_at, response } => {
self.interactions.push(Interaction {
id, entry_id: entry_id.clone(), timestamp: occurred_at,
action: Action::Practiced, response,
});
self.bump_srs(entry_id, Action::Practiced, occurred_at);
}
Event::Integrated { id, entry_id, occurred_at, response } => {
self.interactions.push(Interaction {
id, entry_id: entry_id.clone(), timestamp: occurred_at,
action: Action::Integrated, response,
});
self.bump_srs(entry_id, Action::Integrated, occurred_at);
}
Event::FiringLogged { id, entry_id, occurred_at, note } => {
self.firings.push(Firing { id, entry_id, timestamp: occurred_at, note });
}
}
}
}

Notes:

  • No _ wildcard — new event = compile error here.
  • bump_srs is internal to the reducer; recomputes the per-entry SRS inputs from current state.
  • Reads compute the offset-dependent next_review_millis at query time, not in the reducer. Otherwise DST/timezone changes silently invalidate the cache.
crates/api/src/lib.rs
Command::EntriesList => {
let utc_offset_seconds = chrono::Local::now().offset().local_minus_utc();
let views = ctx.projection.list_entry_views(utc_offset_seconds);
(Response::ok_json("Entries.List", serde_json::to_value(&views).unwrap()),
DispatchSideEffect::None)
}

list_entry_views holds the read lock once, walks entries_by_id, joins each entry against srs_inputs_by_entry to compute next_review_millis, returns the Vec<EntryView>. Microseconds at <10k records.

dispatch_with_device is renamed dispatch and takes a DispatchCtx<'a> carrying the narrow ports it needs:

pub struct DispatchCtx<'a> {
pub projection: &'a Projection,
pub store: &'a dyn EventStore,
pub classifier: &'a dyn Classifier,
pub session: &'a dyn SessionWriter,
pub clock: &'a dyn Clock,
pub notifications: &'a dyn Notifications,
}

Device retreats to a shell-side aggregate that bundles concrete adapters; it is no longer imported by crates/core or crates/api.

The write API takes a narrow EventStore port (not &dyn AppendLog, not &dyn Device) per the application-core target. EventStore exposes exactly two methods: append(&Event) and read_all() -> Vec<Event>. The shell-side FsEventStore / CloudEventStore implement it on top of today’s filesystem and R2 backings.

// crates/core/src/services/capture.rs (use-case service)
pub fn record_entry_captured(
projection: &Projection,
store: &dyn EventStore,
event: EntryCaptured,
) -> Result<(), EventStoreError> {
let event = Event::EntryCaptured(event);
// Phase 1: durability anchor. Append the fact to the log.
store.append(&event)?;
// Phase 2: in-memory apply. Same reducer the hydrator uses.
let mut data = projection.write_data();
data.apply(event);
Ok(())
}

Projection::write_data is a thin shell over RwLock::write. ProjectionData::apply is pure — no traits, no I/O — and is the same code path used at hydration. If phase 2 panics, the log is correct and next boot re-hydrates.

crates/core/src/projection/mod.rs
impl Projection {
pub fn hydrate(store: &dyn EventStore) -> Result<Self, EventStoreError> {
let mut data = ProjectionData::default();
for event in store.read_all()? {
data.apply(event);
}
Ok(Projection { inner: RwLock::new(data) })
}
}

EventStore::read_all returns Vec<Event> (already deserialized); malformed-line handling moves into the store impl, where it can log + skip with the right context. The pure apply never has to think about JSON.

In tauri::Builder::setup, before app.manage:

let store: Arc<dyn EventStore> = pick_event_store(&session);
let projection = Projection::hydrate(&*store).expect("hydrate");
app.manage(AppState { store, projection: Arc::new(projection), /* + other ports */ });

Sub-10k records replays in <100ms locally. With cloud (R2) backing, hydration is one network round-trip per log file — measure once, accept the cold-boot latency, document it. A future “rehydrate on sign-in” path can warm the cache without blocking startup.

Stage 1 — Scaffold the projection + EventStore port (no behavior change)

Section titled “Stage 1 — Scaffold the projection + EventStore port (no behavior change)”
  • New module: crates/core/src/projection/{mod,data,apply,query}.rs with Projection, ProjectionData (pure), apply (pure), query fns (pure), and hydrate.
  • New module: crates/core/src/ports.rs defining EventStore (and stubs for Classifier, SessionReader, Clock, etc., even if unused at this stage).
  • Shell-side adapter: FsEventStore (and later CloudEventStore) in src-tauri/src/. For Stage 1 it adapts on top of the existing per-stream JSONL files — read_all() reads each file, decodes, and yields synthetic Event::EntryCaptured etc. so the reducer has something to fold.
  • Wire into AppState in src-tauri/src/lib.rs. Don’t read from the projection yet.
  • Test: Projection::hydrate(&store) matches list_entries / list_interactions / list_firings / list_classifications for a scenario seeded with all four streams.

Reversible: deleting the new modules restores prior behavior.

Stage 2 — Switch reads to projection (one command at a time)

Section titled “Stage 2 — Switch reads to projection (one command at a time)”
  • EntriesList first. Switch the dispatch arm to call projection.list_entry_views(offset). Keep the legacy list_entry_views(device, offset) as a fallback path for one release, gated behind a debug flag if the projection diverges.
  • Then InteractionsList, FiringsList. Trivial — just a clone() of the Vec.
  • Test: new dispatch tests against the projection-backed handler; existing wire-format tests stay green.

Reversible: revert each dispatch arm.

Stage 3 — Two-phase writes via use-case services

Section titled “Stage 3 — Two-phase writes via use-case services”
  • Add use-case services in crates/core/src/services/ (e.g. services::capture::record_entry_captured(projection, store, event)). Each service does phase 1 (store.append) then phase 2 (projection.apply) atomically.
  • Switch every EntriesAppend, EntriesSoftDelete, InteractionsAppend, FiringsAppend dispatch arm to call the matching service.
  • Switch the classify_and_supersede task to a services::classify::classify_entry service that takes &dyn Classifier, &dyn EventStore, &dyn SessionReader, &dyn Clock, &Projectionnot &dyn Device. Per target-architecture.md. Also fix the stale-classification race here (see “Known correctness debt” in the target-architecture doc — pin the classification to the captured event id/version; reducer refuses if a later edit landed).
  • Drop the post-capture 2.5s EntriesList refetch in Elm (src/Main.elm — the timer-driven refetch added 2026-04-25 to make the classifier supersede visible). Reads are now cheap and fresh.
  • Drop the on-Detail-nav EntriesList refetch for the same reason.

Partially reversible: you can revert the Elm-side cleanup independently from the Rust-side wiring.

Stage 4 — Unified event log (shipped as a single move, no dual-write)

Section titled “Stage 4 — Unified event log (shipped as a single move, no dual-write)”

The plan originally called for a dual-write window for safety. Solo prototype, single user (Justin) — we skipped the dual-write and went straight to unified-only. What actually shipped:

  • Event enum + From<Entry/Interaction/Firing/Classification> conversions: in core since Stage 1.
  • Write side: FsEventStore::append writes only to events.jsonl. The per-stream JSONL files are no longer touched.
  • Read side: read_all prefers events.jsonl; falls back to synthesizing events from the per-stream files only when the unified file is empty (the pre-backfill window).
  • One-shot backfill: FsEventStore::backfill_unified_if_needed runs at startup (Tauri host + dev-web core-server) and after every /dev/seed. Reads legacy per-stream files, writes synthesized events to events.jsonl. Idempotent; no-op once the unified file has data.
  • Per-stream files become read-only fossils after backfill — never written to, ignored on read. We left them on disk rather than deleting; if anything ever goes wrong the data is recoverable.

If multi-user shows up later and we need a true safety window before a destructive migration, revisit this — re-introduce dual-write + delayed cleanup. For one user it’s not worth the code.

Out of scope for the initial refactor, but the unified log makes it easy:

  • Re-hydrate on file-mtime change (Dropbox watcher).
  • Per-event R2 objects for cheap remote sync (the planned follow-up).
  • tauri::ipc::Channel<Event> push from Rust to Elm so the UI can reactively update without polling.
ModeWhat happensRecovery
Phase 2 panic (poisoned RwLock)Dispatch returns error envelope to ElmNext boot re-hydrates from disk
Malformed event lineapply skips with eprintln!Log + carry on; investigate via dev-web /dev/state
events.jsonl partially written (power loss mid-line)Last line fails to deserializeSkipped; data after the partial write is gone but earlier lines are intact
Migration tool produces wrong projectionStage 4 dual-write means legacy files still authoritativeRe-run tool, fix bugs, repeat
R2 unreachable at bootHydration failsFall back to local cache (or empty projection); user sees an error banner; retry on next dispatch
FileScope
crates/core/src/projection/{mod,data,apply,query}.rsNewProjection (RwLock holder), ProjectionData (pure), apply (pure), query fns (pure)
crates/core/src/events.rsNew (Stage 4) — Event enum
crates/core/src/ports.rsNew — narrow port traits: EventStore, Classifier, SessionReader, SessionWriter, Clock, Notifications
crates/core/src/services/{capture,classify,notifications,auth}.rsNew — use-case services taking narrow ports
crates/core/src/lib.rsRe-export Projection, Event, ports + services. Stop re-exporting Device.
crates/core/src/domain.rsKeep types; drop list_* / append_* once services replace them
crates/core/src/append_log.rsDeleteAppendLog trait is shell-internal now (replaced by EventStore)
crates/api/src/lib.rsdispatch takes DispatchCtx<'a>; every arm calls a service; DispatchSideEffect stays as the host-spawn intent
crates/api/src/bin/core_server.rsBuilds a DispatchCtx from concrete shell adapters
src-tauri/src/lib.rsAppState carries Arc<Projection> + concrete port impls; core_dispatch builds a DispatchCtx per call; Device (the aggregate) lives here
src/Main.elmDrop the 2.5s post-capture refetch + on-Detail-nav EntriesList calls (Stage 3)
  • Lamport clocks for sync. Deferred. Single-device for now; clock skew tolerated to seconds. Revisit when the second device lands.
  • Snapshot-and-tail for log size. Deferred. Today’s lifetime record count fits in memory. Revisit when boot replay exceeds 1s.
  • Push-to-Elm via tauri::ipc::Channel. Deferred. Elm refetches on user actions; the projection is fresh on every read so the current pattern works without a push channel.
  • Encrypting the unified log on R2. Tracked separately (project_e2e_encryption memory).

SQLite would give us indexes + transactions. We don’t need them at this scale (HashMap is faster than B-tree for sub-10k entries; phase 1

  • 2 ordering gives us write atomicity-enough). Adding SQLite is reversible later if we outgrow this; today it’s overhead.

The agent round 2026-04-25 surfaced these references — read these before re-litigating any of the above: