In-memory projection: implementation plan
Status: Planned 2026-04-25; updated 2026-04-26 to use narrow
ports per target-architecture.md.
Supersedes the implementation-order section of
event-sourced-state.md, which described
the what and why. This doc is the how — concrete shape, files,
sequencing, and the unified-log decision that came out of the
planning round.
What changed since event-sourced-state.md
Section titled “What changed since event-sourced-state.md”- Unified event log decision: yes. Collapse the four parallel
JSONL streams (
entries,interactions,firings,classifications) into oneevents.jsonlwith atdiscriminator. Reducers project four views from the one stream. Rationale below. - Concrete reducer + lock layout. One
RwLock<ProjectionData>, oneProjection::apply(event)switch, fanned out to view-specific upserts. - Narrow ports, not
&dyn AppendLog. Pertarget-architecture.md, services depend on&dyn EventStore(write side) and&Projection(read side), not on the umbrellaDevicetrait. The pure sublayer (ProjectionData::apply, query fns) takes no traits at all. - Migration is a one-time replay tool, not an in-app fork.
- Stage gating. Five stages, each independently shippable, each reversible until stage 4.
For the architectural framing this plan implements — what’s “pure” vs. “hexagonal” in this codebase, why we didn’t go strict FC/IS, and the no-
&dyn Device-in-core rule — seetarget-architecture.md. This doc assumes that target and shows how the projection refactor lands us there.
Decisions
Section titled “Decisions”| Question | Decision | Why |
|---|---|---|
| Unified log? | Yes — one events.jsonl per user with t discriminator | One merge surface for sync; one apply switch; cross-stream invariants enforceable |
| Lock granularity | Single RwLock<ProjectionData> | 5:1 read:write at this scale; no torn reads on cross-view joins |
| Hydration timing | Sync, at Tauri startup, before app.manage | Sub-10k records replays in <100ms; solo-prototype, blocking is fine |
| Write path owner | Projection::append_* does both phases | Two-phase invariant lives in one place; reducer = hydrator |
| Reducer state shape | Materialized per-view (HashMap<id, Entry>, Vec<Interaction>, Vec<Firing>) + materialized SRS-per-entry inputs | Reads are O(view-size); SRS offset applied at read because utc_offset_seconds is read-time |
| Edit semantics | Full snapshots, never diffs | One event type per concept; reducer doesn’t need prior state to apply |
| Recovery on phase-2 panic | Log is truth → next boot re-hydrates | No app-level rollback; solo-prototype simplicity |
| Out-of-order on sync | Tolerate; latest-wins by (id, timestamp) | No vector clocks; clock skew tolerated within seconds |
The unified event log
Section titled “The unified event log”One file per user: events.jsonl. Each line is one event, internally
tagged on t, PascalCase EntityVerbed past-tense:
// Entries (entity events; latest-wins by id){ "t": "EntryCaptured", "id": "e-...", "capturedAt": 0, "source": "...", "content": "...", "shape": { "t": "Unsure" } }{ "t": "EntryEdited", "id": "e-...", "editedAt": 0, "source": "...", "content": "...", "shape": { ... } }{ "t": "EntryDeleted", "id": "e-...", "deletedAt": 0 }
// Classifier provenance (append-only; latest per entryId is current pick){ "t": "EntryClassified", "id": "cls-...", "entryId": "e-...", "classifierShape": "train", "confidence": "high", "reasoning": "...", "classifiedAt": 0 }
// Practice rituals (append-only; idempotent by id){ "t": "Practiced", "id": "ix-...", "entryId": "e-...", "occurredAt": 0, "response": null }{ "t": "Integrated", "id": "ix-...", "entryId": "e-...", "occurredAt": 0, "response": null }
// Real-world firings (append-only; idempotent by id){ "t": "FiringLogged", "id": "fir-...", "entryId": "e-...", "occurredAt": 0, "note": null }Why unified beats per-stream here
Section titled “Why unified beats per-stream here”- One merge surface for sync. Dropbox conflict files fan out per file; one log = one conflict.
- Cross-stream invariants stay enforceable.
EntryViewjoins entries × interactions × SRS today; with a totally-ordered stream the join becomes a deterministic left-fold rather than a multi-source dance. - One apply switch, N projections. The reducer is one match; fan-out to view-specific upserts is structural, not behavioral.
- Schema evolution lives in one place. Adding
EntryArchivedis one new arm.
Why we still keep EntryEdited distinct from EntryCaptured
Section titled “Why we still keep EntryEdited distinct from EntryCaptured”Structurally identical (full snapshot), semantically distinct.
Analytics and the audit trail read better. Cost is one extra arm.
Defensible to collapse into EntryWritten later — keep both for now.
Schema evolution rules
Section titled “Schema evolution rules”- Never rename a field. Use
#[serde(rename = "diskName")]if the Rust name needs to drift. - Never change a field’s meaning. Add a new field with
#[serde(default)], or mint a new event variant (EntryCapturedV2). - Unknown
tvalues: log + skip. Mirrors today’s silent-skip on malformed lines (if let Ok(...)inlist_entries). Forward-compat for free at solo-prototype scale. - No
_wildcard on the apply match. A new event variant is a compile error at every reducer site (project rule).
What goes in the log: facts, never commands
Section titled “What goes in the log: facts, never commands”EntryClassified (with the result baked in), never ClassifyEntry
(which would re-fire the LLM during replay). The side-effect
orchestration rule
(side-effect-orchestration.md) stays
intact: Rust runs the side effect, then appends the resulting fact.
Replay never re-runs the side effect. No now() or random()
inside a reducer.
The runtime layer
Section titled “The runtime layer”// crates/core/src/projection.rs (new module)
pub struct Projection { inner: RwLock<ProjectionData>,}
struct ProjectionData { // Latest-wins by id; tombstones already removed. entries_by_id: HashMap<String, Entry>, // File-order, no dedup. interactions: Vec<Interaction>, firings: Vec<Firing>, classifications: Vec<Classification>, // SRS inputs per entry: last-action timestamp + practiced count // + last action. Offset is read-time (next_review_millis depends // on user's *current* local offset, not write-time offset). srs_inputs_by_entry: HashMap<String, SrsInputs>,}
struct SrsInputs { latest_action: Action, latest_timestamp: i64, practiced_count: u32,}Wrapped in Arc<Projection> and stored in tauri::State. The
classify spawn task and the dispatch handler each clone the Arc;
inner RwLock serializes writes against reads.
Reducer sketch
Section titled “Reducer sketch”impl ProjectionData { fn apply(&mut self, event: Event) { match event { Event::EntryCaptured { id, captured_at, source, content, shape } | Event::EntryEdited { id, edited_at: captured_at, source, content, shape } => { self.entries_by_id.insert(id.clone(), Entry { id, created_at: captured_at, source, content, shape, is_deleted: false, }); } Event::EntryDeleted { id, .. } => { self.entries_by_id.remove(&id); // SRS inputs deliberately retained — if a tombstone // is later reversed by a fresh EntryCaptured/Edited // (resurrection), the practice history rejoins. } Event::EntryClassified { id, entry_id, classifier_shape, confidence, reasoning, classified_at } => { self.classifications.push(Classification { id, entry_id, classifier_shape, confidence, reasoning, classified_at, }); } Event::Practiced { id, entry_id, occurred_at, response } => { self.interactions.push(Interaction { id, entry_id: entry_id.clone(), timestamp: occurred_at, action: Action::Practiced, response, }); self.bump_srs(entry_id, Action::Practiced, occurred_at); } Event::Integrated { id, entry_id, occurred_at, response } => { self.interactions.push(Interaction { id, entry_id: entry_id.clone(), timestamp: occurred_at, action: Action::Integrated, response, }); self.bump_srs(entry_id, Action::Integrated, occurred_at); } Event::FiringLogged { id, entry_id, occurred_at, note } => { self.firings.push(Firing { id, entry_id, timestamp: occurred_at, note }); } } }}Notes:
- No
_wildcard — new event = compile error here. bump_srsis internal to the reducer; recomputes the per-entry SRS inputs from current state.- Reads compute the offset-dependent
next_review_millisat query time, not in the reducer. Otherwise DST/timezone changes silently invalidate the cache.
Read paths after the refactor
Section titled “Read paths after the refactor”Command::EntriesList => { let utc_offset_seconds = chrono::Local::now().offset().local_minus_utc(); let views = ctx.projection.list_entry_views(utc_offset_seconds); (Response::ok_json("Entries.List", serde_json::to_value(&views).unwrap()), DispatchSideEffect::None)}list_entry_views holds the read lock once, walks entries_by_id,
joins each entry against srs_inputs_by_entry to compute
next_review_millis, returns the Vec<EntryView>. Microseconds at
<10k records.
dispatch_with_device is renamed dispatch and takes a
DispatchCtx<'a> carrying the narrow ports it needs:
pub struct DispatchCtx<'a> { pub projection: &'a Projection, pub store: &'a dyn EventStore, pub classifier: &'a dyn Classifier, pub session: &'a dyn SessionWriter, pub clock: &'a dyn Clock, pub notifications: &'a dyn Notifications,}Device retreats to a shell-side aggregate that bundles concrete
adapters; it is no longer imported by crates/core or crates/api.
Write paths after the refactor
Section titled “Write paths after the refactor”The write API takes a narrow EventStore port (not &dyn AppendLog,
not &dyn Device) per the application-core target. EventStore
exposes exactly two methods: append(&Event) and read_all() -> Vec<Event>. The shell-side FsEventStore / CloudEventStore
implement it on top of today’s filesystem and R2 backings.
// crates/core/src/services/capture.rs (use-case service)pub fn record_entry_captured( projection: &Projection, store: &dyn EventStore, event: EntryCaptured,) -> Result<(), EventStoreError> { let event = Event::EntryCaptured(event); // Phase 1: durability anchor. Append the fact to the log. store.append(&event)?; // Phase 2: in-memory apply. Same reducer the hydrator uses. let mut data = projection.write_data(); data.apply(event); Ok(())}Projection::write_data is a thin shell over RwLock::write.
ProjectionData::apply is pure — no traits, no I/O — and is the
same code path used at hydration. If phase 2 panics, the log is
correct and next boot re-hydrates.
Hydration
Section titled “Hydration”impl Projection { pub fn hydrate(store: &dyn EventStore) -> Result<Self, EventStoreError> { let mut data = ProjectionData::default(); for event in store.read_all()? { data.apply(event); } Ok(Projection { inner: RwLock::new(data) }) }}EventStore::read_all returns Vec<Event> (already deserialized);
malformed-line handling moves into the store impl, where it can log +
skip with the right context. The pure apply never has to think
about JSON.
In tauri::Builder::setup, before app.manage:
let store: Arc<dyn EventStore> = pick_event_store(&session);let projection = Projection::hydrate(&*store).expect("hydrate");app.manage(AppState { store, projection: Arc::new(projection), /* + other ports */ });Sub-10k records replays in <100ms locally. With cloud (R2) backing, hydration is one network round-trip per log file — measure once, accept the cold-boot latency, document it. A future “rehydrate on sign-in” path can warm the cache without blocking startup.
Stages (each shippable independently)
Section titled “Stages (each shippable independently)”Stage 1 — Scaffold the projection + EventStore port (no behavior change)
Section titled “Stage 1 — Scaffold the projection + EventStore port (no behavior change)”- New module:
crates/core/src/projection/{mod,data,apply,query}.rswithProjection,ProjectionData(pure),apply(pure), query fns (pure), andhydrate. - New module:
crates/core/src/ports.rsdefiningEventStore(and stubs forClassifier,SessionReader,Clock, etc., even if unused at this stage). - Shell-side adapter:
FsEventStore(and laterCloudEventStore) insrc-tauri/src/. For Stage 1 it adapts on top of the existing per-stream JSONL files —read_all()reads each file, decodes, and yields syntheticEvent::EntryCapturedetc. so the reducer has something to fold. - Wire into
AppStateinsrc-tauri/src/lib.rs. Don’t read from the projection yet. - Test:
Projection::hydrate(&store)matcheslist_entries/list_interactions/list_firings/list_classificationsfor a scenario seeded with all four streams.
Reversible: deleting the new modules restores prior behavior.
Stage 2 — Switch reads to projection (one command at a time)
Section titled “Stage 2 — Switch reads to projection (one command at a time)”EntriesListfirst. Switch the dispatch arm to callprojection.list_entry_views(offset). Keep the legacylist_entry_views(device, offset)as a fallback path for one release, gated behind a debug flag if the projection diverges.- Then
InteractionsList,FiringsList. Trivial — just aclone()of the Vec. - Test: new dispatch tests against the projection-backed handler; existing wire-format tests stay green.
Reversible: revert each dispatch arm.
Stage 3 — Two-phase writes via use-case services
Section titled “Stage 3 — Two-phase writes via use-case services”- Add use-case services in
crates/core/src/services/(e.g.services::capture::record_entry_captured(projection, store, event)). Each service does phase 1 (store.append) then phase 2 (projection.apply) atomically. - Switch every
EntriesAppend,EntriesSoftDelete,InteractionsAppend,FiringsAppenddispatch arm to call the matching service. - Switch the
classify_and_supersedetask to aservices::classify::classify_entryservice that takes&dyn Classifier,&dyn EventStore,&dyn SessionReader,&dyn Clock,&Projection— not&dyn Device. Pertarget-architecture.md. Also fix the stale-classification race here (see “Known correctness debt” in the target-architecture doc — pin the classification to the captured event id/version; reducer refuses if a later edit landed). - Drop the post-capture 2.5s
EntriesListrefetch in Elm (src/Main.elm— the timer-driven refetch added 2026-04-25 to make the classifier supersede visible). Reads are now cheap and fresh. - Drop the on-Detail-nav
EntriesListrefetch for the same reason.
Partially reversible: you can revert the Elm-side cleanup independently from the Rust-side wiring.
Stage 4 — Unified event log (shipped as a single move, no dual-write)
Section titled “Stage 4 — Unified event log (shipped as a single move, no dual-write)”The plan originally called for a dual-write window for safety. Solo prototype, single user (Justin) — we skipped the dual-write and went straight to unified-only. What actually shipped:
Eventenum +From<Entry/Interaction/Firing/Classification>conversions: in core since Stage 1.- Write side:
FsEventStore::appendwrites only toevents.jsonl. The per-stream JSONL files are no longer touched. - Read side:
read_allprefersevents.jsonl; falls back to synthesizing events from the per-stream files only when the unified file is empty (the pre-backfill window). - One-shot backfill:
FsEventStore::backfill_unified_if_neededruns at startup (Tauri host + dev-webcore-server) and after every/dev/seed. Reads legacy per-stream files, writes synthesized events toevents.jsonl. Idempotent; no-op once the unified file has data. - Per-stream files become read-only fossils after backfill — never written to, ignored on read. We left them on disk rather than deleting; if anything ever goes wrong the data is recoverable.
If multi-user shows up later and we need a true safety window before a destructive migration, revisit this — re-introduce dual-write + delayed cleanup. For one user it’s not worth the code.
Stage 5 — Schema-version sync hooks
Section titled “Stage 5 — Schema-version sync hooks”Out of scope for the initial refactor, but the unified log makes it easy:
- Re-hydrate on file-mtime change (Dropbox watcher).
- Per-event R2 objects for cheap remote sync (the planned follow-up).
tauri::ipc::Channel<Event>push from Rust to Elm so the UI can reactively update without polling.
Failure modes
Section titled “Failure modes”| Mode | What happens | Recovery |
|---|---|---|
Phase 2 panic (poisoned RwLock) | Dispatch returns error envelope to Elm | Next boot re-hydrates from disk |
| Malformed event line | apply skips with eprintln! | Log + carry on; investigate via dev-web /dev/state |
events.jsonl partially written (power loss mid-line) | Last line fails to deserialize | Skipped; data after the partial write is gone but earlier lines are intact |
| Migration tool produces wrong projection | Stage 4 dual-write means legacy files still authoritative | Re-run tool, fix bugs, repeat |
| R2 unreachable at boot | Hydration fails | Fall back to local cache (or empty projection); user sees an error banner; retry on next dispatch |
Files that change
Section titled “Files that change”| File | Scope |
|---|---|
crates/core/src/projection/{mod,data,apply,query}.rs | New — Projection (RwLock holder), ProjectionData (pure), apply (pure), query fns (pure) |
crates/core/src/events.rs | New (Stage 4) — Event enum |
crates/core/src/ports.rs | New — narrow port traits: EventStore, Classifier, SessionReader, SessionWriter, Clock, Notifications |
crates/core/src/services/{capture,classify,notifications,auth}.rs | New — use-case services taking narrow ports |
crates/core/src/lib.rs | Re-export Projection, Event, ports + services. Stop re-exporting Device. |
crates/core/src/domain.rs | Keep types; drop list_* / append_* once services replace them |
crates/core/src/append_log.rs | Delete — AppendLog trait is shell-internal now (replaced by EventStore) |
crates/api/src/lib.rs | dispatch takes DispatchCtx<'a>; every arm calls a service; DispatchSideEffect stays as the host-spawn intent |
crates/api/src/bin/core_server.rs | Builds a DispatchCtx from concrete shell adapters |
src-tauri/src/lib.rs | AppState carries Arc<Projection> + concrete port impls; core_dispatch builds a DispatchCtx per call; Device (the aggregate) lives here |
src/Main.elm | Drop the 2.5s post-capture refetch + on-Detail-nav EntriesList calls (Stage 3) |
Open questions / non-decisions
Section titled “Open questions / non-decisions”- Lamport clocks for sync. Deferred. Single-device for now; clock skew tolerated to seconds. Revisit when the second device lands.
- Snapshot-and-tail for log size. Deferred. Today’s lifetime record count fits in memory. Revisit when boot replay exceeds 1s.
- Push-to-Elm via
tauri::ipc::Channel. Deferred. Elm refetches on user actions; the projection is fresh on every read so the current pattern works without a push channel. - Encrypting the unified log on R2. Tracked separately
(
project_e2e_encryptionmemory).
Why not a database
Section titled “Why not a database”SQLite would give us indexes + transactions. We don’t need them at this scale (HashMap is faster than B-tree for sub-10k entries; phase 1
- 2 ordering gives us write atomicity-enough). Adding SQLite is reversible later if we outgrow this; today it’s overhead.
Prior art
Section titled “Prior art”The agent round 2026-04-25 surfaced these references — read these before re-litigating any of the above:
- Greg Young — Versioning in an Event Sourced System — weak-schema rule, never rename, mint new variants. https://leanpub.com/esversioning/read
- Oskar Dudycz — Simple events versioning patterns — concrete upcasting/default-values recipes. https://event-driven.io/en/simple_events_versioning_patterns/
- Linear’s sync engine reverse-engineered (wzhudev) — SyncActions = changesets applied to a materialized object graph; LWW for ordering, no CRDTs. https://github.com/wzhudev/reverse-linear-sync-engine
- Replicache “How it works” — Causal+ ordering, mutator replay against newer base versions; the reference for “what to do when events arrive late.” https://doc.replicache.dev/concepts/how-it-works
- Riffle (UIST ‘23, Litt et al.) — single local relational store as truth, reactive queries on top; the academic version of what we’re building. https://riffle.systems/essays/prelude/
- Tauri v2 State Management — canonical
tauri::State<Mutex<T>>/RwLock<T>patterns. https://v2.tauri.app/develop/state-management/ - Confluent — Single vs. Multiple Event Streams + EventSourcingDB Designing Aggregates — the steelman against unification (read before re-arguing). https://developer.confluent.io/courses/event-design/single-vs-multiple-event-streams/
- Kleppmann et al. — Local-first software (Ink & Switch, 2019) — the seven ideals; addresses filesystem/Dropbox-style sync as a viable substrate. https://www.inkandswitch.com/essay/local-first/
- Kleppmann — Designing Data-Intensive Applications, 2nd ed., Ch. 11 “Stream Processing” — sections “Event Sourcing” and “Databases and Streams” are the definitive framing of “log is truth, state is a derived view.”