Decider snapshots — avoid full replay for long-lived streams #57

Closed
opened 2026-02-21 22:42:50 +00:00 by ash · 0 comments
Owner

Problem

Every decider.Load() replays ALL events from the start. At 1000 events that is 4.6ms on SQLite. At 10K+ events this becomes a real bottleneck for command processing latency.

Proposal

Periodic state snapshots. Load = snapshot + events since snapshot.

type SnapshotStore[S any] interface {
    SaveSnapshot(ctx context.Context, streamID string, version int, state S) error
    LoadSnapshot(ctx context.Context, streamID string) (state S, version int, err error)
}

Decider integration:

  1. Load() checks for snapshot first
  2. Loads only events after snapshot version
  3. After successful command, optionally save snapshot every N events
  4. Configurable: WithSnapshotEvery(100)

Benchmark Target

  • Decider.Load at 100, 1000, 10000 events (baseline)
  • With snapshot + 50 new events vs full replay of 1050
  • Find the crossover point where snapshots pay for themselves

Store Implementations

  • SQLite: separate snapshots table
  • Postgres: same
  • Memory: map
  • NATS KV: distributed snapshot cache

Acceptance

  • SnapshotStore interface
  • SQLite + Memory implementations
  • Decider integration with configurable interval
  • Benchmarks showing crossover point
  • Snapshot invalidation on schema change
## Problem Every `decider.Load()` replays ALL events from the start. At 1000 events that is 4.6ms on SQLite. At 10K+ events this becomes a real bottleneck for command processing latency. ## Proposal Periodic state snapshots. Load = snapshot + events since snapshot. ```go type SnapshotStore[S any] interface { SaveSnapshot(ctx context.Context, streamID string, version int, state S) error LoadSnapshot(ctx context.Context, streamID string) (state S, version int, err error) } ``` Decider integration: 1. `Load()` checks for snapshot first 2. Loads only events after snapshot version 3. After successful command, optionally save snapshot every N events 4. Configurable: `WithSnapshotEvery(100)` ## Benchmark Target - Decider.Load at 100, 1000, 10000 events (baseline) - With snapshot + 50 new events vs full replay of 1050 - Find the crossover point where snapshots pay for themselves ## Store Implementations - SQLite: separate `snapshots` table - Postgres: same - Memory: map - NATS KV: distributed snapshot cache ## Acceptance - [ ] SnapshotStore interface - [ ] SQLite + Memory implementations - [ ] Decider integration with configurable interval - [ ] Benchmarks showing crossover point - [ ] Snapshot invalidation on schema change
ash closed this issue 2026-02-22 14:31:43 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ash/eskit#57
No description provided.