#150 - perf: reduce Append allocations — 76 allocs for 1 event is unacceptable - ash/eskit

ash commented

2026-03-01 12:11:52 +00:00

Owner

Problem

Benchmarked on real hardware:

SQLite AppendSingle: 247µs, 76 allocs
SQLite AppendBatch100: 3.8ms, 2154 allocs (22/event)
Postgres AppendSingle: 3.1ms, 61 allocs
Postgres AppendBatch100: 26ms, 2142 allocs (21/event)

76 allocations to append one event. For the #1 ES framework in Go, target is <10.

Root Causes (sqlitestore)

Per event in the loop:

writeCodec.Marshal(data) — JSON serialize (unavoidable, but could pool buffer)
json.Marshal(meta.Extra) — EVERY event, even when nil. Same metadata for all events in batch.
now.Format(time.RFC3339Nano) — EVERY event, same timestamp. Format once outside loop.
string(jsonData) — byte→string copy for SQL
string(extraJSON) — byte→string copy for SQL
stmt.ExecContext per row — N separate INSERTs instead of batch
result.LastInsertId() — extra call per row
strconv.FormatInt — int→string for Event.ID
eskit.ResolveEventType — reflect per event (could cache)
eskit.CurrentSchemaVersion — lookup per event

Outside loop:
11. Tombstone check query — EVERY append, even new streams
12. Version check query — separate from tombstone
13. PrepareContext — for INSERT

Fixes

Quick wins (no API change)

Format timestamp ONCE before loop, reuse string
Marshal metadata.Extra ONCE before loop (same for all events)
Cache ResolveEventType result per Go type (its the same type in a typed store)
Cache CurrentSchemaVersion (same event type = same version)
Combine tombstone + version check into ONE query: SELECT COALESCE(MAX(version),0), (SELECT reason FROM tombstones WHERE stream_id=?) FROM events WHERE stream_id=?

Medium effort

Batch INSERT: build one SQL statement for all events instead of N ExecContext calls
- SQLite: INSERT INTO events VALUES (?,?,?),(?,?,?),(?,?,?)
- Postgres: INSERT INTO events VALUES ($1,$2,$3),($4,$5,$6) or pgx CopyFrom
Use pgx CopyFrom for Postgres batch appends (binary protocol, zero parsing overhead)
Pool marshal buffers via sync.Pool

Stretch

Store timestamps as INTEGER (unix nanos) — eliminates Format entirely
Avoid string([]byte) copies — use driver that accepts []byte directly

Target

AppendSingle: <10 allocs
AppendBatch100: <8 allocs/event

Requirements

Benchmark BEFORE and AFTER on both SQLite and Postgres (VPS: 143.14.50.71)
Must pass all conformance tests
Fix pgstore benchmark factory to Truncate on setup (currently broken)
Both stores

## Problem Benchmarked on real hardware: ``` SQLite AppendSingle: 247µs, 76 allocs SQLite AppendBatch100: 3.8ms, 2154 allocs (22/event) Postgres AppendSingle: 3.1ms, 61 allocs Postgres AppendBatch100: 26ms, 2142 allocs (21/event) ``` 76 allocations to append one event. For the #1 ES framework in Go, target is <10. ## Root Causes (sqlitestore) Per event in the loop: 1. `writeCodec.Marshal(data)` — JSON serialize (unavoidable, but could pool buffer) 2. `json.Marshal(meta.Extra)` — EVERY event, even when nil. Same metadata for all events in batch. 3. `now.Format(time.RFC3339Nano)` — EVERY event, same timestamp. Format once outside loop. 4. `string(jsonData)` — byte→string copy for SQL 5. `string(extraJSON)` — byte→string copy for SQL 6. `stmt.ExecContext` per row — N separate INSERTs instead of batch 7. `result.LastInsertId()` — extra call per row 8. `strconv.FormatInt` — int→string for Event.ID 9. `eskit.ResolveEventType` — reflect per event (could cache) 10. `eskit.CurrentSchemaVersion` — lookup per event Outside loop: 11. Tombstone check query — EVERY append, even new streams 12. Version check query — separate from tombstone 13. PrepareContext — for INSERT ## Fixes ### Quick wins (no API change) - [ ] Format timestamp ONCE before loop, reuse string - [ ] Marshal metadata.Extra ONCE before loop (same for all events) - [ ] Cache ResolveEventType result per Go type (its the same type in a typed store) - [ ] Cache CurrentSchemaVersion (same event type = same version) - [ ] Combine tombstone + version check into ONE query: `SELECT COALESCE(MAX(version),0), (SELECT reason FROM tombstones WHERE stream_id=?) FROM events WHERE stream_id=?` ### Medium effort - [ ] Batch INSERT: build one SQL statement for all events instead of N ExecContext calls - SQLite: `INSERT INTO events VALUES (?,?,?),(?,?,?),(?,?,?)` - Postgres: `INSERT INTO events VALUES ($1,$2,$3),($4,$5,$6)` or pgx CopyFrom - [ ] Use pgx CopyFrom for Postgres batch appends (binary protocol, zero parsing overhead) - [ ] Pool marshal buffers via sync.Pool ### Stretch - [ ] Store timestamps as INTEGER (unix nanos) — eliminates Format entirely - [ ] Avoid string([]byte) copies — use driver that accepts []byte directly ## Target - AppendSingle: <10 allocs - AppendBatch100: <8 allocs/event ## Requirements - Benchmark BEFORE and AFTER on both SQLite and Postgres (VPS: 143.14.50.71) - Must pass all conformance tests - Fix pgstore benchmark factory to Truncate on setup (currently broken) - Both stores