feat(agents): pluggable CheckpointStore + cursor run-event persistence #987
No reviewers
Labels
No labels
area:agents
area:dashboard
area:database
area:design
area:design-review
area:flows
area:infra
area:meta
area:security
area:sessions
area:webhook
area:workdir
security
type:bug
type:chore
type:meta
type:user-story
No milestone
No project
No assignees
4 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
charles/claude-hooks!987
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "code-lead/958"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Lifts session-map helpers behind a
CheckpointStoreinterface, addsagent_checkpoint+agent_run_eventtables for cursor's richer surface, and exposes a/agents/sessionsadmin surface so the operator can force-clear a stuck resume.Test plan
just qa(typecheck + biome + tests, all 3355 server tests green)providercolumn + satellite tables existappendRunEvent(PK contention)dropSessionWithArtifactscascades cleanup across all three tablesCloses #958
4a5c3b75ef129edc2e73CI still pending at review time (run #1724, sha
4a5c3b75). Stepping off the review request — will be re-dispatched automatically when CI completes.behavior (
apps/server/src/infrastructure/agent/cursor-sdk-adapter.ts):agent_checkpointtable is created and thesetCheckpoint/getCheckpointAPI works, but the cursor adapter never callssetCheckpointandAgent.createis not wired with{ stores: { checkpointStore, runStore } }. AC explicitly requires: "Pass ourCheckpointStoreadapter intoAgent.create({ stores: { checkpointStore, runStore } })… OnAgent.resume(...), the checkpoint store satisfies cursor's reads from disk." Without the wiring, theagent_checkpointtable is dead storage and cursor won't resume from our on-disk blob. Either wireAgent.createwith the adapter (verifying the SDK's actual option names against@cursor/sdktypes as the AC directs), or document why the SDK doesn't support it and close that AC item explicitly.doc-gap (
apps/server/src/infrastructure/database/drizzle/0011_agent_checkpoint_store.sql, lines 31–33): Header comment says "cleanup is automatic — when the session row is dropped, so are its checkpoint blob and run-event log" via the unique index FK target. NoFOREIGN KEY … ON DELETE CASCADEconstraints are actually defined. The code is correct (no-cascade is the right design, andsessions-migration.test.tsexplicitly tests it), but the comment is wrong and will mislead future readers. Fix the comment to say cleanup is handled explicitly bydropSessionWithArtifacts/ the daily sweep.Everything else is solid: migration is idempotent + partial-DB-safe, concurrency test covers the PK retry path,
guardMutatingon the DELETE route, provider backfill correct, sweeper integration clean.behavior
agent_checkpointtable +setCheckpoint/getCheckpointare dead code in production — the cursor adapter never calls them, andAgent.create/Agent.resume(cursor-sdk-adapter.ts:1161–1172) don't receivestores: { checkpointStore }. The AC explicitly requires passing the store toAgent.create. Either wire it up, or (if cursor SDK 1.0.12 doesn't expose astoresoption) removeagent_checkpoint+ the blob methods as out-of-scope and add a comment explaining why.doc-gap
0011_agent_checkpoint_store.sqllines 16–20 state "Both new tables carry a foreign-key … so cleanup is automatic — when the session row is dropped … so are its checkpoint blob and run-event log." The actual DDL has no FK constraints and no ON DELETE CASCADE.agent-checkpoints.test.ts:134–144explicitly asserts the absence of cascade. The comment needs to accurately describe the application-level cleanup path (sweeper +dropSessionWithArtifacts), not non-existent DB-level cascades.doc-gap
checkpoint-store.ts:132doc comment saysdropSessionWithArtifactsis "Used by both the operator force-clear path and thedropAllForIssuecleanup."dropAllForIssue(sessions.ts:210) does a directDELETEfromclaude_sdk_sessionsonly — it never callsdropSessionWithArtifacts. Satellite orphans for closed issues wait until the next sweeper cycle. Fix the comment.schema
schema/agent-run-event.tsdoesn't declare the compositePRIMARY KEY (session_id, seq)in the Drizzle table definition. The SQL migration is correct, but the Drizzle schema file omitsprimaryKey({ columns: [agentRunEvent.session_id, agentRunEvent.seq] }). Adrizzle-kit pushwould silently drop the PK.- Remove `setCheckpoint`/`getCheckpoint` blob methods from `CheckpointStore` interface and implementation; Cursor SDK 1.0.12 does not expose a `stores` injection point on `Agent.create` — the SDK's `AgentCheckpointStore` has a different interface (`loadLatest`/`getBlobStore`) with no supported custom-store override. Add explanatory NOTE comments to `checkpoint-store.ts` and at the `Agent.create` call site. - Fix false FK-cascade claim in `0011_agent_checkpoint_store.sql` header comment; cleanup is application-level (`dropSessionWithArtifacts` + daily sweeper), not via `ON DELETE CASCADE`. - Fix `dropSessionWithArtifacts` doc comment to remove the incorrect claim that `dropAllForIssue` calls it; satellite orphans from that path wait for the sweeper. - Add `primaryKey({ columns: [t.session_id, t.seq] })` to the `agentRunEvent` Drizzle table definition so a `drizzle-kit push` would not silently drop the composite PK. - Simplify `sweepOrphanCheckpointArtifacts` to only sweep `agent_run_event` (nothing writes to `agent_checkpoint` now). - Update unit tests accordingly. Co-authored-by: Cursor <cursoragent@cursor.com>All reviewer feedback addressed in the latest commit:
behavior —
setCheckpoint/getCheckpointremoved fromCheckpointStoreinterface and implementation. Cursor SDK 1.0.12 does not expose astoresoption onAgent.create/Agent.resume—AgentOptionshas no such field. The SDK's internalAgentCheckpointStoreusesloadLatest(agentId)/getBlobStore(agentId), a completely different interface from ours, with no supported injection point. Theagent_checkpointtable is retained in the DB (migration already applied) with a NOTE comment explaining why it is unpopulated and what would be needed to wire it in a future SDK version.appendRunEventremains wired and active.doc-gap (migration comment) —
0011_agent_checkpoint_store.sqlheader corrected: removed the false FK-cascade claim, replaced with accurate description of application-level cleanup viadropSessionWithArtifacts+sweepOrphanCheckpointArtifacts.doc-gap (checkpoint-store.ts:132) —
dropSessionWithArtifactsdoc comment corrected: removed the incorrectdropAllForIssuereference; the note now accurately describes thatdropAllForIssuedoes a direct DELETE and satellite orphans from that path wait for the daily sweeper.schema —
agent-run-event.tsDrizzle table definition updated withprimaryKey({ columns: [t.session_id, t.seq] })sodrizzle-kit pushwould not silently drop the composite PK.322d04c339f0f33ec83df0f33ec83d5a2fb01765LGTM. Interface, schema, sweep, HTTP surface, and frontend all correct. SDK-limitation deviations (
getCheckpoint/storesinjection) are thoroughly documented inline — the right call given Cursor SDK 1.0.12.Nit (non-blocking):
handleSessionsDropdoc comment claims it dropsagent_checkpointblobs; the table is always empty in this SDK version, so that line is misleading but harmless.Merge call returned false — please merge manually.