feat(sessions): disk-backed session store and resume flow #25
No reviewers
Labels
No labels
area:agents
area:dashboard
area:database
area:design
area:design-review
area:flows
area:infra
area:meta
area:security
area:sessions
area:webhook
area:workdir
security
type:bug
type:chore
type:meta
type:user-story
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
charles/claude-hooks!25
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "boss/6"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Replaces the
sessionKey/dropSessionstubs inmain.tswith a real disk-backed store (src/sessions.ts) and wires session resumption intorunAgent.src/sessions.ts):sessionKey,getSession,setSession,dropSession,dropAllForIssue, backed by~/.local/state/claude-hooks/sessions.json(override withCLAUDE_HOOKS_STATE_DIR). Atomic writes viatmp + rename. Concurrent writes serialised by an in-process promise-chain mutex — documented as last-writer-wins for the (unexpected) cross-process case.resume: idtoquery()when a session exists; captures the first SDK message'ssession_idand persists it; on a thrown error during a resumed run, drops the bad session and retries once fresh with a warning.implement-delta.md,review-delta.md,address-review-delta.md. The webhook picks<base>-delta.mdwhen a session already exists for the (agent, repo, issue/pr) key, else falls back to the full template.cleanupIssuecallsdropAllForIssue(repo, issue);cleanupBranchaccepts an optionalprNumberand does the same. The PR-close handler passes it./reset: unchanged shape — now awaits the real asyncdropSession.Test plan
bun test— 69 tests pass (newsrc/sessions.test.tscovers round-trip, atomic write + corrupt-tmp survival, concurrent sets,dropAllForIssuescoping)bunx tsc --noEmit— cleanbunx biome check src/— clean/resetcase assertssession=trueafter a realsetSessionresuming session <id>and uses the delta skillCloses #6
Review — PR #25: disk-backed session store and resume flow
All acceptance criteria from issue #6 are met. Code is correct, tests are thorough, and the edge cases are handled well. Two observations below — neither blocks merging.
AC checklist
Sessions store (
src/sessions.ts)sessionKey,getSession,setSession,dropSession,dropAllForIssue— all implemented and exported~/.local/state/claude-hooks/sessions.json(env override works)tmp + renamewith a sufficiently unique tmp name (pid.ms.rand)runAgentintegrationsessionKeycomputed at task start;resume: idpassed toquery()when a session is storedcapturedflag correctly gates the firstsession_idmessage;setSessioncalled with.catch()so a disk error does not kill the taskdropAllForIssue;cleanupBranchnow acceptsprNumber/resetendpoint now awaits the real asyncdropSessionSkill prompt delta vs. full
skillForEventhelper is clean; try/catch fallback to full template when delta file is missingimplement-delta.md,review-delta.md,address-review-delta.mdaddedTests
.tmp.brokensibling does not corrupt readssetSessioncalls all land without data lossdropAllForIssuescoping: only matching (repo, issue) keys removed/resettest inmain.test.tsnow exercises the realdropSessionand assertssession=trueObservation 1 —
mergemissing its delta file (non-blocking)handleApprovednow callsskillForEvent(MERGE_AGENT, repo, pr.number, "merge")butskills/merge-delta.mdis absent. The try/catch fallback inskillForEventsilently returns the fullmerge.md, so the merge agent never benefits from session resumption. Either addmerge-delta.md, or revert this call back toloadSkill("merge")to make the intent explicit. If merge is intentionally a one-shot action the latter is cleaner and avoids the unnecessarygetSessionlookup.Observation 2 — retry scope is broader than "session not found" (non-blocking)
The catch block retries fresh whenever
runOnce(storedSession)throws andstoredSessionis non-null — regardless of error type. This covers the documented case (expired/unknown session ID) but also catches mid-run errors (Forgejo API blip, Bash failure, etc.), dropping a valid session. The behaviour is still acceptable (fresh retry beats permanent failure), but if the agent had already pushed partial work, the fresh run could open a duplicate PR. Not a new risk (it predates sessions), but the new retry path makes it slightly more likely to surface.