fix(agents): re-sync worker config.type after rename — closes #711 #723
No reviewers
Labels
No labels
area:agents
area:dashboard
area:database
area:design
area:design-review
area:flows
area:infra
area:meta
area:security
area:sessions
area:webhook
area:workdir
security
type:bug
type:chore
type:meta
type:user-story
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
charles/claude-hooks!723
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "boss/711"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Closes #711.
POST /agents/types/{old}/renameupdates the DBagents.typecolumn + rewritesagents.json+ reloadsconfig.types, but each in-memoryWorkerwas constructed at boot withconfig.type = oldName. Without a re-sync,pool.getWorkersByType(workers, newName)returns an empty list (every worker still reports the old type), sodispatchByType(newName, …)silently drops every webhook that targets the freshly-renamed type — the agent goes idle on the dashboard until the next service restart.Two-line fix in the existing post-commit side-effects loop: walk
listAgentsByType(newName)and update each registeredWorker'sconfig.typein place, alongside the existingenqueueRendercall.Why not unregister + re-register
agents.nameis NOT touched by this rename — only thetypecolumn moves. The worker registry is keyed byagents.name, so the lookup key stays valid; only the cachedconfig.typefield on the existingWorkeris stale.currentTask, the queue, the abort signal, and the lifecycle hooks. An in-flight task continues to drain through the sameWorkerinstance under the new type.claude-hooks-<name>) andCLAUDE_CONFIG_DIR(~/.config/claude-hooks/agent-env/<name>) are both keyed byname, so no reconcile is needed for thetype-only rename.The bug we observed in the wild —
boss-defaultworkers running while DB carriescode-lead-defaultrows — comes from a separate path (the boot-timemigrateForLegacyTypeRenamesrewriting DB names + types in one pass). That path's worker-registry parity relies on the next service restart re-reading the DB, which is the documented operator workflow perdocs/shutdown.md. This PR scopes the runtime endpoint only.Tests
Two new cases in
agent-type-rename.test.ts:re-syncs every live worker's config.type to the new name (#711)— registers threesenior-*workers, assertsgetWorkersByType("senior")returns them pre-rename, calls the rename endpoint, assertsgetWorkersByType("tech-lead")returns them post-rename andgetWorkersByType("senior")is empty.preserves currentTask + queue depth across the rename (#711)— stashes a fakecurrentTask+ a queued entry onsenior-default, runs the rename, asserts both survive andconfig.typeflipped.Plus the existing 13 tests on the rename endpoint continue to pass — 15/15 in
bun test src/http/handlers/agent-type-rename.test.ts. Server suite: 3067 / 3070 (the 3 remaining failures are the pre-existing bun 1.3.11fs/promises.utimesint32 overflow insweeper.test.ts, unrelated and tracked separately).Typecheck + biome clean.
Out of scope
agents.namechanges — that path lives in the boot-timemigrateForLegacyTypeRenamesmigration and is already handled by the next-restart re-read.sweeper.test.tsflakes (bun upstream).🤖 Generated with Claude Code
`POST /agents/types/{old}/rename` updates the DB `agents.type` column and reloads `config.types`, but each in-memory `Worker` was constructed at boot with `config.type = oldName`. Without a re-sync, `pool.getWorkersByType(workers, newName)` returns an empty list (every worker still reports the old type), so `dispatchByType(newName, …)` silently drops every webhook that targets the freshly-renamed type — the agent goes idle on the dashboard until the next service restart. Walk `listAgentsByType(newName)` post-commit and update each registered worker's `config.type` in place, alongside the existing `enqueueRender` loop. Worker names are not touched (the rename only changes the `type` column on `agents`); `currentTask` + queue depth are preserved verbatim, so an in-flight task continues draining through the freshly re-synced worker. Two new tests pin the behaviour: - post-rename `getWorker(name).config.type` flips to the new name and `getWorkersByType(newName)` finds every renamed instance, - a worker carrying a fake `currentTask` + queue entry through the rename retains both intact. Closes #711 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>Minimal, correct fix.
getWorker(row.name)+w.config.type = newNamepiggybacks on the existinglistAgentsByType(newName)loop at the right point (post-DB-commit), so no inconsistency window. Two tests cover the re-sync invariant and non-destructive behaviour (currentTask + queue depth preserved). CI green.