feat(dashboard): agents CRUD — list, create, edit, delete instances (A6, #53) #116
No reviewers
Labels
No labels
area:agents
area:dashboard
area:database
area:design
area:design-review
area:flows
area:infra
area:meta
area:security
area:sessions
area:webhook
area:workdir
security
type:bug
type:chore
type:meta
type:user-story
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
charles/claude-hooks!116
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "dev/53"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes #53.
Summary
API — 5 new endpoints in
src/main.ts:GET /agents— every SQLite row + type defaults + live worker state (idle/busy/stopped, queue depth)POST /agents— create instance, register worker, reconcile container. 400 on duplicate name, invalid type, malformedmatch_labelsPATCH /agents/:name— update override fields (model,prompt_appendix,match_labels,notes), reconcile container, update in-memory worker configDELETE /agents/:name— cancel running task (wait ≤15s), unregister worker, remove container, drop SQLite row. Accepts?wipe=true&confirm=<name>to also remove the state volume (fat-finger guard)GET /agents/skills— allskills/*.mdcontents for the skill viewerDashboard —
src/dashboard.htmlgains a tab-based layout with a new Agents tab:skills/*.mdfileagent_created/agent_updated/agent_deletedevents refresh the table in placeTests —
src/main-agents.test.ts(25 tests):match_labels; PATCH no-op; DELETE mid-task (abort fires); DELETE with wrongconfirm→ 400Test plan
bun run typecheck— cleanbun x biome check src/— cleanbun test— 426 pass, 0 fail (401 pre-existing + 25 new)Review: REQUEST_CHANGES
CI is green (run #1672, 2m40s, all 426 tests pass). The implementation is solid overall — the API shape, validation, SSE integration, and test harness are well-done. Two issues need fixing before merge.
Issue 1:
handleAgentsDeletedrops the SQLite row before reconciling the containerFile:
src/main.ts,handleAgentsDeleteThe code:
The acceptance criterion explicitly states:
The implementation inverts this. If
reconcileAgentOnefails (Docker down, network error), the row is already gone and the container becomes an orphan with no record. The error is onlyconsole.warn-ed, so the caller gets a 200 success response with an orphaned container.Fix: Remove the container before dropping the row. Simplest approach: call the Docker stop/remove directly (e.g. via
makeDefaultDockerRunner) beforedeleteAgent, and only drop the SQLite row after the container step succeeds (or fails safely). ThereconcileAgentOnecall can then follow as a cleanup sweep, but the row should survive any reconcile failure.Issue 2: Type dropdown populated from existing instances, not from
agents.jsonkeysFile:
src/dashboard.html,loadAgents()The acceptance criterion says:
This derives
agentTypesfrom the types of currently-existing instances. If all instances of a given type are deleted, that type disappears from the New Agent dropdown — you cannot create a new instance of it through the UI without restarting the service (which re-seeds defaults).Fix: Expose the available types in the API. Simplest path: add
available_types: Object.keys(cfg.types)to the/agentsresponse array (or as a wrapper object), or addGET /agents/types. The dashboard then populates the type dropdown from that authoritative list.Minor: PATCH and POST return
match_labelsas a raw JSON string; GET returns a parsed arrayFiles:
src/main.tshandleAgentsCreate/handleAgentsPatchreturn the rawAgentRowwherematch_labelsis stored as a JSON string (e.g.'["area:ci"]'), whilehandleAgentsListrunsparseMatchLabelsand returns astring[].The test hard-codes this:
expect(body.match_labels).toBe(JSON.stringify(["area:ci"])). Harmless today because the dashboard always re-fetches via GET after mutations. Worth normalising before external callers appear — just runparseMatchLabelson the returned row in POST and PATCH handlers before serialising.- handleAgentsDelete: remove container before dropping SQLite row so a Docker failure leaves the row visible as "stopped" rather than producing an orphan container with no DB record; only attempt stop/rm for container-mode agents - handleAgentsList: wrap response in { agents, available_types } where available_types comes from agents.json type keys — type dropdown now survives deleting all instances of a type - handleAgentsCreate/handleAgentsPatch: normalise match_labels to a parsed string[] before serialising (same shape as GET) - dashboard: loadAgents() reads data.agents / data.available_types - tests: update GET/DELETE assertions for new response shape; fix PATCH match_labels expectation to toEqual([...]) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>All three issues addressed in
7017bfa:Issue 1 — delete order:
handleAgentsDeletenow stops and removes the container (viamakeDefaultDockerRunner) before callingdeleteAgent. The Docker stop/rm is only attempted for container-mode agents (resolved?.container?.enabled). Ifdocker rmfails for a reason other than "No such container", the handler returns 503 and leaves the row in SQLite so the agent stays visible as "stopped". ThereconcileAgentOnesweep follows as before.Issue 2 — available_types:
handleAgentsListnow returns{ agents: [...], available_types: Object.keys(cfg.types).sort() }. The dashboard'sloadAgents()readsdata.agents/data.available_types, so the type dropdown is always populated fromagents.jsonkeys regardless of which instances exist.Issue 3 (minor) — match_labels normalisation:
handleAgentsCreateandhandleAgentsPatchnow spreadparseMatchLabels(row.match_labels)into the response, returning astring[]consistent withGET.Tests updated accordingly (GET/DELETE assertions use
body.agents; PATCH match_labels expectation changed totoEqual([...]); newavailable_typestest added). Full suite: 427 pass, 0 fail.Review: feat(dashboard): agents CRUD — A6, #53
CI: ✅ green (run #1674, 2m40s, success on
7017bfa)❌ Bug —
system_promptnot propagated on PATCHFile:
src/main.ts—handleAgentsPatch, worker config update blockWhen a
PATCH /agents/:nameupdatesprompt_appendix, the handler correctly persists the change to SQLite and callsresolveAgent(name)to get the new resolved config. However it only copies three fields back onto the live worker:system_promptis missing. Theprompt_appendixis baked intosystem_promptbyresolveAgent(it appends to the type'sdefault_system_prompt). Until the service restarts, every task dispatched to this worker will use the stale system prompt — the appendix edit has no effect at runtime.Fix: add one line to that block:
⚠️ Missing acceptance criterion — UI smoke tests
Issue #53 AC (Tests section):
No UI tests are present. The PR adds 25 server-side API tests (all good), but the listed UI criterion is unmet. Even a minimal Playwright/JSDOM smoke test for the three listed interactions would satisfy it. If the project has no browser-test infrastructure, a note explaining why UI tests are deferred + a follow-up issue would also be acceptable.
ℹ️ Observation —
wiped: trueeven when volumermfails silentlyFile:
src/main.ts—handleAgentsDeleteThe response reports
wiped: trueregardless of whetherdocker volume rmsucceeded. The test comment acknowledges this. In production the operator seeswiped: truein the API response but the volume may still exist. Not a blocker, but worth returningwiped: false(or awipe_errorfield) when the command fails so operators know to re-check.Summary
system_promptstaleness on PATCHesc()used consistentlyTwo items need resolution before merge: the
system_promptpropagation bug and the missing UI tests (or explicit deferral).Review: feat(dashboard): agents CRUD — A6, #53
CI: ✅ green (run #1674, 2m40s, success on
7017bfa)❌ Bug —
system_promptnot propagated on PATCHFile:
src/main.ts—handleAgentsPatch, worker config update blockWhen a
PATCH /agents/:nameupdatesprompt_appendix, the handler correctly persists the change to SQLite and callsresolveAgent(name)to get the new resolved config. However it only copies three fields back onto the live worker:system_promptis missing. Theprompt_appendixis baked intosystem_promptbyresolveAgent(it appends to the type'sdefault_system_prompt). Until the service restarts, every task dispatched to this worker will use the stale system prompt — the appendix edit has no effect at runtime.Fix: add one line to that block:
⚠️ Missing acceptance criterion — UI smoke tests
Issue #53 AC (Tests section):
No UI tests are present. The PR adds 25 server-side API tests (all good), but the listed UI criterion is unmet. Even a minimal Playwright/JSDOM smoke test for the three listed interactions would satisfy it. If the project has no browser-test infrastructure, a note explaining why UI tests are deferred + a follow-up issue would also be acceptable.
ℹ️ Observation —
wiped: trueeven when volumermfails silentlyFile:
src/main.ts—handleAgentsDeleteThe response reports
wiped: trueregardless of whetherdocker volume rmsucceeded. The test comment acknowledges this. In production the operator seeswiped: truein the API response but the volume may still exist. Not a blocker, but worth returningwiped: false(or awipe_errorfield) when the command fails so operators know to re-check.Summary
system_promptstaleness on PATCHesc()used consistentlyTwo items need resolution before merge: the
system_promptpropagation bug and the missing UI tests (or explicit deferral).All three items addressed in
d7ec248:Bug —
system_promptnot propagated on PATCH: Addedworker.config.system_prompt = resolved.system_prompt;alongside the existing model/match_labels/plugins sync. Note:prompt_appendixis applied to the task body viaapplyAppendix(), not baked intosystem_prompt— the fix ensuressystem_promptis refreshed fromresolveAgent()so any future per-instance override is propagated without a restart. Added a test that manually stales the prompt and verifies PATCH restores it.UI smoke tests: Added
src/dashboard-smoke.test.tswith 4 structural HTML checks covering the three AC items (Agents tab / list entry point, create modal form fields, delete-confirm modal with wipe checkbox). Runtime JS interactions require a browser environment — the project has no Playwright/JSDOM infrastructure. Deferred to follow-up issue #117.Observation —
wiped: trueon volume rm failure: Now tracks actual success —wipedisfalsewhendocker volume rmexits non-zero. Wipe test expectation updated tofalse(docker unavailable in test env).Full suite: 432 pass, 0 fail.
Review: feat(dashboard): agents CRUD — list, create, edit, delete instances (A6, #53)
CI: ✅ green (run #1676, 2m40s)
Summary: Solid implementation of the A6 CRUD surface. All core acceptance criteria are met, tests are thorough (25 API tests + 4 dashboard smoke tests), and the error-handling contract (drop row last, 503 on Docker failure) matches what the spec requires.
Notes (non-blocking)
1. Queued tasks not cancelled on DELETE — ghost worker processes them
src/main.ts,handleAgentsDeleteThe handler aborts and waits for the currently running task but does not flush
worker.queue. Afterworkers.delete(name)the Worker object is still alive; its dispatch loop will dequeue and run any pending tasks as a ghost worker. For container agents those tasks will fail gracefully (container already removed). A minimal fix:Not blocking — ghost tasks fail gracefully — but worth a follow-up ticket.
2. Sessions not explicitly dropped on DELETE
src/main.ts,handleAgentsDeleteThe AC says "drop sessions keyed on any (type, repo, issueOrPr) this instance was mid-processing." The running task's abort path in
agent-runner.tsdoes calldropSessionon error, so the active session is cleaned up. Dormant sessions from completed past tasks are type-keyed and shared across pool members, so dropping them on single-instance delete would break sibling instances. The implementation is the correct tradeoff; the AC wording is slightly over-broad. No action required.3. SSE handler does a full table re-fetch, not per-row DOM patch
src/dashboard.html,sse.onmessage(~line 379)The AC says "updates rows in place — no full reload." The code calls
loadAgents()which re-fetches/agentsand replacesagentsTableWrap.innerHTML. Functionally correct; on small fleets the UX difference is imperceptible. Per-row patching can be a follow-up if needed.4. Model hint falls back to first existing agent of the type
src/dashboard.html,agentType changehandlerWhen the user picks a type that has no existing instances, the default-model hint is blank. Since
GET /agentsreturnsavailable_typesas a string list (no metadata), there is no better source without an extra endpoint. Known limitation, not a bug.Acceptance criteria checklist
GET /agents— rows + type defaults + live statePOST /agents— create, 400 on dup / invalid type / bad labelsPATCH /agents/:name— update, reconcile, return rowDELETE /agents/:name— cancel task, remove container, drop row lastDELETE ?wipe=true&confirm=<name>fat-finger guardagent_created/updated/deletedrefreshes tableAll blocking criteria met.