feat(cancel): target specific agent or task_id #91

Merged
code-lead merged 1 commit from boss/87 into main 2026-04-19 17:07:41 +00:00
Collaborator

Summary

Makes POST /cancel targetable so operators can stop one specific task in a multi-worker run instead of whichever worker happens to be first in the map.

POST /cancel now accepts:

  • {} — cancel the sole busy worker; 409 { "error": "multiple workers busy; specify agent or task_id" } when more than one is busy.
  • { "agent": "<name>" } — cancel that worker's current task. 404 if the agent name isn't known.
  • { "task_id": "<uuid>" } — cancel the specific running task wherever it lives. 404 if the id isn't currently running. Preferred over agent when both are given.

Response carries task_id, agent, and issue_number so the operator can confirm the right target was hit.

Dashboard:

  • The per-row cancel button now POSTs { task_id: "<id>" } so it behaves correctly under concurrent load.
  • New Workers card mirrors the Storage card and exposes a per-agent cancel button that POSTs { agent: "<name>" } (behind a confirm() prompt).

Closes #87.

Test plan

  • bun test — 254 pass, 0 fail (new POST /cancel suite covers empty body / agent / task_id / 404 / 409 / idle-agent / precedence).
  • bun x biome check src/ — clean.
  • bun x tsc --noEmit — clean.
  • Manual: two agents busy, curl -X POST http://192.168.1.164:4500/cancel -d '{"agent":"dev"}' -H 'Content-Type: application/json' — only dev aborts, response echoes the cancelled task_id and issue_number.
  • Manual: dashboard workers bar shows busy · N queued with a cancel button; clicking it fires { agent } after confirm.
  • Manual: per-row cancel on a running task now correctly hits that specific task even with a sibling agent also running.
## Summary Makes `POST /cancel` targetable so operators can stop one specific task in a multi-worker run instead of whichever worker happens to be first in the map. `POST /cancel` now accepts: - `{}` — cancel the sole busy worker; `409 { "error": "multiple workers busy; specify agent or task_id" }` when more than one is busy. - `{ "agent": "<name>" }` — cancel that worker's current task. `404` if the agent name isn't known. - `{ "task_id": "<uuid>" }` — cancel the specific running task wherever it lives. `404` if the id isn't currently running. Preferred over `agent` when both are given. Response carries `task_id`, `agent`, and `issue_number` so the operator can confirm the right target was hit. Dashboard: - The per-row cancel button now POSTs `{ task_id: "<id>" }` so it behaves correctly under concurrent load. - New **Workers** card mirrors the Storage card and exposes a per-agent `cancel` button that POSTs `{ agent: "<name>" }` (behind a `confirm()` prompt). Closes #87. ## Test plan - [x] `bun test` — 254 pass, 0 fail (new `POST /cancel` suite covers empty body / agent / task_id / 404 / 409 / idle-agent / precedence). - [x] `bun x biome check src/` — clean. - [x] `bun x tsc --noEmit` — clean. - [ ] Manual: two agents busy, `curl -X POST http://192.168.1.164:4500/cancel -d '{"agent":"dev"}' -H 'Content-Type: application/json'` — only `dev` aborts, response echoes the cancelled `task_id` and `issue_number`. - [ ] Manual: dashboard workers bar shows `busy · N queued` with a cancel button; clicking it fires `{ agent }` after confirm. - [ ] Manual: per-row cancel on a running task now correctly hits that specific task even with a sibling agent also running.
feat(cancel): target specific agent or task_id instead of first busy worker
All checks were successful
qa / qa (pull_request) Successful in 2m37s
qa / dockerfile (pull_request) Successful in 8s
7713ebea35
`POST /cancel` now accepts:
- `{}` — cancel the sole busy worker; 409 when >1 busy
- `{ "agent": "<name>" }` — cancel that worker's task (404 if agent unknown)
- `{ "task_id": "<uuid>" }` — cancel by id wherever it lives (404 if not running)

Response carries `task_id`, `agent`, and `issue_number` so the operator can
confirm the right target. Dashboard cancel button now POSTs the task_id;
new workers bar exposes a per-agent cancel affordance. Closes #87.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
reviewer approved these changes 2026-04-19 16:56:57 +00:00
Dismissed
reviewer left a comment

Review: feat(cancel): target specific agent or task_id

CI: green (run #1612, success, 2m46s on head 7713ebe)

All acceptance criteria from #87 are met.

Endpoint shape

  • {} with sole busy worker cancelled correctly
  • {} with >1 busy workers returns 409 multiple workers busy; specify agent or task_id
  • { agent } targets that worker; 404 if unknown agent name; 200 nothing running if known but idle
  • { task_id } finds the owning worker regardless of which agent it is; 404 if not running
  • task_id wins over agent when both are provided
  • Response echoes task_id, agent, issue_number

Dashboard

  • cancelTask(e, taskId) now POSTs { task_id } correctly targeted under concurrent load
  • New Workers card polls /health every 5s, renders per-agent busy state with a cancel button that calls cancelAgent(name) behind confirm() and POSTs { agent: name }

Tests

254 pass, 0 fail. All 9 specified cases covered: empty body / sole busy / >1 busy 409 / cancel by agent / cancel by task_id / precedence (task_id wins) / 404 unknown agent / 404 unknown task_id / known-but-idle 200.

Code correctness

The TOCTOU window between finding target and calling .abort() is handled correctly: currentTask and currentAbort are snapshotted before abort, then a defensive guard fires in case the worker already cleaned up. In a single-threaded event loop this window is zero within the same sync block, but the guard is still good practice.

Minor observation (non-blocking)

cancelTask() and cancelAgent() in the dashboard are fire-and-forget with no error handling and no immediate UI refresh. Not a regression from pre-#87 behaviour; SSE events update task state in real time and the Workers card polls every 5s.

Approved. Clean implementation, solid test coverage, CI green.

## Review: feat(cancel): target specific agent or task_id **CI**: green (run #1612, `success`, 2m46s on head `7713ebe`) **All acceptance criteria from #87 are met.** ### Endpoint shape - `{}` with sole busy worker cancelled correctly - `{}` with >1 busy workers returns 409 `multiple workers busy; specify agent or task_id` - `{ agent }` targets that worker; 404 if unknown agent name; 200 `nothing running` if known but idle - `{ task_id }` finds the owning worker regardless of which agent it is; 404 if not running - `task_id` wins over `agent` when both are provided - Response echoes `task_id`, `agent`, `issue_number` ### Dashboard - `cancelTask(e, taskId)` now POSTs `{ task_id }` correctly targeted under concurrent load - New Workers card polls `/health` every 5s, renders per-agent busy state with a cancel button that calls `cancelAgent(name)` behind `confirm()` and POSTs `{ agent: name }` ### Tests 254 pass, 0 fail. All 9 specified cases covered: empty body / sole busy / >1 busy 409 / cancel by agent / cancel by task_id / precedence (task_id wins) / 404 unknown agent / 404 unknown task_id / known-but-idle 200. ### Code correctness The TOCTOU window between finding `target` and calling `.abort()` is handled correctly: `currentTask` and `currentAbort` are snapshotted before abort, then a defensive guard fires in case the worker already cleaned up. In a single-threaded event loop this window is zero within the same sync block, but the guard is still good practice. ### Minor observation (non-blocking) `cancelTask()` and `cancelAgent()` in the dashboard are fire-and-forget with no error handling and no immediate UI refresh. Not a regression from pre-#87 behaviour; SSE events update task state in real time and the Workers card polls every 5s. Approved. Clean implementation, solid test coverage, CI green.
reviewer approved these changes 2026-04-19 17:06:57 +00:00
reviewer left a comment

APPROVED

CI green (run #1612, 2m 46s). All acceptance criteria from #87 are met.

Acceptance criteria — all checked

Criterion Status
{} cancels sole busy worker
{} → 409 when >1 busy
{"agent"} cancels named worker, 404 if unknown
{"task_id"} cancels by id, 404 if not running
task_id wins when both given
Response carries task_id, agent, issue_number
Dashboard per-row cancel POSTs {task_id}
Workers card with per-agent cancel affordance
Tests cover all body-shape cases + 409 + 404s (9 tests)

Code quality notes

Race-condition guard (src/main.ts, snapshot block before currentAbort.abort()) — the snapshot-before-abort pattern is exactly right. There is a window between computing busyWorkers and snapshotting currentTask/currentAbort where the worker's finally block could race and null those fields. The second guard (if (!currentTask || !currentAbort)) closes it cleanly.

pushEvent moved outside the if (record) guard — behavioural change is safe. pushEvent already handles a missing record gracefully (just broadcasts SSE, skips the push into record.events). The broadcast still notifies connected dashboard clients even in the edge case where the cancelled task has no history record.

Workers card polling at 5 s — appropriate for a live-ops panel. /health is O(workers) so the cost is negligible.

Test isolation via afterEach — directly mutating worker.currentTask / worker.currentAbort is the right approach; it exercises the routing logic without pulling in the Agent SDK. The afterEach cleanup is thorough.

## ✅ APPROVED CI green (run #1612, 2m 46s). All acceptance criteria from #87 are met. ### Acceptance criteria — all checked | Criterion | Status | |-----------|--------| | `{}` cancels sole busy worker | ✅ | | `{}` → 409 when >1 busy | ✅ | | `{"agent"}` cancels named worker, 404 if unknown | ✅ | | `{"task_id"}` cancels by id, 404 if not running | ✅ | | `task_id` wins when both given | ✅ | | Response carries `task_id`, `agent`, `issue_number` | ✅ | | Dashboard per-row cancel POSTs `{task_id}` | ✅ | | Workers card with per-agent cancel affordance | ✅ | | Tests cover all body-shape cases + 409 + 404s | ✅ (9 tests) | ### Code quality notes **Race-condition guard (`src/main.ts`, snapshot block before `currentAbort.abort()`)** — the snapshot-before-abort pattern is exactly right. There is a window between computing `busyWorkers` and snapshotting `currentTask`/`currentAbort` where the worker's finally block could race and null those fields. The second guard (`if (!currentTask || !currentAbort)`) closes it cleanly. **`pushEvent` moved outside the `if (record)` guard** — behavioural change is safe. `pushEvent` already handles a missing record gracefully (just broadcasts SSE, skips the push into `record.events`). The broadcast still notifies connected dashboard clients even in the edge case where the cancelled task has no history record. **Workers card polling at 5 s** — appropriate for a live-ops panel. `/health` is O(workers) so the cost is negligible. **Test isolation via `afterEach`** — directly mutating `worker.currentTask` / `worker.currentAbort` is the right approach; it exercises the routing logic without pulling in the Agent SDK. The `afterEach` cleanup is thorough.
code-lead deleted branch boss/87 2026-04-19 17:07:41 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
charles/claude-hooks!91
No description provided.