/cancel endpoint: accept an agent param instead of cancelling the first busy worker #87

Closed
opened 2026-04-19 09:38:37 +00:00 by claude-desktop · 0 comments
Collaborator

User story

As the operator, I want /cancel to take an explicit agent
name so that cancelling one agent's dupe / stuck task doesn't
collateral-damage another agent's critical work.

What happened (concrete)

During a 3-ticket parallel dispatch (#81 / #82 / #83) today, two
workers were simultaneously busy:

  • boss running f98fe733 — the address-review task for PR #85
    (ticket #81) after reviewer's REQUEST_CHANGES.
  • dev running 8cdcd807 — a leftover dupe from an earlier
    assignee-pivot trick; harmless but burning cycles on an
    already-solved issue.

Intent: cancel dev's dupe. Actual behaviour: /cancel walks
workers.values() in map order, aborts the first worker with
currentAbort + currentTask, then breaks — so it killed
boss's critical address-review. Dev kept running its dupe
untouched. PR #85 got stuck in "changes requested" with no
one addressing it; had to manually re-request review to
resurrect the flow.

See src/main.ts:332 (/cancel handler).

Why this matters

Today the default 5-agent pool runs independent flows in
parallel (designer + dev + reviewer + boss merge step, plus
design-reviewer). The multi-agent pool config in #47 / #49
makes this the norm, not the exception. A non-targetable
/cancel is a foot-gun: operators reach for it expecting to
stop one specific task and get a random cancellation instead.

Acceptance criteria

Endpoint shape

  • POST /cancel accepts a JSON body:
    - {} — keep today's "cancel the one running task" behaviour
    only if exactly one worker is busy; otherwise return
    409 { "error": "multiple workers busy; specify agent or task_id" }.
    - { "agent": "<name>" } — cancel that worker's current
    task. 404 if the agent name isn't known.
    - { "task_id": "<uuid>" } — cancel the specific running
    task wherever it lives. 404 if the id isn't currently
    running. Preferred over agent when both are given.
  • Response carries the cancelled task's id, agent, and
    issue_number so the operator can confirm the right target
    was hit.

Dashboard

  • src/dashboard.html per-row cancel button already exists
    (line 245). Update cancelTask(event) (line 351) to POST
    {"task_id": "<taskId>"} instead of an empty body — makes
    the button behave correctly under concurrent load.
  • Workers-summary bar gets a per-agent "cancel" affordance
    that POSTs {"agent": "<name>"}.

Tests

  • main.test.ts POST /cancel suite:
    - current: a single test that doesn't distinguish the "no
    body" / "stale body" cases. Split into body-shape cases.
    - add: cancels only the named agent's task — two workers
    busy, body targets one, the other keeps running.
    - add: rejects empty body when >1 worker busy — 409.
    - add: returns 404 on unknown agent / unknown task_id.

Out of scope

  • Per-task pause / resume.
  • Cancelling queued tasks (today's handler only touches
    currentTask). The queue manipulation is a separate ticket
    when someone needs it.
  • CLI or TUI wrapper — curl -X POST -d '{"agent":"boss"}' is
    fine for the operator for now.

References

  • src/main.ts:332 — the handler to split.
  • src/dashboard.html:245,351 — the button/fetch pair to update.
  • #76 — already validated the multi-agent container mode that
    makes this class of bug more likely.

Dependencies

  • Blocked by: nothing.
  • Blocks: #47 / #49 pool scheduler once it can run >1 task
    per agent type concurrently — a pool makes the foot-gun
    strictly worse.
  • Branch off: main.
## User story As the **operator**, I want `/cancel` to take an explicit `agent` name so that cancelling one agent's dupe / stuck task doesn't collateral-damage another agent's critical work. ## What happened (concrete) During a 3-ticket parallel dispatch (#81 / #82 / #83) today, two workers were simultaneously busy: - `boss` running `f98fe733` — the `address-review` task for PR #85 (ticket #81) after reviewer's REQUEST_CHANGES. - `dev` running `8cdcd807` — a leftover dupe from an earlier assignee-pivot trick; harmless but burning cycles on an already-solved issue. Intent: cancel dev's dupe. Actual behaviour: `/cancel` walks `workers.values()` in map order, aborts the **first** worker with `currentAbort + currentTask`, then `break`s — so it killed boss's critical address-review. Dev kept running its dupe untouched. PR #85 got stuck in "changes requested" with no one addressing it; had to manually re-request review to resurrect the flow. See `src/main.ts:332` (`/cancel` handler). ## Why this matters Today the default 5-agent pool runs independent flows in parallel (designer + dev + reviewer + boss merge step, plus design-reviewer). The multi-agent pool config in #47 / #49 makes this the norm, not the exception. A non-targetable `/cancel` is a foot-gun: operators reach for it expecting to stop *one specific task* and get a random cancellation instead. ## Acceptance criteria ### Endpoint shape - [ ] `POST /cancel` accepts a JSON body: - `{}` — keep today's "cancel the one running task" behaviour **only if exactly one worker is busy**; otherwise return `409 { "error": "multiple workers busy; specify agent or task_id" }`. - `{ "agent": "<name>" }` — cancel that worker's current task. `404` if the agent name isn't known. - `{ "task_id": "<uuid>" }` — cancel the specific running task wherever it lives. `404` if the id isn't currently running. Preferred over `agent` when both are given. - [ ] Response carries the cancelled task's `id`, `agent`, and `issue_number` so the operator can confirm the right target was hit. ### Dashboard - [ ] `src/dashboard.html` per-row cancel button already exists (line 245). Update `cancelTask(event)` (line 351) to POST `{"task_id": "<taskId>"}` instead of an empty body — makes the button behave correctly under concurrent load. - [ ] Workers-summary bar gets a per-agent "cancel" affordance that POSTs `{"agent": "<name>"}`. ### Tests - [ ] `main.test.ts` `POST /cancel` suite: - current: a single test that doesn't distinguish the "no body" / "stale body" cases. Split into body-shape cases. - add: `cancels only the named agent's task` — two workers busy, body targets one, the other keeps running. - add: `rejects empty body when >1 worker busy` — 409. - add: `returns 404 on unknown agent / unknown task_id`. ## Out of scope - Per-task pause / resume. - Cancelling queued tasks (today's handler only touches `currentTask`). The queue manipulation is a separate ticket when someone needs it. - CLI or TUI wrapper — `curl -X POST -d '{"agent":"boss"}'` is fine for the operator for now. ## References - `src/main.ts:332` — the handler to split. - `src/dashboard.html:245,351` — the button/fetch pair to update. - #76 — already validated the multi-agent container mode that makes this class of bug more likely. ## Dependencies - **Blocked by:** nothing. - **Blocks:** #47 / #49 pool scheduler once it can run >1 task per agent type concurrently — a pool makes the foot-gun strictly worse. - **Branch off:** `main`.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
charles/claude-hooks#87
No description provided.