feat(monitor): mid-flight steering — inject user messages into a running agent #224

Closed
opened 2026-04-21 12:03:27 +00:00 by claude-desktop · 0 comments
Collaborator

User story

As an operator, I want to interrupt a running agent mid-turn and send it a redirect message ("no, don't touch sessions.ts, focus on db.ts") without killing the task and losing its context. Today the only recourse is /cancel + re-dispatch, which wastes the accumulated reasoning and costs a full re-prompt.

Acceptance criteria

SDK integration

  • Switch agent-runner.ts from one-shot query() to the streaming input variant of @anthropic-ai/claude-agent-sdk so new user messages can be pushed into a live conversation. Existing session-chain + resume semantics remain intact.
  • On every dispatch, keep a bounded in-memory queue of pending operator messages keyed by task_id. The SDK iterator drains from this queue each turn boundary.

Steer endpoint

  • POST /task/:id/steer with body { message: string }. Auth-gated (M18-8). Returns 202 when the message is queued; 409 Conflict if the task has settled; 413 Payload Too Large if the body exceeds a sane cap (suggest 8 KB).
  • The queued message is delivered at the next turn boundary of the in-flight SDK conversation — ideally as a regular user role message so the agent treats it as operator input, not as tool output.
  • An SSE steer_queued / steer_delivered envelope emits on /events so the Monitor UI can show the message appearing in the transcript with a distinct visual.

Interrupt-and-steer UI wiring

  • The task detail page (per the Penpot handoff) implements the "Interrupt + Steer" composer. Pressing Enter fires POST /task/:id/steer; the panel flips to a pending state until the service echoes steer_delivered.
  • The transcript renders operator-injected messages with a dedicated role color (suggest --color-role-operator, add to tokens if missing) so agent vs. operator turns are distinguishable.
  • Cancel remains a separate button with a confirm prompt — steer is additive, cancel is terminal.

Safety rails

  • Rate-limit steer to N messages per minute per task (suggest 6 — one every ~10 s). Further submissions return 429 Too Many Requests. Prevents a runaway loop of chained steers from overloading the SDK.
  • Queue is capped at 10 messages. Overflow drops the oldest and logs a warning — operator should see a toast.
  • Steer is refused on the foreman agent — the foreman already has a full chat surface via /foreman/chat; steering would be a confusing second channel. Return 400 with a hint pointing at the Planner page.

Verification

  • Unit test in apps/server/src/agent-runner.test.ts — mocked SDK with streaming input, verify a posted steer message lands in the next iterator yield.
  • Integration test: dispatch a long-running task (mocked SDK), POST /steer, assert the transcript receives the injected user message and the agent's next assistant turn references it.
  • Manual: dispatch a dev task on a dummy issue, steer it mid-run with "also update CLAUDE.md", confirm the resulting PR includes the CLAUDE.md edit.

Dependencies

  • Blocked by the Penpot mockup ticket (task-detail redesign). The implementer should ship against the agreed-upon visual and copy.

Out of scope

  • Multi-party chat (agents responding to each other). This is strictly operator-to-single-agent.
  • Persisted steering history — the queue is in-memory; the transcript captures what was delivered.
  • Retroactively steering an already-settled task — for that, use re-dispatch (companion ticket) and re-prompt fresh.

References

  • @anthropic-ai/claude-agent-sdk — streaming input mode.
  • apps/server/src/agent-runner.ts — current one-shot query invocation.
  • apps/server/src/worker.ts — task lifecycle + abort signal wiring.
  • packages/shared/src/sse.ts — SSE envelope shapes (add steer_queued / steer_delivered).
  • Companion tickets: Penpot mockup (blocker), re-dispatch button (sibling — different feature but same surface area).
## User story As an operator, I want to **interrupt a running agent mid-turn and send it a redirect message** ("no, don't touch sessions.ts, focus on db.ts") without killing the task and losing its context. Today the only recourse is `/cancel` + re-dispatch, which wastes the accumulated reasoning and costs a full re-prompt. ## Acceptance criteria ### SDK integration - [ ] Switch `agent-runner.ts` from one-shot `query()` to the **streaming input** variant of `@anthropic-ai/claude-agent-sdk` so new user messages can be pushed into a live conversation. Existing session-chain + resume semantics remain intact. - [ ] On every dispatch, keep a bounded in-memory queue of pending operator messages keyed by `task_id`. The SDK iterator drains from this queue each turn boundary. ### Steer endpoint - [ ] `POST /task/:id/steer` with body `{ message: string }`. Auth-gated (M18-8). Returns `202` when the message is queued; `409 Conflict` if the task has settled; `413 Payload Too Large` if the body exceeds a sane cap (suggest 8 KB). - [ ] The queued message is delivered at the next turn boundary of the in-flight SDK conversation — ideally as a regular `user` role message so the agent treats it as operator input, not as tool output. - [ ] An SSE `steer_queued` / `steer_delivered` envelope emits on `/events` so the Monitor UI can show the message appearing in the transcript with a distinct visual. ### Interrupt-and-steer UI wiring - [ ] The task detail page (per the Penpot handoff) implements the "Interrupt + Steer" composer. Pressing Enter fires `POST /task/:id/steer`; the panel flips to a pending state until the service echoes `steer_delivered`. - [ ] The transcript renders operator-injected messages with a dedicated role color (suggest `--color-role-operator`, add to tokens if missing) so agent vs. operator turns are distinguishable. - [ ] Cancel remains a separate button with a confirm prompt — steer is additive, cancel is terminal. ### Safety rails - [ ] Rate-limit steer to N messages per minute per task (suggest 6 — one every ~10 s). Further submissions return `429 Too Many Requests`. Prevents a runaway loop of chained steers from overloading the SDK. - [ ] Queue is capped at 10 messages. Overflow drops the oldest and logs a warning — operator should see a toast. - [ ] Steer is refused on the **foreman** agent — the foreman already has a full chat surface via `/foreman/chat`; steering would be a confusing second channel. Return `400` with a hint pointing at the Planner page. ### Verification - [ ] Unit test in `apps/server/src/agent-runner.test.ts` — mocked SDK with streaming input, verify a posted steer message lands in the next iterator yield. - [ ] Integration test: dispatch a long-running task (mocked SDK), POST /steer, assert the transcript receives the injected `user` message and the agent's next `assistant` turn references it. - [ ] Manual: dispatch a dev task on a dummy issue, steer it mid-run with "also update CLAUDE.md", confirm the resulting PR includes the CLAUDE.md edit. ## Dependencies - **Blocked by the Penpot mockup ticket** (task-detail redesign). The implementer should ship against the agreed-upon visual and copy. ## Out of scope - Multi-party chat (agents responding to each other). This is strictly operator-to-single-agent. - Persisted steering history — the queue is in-memory; the transcript captures what was delivered. - Retroactively steering an already-settled task — for that, use re-dispatch (companion ticket) and re-prompt fresh. ## References - `@anthropic-ai/claude-agent-sdk` — streaming input mode. - `apps/server/src/agent-runner.ts` — current one-shot query invocation. - `apps/server/src/worker.ts` — task lifecycle + abort signal wiring. - `packages/shared/src/sse.ts` — SSE envelope shapes (add `steer_queued` / `steer_delivered`). - Companion tickets: Penpot mockup (blocker), re-dispatch button (sibling — different feature but same surface area).
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
charles/claude-hooks#224
No description provided.