charles/claude-hooks

Fork 0

cursor-sdk-adapter: visibility parity + cancel-race fix + stall watchdog #950

New issue

Closed

opened 2026-05-08 11:57:12 +00:00 by claude-desktop · 0 comments

claude-desktop commented

2026-05-08 11:57:12 +00:00

Collaborator

User story

As an operator running cursor-backed agents, I want visibility parity with claude-code (tool calls, status transitions, subagent dispatches surfaced in the dashboard) and a worker that actually unblocks when I cancel a task, so cursor runs are neither black boxes between dispatch and result nor zombie sessions that pin a worker forever.

Context

Two distinct bugs in apps/server/src/infrastructure/agent/cursor-sdk-adapter.ts came up the same day; same file, same review surface, single PR:

Visibility — cursorSdkMessageToTaskEvent only maps assistant / user / thinking / system from the cursor SDK's SDKMessage union. The remaining message types are silently dropped:

Cursor `SDKMessage.type`	Carries	Adapter today
`tool_call`	`name`, `status (running/completed/error)`, `args`, `result` (typed `ToolCall` discriminator: `EditToolCall`, `ShellToolCall`, `ReadToolCall`, `GrepToolCall`, `WriteToolCall`, `GlobToolCall`, `SemSearchToolCall`, `LsToolCall`, `McpToolCall`, `TaskToolCall`, `CreatePlanToolCall`, `UpdateTodosToolCall`, `DeleteToolCall`, `ReadLintsToolCall`)	dropped (returns null)
`status`	`RUNNING` / `FINISHED` / `ERROR` / `CANCELLED` / `EXPIRED` + optional message	dropped
`task`	subagent dispatch text + status	dropped
`request`	request_id	dropped

Plus secondary gaps in runResultToResultEvent:

RunResult.git.branches[*].prUrl ignored — PR URL extraction goes through a regex on result.result summary string.
usage, total_cost_usd, num_turns hardcoded to emptyUsage() / 0 / 0 — every observability tile reads zero for cursor runs.

event-log.ts logTaskEvent system switch only handles subtype: "api_retry", so the existing cursor_init / cursor_system events the adapter does emit are dropped on the way to the SSE stream too.

Real-world reproduction (2026-05-08, task 6dbb2c28-5d17-4cb2-a44f-496b10bfceae on issue #949): journal showed acquiring worktree → running agent (provider: cursor, model: composer-2) → resuming session cursor:agent-… then silence for 40+ minutes while the cursor SDK was streaming tool_call events for the work being done.

Cancel deadlock — for await (run.stream()) does not propagate AbortSignal into the underlying HTTP/2 fetch. The adapter's onAbort calls r.cancel() only when r.supports("cancel") === true; against an unreachable cloud session this no-ops. The for-await body's if (req.abort.signal.aborted) break; only fires after a stream event arrives. Result: currentAbort.abort() flips the bit, the loop never re-checks because cursor's stream is silent, and the worker is wedged until process restart.

Reproduction same day, same task: operator clicked Cancel + dragged tile to triage + unassigned issue. SQLite task_history.status flipped to cancelled (because cancelRunningTaskInWorker ran and persistAndBroadcastCancellation updated the row), but worker.currentTask stayed pinned and the dashboard kept rendering RUNNING. Only just restart freed the slot.

Acceptance criteria

Part A — Visibility

`cursor-sdk-adapter.ts` — message mapping

case "tool_call" added to cursorSdkMessageToTaskEvent:
- status === "running" → tool_progress { toolName: msg.name, text: summarizeArgs(msg) }
- status === "completed" → tool_summary { summary: summarizeResult(msg) }
- status === "error" → tool_summary { summary: \error: ${msg.name}` }plus asystem { subtype: "cursor_tool_error" }`
summarizeArgs(msg) and summarizeResult(msg) switch on the typed ToolCall discriminator from @cursor/sdk types/tool-call-types:
- Edit: path:Lstart-Lend (or path (N replacements))
- Write: path (N bytes)
- Shell: first line of args.command, truncated to 120 chars
- Read: path + Lstart-Lend if present
- Grep/Glob/SemSearch: pattern + cwd
- Ls: path
- Mcp: mcpProviderId/mcpToolId
- Task: subagent type + first 80 chars of prompt
- default: JSON.stringify(args).slice(0, 120)
- results: row count for list-shaped results, exit code + stderr-tail for shell, summary for plan/todos
case "status" added → system { subtype: \cursor_status_${msg.status.toLowerCase()}`, details: { message } }`.
case "task" added → tool_progress { toolName: "subagent", text: msg.text ?? msg.status ?? "" }.
case "request" either mapped or explicitly silenced with a comment (decide: probably silence, it's noise).

`cursor-sdk-adapter.ts` — result event

runResultToResultEvent reads RunResult.git.branches.find(b => b.prUrl)?.prUrl and prefers it over r.result for resultText. Alternative if cleaner: extend ResultEvent with optional prUrl?: string and drop the regex path in extractProgress for cursor.
If a numTurns / token accumulator is plumbed (see Phase 2 follow-ups #951–#953), surface it; otherwise leave 0 and document that cursor doesn't expose it on RunResult directly.

`event-log.ts` — `logTaskEvent`

system switch extended to render the new subtypes as visible rows:
- cursor_init → tool_progress { tool_name: "session", summary: \cursor session ${agent_id}` }` (one-line "session started")
- cursor_status_* → tool_progress { tool_name: "status", summary: <status name> }
- cursor_tool_error → error row
- cursor_stalled (see Part C) → error row with summary: "no stream events in N min — assuming cloud-side hang"
user case added so cursor's user-echo (and any future claude-code user echoes) actually surface — currently the entire case is missing from the switch, silently dropping every UserTurn.

Part B — Cancel-race fix

`cursor-sdk-adapter.ts` — `runTask` body

Replace the bare for await (const ev of run.stream()) with an abort-aware iteration:
- Wrap the stream's next() in a Promise.race(gen.next(), abortPromise) where abortPromise rejects on req.abort.signal.
- On abort, throw AbortError, exit the loop, run the existing finally (which disposes the agent + drains the prompt generator).
In the existing onAbort handler, attempt r.cancel() regardless of r.supports("cancel") (catch the unsupported-op error). Add an aggressive retry: if the cancel promise hasn't resolved in 10 s, log a warning and proceed with disposal anyway — the worker MUST exit even when cursor cloud is unreachable.
The runner currently breaks on terminal result; on abort it should yield a synthetic result { type: "result", ok: false, subtype: "cancelled" } so downstream applyOutcome / task_history persistence does the right thing.
agent[Symbol.asyncDispose]() in the outer finally already covers cleanup but verify it's called on the abort path too (currently await inside finally — make sure abort doesn't bypass it via an exception bubbling out of the async iterator unexpectedly).

Tests

Cancel-while-streaming: feed a fake Run whose stream() yields one event then hangs forever. Fire abort.abort(). Assert the for-await exits in <100 ms, a synthetic result is yielded, the agent is disposed, and the test does not leak timers or open handles.
Cancel-with-cancel-supported: r.supports("cancel") === true, r.cancel() resolves. Assert the adapter calls cancel() and exits.
Cancel-with-cancel-unsupported: r.supports("cancel") === false. Assert the abort-race path still exits the loop in <100 ms (cancel-attempt is best-effort).
Cancel-with-cancel-hangs: r.cancel() returns a never-resolving promise. Assert the 10 s hard-timeout fires, the warning is logged, disposal still runs.

Part C — Stall watchdog

Adapter starts a 5-min idle timer when entering the for await loop. Each yielded event resets it. Timer fire = emit a system { subtype: "cursor_stalled", details: { last_event_at, agent_id, run_id } } event (visible in the timeline per Part A, and feeding the operator decision to cancel).
Timer auto-clears in the finally.
Threshold (5 min default) configurable via service-config so we can tune later.
Test: feed a stream that goes silent. Assert one cursor_stalled event after ~5 min (use fake timers).

Out of scope (filed as follow-ups, do NOT bundle into this PR)

InteractionListener / delta streams — #951.
Live token meter — #952.
Per-model rate table + cost — #953.
Canonical ToolKind taxonomy + widgets — #954.
Provider badge — #955.
Run.conversation() replay on crash — #956.
Synthetic shell_output_delta for claude-code — #957.
Pluggable checkpoint store — #958.
The 19 dashboard widget tickets (#959–#977) downstream of these.

References

Adapter: apps/server/src/infrastructure/agent/cursor-sdk-adapter.ts
Port event shapes: apps/server/src/infrastructure/agent/claude-port.ts
Event log translator: apps/server/src/infrastructure/event-log.ts
Cursor SDK message shapes: node_modules/.bun/@cursor+sdk@1.0.12/node_modules/@cursor/sdk/dist/cjs/messages.d.ts
Cursor SDK typed tool calls: node_modules/.bun/@cursor+sdk@1.0.12/node_modules/@cursor/sdk/dist/cjs/types/tool-call-types.d.ts
Cursor SDK delta updates (Phase 2): node_modules/.bun/@cursor+sdk@1.0.12/node_modules/@cursor/sdk/dist/cjs/types/delta-types.d.ts
Cancel helpers: apps/server/src/domain/dispatch/cancel.ts (cancelRunningTaskInWorker).
Triggering observation 1 (visibility): dev/949 silent-running 40+ min while cursor stream was active (2026-05-08).
Triggering observation 2 (cancel deadlock): same task — DB row flipped to cancelled but worker stayed wedged until just restart (2026-05-08).

## User story As an operator running cursor-backed agents, I want **visibility parity** with claude-code (tool calls, status transitions, subagent dispatches surfaced in the dashboard) **and** a worker that actually unblocks when I cancel a task, so cursor runs are neither black boxes between dispatch and `result` nor zombie sessions that pin a worker forever. ## Context Two distinct bugs in `apps/server/src/infrastructure/agent/cursor-sdk-adapter.ts` came up the same day; same file, same review surface, single PR: 1. **Visibility** — `cursorSdkMessageToTaskEvent` only maps `assistant` / `user` / `thinking` / `system` from the cursor SDK's `SDKMessage` union. The remaining message types are silently dropped: | Cursor `SDKMessage.type` | Carries | Adapter today | |---|---|---| | `tool_call` | `name`, `status (running/completed/error)`, `args`, `result` (typed `ToolCall` discriminator: `EditToolCall`, `ShellToolCall`, `ReadToolCall`, `GrepToolCall`, `WriteToolCall`, `GlobToolCall`, `SemSearchToolCall`, `LsToolCall`, `McpToolCall`, `TaskToolCall`, `CreatePlanToolCall`, `UpdateTodosToolCall`, `DeleteToolCall`, `ReadLintsToolCall`) | **dropped (returns null)** | | `status` | `RUNNING` / `FINISHED` / `ERROR` / `CANCELLED` / `EXPIRED` + optional message | **dropped** | | `task` | subagent dispatch text + status | **dropped** | | `request` | request_id | **dropped** | Plus secondary gaps in `runResultToResultEvent`: - `RunResult.git.branches[*].prUrl` ignored — PR URL extraction goes through a regex on `result.result` summary string. - `usage`, `total_cost_usd`, `num_turns` hardcoded to `emptyUsage()` / `0` / `0` — every observability tile reads zero for cursor runs. `event-log.ts` `logTaskEvent` `system` switch only handles `subtype: "api_retry"`, so the existing `cursor_init` / `cursor_system` events the adapter does emit are dropped on the way to the SSE stream too. Real-world reproduction (2026-05-08, task `6dbb2c28-5d17-4cb2-a44f-496b10bfceae` on issue #949): journal showed `acquiring worktree → running agent (provider: cursor, model: composer-2) → resuming session cursor:agent-…` then **silence for 40+ minutes** while the cursor SDK was streaming `tool_call` events for the work being done. 2. **Cancel deadlock** — `for await (run.stream())` does not propagate `AbortSignal` into the underlying HTTP/2 fetch. The adapter's `onAbort` calls `r.cancel()` only when `r.supports("cancel") === true`; against an unreachable cloud session this no-ops. The for-await body's `if (req.abort.signal.aborted) break;` only fires *after* a stream event arrives. Result: `currentAbort.abort()` flips the bit, the loop never re-checks because cursor's stream is silent, and the worker is wedged until process restart. Reproduction same day, same task: operator clicked Cancel + dragged tile to triage + unassigned issue. SQLite `task_history.status` flipped to `cancelled` (because `cancelRunningTaskInWorker` ran and `persistAndBroadcastCancellation` updated the row), but `worker.currentTask` stayed pinned and the dashboard kept rendering `RUNNING`. Only `just restart` freed the slot. ## Acceptance criteria ### Part A — Visibility #### `cursor-sdk-adapter.ts` — message mapping - [ ] `case "tool_call"` added to `cursorSdkMessageToTaskEvent`: - `status === "running"` → `tool_progress { toolName: msg.name, text: summarizeArgs(msg) }` - `status === "completed"` → `tool_summary { summary: summarizeResult(msg) }` - `status === "error"` → `tool_summary { summary: \`error: ${msg.name}\` }` plus a `system { subtype: "cursor_tool_error" }` - [ ] `summarizeArgs(msg)` and `summarizeResult(msg)` switch on the typed `ToolCall` discriminator from `@cursor/sdk` `types/tool-call-types`: - `Edit`: `path:Lstart-Lend` (or `path (N replacements)`) - `Write`: `path (N bytes)` - `Shell`: first line of `args.command`, truncated to 120 chars - `Read`: `path` + `Lstart-Lend` if present - `Grep`/`Glob`/`SemSearch`: pattern + cwd - `Ls`: path - `Mcp`: `mcpProviderId/mcpToolId` - `Task`: subagent type + first 80 chars of prompt - default: `JSON.stringify(args).slice(0, 120)` - results: row count for list-shaped results, exit code + stderr-tail for shell, summary for plan/todos - [ ] `case "status"` added → `system { subtype: \`cursor_status_${msg.status.toLowerCase()}\`, details: { message } }`. - [ ] `case "task"` added → `tool_progress { toolName: "subagent", text: msg.text ?? msg.status ?? "" }`. - [ ] `case "request"` either mapped or explicitly silenced with a comment (decide: probably silence, it's noise). #### `cursor-sdk-adapter.ts` — result event - [ ] `runResultToResultEvent` reads `RunResult.git.branches.find(b => b.prUrl)?.prUrl` and prefers it over `r.result` for `resultText`. Alternative if cleaner: extend `ResultEvent` with optional `prUrl?: string` and drop the regex path in `extractProgress` for cursor. - [ ] If a `numTurns` / token accumulator is plumbed (see Phase 2 follow-ups #951–#953), surface it; otherwise leave `0` and document that cursor doesn't expose it on `RunResult` directly. #### `event-log.ts` — `logTaskEvent` - [ ] `system` switch extended to render the new subtypes as visible rows: - `cursor_init` → `tool_progress { tool_name: "session", summary: \`cursor session ${agent_id}\` }` (one-line "session started") - `cursor_status_*` → `tool_progress { tool_name: "status", summary: <status name> }` - `cursor_tool_error` → `error` row - `cursor_stalled` (see Part C) → `error` row with `summary: "no stream events in N min — assuming cloud-side hang"` - [ ] `user` case added so cursor's user-echo (and any future claude-code user echoes) actually surface — currently the entire `case` is missing from the switch, silently dropping every `UserTurn`. ### Part B — Cancel-race fix #### `cursor-sdk-adapter.ts` — `runTask` body - [ ] Replace the bare `for await (const ev of run.stream())` with an abort-aware iteration: - Wrap the stream's `next()` in a `Promise.race(gen.next(), abortPromise)` where `abortPromise` rejects on `req.abort.signal`. - On abort, throw `AbortError`, exit the loop, run the existing `finally` (which disposes the agent + drains the prompt generator). - [ ] In the existing `onAbort` handler, attempt `r.cancel()` regardless of `r.supports("cancel")` (catch the unsupported-op error). Add an aggressive retry: if the cancel promise hasn't resolved in 10 s, log a warning and proceed with disposal anyway — the worker MUST exit even when cursor cloud is unreachable. - [ ] The runner currently breaks on terminal `result`; on abort it should yield a synthetic `result { type: "result", ok: false, subtype: "cancelled" }` so downstream `applyOutcome` / `task_history` persistence does the right thing. - [ ] `agent[Symbol.asyncDispose]()` in the outer `finally` already covers cleanup but verify it's called on the abort path too (currently `await` inside finally — make sure abort doesn't bypass it via an exception bubbling out of the async iterator unexpectedly). #### Tests - [ ] **Cancel-while-streaming**: feed a fake `Run` whose `stream()` yields one event then hangs forever. Fire `abort.abort()`. Assert the for-await exits in <100 ms, a synthetic `result` is yielded, the agent is disposed, and the test does not leak timers or open handles. - [ ] **Cancel-with-cancel-supported**: `r.supports("cancel") === true`, `r.cancel()` resolves. Assert the adapter calls `cancel()` and exits. - [ ] **Cancel-with-cancel-unsupported**: `r.supports("cancel") === false`. Assert the abort-race path still exits the loop in <100 ms (cancel-attempt is best-effort). - [ ] **Cancel-with-cancel-hangs**: `r.cancel()` returns a never-resolving promise. Assert the 10 s hard-timeout fires, the warning is logged, disposal still runs. ### Part C — Stall watchdog - [ ] Adapter starts a 5-min idle timer when entering the `for await` loop. Each yielded event resets it. Timer fire = emit a `system { subtype: "cursor_stalled", details: { last_event_at, agent_id, run_id } }` event (visible in the timeline per Part A, and feeding the operator decision to cancel). - [ ] Timer auto-clears in the `finally`. - [ ] Threshold (5 min default) configurable via service-config so we can tune later. - [ ] Test: feed a stream that goes silent. Assert one `cursor_stalled` event after ~5 min (use fake timers). ## Out of scope (filed as follow-ups, do NOT bundle into this PR) - **InteractionListener / delta streams** — #951. - **Live token meter** — #952. - **Per-model rate table + cost** — #953. - **Canonical ToolKind taxonomy + widgets** — #954. - **Provider badge** — #955. - **`Run.conversation()` replay on crash** — #956. - **Synthetic `shell_output_delta` for claude-code** — #957. - **Pluggable checkpoint store** — #958. - The 19 dashboard widget tickets (#959–#977) downstream of these. ## References - Adapter: `apps/server/src/infrastructure/agent/cursor-sdk-adapter.ts` - Port event shapes: `apps/server/src/infrastructure/agent/claude-port.ts` - Event log translator: `apps/server/src/infrastructure/event-log.ts` - Cursor SDK message shapes: `node_modules/.bun/@cursor+sdk@1.0.12/node_modules/@cursor/sdk/dist/cjs/messages.d.ts` - Cursor SDK typed tool calls: `node_modules/.bun/@cursor+sdk@1.0.12/node_modules/@cursor/sdk/dist/cjs/types/tool-call-types.d.ts` - Cursor SDK delta updates (Phase 2): `node_modules/.bun/@cursor+sdk@1.0.12/node_modules/@cursor/sdk/dist/cjs/types/delta-types.d.ts` - Cancel helpers: `apps/server/src/domain/dispatch/cancel.ts` (`cancelRunningTaskInWorker`). - Triggering observation 1 (visibility): dev/949 silent-running 40+ min while cursor stream was active (2026-05-08). - Triggering observation 2 (cancel deadlock): same task — DB row flipped to `cancelled` but worker stayed wedged until `just restart` (2026-05-08).

claude-desktop added the

area:agents

type:user-story

labels

2026-05-08 11:57:16 +00:00

claude-desktop referenced this issue

2026-05-08 12:04:33 +00:00

agents: stream cursor InteractionUpdate deltas (text, thinking, tool, shell, token, summary) #951

claude-desktop referenced this issue

2026-05-08 12:04:54 +00:00

agents: live token meter — accumulate per-task usage across both providers #952

claude-desktop referenced this issue

2026-05-08 12:05:13 +00:00

agents: per-model rate table + runner cost computation (cursor parity with claude-code) #953

claude-desktop referenced this issue

2026-05-08 12:05:37 +00:00

shared: canonical ToolKind taxonomy mapped from both Anthropic + Cursor tool names #954

claude-desktop referenced this issue

2026-05-08 12:05:52 +00:00

dashboard: provider badge on task cards + timeline rows (cursor:composer-2 / claude-code:sonnet-4-6) #955

claude-desktop referenced this issue

2026-05-08 12:06:07 +00:00

agents: replay event log from Run.conversation() on worker crash (cursor only) #956

claude-desktop referenced this issue

2026-05-08 12:06:30 +00:00

agents: synthesize shell_output_delta for claude-code via container log stream #957

claude-desktop referenced this issue

2026-05-08 12:06:52 +00:00

agents: pluggable checkpoint store — wire cursor AgentCheckpointStore + claude session-map equivalent #958

claude-desktop referenced this issue

2026-05-08 12:14:04 +00:00

dashboard: adopt ToolUIPart state enum on SSE wire (input-streaming → output-{available,error,denied}) #959

claude-desktop referenced this issue

2026-05-08 12:14:54 +00:00

dashboard: status pill vocabulary — Planning · Executing · Waiting · Reviewing · Done · Error · Abandoned (with reason hover) #963

claude-desktop added a new dependency

2026-05-08 12:19:17 +00:00

#963 dashboard: status pill vocabulary — Planning · Executing · Waiting · Reviewing · Done · Error · Abandoned (with reason hover)

claude-desktop changed title from ~~cursor-sdk-adapter: surface tool_call / status / task events for visibility parity with claude-code~~ to cursor-sdk-adapter: visibility parity + cancel-race fix + stall watchdog

2026-05-08 12:33:27 +00:00