B11 — Escalate dev→boss after N silent failures (close F2 cont.) #427

Closed
opened 2026-04-27 07:25:24 +00:00 by claude-desktop · 0 comments
Collaborator

As an orchestrator,
I want to switch the agent type from dev (Sonnet) to boss (Opus) after a PR has triggered N silent failures,
so that complex multi-file conflicts get the bigger model on the second try.

Last night the same dev task failed silently 6× on PR #420. When the operator finally delegated to a general-purpose Opus subagent, the rebase succeeded first try. Pattern: multi-file board conflicts exceed Sonnet's diff-merging reliability; Opus handles them.

Acceptance criteria

Routing

  • When B10 re-dispatches with silent_failure_count >= 1: route to type boss instead of dev (preserve task.branch_override).
  • Audit-comment on the PR: Escalated to @boss after N silent completion(s) on @dev (rebase-task duration < 30 s, no sha change).
  • Configurable per agent type in config/agents.jsonescalation_target: "boss" (default per type — devboss, reviewerboss, others → none).

Budget

  • Daily cap on escalations: max_escalations_per_day (default 10) in config/agents.json.
  • When cap hit: emit flow:dead-letter instead of escalating, log [escalation] daily cap reached, dead-lettering #N.
  • Cap counter resets at local midnight.

Tests

  • Unit test: silent_failure_count=1 → next dispatch routes to boss, audit comment posted.
  • Unit test: silent_failure_count=0 → routes to dev (no escalation).
  • Unit test: 10 escalations today, 11th request → dead-letter, no boss dispatch.
  • Unit test: midnight rollover resets counter.

Out of scope

  • Counter logic — owned by B10.
  • Dead-letter UI — owned by B15.
  • Per-PR (not per-day) caps — defer.

References

  • Spec: docs/specs/automation-hardening.md §4 B11.
  • Agent registry / dispatch: apps/server/src/domain/agents/, apps/server/src/http/webhook.ts.
  • Cost data for the cap-default of 10: see token-economy doc.

Dependencies

  • Land after B10 — needs the silent-failure counter.
**As an** orchestrator, **I want** to switch the agent type from `dev` (Sonnet) to `boss` (Opus) after a PR has triggered N silent failures, **so that** complex multi-file conflicts get the bigger model on the second try. Last night the same `dev` task failed silently 6× on PR #420. When the operator finally delegated to a general-purpose Opus subagent, the rebase succeeded first try. Pattern: multi-file board conflicts exceed Sonnet's diff-merging reliability; Opus handles them. ## Acceptance criteria ### Routing - [ ] When B10 re-dispatches with `silent_failure_count >= 1`: route to type `boss` instead of `dev` (preserve `task.branch_override`). - [ ] Audit-comment on the PR: `Escalated to @boss after N silent completion(s) on @dev (rebase-task duration < 30 s, no sha change).` - [ ] Configurable per agent type in `config/agents.json` → `escalation_target: "boss"` (default per type — `dev` → `boss`, `reviewer` → `boss`, others → none). ### Budget - [ ] Daily cap on escalations: `max_escalations_per_day` (default 10) in `config/agents.json`. - [ ] When cap hit: emit `flow:dead-letter` instead of escalating, log `[escalation] daily cap reached, dead-lettering #N`. - [ ] Cap counter resets at local midnight. ### Tests - [ ] Unit test: silent_failure_count=1 → next dispatch routes to boss, audit comment posted. - [ ] Unit test: silent_failure_count=0 → routes to dev (no escalation). - [ ] Unit test: 10 escalations today, 11th request → dead-letter, no boss dispatch. - [ ] Unit test: midnight rollover resets counter. ## Out of scope - Counter logic — owned by B10. - Dead-letter UI — owned by B15. - Per-PR (not per-day) caps — defer. ## References - Spec: `docs/specs/automation-hardening.md` §4 B11. - Agent registry / dispatch: `apps/server/src/domain/agents/`, `apps/server/src/http/webhook.ts`. - Cost data for the cap-default of 10: see token-economy doc. ## Dependencies - **Land after B10** — needs the silent-failure counter.
Sign in to join this conversation.
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
charles/claude-hooks#427
No description provided.