feat(watchdog): per-agent-type completion proof (B17) #447

Merged
code-lead merged 3 commits from boss/446 into main 2026-04-27 12:57:36 +00:00
Collaborator

Adds a per-agent-type delivery-proof watchdog that catches silent failures on non-git agents (designer / design-reviewer / reviewer / reviewer-security / foreman), reusing B10's counter + dead-letter path so dev/boss flow continues unchanged.

Test plan

  • bun test apps/server/src/domain/dispatch/ — 90/90 pass (28 new + 62 existing)
  • bun x tsc --noEmit clean on the changed files (only pre-existing tail-pr-rebase-watchdog.ts import error remains, from PR #444)
  • Biome check + format pass on all touched files
  • CI green on boss/446

Closes #446

Adds a per-agent-type delivery-proof watchdog that catches silent failures on non-git agents (designer / design-reviewer / reviewer / reviewer-security / foreman), reusing B10's counter + dead-letter path so dev/boss flow continues unchanged. ## Test plan - [x] `bun test apps/server/src/domain/dispatch/` — 90/90 pass (28 new + 62 existing) - [x] `bun x tsc --noEmit` clean on the changed files (only pre-existing `tail-pr-rebase-watchdog.ts` import error remains, from PR #444) - [x] Biome check + format pass on all touched files - [ ] CI green on `boss/446` Closes #446
feat(watchdog): per-agent-type completion proof (B17 — extends B10 beyond git agents)
Some checks failed
qa / qa (pull_request) Failing after 3m3s
qa / dockerfile (pull_request) Successful in 11s
355d298778
Generalises the B10 silent-completion watchdog with a per-agent-type
delivery-proof table. designer / design-reviewer / reviewer /
reviewer-security / foreman now each declare what counts as "actually
delivered" — a Penpot URL, a frame callout, a submitted review verdict,
or a dispatch / breakdown marker. A `done — task completed` report
without the matching evidence routes through the same B10 path: counter
increment, dead-letter at threshold, escalation via B11.

dev / boss continue to flow through the existing B10 watchdog (the new
module skips them), so there is no double-flagging on rebase / fix-ci
tasks.

Closes #446
fix(ci): repoint tail-pr-rebase-watchdog import after handler relocation
All checks were successful
qa / qa (pull_request) Successful in 8m15s
qa / dockerfile (pull_request) Successful in 13s
a092c1a3f7
`fetchPrDispatchLabels` moved from `http/webhook-handlers` to
`domain/workflow/event-handlers` in #444, but the B17 commit added a
reference to the old path, breaking server typecheck.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
reviewer requested changes 2026-04-27 12:45:10 +00:00
Dismissed
reviewer left a comment
  • behavior registry.ts buildCompletionProofDepslistComments and listReviews swallow forge exceptions and return []. When forge is transiently unreachable, the proof rule receives an empty list, finds no matching comment/review, evaluates the proof as failed, increments the counter, and re-dispatches the agent. The "transport blip must not flag the agent" guarantee (explicitly stated in the runCompletionProof comment) is violated because runCompletionProof's outer catch never fires — the error is eaten before it can propagate. getPullHeadSha correctly returns null on error and shaChangedProof handles null as "pass"; listComments/listReviews need the same treatment.

    Fix: remove the try/catch wrappers from listComments and listReviews in buildCompletionProofDeps so that any transport error propagates out of the proof rule and is caught by runCompletionProof's existing catch block, which returns { action: "no_rule" }. Alternatively, add a sentinel (e.g. null) to the return type so proof rules can explicitly return { passed: true, reason: "forge unavailable" } on transport error, matching the getPullHeadSha pattern already in use.

    Concretely, change (registry.ts, buildCompletionProofDeps):

    // before — swallows error, returns [] → false proof failure
    listComments: async (repo, number) => {
      try {
        return await createForgeAdapterForRepo(repo, forgejoToken).listComments(parseRepo(repo), number);
      } catch (err) {
        console.warn(`...`);
        return [];
      }
    },
    
    // after — let it throw; runCompletionProof catches and returns no_rule
    listComments: async (repo, number) =>
      createForgeAdapterForRepo(repo, forgejoToken).listComments(parseRepo(repo), number),
    

    Same for listReviews. The console.warn logging can move into runCompletionProof's catch if visibility is needed.

Everything else is solid: proof table is exhaustive, PROOF_RULES_HANDLED_BY_B10 gate is correct, counter/dead-letter wiring reuses B10 correctly, and test coverage is thorough (designer pre-start timestamp regression, cross-login filter, pending draft review, dead-letter threshold, rule-throws path).

- **behavior** `registry.ts` `buildCompletionProofDeps` — `listComments` and `listReviews` swallow forge exceptions and return `[]`. When forge is transiently unreachable, the proof rule receives an empty list, finds no matching comment/review, evaluates the proof as **failed**, increments the counter, and re-dispatches the agent. The "transport blip must not flag the agent" guarantee (explicitly stated in the `runCompletionProof` comment) is violated because `runCompletionProof`'s outer `catch` never fires — the error is eaten before it can propagate. `getPullHeadSha` correctly returns `null` on error and `shaChangedProof` handles `null` as "pass"; `listComments`/`listReviews` need the same treatment. Fix: remove the `try/catch` wrappers from `listComments` and `listReviews` in `buildCompletionProofDeps` so that any transport error propagates out of the proof rule and is caught by `runCompletionProof`'s existing `catch` block, which returns `{ action: "no_rule" }`. Alternatively, add a sentinel (e.g. `null`) to the return type so proof rules can explicitly return `{ passed: true, reason: "forge unavailable" }` on transport error, matching the `getPullHeadSha` pattern already in use. Concretely, change (registry.ts, `buildCompletionProofDeps`): ```ts // before — swallows error, returns [] → false proof failure listComments: async (repo, number) => { try { return await createForgeAdapterForRepo(repo, forgejoToken).listComments(parseRepo(repo), number); } catch (err) { console.warn(`...`); return []; } }, // after — let it throw; runCompletionProof catches and returns no_rule listComments: async (repo, number) => createForgeAdapterForRepo(repo, forgejoToken).listComments(parseRepo(repo), number), ``` Same for `listReviews`. The `console.warn` logging can move into `runCompletionProof`'s catch if visibility is needed. Everything else is solid: proof table is exhaustive, `PROOF_RULES_HANDLED_BY_B10` gate is correct, counter/dead-letter wiring reuses B10 correctly, and test coverage is thorough (designer pre-start timestamp regression, cross-login filter, pending draft review, dead-letter threshold, rule-throws path).
fix(watchdog): let forge transport errors propagate to proof outer catch
All checks were successful
qa / qa (pull_request) Successful in 8m26s
qa / dockerfile (pull_request) Successful in 13s
ff52c7d62f
listComments / listReviews in buildCompletionProofDeps swallowed errors
and returned [], which made transient forge blips look like a missing
delivery comment and falsely incremented the silent-failure counter.
Drop the local try/catch so runCompletionProof's outer catch returns
no_rule on transport errors, matching the contract documented in its
catch block.
Author
Collaborator

Addressed in ff52c7d. Removed the try/catch wrappers around listComments / listReviews in buildCompletionProofDeps so transport errors propagate to runCompletionProof's outer catch (which already returns { action: "no_rule" } and logs proof rule threw: …, skipping). getPullHeadSha's null-on-error handling stays as-is — it's the documented sentinel that shaChangedProof already treats as "pass".

Addressed in `ff52c7d`. Removed the `try/catch` wrappers around `listComments` / `listReviews` in `buildCompletionProofDeps` so transport errors propagate to `runCompletionProof`'s outer catch (which already returns `{ action: "no_rule" }` and logs `proof rule threw: …, skipping`). `getPullHeadSha`'s `null`-on-error handling stays as-is — it's the documented sentinel that `shaChangedProof` already treats as "pass".
reviewer approved these changes 2026-04-27 12:57:10 +00:00
reviewer left a comment

Fix landed correctly — listComments and listReviews now propagate transport errors to runCompletionProof's outer catch, with an explanatory comment. CI green.

Fix landed correctly — `listComments` and `listReviews` now propagate transport errors to `runCompletionProof`'s outer catch, with an explanatory comment. CI green.
code-lead deleted branch boss/446 2026-04-27 12:57:36 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
charles/claude-hooks!447
No description provided.