B17 — Per-agent-type completion proof (extend B10 silent-failure detection beyond git) #446
Labels
No labels
area:agents
area:dashboard
area:database
area:design
area:design-review
area:flows
area:infra
area:meta
area:security
area:sessions
area:webhook
area:workdir
security
type:bug
type:chore
type:meta
type:user-story
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
charles/claude-hooks#446
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
As an orchestrator,
I want B10's silent-completion watchdog to require a per-agent-type delivery proof before trusting
done — task completed,so that non-git agents (designer, design-reviewer, reviewer) can't silently fail like dev does.
B10 (#426) caught silent failures on
dev/bossrebase tasks by checking PR head sha + duration. It works for git-producing agents. But last night the designer silently failed on issue #236: agent ran 12 min on Opus + Penpot MCP, logged success, never posted a Penpot link comment on the issue, never created a frame. Card stayedIDLE-ASSIGNEDfor 17 hours until manually retriggered.Probable cause: cross-task session contamination (designer-2 resumed prior session from issue #291) AND/OR Penpot MCP call failed silently. Either way, the orchestrator trusted the success signal because B10's git heuristic doesn't apply to a Penpot-producing agent.
This story generalises B10 with a per-agent-type completion-proof table. Each agent type declares what counts as "actually delivered":
penpot.app/#/workspace/URL, posted by the agent's forgejo_user, after task startIf the agent reports
done — task completedand the proof is missing, treat it as a silent failure (same B10 path: increment counter, dead-letter at threshold, escalate via B11).Acceptance criteria
Backend — completion-proof table
COMPLETION_PROOF_RULES: Record<AgentType, ProofCheck>inapps/server/src/domain/dispatch/completion-proof.ts(new module).ProofCheckis a function(task: TaskRecord, started_at: ms) => Promise<{ passed: boolean; reason?: string }>with the table above as concrete impls.dev/bossproof reuses B10's existing sha-changed check.designerproof: fetch issue comments via Forgejo API, filtercreated_at >= started_atANDuser.login === task.assigneeAND body containspenpot.app/#/workspace/. Pass if any.design-reviewerproof: fetch issue + linked-PR comments, filter same, body containsframe:orfigure:orpenpot.app/. Pass if any.reviewer/reviewer-securityproof: fetch PR reviews via/repos/.../pulls/N/reviews, filtersubmitted_at >= started_atANDuser.login === task.assigneeAND state in["APPROVED", "REQUEST_CHANGES", "COMMENTED"]. Pass if any.foremanproof: fetch issue comments, filter same, body containsDispatchedorBroken downorSkill:. Pass if any.Backend — wire into B10 path
done — task completed), if the task had abranch_overrideOR matched a non-dev/bossagent type, callCOMPLETION_PROOF_RULES[type](task, started_at).[suspect-completion] task <id> on <agent> completed without delivery proof — flaggingand route through B10's existing increment + re-dispatch path.Tests
completion-proof.test.ts): designer task, no Penpot URL in any post-start comment → fails proof.penpot.app/#/workspace/abcposted 5 s after start by@designer→ passes.@reviewerpost-start with stateAPPROVED→ passes.Out of scope
tester/ future agent types — extend the table when those land.References
docs/specs/automation-hardening.md— extend §4 B10 with this addendum.apps/server/src/domain/dispatch/silent-completion.ts(or wherever B10's logic landed).docs/penpot.md,docs/design-review.md.06299765-1760-4f25-850d-eee8157e36a3on issue #236 at 2026-04-26 20:56 — ran 12 min, no Penpot comment, silent stall.Dependencies
Suggested first commit
feat(watchdog): per-agent-type completion proof (B17 — extends B10 beyond git agents)