charles/claude-hooks

Fork

You've already forked claude-hooks

Code Issues 10 Pull requests Projects Releases Packages 1 Wiki Activity Actions

Expand webhook test depth — handler logic, CI dedup, session resume #37

New issue

Closed

opened 2026-04-18 10:34:23 +00:00 by claude-desktop · 0 comments

claude-desktop commented

2026-04-18 10:34:23 +00:00

Collaborator

Copy link

User story

As a maintainer, I want the webhook tests to actually assert what the handlers do — not just "returns 200 without crashing" — so that a future regression in CI dedup, no-workflows routing, or session resume fails a test instead of silently shipping.

Context

Audit 2026-04-18 flagged shallow tests: webhook.test.ts checks that a handler returns HTTP 200 and bails when no token is configured, but no spy on fetch confirms whether any API call was (or was not) made. The pass conditions are "didn't crash", not "did the right thing".

Complexity without corresponding tests:

prMergeable, dispatchRebaseIfNotMergeable — the synchronize rebase dispatch.
requestReviewIfFresh dedup — already-requested vs. verdict-at-head skip paths.
handleStatusEvent vs. handleActionRunEvent — aggregate green → request review routing.
Fallback timer arm/cancel — new in this session.
Worker.runAgent session resume — zero tests on the three-state transition (fresh, resume-success, resume-fail-and-retry).

Acceptance criteria

Fetch spy harness

webhook.test.ts (or the split test files if #A landed) gains a thin fetch mock harness: tests can register URL-pattern → response handlers, and assert which URLs were called in what order. Prefer a tiny in-repo helper over a library.
Existing "returns 200" tests gain explicit expect(fetchSpy.calls).toHaveLength(0) (or .toEqual([...])) assertions so a future regression that adds a stray API call fails.

CI gate tests

requestReviewIfFresh tests: (a) no request made if reviewer already in requested_reviewers; (b) no request if an APPROVED/REQUEST_CHANGES review exists at the head SHA; (c) POST /requested_reviewers called exactly once with {reviewers: ["reviewer"]} when neither condition holds; (d) PR author == reviewer → no-op.
handleStatusEvent: aggregate success routes to requestReviewIfFresh; aggregate failure routes to dispatchFixCi; pending is a no-op.
handleActionRunEvent: action_run_failure dispatches fix-ci with the failing workflow_id in the prompt; action_run_success checks aggregate before requesting review.
repoHasWorkflows returns true when .forgejo/workflows/ has entries, true when .github/workflows/ has entries, false when neither does, false on 404/network error.

Fallback timer tests

armReviewFallback schedules a call to requestReviewIfFresh after REVIEW_FALLBACK_MS; cancelReviewFallback prevents it. Use fake timers (Bun's vi.useFakeTimers / setSystemTime equivalent) or manually mock setTimeout.
Any action_run_* or status event for the head SHA cancels a previously-armed fallback.

Session resume tests

runAgent (or its extracted runAgentTask if #C landed) is tested across: (a) fresh dispatch (no stored session) → SDK called without resume; (b) stored session present → SDK called with resume: <id>; (c) SDK rejects resumed session → session dropped, SDK called a second time without resume, result captured.

Coverage bookkeeping

just qa passes including new tests.
No flaky tests introduced (each test run deterministic; fake timers / fetch mocks fully controlled).
Coverage on the touched files is visibly higher (spot-check via bun test --coverage if available, otherwise count assertions).

Out of scope

Full E2E rework (the existing live-service E2E stays manual).
Rewriting untouched tests for unrelated modules.

References

Codebase audit 2026-04-18.

Dependencies

Blocked by: #A (cleaner seams make mocking easier).
Optionally helped by: #C (extracts runAgent — simpler to test in isolation).
Branch off: whichever branch #A merged to (likely main).

## User story As a **maintainer**, I want the webhook tests to **actually assert** what the handlers do — not just "returns 200 without crashing" — so that a future regression in CI dedup, no-workflows routing, or session resume fails a test instead of silently shipping. ## Context Audit 2026-04-18 flagged shallow tests: `webhook.test.ts` checks that a handler returns HTTP 200 and bails when no token is configured, but **no spy on `fetch`** confirms whether any API call was (or was not) made. The pass conditions are "didn't crash", not "did the right thing". Complexity without corresponding tests: - `prMergeable`, `dispatchRebaseIfNotMergeable` — the synchronize rebase dispatch. - `requestReviewIfFresh` dedup — already-requested vs. verdict-at-head skip paths. - `handleStatusEvent` vs. `handleActionRunEvent` — aggregate green → request review routing. - Fallback timer arm/cancel — new in this session. - `Worker.runAgent` session resume — zero tests on the three-state transition (fresh, resume-success, resume-fail-and-retry). ## Acceptance criteria ### Fetch spy harness - [ ] `webhook.test.ts` (or the split test files if **#A** landed) gains a thin `fetch` mock harness: tests can register URL-pattern → response handlers, and assert which URLs were called in what order. Prefer a tiny in-repo helper over a library. - [ ] Existing "returns 200" tests gain explicit `expect(fetchSpy.calls).toHaveLength(0)` (or `.toEqual([...])`) assertions so a future regression that adds a stray API call fails. ### CI gate tests - [ ] `requestReviewIfFresh` tests: (a) no request made if reviewer already in `requested_reviewers`; (b) no request if an APPROVED/REQUEST_CHANGES review exists at the head SHA; (c) POST `/requested_reviewers` called exactly once with `{reviewers: ["reviewer"]}` when neither condition holds; (d) PR author == reviewer → no-op. - [ ] `handleStatusEvent`: aggregate success routes to `requestReviewIfFresh`; aggregate failure routes to `dispatchFixCi`; pending is a no-op. - [ ] `handleActionRunEvent`: `action_run_failure` dispatches fix-ci with the failing workflow_id in the prompt; `action_run_success` checks aggregate before requesting review. - [ ] `repoHasWorkflows` returns true when `.forgejo/workflows/` has entries, true when `.github/workflows/` has entries, false when neither does, false on 404/network error. ### Fallback timer tests - [ ] `armReviewFallback` schedules a call to `requestReviewIfFresh` after `REVIEW_FALLBACK_MS`; `cancelReviewFallback` prevents it. Use fake timers (Bun's `vi.useFakeTimers` / `setSystemTime` equivalent) or manually mock `setTimeout`. - [ ] Any `action_run_*` or `status` event for the head SHA cancels a previously-armed fallback. ### Session resume tests - [ ] `runAgent` (or its extracted `runAgentTask` if **#C** landed) is tested across: (a) fresh dispatch (no stored session) → SDK called without `resume`; (b) stored session present → SDK called with `resume: <id>`; (c) SDK rejects resumed session → session dropped, SDK called a second time without resume, result captured. ### Coverage bookkeeping - [ ] `just qa` passes including new tests. - [ ] No flaky tests introduced (each test run deterministic; fake timers / fetch mocks fully controlled). - [ ] Coverage on the touched files is visibly higher (spot-check via `bun test --coverage` if available, otherwise count assertions). ## Out of scope - Full E2E rework (the existing live-service E2E stays manual). - Rewriting untouched tests for unrelated modules. ## References - Codebase audit 2026-04-18. ## Dependencies - **Blocked by:** `#A` (cleaner seams make mocking easier). - **Optionally helped by:** `#C` (extracts `runAgent` — simpler to test in isolation). - **Branch off:** whichever branch `#A` merged to (likely `main`).

claude-desktop added the

area:webhook

type:user-story

labels

2026-04-18 10:34:50 +00:00

dev was assigned by claude-desktop

2026-04-18 11:30:13 +00:00

claude-desktop referenced this issue

2026-04-18 11:35:22 +00:00

Post-CI routing: merge when already approved, skip re-review on rebase #42

dev referenced this issue from a commit

2026-04-18 11:53:12 +00:00

test(webhook): expand test depth — fetch spy, CI gate, fallback timer, session resume

dev referenced this issue from a pull request that will close it,

2026-04-18 11:53:23 +00:00

test(webhook): expand test depth — fetch spy, CI gate, fallback timer, session resume #43

dev referenced this issue from a commit

2026-04-18 12:16:43 +00:00

test(webhook): expand test depth — fetch spy, CI gate, fallback timer, session resume

code-lead closed this issue

2026-04-18 13:25:05 +00:00

No Branch/Tag specified

main

chore/sync-pre-push-from-forge-base

fix/flows-yaml-dispatch-identity

feat/board-tap-to-assign

dev/1107

code-lead/1106

code-lead/1108

dev/1104

code-lead/1103

code-lead/1080

dev/1087

feat/flows-yaml-ci-events

chore/board-drop-stalled-and-density-controls

fix/flows-yaml-routes-always-register

flows-yaml/api-defaults

dev/1023

fix/event-log-history-bleed

fix/janitor-fix-ci-logs-and-cap

dev/1022

fix/board-card-provider

code-lead/1036

dev/1025

code-lead/1020

dev/1017

code-lead/1026

feat/web-shortcut-registry-1018

dev/1015

code-lead/1009

code-lead/1008

dev/975

dev/969

dev/973

dev/967

code-lead/968

code-lead/953

dev/970

dev/976

code-lead/966

code-lead/956

code-lead/951

dev/962

dev/963

dev/977

dev/955

dev/983

dev/961

dev/974

code-lead/950

code-lead/939

dev/941

dev/940

dev/937

dev/938

dev/936

dev/935

feat/web-i18n-fr-locale

feat/spec-editor-ui-polish

chore/drop-legacy-compat

fix/skills-drop-preview-pane

fix/882-skills-safety-rail

dev/911

dev/909

dev/923

dev/917

dev/915

feat/879-sr11-m2-drop-legacy-skill

code-lead/873

dev/881

code-lead/869

dev/867

code-lead/845

code-lead/843

code-lead/844

dev/837

dev/861

dev/849

code-lead/837

code-lead/842

fix/dedup-rebase-inflight

dev/838

code-lead/847

dev/833

code-lead/848

pr/838

code-lead/841

feat/settings-save-bar/836

code-lead/840

dev/846

code-lead/839

dev/832

fix/board-sse-stale-cache

dev/834

dev/835

feat/settings-breadcrumbs

feat/forge-oauth-credentials

refactor/service-config-consolidation

feat/agent-tokens-to-secrets

feat/gitlab-oauth-to-db

feat/authelia-rip-and-voice-fixes

fix/rebase-storm-and-dead-letter

code-lead/797

code-lead/796

dev/811

code-lead/798

dev/810

code-lead/795

dev/808

code-lead/794

dev/805

dev/802

dev/803

feat/avatar-menu-settings-entry

feat/per-agent-token-tracking

dev/793

dev/747

dev/752

code-lead/790

code-lead/759

dev/756

dev/760

dev/741

dev/767

dev/740

dev/709

dev/644

dev/637

boss/614

dev/600

dev/611

dev/585

fix/login-bonus-fixes

boss/544

dev/542

refactor/api-prefix-and-session-gate

dev/489

boss/531

boss/518

dev/499

boss/516

dev/530

dev/517

dev/519

dev/515

dev/522

dev/503

dev/471

boss/329

dev/417

dev/418

dev/402

boss/327

dev/334

dev/332

boss/326

boss/325

dev/331

boss/324

boss/323

boss/322

dev/294

test/s11-task-analytics

dev/262

boss/270

dev/268

foreman/ui-consolidation-spec

dev/234

boss/196

boss/176

boss/164

fix/124-session-persist-bind

boss/52

dev/87

boss/73

dev/77

dev/81

dev/82

boss/79

dev/42

dev/35

boss/7

No results found.

Labels

Clear labels

area:agents

Agent types, pool scheduling, per-instance config

area:dashboard

Dashboard UI and observability surfaces

area:database

DB layer — schema, migrations, ORM, raw SQL

area:design

UI/UX mockup work — routes to designer agent

area:design-review

Design review dispatch — routes to design-reviewer agent

area:flows

Flow runner — YAML loader, executor, op registry, expression eval

area:infra

Deployment, isolation, containers, systemd units

area:meta

Tracking, scaffolding, project setup

area:security

Security — routes to reviewer-security (opus)

area:sessions

Session-id store, Claude SDK resume logic

area:webhook

Forgejo webhook routing and handlers

area:workdir

Clone cache, worktrees, git identity

security

Security-sensitive issue

Tracking or decisions, not implementation work

No labels

Milestone

Clear milestone

No items

No milestone

Projects

Clear projects

No items

No project

Assignees

Clear assignees

No assignees

dev

1 participant

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

charles/claude-hooks#37

Reference in a new issue

Repository

charles/claude-hooks

Title

Body

No description provided.

Delete branch "%!s()"

Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?

Rows
Columns