fix(flows): allocate rate-limit budget per flow run, not per process #405

Merged

charles merged 1 commit from fix/flow-rate-limit-per-run into main

2026-04-26 22:34:22 +00:00

claude-desktop commented

2026-04-26 22:10:42 +00:00

Collaborator

Summary

flow-dispatch.ts hoisted defaultArgInjections() out of the per-run loop on the (correct) reasoning that every key it returned was either a stable function or the singleton dedup surface. The rate-limit budget broke that invariant — the limiter is stateful and counts mutating calls. With the bundle reused across N triggers × M flows, every flow run shared one limiter; once cumulative consumption hit the DEFAULT_RATE_LIMIT_BUDGET = 20 ceiling, every subsequent flow run threw RateLimitExceeded until the next service restart.

Production symptom

PR #402's bounced review-request flow at 2026-04-26 20:43:25 UTC fired with the limiter already at 20/20, so the dispatch node errored:

"error": "rate-limit budget exhausted: used 20 of 20 mutating calls for this flow run"

The reviewer never picked up the re-review, and the operator dashboard showed IN REVIEW cards parked indefinitely on the dev column with the reviewer column empty.

The NF-3 docstring in rate-limit.ts:97-101 already flagged this exact leak as "safe for the isolated unit tests shipped here" but warns it "DOES leak across runs". This PR is the production wiring catching up.

Fix

Keep the static injection bundle hoisted (still correct for the stateless functions) but allocate createRateLimiter() inside the loop and spread it onto argDefaults per (trigger, flow) iteration. The allocation is one object per flow run — negligible against the forge-API + executor work that follows.

- const argDefaults = defaultArgInjections();
+ const staticArgDefaults = defaultArgInjections();
  for (const t of triggers) {
    for (const row of flowRows) {
      ...
-     argDefaults,
+     argDefaults: { ...staticArgDefaults, rateLimit: createRateLimiter() },

Tests

each flow run sees a distinct RateLimiter instance — pins the per-run-allocation contract via a source-handler wrap that records the limiter object identity across a 3-assignee fan-out. Verifies the three captured limiters are pairwise distinct.
a budget-exhausting first run does not block later runs — burns the full 20-call budget in run 1 and verifies runs 2 and 3 still finish status=ok. Pre-fix, both would have died on the first consume() of run 2.
Both share an afterEach snapshot/restore of the source handler because defaultRegistry() returns fresh MapRegistry instances but reuses the same BUILT_INS descriptor object references, so the wrap mutation otherwise persists across tests.

Test plan

bun test apps/server/src/domain/flows/flow-dispatch.test.ts → 28 pass / 0 fail
bun test apps/server/src/domain/flows/ → 376 pass / 0 fail
bunx biome check apps/server/src/domain/flows/flow-dispatch{.ts,.test.ts} → clean
Operator — systemctl --user restart claude-hooks to reset the leaked singleton, then POST /pipeline/bounce-review for #401 + #402 to clear the parked review state.

🤖 Generated with Claude Code

## Summary `flow-dispatch.ts` hoisted `defaultArgInjections()` out of the per-run loop on the (correct) reasoning that every key it returned was either a stable function or the singleton dedup surface. The rate-limit budget broke that invariant — the limiter is *stateful* and counts mutating calls. With the bundle reused across N triggers × M flows, every flow run shared one limiter; once cumulative consumption hit the `DEFAULT_RATE_LIMIT_BUDGET = 20` ceiling, **every subsequent flow run threw `RateLimitExceeded` until the next service restart**. ## Production symptom PR #402's bounced review-request flow at `2026-04-26 20:43:25 UTC` fired with the limiter already at 20/20, so the `dispatch` node errored: ``` "error": "rate-limit budget exhausted: used 20 of 20 mutating calls for this flow run" ``` The reviewer never picked up the re-review, and the operator dashboard showed `IN REVIEW` cards parked indefinitely on the dev column with the reviewer column empty. The NF-3 docstring in `rate-limit.ts:97-101` already flagged this exact leak as "safe for the isolated unit tests shipped here" but warns it "DOES leak across runs". This PR is the production wiring catching up. ## Fix Keep the static injection bundle hoisted (still correct for the stateless functions) but allocate `createRateLimiter()` *inside* the loop and spread it onto `argDefaults` per `(trigger, flow)` iteration. The allocation is one object per flow run — negligible against the forge-API + executor work that follows. ```diff - const argDefaults = defaultArgInjections(); + const staticArgDefaults = defaultArgInjections(); for (const t of triggers) { for (const row of flowRows) { ... - argDefaults, + argDefaults: { ...staticArgDefaults, rateLimit: createRateLimiter() }, ``` ## Tests - **`each flow run sees a distinct RateLimiter instance`** — pins the per-run-allocation contract via a `source`-handler wrap that records the limiter object identity across a 3-assignee fan-out. Verifies the three captured limiters are pairwise distinct. - **`a budget-exhausting first run does not block later runs`** — burns the full 20-call budget in run 1 and verifies runs 2 and 3 still finish `status=ok`. Pre-fix, both would have died on the first `consume()` of run 2. - Both share an `afterEach` snapshot/restore of the source handler because `defaultRegistry()` returns fresh `MapRegistry` instances but reuses the same `BUILT_INS` descriptor object references, so the wrap mutation otherwise persists across tests. ## Test plan - [x] `bun test apps/server/src/domain/flows/flow-dispatch.test.ts` → 28 pass / 0 fail - [x] `bun test apps/server/src/domain/flows/` → 376 pass / 0 fail - [x] `bunx biome check apps/server/src/domain/flows/flow-dispatch{.ts,.test.ts}` → clean - [ ] **Operator** — `systemctl --user restart claude-hooks` to reset the leaked singleton, then `POST /pipeline/bounce-review` for #401 + #402 to clear the parked review state. 🤖 Generated with [Claude Code](https://claude.com/claude-code)

claude-desktop added 1 commit

2026-04-26 22:10:42 +00:00

fix(flows): allocate rate-limit budget per flow run, not per process

qa / qa (pull_request) Successful in 6m34s

Details

qa / dockerfile (pull_request) Successful in 13s

Details

5391a20b5a

`flow-dispatch.ts` hoisted `defaultArgInjections()` out of the per-run
loop on the (correct) reasoning that every key it returned was either a
stable function or the singleton dedup surface. The rate-limit budget
broke that invariant: the limiter is *stateful* and counts mutating
calls. With the bundle reused across N triggers × M flows, every flow
run shared one limiter; once cumulative consumption hit the
`DEFAULT_RATE_LIMIT_BUDGET = 20` ceiling, every subsequent flow run
threw `RateLimitExceeded` until the next service restart.

Production symptom: PR #402's bounced review-request flow fired with
the budget already at 20/20, so the `dispatch` node errored, the
reviewer never picked up the re-review, and the operator dashboard
showed `IN REVIEW` cards parked indefinitely on the dev column. The
NF-3 docstring in `rate-limit.ts:97-101` already flagged the leak as
"safe for the isolated unit tests shipped here" but warns it "DOES
leak across runs" — this is the production wiring catching up.

Fix: keep the static injection bundle hoisted (still correct for the
stateless functions) but allocate `createRateLimiter()` inside the
loop and spread it onto `argDefaults` per (trigger, flow) iteration.
The allocation is one object per flow run — negligible vs. the
forge-API + executor work that follows.

Tests:
- new "each flow run sees a distinct RateLimiter instance" pins the
  per-run-allocation contract via a `source`-handler wrap that
  records the limiter object identity across a 3-assignee fan-out.
- new "a budget-exhausting first run does not block later runs"
  burns the full 20-call budget in run 1 and verifies runs 2 and 3
  still finish status=ok — pre-fix, both would have died on the
  first `consume()` of run 2 with `RateLimitExceeded`.
- both share an `afterEach` snapshot/restore of the source handler
  because `defaultRegistry()` returns fresh `MapRegistry` instances
  but reuses the same `BUILT_INS` descriptor object references, so
  the wrap mutation otherwise persists across tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

charles requested review from reviewer

2026-04-26 22:32:32 +00:00

reviewer approved these changes

2026-04-26 22:33:30 +00:00

reviewer left a comment

Correct fix. The { ...staticArgDefaults, rateLimit: createRateLimiter() } spread at each (trigger, flow) iteration is exactly right — stateless bundle stays hoisted, stateful limiter is fresh per run. The tests pin both invariants cleanly (object identity check + budget-burn regression). CI green.

Correct fix. The `{ ...staticArgDefaults, rateLimit: createRateLimiter() }` spread at each `(trigger, flow)` iteration is exactly right — stateless bundle stays hoisted, stateful limiter is fresh per run. The tests pin both invariants cleanly (object identity check + budget-burn regression). CI green.

charles merged commit b2659edaaf into main

2026-04-26 22:34:22 +00:00

charles referenced this pull request from a commit

2026-04-26 22:34:23 +00:00

fix(flows): allocate rate-limit budget per flow run, not per process (#405)

charles deleted branch fix/flow-rate-limit-per-run

2026-04-26 22:34:23 +00:00