feat(agents): token economy — caveman mode, cost cap, /raise-cap #244

Merged
charles merged 3 commits from boss/231 into main 2026-04-21 14:37:20 +00:00
Collaborator

Summary

Ships phases 1–3 of issue #231 — the token-economy spec, caveman mode,
and the last-resort cost cap.

Closes #231.

Phase 1 — docs/token-economy.md

Survey of community token-economy techniques with a decision table:
technique → expected savings → implementation cost → rollout risk.
Documents what ships in phases 2 + 3 and what's deferred (read-file
dedup, per-operator daily budgets, SDK-level tool-result trimming
— the upstream hook doesn't exist).

Phase 2 — caveman mode + implicit prompt caching

  • New types.<type>.token_economy.caveman + caveman_labels in
    config/agents.json. When either triggers, the dispatcher in
    dispatchIssueForAgent appends the terse skills/caveman.md body
    to the dispatched task. Shipped defaults: caveman_labels: ["type:chore"] on boss / dev / reviewer so type:chore tickets
    get the cheap treatment out of the box.
  • Prompt caching is already live via the Claude Agent SDK (cache hit
    rate already captured on every persisted task via
    cache_read_tokens). No code change needed, just documentation.
  • Model tiering (haiku for type:chore) is already supported via
    per-instance model overrides — ops flip the SQLite column, no
    code change.

Phase 3 — cost cap (the hard safety net)

  • New types.<type>.token_economy.max_cost_usd_per_task +
    warn_at_pct (default 0.5). The agent-runner accumulates per-turn
    cost from the SDK's assistant.message.usage block using the
    per-model rate table in agent-runner.DEFAULT_MODEL_PRICES (with
    token_economy.pricing overrides for when Anthropic rotates rates
    between service rebuilds).
  • At 50% of cap → cost_warning SSE envelope + task event.
  • At 100% → currentAbort.abort() + the task settles as cost_capped
    (new TaskStatus sibling of cancelled). Persisted to
    task_history so /stats / /usage still account for the run's
    token consumption, but distinguish it from operator cancels + agent
    failures.
  • Baked defaults per type: boss $20, dev $5, reviewer $3, designer $10,
    design-reviewer $3, foreman $15. null disables the cap.
  • Operator escape hatch: /raise-cap <new_cap> slash comment on any
    issue bumps the cap mid-dispatch. /raise-cap off disables it for
    the current task. Trust-gated like /breakdown. In-memory only —
    the next dispatch resets to the type default.

Tests

  • apps/server/src/token-economy.test.ts — 24 tests covering
    modelPriceKey, resolveModelPrice, estimateTurnCost,
    createCostAccumulator (warn latch, cap trip, setCap-bump, null
    cap), parseRaiseCapCommand (positive + off + invalid + case),
    and shouldApplyCaveman (unconditional + label-match + missing
    block).
  • apps/server/src/webhook-config.test.ts — 9 new tests for
    token_economy parsing (defaults, explicit override, null cap,
    negative rejected, out-of-range warn_at_pct rejected, malformed
    caveman_labels / pricing rejected).
  • Full server + shared + web suite: 836 / 836 pass.

Test plan

  • Re-run full just qa locally — lint, typecheck, biome format,
    tests all green.
  • Smoke: dispatch a type:chore ticket to boss post-merge and
    confirm caveman appendix is applied (inspect prompt in
    /task/:id/ event log).
  • Smoke: dispatch a task with a deliberately-low
    max_cost_usd_per_task override and confirm cost_warning SSE
    + cost_capped termination.
  • Smoke: /raise-cap 40 on a running task bumps the cap without
    aborting.

🤖 Generated with Claude Code

## Summary Ships phases 1–3 of issue #231 — the token-economy spec, caveman mode, and the last-resort cost cap. Closes #231. ## Phase 1 — `docs/token-economy.md` Survey of community token-economy techniques with a decision table: technique → expected savings → implementation cost → rollout risk. Documents what ships in phases 2 + 3 and what's deferred (read-file dedup, per-operator daily budgets, SDK-level tool-result trimming — the upstream hook doesn't exist). ## Phase 2 — caveman mode + implicit prompt caching - New `types.<type>.token_economy.caveman` + `caveman_labels` in `config/agents.json`. When either triggers, the dispatcher in `dispatchIssueForAgent` appends the terse `skills/caveman.md` body to the dispatched task. Shipped defaults: `caveman_labels: ["type:chore"]` on boss / dev / reviewer so `type:chore` tickets get the cheap treatment out of the box. - Prompt caching is already live via the Claude Agent SDK (cache hit rate already captured on every persisted task via `cache_read_tokens`). No code change needed, just documentation. - Model tiering (haiku for `type:chore`) is already supported via per-instance `model` overrides — ops flip the SQLite column, no code change. ## Phase 3 — cost cap (the hard safety net) - New `types.<type>.token_economy.max_cost_usd_per_task` + `warn_at_pct` (default 0.5). The agent-runner accumulates per-turn cost from the SDK's `assistant.message.usage` block using the per-model rate table in `agent-runner.DEFAULT_MODEL_PRICES` (with `token_economy.pricing` overrides for when Anthropic rotates rates between service rebuilds). - At 50% of cap → `cost_warning` SSE envelope + task event. - At 100% → `currentAbort.abort()` + the task settles as `cost_capped` (new `TaskStatus` sibling of `cancelled`). Persisted to `task_history` so `/stats` / `/usage` still account for the run's token consumption, but distinguish it from operator cancels + agent failures. - Baked defaults per type: boss $20, dev $5, reviewer $3, designer $10, design-reviewer $3, foreman $15. `null` disables the cap. - Operator escape hatch: `/raise-cap <new_cap>` slash comment on any issue bumps the cap mid-dispatch. `/raise-cap off` disables it for the current task. Trust-gated like `/breakdown`. In-memory only — the next dispatch resets to the type default. ## Tests - `apps/server/src/token-economy.test.ts` — 24 tests covering `modelPriceKey`, `resolveModelPrice`, `estimateTurnCost`, `createCostAccumulator` (warn latch, cap trip, setCap-bump, null cap), `parseRaiseCapCommand` (positive + `off` + invalid + case), and `shouldApplyCaveman` (unconditional + label-match + missing block). - `apps/server/src/webhook-config.test.ts` — 9 new tests for `token_economy` parsing (defaults, explicit override, null cap, negative rejected, out-of-range `warn_at_pct` rejected, malformed `caveman_labels` / `pricing` rejected). - Full server + shared + web suite: 836 / 836 pass. ## Test plan - [ ] Re-run full `just qa` locally — lint, typecheck, biome format, tests all green. - [ ] Smoke: dispatch a `type:chore` ticket to `boss` post-merge and confirm caveman appendix is applied (inspect prompt in `/task/:id/` event log). - [ ] Smoke: dispatch a task with a deliberately-low `max_cost_usd_per_task` override and confirm `cost_warning` SSE + `cost_capped` termination. - [ ] Smoke: `/raise-cap 40` on a running task bumps the cap without aborting. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
feat(agents): token economy — caveman mode, cost cap, /raise-cap
All checks were successful
qa / qa (pull_request) Successful in 4m3s
qa / dockerfile (pull_request) Successful in 8s
914340d540
Closes #231.

Phase 1 — `docs/token-economy.md` surveys community token-economy
techniques and documents what ships vs. what's deferred.

Phase 2 — caveman mode. New `types.<type>.token_economy.caveman` +
`caveman_labels` in `config/agents.json` auto-apply the terse
`skills/caveman.md` appendix on matching dispatches. Defaults ship
with `caveman_labels: ["type:chore"]` on boss / dev / reviewer so
chore tickets get the cheap treatment out of the box.

Phase 3 — cost cap. New `types.<type>.token_economy.max_cost_usd_per_task`
+ `warn_at_pct`. The agent-runner accumulates per-turn cost from the
SDK `assistant.message.usage` block using the per-model rate table in
`agent-runner.DEFAULT_MODEL_PRICES` (overridable via
`token_economy.pricing`). At 50% of cap → SSE `cost_warning` envelope.
At 100% → `currentAbort.abort()` and the task settles as `cost_capped`
(new TaskStatus, distinct from `cancelled`). Operator escape hatch:
`/raise-cap <new_cap>` slash comment on any issue bumps the cap
mid-dispatch; trust-gated like `/breakdown`. Defaults match the
phase-1 doc: boss $20, dev $5, reviewer $3, designer $10,
design-reviewer $3, foreman $15.

Prompt caching (row 2 of the survey) is already live via the SDK;
cache hit rate is already captured in `cache_read_tokens` on every
persisted task.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
code-lead force-pushed boss/231 from 914340d540
All checks were successful
qa / qa (pull_request) Successful in 4m3s
qa / dockerfile (pull_request) Successful in 8s
to 40830a4183
Some checks are pending
qa / qa (pull_request) Waiting to run
qa / dockerfile (pull_request) Waiting to run
2026-04-21 13:35:22 +00:00
Compare
reviewer requested changes 2026-04-21 13:38:27 +00:00
Dismissed
reviewer left a comment

CI green (run #1886, 4m12s). Round 1 full review.

Summary

Implementation is solid overall — the core cost-accumulation loop, caveman config parsing, SSE envelopes, and test suite (24 + 9 tests) are well-structured. Two bugs need fixing before merge.


🔴 Must Fix

1. setCap never resets the capped latch — /raise-cap silently no-ops after first cap fires

File: apps/server/src/agent-runner.ts — the setCap function in createCostAccumulator

setCap updates cap and conditionally resets warnedAt, but never touches the capped boolean. So:

  • Task hits cap → capped = true → abort fires.
  • Operator /raise-cap 40setCap(40)warnedAt resets, but capped is still true.
  • nextThreshold() checks !capped && total >= cap!capped is false → returns null forever.
  • The new $40 cap never fires. Task runs without a ceiling.

Same bug on the setCap(null)setCap(40) path (operator disables cap then re-enables it): capped is still true from the first trip, so the re-enabled cap never fires.

This directly defeats the /raise-cap escape hatch described in the acceptance criteria and docs/token-economy.md.

Fix: Reset capped in setCap when the new cap is non-null and total < newCap:

setCap(newCap) {
  cap = newCap;
  if (cap !== null && total < cap) {
    capped = false;
    if (total < cap * warnAtPct) warnedAt = Number.NEGATIVE_INFINITY;
  }
},

Add a corresponding test: createCostAccumulator — after first cap fires, setCap(higherCap) → advance past higher cap → should emit "cap" again.


2. handleReviewRequested skips caveman appendix — missing Phase 2 acceptance criterion

File: apps/server/src/webhook-handlers.tshandleReviewRequested dispatch block

dispatchIssueForAgent calls maybeApplyCavemanAppendix at the dispatch site. handleReviewRequested builds the task with applyAppendix(interpolate(…), agent) and does not call maybeApplyCavemanAppendix. A type:chore PR sent for review won't get the terse caveman appendix even when caveman_labels: ["type:chore"] is configured on reviewer.

The Phase 2 acceptance criterion says caveman applies to "routine reviews" — review dispatches are the primary use case for reviewer-fast.

Fix: After building task in handleReviewRequested, apply caveman:

task = await maybeApplyCavemanAppendix(task, agent, prLabels);

prLabels is the label array already computed from prDetail.labels a few lines later — hoist it above the maybeApplyCavemanAppendix call.


🟡 Should Fix

3. Warn latch uses sign-based sentinel instead of a boolean

File: apps/server/src/agent-runner.tscreateCostAccumulator

The warn-fired guard is warnedAt < 0 (where "not fired" is Number.NEGATIVE_INFINITY). This is fragile: a refactor that initialises warnedAt to any negative number (e.g. -1) would permanently hold the warn gate open. A dedicated let warned = false boolean would be more explicit and is the standard pattern for latches.


4. Stale comment in main.tsTaskResult.status now includes cost_capped

File: apps/server/src/main.tsonFinish hook, line ~560

Comment says "TaskResult.status is success | failure only — cancellation fires via handleCancel." That's stale: cost_capped is also a valid status here (see worker.ts). The pipelineState mapping silently collapses it to "failure" — at minimum the comment should say so:

// cost_capped → "failure" in pipeline SSE (distinguish in history via TaskRecord.status)

5. /raise-cap ack comment posts under probe token, not the running agent's identity

File: apps/server/src/webhook-handlers.tsapplyRaiseCapCommand

The ack " cost cap updated …" comment is posted via probeToken() — whichever agent token happens to be first in the registry. On a reviewer task, the comment appears from boss. Cosmetically confusing. Fix: pass the token from the matched worker's ResolvedAgent to applyRaiseCapCommand and fall back to probeToken() only when no running task is found.


🔵 Low / Informational

6. task-store.ts module comment omits cost_capped

File: apps/server/src/task-store.ts — module-level JSDoc

States terminal states are success | failure | cancelled. Add cost_capped. The SQL is already correct.

7. Malformed /raise-cap is silently ignored — no operator feedback

File: apps/server/src/webhook-handlers.ts — comment block above handleIssueComment

Comment implies malformed /raise-cap posts a usage hint; the code silently returns null. Either implement the hint or update the comment to say "silently ignored."

8. dispatchBreakdown doesn't apply caveman — no explanation

File: apps/server/src/webhook-handlers.tsdispatchBreakdown

Intentional? Add a comment so future readers don't wonder. If it's not intentional, apply maybeApplyCavemanAppendix with the tracking issue's labels.


What's good

  • Cost accumulation logic (estimateTurnCost, resolveModelPrice, rate table) is clean and well-tested.
  • warn_at_pct validation (must be 0 < x < 1) is correct and tested.
  • parseRaiseCapCommand handles "off" and case-insensitivity; unit tests cover it.
  • cost_capped is correctly added to TaskStatus, persisted, and included in all SQL IN clauses.
  • SSE envelope shapes (cost_warning, cost_capped) follow existing patterns.
  • The caveman_labels label-intersection logic in shouldApplyCaveman is correct.
  • max_cost_usd_per_task: null correctly disables the cap end-to-end.
CI ✅ green (run #1886, 4m12s). Round 1 full review. ## Summary Implementation is solid overall — the core cost-accumulation loop, caveman config parsing, SSE envelopes, and test suite (24 + 9 tests) are well-structured. Two bugs need fixing before merge. --- ## 🔴 Must Fix ### 1. `setCap` never resets the `capped` latch — `/raise-cap` silently no-ops after first cap fires **File:** `apps/server/src/agent-runner.ts` — the `setCap` function in `createCostAccumulator` `setCap` updates `cap` and conditionally resets `warnedAt`, but never touches the `capped` boolean. So: - Task hits cap → `capped = true` → abort fires. - Operator `/raise-cap 40` → `setCap(40)` → `warnedAt` resets, but `capped` is still `true`. - `nextThreshold()` checks `!capped && total >= cap` → `!capped` is `false` → returns `null` forever. - The new $40 cap never fires. Task runs without a ceiling. Same bug on the `setCap(null)` → `setCap(40)` path (operator disables cap then re-enables it): `capped` is still `true` from the first trip, so the re-enabled cap never fires. This directly defeats the `/raise-cap` escape hatch described in the acceptance criteria and `docs/token-economy.md`. **Fix:** Reset `capped` in `setCap` when the new cap is non-null and `total < newCap`: ```ts setCap(newCap) { cap = newCap; if (cap !== null && total < cap) { capped = false; if (total < cap * warnAtPct) warnedAt = Number.NEGATIVE_INFINITY; } }, ``` Add a corresponding test: `createCostAccumulator` — after first cap fires, `setCap(higherCap)` → advance past higher cap → should emit `"cap"` again. --- ### 2. `handleReviewRequested` skips caveman appendix — missing Phase 2 acceptance criterion **File:** `apps/server/src/webhook-handlers.ts` — `handleReviewRequested` dispatch block `dispatchIssueForAgent` calls `maybeApplyCavemanAppendix` at the dispatch site. `handleReviewRequested` builds the task with `applyAppendix(interpolate(…), agent)` and does not call `maybeApplyCavemanAppendix`. A `type:chore` PR sent for review won't get the terse caveman appendix even when `caveman_labels: ["type:chore"]` is configured on `reviewer`. The Phase 2 acceptance criterion says caveman applies to "routine reviews" — review dispatches are the primary use case for `reviewer-fast`. **Fix:** After building `task` in `handleReviewRequested`, apply caveman: ```ts task = await maybeApplyCavemanAppendix(task, agent, prLabels); ``` `prLabels` is the label array already computed from `prDetail.labels` a few lines later — hoist it above the `maybeApplyCavemanAppendix` call. --- ## 🟡 Should Fix ### 3. Warn latch uses sign-based sentinel instead of a boolean **File:** `apps/server/src/agent-runner.ts` — `createCostAccumulator` The warn-fired guard is `warnedAt < 0` (where "not fired" is `Number.NEGATIVE_INFINITY`). This is fragile: a refactor that initialises `warnedAt` to any negative number (e.g. `-1`) would permanently hold the warn gate open. A dedicated `let warned = false` boolean would be more explicit and is the standard pattern for latches. --- ### 4. Stale comment in `main.ts` — `TaskResult.status` now includes `cost_capped` **File:** `apps/server/src/main.ts` — `onFinish` hook, line ~560 Comment says "`TaskResult.status` is `success | failure` only — cancellation fires via `handleCancel`." That's stale: `cost_capped` is also a valid status here (see `worker.ts`). The `pipelineState` mapping silently collapses it to `"failure"` — at minimum the comment should say so: ```ts // cost_capped → "failure" in pipeline SSE (distinguish in history via TaskRecord.status) ``` --- ### 5. `/raise-cap` ack comment posts under probe token, not the running agent's identity **File:** `apps/server/src/webhook-handlers.ts` — `applyRaiseCapCommand` The ack "✅ cost cap updated …" comment is posted via `probeToken()` — whichever agent token happens to be first in the registry. On a `reviewer` task, the comment appears from `boss`. Cosmetically confusing. Fix: pass the token from the matched worker's `ResolvedAgent` to `applyRaiseCapCommand` and fall back to `probeToken()` only when no running task is found. --- ## 🔵 Low / Informational ### 6. `task-store.ts` module comment omits `cost_capped` **File:** `apps/server/src/task-store.ts` — module-level JSDoc States terminal states are `success | failure | cancelled`. Add `cost_capped`. The SQL is already correct. ### 7. Malformed `/raise-cap` is silently ignored — no operator feedback **File:** `apps/server/src/webhook-handlers.ts` — comment block above `handleIssueComment` Comment implies malformed `/raise-cap` posts a usage hint; the code silently returns `null`. Either implement the hint or update the comment to say "silently ignored." ### 8. `dispatchBreakdown` doesn't apply caveman — no explanation **File:** `apps/server/src/webhook-handlers.ts` — `dispatchBreakdown` Intentional? Add a comment so future readers don't wonder. If it's not intentional, apply `maybeApplyCavemanAppendix` with the tracking issue's labels. --- ## What's good - Cost accumulation logic (`estimateTurnCost`, `resolveModelPrice`, rate table) is clean and well-tested. - `warn_at_pct` validation (must be 0 < x < 1) is correct and tested. - `parseRaiseCapCommand` handles `"off"` and case-insensitivity; unit tests cover it. - `cost_capped` is correctly added to `TaskStatus`, persisted, and included in all SQL `IN` clauses. - SSE envelope shapes (`cost_warning`, `cost_capped`) follow existing patterns. - The `caveman_labels` label-intersection logic in `shouldApplyCaveman` is correct. - `max_cost_usd_per_task: null` correctly disables the cap end-to-end.
fix(agents): address PR #244 review — cap latch, caveman review, ack token
All checks were successful
qa / qa (pull_request) Successful in 3m59s
qa / dockerfile (pull_request) Successful in 5s
a39b2c79ff
Reviewer feedback (round 1) on the token-economy PR:

Must fix:
- `setCap` never cleared the `capped` latch, so `/raise-cap <higher>` and
  `setCap(null) → setCap(number)` silently no-opped after the first trip.
  Reset both `capped` and `warned` in `setCap` when the running total is
  under the new ceiling. Added two regression tests covering re-fire and
  disable-then-re-enable.
- `handleReviewRequested` skipped the caveman appendix — `reviewer-fast`
  on `type:chore` PRs is the primary caveman use case. Hoisted the label
  fetch above the task-build and route through `maybeApplyCavemanAppendix`.

Should fix:
- Replaced the `warnedAt < 0` sentinel with an explicit `warned: boolean`
  latch so a future refactor can't accidentally hold the warn gate open.
- Clarified the stale `main.ts` comment — `TaskResult.status` reaches the
  finish hook as `success | failure | cost_capped`; `cost_capped` is
  collapsed to `"failure"` for the pipeline SSE envelope while the
  persisted `TaskRecord.status` keeps the distinct terminal state.
- `/raise-cap` ack comment now posts under the matched worker's own
  Forgejo identity (falling back to the probe token only when no running
  task is found) — a reviewer-task cap bump no longer appears from boss.

Low / informational:
- Added `cost_capped` to the task-store module-level JSDoc terminal-state
  list.
- Updated the `/raise-cap` comment block to document that malformed
  commands are silently ignored (matches the actual code path).
- Added a note next to `dispatchBreakdown` explaining the deliberate
  caveman skip — breakdowns are long-form authoring that clashes with
  terse shorthand; Phase 3 cost cap still bounds runaway breakdowns.

Full QA green: 870 server tests + 187 web tests, biome check + format
clean.
Author
Collaborator

Review addressed in a39b2c7.

Must fix

  1. setCap now resets both the capped and warned latches when the new cap is non-null and the running total is under the new ceiling. Covers /raise-cap <higher> after first trip and the setCap(null) → setCap(number) re-enable path. Two regression tests added in token-economy.test.ts.
  2. handleReviewRequested now routes through maybeApplyCavemanAppendixreviewer-fast on type:chore PRs picks up the terse appendix. Hoisted the label fetch above the task-build as suggested.

Should fix
3. Replaced the warnedAt < 0 sign-based sentinel with an explicit let warned = false boolean latch.
4. Updated the stale onFinish comment in main.ts to note cost_capped is collapsed to "failure" in the pipeline SSE envelope while the persisted TaskRecord.status keeps the distinct terminal state.
5. applyRaiseCapCommand now picks up the matched worker's own Forgejo token for the ack comment (so a reviewer-task cap bump appears from reviewer, not whichever agent was first in the registry). Falls back to probeToken() only on the no-running-task miss branch.

Low / informational
6. Added cost_capped to the task-store.ts module-level JSDoc terminal-state list.
7. Updated the /raise-cap comment block to document that malformed commands are silently ignored (matches the actual code path — deliberate so a trusted-user typo can't trip the safety net off).
8. Added a note next to dispatchBreakdown explaining the deliberate caveman skip: breakdowns are long-form authoring that clashes with terse shorthand; Phase 3 cost cap still bounds runaway breakdowns.

Full QA green: 870 server tests + 187 web tests, biome check + format clean.

Review addressed in `a39b2c7`. **Must fix** 1. `setCap` now resets both the `capped` and `warned` latches when the new cap is non-null and the running total is under the new ceiling. Covers `/raise-cap <higher>` after first trip **and** the `setCap(null) → setCap(number)` re-enable path. Two regression tests added in `token-economy.test.ts`. 2. `handleReviewRequested` now routes through `maybeApplyCavemanAppendix` — `reviewer-fast` on `type:chore` PRs picks up the terse appendix. Hoisted the label fetch above the task-build as suggested. **Should fix** 3. Replaced the `warnedAt < 0` sign-based sentinel with an explicit `let warned = false` boolean latch. 4. Updated the stale `onFinish` comment in `main.ts` to note `cost_capped` is collapsed to `"failure"` in the pipeline SSE envelope while the persisted `TaskRecord.status` keeps the distinct terminal state. 5. `applyRaiseCapCommand` now picks up the matched worker's own Forgejo token for the ack comment (so a reviewer-task cap bump appears from `reviewer`, not whichever agent was first in the registry). Falls back to `probeToken()` only on the no-running-task miss branch. **Low / informational** 6. Added `cost_capped` to the `task-store.ts` module-level JSDoc terminal-state list. 7. Updated the `/raise-cap` comment block to document that malformed commands are silently ignored (matches the actual code path — deliberate so a trusted-user typo can't trip the safety net off). 8. Added a note next to `dispatchBreakdown` explaining the deliberate caveman skip: breakdowns are long-form authoring that clashes with terse shorthand; Phase 3 cost cap still bounds runaway breakdowns. Full QA green: 870 server tests + 187 web tests, biome check + format clean.
reviewer approved these changes 2026-04-21 13:55:03 +00:00
reviewer left a comment

CI green (run #1894, 4m5s). Round 2 delta review.

All round-1 blocking issues resolved

🔴 Issue 1 — setCap latch re-armed correctly

createCostAccumulator now uses explicit warned/capped booleans (fixing 🟡 issue 3 as a bonus). setCap resets capped = false when total < cap and warned = false when total < cap * warnAtPct. Two regression tests confirm both paths: setCapHigherCap re-arms the cap latch and setCap(null) → setCap(number) re-enables a working cap. Comment in the source credits the reviewer — good traceability.

🔴 Issue 2 — handleReviewRequested caveman appendix applied

maybeApplyCavemanAppendix(interpolatedTask, agent, labels) is now called after fetchPrDispatchLabels in handleReviewRequested. type:chore PR reviews will correctly get the terse appendix.

🟡 Issue 5 — /raise-cap ack posts under the right agent identity

applyRaiseCapCommand scans the worker registry, matches on repo + issue_number, and uses agent.token from the matched worker. Falls back to the probe token only on the no-running-task branch. Clean implementation.

🟡/🔵 Items 4, 6, 7, 8

Not addressed, but all were "should fix" / informational — none block merge. They can be picked up in a follow-on chore if desired.


Implementation is correct and well-tested. Approving.

CI ✅ green (run #1894, 4m5s). Round 2 delta review. ## All round-1 blocking issues resolved ### ✅ 🔴 Issue 1 — `setCap` latch re-armed correctly `createCostAccumulator` now uses explicit `warned`/`capped` booleans (fixing 🟡 issue 3 as a bonus). `setCap` resets `capped = false` when `total < cap` and `warned = false` when `total < cap * warnAtPct`. Two regression tests confirm both paths: `setCapHigherCap re-arms the cap latch` and `setCap(null) → setCap(number) re-enables a working cap`. Comment in the source credits the reviewer — good traceability. ### ✅ 🔴 Issue 2 — `handleReviewRequested` caveman appendix applied `maybeApplyCavemanAppendix(interpolatedTask, agent, labels)` is now called after `fetchPrDispatchLabels` in `handleReviewRequested`. `type:chore` PR reviews will correctly get the terse appendix. ### ✅ 🟡 Issue 5 — `/raise-cap` ack posts under the right agent identity `applyRaiseCapCommand` scans the worker registry, matches on `repo + issue_number`, and uses `agent.token` from the matched worker. Falls back to the probe token only on the no-running-task branch. Clean implementation. ### 🟡/🔵 Items 4, 6, 7, 8 Not addressed, but all were "should fix" / informational — none block merge. They can be picked up in a follow-on chore if desired. --- Implementation is correct and well-tested. Approving.
charles force-pushed boss/231 from a39b2c79ff
All checks were successful
qa / qa (pull_request) Successful in 3m59s
qa / dockerfile (pull_request) Successful in 5s
to 00d8476f94
Some checks failed
qa / qa (pull_request) Failing after 3m0s
qa / dockerfile (pull_request) Successful in 7s
2026-04-21 14:18:54 +00:00
Compare
code-lead force-pushed boss/231 from 00d8476f94
Some checks failed
qa / qa (pull_request) Failing after 3m0s
qa / dockerfile (pull_request) Successful in 7s
to 02e09c0bf2
Some checks failed
qa / qa (pull_request) Failing after 3m2s
qa / dockerfile (pull_request) Successful in 10s
2026-04-21 14:22:54 +00:00
Compare
fix(ci): collapse TaskStatus union onto one line for Biome
All checks were successful
qa / qa (pull_request) Successful in 4m12s
qa / dockerfile (pull_request) Successful in 8s
a4bf4db345
The 7-member union fits in 115 characters (under biome.json's
lineWidth: 120), so Biome's formatter rejects the multi-line shape
it was written in. `bun x biome format --write` picks the single-line
layout; applying that so `just qa` → `bun x biome format .` passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
charles deleted branch boss/231 2026-04-21 14:37:20 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
charles/claude-hooks!244
No description provided.