charles/claude-hooks

Fork 0

feat(usage): Anthropic-console-style token dashboard (M17-2) #159

Merged

code-lead merged 1 commit from boss/153 into main

2026-04-20 15:57:03 +00:00

code-lead commented

2026-04-20 15:37:19 +00:00

Collaborator

Summary

Capture per-task token counts from the Claude Agent SDK's result.usage
block and surface them as an Anthropic-console-style Usage tab on the
dashboard, backed by a new GET /usage endpoint. Closes #153.

Data capture

TaskRecord / persistTask grow four new columns —
input_tokens, output_tokens, cache_creation_tokens,
cache_read_tokens. Migration in task-store::ensureSchema runs
idempotent ALTER TABLE statements so existing rows stay readable
with NULL columns (treated as zero in aggregates).
logSDKMessage lifts the usage block out of the terminal result
message; registerWorker's runTask correlates those numbers into the
in-memory record so onFinish forwards them to persistTask.

`/usage` endpoint

GET /usage[?window=week|day|all], default week (or the
usage_reset config override — new top-level key in
config/agents.json).
Weekly window anchors on Monday 00:00 UTC (Pro Max reset cadence);
reset_at is the following Monday. Day window = 24h from today
00:00 UTC; all window = unbounded, reset_at: null.
Response carries totals (input / output / cache read / cache creation
/ tasks / turns), by_agent ranked by tokens consumed, a by_day
sparkline-compatible array, and threshold_tokens echoed from the
optional usage_threshold_tokens config key.

Dashboard Usage tab

Large hero number ("4.2M tokens this week"), stacked
input/output/cache bar, circular SVG threshold ring.
Threshold colour states: green <50 %, yellow 50-80 %, orange 80-95 %,
red >95 % of usage_threshold_tokens. Ring stays blank when the
threshold is unconfigured (Anthropic publishes no hard Pro-Max cap).
By-agent table ranked by tokens, by-day sparkline reusing the
/stats .stats-day-strip recipe. Window chip-strip +
5-minute auto-refresh while the tab is active.

Tests

src/usage.test.ts — 16 tests covering endpoint shape, window
boundaries (Monday 00:00 UTC anchor, 24h day span, unbounded all),
ranking order, NULL-token backward compatibility.
src/dashboard-smoke.test.ts — 7 structural checks for the new tab
elements and threshold-bucket JS helper.
src/dashboard-browser.test.ts — 6 happy-dom tests for hero
rendering, by-agent ranking, window refetch, blank-ring behaviour,
green→yellow→orange→red colour transitions, and empty state.

Docs

README /usage section with the full response shape and config
reference.
CLAUDE.md Modules table for task-store.ts updated to mention the
new computeUsage helper.
openapi.yml — /usage path + UsageResult schema.

Test plan

bun test — all 587 existing tests plus the 29 new ones pass
bun x tsc --noEmit green
bun x biome check src/ green
Visual smoke: open /dashboard → Usage tab → verify hero, ring,
stacked bar, agent table, day sparkline
Set usage_threshold_tokens in config/agents.json, restart,
watch the ring fill and hero colour change with task volume

## Summary Capture per-task token counts from the Claude Agent SDK's `result.usage` block and surface them as an Anthropic-console-style Usage tab on the dashboard, backed by a new `GET /usage` endpoint. Closes #153. ## Data capture - `TaskRecord` / `persistTask` grow four new columns — `input_tokens`, `output_tokens`, `cache_creation_tokens`, `cache_read_tokens`. Migration in `task-store::ensureSchema` runs idempotent `ALTER TABLE` statements so existing rows stay readable with `NULL` columns (treated as zero in aggregates). - `logSDKMessage` lifts the usage block out of the terminal `result` message; `registerWorker`'s runTask correlates those numbers into the in-memory record so `onFinish` forwards them to `persistTask`. ## `/usage` endpoint - `GET /usage[?window=week|day|all]`, default `week` (or the `usage_reset` config override — new top-level key in `config/agents.json`). - Weekly window anchors on **Monday 00:00 UTC** (Pro Max reset cadence); `reset_at` is the following Monday. Day window = 24h from today 00:00 UTC; all window = unbounded, `reset_at: null`. - Response carries totals (input / output / cache read / cache creation / tasks / turns), `by_agent` ranked by tokens consumed, a `by_day` sparkline-compatible array, and `threshold_tokens` echoed from the optional `usage_threshold_tokens` config key. ## Dashboard Usage tab - Large hero number ("4.2M tokens this week"), stacked input/output/cache bar, circular SVG threshold ring. - Threshold colour states: green <50 %, yellow 50-80 %, orange 80-95 %, red >95 % of `usage_threshold_tokens`. Ring stays blank when the threshold is unconfigured (Anthropic publishes no hard Pro-Max cap). - By-agent table ranked by tokens, by-day sparkline reusing the `/stats` `.stats-day-strip` recipe. Window chip-strip + 5-minute auto-refresh while the tab is active. ## Tests - `src/usage.test.ts` — 16 tests covering endpoint shape, window boundaries (Monday 00:00 UTC anchor, 24h day span, unbounded all), ranking order, NULL-token backward compatibility. - `src/dashboard-smoke.test.ts` — 7 structural checks for the new tab elements and threshold-bucket JS helper. - `src/dashboard-browser.test.ts` — 6 happy-dom tests for hero rendering, by-agent ranking, window refetch, blank-ring behaviour, green→yellow→orange→red colour transitions, and empty state. ## Docs - README `/usage` section with the full response shape and config reference. - CLAUDE.md Modules table for `task-store.ts` updated to mention the new `computeUsage` helper. - `openapi.yml` — `/usage` path + `UsageResult` schema. ## Test plan - [ ] `bun test` — all 587 existing tests plus the 29 new ones pass - [ ] `bun x tsc --noEmit` green - [ ] `bun x biome check src/` green - [ ] Visual smoke: open `/dashboard` → Usage tab → verify hero, ring, stacked bar, agent table, day sparkline - [ ] Set `usage_threshold_tokens` in `config/agents.json`, restart, watch the ring fill and hero colour change with task volume

code-lead added 1 commit

2026-04-20 15:37:19 +00:00

feat(usage): Anthropic-console-style token dashboard (M17-2, #153 )

qa / qa (pull_request) Has been cancelled

Details

qa / dockerfile (pull_request) Has been cancelled

Details

d41c3abfb5

Capture input/output/cache tokens on every finalized task, persist them
in the task_history SQLite table, and surface a rolling weekly rollup
via GET /usage + a new Dashboard > Usage tab. Matches the Anthropic
console "Weekly token usage" layout: large hero number, stacked
input/output/cache bar, threshold rings at 50/80/95 %, by-agent ranking
and a by-day sparkline. Weekly window anchors on Monday 00:00 UTC to
match Pro Max's server-side reset; configurable via
`usage_reset` / `usage_threshold_tokens` in config/agents.json.

Closes #153

code-lead self-assigned this

2026-04-20 15:37:28 +00:00

code-lead requested review from reviewer

2026-04-20 15:40:21 +00:00

code-lead force-pushed boss/153 from d41c3abfb5

qa / qa (pull_request) Has been cancelled

Details

qa / dockerfile (pull_request) Has been cancelled

Details

to 3c30f4a217

qa / qa (pull_request) Successful in 3m10s

Details

qa / dockerfile (pull_request) Successful in 8s

Details

2026-04-20 15:40:27 +00:00

Compare

reviewer commented

2026-04-20 15:40:59 +00:00

Collaborator

CI still pending at review time (run #1749, sha 3c30f4a). Stepping off the review request — will be re-dispatched automatically when CI completes.

CI still pending at review time (run #1749, sha `3c30f4a`). Stepping off the review request — will be re-dispatched automatically when CI completes.

reviewer approved these changes

2026-04-20 15:55:52 +00:00

reviewer left a comment

Review — PR #159: feat(usage): Anthropic-console-style token dashboard (M17-2)

CI: green ✅ (run #1749, 3m20s, sha 3c30f4a)
Round: 1 (no prior reviews)

Acceptance criteria check

All criteria from issue #153 are met:

✅ TaskRecord extended with input_tokens, output_tokens, cache_creation_tokens, cache_read_tokens
✅ persistTask correctly writes them; onFinish and the cancel path both forward the fields
✅ Idempotent ALTER TABLE migrations — existing rows keep NULL, aggregation uses COALESCE(SUM(col), 0)
✅ GET /usage[?window=week|day|all] wired in main.ts, default week
✅ Response shape matches spec: window.{reset_at, since, kind}, totals.{tasks, turns, input_tokens, output_tokens, cache_read, cache_creation}, by_agent, by_day, threshold_tokens
✅ Weekly anchor formula (dayOfWeek + 6) % 7 is correct, tested for Mon/Wed/Sun boundary cases
✅ Dashboard Usage tab: hero number, stacked bar, SVG threshold ring, by-agent table ranked by tokens, by-day sparkline, 5-min auto-refresh, window chip-strip
✅ Threshold colour states: green < 50%, yellow 50-80%, orange 80-95%, red ≥ 95%
✅ Ring blank when usage_threshold_tokens is absent
✅ usage_reset and usage_threshold_tokens config keys parsed with graceful fallback + warning on invalid values
✅ Tests: 16 endpoint + 7 smoke + 6 browser tests
✅ README /usage section, CLAUDE.md Modules table, openapi.yml

One minor finding (non-blocking)

src/main.ts ~line 239 — ?? 0 stores zeros instead of NULL when the SDK emits no usage block

When msg.usage is null/undefined (e.g. a very early abort before the SDK makes any API call), all four token fields coalesce to 0. In runTask, typeof 0 === "number" is truthy, so record.input_tokens gets set to 0. Then persistTask stores 0 rather than NULL.

The PersistTaskInput docstring says "null when the SDK reported no usage block", but the code stores 0. Aggregate correctness is unaffected — COALESCE(SUM(col), 0) treats NULL and 0 identically. In practice the SDK's NonNullableUsage type guarantees msg.usage is always present on result messages, so this path is theoretical. Not blocking.

Suggested fix:

const tokenDetail = u ? {
    input_tokens: u.input_tokens ?? 0,
    output_tokens: u.output_tokens ?? 0,
    cache_creation_tokens: u.cache_creation_input_tokens ?? 0,
    cache_read_tokens: u.cache_read_input_tokens ?? 0,
} : {};

Overall

Clean implementation. SQL, boundary math, config parsing, migration, and tests are all correct. Ready to merge.

## Review — PR #159: feat(usage): Anthropic-console-style token dashboard (M17-2) **CI**: green ✅ (run #1749, 3m20s, sha `3c30f4a`) **Round**: 1 (no prior reviews) --- ### Acceptance criteria check All criteria from issue #153 are met: - ✅ `TaskRecord` extended with `input_tokens`, `output_tokens`, `cache_creation_tokens`, `cache_read_tokens` - ✅ `persistTask` correctly writes them; `onFinish` and the cancel path both forward the fields - ✅ Idempotent `ALTER TABLE` migrations — existing rows keep `NULL`, aggregation uses `COALESCE(SUM(col), 0)` - ✅ `GET /usage[?window=week|day|all]` wired in `main.ts`, default `week` - ✅ Response shape matches spec: `window.{reset_at, since, kind}`, `totals.{tasks, turns, input_tokens, output_tokens, cache_read, cache_creation}`, `by_agent`, `by_day`, `threshold_tokens` - ✅ Weekly anchor formula `(dayOfWeek + 6) % 7` is correct, tested for Mon/Wed/Sun boundary cases - ✅ Dashboard Usage tab: hero number, stacked bar, SVG threshold ring, by-agent table ranked by tokens, by-day sparkline, 5-min auto-refresh, window chip-strip - ✅ Threshold colour states: green < 50%, yellow 50-80%, orange 80-95%, red ≥ 95% - ✅ Ring blank when `usage_threshold_tokens` is absent - ✅ `usage_reset` and `usage_threshold_tokens` config keys parsed with graceful fallback + warning on invalid values - ✅ Tests: 16 endpoint + 7 smoke + 6 browser tests - ✅ README `/usage` section, CLAUDE.md Modules table, `openapi.yml` --- ### One minor finding (non-blocking) **`src/main.ts` ~line 239 — `?? 0` stores zeros instead of NULL when the SDK emits no usage block** When `msg.usage` is `null`/`undefined` (e.g. a very early abort before the SDK makes any API call), all four token fields coalesce to `0`. In `runTask`, `typeof 0 === "number"` is truthy, so `record.input_tokens` gets set to `0`. Then `persistTask` stores `0` rather than `NULL`. The `PersistTaskInput` docstring says "null when the SDK reported no usage block", but the code stores `0`. Aggregate correctness is unaffected — `COALESCE(SUM(col), 0)` treats NULL and 0 identically. In practice the SDK's `NonNullableUsage` type guarantees `msg.usage` is always present on result messages, so this path is theoretical. Not blocking. Suggested fix: ```typescript const tokenDetail = u ? { input_tokens: u.input_tokens ?? 0, output_tokens: u.output_tokens ?? 0, cache_creation_tokens: u.cache_creation_input_tokens ?? 0, cache_read_tokens: u.cache_read_input_tokens ?? 0, } : {}; ``` --- ### Overall Clean implementation. SQL, boundary math, config parsing, migration, and tests are all correct. Ready to merge.