feat(usage): Anthropic-console-style token dashboard (M17-2) #159

Merged
code-lead merged 1 commit from boss/153 into main 2026-04-20 15:57:03 +00:00
Collaborator

Summary

Capture per-task token counts from the Claude Agent SDK's result.usage
block and surface them as an Anthropic-console-style Usage tab on the
dashboard, backed by a new GET /usage endpoint. Closes #153.

Data capture

  • TaskRecord / persistTask grow four new columns —
    input_tokens, output_tokens, cache_creation_tokens,
    cache_read_tokens. Migration in task-store::ensureSchema runs
    idempotent ALTER TABLE statements so existing rows stay readable
    with NULL columns (treated as zero in aggregates).
  • logSDKMessage lifts the usage block out of the terminal result
    message; registerWorker's runTask correlates those numbers into the
    in-memory record so onFinish forwards them to persistTask.

/usage endpoint

  • GET /usage[?window=week|day|all], default week (or the
    usage_reset config override — new top-level key in
    config/agents.json).
  • Weekly window anchors on Monday 00:00 UTC (Pro Max reset cadence);
    reset_at is the following Monday. Day window = 24h from today
    00:00 UTC; all window = unbounded, reset_at: null.
  • Response carries totals (input / output / cache read / cache creation
    / tasks / turns), by_agent ranked by tokens consumed, a by_day
    sparkline-compatible array, and threshold_tokens echoed from the
    optional usage_threshold_tokens config key.

Dashboard Usage tab

  • Large hero number ("4.2M tokens this week"), stacked
    input/output/cache bar, circular SVG threshold ring.
  • Threshold colour states: green <50 %, yellow 50-80 %, orange 80-95 %,
    red >95 % of usage_threshold_tokens. Ring stays blank when the
    threshold is unconfigured (Anthropic publishes no hard Pro-Max cap).
  • By-agent table ranked by tokens, by-day sparkline reusing the
    /stats .stats-day-strip recipe. Window chip-strip +
    5-minute auto-refresh while the tab is active.

Tests

  • src/usage.test.ts — 16 tests covering endpoint shape, window
    boundaries (Monday 00:00 UTC anchor, 24h day span, unbounded all),
    ranking order, NULL-token backward compatibility.
  • src/dashboard-smoke.test.ts — 7 structural checks for the new tab
    elements and threshold-bucket JS helper.
  • src/dashboard-browser.test.ts — 6 happy-dom tests for hero
    rendering, by-agent ranking, window refetch, blank-ring behaviour,
    green→yellow→orange→red colour transitions, and empty state.

Docs

  • README /usage section with the full response shape and config
    reference.
  • CLAUDE.md Modules table for task-store.ts updated to mention the
    new computeUsage helper.
  • openapi.yml/usage path + UsageResult schema.

Test plan

  • bun test — all 587 existing tests plus the 29 new ones pass
  • bun x tsc --noEmit green
  • bun x biome check src/ green
  • Visual smoke: open /dashboard → Usage tab → verify hero, ring,
    stacked bar, agent table, day sparkline
  • Set usage_threshold_tokens in config/agents.json, restart,
    watch the ring fill and hero colour change with task volume
## Summary Capture per-task token counts from the Claude Agent SDK's `result.usage` block and surface them as an Anthropic-console-style Usage tab on the dashboard, backed by a new `GET /usage` endpoint. Closes #153. ## Data capture - `TaskRecord` / `persistTask` grow four new columns — `input_tokens`, `output_tokens`, `cache_creation_tokens`, `cache_read_tokens`. Migration in `task-store::ensureSchema` runs idempotent `ALTER TABLE` statements so existing rows stay readable with `NULL` columns (treated as zero in aggregates). - `logSDKMessage` lifts the usage block out of the terminal `result` message; `registerWorker`'s runTask correlates those numbers into the in-memory record so `onFinish` forwards them to `persistTask`. ## `/usage` endpoint - `GET /usage[?window=week|day|all]`, default `week` (or the `usage_reset` config override — new top-level key in `config/agents.json`). - Weekly window anchors on **Monday 00:00 UTC** (Pro Max reset cadence); `reset_at` is the following Monday. Day window = 24h from today 00:00 UTC; all window = unbounded, `reset_at: null`. - Response carries totals (input / output / cache read / cache creation / tasks / turns), `by_agent` ranked by tokens consumed, a `by_day` sparkline-compatible array, and `threshold_tokens` echoed from the optional `usage_threshold_tokens` config key. ## Dashboard Usage tab - Large hero number ("4.2M tokens this week"), stacked input/output/cache bar, circular SVG threshold ring. - Threshold colour states: green <50 %, yellow 50-80 %, orange 80-95 %, red >95 % of `usage_threshold_tokens`. Ring stays blank when the threshold is unconfigured (Anthropic publishes no hard Pro-Max cap). - By-agent table ranked by tokens, by-day sparkline reusing the `/stats` `.stats-day-strip` recipe. Window chip-strip + 5-minute auto-refresh while the tab is active. ## Tests - `src/usage.test.ts` — 16 tests covering endpoint shape, window boundaries (Monday 00:00 UTC anchor, 24h day span, unbounded all), ranking order, NULL-token backward compatibility. - `src/dashboard-smoke.test.ts` — 7 structural checks for the new tab elements and threshold-bucket JS helper. - `src/dashboard-browser.test.ts` — 6 happy-dom tests for hero rendering, by-agent ranking, window refetch, blank-ring behaviour, green→yellow→orange→red colour transitions, and empty state. ## Docs - README `/usage` section with the full response shape and config reference. - CLAUDE.md Modules table for `task-store.ts` updated to mention the new `computeUsage` helper. - `openapi.yml` — `/usage` path + `UsageResult` schema. ## Test plan - [ ] `bun test` — all 587 existing tests plus the 29 new ones pass - [ ] `bun x tsc --noEmit` green - [ ] `bun x biome check src/` green - [ ] Visual smoke: open `/dashboard` → Usage tab → verify hero, ring, stacked bar, agent table, day sparkline - [ ] Set `usage_threshold_tokens` in `config/agents.json`, restart, watch the ring fill and hero colour change with task volume
feat(usage): Anthropic-console-style token dashboard (M17-2, #153)
Some checks failed
qa / qa (pull_request) Has been cancelled
qa / dockerfile (pull_request) Has been cancelled
d41c3abfb5
Capture input/output/cache tokens on every finalized task, persist them
in the task_history SQLite table, and surface a rolling weekly rollup
via GET /usage + a new Dashboard > Usage tab. Matches the Anthropic
console "Weekly token usage" layout: large hero number, stacked
input/output/cache bar, threshold rings at 50/80/95 %, by-agent ranking
and a by-day sparkline. Weekly window anchors on Monday 00:00 UTC to
match Pro Max's server-side reset; configurable via
`usage_reset` / `usage_threshold_tokens` in config/agents.json.

Closes #153
code-lead force-pushed boss/153 from d41c3abfb5
Some checks failed
qa / qa (pull_request) Has been cancelled
qa / dockerfile (pull_request) Has been cancelled
to 3c30f4a217
All checks were successful
qa / qa (pull_request) Successful in 3m10s
qa / dockerfile (pull_request) Successful in 8s
2026-04-20 15:40:27 +00:00
Compare
Collaborator

CI still pending at review time (run #1749, sha 3c30f4a). Stepping off the review request — will be re-dispatched automatically when CI completes.

CI still pending at review time (run #1749, sha `3c30f4a`). Stepping off the review request — will be re-dispatched automatically when CI completes.
reviewer approved these changes 2026-04-20 15:55:52 +00:00
reviewer left a comment

Review — PR #159: feat(usage): Anthropic-console-style token dashboard (M17-2)

CI: green (run #1749, 3m20s, sha 3c30f4a)
Round: 1 (no prior reviews)


Acceptance criteria check

All criteria from issue #153 are met:

  • TaskRecord extended with input_tokens, output_tokens, cache_creation_tokens, cache_read_tokens
  • persistTask correctly writes them; onFinish and the cancel path both forward the fields
  • Idempotent ALTER TABLE migrations — existing rows keep NULL, aggregation uses COALESCE(SUM(col), 0)
  • GET /usage[?window=week|day|all] wired in main.ts, default week
  • Response shape matches spec: window.{reset_at, since, kind}, totals.{tasks, turns, input_tokens, output_tokens, cache_read, cache_creation}, by_agent, by_day, threshold_tokens
  • Weekly anchor formula (dayOfWeek + 6) % 7 is correct, tested for Mon/Wed/Sun boundary cases
  • Dashboard Usage tab: hero number, stacked bar, SVG threshold ring, by-agent table ranked by tokens, by-day sparkline, 5-min auto-refresh, window chip-strip
  • Threshold colour states: green < 50%, yellow 50-80%, orange 80-95%, red ≥ 95%
  • Ring blank when usage_threshold_tokens is absent
  • usage_reset and usage_threshold_tokens config keys parsed with graceful fallback + warning on invalid values
  • Tests: 16 endpoint + 7 smoke + 6 browser tests
  • README /usage section, CLAUDE.md Modules table, openapi.yml

One minor finding (non-blocking)

src/main.ts ~line 239 — ?? 0 stores zeros instead of NULL when the SDK emits no usage block

When msg.usage is null/undefined (e.g. a very early abort before the SDK makes any API call), all four token fields coalesce to 0. In runTask, typeof 0 === "number" is truthy, so record.input_tokens gets set to 0. Then persistTask stores 0 rather than NULL.

The PersistTaskInput docstring says "null when the SDK reported no usage block", but the code stores 0. Aggregate correctness is unaffected — COALESCE(SUM(col), 0) treats NULL and 0 identically. In practice the SDK's NonNullableUsage type guarantees msg.usage is always present on result messages, so this path is theoretical. Not blocking.

Suggested fix:

const tokenDetail = u ? {
    input_tokens: u.input_tokens ?? 0,
    output_tokens: u.output_tokens ?? 0,
    cache_creation_tokens: u.cache_creation_input_tokens ?? 0,
    cache_read_tokens: u.cache_read_input_tokens ?? 0,
} : {};

Overall

Clean implementation. SQL, boundary math, config parsing, migration, and tests are all correct. Ready to merge.

## Review — PR #159: feat(usage): Anthropic-console-style token dashboard (M17-2) **CI**: green ✅ (run #1749, 3m20s, sha `3c30f4a`) **Round**: 1 (no prior reviews) --- ### Acceptance criteria check All criteria from issue #153 are met: - ✅ `TaskRecord` extended with `input_tokens`, `output_tokens`, `cache_creation_tokens`, `cache_read_tokens` - ✅ `persistTask` correctly writes them; `onFinish` and the cancel path both forward the fields - ✅ Idempotent `ALTER TABLE` migrations — existing rows keep `NULL`, aggregation uses `COALESCE(SUM(col), 0)` - ✅ `GET /usage[?window=week|day|all]` wired in `main.ts`, default `week` - ✅ Response shape matches spec: `window.{reset_at, since, kind}`, `totals.{tasks, turns, input_tokens, output_tokens, cache_read, cache_creation}`, `by_agent`, `by_day`, `threshold_tokens` - ✅ Weekly anchor formula `(dayOfWeek + 6) % 7` is correct, tested for Mon/Wed/Sun boundary cases - ✅ Dashboard Usage tab: hero number, stacked bar, SVG threshold ring, by-agent table ranked by tokens, by-day sparkline, 5-min auto-refresh, window chip-strip - ✅ Threshold colour states: green < 50%, yellow 50-80%, orange 80-95%, red ≥ 95% - ✅ Ring blank when `usage_threshold_tokens` is absent - ✅ `usage_reset` and `usage_threshold_tokens` config keys parsed with graceful fallback + warning on invalid values - ✅ Tests: 16 endpoint + 7 smoke + 6 browser tests - ✅ README `/usage` section, CLAUDE.md Modules table, `openapi.yml` --- ### One minor finding (non-blocking) **`src/main.ts` ~line 239 — `?? 0` stores zeros instead of NULL when the SDK emits no usage block** When `msg.usage` is `null`/`undefined` (e.g. a very early abort before the SDK makes any API call), all four token fields coalesce to `0`. In `runTask`, `typeof 0 === "number"` is truthy, so `record.input_tokens` gets set to `0`. Then `persistTask` stores `0` rather than `NULL`. The `PersistTaskInput` docstring says "null when the SDK reported no usage block", but the code stores `0`. Aggregate correctness is unaffected — `COALESCE(SUM(col), 0)` treats NULL and 0 identically. In practice the SDK's `NonNullableUsage` type guarantees `msg.usage` is always present on result messages, so this path is theoretical. Not blocking. Suggested fix: ```typescript const tokenDetail = u ? { input_tokens: u.input_tokens ?? 0, output_tokens: u.output_tokens ?? 0, cache_creation_tokens: u.cache_creation_input_tokens ?? 0, cache_read_tokens: u.cache_read_input_tokens ?? 0, } : {}; ``` --- ### Overall Clean implementation. SQL, boundary math, config parsing, migration, and tests are all correct. Ready to merge.
code-lead deleted branch boss/153 2026-04-20 15:57:04 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
charles/claude-hooks!159
No description provided.