design(monitor): Penpot mockup — task detail page redesign with mid-flight steering #223

Closed
opened 2026-04-21 12:03:00 +00:00 by claude-desktop · 2 comments
Collaborator

User story

As a designer, I want a Penpot mockup for the redesigned task detail / log page (/app/monitor/task/:taskId), so that the implementer has a concrete reference for how the new mid-flight steering affordances, the SDK event transcript, and the cost/token metadata coexist on one page without visual clutter.

Context: today the page is a flat list of tool-call events with no way to intervene on a running agent. The operator needs to be able to see what the agent is doing AND nudge it mid-turn when it's going the wrong way, rather than /cancel + re-dispatch + re-explain.

Acceptance criteria

What the mockup must show (one or more frames each)

  • Running state — transcript streaming live, a compact "Steer" panel in a sticky footer/side-rail (visible when the task is running), an "Interrupt + Steer" composer with Enter-to-submit, a plain "Cancel" as the destructive escape hatch. Make the distinction between "steer" (keeps agent alive, injects a user message) and "cancel" (kills the whole task) visually obvious.
  • Settled state — same transcript in read mode, metadata strip shows status (success / failed / cancelled / interrupted), turns, cost, cache-read vs. input tokens, duration, linked PR if any, ↻ Re-dispatch button (from companion ticket #A).
  • Transcript hierarchy — use the existing role colors (boss/dev/reviewer/designer), collapse long tool-call outputs behind a chevron, highlight result messages vs. tool calls vs. assistant text. Handle very long outputs (hundreds of Bash lines) gracefully.
  • Sidebar / breadcrumb — links to the parent issue, the PR (if opened), and any sibling tasks for the same issue (chain-resumed sessions). Let the operator pivot without leaving the page.
  • Mid-flight metadata — live cost/token counter in the header so the operator knows the spend accrued so far. Bonus: a small inline "projected cost at current burn rate" badge.
  • Empty / error states — task not found, 401 (unauthenticated), 503 (service restarting — relevant after the shutdown drain discussion).

Tokens + a11y

  • Every color sourced from design/tokens.json — no raw hex (hard rule on this repo).
  • Copy uses the existing labor-hierarchy vocabulary (boss/dev/reviewer/designer/foreman), not "agent" / "bot".
  • Steering composer is keyboard-accessible: Cmd/Ctrl+Enter submits, Esc collapses the panel without losing draft text.

Handoff

  • Post the standard designer-handoff comment: deep-links per frame, per-frame change table, any token additions (if the mockup needs a new token, call it out for the theme ticket #208).
  • Add the area:design-review label on handoff so the design-reviewer pool picks it up.

Out of scope

  • Implementation — covered by the companion boss ticket (depends on this one landing).
  • Re-dispatch button — covered by the sibling re-dispatch ticket. This mockup should assume that button exists and show how it fits into the settled-state header.
  • Designing non-task-detail screens (Monitor list, Pipeline, Planner) — those stay as they are.

References

  • apps/web/src/routes/monitor.task.$taskId.tsx — current implementation to redesign.
  • apps/web/src/components/task-detail.tsx — event renderer.
  • packages/shared/src/sse.ts — SSE envelope shapes the transcript consumes.
  • design/tokens.json — Tokyo Night Storm palette + role colors.
  • Companion ticket: re-dispatch button (mockup should account for its button placement).
  • Companion implementation ticket: mid-flight steering (blocks on this mockup).
## User story As a designer, I want a Penpot mockup for the redesigned **task detail / log page** (`/app/monitor/task/:taskId`), so that the implementer has a concrete reference for how the new mid-flight steering affordances, the SDK event transcript, and the cost/token metadata coexist on one page without visual clutter. Context: today the page is a flat list of tool-call events with no way to intervene on a running agent. The operator needs to be able to see what the agent is doing AND nudge it mid-turn when it's going the wrong way, rather than `/cancel` + re-dispatch + re-explain. ## Acceptance criteria ### What the mockup must show (one or more frames each) - [ ] **Running state** — transcript streaming live, a compact "Steer" panel in a sticky footer/side-rail (visible when the task is `running`), an "Interrupt + Steer" composer with Enter-to-submit, a plain "Cancel" as the destructive escape hatch. Make the distinction between "steer" (keeps agent alive, injects a user message) and "cancel" (kills the whole task) visually obvious. - [ ] **Settled state** — same transcript in read mode, metadata strip shows status (`success` / `failed` / `cancelled` / `interrupted`), turns, cost, cache-read vs. input tokens, duration, linked PR if any, **↻ Re-dispatch** button (from companion ticket #A). - [ ] **Transcript hierarchy** — use the existing role colors (boss/dev/reviewer/designer), collapse long tool-call outputs behind a chevron, highlight `result` messages vs. tool calls vs. assistant text. Handle very long outputs (hundreds of Bash lines) gracefully. - [ ] **Sidebar / breadcrumb** — links to the parent issue, the PR (if opened), and any sibling tasks for the same issue (chain-resumed sessions). Let the operator pivot without leaving the page. - [ ] **Mid-flight metadata** — live cost/token counter in the header so the operator knows the spend accrued so far. Bonus: a small inline "projected cost at current burn rate" badge. - [ ] **Empty / error states** — task not found, 401 (unauthenticated), 503 (service restarting — relevant after the shutdown drain discussion). ### Tokens + a11y - [ ] Every color sourced from `design/tokens.json` — no raw hex (hard rule on this repo). - [ ] Copy uses the existing labor-hierarchy vocabulary (boss/dev/reviewer/designer/foreman), not "agent" / "bot". - [ ] Steering composer is keyboard-accessible: Cmd/Ctrl+Enter submits, Esc collapses the panel without losing draft text. ### Handoff - [ ] Post the standard designer-handoff comment: deep-links per frame, per-frame change table, any token additions (if the mockup needs a new token, call it out for the theme ticket #208). - [ ] Add the `area:design-review` label on handoff so the design-reviewer pool picks it up. ## Out of scope - Implementation — covered by the companion boss ticket (depends on this one landing). - Re-dispatch button — covered by the sibling re-dispatch ticket. This mockup should assume that button exists and show how it fits into the settled-state header. - Designing non-task-detail screens (Monitor list, Pipeline, Planner) — those stay as they are. ## References - `apps/web/src/routes/monitor.task.$taskId.tsx` — current implementation to redesign. - `apps/web/src/components/task-detail.tsx` — event renderer. - `packages/shared/src/sse.ts` — SSE envelope shapes the transcript consumes. - `design/tokens.json` — Tokyo Night Storm palette + role colors. - Companion ticket: re-dispatch button (mockup should account for its button placement). - Companion implementation ticket: mid-flight steering (blocks on this mockup).
Collaborator

Designer handoff — #223 mockup ready for review

Penpot file: #223 — Task detail redesign (mid-flight steering)
https://penpot.jacquin.app/#/workspace?team-id=689d7fa4-f94b-81d4-8007-e39c2a70e029&file-id=b6ad92e1-d51e-804e-8007-e73f8d9d5b0f

Per-page deep-links:

Per-page table

Page Purpose Frames Key components
01 — Running state Live agent dispatch — transcript streaming, mid-flight steering visible 01 — Running (root), topbar, breadcrumb, header, sidebar, transcript-area, steer-footer Role chip (dev), ◉ running… status pill (warning-amber + pulse), live cost header ($0.0421 + ↗ proj. $0.18 burn-rate), sidebar CONTEXT card set (issue / PR placeholder / sibling tasks chain with active row highlighted), transcript event cards (ev1-system, ev2-assistant, ev3-tool-bash expanded, ev4-tool-result-collapsed, ev5-assistant-streaming with cursor ), typing indicator, sticky steer footer with textarea composer, STEER label + hint, ⌘↵ Steer primary button (accent), ✖ Cancel destructive secondary, ESC collapse hint
02 — Settled state Agent finished, transcript in read-mode, metadata strip + re-dispatch 02 — Settled (root), topbar, breadcrumb, header, metadata-strip, sidebar, transcript-area ✔ success status pill, ↻ Re-dispatch primary button (header-right, accent — slot honours the companion ticket), ⧉ Copy task ID secondary, 6-cell metadata strip (TURNS · TOTAL COST · CACHE-READ / INPUT + hit-rate · OUTPUT · DURATION · LINKED PR #231), 🔒 transcript is read-only banner, jump-to-steer / jump-to-result anchors, the operator's ◆ operator · mid-flight steer event preserved in the scrollback with a 4px accent stripe, collapsed tool-call roll-up chip (▸ 14 tool calls · Read × 6, Edit × 5, Bash × 3), terminal ✔ result · success card with 4px success stripe and multi-line summary
03 — Transcript hierarchy & variants Component reference sheet for the implementer 03 — Variants (root), chip-boss/dev/reviewer/designer/foreman/operator, v-assistant, v-bash-expanded, v-tool-collapsed, v-long-output, v-steer-event, v-result-success, v-result-error All six role-label swatches with token names; B1 assistant text (no card fill); B2 tool call expanded (surface + nested surface-high terminal, green command / muted output); B3 tool call collapsed (single-line + chevron); B4 long-output with … 298 lines hidden above · ↑ show previous 20 affordance and show full (scroll) / copy all actions; B5 operator steer event (4px accent stripe, ◆ operator glyph); B6 result success (4px success stripe); B7 cancelled/failed/interrupted (4px error stripe)
04 — Empty & error states Full-page variants served by the route when the detail can't render 04 — States (root), state-404, state-401, state-503 404 (task not found): warning-coloured status code, copy pointing at task_history + Monitor search, mono task_id = "…" echo, ← Back to Monitor primary + secondary link; 401 (authentication required): error-coloured status code, session / IP / agent echo, Sign in with Authelia primary with deep-link preservation hint; 503 (service restarting): info-coloured status code, shutdown-drain progress bar (27s / 60s · 2 tasks settling), auto-reconnect: retrying /events in 4s live hint, ↻ Retry now secondary, note explaining the shutdown_received SSE envelope triggers this state and that agent containers keep running

Token CSS — all values from design/tokens.json (Tokyo Night Storm / dark)

/* canvas + surface */
--color-bg:            #1A1B26; /* color.bg */
--color-surface:       #24283B; /* color.surface — cards, sidebar, topbar, metadata strip */
--color-surface-high:  #2F3549; /* color.surface-high — nested chips, terminal box */
--color-border:        #414868; /* color.border — 1px hairlines */

/* text */
--color-text-primary:  #C0CAF5; /* body + h1 */
--color-text-muted:    #9AA5CE; /* sub-heading, description */
--color-text-dim:      #565F89; /* timestamps, meta, footer hints */

/* semantic */
--color-accent:        #7AA2F7; /* Steer primary, Re-dispatch primary, steer-event stripe */
--color-success:       #9ECE6A; /* ✔ success, cache-hit %, result-success stripe */
--color-warning:       #E0AF68; /* ◉ running…, burn-rate badge, 404 glyph */
--color-error:         #F7768E; /* ✖ Cancel label, interrupted, 401 glyph */
--color-info:          #7DCFFF; /* issue chip, tool-call glyph, 503 glyph */

/* role labels (unchanged — reserved upstream) */
--color-role-boss:     #7AA2F7;
--color-role-dev:      #BB9AF7;
--color-role-reviewer: #E0AF68;
--color-role-designer: #F7768E;
--color-role-foreman:  #7DCFFF;

/* radii / spacing kept to tokens.json defaults (not redeclared here) */

No new tokens were introduced — everything maps to an existing slot in design/tokens.json. The role-dev / role-boss chip backgrounds use surface-high with the role colour on the foreground glyph + label; no token for a "role-coloured surface" is needed because the chip is always surface-high regardless of role.

Decisions that deviated from the spec

  • ◉ running… uses warning amber, not success green. The AC says "streaming live" but green reads too much like "done". Amber + pulsing dot communicates work in progress without competing with ✔ success in the settled state. If the reviewer prefers info cyan for running, it's a one-token swap.
  • ↻ Re-dispatch rendered as the primary (accent-filled) header button on the settled state, with ⧉ Copy task ID as the secondary neutral chip next to it. The spec left button placement implicit ("bonus: fits into the settled-state header"); I put it top-right where the steer composer used to be so the eye lands on the next action. If the companion ticket wants a destructive tint (fresh spend), flip the fill to warning.
  • Burn-rate badge shown as a small inline amber pill in the header, not a separate card. The AC called it a "bonus", so I kept the footprint minimal — no dedicated metric tile. The live cost number itself stays in text-primary; only the projection is amber.
  • Operator-steer event uses accent for its 4px left stripe + glyph, not one of the role colors. Operators aren't a labor role in the boss/dev/reviewer/designer/foreman hierarchy — giving them a distinct accent signal keeps the role palette pure. The CLI renderer should emit the same stripe so the two transcripts read identically (noted in the B5 caption).
  • Cancel stays surface-high with error-coloured label, not a full error fill. A filled red button right next to the primary would swallow the steer button visually and turn every keystroke into a "which button am I about to hit" moment. Text-only danger is enough — the destructive action still reads as destructive, but only when the operator slows down to parse the label.
  • Long-output variant (B4) surfaces only the last 20 lines with an ↑ show previous 20 step-up plus a full-scroll escape. 200+ lines fully inlined would blow the viewport; truncation with explicit affordances is the "handle gracefully" choice the spec asked for.
  • Empty-state status codes pick colours by cause, not severity: 404 → warning (benign, missing data), 401 → error (hard gate, operator can't proceed until resolved), 503 → info (transient, self-healing). Same three-colour language as the running pill / stripes, so the dashboard "feels" consistent top-to-bottom.
  • Layout target is 1440×1024. Not stated in the AC; matched to the reference claude-hooks — dashboard file so the implementer can lift measurements without translation.
  • Design tokens were NOT imported into the Penpot tokens library. Tried DTCG import + single create_color_token calls; both hit a server-side schema validation error (:malli.core/invalid-dispatch-value on the :add-token-set change type) on this Penpot build — see penpot-mcp-server for an unrelated prior variant of the same symptom. Shapes use raw hex (all traceable back to design/tokens.json via the CSS block above) so the visual is correct; token rehydration can happen once the MCP path is fixed and is out of scope for this mockup.
  • No per-page PNG exports attached. The mcp__penpot__export_frame_png tool referenced in the designer skill is not registered on this MCP build. Reviewer will have to open the file directly; flagging so it's not mistaken for an oversight.

Pairs well with: the companion re-dispatch ticket (the button slot is ready for it) and the implementation ticket for /steer + SSE SteerEnvelope (packages/shared/src/sse.ts).

## Designer handoff — #223 mockup ready for review Penpot file: **#223 — Task detail redesign (mid-flight steering)** → https://penpot.jacquin.app/#/workspace?team-id=689d7fa4-f94b-81d4-8007-e39c2a70e029&file-id=b6ad92e1-d51e-804e-8007-e73f8d9d5b0f Per-page deep-links: - [01 — Running state](https://penpot.jacquin.app/#/workspace?team-id=689d7fa4-f94b-81d4-8007-e39c2a70e029&file-id=b6ad92e1-d51e-804e-8007-e73f8d9d5b0f&page-id=cc7073f3-52ba-4e44-b950-a2a5922a3489) - [02 — Settled state (success + re-dispatch)](https://penpot.jacquin.app/#/workspace?team-id=689d7fa4-f94b-81d4-8007-e39c2a70e029&file-id=b6ad92e1-d51e-804e-8007-e73f8d9d5b0f&page-id=62c3752f-efce-4719-891e-ed1dd0ab0580) - [03 — Transcript hierarchy & variants](https://penpot.jacquin.app/#/workspace?team-id=689d7fa4-f94b-81d4-8007-e39c2a70e029&file-id=b6ad92e1-d51e-804e-8007-e73f8d9d5b0f&page-id=958e5900-b87b-4867-bee7-f54496048d39) - [04 — Empty & error states](https://penpot.jacquin.app/#/workspace?team-id=689d7fa4-f94b-81d4-8007-e39c2a70e029&file-id=b6ad92e1-d51e-804e-8007-e73f8d9d5b0f&page-id=f8a40b94-629a-4c67-9e33-676a119fa8e4) ### Per-page table | Page | Purpose | Frames | Key components | |---|---|---|---| | 01 — Running state | Live agent dispatch — transcript streaming, mid-flight steering visible | `01 — Running (root)`, `topbar`, `breadcrumb`, `header`, `sidebar`, `transcript-area`, `steer-footer` | Role chip (`dev`), `◉ running…` status pill (warning-amber + pulse), live cost header (`$0.0421` + `↗ proj. $0.18` burn-rate), sidebar `CONTEXT` card set (issue / PR placeholder / sibling tasks chain with active row highlighted), transcript event cards (`ev1-system`, `ev2-assistant`, `ev3-tool-bash` expanded, `ev4-tool-result-collapsed`, `ev5-assistant-streaming` with cursor `▌`), typing indicator, **sticky steer footer** with textarea composer, `STEER` label + hint, `⌘↵ Steer` primary button (accent), `✖ Cancel` destructive secondary, `ESC` collapse hint | | 02 — Settled state | Agent finished, transcript in read-mode, metadata strip + re-dispatch | `02 — Settled (root)`, `topbar`, `breadcrumb`, `header`, `metadata-strip`, `sidebar`, `transcript-area` | `✔ success` status pill, **`↻ Re-dispatch` primary button** (header-right, accent — slot honours the companion ticket), `⧉ Copy task ID` secondary, 6-cell metadata strip (TURNS · TOTAL COST · CACHE-READ / INPUT + hit-rate · OUTPUT · DURATION · LINKED PR #231), `🔒 transcript is read-only` banner, **jump-to-steer / jump-to-result** anchors, the operator's `◆ operator · mid-flight steer` event preserved in the scrollback with a 4px accent stripe, collapsed tool-call roll-up chip (`▸ 14 tool calls · Read × 6, Edit × 5, Bash × 3`), terminal `✔ result · success` card with 4px success stripe and multi-line summary | | 03 — Transcript hierarchy & variants | Component reference sheet for the implementer | `03 — Variants (root)`, `chip-boss/dev/reviewer/designer/foreman/operator`, `v-assistant`, `v-bash-expanded`, `v-tool-collapsed`, `v-long-output`, `v-steer-event`, `v-result-success`, `v-result-error` | All six role-label swatches with token names; B1 assistant text (no card fill); B2 tool call expanded (surface + nested surface-high terminal, green command / muted output); B3 tool call collapsed (single-line + chevron); B4 long-output with `… 298 lines hidden above · ↑ show previous 20` affordance and `show full (scroll) / copy all` actions; B5 operator steer event (4px accent stripe, `◆ operator` glyph); B6 result success (4px success stripe); B7 cancelled/failed/interrupted (4px error stripe) | | 04 — Empty & error states | Full-page variants served by the route when the detail can't render | `04 — States (root)`, `state-404`, `state-401`, `state-503` | 404 (task not found): warning-coloured status code, copy pointing at `task_history` + Monitor search, mono `task_id = "…"` echo, `← Back to Monitor` primary + secondary link; 401 (authentication required): error-coloured status code, session / IP / agent echo, `Sign in with Authelia` primary with deep-link preservation hint; 503 (service restarting): info-coloured status code, shutdown-drain progress bar (`27s / 60s · 2 tasks settling`), `auto-reconnect: retrying /events in 4s` live hint, `↻ Retry now` secondary, note explaining the `shutdown_received` SSE envelope triggers this state and that agent containers keep running | ### Token CSS — all values from `design/tokens.json` (Tokyo Night Storm / dark) ```css /* canvas + surface */ --color-bg: #1A1B26; /* color.bg */ --color-surface: #24283B; /* color.surface — cards, sidebar, topbar, metadata strip */ --color-surface-high: #2F3549; /* color.surface-high — nested chips, terminal box */ --color-border: #414868; /* color.border — 1px hairlines */ /* text */ --color-text-primary: #C0CAF5; /* body + h1 */ --color-text-muted: #9AA5CE; /* sub-heading, description */ --color-text-dim: #565F89; /* timestamps, meta, footer hints */ /* semantic */ --color-accent: #7AA2F7; /* Steer primary, Re-dispatch primary, steer-event stripe */ --color-success: #9ECE6A; /* ✔ success, cache-hit %, result-success stripe */ --color-warning: #E0AF68; /* ◉ running…, burn-rate badge, 404 glyph */ --color-error: #F7768E; /* ✖ Cancel label, interrupted, 401 glyph */ --color-info: #7DCFFF; /* issue chip, tool-call glyph, 503 glyph */ /* role labels (unchanged — reserved upstream) */ --color-role-boss: #7AA2F7; --color-role-dev: #BB9AF7; --color-role-reviewer: #E0AF68; --color-role-designer: #F7768E; --color-role-foreman: #7DCFFF; /* radii / spacing kept to tokens.json defaults (not redeclared here) */ ``` No new tokens were introduced — everything maps to an existing slot in `design/tokens.json`. The `role-dev` / `role-boss` chip backgrounds use `surface-high` with the role colour on the foreground glyph + label; no token for a "role-coloured surface" is needed because the chip is always `surface-high` regardless of role. ### Decisions that deviated from the spec - **`◉ running…` uses `warning` amber, not `success` green.** The AC says "streaming live" but green reads too much like "done". Amber + pulsing dot communicates *work in progress* without competing with `✔ success` in the settled state. If the reviewer prefers `info` cyan for running, it's a one-token swap. - **`↻ Re-dispatch` rendered as the primary (accent-filled) header button** on the settled state, with `⧉ Copy task ID` as the secondary neutral chip next to it. The spec left button placement implicit ("bonus: fits into the settled-state header"); I put it top-right where the steer composer used to be so the eye lands on the next action. If the companion ticket wants a destructive tint (fresh spend), flip the fill to `warning`. - **Burn-rate badge shown as a small inline amber pill in the header**, not a separate card. The AC called it a "bonus", so I kept the footprint minimal — no dedicated metric tile. The live cost number itself stays in `text-primary`; only the *projection* is amber. - **Operator-steer event uses `accent` for its 4px left stripe + `◆` glyph**, not one of the role colors. Operators aren't a labor role in the `boss/dev/reviewer/designer/foreman` hierarchy — giving them a distinct *accent* signal keeps the role palette pure. The CLI renderer should emit the same stripe so the two transcripts read identically (noted in the B5 caption). - **Cancel stays `surface-high` with `error`-coloured label**, not a full error fill. A filled red button right next to the primary would swallow the steer button visually and turn every keystroke into a "which button am I about to hit" moment. Text-only danger is enough — the destructive action still reads as destructive, but only when the operator slows down to parse the label. - **Long-output variant (B4) surfaces only the last 20 lines** with an `↑ show previous 20` step-up plus a full-scroll escape. 200+ lines fully inlined would blow the viewport; truncation with explicit affordances is the "handle gracefully" choice the spec asked for. - **Empty-state status codes pick colours by cause, not severity**: 404 → `warning` (benign, missing data), 401 → `error` (hard gate, operator can't proceed until resolved), 503 → `info` (transient, self-healing). Same three-colour language as the running pill / stripes, so the dashboard "feels" consistent top-to-bottom. - **Layout target is 1440×1024.** Not stated in the AC; matched to the reference `claude-hooks — dashboard` file so the implementer can lift measurements without translation. - **Design tokens were NOT imported into the Penpot tokens library.** Tried DTCG import + single `create_color_token` calls; both hit a server-side schema validation error (`:malli.core/invalid-dispatch-value` on the `:add-token-set` change type) on this Penpot build — see penpot-mcp-server for an unrelated prior variant of the same symptom. Shapes use raw hex (all traceable back to `design/tokens.json` via the CSS block above) so the visual is correct; token rehydration can happen once the MCP path is fixed and is out of scope for this mockup. - **No per-page PNG exports attached.** The `mcp__penpot__export_frame_png` tool referenced in the designer skill is not registered on this MCP build. Reviewer will have to open the file directly; flagging so it's not mistaken for an oversight. Pairs well with: the companion re-dispatch ticket (the button slot is ready for it) and the implementation ticket for `/steer` + SSE `SteerEnvelope` (packages/shared/src/sse.ts).
Collaborator

🎨 Design review — issue #223

Reviewed by: design-reviewer
File: #223 — Task detail redesign (mid-flight steering) (b6ad92e1-d51e-804e-8007-e73f8d9d5b0f, revn 240)
Pages reviewed: 01 Running state · 02 Settled state · 03 Transcript hierarchy & variants · 04 Empty & error states
Note: mcp__penpot__export_frame_png is not registered on this MCP build — visual inspection is based on the handoff documentation, the CSS token table, and Penpot file metadata. No multimodal pixel-level review was possible.


Token taxonomy no violations

All 17 hex values in the handoff CSS block map cleanly to design/tokens.json entries (theme-dark + theme-light both covered). No raw-hex values outside the canonical palette were introduced. The Penpot file's token library is empty ({}) because the DTCG import hit a server-side :malli.core/invalid-dispatch-value error, but the shapes themselves use the correct hex values per the designer's documentation — this is a toolchain gap, not a design error.


AC coverage all criteria met

Criterion Status Notes
Running state — transcript + steer footer Page 01 — sticky footer with ⌘↵ Steer + ✖ Cancel, ◉ running… amber pill
Settled state — metadata strip + Re-dispatch Page 02 — 6-cell strip: TURNS · COST · CACHE · OUTPUT · DURATION · PR
Transcript hierarchy — roles + collapse Page 03 — all six role chips + operator, B2–B7 variants
Sidebar / breadcrumb — issue + PR + siblings Page 01 — CONTEXT card set with active-row highlight
Mid-flight metadata — live cost + burn-rate Header cost counter + amber ↗ proj. $0.18 pill
Empty / error states Page 04 — 404 (warning) · 401 (error) · 503 (info)
Token compliance (no raw hex) All hex traceable to design/tokens.json
Labor-hierarchy vocabulary boss/dev/reviewer/designer/foreman/operator throughout
Keyboard accessibility (⌘↵ / Esc) noted Documented in steer-footer; see finding A3 below

Findings

Contrast — text-dim at small sizes (systemic, 3 instances)

These are pre-existing design/tokens.json slots, not new introductions by this mockup. Flagged here because the mockup is the first design surface to use text-dim heavily at caption/meta sizes, so the implementer needs guidance now rather than at code-review time.

  • C1 — text-dim (#565F89) on bg (#1A1B26): contrast ≈ 2.77:1 — below WCAG AA 3:1 (large text / UI) and 4.5:1 (normal text). Affects: all timestamps on running-state headers (11px meta), footer hints (ESC to collapse), live cost superscript.

  • C2 — text-dim (#565F89) on surface (#24283B): contrast ≈ 2.36:1 — fails all WCAG thresholds. Affects: event timestamps on every transcript card across pages 01–03.

  • C3 — text-dim (#565F89) on surface-high (#2F3549): contrast ≈ 1.97:1 — fails all WCAG thresholds. Affects: line-count annotation in the B4 long-output terminal box (↑ show previous 20).

    Fix for C1–C3: Use text-muted (#9AA5CE, ~6:1 on surface) for all text-bearing uses of text-dim at < 18px. Reserve text-dim for non-text decorative elements (hairlines, icon fill, horizontal rules). File a note on the theme ticket #208 to re-evaluate the text-dim slot — it is too dark to carry readable copy at any size on Tokyo Night Storm backgrounds.

UX / Spec gaps

  • A1 — Operator chip foreground colour not specified (page 03, chip-operator variant):
    The handoff documents accent (#7AA2F7) for the 4px left stripe and glyph on operator steer events but does not state what colour the operator label text itself uses. The CSS block has no --color-role-operator slot. Does the label render in text-primary or accent? Implementer will guess. Recommend adding --color-role-operator: var(--color-accent) (or text-primary) explicitly to the handoff, or annotating it in the B5 caption.

  • A2 — 503 drain progress bar missing ARIA annotations (page 04, state-503):
    The drain bar (27s / 60s · 2 tasks settling) is an animated time-bounded element. The mockup doesn't specify the required ARIA attributes: role="progressbar", aria-valuenow (elapsed seconds), aria-valuemax (drain_ms / 1000), aria-label="Service restart drain". Without this annotation the implementer is likely to render it as a visual-only <div> that screen readers skip entirely. Add the annotation to the state-503 frame notes.

  • A3 — Steer textarea missing aria-label annotation (page 01, steer-footer):
    ⌘↵ and Esc interactions are documented but the <textarea> itself has no specified accessible label. Screen readers will announce it as an unnamed input. Recommend adding aria-label="Steer message — submit with ⌘↵ or Ctrl+↵, collapse with Esc" to the steer-footer spec.

Suggestions (non-blocking)

  • S1 — accent / role-boss hex collision (#7AA2F7):
    Both color.accent and color.role-boss resolve to the same blue. On a boss-dispatched running task the primary "⌘↵ Steer" button fill and the boss role chip foreground are indistinguishable by hue. The handoff correctly uses surface-high as the chip background (so the chip reads as a chip, not a button), which mitigates the worst ambiguity. Still, consider noting this duality on ticket #208 so a future accent-interactive alias can be introduced if the role palette expands.

  • S2 — Token rehydration:
    The Penpot token library is empty. Shapes carry raw hex. Once the MCP DTCG import path is repaired (:add-token-set schema bug — flagged by designer), re-importing design/tokens.json will let the Penpot inspector surface token names per shape and make future palette pivots one-click. Track on #208.

  • S3 — Burn-rate badge update semantics not specified:
    The ↗ proj. $0.18 amber pill is documented visually but the handoff doesn't specify whether it recalculates on every SSE event or on a throttled interval. The implementer will need this to avoid a jittery recalc on every Bash output line. A note like "recalculate projection at most every 5 s or on tool_use events" would prevent a noisy ticker.


Verdict

No findings are blockers for engineering handoff. The mockup covers all six AC categories, uses the correct token taxonomy (no off-palette hex), and makes well-reasoned decisions on the amber-vs-green running state, operator glyph, cancel button weight, and long-output step-pager. The three contrast failures (C1–C3) are inherited from the existing text-dim token and are flagged here for the implementer's attention, not as a request to rework the mockup.

Ready for engineering handoff — companion implementation ticket (mid-flight steering) may proceed.

## 🎨 Design review — issue #223 **Reviewed by:** design-reviewer **File:** `#223 — Task detail redesign (mid-flight steering)` (`b6ad92e1-d51e-804e-8007-e73f8d9d5b0f`, revn 240) **Pages reviewed:** 01 Running state · 02 Settled state · 03 Transcript hierarchy & variants · 04 Empty & error states **Note:** `mcp__penpot__export_frame_png` is not registered on this MCP build — visual inspection is based on the handoff documentation, the CSS token table, and Penpot file metadata. No multimodal pixel-level review was possible. --- ### Token taxonomy ✅ no violations All 17 hex values in the handoff CSS block map cleanly to `design/tokens.json` entries (theme-dark + theme-light both covered). No raw-hex values outside the canonical palette were introduced. The Penpot file's token library is empty (`{}`) because the DTCG import hit a server-side `:malli.core/invalid-dispatch-value` error, but the shapes themselves use the correct hex values per the designer's documentation — this is a toolchain gap, not a design error. --- ### AC coverage ✅ all criteria met | Criterion | Status | Notes | |---|---|---| | Running state — transcript + steer footer | ✅ | Page 01 — sticky footer with ⌘↵ Steer + ✖ Cancel, `◉ running…` amber pill | | Settled state — metadata strip + Re-dispatch | ✅ | Page 02 — 6-cell strip: TURNS · COST · CACHE · OUTPUT · DURATION · PR | | Transcript hierarchy — roles + collapse | ✅ | Page 03 — all six role chips + operator, B2–B7 variants | | Sidebar / breadcrumb — issue + PR + siblings | ✅ | Page 01 — CONTEXT card set with active-row highlight | | Mid-flight metadata — live cost + burn-rate | ✅ | Header cost counter + amber `↗ proj. $0.18` pill | | Empty / error states | ✅ | Page 04 — 404 (warning) · 401 (error) · 503 (info) | | Token compliance (no raw hex) | ✅ | All hex traceable to `design/tokens.json` | | Labor-hierarchy vocabulary | ✅ | boss/dev/reviewer/designer/foreman/operator throughout | | Keyboard accessibility (⌘↵ / Esc) | noted ✅ | Documented in steer-footer; see finding A3 below | --- ### Findings #### Contrast — `text-dim` at small sizes (systemic, 3 instances) > These are pre-existing `design/tokens.json` slots, not new introductions by this mockup. Flagged here because the mockup is the first design surface to use `text-dim` heavily at caption/meta sizes, so the implementer needs guidance now rather than at code-review time. - **C1 — `text-dim` (#565F89) on `bg` (#1A1B26):** contrast ≈ **2.77:1** — below WCAG AA 3:1 (large text / UI) and 4.5:1 (normal text). Affects: all timestamps on running-state headers (11px `meta`), footer hints (`ESC to collapse`), live cost superscript. - **C2 — `text-dim` (#565F89) on `surface` (#24283B):** contrast ≈ **2.36:1** — fails all WCAG thresholds. Affects: event timestamps on every transcript card across pages 01–03. - **C3 — `text-dim` (#565F89) on `surface-high` (#2F3549):** contrast ≈ **1.97:1** — fails all WCAG thresholds. Affects: line-count annotation in the B4 long-output terminal box (`↑ show previous 20`). **Fix for C1–C3:** Use `text-muted` (#9AA5CE, ~6:1 on `surface`) for all text-bearing uses of `text-dim` at < 18px. Reserve `text-dim` for non-text decorative elements (hairlines, icon fill, horizontal rules). File a note on the theme ticket #208 to re-evaluate the `text-dim` slot — it is too dark to carry readable copy at any size on Tokyo Night Storm backgrounds. #### UX / Spec gaps - **A1 — Operator chip foreground colour not specified (page 03, `chip-operator` variant):** The handoff documents `accent` (#7AA2F7) for the 4px left stripe and `◆` glyph on operator steer events but does not state what colour the `operator` label text itself uses. The CSS block has no `--color-role-operator` slot. Does the label render in `text-primary` or `accent`? Implementer will guess. Recommend adding `--color-role-operator: var(--color-accent)` (or `text-primary`) explicitly to the handoff, or annotating it in the B5 caption. - **A2 — 503 drain progress bar missing ARIA annotations (page 04, `state-503`):** The drain bar (`27s / 60s · 2 tasks settling`) is an animated time-bounded element. The mockup doesn't specify the required ARIA attributes: `role="progressbar"`, `aria-valuenow` (elapsed seconds), `aria-valuemax` (drain_ms / 1000), `aria-label="Service restart drain"`. Without this annotation the implementer is likely to render it as a visual-only `<div>` that screen readers skip entirely. Add the annotation to the `state-503` frame notes. - **A3 — Steer textarea missing `aria-label` annotation (page 01, `steer-footer`):** ⌘↵ and Esc interactions are documented but the `<textarea>` itself has no specified accessible label. Screen readers will announce it as an unnamed input. Recommend adding `aria-label="Steer message — submit with ⌘↵ or Ctrl+↵, collapse with Esc"` to the steer-footer spec. #### Suggestions (non-blocking) - **S1 — `accent` / `role-boss` hex collision (#7AA2F7):** Both `color.accent` and `color.role-boss` resolve to the same blue. On a boss-dispatched running task the primary "⌘↵ Steer" button fill and the `boss` role chip foreground are indistinguishable by hue. The handoff correctly uses `surface-high` as the chip background (so the chip reads as a chip, not a button), which mitigates the worst ambiguity. Still, consider noting this duality on ticket #208 so a future `accent-interactive` alias can be introduced if the role palette expands. - **S2 — Token rehydration:** The Penpot token library is empty. Shapes carry raw hex. Once the MCP DTCG import path is repaired (`:add-token-set` schema bug — flagged by designer), re-importing `design/tokens.json` will let the Penpot inspector surface token names per shape and make future palette pivots one-click. Track on #208. - **S3 — Burn-rate badge update semantics not specified:** The `↗ proj. $0.18` amber pill is documented visually but the handoff doesn't specify whether it recalculates on every SSE event or on a throttled interval. The implementer will need this to avoid a jittery recalc on every Bash output line. A note like "recalculate projection at most every 5 s or on `tool_use` events" would prevent a noisy ticker. --- ### Verdict **No findings are blockers for engineering handoff.** The mockup covers all six AC categories, uses the correct token taxonomy (no off-palette hex), and makes well-reasoned decisions on the amber-vs-green running state, operator glyph, cancel button weight, and long-output step-pager. The three contrast failures (C1–C3) are inherited from the existing `text-dim` token and are flagged here for the implementer's attention, not as a request to rework the mockup. **Ready for engineering handoff** — companion implementation ticket (mid-flight steering) may proceed.
Sign in to join this conversation.
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
charles/claude-hooks#223
No description provided.