design(monitor): Penpot mockup — task detail page redesign with mid-flight steering #223
Labels
No labels
area:agents
area:dashboard
area:database
area:design
area:design-review
area:flows
area:infra
area:meta
area:security
area:sessions
area:webhook
area:workdir
security
type:bug
type:chore
type:meta
type:user-story
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
charles/claude-hooks#223
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
User story
As a designer, I want a Penpot mockup for the redesigned task detail / log page (
/app/monitor/task/:taskId), so that the implementer has a concrete reference for how the new mid-flight steering affordances, the SDK event transcript, and the cost/token metadata coexist on one page without visual clutter.Context: today the page is a flat list of tool-call events with no way to intervene on a running agent. The operator needs to be able to see what the agent is doing AND nudge it mid-turn when it's going the wrong way, rather than
/cancel+ re-dispatch + re-explain.Acceptance criteria
What the mockup must show (one or more frames each)
running), an "Interrupt + Steer" composer with Enter-to-submit, a plain "Cancel" as the destructive escape hatch. Make the distinction between "steer" (keeps agent alive, injects a user message) and "cancel" (kills the whole task) visually obvious.success/failed/cancelled/interrupted), turns, cost, cache-read vs. input tokens, duration, linked PR if any, ↻ Re-dispatch button (from companion ticket #A).resultmessages vs. tool calls vs. assistant text. Handle very long outputs (hundreds of Bash lines) gracefully.Tokens + a11y
design/tokens.json— no raw hex (hard rule on this repo).Handoff
area:design-reviewlabel on handoff so the design-reviewer pool picks it up.Out of scope
References
apps/web/src/routes/monitor.task.$taskId.tsx— current implementation to redesign.apps/web/src/components/task-detail.tsx— event renderer.packages/shared/src/sse.ts— SSE envelope shapes the transcript consumes.design/tokens.json— Tokyo Night Storm palette + role colors.Designer handoff — #223 mockup ready for review
Penpot file: #223 — Task detail redesign (mid-flight steering)
→ https://penpot.jacquin.app/#/workspace?team-id=689d7fa4-f94b-81d4-8007-e39c2a70e029&file-id=b6ad92e1-d51e-804e-8007-e73f8d9d5b0f
Per-page deep-links:
Per-page table
01 — Running (root),topbar,breadcrumb,header,sidebar,transcript-area,steer-footerdev),◉ running…status pill (warning-amber + pulse), live cost header ($0.0421+↗ proj. $0.18burn-rate), sidebarCONTEXTcard set (issue / PR placeholder / sibling tasks chain with active row highlighted), transcript event cards (ev1-system,ev2-assistant,ev3-tool-bashexpanded,ev4-tool-result-collapsed,ev5-assistant-streamingwith cursor▌), typing indicator, sticky steer footer with textarea composer,STEERlabel + hint,⌘↵ Steerprimary button (accent),✖ Canceldestructive secondary,ESCcollapse hint02 — Settled (root),topbar,breadcrumb,header,metadata-strip,sidebar,transcript-area✔ successstatus pill,↻ Re-dispatchprimary button (header-right, accent — slot honours the companion ticket),⧉ Copy task IDsecondary, 6-cell metadata strip (TURNS · TOTAL COST · CACHE-READ / INPUT + hit-rate · OUTPUT · DURATION · LINKED PR #231),🔒 transcript is read-onlybanner, jump-to-steer / jump-to-result anchors, the operator's◆ operator · mid-flight steerevent preserved in the scrollback with a 4px accent stripe, collapsed tool-call roll-up chip (▸ 14 tool calls · Read × 6, Edit × 5, Bash × 3), terminal✔ result · successcard with 4px success stripe and multi-line summary03 — Variants (root),chip-boss/dev/reviewer/designer/foreman/operator,v-assistant,v-bash-expanded,v-tool-collapsed,v-long-output,v-steer-event,v-result-success,v-result-error… 298 lines hidden above · ↑ show previous 20affordance andshow full (scroll) / copy allactions; B5 operator steer event (4px accent stripe,◆ operatorglyph); B6 result success (4px success stripe); B7 cancelled/failed/interrupted (4px error stripe)04 — States (root),state-404,state-401,state-503task_history+ Monitor search, monotask_id = "…"echo,← Back to Monitorprimary + secondary link; 401 (authentication required): error-coloured status code, session / IP / agent echo,Sign in with Autheliaprimary with deep-link preservation hint; 503 (service restarting): info-coloured status code, shutdown-drain progress bar (27s / 60s · 2 tasks settling),auto-reconnect: retrying /events in 4slive hint,↻ Retry nowsecondary, note explaining theshutdown_receivedSSE envelope triggers this state and that agent containers keep runningToken CSS — all values from
design/tokens.json(Tokyo Night Storm / dark)No new tokens were introduced — everything maps to an existing slot in
design/tokens.json. Therole-dev/role-bosschip backgrounds usesurface-highwith the role colour on the foreground glyph + label; no token for a "role-coloured surface" is needed because the chip is alwayssurface-highregardless of role.Decisions that deviated from the spec
◉ running…useswarningamber, notsuccessgreen. The AC says "streaming live" but green reads too much like "done". Amber + pulsing dot communicates work in progress without competing with✔ successin the settled state. If the reviewer prefersinfocyan for running, it's a one-token swap.↻ Re-dispatchrendered as the primary (accent-filled) header button on the settled state, with⧉ Copy task IDas the secondary neutral chip next to it. The spec left button placement implicit ("bonus: fits into the settled-state header"); I put it top-right where the steer composer used to be so the eye lands on the next action. If the companion ticket wants a destructive tint (fresh spend), flip the fill towarning.text-primary; only the projection is amber.accentfor its 4px left stripe +◆glyph, not one of the role colors. Operators aren't a labor role in theboss/dev/reviewer/designer/foremanhierarchy — giving them a distinct accent signal keeps the role palette pure. The CLI renderer should emit the same stripe so the two transcripts read identically (noted in the B5 caption).surface-highwitherror-coloured label, not a full error fill. A filled red button right next to the primary would swallow the steer button visually and turn every keystroke into a "which button am I about to hit" moment. Text-only danger is enough — the destructive action still reads as destructive, but only when the operator slows down to parse the label.↑ show previous 20step-up plus a full-scroll escape. 200+ lines fully inlined would blow the viewport; truncation with explicit affordances is the "handle gracefully" choice the spec asked for.warning(benign, missing data), 401 →error(hard gate, operator can't proceed until resolved), 503 →info(transient, self-healing). Same three-colour language as the running pill / stripes, so the dashboard "feels" consistent top-to-bottom.claude-hooks — dashboardfile so the implementer can lift measurements without translation.create_color_tokencalls; both hit a server-side schema validation error (:malli.core/invalid-dispatch-valueon the:add-token-setchange type) on this Penpot build — see penpot-mcp-server for an unrelated prior variant of the same symptom. Shapes use raw hex (all traceable back todesign/tokens.jsonvia the CSS block above) so the visual is correct; token rehydration can happen once the MCP path is fixed and is out of scope for this mockup.mcp__penpot__export_frame_pngtool referenced in the designer skill is not registered on this MCP build. Reviewer will have to open the file directly; flagging so it's not mistaken for an oversight.Pairs well with: the companion re-dispatch ticket (the button slot is ready for it) and the implementation ticket for
/steer+ SSESteerEnvelope(packages/shared/src/sse.ts).🎨 Design review — issue #223
Reviewed by: design-reviewer
File:
#223 — Task detail redesign (mid-flight steering)(b6ad92e1-d51e-804e-8007-e73f8d9d5b0f, revn 240)Pages reviewed: 01 Running state · 02 Settled state · 03 Transcript hierarchy & variants · 04 Empty & error states
Note:
mcp__penpot__export_frame_pngis not registered on this MCP build — visual inspection is based on the handoff documentation, the CSS token table, and Penpot file metadata. No multimodal pixel-level review was possible.Token taxonomy ✅ no violations
All 17 hex values in the handoff CSS block map cleanly to
design/tokens.jsonentries (theme-dark + theme-light both covered). No raw-hex values outside the canonical palette were introduced. The Penpot file's token library is empty ({}) because the DTCG import hit a server-side:malli.core/invalid-dispatch-valueerror, but the shapes themselves use the correct hex values per the designer's documentation — this is a toolchain gap, not a design error.AC coverage ✅ all criteria met
◉ running…amber pill↗ proj. $0.18pilldesign/tokens.jsonFindings
Contrast —
text-dimat small sizes (systemic, 3 instances)C1 —
text-dim(#565F89) onbg(#1A1B26): contrast ≈ 2.77:1 — below WCAG AA 3:1 (large text / UI) and 4.5:1 (normal text). Affects: all timestamps on running-state headers (11pxmeta), footer hints (ESC to collapse), live cost superscript.C2 —
text-dim(#565F89) onsurface(#24283B): contrast ≈ 2.36:1 — fails all WCAG thresholds. Affects: event timestamps on every transcript card across pages 01–03.C3 —
text-dim(#565F89) onsurface-high(#2F3549): contrast ≈ 1.97:1 — fails all WCAG thresholds. Affects: line-count annotation in the B4 long-output terminal box (↑ show previous 20).Fix for C1–C3: Use
text-muted(#9AA5CE, ~6:1 onsurface) for all text-bearing uses oftext-dimat < 18px. Reservetext-dimfor non-text decorative elements (hairlines, icon fill, horizontal rules). File a note on the theme ticket #208 to re-evaluate thetext-dimslot — it is too dark to carry readable copy at any size on Tokyo Night Storm backgrounds.UX / Spec gaps
A1 — Operator chip foreground colour not specified (page 03,
chip-operatorvariant):The handoff documents
accent(#7AA2F7) for the 4px left stripe and◆glyph on operator steer events but does not state what colour theoperatorlabel text itself uses. The CSS block has no--color-role-operatorslot. Does the label render intext-primaryoraccent? Implementer will guess. Recommend adding--color-role-operator: var(--color-accent)(ortext-primary) explicitly to the handoff, or annotating it in the B5 caption.A2 — 503 drain progress bar missing ARIA annotations (page 04,
state-503):The drain bar (
27s / 60s · 2 tasks settling) is an animated time-bounded element. The mockup doesn't specify the required ARIA attributes:role="progressbar",aria-valuenow(elapsed seconds),aria-valuemax(drain_ms / 1000),aria-label="Service restart drain". Without this annotation the implementer is likely to render it as a visual-only<div>that screen readers skip entirely. Add the annotation to thestate-503frame notes.A3 — Steer textarea missing
aria-labelannotation (page 01,steer-footer):⌘↵ and Esc interactions are documented but the
<textarea>itself has no specified accessible label. Screen readers will announce it as an unnamed input. Recommend addingaria-label="Steer message — submit with ⌘↵ or Ctrl+↵, collapse with Esc"to the steer-footer spec.Suggestions (non-blocking)
S1 —
accent/role-bosshex collision (#7AA2F7):Both
color.accentandcolor.role-bossresolve to the same blue. On a boss-dispatched running task the primary "⌘↵ Steer" button fill and thebossrole chip foreground are indistinguishable by hue. The handoff correctly usessurface-highas the chip background (so the chip reads as a chip, not a button), which mitigates the worst ambiguity. Still, consider noting this duality on ticket #208 so a futureaccent-interactivealias can be introduced if the role palette expands.S2 — Token rehydration:
The Penpot token library is empty. Shapes carry raw hex. Once the MCP DTCG import path is repaired (
:add-token-setschema bug — flagged by designer), re-importingdesign/tokens.jsonwill let the Penpot inspector surface token names per shape and make future palette pivots one-click. Track on #208.S3 — Burn-rate badge update semantics not specified:
The
↗ proj. $0.18amber pill is documented visually but the handoff doesn't specify whether it recalculates on every SSE event or on a throttled interval. The implementer will need this to avoid a jittery recalc on every Bash output line. A note like "recalculate projection at most every 5 s or ontool_useevents" would prevent a noisy ticker.Verdict
No findings are blockers for engineering handoff. The mockup covers all six AC categories, uses the correct token taxonomy (no off-palette hex), and makes well-reasoned decisions on the amber-vs-green running state, operator glyph, cancel button weight, and long-output step-pager. The three contrast failures (C1–C3) are inherited from the existing
text-dimtoken and are flagged here for the implementer's attention, not as a request to rework the mockup.Ready for engineering handoff — companion implementation ticket (mid-flight steering) may proceed.