Add designer + design-reviewer agent types (UI/UX mockups) #56

Closed
opened 2026-04-18 17:17:41 +00:00 by claude-desktop · 0 comments
Collaborator

User story

As a maintainer, I want two new agent types — designer and design-reviewer — routed by area:design, so that UI/UX mockup tickets (like #55) land through a dedicated agent pair that can drive Penpot end-to-end and catch obvious visual defects, without a human driving the session.

Context

#55 was delivered interactively from Claude Desktop with Penpot MCP + Forge MCP + a chunk of Authelia auth plumbing. That's not a repeatable pipeline — the design workflow needs to become an agent type with a containerised MCP loadout, the same way dev/boss/reviewer have pre-baked skills.

Model decisions (resolves the review chat on #55):

  • designer default model: claude-opus-4-7. Layout math, MCP quirks (frame-fill bug, parallel-write coordination), and design-system reasoning are past Sonnet's reliability. Downgrade to Sonnet only per-instance, via the dashboard, for simple one-frame tweaks.
  • design-reviewer default model: claude-sonnet-4-6. Multimodal pattern-matching on exported frame PNGs (overflow / contrast / alignment / poor positioning) is high-recall review work, not synthesis. Sonnet is the cost/quality sweet spot.

Acceptance criteria

Config — config/agents.json

  • New designer entry: forgejo_user: designer, model: claude-opus-4-7, branch_prefix: designer, system prompt explaining it produces Penpot mockups and handoff comments, container enabled with the claude-hooks:dev image.
  • New design-reviewer entry: forgejo_user: design-reviewer, model: claude-sonnet-4-6, system prompt for visual defect review, container enabled.
  • Forgejo users designer and design-reviewer created on the instance; tokens issued to ~/.config/claude-hooks/tokens/{designer,design-reviewer}; added to the repos they should access.
  • Signing keys set up the same way boss/dev/reviewer have them (per the forgejo_gpg_home memory — keyring must live at /var/lib/forgejo/data/home/.gnupg/).

MCP bindings — container image

  • designer container embeds Forge MCP (already baked) + Penpot MCP (new binding).
  • design-reviewer container embeds the same two MCPs (reviewer needs Forge for posting comments + Penpot for export_frame_png and for reading file structure).
  • Penpot MCP credentials passed in as env from host secrets: ~/.config/claude-hooks/penpot-creds, ~/.config/claude-hooks/authelia-creds, ~/.config/claude-hooks/penpot-cookie.
  • Critical gotcha (paid for on #55): this Penpot instance has login-with-password disabled (OIDC-only) and access tokens off. The stock penpot-mcp-server crashes on startup because it tries login-with-password and gets code: login-disabled. The patched build at ~/Workspace/penpot-mcp-server/ adds AUTHELIA_BASIC_AUTH (Authelia proxy-auth) + PENPOT_AUTH_TOKEN_COOKIE (pre-seeded OIDC session cookie, bypasses login) + DB→RPC fallback in services/changes.py::get_file_info. These patches must be upstreamed or baked into claude-hooks:dev before this ticket closes, otherwise the designer/reviewer containers will crash-loop.
  • dev/boss/reviewer containers do not embed Penpot MCP — cost and attack-surface aren't worth it for code-only agents.

Skills

New skills, living under src/skills/:

  • design-breakdown (designer) — reads a spec ticket; proposes per-screen scope; optionally creates the Penpot file skeleton and sub-stories. Mirrors the existing breakdown skill, but outputs are Penpot artifacts + design user stories, not code tickets.
  • design-implement (designer) — reads the ticket's AC, creates the Penpot pages/frames/shapes, posts a handoff comment matching the shape of #55 (comment) (deep-link, per-page table, token CSS, decisions-that-deviated list).
  • design-address-review (designer) — reads review comments from design-reviewer, applies fixes to the Penpot file, re-posts.
  • design-review (design-reviewer) — exports each frame via Penpot export_frame_png, inspects visually, posts a comment with findings grouped by category: overflow, contrast, alignment, typography, UX, suggestion. Each finding cites the frame name + approximate coordinates.
  • Existing implement/review/address-review/rebase/merge/fix-ci/breakdown skills stay unchanged; dev/boss/reviewer do not gain design skills.

Label routing

  • Create label area:design (suggested color #ec4899 magenta to differentiate from area:dashboard purple; description: "UI/UX mockup work — routes to designer agent").
  • Webhook: issue labelled area:design → dispatch to designer (not dev/boss). PR opened by designer → dispatch to design-reviewer (not reviewer).
  • Tickets without area:design keep routing as today.
  • Update the routing logic in src/webhook.ts (or wherever label-based dispatch lives) to look up the type by label before falling through to the default type mapping.

Tests

  • Unit: webhook routes area:design issues to designer; routes designer-authored PRs to design-reviewer.
  • Smoke: dispatch a throwaway area:design issue → designer creates a Penpot file + posts handoff comment → design-reviewer posts review comment → designer applies fixes. End-to-end happy path on a disposable ticket.
  • Negative: dispatch a regular area:webhook issue → still goes to dev/boss per existing rules (no regression).

Documentation

  • CLAUDE.md Roles table: add designer and design-reviewer rows.
  • CLAUDE.md: one paragraph on Penpot MCP auth gotcha (OIDC-only → cookie + Authelia-basic).
  • README.md: one-liner on the new agent types + area:design routing.

Out of scope

  • Dashboard CRUD for the new types — #53 (A6) will surface them automatically once they exist in agents.json.
  • Other design tools (Figma, Excalidraw, etc.) — Penpot-only for now.
  • Automatic review→fix→review loops — one review round per dispatch, then operator decides.
  • Light-theme rendering for reviews — reviewer inspects whichever theme is in the file (dark-only until A6 adds a toggle).
  • Upstreaming the Penpot MCP patches to the public penpot-mcp-server repo — we bake into our own image for now; upstreaming is a separate effort.

References

  • Tracking issue: #47.
  • Triggering example: #55 — manual delivery of the /agents mockups. The skills here should reproduce that workflow agentically.
  • Handoff comment shape the design-implement skill should emit: #55 (comment)
  • Milestone: Agent pool + customization (#16).
  • Memory notes (~/.claude/projects/.../memory/): agents.md (roles), sdk_config_isolation.md (CLAUDE_CONFIG_DIR trap), mcp_merge_bug.md (forgejo-mcp patch pattern), forgejo_gpg_home.md (signing keyring location).
  • Local Penpot MCP build: ~/Workspace/penpot-mcp-server/ (three patches landed during #55, see src/penpot_mcp/{config.py,services/api.py,services/changes.py}).

Dependencies

  • Blocked by: nothing on main. Can run in parallel with A1–A6 on the milestone.
  • Blocks: implicit — dashboard CRUD (#53) shows the new types only once they exist.
  • Branch off: main.

Suggested breakdown (for boss)

Substories roughly sized to one PR each:

  1. Config + forgejo users + tokens + signing — foundation, no behavior change yet.
  2. Upstream / bake the Penpot MCP patches into claude-hooks:dev — includes AUTHELIA_BASIC_AUTH, PENPOT_AUTH_TOKEN_COOKIE, get_file_info RPC fallback.
  3. Label routing — webhook dispatches area:designdesigner; create the label; update tests.
  4. Skills design-implement + design-review — core pipeline, enough for the happy path smoke test.
  5. Skills design-breakdown + design-address-review — completes the loop.
  6. Docs passCLAUDE.md, README.md.
## User story As a **maintainer**, I want two new agent types — `designer` and `design-reviewer` — routed by `area:design`, so that UI/UX mockup tickets (like #55) land through a dedicated agent pair that can drive Penpot end-to-end and catch obvious visual defects, without a human driving the session. ## Context #55 was delivered interactively from Claude Desktop with Penpot MCP + Forge MCP + a chunk of Authelia auth plumbing. That's not a repeatable pipeline — the design workflow needs to become an agent type with a containerised MCP loadout, the same way `dev`/`boss`/`reviewer` have pre-baked skills. Model decisions (resolves the review chat on #55): - **`designer` default model: `claude-opus-4-7`.** Layout math, MCP quirks (frame-fill bug, parallel-write coordination), and design-system reasoning are past Sonnet's reliability. Downgrade to Sonnet only per-instance, via the dashboard, for simple one-frame tweaks. - **`design-reviewer` default model: `claude-sonnet-4-6`.** Multimodal pattern-matching on exported frame PNGs (overflow / contrast / alignment / poor positioning) is high-recall review work, not synthesis. Sonnet is the cost/quality sweet spot. ## Acceptance criteria ### Config — `config/agents.json` - [ ] New `designer` entry: `forgejo_user: designer`, `model: claude-opus-4-7`, `branch_prefix: designer`, system prompt explaining it produces Penpot mockups and handoff comments, container enabled with the `claude-hooks:dev` image. - [ ] New `design-reviewer` entry: `forgejo_user: design-reviewer`, `model: claude-sonnet-4-6`, system prompt for visual defect review, container enabled. - [ ] Forgejo users `designer` and `design-reviewer` created on the instance; tokens issued to `~/.config/claude-hooks/tokens/{designer,design-reviewer}`; added to the repos they should access. - [ ] Signing keys set up the same way `boss`/`dev`/`reviewer` have them (per the `forgejo_gpg_home` memory — keyring must live at `/var/lib/forgejo/data/home/.gnupg/`). ### MCP bindings — container image - [ ] `designer` container embeds **Forge MCP** (already baked) + **Penpot MCP** (new binding). - [ ] `design-reviewer` container embeds the same two MCPs (reviewer needs Forge for posting comments + Penpot for `export_frame_png` and for reading file structure). - [ ] Penpot MCP credentials passed in as env from host secrets: `~/.config/claude-hooks/penpot-creds`, `~/.config/claude-hooks/authelia-creds`, `~/.config/claude-hooks/penpot-cookie`. - [ ] **Critical gotcha** (paid for on #55): this Penpot instance has `login-with-password` disabled (OIDC-only) and access tokens off. The stock `penpot-mcp-server` crashes on startup because it tries login-with-password and gets `code: login-disabled`. The patched build at `~/Workspace/penpot-mcp-server/` adds `AUTHELIA_BASIC_AUTH` (Authelia proxy-auth) + `PENPOT_AUTH_TOKEN_COOKIE` (pre-seeded OIDC session cookie, bypasses login) + DB→RPC fallback in `services/changes.py::get_file_info`. **These patches must be upstreamed or baked into `claude-hooks:dev` before this ticket closes**, otherwise the designer/reviewer containers will crash-loop. - [ ] `dev`/`boss`/`reviewer` containers do **not** embed Penpot MCP — cost and attack-surface aren't worth it for code-only agents. ### Skills New skills, living under `src/skills/`: - [ ] `design-breakdown` (designer) — reads a spec ticket; proposes per-screen scope; optionally creates the Penpot file skeleton and sub-stories. Mirrors the existing `breakdown` skill, but outputs are Penpot artifacts + design user stories, not code tickets. - [ ] `design-implement` (designer) — reads the ticket's AC, creates the Penpot pages/frames/shapes, posts a handoff comment matching the shape of https://forge.jacquin.app/charles/claude-hooks/issues/55#issuecomment-5241 (deep-link, per-page table, token CSS, decisions-that-deviated list). - [ ] `design-address-review` (designer) — reads review comments from `design-reviewer`, applies fixes to the Penpot file, re-posts. - [ ] `design-review` (design-reviewer) — exports each frame via Penpot `export_frame_png`, inspects visually, posts a comment with findings grouped by category: `overflow`, `contrast`, `alignment`, `typography`, `UX`, `suggestion`. Each finding cites the frame name + approximate coordinates. - [ ] Existing `implement`/`review`/`address-review`/`rebase`/`merge`/`fix-ci`/`breakdown` skills stay unchanged; `dev`/`boss`/`reviewer` do not gain design skills. ### Label routing - [ ] Create label `area:design` (suggested color `#ec4899` magenta to differentiate from `area:dashboard` purple; description: "UI/UX mockup work — routes to designer agent"). - [ ] Webhook: issue labelled `area:design` → dispatch to `designer` (not `dev`/`boss`). PR opened by `designer` → dispatch to `design-reviewer` (not `reviewer`). - [ ] Tickets without `area:design` keep routing as today. - [ ] Update the routing logic in `src/webhook.ts` (or wherever label-based dispatch lives) to look up the type by label *before* falling through to the default type mapping. ### Tests - [ ] Unit: webhook routes `area:design` issues to `designer`; routes `designer`-authored PRs to `design-reviewer`. - [ ] Smoke: dispatch a throwaway `area:design` issue → designer creates a Penpot file + posts handoff comment → design-reviewer posts review comment → designer applies fixes. End-to-end happy path on a disposable ticket. - [ ] Negative: dispatch a regular `area:webhook` issue → still goes to `dev`/`boss` per existing rules (no regression). ### Documentation - [ ] `CLAUDE.md` Roles table: add `designer` and `design-reviewer` rows. - [ ] `CLAUDE.md`: one paragraph on Penpot MCP auth gotcha (OIDC-only → cookie + Authelia-basic). - [ ] `README.md`: one-liner on the new agent types + `area:design` routing. ## Out of scope - Dashboard CRUD for the new types — #53 (A6) will surface them automatically once they exist in `agents.json`. - Other design tools (Figma, Excalidraw, etc.) — Penpot-only for now. - Automatic review→fix→review loops — one review round per dispatch, then operator decides. - Light-theme rendering for reviews — reviewer inspects whichever theme is in the file (dark-only until A6 adds a toggle). - Upstreaming the Penpot MCP patches to the public `penpot-mcp-server` repo — we bake into our own image for now; upstreaming is a separate effort. ## References - Tracking issue: #47. - Triggering example: **#55** — manual delivery of the `/agents` mockups. The skills here should reproduce that workflow agentically. - Handoff comment shape the `design-implement` skill should emit: https://forge.jacquin.app/charles/claude-hooks/issues/55#issuecomment-5241 - Milestone: **Agent pool + customization** (#16). - Memory notes (`~/.claude/projects/.../memory/`): `agents.md` (roles), `sdk_config_isolation.md` (CLAUDE_CONFIG_DIR trap), `mcp_merge_bug.md` (forgejo-mcp patch pattern), `forgejo_gpg_home.md` (signing keyring location). - Local Penpot MCP build: `~/Workspace/penpot-mcp-server/` (three patches landed during #55, see `src/penpot_mcp/{config.py,services/api.py,services/changes.py}`). ## Dependencies - **Blocked by:** nothing on `main`. Can run in parallel with A1–A6 on the milestone. - **Blocks:** implicit — dashboard CRUD (#53) shows the new types only once they exist. - **Branch off:** `main`. ## Suggested breakdown (for `boss`) Substories roughly sized to one PR each: 1. **Config + forgejo users + tokens + signing** — foundation, no behavior change yet. 2. **Upstream / bake the Penpot MCP patches into `claude-hooks:dev`** — includes `AUTHELIA_BASIC_AUTH`, `PENPOT_AUTH_TOKEN_COOKIE`, `get_file_info` RPC fallback. 3. **Label routing** — webhook dispatches `area:design` → `designer`; create the label; update tests. 4. **Skills `design-implement` + `design-review`** — core pipeline, enough for the happy path smoke test. 5. **Skills `design-breakdown` + `design-address-review`** — completes the loop. 6. **Docs pass** — `CLAUDE.md`, `README.md`.
Sign in to join this conversation.
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
charles/claude-hooks#56
No description provided.