designer: pin to claude-opus-4-7[1m] for Penpot-heavy runs #94

Closed
opened 2026-04-19 17:41:35 +00:00 by claude-desktop · 0 comments
Collaborator

User story

As the operator, I want the designer agent configured with the 1M-context variant so that multi-page Penpot mockup tasks don't hit Prompt is too long mid-run.

What happened (concrete)

Designer task d3883982-7b23-4083-ac18-38b42b98e633 on #70 ran uncapped (after the maxTurns: 100 cap was removed today) and died at turn ~N with Prompt is too long. The failure cause is genuine context accumulation:

  • Each create_frame / create_text / get_file_info MCP call + its result stays in conversation history.
  • #70 asks for 6 Penpot pages plus 4 components, each with multiple frames and text shapes.
  • Typical call count: 80–150 MCP exchanges across the full AC.
  • At ~2.5 KB average per exchange (tool args + result JSON), conversation alone is 200–400 KB before model inference — crosses Opus-200k even with the forgejo-mcp trim.

Rolling back maxTurns made it possible to reach the AC; but reaching it needs the larger context window.

Why designer specifically

  • Designer's MCP surface is write-heavy (every shape creation is a apply_changes RPC).
  • Designer returns big JSON structures from get_file_info / list_projects that round-trip through the model.
  • Boss and dev do mostly file-local edits (Edit / Write / Bash) — their MCP usage is sparse by comparison.
  • design-reviewer exports PNGs per frame — also big; worth considering for 1M too once it lands on a real review.

Acceptance criteria

Config

  • config/agents.json: agents.designer.modelclaude-opus-4-7[1m].
  • Leave design-reviewer on claude-sonnet-4-6 for now — review passes are shorter; revisit if one fails.
  • Leave boss on plain claude-opus-4-7 — the forgejo-mcp trim (#81) already proved 200k is enough for boss's workload.

Smoke

  • Re-dispatch #70 after the model bump; confirm all 6 pages + components + handoff comment land without Prompt is too long.

Out of scope

  • Mid-session compaction (e.g. summarising early turns into a single assistant message). Anthropic SDK doesn't expose this directly; could be a future claude-hooks-side wrapper but bigger scope.
  • Trimming Penpot MCP results (e.g. don't return the full file tree from get_file_info — return a summary). Server-side change in the Penpot MCP fork; worth exploring if 1M isn't enough either.
  • Switching designer to Sonnet. Would drop context even further (200k) and lose Opus's design judgment.

References

  • Failure: designer task d3883982-7b23-4083-ac18-38b42b98e633 on #70 (2026-04-19 19:35).
  • Related: maxTurns: 100 cap removed in commit 4facca8 today — unblocked the path to hit this real context limit.
  • Skill: skills/design-implement.md (the prompt that designer runs).

Dependencies

  • Blocked by: nothing.
  • Blocks: #70 completing on the designer agent.
  • Branch off: main.
## User story As the **operator**, I want the `designer` agent configured with the 1M-context variant so that multi-page Penpot mockup tasks don't hit `Prompt is too long` mid-run. ## What happened (concrete) Designer task `d3883982-7b23-4083-ac18-38b42b98e633` on #70 ran uncapped (after the `maxTurns: 100` cap was removed today) and died at turn ~N with `Prompt is too long`. The failure cause is genuine context accumulation: - Each `create_frame` / `create_text` / `get_file_info` MCP call + its result stays in conversation history. - #70 asks for 6 Penpot pages plus 4 components, each with multiple frames and text shapes. - Typical call count: 80–150 MCP exchanges across the full AC. - At ~2.5 KB average per exchange (tool args + result JSON), conversation alone is 200–400 KB before model inference — crosses Opus-200k even with the `forgejo-mcp` trim. Rolling back `maxTurns` made it possible to reach the AC; but reaching it needs the larger context window. ## Why designer specifically - Designer's MCP surface is write-heavy (every shape creation is a `apply_changes` RPC). - Designer returns big JSON structures from `get_file_info` / `list_projects` that round-trip through the model. - Boss and dev do mostly file-local edits (`Edit` / `Write` / `Bash`) — their MCP usage is sparse by comparison. - `design-reviewer` exports PNGs per frame — also big; worth considering for 1M too once it lands on a real review. ## Acceptance criteria ### Config - [ ] `config/agents.json`: `agents.designer.model` → `claude-opus-4-7[1m]`. - [ ] Leave `design-reviewer` on `claude-sonnet-4-6` for now — review passes are shorter; revisit if one fails. - [ ] Leave `boss` on plain `claude-opus-4-7` — the `forgejo-mcp` trim (#81) already proved 200k is enough for boss's workload. ### Smoke - [ ] Re-dispatch #70 after the model bump; confirm all 6 pages + components + handoff comment land without `Prompt is too long`. ## Out of scope - **Mid-session compaction** (e.g. summarising early turns into a single assistant message). Anthropic SDK doesn't expose this directly; could be a future claude-hooks-side wrapper but bigger scope. - **Trimming Penpot MCP results** (e.g. don't return the full file tree from `get_file_info` — return a summary). Server-side change in the Penpot MCP fork; worth exploring if 1M isn't enough either. - **Switching designer to Sonnet.** Would drop context even further (200k) and lose Opus's design judgment. ## References - Failure: designer task `d3883982-7b23-4083-ac18-38b42b98e633` on #70 (2026-04-19 19:35). - Related: `maxTurns: 100` cap removed in commit `4facca8` today — unblocked the path to hit this real context limit. - Skill: `skills/design-implement.md` (the prompt that designer runs). ## Dependencies - **Blocked by:** nothing. - **Blocks:** #70 completing on the designer agent. - **Branch off:** `main`.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
charles/claude-hooks#94
No description provided.