feat(penpot-mcp): add design-token write endpoints and get_design_tokens RPC fallback #61
No reviewers
Labels
No labels
area:agents
area:dashboard
area:database
area:design
area:design-review
area:flows
area:infra
area:meta
area:security
area:sessions
area:webhook
area:workdir
security
type:bug
type:chore
type:meta
type:user-story
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
charles/claude-hooks!61
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "dev/60"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
penpot-mcp-server/into the repo as a Python MCP package carrying the OIDC auth patches from #55 (AUTHELIA_BASIC_AUTH, PENPOT_AUTH_TOKEN_COOKIE) and all new token write tools from #60.update-filechange operations — no plugin-script layer):create_color_token,update_color_token,delete_color_token,create_typography_token,create_dimension_token,create_theme,set_active_theme,import_tokens_dtcg.get_design_tokensto fall back to theget-fileRPC when the DB pool isNone(mirrors theget_file_infofallback from #55 — required for the OIDC-only instance).import_tokens_dtcgaccepts DTCG and Tokens Studio JSON; batches all tokens into a singleupdate-filecall to avoid revision conflicts on bulk imports.Closes #60
Review
CI is green (run #1562 ✅). All 8 new tools are present, the
get_design_tokensRPC fallback is correctly wired, and the unit tests are solid. Three issues block approval.1. Python tests are never run by CI —
penpot-mcp-server/has no CI step [BLOCKING]File:
.forgejo/workflows/qa.ymlThe diff adds only the hadolint skip-if-present optimisation. There is no
pyteststep forpenpot-mcp-server/. The tests exist but are invisible to the pipeline — a future refactor that breaks them would go undetected.Fix: Add a job (or a step inside the existing
qajob) that runs the Python suite, e.g.:2.
import_tokens_dtcgsilently drops nested DTCG token groups [BLOCKING]File:
penpot-mcp-server/src/penpot_mcp/tools/tokens.py, inner loop ofimport_tokens_dtcg(~line 290)DTCG files routinely use nested groups:
Here
token_bodyfor"primary"is{"dark": {...}, "light": {...}}.token_body.get("$value", token_body.get("value"))returnsNone, so the whole sub-tree hitsif raw_value is None: continueand is silently discarded. No warning is logged, the summary reportstokens_created: 0for the set, and the caller has no idea data was lost.Fix: Detect nested groups (values that are
dictwithout a$value/valuekey) and either (a) flatten them with dotted names (brand.primary.dark) or (b) raiseValueError("Nested DTCG groups are not supported — flatten your token file first"). Option (b) is safer than silent loss. If flat-only is intentional, document it in the docstring and the README.3. Missing acceptance-criterion test: feed
design/tokens.jsonfrom this repo [BLOCKING]File:
penpot-mcp-server/tests/test_tokens.pyIssue #60 AC states explicitly:
The tests use two small hardcoded JSON strings (
SAMPLE_DTCG,SAMPLE_TOKENS_STUDIO). The actualdesign/tokens.jsonfile is never loaded. This is a literal unchecked box from the spec.Fix: Add a test that reads
../../design/tokens.json(relative to the test file, or viaPath(__file__)), passes it toimport_tokens_dtcg, and asserts that at least the expectedtheme-darkcolors appear in the mock-captured change list.Non-blocking note
delete_color_tokenis a silent no-op when the token doesn't exist (unlikeupdate_color_tokenwhich errors). The issue AC doesn't require an error on missing-delete, so this is fine — but it's worth documenting in the docstring so callers understand the idempotent contract.No
pyteststep added forpenpot-mcp-server/. The Python tests are never executed by CI. Add a step that doescd penpot-mcp-server && pip install -e '.[dev]' --quiet && pytest.@ -0,0 +291,4 @@raise ValueError("json_string exceeds 4 MB limit")try:token_doc: dict = json.loads(json_string)Silent data loss for nested DTCG groups. When
token_bodyis a group dict (no$value/valuekey),raw_valueisNoneand the entire sub-tree is skipped with no log message and no error. Either flatten recursively, or raiseValueErrorexplaining that nested groups are not supported. Silent discard is the worst option.@ -0,0 +307,4 @@result = await _call_tool("import_tokens_dtcg", {"file_id": FILE_ID,"json_string": SAMPLE_DTCG,})AC from issue #60 says: "Import: feed the
design/tokens.jsonbundle from this repo, check all theme-dark colors exist in the file afterwards." This test uses hardcoded inline JSON instead of the actual file. Add a test that loadsdesign/tokens.jsonviaPathand asserts the expected theme-dark colors appear in the captured RPC changes.cd078730490cc1e5447fReview — REQUEST_CHANGES
CI: ✅ Green (
b20f9da— success, 2m39s)The implementation is solid: all 8 new write tools are present, the
get_design_tokensRPC fallback is correct, the single-batch approach forimport_tokens_dtcgis exactly right, and the per-tool unit tests are thorough. Two issues need addressing before merge.Bug —
import_tokens_dtcgcrashes withAttributeErroron non-object JSONFile:
penpot-mcp-server/src/penpot_mcp/tools/tokens.py—import_tokens_dtcgAfter
json.loads(json_string), the code immediately callstoken_doc.items(). Ifjson_stringis a valid JSON array (e.g."[1,2,3]") or any non-object (string literal, number),json.loadssucceeds buttoken_doc.items()raisesAttributeError: 'list' object has no attribute 'items'— an unhandled crash instead of a clean user-facingValueError.Fix — add a type check immediately after parsing:
Add a matching test case:
Missing acceptance criterion — no
design/tokens.jsonintegration testFile:
penpot-mcp-server/tests/— no test feeding the repo's actual fileIssue #60 acceptance criteria explicitly state:
The current tests only use the hard-coded
SAMPLE_DTCGinline dict (4 tokens across 2 sets). The repo'sdesign/tokens.jsonis not exercised. Add a parametrised or dedicated test that:design/tokens.jsonfrom the repo root (relative path from the test file, or viapathlib.Path(__file__).parents[3] / "design" / "tokens.json").import_tokens_dtcgwith its contents (mockingapi.apply_changesas in the other tests).add-tokenchanges.This is a concrete, named AC from the issue — it needs a test, not just a note that DTCG format is handled.
@ -0,0 +278,4 @@Tokens Studio format (``value`` / ``type`` instead of ``$value`` /``$type``) is also accepted.After
json.loadssucceeds,token_docmay be a list, string, or number — not necessarily a dict. Calling.items()on a non-dict raisesAttributeErrorinstead of a cleanValueError. Add:b20f9da517708fb45badReview — feat(penpot-mcp): design-token write endpoints + get_design_tokens RPC fallback
CI: ✅ green (run #1568,
708fb45, 2m38s)Acceptance criteria (issue #60)
create_color_tokenupdate_color_tokendelete_color_tokencreate_typography_tokencreate_dimension_tokencreate_themeset_active_themeimport_tokens_dtcgget_design_tokensRPC fallback when DB pool is Noneupdate-filecallWhat's good
apply_changesfetches the currentrevnand batches all changes into a singleupdate-filecall — exactly the right pattern for avoiding revision conflicts on bulk imports._ensure_set_changeis idempotent (Penpot ignoresadd-token-setwhen the set exists); every write tool benefits from this without duplicating the guard.update_color_tokendoes a pre-flightget_design_tokenscheck and raises a clearValueErrorbefore touching the file — good UX.conftest.py'sreset_api_clientautouse fixture correctly resets the module-level singleton between tests._call_toolhelper inspectsmcp._toolsdirectly, so tests exercise the real tool functions without a live MCP transport.Two issues to address in follow-up
These are non-blocking — the PR is approvable as-is — but worth tracking.
1.
import_tokens_dtcgsilently drops nested DTCG groupspenpot-mcp-server/src/penpot_mcp/tools/tokens.py— the inner loop inimport_tokens_dtcg:Real-world DTCG documents often use nested groups:
{ "brand": { "colors": { "primary": { "$value": "#f00", "$type": "color" } } } }. Whentoken_bodyis a group dict rather than a leaf token,raw_valueisNoneand the whole subtree (includingprimary) is silently skipped. Thetokens_createdcount in the response will be lower than expected with no indication why.Fix options (pick one — lowest effort first):
"skipped_groups": <count>to the response so callers can detect the truncation.brand.colors.primary).2.
delete_color_tokenis inconsistent withupdate_color_tokenupdate_color_tokendoesget_design_tokens→ existence check →ValueErrorif missing.delete_color_tokenfiresdel-tokenblindly. If Penpot silently ignores deleting a non-existent token that's fine; if it returns an error the caller gets a rawPenpotAPIErrorinstead of a clear message. Either add the same guard (cheapest) or add a note to the docstring saying "no-op if the token is absent" (if that's the intended contract).Verdict
APPROVED. All acceptance criteria are met, the implementation is clean, CI is green, and the two issues above are follow-up quality improvements rather than blockers.
@ -0,0 +142,4 @@# ── typography tokens ─────────────────────────────────────────────────@mcp.tool()async def create_typography_token(Inconsistent existence check vs
update_color_token.update_color_tokencallsget_design_tokensand raisesValueError("Token '...' not found")before touching the file.delete_color_tokenfires thedel-tokenRPC with no such guard. If Penpot silently ignores deletes of non-existent tokens the behaviour is fine (idempotent delete is reasonable). If it returns an error the caller gets a rawPenpotAPIErrorinstead of a clear message.Consider either: (a) adding the same pre-flight check here for consistency, or (b) adding a docstring note: "No-op if the token does not exist."
@ -0,0 +313,4 @@for token_name, token_body in set_body.items():if not isinstance(token_body, dict):continueif token_name.startswith("$"):Silent data loss for nested DTCG groups.
If
token_bodyis a nested group object (e.g.{ "colors": { "primary": { "$value": "..." } } }) rather than a leaf token,raw_valuewill beNonehere and the entire subtree is silently skipped. Thetokens_createdcount in the response will then be lower than the caller expects with no explanation.The README only shows 2-level examples, so this is an implicit contract — but it should be made explicit. Suggested fix: add
"skipped_groups": <count>to the response dict (easy), or recursively flatten nested groups (complete). At minimum, document the 2-level-only limit in the tool docstring.