fix(image+plugins): un-pin claude-code; install plugins from inside image #98
No reviewers
Labels
No labels
area:agents
area:dashboard
area:database
area:design
area:design-review
area:flows
area:infra
area:meta
area:security
area:sessions
area:webhook
area:workdir
security
type:bug
type:chore
type:meta
type:user-story
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
charles/claude-hooks!98
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "fix/unpin-claude-code"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
All five agents were dispatching for hours with zero plugins loaded, even though
agents.jsondeclared them andsettings.jsonlisted them as enabled.claude plugin listinside every container reportedStatus: ✘ failed to load — Plugin X not found in marketplace claude-plugins-official.Root cause.
Dockerfilepinnedclaude-code(2.1.110, per #83) whilejust agent-plugins-installran on the host (auto-updated to 2.1.113). Installer wrote marketplace state in a format the older container loader couldn't resolve. The smoke probe only grepped the❯ <name>header line, so it reported green while every plugin silently failed.Fix. Un-pin, install plugins from inside the image so installer CLI == loader CLI always.
Changes
ARG CLAUDE_CODE_VERSION. Resolve the npmlatestdist-tag at image build. Switch from@anthropic-ai/claude-code(bundledcli.js— layout gone in 2.1.114) to the platform-native@anthropic-ai/claude-code-linux-{x64,arm64}tarball. MoveDISABLE_AUTOUPDATER=1to anENV.agent-plugins-install— run the install inside a throwaway container offcontainer_imagewith the agent-env dir bind-mounted rw. Wipe staleplugins/dir andenabledPlugins/extraKnownMarketplaceskeys fromsettings.jsonbefore install so migration from the old host-install is clean.Status:and fails on✘ failed to load, not just the name header. Remove the pinned-version probe (nothing to pin against).Rationale for un-pinning
Pinning was motivated by the v2.1.111 context-bloat regression (#83), but the underlying context-ceiling problem has since been addressed via other means (1M-context designer, agent-aware prompt-too-long hint). Keeping the pin guaranteed drift with the host-installed
claudeCLI, which is what bit us here.Test plan
just qa— 259 pass, 0 failclaude --versionreports2.1.114inside every containerjust containers-rebuild— all 5 containers recreated cleanlyjust agent-plugins-install— 12 plugins install successfully (boss/dev/reviewer/designer/design-reviewer)scripts/smoke-creds.sh— 33 passed, 0 failed; hardened plugin probe now reports(loaded)per plugin instead of accepting the name header alone🤖 Generated with Claude Code
Review: fix(image+plugins): un-pin claude-code; install plugins from inside image
CI: ✅ Green (run #1626, 2m41s, success)
Root cause diagnosis and fix are correct: the host-vs-container CLI version skew was the genuine source of plugin load failures, and installing plugins from inside a throwaway container built off the same image is the right architectural fix. The hardened smoke probe (per-plugin
Status:parsing) is a clear improvement. Two bugs and one stale comment need fixing before merge.🔴 Bug 1 —
containers-smokestill references the removedcli.jsFile:
justfile,containers-smoke NAMErecipecli.jsno longer exists in the 2.1.114+ layout — the image now installs a native binary at/usr/local/bin/claude. This command will fail with "file not found" the first time anyone runsjust containers-smoke <agent>.Fix: Replace that line with:
🟡 Bug 2 — Plugin install failures silently swallowed in the container script
File:
justfile,agent-plugins-installrecipe, inner bash-cscriptset -ewithoutset -o pipefailmeans the pipelineclaude plugin install … 2>&1 | tail -1always exits 0 (fromtail). A failed install is not surfaced —just agent-plugins-installexits cleanly and the operator has no signal. Failures are only caught later bysmoke-creds.sh.Fix: Change
set -etoset -eo pipefailin the inner script.⚪ Cosmetic — Dockerfile header comment references removed ARG
File:
Dockerfile, header comment blockCLAUDE_CODE_VERSIONis no longer anARG. Should read:- justfile containers-smoke: drop `bun /opt/claude-code/cli.js` (that path is gone in the 2.1.114 native-binary layout); use the `claude` wrapper instead. - justfile agent-plugins-install: inner bash now `set -eo pipefail` so a failed `claude plugin install | tail -1` surfaces a non-zero exit instead of being swallowed by `tail`. - Dockerfile header comment: drop the ${CLAUDE_CODE_VERSION} reference now that the ARG is gone; point at the per-platform npm package. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>All three points addressed in
f68ab3c:containers-smoke→ usesclaude --versionnow; verified againstclaude-hooks-dev(outputs2.1.114 (Claude Code)).agent-plugins-installinner script →set -eo pipefailso a failedclaude plugin install | tail -1exits non-zero; confirmed the pipeline now propagates failures.@anthropic-ai/claude-code-linux-x64(the per-platform package).scripts/smoke-creds.shstill 33 passed, 0 failed. Ready for re-review.Review — REQUEST_CHANGES
CI was still running at review time (run #183, commit
f68ab3c, job "fix(review): address PR #98 review") — push any trivial change (or wait) and I will re-review when it completes.Code review — one issue found
Dockerfile—/opt/claude-codenot cleaned up after installAfter
install -m 0755 /opt/claude-code/claude /usr/local/bin/claudecopies the binary,/opt/claude-codeis never removed. The tarball extracts extra files (LICENSE, package.json, …) that persist in the image layer and waste space. Fix:Everything else looks good
claudebinary as the long-running agent) eliminates the installer-vs-loader version skew that caused all plugins to silently fail.[ -n "$tarball_url" ]guard andset -euxis solid.ENV DISABLE_AUTOUPDATER=1correctly moved to a persistent env.agent-plugins-install: pre-wipe ofplugins/anddel(.enabledPlugins, .extraKnownMarketplaces)before in-container install correctly clears stale host-side paths.set -eo pipefailinside the bash heredoc ensures a failingclaude plugin install | tail -1propagates non-zero.smoke-creds.shplugin probe: the new awk parser distinguishingloadedfrom✘ failed to loadis a clear improvement over the header-only grep that gave false positives.Review — REQUEST_CHANGES
CI was still running at review time (run #183, commit
f68ab3c, job "fix(review): address PR #98 review") — push any trivial change (or wait) and I will re-review when it completes.Code review — one issue found
Dockerfile—/opt/claude-codenot cleaned up after installAfter
install -m 0755 /opt/claude-code/claude /usr/local/bin/claudecopies the binary,/opt/claude-codeis never removed. The tarball extracts extra files (LICENSE, package.json, …) that persist in the image layer and waste space. Fix:Everything else looks good
claudebinary as the long-running agent) eliminates the installer-vs-loader version skew that caused all plugins to silently fail.[ -n "$tarball_url" ]guard andset -euxis solid.ENV DISABLE_AUTOUPDATER=1correctly moved to a persistent env.agent-plugins-install: pre-wipe ofplugins/anddel(.enabledPlugins, .extraKnownMarketplaces)before in-container install correctly clears stale host-side paths.set -eo pipefailinside the bash heredoc ensures a failingclaude plugin install | tail -1propagates non-zero.smoke-creds.shplugin probe: the new awk parser distinguishingloadedfrom "failed to load" is a clear improvement over the header-only grep that gave false positives.Addressed in
6552eaf:rm -rf /opt/claude-codeat the end of the sameRUNstep. Verified — image built,/opt/claude-codeis gone inside the runtime,claude --versionstill reports2.1.114.