Dockerfile: pin claude-code version explicitly, don't inherit "latest at build time" #83

Closed
opened 2026-04-19 08:43:23 +00:00 by claude-desktop · 0 comments
Collaborator

User story

As the operator, I want the agent container's claude-code
binary pinned to a specific version number in the Dockerfile, so
that an upstream regression never silently lands in our agents the
next time we rebuild the image.

What happened (concrete)

Claude Code v2.1.111 introduced a ~14% context-window bloat
regression (anthropics/claude-code#49593).
Our containers currently ship v2.1.112 because the Dockerfile
does npm install -g @anthropic-ai/claude-code with no version
pin — it picks up whatever's latest at image build time. We
inherited the regression without a decision or a PR.

The host operator's claude is on v2.1.113, so interactive
sessions also carry the bloat. But the cost concentrates on the
SDK caller: claude-hooks agents' first turn already hits the
200k budget thanks to MCP tool schemas (see #81 for details), so
an extra ~14% from the regression tips us over the limit.
Observed on task 695b934e-6db9-4b35-990c-924b8b39e717 on #79:
boss died turn 1 with Prompt is too long.

Why "always latest" is the wrong default here

For interactive operator use, auto-updating is fine — the
operator can roll back, pin, or route around issues. For
webhook-dispatched agents the failure mode is:

  1. Image rebuild silently upgrades claude-code.
  2. Next dispatch hits a regression.
  3. We learn about it when tickets start failing in prod.

A pinned version trades convenience for reproducibility. We
already do this for forgejo-mcp (baked in at a specific upstream
tag plus patches); claude-code should follow the same pattern.

Acceptance criteria

Pin & bump mechanics

  • Dockerfile pins claude-code to a known-good version via
    npm install -g @anthropic-ai/claude-code@X.Y.Z — initial
    pin = the latest version before 2.1.111's bloat, i.e.
    2.1.110.
  • A single ARG CLAUDE_CODE_VERSION=2.1.110 at the top of the
    Dockerfile, referenced in the npm install step — so
    bumping is one line.
  • A short comment above the ARG pointing at this ticket
    explaining why we pin (+ the v2.1.111 regression that
    motivated it).

Visibility

  • scripts/smoke-creds.sh asserts the container's
    claude --version matches the pinned value. Fails loud if
    a local rebuild drifted.
  • Health endpoint (GET /health) includes
    claude_code_version per container (run claude --version
    on startup and cache).

Bump cadence

  • New policy in CLAUDE.md: claude-code version bumps are
    intentional PRs that (a) bump the ARG, (b) rebuild the
    image, (c) run the full smoke suite (scripts/smoke-creds.sh
    + re-dispatch #62 / #77 / #79 as a post-merge gate) before
    merge.

Out of scope

  • Pinning SDK version (@anthropic-ai/claude-agent-sdk) —
    already pinned in package.json. Only the CLI bundled into the
    agent container is the unpinned surface.
  • Auto-bumping via Renovate / Dependabot. Cadence is low enough
    that human-gated bumps are fine.

References

  • anthropics/claude-code#49593
    2.1.111 context bloat regression.
  • #81 — the forgejo-mcp trim ticket that bounds the other
    contributor to the prompt-size problem.
  • Dockerfile — where the pin lands.

Dependencies

  • Blocked by: nothing.
  • Blocks: reproducible agent runs; companion to #81.
  • Branch off: main.
## User story As the **operator**, I want the agent container's `claude-code` binary pinned to a specific version number in the Dockerfile, so that an upstream regression never silently lands in our agents the next time we rebuild the image. ## What happened (concrete) Claude Code `v2.1.111` introduced a ~14% context-window bloat regression ([anthropics/claude-code#49593](https://github.com/anthropics/claude-code/issues/49593)). Our containers currently ship `v2.1.112` because the Dockerfile does `npm install -g @anthropic-ai/claude-code` with no version pin — it picks up whatever's latest at image build time. We inherited the regression without a decision or a PR. The host operator's `claude` is on `v2.1.113`, so interactive sessions also carry the bloat. But the cost concentrates on the SDK caller: `claude-hooks` agents' first turn already hits the 200k budget thanks to MCP tool schemas (see #81 for details), so an extra ~14% from the regression *tips us over* the limit. Observed on task `695b934e-6db9-4b35-990c-924b8b39e717` on #79: boss died turn 1 with `Prompt is too long`. ## Why "always latest" is the wrong default here For interactive operator use, auto-updating is fine — the operator can roll back, pin, or route around issues. For webhook-dispatched agents the failure mode is: 1. Image rebuild silently upgrades `claude-code`. 2. Next dispatch hits a regression. 3. We learn about it when tickets start failing in prod. A pinned version trades convenience for reproducibility. We already do this for `forgejo-mcp` (baked in at a specific upstream tag plus patches); `claude-code` should follow the same pattern. ## Acceptance criteria ### Pin & bump mechanics - [ ] `Dockerfile` pins `claude-code` to a known-good version via `npm install -g @anthropic-ai/claude-code@X.Y.Z` — initial pin = the latest version *before* 2.1.111's bloat, i.e. 2.1.110. - [ ] A single `ARG CLAUDE_CODE_VERSION=2.1.110` at the top of the Dockerfile, referenced in the `npm install` step — so bumping is one line. - [ ] A short comment above the ARG pointing at this ticket explaining why we pin (+ the v2.1.111 regression that motivated it). ### Visibility - [ ] `scripts/smoke-creds.sh` asserts the container's `claude --version` matches the pinned value. Fails loud if a local rebuild drifted. - [ ] Health endpoint (`GET /health`) includes `claude_code_version` per container (run `claude --version` on startup and cache). ### Bump cadence - [ ] New policy in `CLAUDE.md`: `claude-code` version bumps are intentional PRs that (a) bump the `ARG`, (b) rebuild the image, (c) run the full smoke suite (`scripts/smoke-creds.sh` + re-dispatch #62 / #77 / #79 as a post-merge gate) before merge. ## Out of scope - Pinning **SDK** version (`@anthropic-ai/claude-agent-sdk`) — already pinned in `package.json`. Only the CLI bundled into the agent container is the unpinned surface. - Auto-bumping via Renovate / Dependabot. Cadence is low enough that human-gated bumps are fine. ## References - [anthropics/claude-code#49593](https://github.com/anthropics/claude-code/issues/49593) — 2.1.111 context bloat regression. - #81 — the forgejo-mcp trim ticket that bounds the *other* contributor to the prompt-size problem. - `Dockerfile` — where the pin lands. ## Dependencies - **Blocked by:** nothing. - **Blocks:** reproducible agent runs; companion to #81. - **Branch off:** `main`.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
charles/claude-hooks#83
No description provided.