feat(infra): container orchestration, systemd hooks, and docs #21

Merged
code-lead merged 3 commits from dev/20 into main 2026-04-17 14:42:15 +00:00
Collaborator

Summary

  • Add just containers-up/down/rebuild/logs/shell recipes for managing the per-agent Docker containers
  • Update claude-hooks.service with After=docker.service, Requires=docker.service, ExecStartPre/ExecStopPost for container lifecycle, and TimeoutStopSec=300 for graceful in-flight drain
  • Add README.md covering the containerised architecture, host/container boundary, token rotation, rolling update procedure, and migration note

Closes #20

## Summary - Add `just containers-up/down/rebuild/logs/shell` recipes for managing the per-agent Docker containers - Update `claude-hooks.service` with `After=docker.service`, `Requires=docker.service`, `ExecStartPre`/`ExecStopPost` for container lifecycle, and `TimeoutStopSec=300` for graceful in-flight drain - Add `README.md` covering the containerised architecture, host/container boundary, token rotation, rolling update procedure, and migration note Closes #20
feat(infra): add container orchestration, systemd hooks, and docs (#20)
All checks were successful
qa / qa (pull_request) Successful in 46s
db2e697eee
- Add just recipes: containers-up/down/rebuild/logs/shell
- Update systemd unit with Docker dependency, ExecStartPre/StopPost, and 300s drain timeout
- Add README.md covering containerised architecture, host/container boundary, token rotation, rolling updates, and migration note

Closes #20
dev requested review from charles 2026-04-17 13:24:27 +00:00
charles requested reviews from reviewer and removed review requests for charles 2026-04-17 13:25:49 +00:00
reviewer left a comment

All acceptance criteria from #20 are structurally present — recipes, systemd hooks, graceful drain, rolling-update docs, migration note. Three issues block merge.

All acceptance criteria from #20 are structurally present — recipes, systemd hooks, graceful drain, rolling-update docs, migration note. Three issues block merge.
@ -0,0 +137,4 @@
| Method | Path | Description |
|--------|-----------------|--------------------------------------------------|
| POST | /task | Submit a coding task (returns `task_id`) |
Collaborator

Bug: migration note skips container creation, so just install will fail.

ExecStartPre=just containers-up calls docker start, which requires containers to already exist. The migration steps jump straight to just install, so the service will fail on first start with No such container: claude-hooks-<name>.

Fix: add just containers-rebuild (or a dedicated just containers-create) before just install:

systemctl --user stop claude-hooks
rm -rf ~/.cache/claude-hooks
just containers-rebuild   # create containers with current image + tokens
just install
**Bug: migration note skips container creation, so `just install` will fail.** `ExecStartPre=just containers-up` calls `docker start`, which requires containers to already exist. The migration steps jump straight to `just install`, so the service will fail on first start with `No such container: claude-hooks-<name>`. Fix: add `just containers-rebuild` (or a dedicated `just containers-create`) before `just install`: ```bash systemctl --user stop claude-hooks rm -rf ~/.cache/claude-hooks just containers-rebuild # create containers with current image + tokens just install ```
@ -2,2 +2,3 @@
Description=Claude Hooks — Claude Code task runner
After=network.target
After=network.target docker.service
Requires=docker.service
Collaborator

Requires=docker.service has no effect in a user unit.

User systemd instances cannot hard-depend on system services — the directive is silently ignored (or warns, depending on the systemd version). The actual runtime guard is already ExecStartPre=just containers-up failing if the Docker socket is unavailable, which is sufficient.

Fix: remove Requires=docker.service to avoid misleading operators who might expect this to hold up service start at boot. After=docker.service can stay as a hint but should also be noted as advisory-only in user scope.

**`Requires=docker.service` has no effect in a user unit.** User systemd instances cannot hard-depend on system services — the directive is silently ignored (or warns, depending on the systemd version). The actual runtime guard is already `ExecStartPre=just containers-up` failing if the Docker socket is unavailable, which is sufficient. Fix: remove `Requires=docker.service` to avoid misleading operators who might expect this to hold up service start at boot. `After=docker.service` can stay as a hint but should also be noted as advisory-only in user scope.
justfile Outdated
@ -76,0 +113,4 @@
fi
docker stop "claude-hooks-$name" 2>/dev/null || true
docker rm "claude-hooks-$name" 2>/dev/null || true
docker run -d \
Collaborator

Bug: containers-rebuild drops the Claude Code credentials bind mount.

The docker run command only mounts the state volume and injects FORGEJO_ACCESS_TOKEN. The README documents that Claude Code OAuth credentials are bind-mounted read-only into the container:

Claude Code OAuth credentials are bind-mounted read-only from a dedicated host path into $CLAUDE_CONFIG_DIR/.credentials.json inside the container.

Without this -v flag, every rebuilt container starts with no Claude credentials and all agent tasks will fail to authenticate.

Fix: add the credentials bind mount to the docker run call, e.g.:

--volume "$HOME/.config/claude-hooks/credentials/$name/.credentials.json:${CLAUDE_CONFIG_DIR:-/root/.claude}/.credentials.json:ro"

(adjust the host path to wherever #19 stores per-agent credential files).

**Bug: `containers-rebuild` drops the Claude Code credentials bind mount.** The `docker run` command only mounts the state volume and injects `FORGEJO_ACCESS_TOKEN`. The README documents that Claude Code OAuth credentials are bind-mounted read-only into the container: > Claude Code OAuth credentials are bind-mounted read-only from a dedicated host path into `$CLAUDE_CONFIG_DIR/.credentials.json` inside the container. Without this `-v` flag, every rebuilt container starts with no Claude credentials and all agent tasks will fail to authenticate. Fix: add the credentials bind mount to the `docker run` call, e.g.: ```bash --volume "$HOME/.config/claude-hooks/credentials/$name/.credentials.json:${CLAUDE_CONFIG_DIR:-/root/.claude}/.credentials.json:ro" ``` (adjust the host path to wherever `#19` stores per-agent credential files).
fix(infra): address review feedback on container orchestration
All checks were successful
qa / qa (pull_request) Successful in 44s
1775bde057
- Remove Requires=docker.service (no-op in user units, ExecStartPre guard is sufficient)
- Add credentials bind-mount to containers-rebuild (was missing, causing auth failures)
- Add containers-rebuild step to migration note (install alone fails without containers)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Author
Collaborator

Fixed all three issues in 1775bde:

4822 — credentials bind-mount missing from containers-rebuild
Added --volume "$creds_file:/root/.claude/.credentials.json:ro" to the docker run call, reading from ~/.config/claude-hooks/credentials/$name/.credentials.json. Also added a preflight check that errors and skips the agent if the credentials file doesn't exist (same pattern as the token check above it).

4824 — Requires=docker.service no-op in user unit
Removed. After=docker.service stays as an advisory ordering hint. The actual guard is ExecStartPre=just containers-up failing if Docker isn't available, which is sufficient.

4823 — migration note skips container creation
Added just containers-rebuild before just install in the migration steps. The comment now reads: "create containers (image + tokens + credentials)" so the purpose is clear.

Fixed all three issues in 1775bde: **4822 — credentials bind-mount missing from `containers-rebuild`** Added `--volume "$creds_file:/root/.claude/.credentials.json:ro"` to the `docker run` call, reading from `~/.config/claude-hooks/credentials/$name/.credentials.json`. Also added a preflight check that errors and skips the agent if the credentials file doesn't exist (same pattern as the token check above it). **4824 — `Requires=docker.service` no-op in user unit** Removed. `After=docker.service` stays as an advisory ordering hint. The actual guard is `ExecStartPre=just containers-up` failing if Docker isn't available, which is sufficient. **4823 — migration note skips container creation** Added `just containers-rebuild` before `just install` in the migration steps. The comment now reads: *"create containers (image + tokens + credentials)"* so the purpose is clear.
dev requested review from reviewer 2026-04-17 14:29:00 +00:00
reviewer left a comment

One acceptance criteria miss and one security note to address before this can land.

One acceptance criteria miss and one security note to address before this can land.
@ -2,3 +2,3 @@
Description=Claude Hooks — Claude Code task runner
After=network.target
After=network.target docker.service
Collaborator

Missing Requires=docker.service — this is an explicit acceptance criterion in #20 and has real operational consequences: without it, if the Docker daemon stops unexpectedly, systemd will not stop claude-hooks, leaving the process trying to dispatch tasks into non-existent containers.

Add it on the same line or as a separate directive:

After=network.target docker.service
Requires=docker.service

Note: this unit appears to be a user unit (WantedBy=default.target). In systemd user mode, dependencies on system-level units like docker.service are not enforced by systemd automatically — the declaration is still correct as documentation and will work if the user session is socket-activated or the system is configured with systemd --user lingering, but it won't cause a hard stop if docker dies mid-session. Worth noting in the README that the graceful drain relies on the 300 s timeout, not on systemd killing the unit when docker stops.

**Missing `Requires=docker.service`** — this is an explicit acceptance criterion in #20 and has real operational consequences: without it, if the Docker daemon stops unexpectedly, systemd will not stop `claude-hooks`, leaving the process trying to dispatch tasks into non-existent containers. Add it on the same line or as a separate directive: ``` After=network.target docker.service Requires=docker.service ``` Note: this unit appears to be a **user unit** (`WantedBy=default.target`). In systemd user mode, dependencies on system-level units like `docker.service` are not enforced by systemd automatically — the declaration is still correct as documentation and will work if the user session is socket-activated or the system is configured with `systemd --user` lingering, but it won't cause a hard stop if docker dies mid-session. Worth noting in the README that the graceful drain relies on the 300 s timeout, not on systemd killing the unit when docker stops.
Collaborator

Token visible in docker inspect — passing the token as a -e KEY=value flag embeds it in Docker's container metadata, retrievable via docker inspect claude-hooks-<name> by any user who can talk to the Docker socket. Use --env-file instead to avoid this:

tmpenv=$(mktemp)
chmod 600 "$tmpenv"
echo "FORGEJO_ACCESS_TOKEN=$(cat "$token_file")" > "$tmpenv"
docker run -d \
    --name "claude-hooks-$name" \
    --restart unless-stopped \
    --volume "claude-hooks-$name-state:/state" \
    --volume "$creds_file:/root/.claude/.credentials.json:ro" \
    --env-file "$tmpenv" \
    "$IMAGE"
rm -f "$tmpenv"

The token still ends up in the container's env (visible to processes inside it), but it is no longer readable from the host via docker inspect.

**Token visible in `docker inspect`** — passing the token as a `-e KEY=value` flag embeds it in Docker's container metadata, retrievable via `docker inspect claude-hooks-<name>` by any user who can talk to the Docker socket. Use `--env-file` instead to avoid this: ```bash tmpenv=$(mktemp) chmod 600 "$tmpenv" echo "FORGEJO_ACCESS_TOKEN=$(cat "$token_file")" > "$tmpenv" docker run -d \ --name "claude-hooks-$name" \ --restart unless-stopped \ --volume "claude-hooks-$name-state:/state" \ --volume "$creds_file:/root/.claude/.credentials.json:ro" \ --env-file "$tmpenv" \ "$IMAGE" rm -f "$tmpenv" ``` The token still ends up in the container's env (visible to processes inside it), but it is no longer readable from the host via `docker inspect`.
fix(infra): re-add Requires=docker.service, --env-file for token injection
All checks were successful
qa / qa (pull_request) Successful in 44s
edfc63d4f9
- Restore Requires=docker.service (acceptance criterion from #20); add
  README note clarifying that in user scope this is advisory — graceful
  drain relies on TimeoutStopSec=300, not on systemd stopping the unit
  when Docker exits.
- Replace -e FORGEJO_ACCESS_TOKEN=... with --env-file (chmod 600 tmpfile)
  so the token is no longer readable via `docker inspect` from the host.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
dev requested review from reviewer 2026-04-17 14:32:07 +00:00
dev left a comment

Addressed both blocking issues in edfc63d:

Requires=docker.service — restored. Also added a README note under the systemd section clarifying that in user scope this is advisory (systemd doesn't enforce a hard stop if Docker dies mid-session), and that graceful drain relies on TimeoutStopSec=300.

Token in docker inspect — replaced -e FORGEJO_ACCESS_TOKEN=... with a chmod 600 tmpfile passed via --env-file, removed immediately after docker run. Token no longer appears in container metadata on the host.

Addressed both blocking issues in edfc63d: **`Requires=docker.service`** — restored. Also added a README note under the systemd section clarifying that in user scope this is advisory (systemd doesn't enforce a hard stop if Docker dies mid-session), and that graceful drain relies on `TimeoutStopSec=300`. **Token in `docker inspect`** — replaced `-e FORGEJO_ACCESS_TOKEN=...` with a `chmod 600` tmpfile passed via `--env-file`, removed immediately after `docker run`. Token no longer appears in container metadata on the host.
reviewer approved these changes 2026-04-17 14:37:52 +00:00
reviewer left a comment

Both previous blockers addressed. All acceptance criteria from #20 are met. One non-blocking nit inline.

Both previous blockers addressed. All acceptance criteria from #20 are met. One non-blocking nit inline.
Collaborator

Non-blocking nit: With set -euo pipefail active, if docker run exits non-zero the script bails out before rm -f "$tmpenv", leaving the temp file behind. The file is chmod 600 so no privilege concern, but you may want a trap to guarantee cleanup:

tmpenv=$(mktemp)
chmod 600 "$tmpenv"
trap 'rm -f "$tmpenv"' EXIT
printf 'FORGEJO_ACCESS_TOKEN=%s\n' "$(cat "$token_file")" > "$tmpenv"

The trap EXIT fires on both success and error paths. Fine to address in a follow-up.

**Non-blocking nit:** With `set -euo pipefail` active, if `docker run` exits non-zero the script bails out before `rm -f "$tmpenv"`, leaving the temp file behind. The file is `chmod 600` so no privilege concern, but you may want a `trap` to guarantee cleanup: ```bash tmpenv=$(mktemp) chmod 600 "$tmpenv" trap 'rm -f "$tmpenv"' EXIT printf 'FORGEJO_ACCESS_TOKEN=%s\n' "$(cat "$token_file")" > "$tmpenv" ``` The `trap EXIT` fires on both success and error paths. Fine to address in a follow-up.
code-lead deleted branch dev/20 2026-04-17 14:42:16 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
charles/claude-hooks!21
No description provided.