feat(infra): container orchestration, systemd hooks, and docs #21
No reviewers
Labels
No labels
area:agents
area:dashboard
area:database
area:design
area:design-review
area:flows
area:infra
area:meta
area:security
area:sessions
area:webhook
area:workdir
security
type:bug
type:chore
type:meta
type:user-story
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
charles/claude-hooks!21
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "dev/20"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
just containers-up/down/rebuild/logs/shellrecipes for managing the per-agent Docker containersclaude-hooks.servicewithAfter=docker.service,Requires=docker.service,ExecStartPre/ExecStopPostfor container lifecycle, andTimeoutStopSec=300for graceful in-flight drainREADME.mdcovering the containerised architecture, host/container boundary, token rotation, rolling update procedure, and migration noteCloses #20
All acceptance criteria from #20 are structurally present — recipes, systemd hooks, graceful drain, rolling-update docs, migration note. Three issues block merge.
@ -0,0 +137,4 @@| Method | Path | Description ||--------|-----------------|--------------------------------------------------|| POST | /task | Submit a coding task (returns `task_id`) |Bug: migration note skips container creation, so
just installwill fail.ExecStartPre=just containers-upcallsdocker start, which requires containers to already exist. The migration steps jump straight tojust install, so the service will fail on first start withNo such container: claude-hooks-<name>.Fix: add
just containers-rebuild(or a dedicatedjust containers-create) beforejust install:@ -2,2 +2,3 @@Description=Claude Hooks — Claude Code task runnerAfter=network.targetAfter=network.target docker.serviceRequires=docker.serviceRequires=docker.servicehas no effect in a user unit.User systemd instances cannot hard-depend on system services — the directive is silently ignored (or warns, depending on the systemd version). The actual runtime guard is already
ExecStartPre=just containers-upfailing if the Docker socket is unavailable, which is sufficient.Fix: remove
Requires=docker.serviceto avoid misleading operators who might expect this to hold up service start at boot.After=docker.servicecan stay as a hint but should also be noted as advisory-only in user scope.@ -76,0 +113,4 @@fidocker stop "claude-hooks-$name" 2>/dev/null || truedocker rm "claude-hooks-$name" 2>/dev/null || truedocker run -d \Bug:
containers-rebuilddrops the Claude Code credentials bind mount.The
docker runcommand only mounts the state volume and injectsFORGEJO_ACCESS_TOKEN. The README documents that Claude Code OAuth credentials are bind-mounted read-only into the container:Without this
-vflag, every rebuilt container starts with no Claude credentials and all agent tasks will fail to authenticate.Fix: add the credentials bind mount to the
docker runcall, e.g.:(adjust the host path to wherever
#19stores per-agent credential files).Fixed all three issues in
1775bde:4822 — credentials bind-mount missing from
containers-rebuildAdded
--volume "$creds_file:/root/.claude/.credentials.json:ro"to thedocker runcall, reading from~/.config/claude-hooks/credentials/$name/.credentials.json. Also added a preflight check that errors and skips the agent if the credentials file doesn't exist (same pattern as the token check above it).4824 —
Requires=docker.serviceno-op in user unitRemoved.
After=docker.servicestays as an advisory ordering hint. The actual guard isExecStartPre=just containers-upfailing if Docker isn't available, which is sufficient.4823 — migration note skips container creation
Added
just containers-rebuildbeforejust installin the migration steps. The comment now reads: "create containers (image + tokens + credentials)" so the purpose is clear.One acceptance criteria miss and one security note to address before this can land.
@ -2,3 +2,3 @@Description=Claude Hooks — Claude Code task runnerAfter=network.targetAfter=network.target docker.serviceMissing
Requires=docker.service— this is an explicit acceptance criterion in #20 and has real operational consequences: without it, if the Docker daemon stops unexpectedly, systemd will not stopclaude-hooks, leaving the process trying to dispatch tasks into non-existent containers.Add it on the same line or as a separate directive:
Note: this unit appears to be a user unit (
WantedBy=default.target). In systemd user mode, dependencies on system-level units likedocker.serviceare not enforced by systemd automatically — the declaration is still correct as documentation and will work if the user session is socket-activated or the system is configured withsystemd --userlingering, but it won't cause a hard stop if docker dies mid-session. Worth noting in the README that the graceful drain relies on the 300 s timeout, not on systemd killing the unit when docker stops.Token visible in
docker inspect— passing the token as a-e KEY=valueflag embeds it in Docker's container metadata, retrievable viadocker inspect claude-hooks-<name>by any user who can talk to the Docker socket. Use--env-fileinstead to avoid this:The token still ends up in the container's env (visible to processes inside it), but it is no longer readable from the host via
docker inspect.Addressed both blocking issues in
edfc63d:Requires=docker.service— restored. Also added a README note under the systemd section clarifying that in user scope this is advisory (systemd doesn't enforce a hard stop if Docker dies mid-session), and that graceful drain relies onTimeoutStopSec=300.Token in
docker inspect— replaced-e FORGEJO_ACCESS_TOKEN=...with achmod 600tmpfile passed via--env-file, removed immediately afterdocker run. Token no longer appears in container metadata on the host.Both previous blockers addressed. All acceptance criteria from #20 are met. One non-blocking nit inline.
Non-blocking nit: With
set -euo pipefailactive, ifdocker runexits non-zero the script bails out beforerm -f "$tmpenv", leaving the temp file behind. The file ischmod 600so no privilege concern, but you may want atrapto guarantee cleanup:The
trap EXITfires on both success and error paths. Fine to address in a follow-up.