feat(agents): M26-2 reconcile + watchdog lazy lifecycle #617

Merged
code-lead merged 2 commits from dev/589 into main 2026-04-30 22:38:59 +00:00
Collaborator

Summary

  • Lazy containers use docker create --restart no instead of docker run -d --restart unless-stopped; they are created but not started
  • Hot containers continue to use docker run -d --restart unless-stopped (always-on)
  • Lifecycle drift detected via HostConfig.RestartPolicy.Name from docker inspect — mismatch triggers a recreate
  • Watchdog suppresses container_stopped events for lazy instances (stopped is intentional, not a flap)
  • Boot state seeded via getLifecycle(name).markRunning() so lazy containers already running across a service restart are tracked correctly

Test plan

  • bun test apps/server/src/infrastructure/container/ — 126 tests pass (0 fail)
  • just qa — typecheck + biome clean
  • Lazy container: docker create call verified via reconcile runner assertions
  • Hot container: docker run -d call verified via reconcile runner assertions
  • Lifecycle drift hot→lazy and lazy→hot both trigger recreate
  • Watchdog: lazy+stopped → no container_stopped event
  • Watchdog: lazy+missing → container_missing + container_recreated events
  • Watchdog: hot+stopped → container_stopped event emitted

Closes #589

🤖 Generated with Claude Code

## Summary - Lazy containers use `docker create --restart no` instead of `docker run -d --restart unless-stopped`; they are created but not started - Hot containers continue to use `docker run -d --restart unless-stopped` (always-on) - Lifecycle drift detected via `HostConfig.RestartPolicy.Name` from `docker inspect` — mismatch triggers a recreate - Watchdog suppresses `container_stopped` events for lazy instances (stopped is intentional, not a flap) - Boot state seeded via `getLifecycle(name).markRunning()` so lazy containers already running across a service restart are tracked correctly ## Test plan - [ ] `bun test apps/server/src/infrastructure/container/` — 126 tests pass (0 fail) - [ ] `just qa` — typecheck + biome clean - [ ] Lazy container: `docker create` call verified via reconcile runner assertions - [ ] Hot container: `docker run -d` call verified via reconcile runner assertions - [ ] Lifecycle drift hot→lazy and lazy→hot both trigger recreate - [ ] Watchdog: lazy+stopped → no `container_stopped` event - [ ] Watchdog: lazy+missing → `container_missing` + `container_recreated` events - [ ] Watchdog: hot+stopped → `container_stopped` event emitted Closes #589 🤖 Generated with [Claude Code](https://claude.com/claude-code)
dev self-assigned this 2026-04-30 22:26:12 +00:00
feat(agents): M26-2 reconcile + watchdog lazy lifecycle
All checks were successful
qa / dockerfile (pull_request) Successful in 5s
qa / qa (pull_request) Successful in 1m17s
105ead4513
Lazy containers are created (docker create --restart no) but not started;
hot containers use docker run -d --restart unless-stopped. Lifecycle drift
via restart-policy mismatch triggers a recreate. The watchdog suppresses
container_stopped events for lazy instances (stopped is intentional).
Boot state is seeded via markRunning() so lazy containers already running
across a service restart are tracked correctly.

Closes #589

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
dev requested review from reviewer 2026-04-30 22:27:33 +00:00
reviewer left a comment
  • doc-gap container-reconcile.ts:474: the decision matrix row yes + match + no → start (→ "started") is stale — for lazy containers (the default for all rows) this branch now returns "unchanged", not "started". Only hot containers still issue docker start. A reader following the matrix would conclude stopped containers always get started. Suggested fix:
| yes    | yes             | yes + match    | no      | hot→start ("started") / lazy→unchanged |

Also update the prose paragraph that follows (The "present + matches + stopped" branch fires docker start...) to note the hot/lazy split. Everything else — drift detection, watchdog suppression, boot-state seeding, and the full test suite — is correct.

- **doc-gap** `container-reconcile.ts:474`: the decision matrix row `yes + match + no → start (→ "started")` is stale — for lazy containers (the default for all rows) this branch now returns `"unchanged"`, not `"started"`. Only hot containers still issue `docker start`. A reader following the matrix would conclude stopped containers always get started. Suggested fix: ``` | yes | yes | yes + match | no | hot→start ("started") / lazy→unchanged | ``` Also update the prose paragraph that follows (`The "present + matches + stopped" branch fires docker start...`) to note the hot/lazy split. Everything else — drift detection, watchdog suppression, boot-state seeding, and the full test suite — is correct.
docs(container): fix stale decision matrix row for stopped lazy containers
All checks were successful
qa / dockerfile (pull_request) Successful in 4s
qa / qa (pull_request) Successful in 1m31s
5379f9fb75
The `yes + match + no` row incorrectly showed `start (→ "started")` for
all cases; since M26-2 only hot containers trigger `docker start` — lazy
containers return `"unchanged"`. Update the matrix and the explanatory
prose to reflect the hot/lazy split.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Author
Collaborator

Fixed in 5379f9f. Updated the decision matrix row to hot→start ("started") / lazy→unchanged and extended the prose paragraph to explicitly call out that only hot containers trigger docker start — lazy ones are stopped intentionally and reconcile returns "unchanged", leaving them for dispatch.

Fixed in 5379f9f. Updated the decision matrix row to `hot→start ("started") / lazy→unchanged` and extended the prose paragraph to explicitly call out that only hot containers trigger `docker start` — lazy ones are stopped intentionally and reconcile returns `"unchanged"`, leaving them for dispatch.
dev requested review from reviewer 2026-04-30 22:34:07 +00:00
reviewer approved these changes 2026-04-30 22:36:45 +00:00
reviewer left a comment

Decision matrix row and prose both updated correctly: stopped lazy → unchanged, stopped hot → start. CI green.

Decision matrix row and prose both updated correctly: stopped lazy → unchanged, stopped hot → start. CI green.
reviewer approved these changes 2026-04-30 22:37:00 +00:00
reviewer left a comment

Decision matrix row and prose both updated correctly: stopped lazy → unchanged, stopped hot → start. CI green.

Decision matrix row and prose both updated correctly: stopped lazy → unchanged, stopped hot → start. CI green.
code-lead deleted branch dev/589 2026-04-30 22:39:00 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
charles/claude-hooks!617
No description provided.