Observability: auditd rule to catch external docker stop / docker rm calls on claude-hooks-* containers #149

Closed
opened 2026-04-20 14:18:43 +00:00 by claude-desktop · 0 comments
Collaborator

User story

As the operator, I want an auditd rule that logs the calling PID + command line + cwd every time docker stop or docker rm is invoked against a claude-hooks-* container, so that the next time dev-default silently vanishes (#132) we have the caller identity instead of another round of guessing.

Context

#132's investigation (see comment #132 (comment)) ruled out every obvious candidate — no OOM, no restart-policy exhaustion, no service-side reconcileOne call, no shell history, no cron. Yet dockerd logs stopping restart-manager 6× today for dev-default exclusively. Something external is calling docker stop/rm and we can't see who.

auditd with a rule on the docker binary's execve (+ a filter that fires only when the argv contains stop or rm and a claude-hooks- name) would capture the caller's PID, executable path, cwd, and command line for every invocation. That's the one piece of information that would turn this from "mystery" into "we know which process is doing it and can fix or kill it."

Acceptance criteria

Rule file

  • New ops/audit/claude-hooks-docker.rules in the repo — auditd format, one rule per line.
  • Fires on execve of /usr/bin/docker (or wherever which docker points on the deployment host — detect at install time) when argv contains stop or rm AND argv[n] matches claude-hooks-*. Use a -k claude-hooks-docker audit key so the operator can grep results cleanly.
  • Scope to uid 1000 (the operator) and uid 0 (root) — we want to see both sudo/direct and any system-invoked removals.

Install path

  • just audit-install recipe that:
    1. Copies the rules file into /etc/audit/rules.d/ (requires sudo — recipe runs sudo install, prompts).
    2. Runs sudo augenrules --load to apply.
    3. Prints a one-line confirmation + the ausearch -k claude-hooks-docker -ts today command the operator would run to read the logs.
  • just audit-tail recipe: tails ausearch -k claude-hooks-docker --line-buffered in follow mode, filtered to human-readable lines (use aureport -k or parse ausearch output to the fields operators actually need: timestamp, PID, exe, argv, cwd).

Docs

  • README — new "Debugging with auditd" section: how to install the rules, how to read them, how to uninstall.
  • CLAUDE.md Commands section: just audit-install / just audit-tail.

Validation

  • Trigger a docker stop claude-hooks-<test-instance> manually; confirm the entry appears via ausearch -k claude-hooks-docker with operator's shell PID.
  • Trigger via a subprocess (bash -c 'docker stop claude-hooks-…'); confirm the subprocess + its parent are both logged.
  • Wait for (or force) another dev-default disappearance; inspect the audit log and identify the caller. Paste the finding into a comment on #132 and close that issue with the root cause, linked fix if applicable.

Out of scope

  • Automated alerting on audit events (operator-driven inspection is enough for now).
  • A dashboard surface for the audit log — terminal ausearch is fine.
  • Rules for operations other than stop/rm (no interest in auditing docker run or exec).

References

  • Investigation on #132: #132 (comment) (the "who's calling docker stop?" gap this ticket closes).
  • Watchdog that currently recovers (but can't diagnose) the vanishings: src/container-watchdog.ts (from #134).
  • Host: charles-desktop, Arch Linux, systemd-based, auditd should install via pacman -S audit.

Dependencies

  • Blocked by: nothing.
  • Blocks: closing #132.
  • Branch off: main.
## User story As the **operator**, I want an `auditd` rule that logs the calling PID + command line + cwd every time `docker stop` or `docker rm` is invoked against a `claude-hooks-*` container, so that the next time `dev-default` silently vanishes (#132) we have the caller identity instead of another round of guessing. ## Context #132's investigation (see comment https://forge.jacquin.app/charles/claude-hooks/issues/132#issuecomment-6640) ruled out every obvious candidate — no OOM, no restart-policy exhaustion, no service-side `reconcileOne` call, no shell history, no cron. Yet dockerd logs `stopping restart-manager` 6× today for `dev-default` exclusively. Something external is calling `docker stop`/`rm` and we can't see who. `auditd` with a rule on the `docker` binary's `execve` (+ a filter that fires only when the argv contains `stop` or `rm` and a `claude-hooks-` name) would capture the caller's PID, executable path, cwd, and command line for every invocation. That's the one piece of information that would turn this from "mystery" into "we know which process is doing it and can fix or kill it." ## Acceptance criteria ### Rule file - [ ] New `ops/audit/claude-hooks-docker.rules` in the repo — auditd format, one rule per line. - [ ] Fires on `execve` of `/usr/bin/docker` (or wherever `which docker` points on the deployment host — detect at install time) when argv contains `stop` or `rm` AND argv[n] matches `claude-hooks-*`. Use a `-k claude-hooks-docker` audit key so the operator can grep results cleanly. - [ ] Scope to uid 1000 (the operator) and uid 0 (root) — we want to see both sudo/direct and any system-invoked removals. ### Install path - [ ] `just audit-install` recipe that: 1. Copies the rules file into `/etc/audit/rules.d/` (requires sudo — recipe runs `sudo install`, prompts). 2. Runs `sudo augenrules --load` to apply. 3. Prints a one-line confirmation + the `ausearch -k claude-hooks-docker -ts today` command the operator would run to read the logs. - [ ] `just audit-tail` recipe: tails `ausearch -k claude-hooks-docker --line-buffered` in follow mode, filtered to human-readable lines (use `aureport -k` or parse ausearch output to the fields operators actually need: timestamp, PID, exe, argv, cwd). ### Docs - [ ] README — new "Debugging with auditd" section: how to install the rules, how to read them, how to uninstall. - [ ] CLAUDE.md Commands section: `just audit-install` / `just audit-tail`. ### Validation - [ ] Trigger a `docker stop claude-hooks-<test-instance>` manually; confirm the entry appears via `ausearch -k claude-hooks-docker` with operator's shell PID. - [ ] Trigger via a subprocess (`bash -c 'docker stop claude-hooks-…'`); confirm the subprocess + its parent are both logged. - [ ] Wait for (or force) another dev-default disappearance; inspect the audit log and identify the caller. **Paste the finding into a comment on #132** and close that issue with the root cause, linked fix if applicable. ## Out of scope - Automated alerting on audit events (operator-driven inspection is enough for now). - A dashboard surface for the audit log — terminal `ausearch` is fine. - Rules for operations other than `stop`/`rm` (no interest in auditing `docker run` or `exec`). ## References - Investigation on #132: https://forge.jacquin.app/charles/claude-hooks/issues/132#issuecomment-6640 (the "who's calling docker stop?" gap this ticket closes). - Watchdog that currently recovers (but can't diagnose) the vanishings: `src/container-watchdog.ts` (from #134). - Host: `charles-desktop`, Arch Linux, systemd-based, `auditd` should install via `pacman -S audit`. ## Dependencies - **Blocked by:** nothing. - **Blocks:** closing #132. - **Branch off:** `main`.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
charles/claude-hooks#149
No description provided.