bug: agent-type rename migration (#670) misses secret table + token file paths #741

Closed
opened 2026-05-02 11:45:43 +00:00 by claude-desktop · 2 comments
Collaborator

The agent-type rename migration #670 (boss → code-lead, foreman → architect) migrates DB rows in the agents table but leaves two other artifacts pointing at old names:

  1. secret table rows named FORGEJO_TOKEN_BOSS, FORGEJO_TOKEN_FOREMAN — never renamed to FORGEJO_TOKEN_CODE_LEAD, FORGEJO_TOKEN_ARCHITECT. Result: render-to-disk fails at boot with secret "FORGEJO_TOKEN_CODE_LEAD" referenced by ${SECRET:FORGEJO_TOKEN_CODE_LEAD} is not in the secret store. Render-queue retries every few seconds, never succeeds. Container env never materialized → dispatched tasks silently no-op (no error path; flow's agent.dispatch enqueues onto a worker whose container has no creds, container start fails or blocks).
  2. Token files at ~/.config/claude-hooks/tokens/<type> — the migration does not rename tokens/bosstokens/code-lead or tokens/foremantokens/architect. Result: at boot, webhook-config.parseTypeTokenFiles logs no forgejo token for type code-lead at .../tokens/code-lead because agents.json already points at the new path but the filesystem still has the old name. Operator must do the mv manually.

Both artifacts carry agent-type names baked into their identifiers and must move together with the DB row migration to be useful.

Reproduction (already observed)

  1. Apply rename PR #670 to a service that previously ran boss / foreman.
  2. Boot service. Observe:
    • [webhook] no forgejo token for type code-lead at /home/charles/.config/claude-hooks/tokens/code-lead
    • [startup] render-to-disk for code-lead-2 failed: secret "FORGEJO_TOKEN_CODE_LEAD" ... is not in the secret store
  3. Assign an issue to code-lead. Webhook fires, flow runs, but no [code-lead-*] enqueued log line appears. Container stays in Created state. Issue sits IDLE indefinitely.

Acceptance criteria

Migration: rename secret rows alongside agent rows

  • The same on-boot migration that does bosscode-lead / foremanarchitect for the agents table extends to the secret table.
  • For each rename pair (old, new):
    • If secret has FORGEJO_TOKEN_<OLD_UPPER> and not FORGEJO_TOKEN_<NEW_UPPER>: UPDATE secret SET name='FORGEJO_TOKEN_<NEW_UPPER>' WHERE name='FORGEJO_TOKEN_<OLD_UPPER>'.
    • If both exist: leave the new one, log a warning: [agents] secret rename skipped: FORGEJO_TOKEN_<NEW_UPPER> already present, dropping orphan FORGEJO_TOKEN_<OLD_UPPER>. Drop the orphan to avoid leaving stale rows.
    • If neither exists: no-op.
  • Boot log: [agents] migrated <N> secret row(s) → code-lead / architect (#670).
  • Idempotent: re-running the migration on already-migrated DB is a no-op.
  • Generic enough that future per-type rename adds (foobar) cover both agents rows and any secret named <UPPER>_TOKEN_FOO / FOO_TOKEN. Consider a regex sweep ^(.*)_(?:TOKEN_)?<OLD_UPPER>$ → re-target with <NEW_UPPER>.

Migration: rename token files alongside agent rows

  • Same migration scans ~/.config/claude-hooks/tokens/ (path from env or config) for files named after old types. For each (old, new) pair: if tokens/<old> exists and tokens/<new> does not, rename(2) it. If both exist, leave the new one and log warning. If neither exists, no-op.
  • Optional: also rename agent-env/<old>agent-env/<new> directory (today this dir is per-instance not per-type, but check the actual layout). If the migration does not own this rename, the boot log MUST direct the operator to do it manually with the exact mv commands.
  • Boot log: [agents] renamed <N> token file(s) → tokens/code-lead, tokens/architect.

Operator escape hatch

  • If migration cannot reach the filesystem (read-only, EACCES), log a clear instruction with the exact mv commands the operator must run, instead of silent failure.
  • Same for the secret table — if the secret-table is locked / encrypted with a key the runtime cannot resolve at boot, surface the issue clearly.

Tests

  • Migration unit test: seed agents with boss row + secret with FORGEJO_TOKEN_BOSS + create a temp tokens/boss file. Run migration. Assert:
    • agents row renamed to code-lead
    • secret row renamed to FORGEJO_TOKEN_CODE_LEAD
    • tokens/boss gone, tokens/code-lead present with same content
  • Migration test: seed BOTH FORGEJO_TOKEN_BOSS and FORGEJO_TOKEN_CODE_LEAD. Assert: orphan BOSS dropped, warning logged, CODE_LEAD value preserved.
  • Migration test: re-run on already-migrated state. Assert: no-op, no errors, no warnings.
  • Boot test: start service post-rename with mismatched agents.json token paths. Assert: filesystem and DB both align after first boot, no manual intervention required.

Out of scope

  • The PENPOT_TOKEN missing-from-secret-store gap discovered in the same investigation. That is a separate dashboard onboarding bug — designer / design-reviewer agents fail render-to-disk because nobody ever entered the Penpot token. Track separately.
  • Multi-forge token files (tokens/<type>.github, tokens/<type>.gitlab) — out of scope until FM-3 ships.
  • Cleaning up the empty agent-env/.code-lead.empty.bak left behind during the manual mv (operator removes after confirming sound).

References

  • PR / issue #670 — original rename migration that this completes.
  • Memory agent_role_rename_2026_05_02.md — operator-side migration record.
  • Discovered 2026-05-02 — issues #730 / #731 sat IDLE for ~30 min on code-lead because of the missed secret + token-file rename.
The agent-type rename migration `#670` (boss → code-lead, foreman → architect) migrates DB rows in the `agents` table but leaves two other artifacts pointing at old names: 1. **`secret` table rows** named `FORGEJO_TOKEN_BOSS`, `FORGEJO_TOKEN_FOREMAN` — never renamed to `FORGEJO_TOKEN_CODE_LEAD`, `FORGEJO_TOKEN_ARCHITECT`. Result: render-to-disk fails at boot with `secret "FORGEJO_TOKEN_CODE_LEAD" referenced by ${SECRET:FORGEJO_TOKEN_CODE_LEAD} is not in the secret store`. Render-queue retries every few seconds, never succeeds. Container env never materialized → dispatched tasks silently no-op (no error path; flow's `agent.dispatch` enqueues onto a worker whose container has no creds, container start fails or blocks). 2. **Token files** at `~/.config/claude-hooks/tokens/<type>` — the migration does not rename `tokens/boss` → `tokens/code-lead` or `tokens/foreman` → `tokens/architect`. Result: at boot, `webhook-config.parseTypeTokenFiles` logs `no forgejo token for type code-lead at .../tokens/code-lead` because `agents.json` already points at the new path but the filesystem still has the old name. Operator must do the `mv` manually. Both artifacts carry agent-type names baked into their identifiers and must move together with the DB row migration to be useful. ## Reproduction (already observed) 1. Apply rename PR `#670` to a service that previously ran `boss` / `foreman`. 2. Boot service. Observe: - `[webhook] no forgejo token for type code-lead at /home/charles/.config/claude-hooks/tokens/code-lead` - `[startup] render-to-disk for code-lead-2 failed: secret "FORGEJO_TOKEN_CODE_LEAD" ... is not in the secret store` 3. Assign an issue to `code-lead`. Webhook fires, flow runs, but no `[code-lead-*] enqueued` log line appears. Container stays in `Created` state. Issue sits IDLE indefinitely. ## Acceptance criteria ### Migration: rename `secret` rows alongside agent rows - [ ] The same on-boot migration that does `boss` → `code-lead` / `foreman` → `architect` for the `agents` table extends to the `secret` table. - [ ] For each rename pair `(old, new)`: - If `secret` has `FORGEJO_TOKEN_<OLD_UPPER>` and not `FORGEJO_TOKEN_<NEW_UPPER>`: `UPDATE secret SET name='FORGEJO_TOKEN_<NEW_UPPER>' WHERE name='FORGEJO_TOKEN_<OLD_UPPER>'`. - If both exist: leave the new one, log a warning: `[agents] secret rename skipped: FORGEJO_TOKEN_<NEW_UPPER> already present, dropping orphan FORGEJO_TOKEN_<OLD_UPPER>`. Drop the orphan to avoid leaving stale rows. - If neither exists: no-op. - [ ] Boot log: `[agents] migrated <N> secret row(s) → code-lead / architect (#670)`. - [ ] Idempotent: re-running the migration on already-migrated DB is a no-op. - [ ] Generic enough that future per-type rename adds (`foo` → `bar`) cover both `agents` rows and any secret named `<UPPER>_TOKEN_FOO` / `FOO_TOKEN`. Consider a regex sweep `^(.*)_(?:TOKEN_)?<OLD_UPPER>$` → re-target with `<NEW_UPPER>`. ### Migration: rename token files alongside agent rows - [ ] Same migration scans `~/.config/claude-hooks/tokens/` (path from env or config) for files named after old types. For each `(old, new)` pair: if `tokens/<old>` exists and `tokens/<new>` does not, `rename(2)` it. If both exist, leave the new one and log warning. If neither exists, no-op. - [ ] Optional: also rename `agent-env/<old>` → `agent-env/<new>` directory (today this dir is per-instance not per-type, but check the actual layout). If the migration does not own this rename, the boot log MUST direct the operator to do it manually with the exact `mv` commands. - [ ] Boot log: `[agents] renamed <N> token file(s) → tokens/code-lead, tokens/architect`. ### Operator escape hatch - [ ] If migration cannot reach the filesystem (read-only, EACCES), log a clear instruction with the exact `mv` commands the operator must run, instead of silent failure. - [ ] Same for the secret table — if the secret-table is locked / encrypted with a key the runtime cannot resolve at boot, surface the issue clearly. ### Tests - [ ] Migration unit test: seed `agents` with `boss` row + `secret` with `FORGEJO_TOKEN_BOSS` + create a temp `tokens/boss` file. Run migration. Assert: - `agents` row renamed to `code-lead` - `secret` row renamed to `FORGEJO_TOKEN_CODE_LEAD` - `tokens/boss` gone, `tokens/code-lead` present with same content - [ ] Migration test: seed BOTH `FORGEJO_TOKEN_BOSS` and `FORGEJO_TOKEN_CODE_LEAD`. Assert: orphan `BOSS` dropped, warning logged, `CODE_LEAD` value preserved. - [ ] Migration test: re-run on already-migrated state. Assert: no-op, no errors, no warnings. - [ ] Boot test: start service post-rename with mismatched `agents.json` token paths. Assert: filesystem and DB both align after first boot, no manual intervention required. ## Out of scope - The `PENPOT_TOKEN` missing-from-secret-store gap discovered in the same investigation. That is a separate dashboard onboarding bug — designer / design-reviewer agents fail render-to-disk because nobody ever entered the Penpot token. Track separately. - Multi-forge token files (`tokens/<type>.github`, `tokens/<type>.gitlab`) — out of scope until FM-3 ships. - Cleaning up the empty `agent-env/.code-lead.empty.bak` left behind during the manual mv (operator removes after confirming sound). ## References - PR / issue `#670` — original rename migration that this completes. - Memory `agent_role_rename_2026_05_02.md` — operator-side migration record. - Discovered 2026-05-02 — issues #730 / #731 sat IDLE for ~30 min on `code-lead` because of the missed secret + token-file rename.
Collaborator

🦵 @charles kicked the queue — re-running implement on @dev.

🦵 @charles kicked the queue — re-running implement on @dev.
Collaborator

🦵 @charles kicked the queue — re-running implement on @dev.

🦵 @charles kicked the queue — re-running implement on @dev.
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
charles/claude-hooks#741
No description provided.