Tracking: agent pool + customization

claude-desktop commented

2026-04-18 14:58:51 +00:00

Collaborator

Purpose

Replace the hardcoded boss/dev/reviewer trio with a pool-based architecture so the operator can:

Run multiple agents of the same type in parallel (unblock serial dev/boss/reviewer queues).
Specialize instances by model, prompt appendix, and target-label (e.g. reviewer-security on opus 4.7 for PRs labeled security, reviewer-default on sonnet 4.6 for everything else).
CRUD instances from the dashboard (SQLite-backed) without a service restart.
Auto-generate issues from specs/*.md via a new breakdown skill that can apply the same routing labels the reviewer pool consumes.

Non-goals for this milestone

Dynamic Forgejo user / token / GPG provisioning. Agents of the same type share one Forgejo account, one token file, and one GPG identity. No new security surface.
Editing skills through the UI. Skill files stay in skills/*.md, reviewable via PR. The UI displays them read-only; per-agent customisation is an append-only prompt_appendix stored in SQLite.
Cross-repo breakdown orchestration. The breakdown skill operates on a single repo at a time.
Auto-classification beyond label matching. The reviewer-routing signal is the issue's Forgejo labels; we don't re-implement a diff classifier.
UI-managed routing rules. Label matching is configured via the CRUD form's match_labels field on the agent instance itself. No separate rule table.

Dependency graph

Layer 0 — foundation
  #A1  SQLite + type-defaults config refactor

Layer 1 — routing
  #A2  Pool scheduler (webhook dispatches by type)    <-- A1
  #A3  Label-aware instance selection                  <-- A1, A2

Layer 2 — per-instance state
  #A4  Prompt appendix per agent                       <-- A1
  #A5  Per-instance container reconciliation           <-- A1

Layer 3 — UX
  #A6  Dashboard agents CRUD                           <-- A1, A2, A4

Layer 4 — capability
  #A7  Breakdown skill — generate issues from specs/   <-- A3

Critical path: A1 → A2 → A3 → A7, with A4 / A5 / A6 branching off A1 and A2.

Execution order

A1 — foundation; unblocks everything. Seed SQLite with one default instance per type so first boot is behaviour-identical to today.
A2 + A4 — in parallel, both depend only on A1.
A3 — sits on top of A2; enables label-based reviewer specialisation.
A5 — container lifecycle from SQLite diff; v1 can be just agents-sync manually invoked, service-managed later.
A6 — dashboard CRUD over the foundation + scheduler state.
A7 — stretch story for the milestone close; applies labels consumed by A3.

Data model preview

config/agents.json becomes type defaults (Forgejo user, token path, git identity, GPG key, default model, default container image) — one row per type.

SQLite agents table — per-instance overrides:

name           TEXT PRIMARY KEY     -- e.g. "reviewer-security"
type           TEXT                 -- "boss" | "dev" | "reviewer"
model          TEXT NULL            -- overrides type default
prompt_appendix TEXT NULL           -- concat'd to base skill at dispatch time
match_labels   TEXT NULL            -- JSON array, e.g. '["security","audit"]'
notes          TEXT NULL
created_at     INTEGER
updated_at     INTEGER

Session key becomes <type>:<repo>:<issueOrPr> so any pool member can resume a prior session.
Container naming becomes claude-hooks-<instance-name> (e.g. claude-hooks-dev-default, claude-hooks-dev-frontend).

Out of scope for this milestone

Per-instance disk-usage breakdown on the dashboard.
Auth / RBAC on the admin CRUD endpoints (inherits whatever the dashboard has today).
Container hot-reload — at worst we recreate a container on CRUD mutation.
Migrating historical tasks/sessions under the new <type>:<repo>:<issueOrPr> key (new key takes effect for new dispatches; old sessions stay keyed on the old scheme and expire via sweeper).

References

Ideated 2026-04-18 after the post-CI routing and auto-rebase work.
Existing hardcoded config: config/agents.json.
Existing skills: skills/{implement,address-review,review,rebase,merge,fix-ci}.md — all stay, used as-is.

Milestone

Agent pool + customization (#16).

## Purpose Replace the hardcoded boss/dev/reviewer trio with a pool-based architecture so the operator can: - Run multiple agents of the same type in parallel (unblock serial dev/boss/reviewer queues). - Specialize instances by model, prompt appendix, and target-label (e.g. `reviewer-security` on opus 4.7 for PRs labeled `security`, `reviewer-default` on sonnet 4.6 for everything else). - CRUD instances from the dashboard (SQLite-backed) without a service restart. - Auto-generate issues from `specs/*.md` via a new `breakdown` skill that can apply the same routing labels the reviewer pool consumes. ## Non-goals for this milestone - **Dynamic Forgejo user / token / GPG provisioning.** Agents of the same type share one Forgejo account, one token file, and one GPG identity. No new security surface. - **Editing skills through the UI.** Skill files stay in `skills/*.md`, reviewable via PR. The UI displays them read-only; per-agent customisation is an append-only `prompt_appendix` stored in SQLite. - **Cross-repo breakdown orchestration.** The `breakdown` skill operates on a single repo at a time. - **Auto-classification beyond label matching.** The reviewer-routing signal is the issue's Forgejo labels; we don't re-implement a diff classifier. - **UI-managed routing rules.** Label matching is configured via the CRUD form's `match_labels` field on the agent instance itself. No separate rule table. ## Dependency graph ```text Layer 0 — foundation #A1 SQLite + type-defaults config refactor Layer 1 — routing #A2 Pool scheduler (webhook dispatches by type) <-- A1 #A3 Label-aware instance selection <-- A1, A2 Layer 2 — per-instance state #A4 Prompt appendix per agent <-- A1 #A5 Per-instance container reconciliation <-- A1 Layer 3 — UX #A6 Dashboard agents CRUD <-- A1, A2, A4 Layer 4 — capability #A7 Breakdown skill — generate issues from specs/ <-- A3 ``` Critical path: `A1 → A2 → A3 → A7`, with `A4 / A5 / A6` branching off `A1` and `A2`. ## Execution order 1. **A1** — foundation; unblocks everything. Seed SQLite with one default instance per type so first boot is behaviour-identical to today. 2. **A2 + A4** — in parallel, both depend only on A1. 3. **A3** — sits on top of A2; enables label-based reviewer specialisation. 4. **A5** — container lifecycle from SQLite diff; v1 can be `just agents-sync` manually invoked, service-managed later. 5. **A6** — dashboard CRUD over the foundation + scheduler state. 6. **A7** — stretch story for the milestone close; applies labels consumed by A3. ## Data model preview - **`config/agents.json`** becomes *type defaults* (Forgejo user, token path, git identity, GPG key, default model, default container image) — one row per type. - **SQLite `agents` table** — per-instance overrides: ``` name TEXT PRIMARY KEY -- e.g. "reviewer-security" type TEXT -- "boss" | "dev" | "reviewer" model TEXT NULL -- overrides type default prompt_appendix TEXT NULL -- concat'd to base skill at dispatch time match_labels TEXT NULL -- JSON array, e.g. '["security","audit"]' notes TEXT NULL created_at INTEGER updated_at INTEGER ``` - **Session key** becomes `<type>:<repo>:<issueOrPr>` so any pool member can resume a prior session. - **Container naming** becomes `claude-hooks-<instance-name>` (e.g. `claude-hooks-dev-default`, `claude-hooks-dev-frontend`). ## Out of scope for this milestone - Per-instance disk-usage breakdown on the dashboard. - Auth / RBAC on the admin CRUD endpoints (inherits whatever the dashboard has today). - Container hot-reload — at worst we recreate a container on CRUD mutation. - Migrating historical tasks/sessions under the new `<type>:<repo>:<issueOrPr>` key (new key takes effect for new dispatches; old sessions stay keyed on the old scheme and expire via sweeper). ## References - Ideated 2026-04-18 after the post-CI routing and auto-rebase work. - Existing hardcoded config: `config/agents.json`. - Existing skills: `skills/{implement,address-review,review,rebase,merge,fix-ci}.md` — all stay, used as-is. ## Milestone `Agent pool + customization` (#16).

claude-desktop referenced this issue

2026-04-18 14:59:24 +00:00

Agents: SQLite store + type-defaults config refactor #48

claude-desktop referenced this issue

2026-04-18 14:59:48 +00:00

Agents: pool scheduler — dispatch by type across multiple instances #49

claude-desktop referenced this issue

2026-04-18 15:00:09 +00:00

Agents: label-aware instance selection (match_labels on dispatch) #50

claude-desktop referenced this issue

2026-04-18 15:00:25 +00:00

Agents: per-agent prompt appendix concatenated to skill at dispatch #51

claude-desktop referenced this issue

2026-04-18 15:00:54 +00:00

Agents: per-instance container reconciliation (create/remove on CRUD) #52

claude-desktop referenced this issue

2026-04-18 15:01:25 +00:00

Dashboard: agents CRUD — list, create, edit, delete instances #53

claude-desktop referenced this issue

2026-04-18 15:02:00 +00:00

Skills: breakdown — generate issues from specs/*.md with routing labels #54

claude-desktop added the

area:meta

type:meta

labels

2026-04-18 15:02:11 +00:00

claude-desktop added this to the Agent pool + customization milestone

2026-04-18 15:02:20 +00:00

claude-desktop referenced this issue

2026-04-18 15:04:43 +00:00

Design: UX mockups for the /agents dashboard page (Penpot) #55

claude-desktop referenced this issue

2026-04-18 17:17:41 +00:00

Add designer + design-reviewer agent types (UI/UX mockups) #56

claude-desktop referenced this issue

2026-04-18 17:42:10 +00:00

Patch Penpot MCP: add design-token write endpoints (create/update color + typography + dimension tokens, themes) #60

claude-desktop referenced this issue

2026-04-19 09:38:37 +00:00

/cancel endpoint: accept an agent param instead of cancelling the first busy worker #87

claude-desktop referenced this issue

2026-04-20 13:02:26 +00:00

A7 — Breakdown skill: generate issues from specs/*.md #142

claude-desktop commented

2026-04-20 14:24:09 +00:00

Author

Collaborator

Milestone 16 complete — closing tracker

All seven stories landed on 2026-04-20. Summary:

Story	Title	PR
A1	SQLite + type-defaults config refactor	#48
A2	Pool scheduler — webhook dispatches by type	#49
A3	Label-aware instance selection	#50
A4	Per-instance prompt appendix	merged
A5	Per-instance container reconciliation	#52
A6	Dashboard agents CRUD	#53 / #116
A7	Breakdown skill — generate issues from `specs/*.md`	#147

Beyond the planned scope, we also shipped during the same push:

Pool scaling in practice: 2× dev, 2× boss, 2× reviewer pool members exercised against real traffic.
Force-merge terminator on MAX_ROUNDS (#137) — the review-loop dedline story.
Force-merge dashboard badge + surfacing (#141/#143).
Session-history persistence fix (#125) + sweeper for old JSONLs (#131).
SSE heartbeat (#128) + dashboard /stats panel (#127/#133).
Container watchdog for silent-disappearance recovery (#132 → #134).
Reviewer pending-CI carve-out (#148) — the fix after watching #147's force-merge fire on a CI-waiting-not-code-issue loop.

The only non-shipped item in the milestone was the upstream Penpot mockup story #55 — A6 implemented ad-hoc without it, so I'm closing that one too as retroactively-obsolete.

Closing the milestone.

## Milestone 16 complete — closing tracker All seven stories landed on 2026-04-20. Summary: | Story | Title | PR | |---|---|---| | A1 | SQLite + type-defaults config refactor | #48 | | A2 | Pool scheduler — webhook dispatches by type | #49 | | A3 | Label-aware instance selection | #50 | | A4 | Per-instance prompt appendix | merged | | A5 | Per-instance container reconciliation | #52 | | A6 | Dashboard agents CRUD | #53 / #116 | | A7 | Breakdown skill — generate issues from `specs/*.md` | #147 | Beyond the planned scope, we also shipped during the same push: - Pool scaling in practice: 2× dev, 2× boss, 2× reviewer pool members exercised against real traffic. - Force-merge terminator on `MAX_ROUNDS` (#137) — the review-loop dedline story. - Force-merge dashboard badge + surfacing (#141/#143). - Session-history persistence fix (#125) + sweeper for old JSONLs (#131). - SSE heartbeat (#128) + dashboard `/stats` panel (#127/#133). - Container watchdog for silent-disappearance recovery (#132 → #134). - Reviewer pending-CI carve-out (#148) — the fix after watching #147's force-merge fire on a CI-waiting-not-code-issue loop. The only non-shipped item in the milestone was the upstream Penpot mockup story #55 — A6 implemented ad-hoc without it, so I'm closing that one too as retroactively-obsolete. Closing the milestone.

claude-desktop closed this issue

2026-04-20 14:24:20 +00:00