B12 — Worktree hard-release on mismatch (close F1 — silent stall) #428

Closed
opened 2026-04-27 07:25:38 +00:00 by claude-desktop · 0 comments
Collaborator

As an orchestrator,
I want the workdir helper to forcibly release a worktree on branch mismatch and re-acquire,
so that the silent-stall pattern of "release first, then nothing" goes away.

Today the [workdir] worktree at .../dev%2FN is on branch dev/M, expected dev/N — release first log is a notice, not a recovery. The handler doesn't actually release the mismatched worktree before the calling task moves on. Three back-to-back rebase dispatches all hit this last night when the previous task left the worktree on a sibling branch.

Acceptance criteria

Hard release

  • When acquireWorktree(target_branch) finds the path checked out to a different branch, run git worktree remove --force <path> then git worktree add <path> <target_branch>.
  • If the directory has uncommitted changes (e.g. previous task crashed mid-edit): stash to worktree-recovery/<sha> branch first, log a warning, then force-remove. The stash branch is kept for 24 h then GC'd.

Timeout

  • Worktree-acquire has a 60 s overall timeout; on timeout the task fails with WorktreeAcquireTimeout (caught by the existing retry path).

Logging + GC

  • Log line on every release: [workdir] hard-release at <path> from <branch> to <target> (recovered N uncommitted lines).
  • GC sweep on startup + every 6 h: drop worktree-recovery/* branches older than 24 h.

Tests

  • Unit test: clean worktree on wrong branch → force-remove + re-add, no warning.
  • Unit test: dirty worktree → stashed to worktree-recovery/<sha>, then force-remove + re-add, warning logged.
  • Unit test: timeout path triggers WorktreeAcquireTimeout.
  • Unit test: GC drops stash branches > 24 h.

Out of scope

  • Reducing worktree contention by smarter scheduling (different feature).
  • Removing per-instance worktrees (M18-7 deferral stands).

References

  • Spec: docs/specs/automation-hardening.md §4 B12.
  • Workdir helper: apps/server/src/infrastructure/workdir/.
  • Night-1 incident: 3 simultaneous rebase tasks at 05:17 all silent-stalled on this.
**As an** orchestrator, **I want** the workdir helper to forcibly release a worktree on branch mismatch and re-acquire, **so that** the silent-stall pattern of "release first, then nothing" goes away. Today the `[workdir] worktree at .../dev%2FN is on branch dev/M, expected dev/N — release first` log is a *notice*, not a recovery. The handler doesn't actually release the mismatched worktree before the calling task moves on. Three back-to-back rebase dispatches all hit this last night when the previous task left the worktree on a sibling branch. ## Acceptance criteria ### Hard release - [ ] When `acquireWorktree(target_branch)` finds the path checked out to a different branch, run `git worktree remove --force <path>` then `git worktree add <path> <target_branch>`. - [ ] If the directory has uncommitted changes (e.g. previous task crashed mid-edit): stash to `worktree-recovery/<sha>` branch first, log a warning, then force-remove. The stash branch is kept for 24 h then GC'd. ### Timeout - [ ] Worktree-acquire has a 60 s overall timeout; on timeout the task fails with `WorktreeAcquireTimeout` (caught by the existing retry path). ### Logging + GC - [ ] Log line on every release: `[workdir] hard-release at <path> from <branch> to <target> (recovered N uncommitted lines)`. - [ ] GC sweep on startup + every 6 h: drop `worktree-recovery/*` branches older than 24 h. ### Tests - [ ] Unit test: clean worktree on wrong branch → force-remove + re-add, no warning. - [ ] Unit test: dirty worktree → stashed to `worktree-recovery/<sha>`, then force-remove + re-add, warning logged. - [ ] Unit test: timeout path triggers `WorktreeAcquireTimeout`. - [ ] Unit test: GC drops stash branches > 24 h. ## Out of scope - Reducing worktree contention by smarter scheduling (different feature). - Removing per-instance worktrees (M18-7 deferral stands). ## References - Spec: `docs/specs/automation-hardening.md` §4 B12. - Workdir helper: `apps/server/src/infrastructure/workdir/`. - Night-1 incident: 3 simultaneous rebase tasks at 05:17 all silent-stalled on this.
Sign in to join this conversation.
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
charles/claude-hooks#428
No description provided.