Implementation Overlay State Rehome

Implementation Overlay State Rehome

Use this runbook when owner-specific ARC releases were accidentally managed by the GloriousFlywheel core stack and need to move into an implementation overlay. The Jess personal-boundary six-release state move was completed on 2026-04-29; keep this page as the audit trail and repeatable procedure for similar future repairs.

The goal is not to preserve repo-shaped runner taxonomy. The goal is to avoid surprise infrastructure destruction while the owner boundary is repaired.

Current Known Residue

PR #420 exposed five live ARC Helm releases that still existed in the core arc-runners state. A live read-only audit on 2026-04-28 found all five still deployed. The same audit also found massageithaca-dind as an additional core compatibility bridge. The 2026-04-29 maintenance-window state move treated all six Jess personal-boundary releases as one migration set and moved their OpenTofu state ownership into the Jess implementation overlay.

Helm release Runner scale set Current disposition
massageithaca-browser massageithaca Rehomed to Jess overlay
massageithaca-dind massageithaca-dind Rehomed to Jess overlay
personal-package-nix-a personal-package-nix-a Rehomed to Jess overlay
personal-package-nix-b personal-package-nix-b Rehomed to Jess overlay
personal-docker personal-docker Rehomed to Jess overlay
personal-nix personal-nix Rehomed to Jess overlay

These names are compatibility residue. They must not become product runner classes or onboarding examples.

After the state move, Jess overlay apply reconciled the six adopted releases to github-app-secret-jesssullivan, and GloriousFlywheel core applied the remaining output-only cleanup with 0 added, 0 changed, 0 destroyed. A follow-up core plan reported no changes. That completes the owner-authority repair; it does not retire the compatibility lanes themselves.

On 2026-05-16, live massageithaca-dind runner pressure showed the temporary DinD compatibility bridge still matters for MassageIthaca CI/CD. The live release now runs as a bounded Sting bridge: maxRunners=3, listener and runner payloads select sting, they tolerate dedicated.tinyland.dev/compute-expansion=true:NoSchedule, and runner work plus Docker graph scratch use local-path-sting-fast-ephemeral generic ephemeral PVCs. That placement is compute-expansion relief only. It does not make MassageIthaca a new product-specific runner class, and it does not convert Sting fast-local scratch into durable storage HA.

Do not confuse Helm release names with ARC runner scale-set names. massageithaca-browser is the Helm release and state address key, while the live ARC AutoscalingRunnerSet is still named massageithaca because the chart value runnerScaleSetName preserves that legacy registration identity.

Live Classification Audit

Use the read-only audit command before any state move, apply, or teardown:

just arc-runner-residue-audit

The command reads live Helm releases and ARC AutoscalingRunnerSet objects and classifies them as shared capability lanes, implementation-overlay-owned lanes, Jess-rehomed compatibility lanes, standalone compatibility lanes, or provenance gaps. It does not run tofu apply, tofu state mv, helm uninstall, or any other mutating operation.

The audit also tracks these important non-table entries:

Live release / scale set Classification Current disposition
selected six releases Jess-rehomed compatibility State moved to Jesssullivan/jesssullivan-infra; still temporary compatibility debt until callers use normal owner-overlay lanes
jesssullivan-* implementation-overlay owned Present in Jesssullivan/jesssullivan-infra, not core residue
jesssullivan-blog-dind implementation-overlay owned Present in Jesssullivan/jesssullivan-infra, not core residue
tubebrain-nix standalone compatibility lane Manual ARC lane for Jesssullivan/yt-text; rehome into an owner overlay or retire after yt-text consumes normal overlay authority

The tubebrain-nix lane is intentionally not a new product runner class. The live AutoscalingRunnerSet points at https://github.com/Jesssullivan/yt-text, uses github-personal-secret, runs on sting, and exposes the shared tinyland-nix workflow label for that canary. Keep it as bounded compatibility debt until it is adopted by an implementation overlay or removed.

Decision Gate

Before applying a core-stack cleanup, choose exactly one path:

  1. Rehome: move state ownership into an implementation overlay and keep the live releases only as temporary compatibility debt.
  2. Teardown: explicitly approve destruction, then run the guarded manual apply with allow_destroy=true.
  3. Hold: leave the destructive cleanup/apply held until the owner boundary is ready.

Do not merge a destructive plan under the assumption that the releases are unimportant.

The selected path was Rehome into Jesssullivan/jesssullivan-infra. That selection is captured in config/arc-runner-residue-rehome.json and can be checked without touching cluster state:

just arc-runner-rehome-manifest-check
just arc-runner-residue-rehome-plan

The manifest intentionally preserves the old Helm release names and live runnerScaleSetName values only so OpenTofu can adopt the existing releases in the Jess overlay. It also changes the destination secret name to github-app-secret-jesssullivan; do not rehome these releases into the overlay while still pointing at the legacy github-personal-secret.

ARC Controller Two-Phase Migration

Existing core-stack state has historically addressed the shared controller as module.arc_controller. PR #420 makes controller ownership optional with deploy_arc_controller, so the core stack first moves that state to module.arc_controller[0].

For existing deployments, apply the core stack once with deploy_arc_controller = true before any overlay or downstream state sets deploy_arc_controller = false. If the controller is disabled before the moved block lands, OpenTofu can fail planning because the moved block target is not declared in that configuration.

Overlay Preparation

Current authorities:

  • tinyland-inc/tinyland-infra for Tinyland Honey owner configuration; PR #2 merged the overlay repair at 4b31f6ddf12033cbe52e2e7192649295cf2bc473.
  • Jesssullivan/jesssullivan-infra for Jess personal-boundary owner configuration; PR #2 merged the initial overlay repair, and PR #4 refreshed the selected core pin at 0c6490048be94806f612f4cbf5903a7a3b44d91a.

Before moving state, confirm each overlay still pins the selected stable GloriousFlywheel core commit or release. As of the April 28 check-in, the Jess overlay has merged the temporary quarantine config in PR #5, pins fcfc388231e5036d91c2361f75e1f748a8c71f54, strict preflight has 0 blockers, and Jess has only the expected warning for its private personal-boundary repository registration anchor. The April 29 state movement used that prepared overlay and still requires a non-destructive overlay apply to reconcile the adopted releases onto the Jess GitHub App secret.

  1. Confirm the owner implementation-overlay repo and default branch.
  2. Copy examples/implementation-overlay/ only if the owner overlay does not already have the repaired repo shape.
  3. Initialize a separate backend key for overlay-owned ARC state.
  4. Add normal shared capability-class values first.
  5. Add temporary quarantine config only for the live residue being preserved. For the selected Jess rehome path, mirror the six entries in config/arc-runner-residue-rehome.json into tofu/stacks/arc-runners/jesssullivan.tfvars before moving state.
  6. Run the read-only enrollment preflight and resolve blockers before planning state movement.

A temporary quarantine may need to keep the old runnerScaleSetName so the existing Helm release can be adopted. That is allowed only inside the private implementation overlay and only until workflows move to shared capability labels or the release is retired.

State Move Shape

For a local two-state migration, use OpenTofu state files as the transfer medium. Keep backups. Hold a maintenance window if other applies can run. OpenTofu v1.8.7 documents state mv as able to move an address to a different state file, but -state and -state-out are local-backend state-file options. Use them only against pulled local state copies, never as a shortcut against initialized remote backend configuration.

Do not commit one-off import blocks to the reusable ARC stack. Import blocks bind the stack to one overlay’s internal runner keys and will break another owner overlay that uses distinct Helm release names with the same shared workflow labels. Treat imports and state moves as operator-run migration steps from an initialized checkout.

Before making any real state edit:

  1. Confirm no one else is applying the core or overlay ARC stacks.
  2. Pull fresh core and overlay state files from initialized checkouts.
  3. Run the pre-move state checker and require summary: 0 blockers.
  4. Run every generated tofu state mv ... -dry-run command.
  5. Repeat the state moves without -dry-run only against local state files.
  6. Keep the mandatory OpenTofu backup files until post-move plans are green.
# From the initialized core stack checkout.
tofu state pull > /tmp/gf-core-arc-runners.tfstate

# From the initialized implementation overlay checkout.
tofu state pull > /tmp/gf-overlay-arc-runners.tfstate

# From the GloriousFlywheel checkout. This prints addresses only, not state
# values, and confirms whether the pulled files match the selected manifest.
just arc-runner-residue-state-check \
  /tmp/gf-core-arc-runners.tfstate \
  /tmp/gf-overlay-arc-runners.tfstate

# Example dry run for one resource.
tofu state mv \
  -state=/tmp/gf-core-arc-runners.tfstate \
  -state-out=/tmp/gf-overlay-arc-runners.tfstate \
  -dry-run \
  'module.extra_runners["personal-package-nix-a"].helm_release.arc_runner' \
  'module.extra_runners["personal-package-nix-a"].helm_release.arc_runner'

After every dry run matches the intended source and destination, repeat without -dry-run, then push both resulting state files back to their respective backends from the correct initialized working directories.

After local state-file movement and before pushing, rerun the state checker in post-move mode:

just arc-runner-residue-state-check \
  /tmp/gf-core-arc-runners.tfstate \
  /tmp/gf-overlay-arc-runners.tfstate \
  --phase post-move
# From the initialized core stack checkout.
tofu state push /tmp/gf-core-arc-runners.tfstate

# From the initialized implementation overlay checkout.
tofu state push /tmp/gf-overlay-arc-runners.tfstate

Do not use -force unless the operator has checked serial and lineage differences and recorded why the normal guarded push is insufficient.

Address Map

If the overlay uses the current extra_runner_sets quarantine shape, the expected state moves are the entries rendered by:

just arc-runner-residue-rehome-plan --format commands

For personal-docker and personal-nix, the core stack contains moved blocks from the old dedicated module addresses to the current extra_runner_sets addresses. Inspect the pulled core state and use the source address that is actually present; do not assume the legacy address still exists.

Helm release Core source address candidate Overlay destination
massageithaca-browser module.extra_runners["massageithaca-browser"].helm_release.arc_runner module.extra_runners["massageithaca-browser"].helm_release.arc_runner
massageithaca-dind module.extra_runners["massageithaca-dind"].helm_release.arc_runner module.extra_runners["massageithaca-dind"].helm_release.arc_runner
personal-package-nix-a module.extra_runners["personal-package-nix-a"].helm_release.arc_runner module.extra_runners["personal-package-nix-a"].helm_release.arc_runner
personal-package-nix-b module.extra_runners["personal-package-nix-b"].helm_release.arc_runner module.extra_runners["personal-package-nix-b"].helm_release.arc_runner
personal-docker module.extra_runners["personal-docker"].helm_release.arc_runner module.extra_runners["personal-docker"].helm_release.arc_runner
personal-nix module.extra_runners["personal-nix"].helm_release.arc_runner module.extra_runners["personal-nix"].helm_release.arc_runner

Verification

After state movement:

  1. Plan the core stack. It must not propose creating, destroying, or changing rehomed releases.
  2. Plan the implementation overlay. It should either be no-op for quarantined releases or show only intentional changes, such as switching adopted releases from github-personal-secret to github-app-secret-jesssullivan.
  3. Confirm workflows still target shared capability labels wherever they can.
  4. Record remaining owner-scope gaps as enrollment debt, not runner taxonomy.
  5. Remove quarantine config once the owner can consume shared labels directly or the release is explicitly retired.

GloriousFlywheel