Implementation Overlays

Implementation Overlays

Implementation overlays are the intended place for owner-specific deployment configuration. They let one GloriousFlywheel substrate serve multiple owners, organizations, and forge installations without turning those owners or repos into runner taxonomy.

Current status: this boundary is the required architecture. The owner overlay repositories exist, their stable core pins are maintained through overlay validation, and strict enrollment preflight is green. The April 29, 2026 Jess personal-boundary ARC state rehome is complete; the remaining debt is retiring or redesigning compatibility lanes such as the package personal lanes tracked by #412, not repairing core-state ownership.

  • tinyland-inc/tinyland-infra is the Tinyland Honey implementation-overlay authority. PR #2, Overhaul Tinyland implementation overlay, merged at 4b31f6ddf12033cbe52e2e7192649295cf2bc473.
  • Jesssullivan/jesssullivan-infra is the Jess personal-boundary implementation overlay. PR #13 pinned its ARC deploy workflow to GloriousFlywheel authority commit defff7fb7d1f3457c5270ce2e57ac6077e797b1c for the rehome closeout.
  • refresh overlay pins whenever the reusable stack contract changes, before live state movement, and before applying adopted compatibility releases.
  • Tinyland and Jess both reach shared workflow-facing labels through their owner-scoped ARC registrations.
  • both overlays use least-privilege core-read/deploy-key wiring and live cache attachment for validation. Strict preflight reports 0 blockers for both overlays; Jess retains one expected warning for its private personal-boundary repository registration anchor.

The known Jess personal-boundary residue has been moved out of core ARC state and adopted by the Jess implementation overlay. That ownership repair does not make the compatibility lanes product taxonomy. Retirement still requires a separate owner-boundary decision, org/enterprise shared-runner scope, or explicit removal after downstream proof.

The active coordination surface is Implementation Overlay Workstream.

Decision Record

Decision: owner-specific GitHub App bindings, backend coordinates, tfvars, and private topology belong in implementation overlay repos. The reusable GloriousFlywheel repo owns modules, cache/runner contracts, shared capability-class labels, examples, and public/operator docs.

Consequence: a tinyland-inc overlay and a Jesssullivan overlay may both point at the same Honey cluster, Attic cache, Bazel cache, and RustFS state substrate. That shared backend does not make owner identity a runner class.

Consequence: Bzlmod is the distribution and override mechanism for consuming the reusable core from overlays. It is not an auth bypass. GitHub App installation scope, ARC registration, and core-read credentials still have to be solved per owner boundary.

Consequence: personal-account ARC registration may need a private repository URL as control-plane plumbing because GitHub does not provide org-style runner groups for personal accounts. That anchor must not become a workflow-facing label or a reusable product lane.

The core repo owns the product contract:

  • OpenTofu modules and stack shape
  • runner images and shared capability-class labels
  • cache attachment contracts
  • composite actions and examples
  • public and internal docs

Implementation overlays own deployment facts:

  • GitHub App installation IDs and private keys
  • owner or organization-specific githubConfigUrl values
  • tfvars for one operator tenant
  • cache view and namespace choices
  • Attic public key values for Nix substituter reads; this is public material, but it is still part of the deployment contract and must travel with the cache endpoint.
  • backend state configuration
  • private topology, auth, and operator access details

Canonical Split

The overlays can point to the same backend cluster and cache planes when the tenant, namespace, backend state key, and secret boundaries are explicit. They must not invent repo-shaped runner labels to work around GitHub scope limitations.

What Belongs In Core

Core GloriousFlywheel can define reusable capability classes:

  • tinyland-nix
  • tinyland-docker
  • tinyland-dind
  • bounded additive classes such as tinyland-nix-heavy, tinyland-nix-kvm, and tinyland-nix-gpu

The label contract is explicit. runnerScaleSetName is registration identity; the workflow-facing runs-on string must also be present in scaleSetLabels. Do not rely on ARC to infer a workflow label from the scale-set name.

Core can also define generic variables and modules that an implementation overlay consumes. It should not contain jesssullivan, customer, or repo-specific runner lanes as active product structure.

What Belongs In An Implementation Overlay

An implementation overlay can provide owner-specific values such as:

github_config_url    = "https://github.com/<owner-or-org>"
github_config_secret = "<owner-specific-github-app-secret>"
attic_server         = "http://attic.nix-cache.svc.cluster.local"
attic_cache          = "main"
attic_public_key     = "<cache-name:public-key>"

extra_runner_sets = {
  tinyland-nix-heavy-compute-expansion = {
    github_config_url    = "https://github.com/<owner-or-org>"
    github_config_secret = "<owner-specific-github-app-secret>"
    runner_label         = "tinyland-nix-heavy"
    runner_scale_set_name = "tinyland-nix-heavy-compute-expansion"
    runner_labels        = ["self-hosted", "nix", "linux"]
    runner_type          = "nix"
    max_runners          = 1
  }
}

max_runners is a local cap for that overlay’s ARC scale set. It is not a global cap for the shared label across every owner overlay attached to Honey. Global capacity is currently enforced by Kubernetes scheduling; if two owner overlays request tinyland-nix-heavy at once and only one heavy pod fits on a given node, the second job queues.

For the source-owned tinyland-inc stack, do not solve that queue by widening the sting scale set past what the node can schedule. The current managed answer keeps the workflow-facing tinyland-nix-heavy capability label stable, moves that scale set to honey, and raises it to two 64Gi slots. Sting can re-enter the heavy-Nix placement story after the fast-local Nix scratch/store model is proved for that node. The first source-owned proof target is narrower: tinyland-nix-compute-expansion keeps the shared tinyland-nix label and uses per-pod Sting fast-local PVCs for /nix and /home/runner/_work; it is not a blanket approval for every Nix lane to move to Sting.

The exact owner, secret names, backend, and private endpoints stay in the implementation overlay. The product-level lane name remains capability-shaped. When multiple owner overlays attach to the same physical cluster, internal Helm release names, ARC runnerScaleSetName values, and state keys may be owner-distinct to avoid collisions. That does not change the workflow-facing runner_label.

GitHub Personal Account Boundary

GitHub exposes self-hosted runners at repository, organization, and enterprise scope. It does not provide an organization-style runner group for a personal account.

That means a personal-account implementation overlay may still need a repository URL as an ARC registration anchor. If that happens, treat the repo URL as private control-plane plumbing only.

Rules:

  • do not put personal-account anchors in the core public product stack
  • do not mint labels such as personal-nix, <repo>-nix, or <owner>-docker
  • keep workflows targeting shared capability labels
  • if shared labels cannot be reached truthfully, mark the repo blocked or move it under an organization or enterprise runner scope
  • record the owner-boundary gap as enrollment debt, not as product taxonomy

Package Compatibility Retirement

Jesssullivan/scheduling-kit and Jesssullivan/scheduling-bridge now use the workflow-facing shared tinyland-nix label, and their compatibility releases are owned by the Jess overlay. That is still not retirement.

The lanes can be retired only when one of these exits is selected and proved:

  • move the package repos under an organization or enterprise runner scope that can consume the shared capability lane without repo-level ARC anchors
  • define a broader personal-account enrollment mechanism that is not shaped around one downstream repo per scale set
  • remove the compatibility scale sets after downstream CI is either migrated or explicitly marked blocked

Do not close package-lane retirement merely because state ownership moved. State rehome prevents accidental destruction and fixes authority; retirement removes the compatibility runners from live use.

Bzlmod Role

Bzlmod is useful because it gives an implementation overlay a clean way to consume the core module while keeping deployment-specific files out of the core repo. It does not change GitHub’s runner scope model.

Correct interpretation:

  • Bzlmod can separate reusable module code from implementation configuration
  • an overlay repo can pin or override the GloriousFlywheel module
  • the overlay can carry private tfvars and secrets references
  • GitHub App install scope still has to be solved by the forge adapter and owner boundary

Legacy Overlay Mechanics

The older symlink-merge overlay system is still documented for compatibility. Do not confuse that mechanism with the current architecture boundary.

Current usage:

  • implementation overlay is the ownership boundary for deployment facts
  • legacy symlink-merge overlay is a compatibility mechanism for older consumers
  • neither one justifies repo-specific runner labels

Implementation Checklist

  1. Keep the core repo free of owner-specific active runner config.
  2. Create or designate the owner-owned implementation-overlay authority repo. Current designated repos are tinyland-inc/tinyland-infra and Jesssullivan/jesssullivan-infra; keep them private.
  3. Move GitHub App install media and tfvars into that overlay.
  4. Use shared capability-class labels in workflows.
  5. Pass cache endpoints and credentials through the overlay/bootstrap contract, not by hard-coding endpoints in consumer repos.
  6. Run the ARC taxonomy guard before proposing committed runner changes.
  7. Run the read-only implementation-overlay enrollment preflight before plan or apply.
  8. Check the ARC plan for destroys before applying a core-stack cleanup.
  9. Treat unreachable shared lanes as enrollment debt until proved by a real default-branch workflow.
  10. Use the state rehome runbook before removing live owner/repo-shaped residue from core state.

GloriousFlywheel