Runner Topology Model
Post-Liqo ARC topology for the GloriousFlywheel runner platform. Supersedes the prior multi-cluster burst model that used Liqo virtual nodes.
Current State: Single-Cluster ARC
All runners deploy to the on-prem honey RKE2 cluster
managed by a single ARC controller in arc-systems.
The current placement contract is split on purpose:
- ARC controller pods, baseline Nix payloads, DinD payloads, and Jess
implementation-overlay listeners are pinned to
honey - stateless Docker payloads and their listener, plus bounded heavy Nix compute
lanes, are admitted to
stingwith the compute-expansion toleration bumbleremains the storage-biased OpenEBS/ZFS node, not the ARC scheduling authority
That split exists because storage capacity and kubelet eviction capacity are
different things. bumble has the durable ZFS-backed storage plane, but its
node root/image filesystem can still hit DiskPressure from RKE2/containerd and
host Nix store churn. Baseline ARC control and shared runner scheduling must
not depend on that storage node being free of rootfs pressure.
honey RKE2 cluster
+-- arc-systems namespace
| +-- ARC Controller (v0.14.0)
| - manages all AutoScalingRunnerSets
| - Prometheus metrics on :8080
| - 5 alert rules (PrometheusRule)
|
+-- arc-runners namespace
| +-- tinyland-nix (ARC ScaleSet label)
| | - Nix builds, Attic cache integration
| | - Runner image ships Determinate Nix
| | - No shared /nix/store PVC requirement for baseline scheduling
| | - Warm pool: CronJob scales up during business hours
| | - CPU: 4 limit, Memory: 8Gi limit
| |
| +-- tinyland-docker (ARC ScaleSet label)
| | - General CI workloads
| | - Runtime placement: sting compute-expansion lane
| | - CPU: 2 limit, Memory: 4Gi limit
| |
| +-- tinyland-dind (ARC ScaleSet label)
| | - Container builds (Docker-in-Docker)
| | - 40-50Gi ephemeral storage
| |
| +-- tinyland-nix-gpu (ARC ScaleSet, extra)
| | - Shared bounded GPU lane on honey
| | - Host /dev/dri pass-through + Vulkan and Dawn/WebGPU userspace canaries
| | - Max 1 runner
| |
| +-- any live repo-shaped residue found by audit
| - migration debt to retire, not committed product taxonomy
|
+-- gitlab-runners namespace
| +-- GitLab HPA runners (docker, dind, nix)
| - HPA-managed deployments
| - Separate from ARC, managed by gitlab-runner Helm chart
|
+-- nix-cache namespace
+-- Attic binary cache server
+-- Bazel remote cache
+-- CNPG PostgreSQL cluster
+-- RustFS object storage
Runner Classes
Workflow-facing labels are shared capability classes. Implementation object names may differ inside the ARC stack, but those names are not the consumer contract and must not encode project identity.
| Label | Forge | Type | Scale Model | Storage | Use Case |
|---|---|---|---|---|---|
tinyland-nix |
GitHub | ARC ScaleSet | min 0 + scheduled warm pool, core max 16 + sting overflow max 8 | Ephemeral rootfs + Attic/Bazel acceleration | Nix builds, flake checks |
tinyland-docker |
GitHub | ARC ScaleSet | Scale-to-zero, core max 20 | Ephemeral | General CI, tests, linting |
tinyland-dind |
GitHub | ARC ScaleSet | Scale-to-zero, core max 20 + sting overflow max 16 | Honey large ephemeral; sting fast-local PVC scratch | Container image builds |
tinyland-nix-operator |
GitHub | ARC ScaleSet (extra) | Scale-to-zero, max 1 | Ephemeral rootfs + Attic/Bazel acceleration | ARC deploy/operator maintenance |
tinyland-nix-heavy |
GitHub | ARC ScaleSet (extra) | Scale-to-zero | Ephemeral rootfs + Attic/Bazel acceleration | Memory-heavy Nix/Rust builds |
tinyland-nix-kvm |
GitHub | ARC ScaleSet (extra) | Max 1 | Ephemeral + host /dev/kvm |
Shared VM execution |
tinyland-nix-gpu |
GitHub | ARC ScaleSet (extra) | Max 1 | Ephemeral + host /dev/dri |
Bounded GPU/Vulkan/WebGPU smoke |
docker |
GitLab | HPA Deployment | Min 1 replica | Ephemeral | Compatibility GitLab CI |
dind |
GitLab | HPA Deployment | Min 1 replica | Ephemeral | GitLab container builds |
nix |
GitLab | HPA Deployment | Min 1 replica | Ephemeral | GitLab Nix builds |
Repo-shaped add-ons found in live cluster audits or historical stack values are migration debt. They must either be removed or replaced by shared capability labels with explicit runtime, architecture, privilege, or resource semantics. Owner-specific GitHub App installs belong in implementation overlays, not in this runner topology as product classes.
When multiple implementation overlays attach to one physical backend, ARC
runner caps remain local to each owner-specific scale set. They do not create a
global cap for the shared workflow-facing label. Current Honey behavior relies
on Kubernetes scheduling as the final global backpressure, so two owner
overlays can each request one tinyland-nix-heavy runner and the second pod
will queue if the backend cannot admit another heavy runner.
Scale-Up and Scale-Down Lifecycle
ARC Runners (GitHub)
Job queued in GitHub Actions
-> GitHub webhook -> ARC Controller
-> Controller creates runner pod in scale set
-> Pod starts (pulls image, exposes runtime hints)
-> Runner registers with GitHub
-> Job executes
-> Job completes -> pod terminates (ephemeral)
-> Scale set returns to minRunners (default: 0)
Warm pool override (tinyland-nix):
13:00 UTC weekdays (business hours start)
-> CronJob patches AutoScalingRunnerSet: minRunners=1
-> ARC pre-creates 1 idle runner pod
-> Pods already carry the runner image + Nix toolchain
-> Jobs get instant assignment (no registration cold start)
01:00 UTC daily (business hours end)
-> CronJob patches AutoScalingRunnerSet: minRunners=0
-> Idle pods drain and terminate
-> Attic remains the reusable acceleration layer between jobs
GitLab Runners (HPA)
Always-on: minimum 1 replica per runner type
-> HPA monitors CPU/memory utilization
-> Scales up when CPU > target (default 70%)
-> Scale-down stabilization: 300s window
-> Never scales below min replicas
Cold Start Times
| Runner | Cold Start | Warm Start | Notes |
|---|---|---|---|
tinyland-docker |
~20-30s | N/A (ephemeral) | Image pull + registration |
tinyland-dind |
~25-35s | N/A (ephemeral) | Larger image, DinD setup |
tinyland-nix (cold) |
~90-120s | ~5-10s | Runner image already contains Nix; cache misses dominate |
tinyland-nix (warm pool) |
~5-10s | ~5-10s | Warm pool avoids registration cold start; Attic still carries reuse |
| GitLab compatibility runner | ~5s | ~5s | Always running (HPA min replicas) |
What Changed from Liqo Era
The prior topology used Liqo to extend capacity across multiple clusters (blahaj, honey) via virtual nodes. This created several problems:
- Scheduling complexity: Liqo virtual nodes required affinity rules and tolerations that differed per cluster, making runner configs non-portable.
- Network partitions: Cross-cluster pod communication was fragile when Liqo peering connections dropped.
- State management: Shared volumes (like /nix/store) couldn’t span Liqo-peered clusters without additional infrastructure.
- Debugging difficulty: Pod failures on virtual nodes were hard to diagnose — logs and events were split across clusters.
Resolution: All runners now deploy to a single cluster. Multi-cluster burst is deferred until a clear need emerges. When it does, the approach will be federated ARC (multiple independent ARC controllers with a shared GitHub App) rather than virtual-node scheduling.
Advanced Runner Classes
| Runner class | Current state | Next proof surface | Anchor |
|---|---|---|---|
| KVM | Shared lane with bounded proof floor | Broader rockies and graphical VM execution if a fresh slice is reopened beyond the current terminal-first floor |
#312 |
| GPU / WebGPU / Dawn | Shared tinyland-nix-gpu host-device lane on honey, one bounded Dawn/WebGPU compute-plus-render userspace proof floor, and one downstream default-branch proof on the shared lane |
Wider downstream adoption only when an authoritative workload needs it; keep local NVIDIA-fabric ideas as future design context unless a real product requirement revives them | #342, #347, tinyland-inc/lab#163 |
| macOS | Tightened bounded self-hosted Darwin proof floor | Decide whether to promote beyond the current proof floor into a platform-owned shared macOS lane | #320, #335 |
| riscv | Not started | Name the first repo, workload, and operator boundary instead of ambient demand | #333 |
| Cross-forge follow-on | Compatibility-only, not product | Repeatable GitHub-first runtime pattern before any parity claim | #333 |
Capacity Planning
Current cluster: 3 nodes (on-prem RKE2) — honey (control plane), bumble (storage/ZFS), sting (stateless compute).
Current ARC headroom baseline after the bumble rootfs follow-up, listener-placement apply on 2026-04-25, and Docker placement apply on 2026-04-27:
honey: primary ARC controller, baseline Nix payloads, DinD payloads, honey-backedtinyland-nix-heavy, Jess overlay listeners, and storage-sensitive runner payloadsbumble: storage-biased OpenEBS/ZFS services plus live owner/repo-shaped ARC residue and older in-flight runner pods; currently observed as schedulable only when explicitly uncordoned/admittedsting: bounded compute-expansion surface for statelesstinyland-dockerand explicit compute-expansion lanes; not the default baseline Nix, heavy Nix, or DinD runtime surface
The 2026-04-25 audit found bumble DiskPressure=False but still tight on
rootfs headroom after supported CRI and Nix cleanup. The durable fix is
placement and filesystem architecture, not treating raw storage capacity as
kubelet imagefs capacity.
The 2026-04-29 just kubelet-imagefs-capacity-audit --node bumble checkpoint
kept that boundary active: bumble remained Ready=True and
DiskPressure=False, but kubelet rootfs, imagefs, and containerfs each
had only 11.4 GiB available (16.3%) on a 69.9 GiB filesystem. The node can be a
30 TiB-class storage node and still be unsafe as default bursty ARC imagefs
capacity. The May 1 offline fixture guard now keeps healthy, warning, and
critical rootfs/imagefs/containerfs boundaries covered in CI; it is not a live
capacity remediation by itself.
The 2026-05-02 read-only audit still showed bumble below the warning
threshold: 12.0 GiB available (17.1%) for kubelet rootfs, imagefs, and
containerfs, while Ready=True and DiskPressure=False. The operating
decision is therefore a hybrid one: default ARC and GitLab runner placement
continues to avoid bumble, and host-level RKE2/containerd or /nix
reshaping is a later maintenance action. just runner-scale-contract-check
guards the committed ARC and GitLab selectors so bumble cannot silently
return as default runner burst capacity before that remediation is explicit.
The 2026-04-25 overlay validation queue also exposed a separate scheduler
limit: honey can fill its pod slots while bumble is cordoned and sting
is protected by the compute-expansion taint. That is shared pool
capacity/placement debt, not a reason to mint repo-specific runner labels.
The first bounded relief paths admitted tinyland-nix-heavy and stateless
tinyland-docker to sting with the compute-expansion toleration. Baseline
Nix and DinD lanes stay on honey until their runtime and storage envelopes are
proven separately.
The 2026-05-11 outage confirmed the same limit in a more direct way: broad
cluster resources can be available while a honey-pinned lane is still blocked
by the honey node pod-count ceiling. Kubernetes pod capacity, selectors,
taints, tolerations, and per-lane storage envelopes are the real admission
contract. Do not read spare CPU or memory on sting as runner availability
unless that runner class has a reviewed sting placement and scratch-storage
contract. Do not read bumble OpenEBS/ZFS capacity as runner availability; it is
the durable PVC plane.
The 2026-05-12 post-merge burst showed that completed runner utility Jobs can
also consume honey’s finite pod slots long after their useful work is done. The
ARC runner stack now enables the runner-cleanup CronJob in arc-runners so
Succeeded and Failed runner-namespace pods age out through the repo-managed
control plane instead of relying on ad hoc live deletion.
The May 15 managed-apply recurrence exposed a second listener-continuity edge:
ARC can hold listener recreation after a scale-set spec change until existing
runners drain, and freezing only minRunners still allows queued work to refill
the shared lane through an existing listener. The managed ARC apply workflow
now keeps plan/apply/verify off labels it quiesces, max-freezes the shared
Nix/Docker/DinD scale sets before mutation, records a cap snapshot, generates
and guards a fresh post-quiesce apply plan, restores caps from source tfvars
targets on success before listener proof, keeps the cap snapshot only as the
failure rollback, keeps best-effort failure restore in the workflow trap, gives
active shared jobs a bounded 20-minute drain window, and treats a missing
listener with active runners as a failed post-apply proof.
tinyland-nix-operator is the dedicated control-plane lane; the workflow
falls back to tinyland-nix-heavy only until that lane is bootstrapped live.
The 2026-06-09 managed apply exposed two further edges in that lane. First,
the post-apply listener cap prove treated mid-recreation listener churn as
drift and went red on a successful apply; the prove step is now the
settle-aware scripts/arc-prove-listener-caps.sh, which classifies drift as
transient while a set’s AutoscalingListener CR or listener-config secret is
missing or younger than a grace window, hard-fails fast only on a stable drift
signature with a settled listener, and still hard-fails anything unresolved at
its overall deadline. Second, idle leaked EphemeralRunner CRs (zero job
fields with their owning EphemeralRunnerSet at replicas=0, or excess
beyond desired) kept Running pods alive and stalled the freeze drain for its
full 20-minute window; scripts/reap-idle-leaked-ephemeral-runners.sh now
runs between quiesce scoping and the cap freeze, deleting only provably
leaked no-job CRs (graceful CR deletion, just-in-time per-CR job re-check,
warm minRunners pools at current==desired untouchable) and waiting bounded
for the runner sets and listeners to settle before the freeze proceeds.
The 2026-04-29, 2026-05-10, 2026-05-12, and 2026-05-15 cap expansions keep
that placement model but raise the source-owned ceilings for the primary shared
lanes:
tinyland-docker to 20 on sting, tinyland-nix to 16 on honey plus 8
sting overflow slots, and tinyland-dind to 20 on honey plus 16 sting
overflow slots.
The additive tinyland-nix-compute-expansion scale set contributes those
shared tinyland-nix overflow slots on sting; its TIN-1400 storage model uses
per-pod generic ephemeral PVCs on local-path-sting-fast-ephemeral for /nix
and /home/runner/_work, then copies the baked image /nix into the PVC before
runner startup so the image’s Nix installation survives the mount.
The additive tinyland-dind-compute-expansion scale set contributes those
shared tinyland-dind overflow slots on sting; those pods use generic
ephemeral PVCs on local-path-sting-fast-ephemeral for /home/runner/_work
and /var/lib/docker because sting’s fast disks are not the kubelet root
ephemeral-storage filesystem. Those pods therefore keep root ephemeral-storage
requests small and reserve the large scratch budget through the fast-local PVCs.
The Docker lane also carries an explicit 1Gi/8Gi
ephemeral-storage request/limit so large stateless CI bursts are no longer
invisible to scheduler disk accounting. The honey DinD lane splits
ephemeral-storage admission between a 4Gi/8Gi runner workspace container and a
24Gi/40Gi Docker daemon sidecar so
neither side of the build pod inherits the namespace default. Heavy, KVM, GPU,
and compatibility-residue lanes remain separately bounded.
The 2026-05-24 first-party dogfood surge made the remaining Sting storage
truth visible: tinyland-nix-compute-expansion can have the fast-local PVC
model present and still hit scheduler Insufficient ephemeral-storage if the
kubelet root/nodefs surface on sting only advertises roughly 71GB. That is
not a reason to route GloriousFlywheel validation to GitHub-hosted runners, and
it is not proof that the physical SSD/NVMe pool is exhausted. It is a node
storage integration problem: kubelet/local-path/root ephemeral accounting must
match the intended fast-local runner substrate before a larger overflow cap is
treated as fully usable live capacity.
That maxRunners = 1 is still per ARC scale set. With both Tinyland and Jess
owner overlays installed, simultaneous heavy jobs can request two heavy pods
for the same shared label. Until GloriousFlywheel has a higher-level
cross-overlay capacity controller or more sting-class capacity, this is
honest queueing and should be described as global capacity debt rather than
runner label debt.
Use just arc-shared-label-capacity-audit to make that boundary visible from
live state. It groups Helm release values by workflow-facing tinyland-*
labels and reports which owner overlay scale sets publish each label, their
per-scale-set caps, current runner counts, resource envelopes, and placement.
| Scenario | Concurrent Runners | Estimated Resource Usage |
|---|---|---|
| Quiet (off-hours) | 3 GitLab + 0 ARC | ~3 vCPU / 6Gi |
| Normal (business hours) | 3 GitLab + 2 warm Nix | ~11 vCPU / 22Gi |
| Burst (multiple PRs) | 3 GitLab + 4-6 ARC | Exceeds cluster, queue builds |
Scaling strategy: Vertical (larger nodes) before horizontal (more nodes).
ARC minRunners = 0 means burst capacity is only consumed when needed, except
for explicitly scheduled warm-pool windows that pre-scale selected lanes.
Ownership
| Component | Owner | Change Process |
|---|---|---|
| ARC Controller config | Platform Engineer | PR to tofu/stacks/arc-runners/ |
| Runner scale set params | Platform Engineer | PR to tofu/stacks/arc-runners/ |
| Warm pool schedules | Operator | PR to tofu/stacks/arc-runners/ |
| HPA policies | Platform Engineer | PR to tofu/modules/gitlab-runner/ |
| Cluster nodes | Org Admin | On-prem provisioning (honey, bumble, sting) |
| Prometheus alerts | Platform Engineer | PR to tofu/modules/arc-controller/monitoring.tf |