Deployment Contract
How GloriousFlywheel stacks land in the target cluster, who owns what, and where state authority lives.
Stack Topology
GloriousFlywheel deploys 4 OpenTofu stacks into the honey RKE2
cluster. Each stack has its own state file and can be planned/applied
independently.
| Stack | Namespaces | State Backend | Purpose |
|---|---|---|---|
arc-runners |
arc-systems, arc-runners | GitLab HTTP (project 79706605) | ARC controller + GitHub runner scale sets |
attic |
cnpg-system, attic-cache-dev | GitLab HTTP (project 79706605) | Nix binary cache + PostgreSQL + RustFS |
gitlab-runners |
gitlab-runners | GitLab HTTP (project 79706605) | GitLab Runner Helm deployments |
runner-dashboard |
runner-dashboard | GitLab HTTP (project 79706605) | SvelteKit monitoring dashboard |
State Authority
Source of truth hierarchy:
1. OpenTofu state (GitLab Managed Terraform State)
- Authoritative for all managed resources
- HTTP backend, locked per-stack
2. Repo code (tofu/stacks/*, tofu/modules/*)
- Desired state definition
- Drift = difference between (2) and (1)
3. Cluster live state (kubectl get)
- Observed state
- Drift = difference between (1) and (3)
- Dashboard /api/gitops/drift compares (2) vs (3)
Rule: Never modify cluster resources directly. All changes flow through
OpenTofu. Manual kubectl edits will be overwritten on next apply.
Exception: Emergency operator actions (pause runner, scale to zero) via dashboard API are intentional drift — they modify live state and will reconcile on next apply.
Deployment Flow
Local Development
# Prerequisites
export TF_HTTP_PASSWORD="<gitlab-pat-with-api-scope>"
export HONEY_KUBECONFIG="<path-to-honey-kubeconfig>"
# Plan a stack
just tofu-plan arc-runners
# Apply (requires review of plan output)
just tofu-apply arc-runners
CI/CD (GitHub Actions)
PR opened
-> validate.yml: tofu init -backend=false && tofu validate (all modules + stacks)
-> No plan/apply on PR (read-only validation)
Push to main (after merge)
-> deploy-arc-runners.yml: tofu plan -> tofu apply (arc-runners stack only)
-> Other stacks: manual deployment via `just tofu-apply <stack>`
Deployment Dependencies
OpenEBS ZFS (storage)
-> must be deployed before any PVC-using workload
-> provisioned on bumble node (ZFS pool)
CNPG Operator (database)
-> must be deployed before attic stack
-> deployed as part of attic stack (self-bootstraps)
ARC Controller (runner orchestration)
-> must be deployed before any ARC scale set
-> deployed as part of arc-runners stack, before scale sets
Dependency order for fresh cluster:
1. arc-runners (OpenEBS ZFS + ARC controller + scale sets)
2. attic (CNPG + Attic + RustFS)
3. gitlab-runners (GitLab Runner deployments)
4. runner-dashboard (SvelteKit app)
Cluster Requirements
Target Cluster: honey
| Property | Value |
|---|---|
| Provider | On-prem |
| Distribution | RKE2 |
| Nodes | 3 (honey: control plane, bumble: storage/ZFS, sting: stateless compute) |
| Kubernetes version | 1.29+ |
| Storage | OpenEBS ZFS (on bumble node) |
| Ingress | nginx-ingress (RKE2 bundled) |
| CNI | Canal (RKE2 default) |
| Context name | honey |
Required Cluster Features
- OpenEBS ZFS StorageClass: durable storage for stateful services on
honey(the baseline Nix runner lanes should not depend on it for scheduling) - Metrics Server: Required for HPA CPU/memory scaling
- Prometheus Operator CRDs: ServiceMonitor and PrometheusRule for metrics
- Cert-Manager (optional): TLS for dashboard ingress
Required Secrets
| Secret | Scope | Source |
|---|---|---|
TF_HTTP_PASSWORD |
All stacks | GitLab PAT with api scope |
HONEY_KUBECONFIG |
CI deployment | Kubeconfig for honey RKE2 cluster |
| GitHub App credentials | arc-runners | GitHub App installation for ARC |
| GitLab runner tokens | gitlab-runners | GitLab group runner tokens |
| Attic signing key | attic | Generated during attic setup |
| PostgreSQL credentials | attic | Generated by CNPG operator |
Residual Assumptions
Current (honey cluster)
cluster_context = "honey"in tfvars and environment config- 10.43.0.0/16 service CIDR (shared by k3s and RKE2 defaults)
- OpenEBS ZFS on bumble node for persistent storage
Transitional (may change)
- GitLab state backend: Works but couples GloriousFlywheel to GitLab infrastructure. Migration to S3 backend or OpenTofu Cloud is a future option.
- Single cluster: Current model is single-cluster. Multi-cluster would require per-cluster state files and a federation layer (see runner-topology.md).
Removed (no longer valid)
- Liqo virtual nodes: Removed. No cross-cluster scheduling.
- Civo Object Storage for backups: Disabled in attic stack. Using RustFS/external S3. Civo decommissioned April 2026.
- Civo as provider: Migrated to on-prem honey RKE2 cluster (Civo
decommissioned April 2026). Civo-specific config (
civo.tfvars, Civo CLI in CI) has been removed. - Multiple cluster contexts: All stacks target one cluster.
Adding a New Stack
- Create
tofu/stacks/<name>/withmain.tf,variables.tf,outputs.tf - Add
<name>.tfvarswith production values (gitignore sensitive values) - Configure GitLab HTTP state backend in
main.tf:terraform { backend "http" {} } - Add validation job in
.github/workflows/validate.yml - Add
just tofu-plan <name>/just tofu-apply <name>support (automatic via Justfile) - Document namespace ownership in this file