Tailnet-First Operator Plane
Defines the tailnet-first access model, dashboard exposure boundaries, and multi-org runner enrollment for the GloriousFlywheel platform.
Current boundary: tailnet access is an access-auth envelope, not a complete mutation authority. See Auth and Mutation Authority for the current read/write split.
Design Principle
The operator plane is private and tailnet-first. Dashboard auth is mixed-mode today: trusted Tailscale or mTLS proxy identity is preferred, while GitLab OAuth and WebAuthn remain interactive compatibility paths.
Operator (tailnet device)
-> Tailscale tunnel
-> Caddy reverse proxy (Tailscale or mTLS mode)
-> Dashboard (SvelteKit) / API / MCP server
Access Model
Who Can Access What
| Role | Access Method | Scope |
|---|---|---|
| Operator | Dashboard UI over tailnet | View fleet, use compatibility pause/resume or config submission when configured |
| Org Admin | Dashboard UI + forge review | Review compatibility GitOps MRs and enrollment requests |
| Platform Engineer | CLI (kubectl, tofu, MCP) over tailnet |
Full cluster access, infrastructure changes |
| Downstream Consumer | GitHub/GitLab CI only | Submit jobs, no platform access |
| External User | None (tailnet-only) | No access to operator plane |
Authentication Flow
User opens dashboard URL (e.g. https://dashboard.tail12345.ts.net)
-> Caddy checks Tailscale or mTLS identity
-> Emits trusted identity headers
-> Maps to platform role:
- Org owner email -> admin
- Org member email -> operator
- Unknown -> default role or explicit deny at proxy/policy layer
-> SvelteKit reads trusted headers when TRUST_PROXY_HEADERS=true
Proxy identity outranks stored interactive sessions when trusted headers are enabled. Interactive GitLab OAuth and WebAuthn sessions still exist as compatibility paths.
Exposure Boundaries
What’s on the Tailnet
| Service | URL Pattern | Port | Auth |
|---|---|---|---|
| Dashboard UI | dashboard.tail*.ts.net |
443 (Caddy TLS) | tailscale_auth |
| Dashboard API | dashboard.tail*.ts.net/api/* |
443 | tailscale_auth + role check |
| MCP Server | local/tooling process | N/A | Current connector/tool auth |
| Prometheus | prometheus.tail*.ts.net |
443 | tailscale_auth |
| Grafana | grafana.tail*.ts.net |
443 | tailscale_auth |
What’s NOT on the Tailnet
| Service | Access | Reason |
|---|---|---|
| ARC Controller | Cluster-internal only | No external API needed |
| Runner pods | Cluster-internal only | Ephemeral, no direct access |
| Attic cache | Cluster-internal + tailnet | Runners use cluster DNS, devs use tailnet |
| PostgreSQL | Cluster-internal only | CNPG manages access via mTLS |
| RustFS | Cluster-internal only | S3 API for Attic only |
MCP Server Access
The MCP server runs as a local stdio process on the operator’s machine. It calls the dashboard API over the tailnet:
Claude Code -> MCP Server (stdio, local)
-> HTTP to dashboard.tail*.ts.net/api/*
-> Dashboard request path
-> Dashboard resolves trusted proxy/session identity and checks role
-> Returns envelope response
-> MCP Server formats for Claude
Multi-Org Runner Enrollment
Model
GloriousFlywheel supports runner sharing across multiple GitHub organizations and GitLab groups through a single platform instance.
Platform Instance (single cluster)
+-- Org A (GitHub)
| +-- tinyland-nix (shared)
| +-- tinyland-docker (shared)
| +-- tinyland-dind (shared)
|
+-- Org B (GitHub)
| +-- tinyland-docker (shared)
| +-- tinyland-nix-gpu (shared additive capability when enabled)
|
+-- Group C (GitLab)
+-- gl-docker (shared)
+-- gl-nix (shared)
Enrollment Types
| Type | Scope | Registration | Lifecycle |
|---|---|---|---|
| Shared | All enrolled orgs | Single GitHub App installation per org, all orgs route to same scale set | Platform manages, orgs consume |
| Capability add-on | Enrolled orgs with approved need | Shared label with explicit runtime, architecture, privilege, or resource reason | Platform provisions |
| Org-plus-user | Org + specific repos | GitHub App with restricted repo access | Org admin configures repo list |
Registration Flow
GitHub (ARC):
1. Org admin installs GloriousFlywheel GitHub App
2. App installation generates credentials
3. Platform engineer adds org to arc-runners stack:
- New GitHub App secret in cluster
- ARC scale set configured with org's app credentials
4. Runners appear in org's GitHub Actions runner list
5. Org's workflows use shared labels such as `tinyland-nix`
GitLab:
1. Group admin creates group runner token
2. Platform engineer adds token to gitlab-runners stack
3. Runner registers with GitLab group
4. Group's pipelines pick up the runner via tags
Runner Isolation
| Isolation Level | Mechanism | Use Case |
|---|---|---|
| None (shared pool) | All orgs share same scale set pods | Trusted orgs, cost efficiency |
| Namespace | Separate Kubernetes namespace per org | Untrusted orgs, resource isolation |
| Node | Dedicated node pool per org | Strict isolation, compliance |
Default: shared pool with ephemeral pods. Each job gets a fresh pod that is destroyed after completion. No cross-job data leakage.
Enrollment Lifecycle
Enroll:
Org admin requests enrollment
-> Platform engineer adds org config to tofu stack
-> PR + review + merge + apply
-> Runner appears in org's forge
Monitor:
Dashboard shows enrollment status per org
-> /api/runners groups by forge
-> Metrics tracked per org via runner labels
Offboard:
Org admin requests removal
-> Platform engineer removes org config
-> PR + review + merge + apply
-> Runner deregisters from org's forge
-> Pods drain, secrets deleted
Ownership Matrix
| Decision | Owner | Process |
|---|---|---|
| Grant org enrollment | Org Admin | Request via issue, approved by platform team |
| Provision shared runner | Platform Engineer | PR to tofu/stacks/arc-runners/ |
| Provision additive capability lane | Platform Engineer | PR with explicit runtime, privilege, architecture, or bounded resource reason |
| Set resource quotas per org | Org Admin | PR to stack variables |
| Manage GitHub App installation | Org Admin (per org) | GitHub org settings |
| Rotate runner credentials | Platform Engineer | Scheduled or on-demand via runbook |
| Emergency pause (compatibility runners) | Operator | Dashboard compatibility flow when GitLab backend is configured |