Runner Dashboard Overview
The runner dashboard is a SvelteKit 5 application whose primary operator role today is real-time monitoring of the cross-forge runner pool.
Architecture
- Framework: SvelteKit 5 + Skeleton v4 + adapter-node
- Source:
app/ - Infrastructure:
tofu/modules/runner-dashboard/+tofu/stacks/runner-dashboard/
Build And Consumption
Current repo contract:
- the repo-native Nix derivation is
packages.runner-dashboard - that derivation is the canonical Nix image-input surface for
packages.runner-dashboard-image - current operator-facing deployment and release consumption is image-based, not direct flake-output consumption
- the active published dashboard path is GHCR-backed container images from the Dockerfile/Buildx workflows
This means the dashboard flake derivation is real and important, but it is not yet documented as a standalone public consumption surface.
Features
- Cross-forge monitoring: View GitLab CI and GitHub Actions runners in a unified interface with forge badges (GL/GH)
- HPA status: Live replica counts, CPU/memory utilization, scaling events for GitLab runners
- ARC autoscaler views: Scale set status, active/idle/pending runner counts for GitHub Actions runners with scale-to-zero labels
- Drift detection: Alerts when deployed state diverges from tofu state
- Forge filter: Toggle between All, GitLab, and GitHub runner views
Current Control Plane
Current repo truth:
- monitoring is genuinely cross-forge and multi-namespace
- Kubernetes, ARC, and Prometheus are the primary read-side operator surfaces
- configuration detail currently reads local tfvars from the repo checkout
- access auth is mixed-mode at the edge:
- GitLab OAuth
- WebAuthn
- Tailscale proxy-header auth
- mTLS proxy-header auth
- runner pause/resume and submitted configuration changes still go through GitLab runner APIs and GitLab merge requests as compatibility flows today
- there is no GitHub-native or cluster-native mutation authority yet
That means the dashboard is already more tailnet-ready at the access layer than it is forge-neutral at the control layer. The current operator story is:
- monitoring first
- mutation compatibility-only
- future replacement mutation authority still undecided
Multi-Namespace Queries
The dashboard queries multiple Kubernetes namespaces:
gitlab-runners— GitLab runner pods, HPAs, deploymentsarc-runners— ARC runner scale sets, runner pods
Namespaces are configured via the K8S_RUNNER_NAMESPACES environment
variable. ARC namespaces additionally query AutoScalingRunnerSet CRDs
from actions.github.com/v1alpha1.
RBAC is configured as a ClusterRole with dynamic RoleBindings for each namespace.
Prometheus Integration
PromQL queries pull metrics from the cluster Prometheus instance for historical utilization data, scaling event timelines, and cache hit rates.
Prometheus URL is configured via PROMETHEUS_URL in the runner-dashboard
module.
Authentication
- GitLab OAuth: Current interactive compatibility login flow
- WebAuthn / FIDO2: Optional passwordless authentication backed by
a PostgreSQL credential store (
webauthn-dbmodule) - Tailscale / mTLS: Optional proxy-header auth through the Caddy sidecar when that ingress path is enabled; preferred direction for the tailnet-first operator plane
- Request precedence: Trusted proxy identity headers now outrank stored interactive sessions, so tailnet and mTLS request identity is authoritative when the dashboard is deployed behind the trusted proxy path
- Current mutation policy: Read-side visibility is broader, but runner
pause/resume and submitted config changes now require
operatororadminrole resolution - Current runtime authority summary: the Settings page now exposes the dashboard’s active read plane, mutation authority, GitOps submission path, and access envelope so operators do not have to infer those from code or research notes
- Current API baseline: all non-public dashboard API routes now require an authenticated identity; only the health endpoint remains public
- Current data policy: monitoring and runner-status reads are authenticated viewer-plus, while config and drift detail reads are now operator-plus
- Current admin auth workflow: admins can inspect auth-policy shape and review or revoke registered dashboard passkeys across users, plus inspect recent auth security events for interactive login, logout, and passkey lifecycle actions
- Current admin control workflow: admins can inspect recent GitLab-backed compatibility mutation events for runner pause/resume and submitted config changes
Caddy Sidecar Proxy
The dashboard module supports an optional Caddy reverse proxy sidecar with two modes:
- mTLS mode: Client certificate authentication with a custom CA
- Tailscale mode: Automatic TLS via Tailscale MagicDNS
Configure via enable_caddy_proxy, caddy_mode, and related variables
in the runner-dashboard module.
Development
cd app
pnpm install
pnpm dev # Start dev server
pnpm check # Type check
pnpm test # Run tests
pnpm build # Production build
See Also
- OpenTofu Modules — deployment configuration
- Runners Overview — the runner pool this dashboard monitors