GloriousFlywheel

Runbook

Operational procedures for managing the runner infrastructure.

Scaling Up

To increase the maximum number of replicas for a runner type:

Edit the HPA max value for the target runner in organization.yaml.
Apply the change:
```
tofu apply
```
Verify the new HPA configuration:
```
kubectl get hpa -n {org}-runners
```

See HPA Tuning for details on stabilization windows and scaling behavior.

Rotating Runner Tokens

To rotate the GitLab runner registration token:

Delete the Kubernetes Secret containing the current token:

kubectl delete secret runner-token-TYPE -n {org}-runners

Re-apply to recreate the secret with a new token:
```
tofu apply
```
Runner pods will pick up the new token on their next restart.

Adding a New Runner Type

Add the new runner definition to organization.yaml with its configuration (base image, tags, resource limits, HPA settings).
Create corresponding tfvars entries in the overlay for the new runner type.
Apply:
```
tofu apply
```
Verify the new runner appears in the GitLab group runner list.

Emergency Stop

To immediately stop all runners of a specific type:

Option A – Scale HPA to zero:

kubectl scale hpa runner-TYPE --replicas=0 -n {org}-runners

Option B – Delete the runner deployment:

kubectl delete deployment runner-TYPE -n {org}-runners

Note: Option B requires a tofu apply to recreate the deployment when service is restored. Option A can be reversed by setting replicas back to the desired minimum.

Log Collection

View logs for all pods of a specific runner type:

kubectl logs -n {org}-runners -l app=runner-TYPE

Follow logs in real time:

kubectl logs -n {org}-runners -l app=runner-TYPE --follow

Health Check

From the overlay repository, run the health check target:

just runners-health

This verifies that all runner types have at least one healthy pod and that the runners are registered with GitLab.

Manual Status Check

To inspect the full state of the runner namespace:

kubectl get pods,hpa,deployments -n {org}-runners

For a specific runner type:

kubectl get pods -n {org}-runners -l app=runner-docker
kubectl describe hpa runner-docker -n {org}-runners

HPA Tuning – autoscaler configuration details
Troubleshooting – diagnosing common issues
Security Model – access controls and secrets

This site is open source. Improve this page.