Each runner type has an independent HorizontalPodAutoscaler (HPA) that manages replica count based on CPU utilization.
| Parameter | Value |
|---|---|
| Minimum replicas | 1 |
| Maximum replicas | 5 |
| Scale-up stabilization window | 15 seconds |
| Scale-down stabilization window | 5 minutes |
| Target metric | CPU utilization at 70% |
The asymmetric stabilization windows are intentional. The short scale-up window allows the cluster to respond quickly to load spikes. The longer scale-down window prevents thrashing when load fluctuates around the threshold.
HPA parameters are configured per runner type in organization.yaml. To
change the replica range or stabilization windows for a specific runner,
edit the corresponding entry and run tofu apply from the overlay.
Check HPA status for all runners in the namespace:
kubectl get hpa -n {org}-runners
This shows current and desired replica counts, current CPU utilization, and whether the autoscaler is actively scaling.
For more detail on a specific runner:
kubectl describe hpa runner-docker -n {org}-runners