GloriousFlywheel

HPA Tuning

Each runner type has an independent HorizontalPodAutoscaler (HPA) that manages replica count based on CPU utilization.

Default Configuration

Parameter Value
Minimum replicas 1
Maximum replicas 5
Scale-up stabilization window 15 seconds
Scale-down stabilization window 5 minutes
Target metric CPU utilization at 70%

The asymmetric stabilization windows are intentional. The short scale-up window allows the cluster to respond quickly to load spikes. The longer scale-down window prevents thrashing when load fluctuates around the threshold.

Overriding Defaults

HPA parameters are configured per runner type in organization.yaml. To change the replica range or stabilization windows for a specific runner, edit the corresponding entry and run tofu apply from the overlay.

Monitoring

Check HPA status for all runners in the namespace:

kubectl get hpa -n {org}-runners

This shows current and desired replica counts, current CPU utilization, and whether the autoscaler is actively scaling.

For more detail on a specific runner:

kubectl describe hpa runner-docker -n {org}-runners

Scaling Considerations