Rio deploys a simple autoscaler to watch metrics from workloads and scale applications based on current in-flight requests.
Note: Metrics are scraped from the linkerd-proxy sidecar, this requires your application to be injected with the linkerd sidecar. This will happen by default when running new workloads.
To enable autoscaling:
$ rio run --scale 1-10 -p 8080 strongmonkey1992/autoscale:v0 # to give a higher concurrency $ rio run --scale 1-10 --concurrency 20 -p 8080 strongmonkey1992/autoscale:v0 # to scale to zero $ rio run --scale 0-10 -p 8080 strongmonkey1992/autoscale:v0
To test putting load on the service, use hey:
hey -z 3m -c 60 http://xxx-xx.xxxxxx.on-rio-io # watch the scale of your service $ watch rio ps
concurrency means the maximum in-flight requests each pod can take. If your total in-flight request is 60 and concurrency
is 10, Rio will scale workloads to 6 replicas.
Note: When scaling an application to zero, the first request will take longer.
Check autoscaler logs for more details on how metrics are collected and scale decision was made.
$ rio -s logs autoscaler