Application Scaling: FAQ

Why do I need a saved image to setup Application Scaling?

Application Scaling handles automatic launch and termination of compute nodes in response to a configured scaling policy. These nodes are typically responsible for serving your application workload. Working with a saved image ensures that there is no manual intervention required when scaling or de-scaling.

This is also about separation concerns. The scaling scheduler (we call it scaler) is responsible for booting up a node and the application developers will need to take care of the launch activities of the target application.

Will I be charged for using Application Scaling feature?

Application scaling is free to use. However, the compute nodes (launched by scaler) will be charged based on usage. You can choose a compute plan for the group nodes while creating a scale group.

Why does the load balancer show backend connection failure ?

The load balancer is capable of polling the backend nodes for health check. If you are seeing the error backend connection failure for a load balancer that is tied to a scale group, then check if your application is being launched automatically when the node comes up.

We recommend using SystemD script or tools like supervisor for launching applications automically at node startup. To debug, you may just launch a node from your saved image and check if your application is listening on the target port.

Is the CPU Utilization a node level metric?

Yes. CPU Utilization metric that you see in scaling policy rule is a node level utilization metric and not measured at scale group level. Use of load balancer (LB) with scale groups ensures that this metric doesn’t get skewed, hence we recommend LB for all use cases.

We are adding more metrics in near future. If you have a request, please write to us at

What is cooldown period in scale group?

A cooldown period is a time period that your application node needs to become productive after launch. It could be anywhere between 120-300 seconds.

A simple mechanism to calculate this would be:

  • Boot up time

  • Your application launch time

  • Time taken by LB to start redirecting requests to your node