Auto Scaling Concepts
E2E Auto Scaling lets you grow and shrink a pool of compute nodes automatically, based on a policy you define. It keeps your application responsive during periods of high demand and reduces your spend when demand drops — without manual intervention. Most workflows start from Compute > Auto Scaling in the MyAccount left navigation.
This page explains the concepts and terminology used throughout the Auto Scaling documentation. Read it once before you create your first scale group.
Why Use Auto Scaling
| Benefit | What it gives you |
|---|---|
| Cost efficiency | Nodes are added only when demand rises and removed when it falls, so you avoid over-provisioning. |
| Consistent performance | Capacity tracks real-time load, so users see steady response times during traffic spikes. |
| Less manual work | Node addition, removal, and load-balancer registration happen automatically. |
| Control | Custom policies let you scale on CPU or any metric you define (memory, network, disk I/O, request count). |
| Higher availability | Combined with a load balancer, traffic is spread across healthy nodes as the pool changes. |
| Predictable schedules | Scheduled policies pre-scale capacity for known peak and off-peak windows. |
Scaler
The Scaler is the E2E service that manages all Auto Scaling functionality. It watches your metrics, applies your policy, and launches or terminates nodes on your behalf.
Scale Group
A scale group is the core unit of Auto Scaling — a pool of compute nodes that the Scaler manages against a single policy (for example, "add a node when CPU stays above 60% for 30 seconds"). A scale group has a name, a saved image, a compute plan, a minimum/maximum/desired node count, one or more policies, and an attached network and security group.
Group Nodes
The nodes inside a scale group are added and removed dynamically by the Scaler. These are called group nodes (or just nodes) throughout this documentation.
The lifecycle of a group node begins when the Scaler launches it and ends when the Scaler terminates it. You are billed for the time between a node's start and its termination, on an hourly basis.
Saved Image
Because group nodes are created on demand, you need a way to bring your application up automatically at boot. That is what a saved image provides.
A saved image is an image captured from a compute node — ideally one configured to launch your application at startup. Every scale group is built from a saved image, and only images in the Ready state can be selected.
Configure the node so your application starts on boot (for example, with a systemd service or a startup script), then save the image. New group nodes will then come up serving traffic with no manual steps.
Compute Plan
The compute plan (or just plan) defines the hardware — vCPU, RAM, and storage — for your group nodes. It does not have to match the plan used to create your saved image.
Example plan sequence
- Create a node on a small, economical plan and configure your application's launch sequence.
- Save an image from that node.
- Create a scale group from that image but choose the plan you actually need in production.
Scaling Policy
A scaling policy drives the lifecycle of group nodes. The portal supports three policy types, and you can use one or combine them:
| Policy type | When the Scaler acts | Documented in |
|---|---|---|
| Elastic Policy | When a monitored metric (CPU or a custom attribute) crosses your threshold for a sustained watch period. | Elastic Policy |
| Scheduled Policy | At specific times you define with cron expressions (for example, scale up at 9 AM on weekdays). | Scheduled Policy |
| Elastic and Scheduled Policy | Both of the above together. This is the portal's default selection. | Elastic and Scheduled Policy |
A policy is built from these settings:
- Minimum nodes and Maximum nodes
- Desired nodes (cardinality)
- Policy parameter (the metric being watched)
- Scale-up and scale-down thresholds
- Watch period and period duration
- Cooldown
When you set an elastic upscale rule, the Scaler automatically creates the matching downscale (negative) rule. For example, an upscale expression of CPU > 60 produces a downscale expression of CPU < 30.
Minimum and Maximum Nodes
These set the floor and ceiling for the scale group. The Scaler never drops below the minimum or rises above the maximum. The portal default is a minimum of 2 and a maximum of 5, and the maximum cannot exceed 50.
Desired Nodes (Cardinality)
The desired node count is normally determined by the policy, but you can adjust it manually — for example, to pre-warm extra capacity before a code or image update. See Resize a scale group.
Start with 2 nodes and let the scale group take over from there.
Policy Parameter (Target Metric)
The default elastic policy parameter is CPU utilization. To scale on any other signal — memory, network traffic, disk I/O, request count, and so on — use a custom policy, which lets you publish your own metric to the node. See Custom Scaling Policies.
Watch Period and Cooldown
Watch Period
A watch period has two parts — Watch Period (the number of consecutive periods) and Period Duration (the length of each period in seconds). The monitored metric must stay over the threshold for the full watch period before the Scaler acts. This prevents brief spikes from triggering a scale operation.
Example
- Expression:
CPU > 75 - Watch Period:
2 - Period Duration:
10seconds
The Scaler watches two consecutive 10-second periods. If CPU stays above 75% for both, the scale-up begins.
Cooldown
The cooldown is a pause after a scaling action during which no further scaling occurs, giving the system time to absorb the effect of the previous action. The portal default is 150 seconds.
Load Balancer
A load balancer is the stable entry point for a scale group. As group nodes (and their IP addresses) come and go, the load balancer keeps a single, consistent address for your users and automatically lists or de-lists backend nodes as the Scaler changes the pool.
Always pair a scale group with a load balancer. Attach the scale group as an Auto Scale Group backend — see Backend Mapping.
Related Resources
| Resource | Use it for |
|---|---|
| Create a Scale Group | Step-by-step creation flow. |
| Custom Scaling Policies | Scale on memory, network, disk, or any custom metric. |
| Auto Scale Encryption | Encrypt group nodes at rest. |
| Autoscale for E1 Series | Storage, IOPS, and billing for E1-series images. |
| Manage Scale Groups | Operate, edit, and monitor an existing scale group. |
| Auto Scaling API | Automate scale groups. |