Auto Scaling Concepts

E2E Auto Scaling lets you grow and shrink a pool of compute nodes automatically, based on a policy you define. It keeps your application responsive during periods of high demand and reduces your spend when demand drops - without manual intervention. Most workflows start from Compute > Auto Scaling in the MyAccount left navigation.

This page explains the concepts and terminology used throughout the Auto Scaling documentation. Read it once before you create your first scale group.

Scaler & Scale Group Group Nodes Saved Image Compute Plan Scaling Policy Watch Period & Cooldown Load Balancer

Why Use Auto Scaling

Benefit	What it gives you
Cost efficiency	Nodes are added only when demand rises and removed when it falls, so you avoid over-provisioning.
Consistent performance	Capacity tracks real-time load, so users see steady response times during traffic spikes.
Less manual work	Node addition, removal, and load-balancer registration happen automatically.
Control	Custom policies let you scale on CPU or any metric you define (memory, network, disk I/O, request count).
Higher availability	Combined with a load balancer, traffic is spread across healthy nodes as the pool changes.
Predictable schedules	Scheduled policies pre-scale capacity for known peak and off-peak windows.

Scaler

The Scaler is the E2E service that manages all Auto Scaling functionality. It watches your metrics, applies your policy, and launches or terminates nodes on your behalf.

Scale Group

A scale group is the core unit of Auto Scaling - a pool of compute nodes that the Scaler manages against a single policy (for example, "add a node when CPU stays above 60% for 30 seconds"). A scale group has a name, a saved image, a compute plan, a minimum/maximum/desired node count, one or more policies, and an attached network and security group.

Group Nodes

The nodes inside a scale group are added and removed dynamically by the Scaler. These are called group nodes (or just nodes) throughout this documentation.

The lifecycle of a group node begins when the Scaler launches it and ends when the Scaler terminates it. You are billed for the time between a node's start and its termination, on an hourly basis.

Saved Image

Because group nodes are created on demand, you need a way to bring your application up automatically at boot. That is what a saved image provides.

A saved image is an image captured from a compute node - ideally one configured to launch your application at startup. Every scale group is built from a saved image, and only images in the Ready state can be selected.

tip

Configure the node so your application starts on boot (for example, with a systemd service or a startup script), then save the image. New group nodes will then come up serving traffic with no manual steps.

Compute Plan

The compute plan (or just plan) defines the hardware - vCPU, RAM, and storage - for your group nodes. It does not have to match the plan used to create your saved image.

Example plan sequence

Create a node on a small, economical plan and configure your application's launch sequence.
Save an image from that node.
Create a scale group from that image but choose the plan you actually need in production.

Scaling Policy

A scaling policy drives the lifecycle of group nodes. The portal supports three policy types, and you can use one or combine them:

Policy type	When the Scaler acts	Documented in
Elastic Policy	When a monitored metric (CPU or a custom attribute) crosses your threshold for a sustained watch period.	Elastic Policy
Scheduled Policy	At specific times you define with cron expressions (for example, scale up at 9 AM on weekdays).	Scheduled Policy
Elastic and Scheduled Policy	Both of the above together. This is the portal's default selection.	Elastic and Scheduled Policy

A policy is built from these settings:

Minimum nodes and Maximum nodes
Desired nodes (cardinality)
Policy parameter (the metric being watched)
Scale-up and scale-down thresholds
Watch period and period duration
Cooldown

When you set an elastic upscale rule, the Scaler automatically creates the matching downscale (negative) rule. For example, an upscale expression of CPU > 60 produces a downscale expression of CPU < 30.

Minimum and Maximum Nodes

These set the floor and ceiling for the scale group. The Scaler never drops below the minimum or rises above the maximum. The portal default is a minimum of 2 and a maximum of 5, and the maximum cannot exceed 50.

Desired Nodes (Cardinality)

The desired node count is normally determined by the policy, but you can adjust it manually - for example, to pre-warm extra capacity before a code or image update. See Resize a scale group.

tip

Start with 2 nodes and let the scale group take over from there.

Policy Parameter (Target Metric)

The default elastic policy parameter is CPU utilization. To scale on any other signal - memory, network traffic, disk I/O, request count, and so on - use a custom policy, which lets you publish your own metric to the node. See Custom Scaling Policies.

Watch Period and Cooldown

Watch Period

A watch period has two parts - Watch Period (the number of consecutive periods) and Period Duration (the length of each period in seconds). The monitored metric must stay over the threshold for the full watch period before the Scaler acts. This prevents brief spikes from triggering a scale operation.

Example

Expression: CPU > 75
Watch Period: 2
Period Duration: 10 seconds

The Scaler watches two consecutive 10-second periods. If CPU stays above 75% for both, the scale-up begins.

Cooldown

The cooldown is a pause after a scaling action during which no further scaling occurs, giving the system time to absorb the effect of the previous action. The portal default is 150 seconds.

Load Balancer

A load balancer is the stable entry point for a scale group. As group nodes (and their IP addresses) come and go, the load balancer keeps a single, consistent address for your users and automatically lists or de-lists backend nodes as the Scaler changes the pool.

tip

Always pair a scale group with a load balancer. Attach the scale group as an Auto Scale Group backend - see Backend Mapping.

Resource	Use it for
Create a Scale Group	Step-by-step creation flow.
Custom Scaling Policies	Scale on memory, network, disk, or any custom metric.
Auto Scale Encryption	Encrypt group nodes at rest.
Autoscale for E1 Series	Storage, IOPS, and billing for E1-series images.
Manage Scale Groups	Operate, edit, and monitor an existing scale group.
Auto Scaling API	Automate scale groups.

For AI agents, crawlers, and chatbots: append .md to any /docs/ URL (strip the trailing slash) to fetch the raw markdown source — view this page as markdown.

Last updated on June 26, 2026.

Why Use Auto Scaling​

Scaler​

Scale Group​

Group Nodes​

Saved Image​

Compute Plan​

Example plan sequence​

Scaling Policy​

Minimum and Maximum Nodes​

Desired Nodes (Cardinality)​

Policy Parameter (Target Metric)​

Watch Period and Cooldown​

Watch Period​

Cooldown​

Load Balancer​

Related Resources​

Why Use Auto Scaling

Scaler

Scale Group

Group Nodes

Saved Image

Compute Plan

Example plan sequence

Scaling Policy

Minimum and Maximum Nodes

Desired Nodes (Cardinality)

Policy Parameter (Target Metric)

Watch Period and Cooldown

Watch Period

Cooldown

Load Balancer

Related Resources