Skip to main content

Auto Scaling Concepts

E2E Auto Scaling lets you grow and shrink a pool of compute nodes automatically, based on a policy you define. It keeps your application responsive during periods of high demand and reduces your spend when demand drops — without manual intervention. Most workflows start from Compute > Auto Scaling in the MyAccount left navigation.

This page explains the concepts and terminology used throughout the Auto Scaling documentation. Read it once before you create your first scale group.


Why Use Auto Scaling

BenefitWhat it gives you
Cost efficiencyNodes are added only when demand rises and removed when it falls, so you avoid over-provisioning.
Consistent performanceCapacity tracks real-time load, so users see steady response times during traffic spikes.
Less manual workNode addition, removal, and load-balancer registration happen automatically.
ControlCustom policies let you scale on CPU or any metric you define (memory, network, disk I/O, request count).
Higher availabilityCombined with a load balancer, traffic is spread across healthy nodes as the pool changes.
Predictable schedulesScheduled policies pre-scale capacity for known peak and off-peak windows.

Scaler

The Scaler is the E2E service that manages all Auto Scaling functionality. It watches your metrics, applies your policy, and launches or terminates nodes on your behalf.

Scale Group

A scale group is the core unit of Auto Scaling — a pool of compute nodes that the Scaler manages against a single policy (for example, "add a node when CPU stays above 60% for 30 seconds"). A scale group has a name, a saved image, a compute plan, a minimum/maximum/desired node count, one or more policies, and an attached network and security group.

Group Nodes

The nodes inside a scale group are added and removed dynamically by the Scaler. These are called group nodes (or just nodes) throughout this documentation.

The lifecycle of a group node begins when the Scaler launches it and ends when the Scaler terminates it. You are billed for the time between a node's start and its termination, on an hourly basis.

Saved Image

Because group nodes are created on demand, you need a way to bring your application up automatically at boot. That is what a saved image provides.

A saved image is an image captured from a compute node — ideally one configured to launch your application at startup. Every scale group is built from a saved image, and only images in the Ready state can be selected.

tip

Configure the node so your application starts on boot (for example, with a systemd service or a startup script), then save the image. New group nodes will then come up serving traffic with no manual steps.

Compute Plan

The compute plan (or just plan) defines the hardware — vCPU, RAM, and storage — for your group nodes. It does not have to match the plan used to create your saved image.

Example plan sequence

  1. Create a node on a small, economical plan and configure your application's launch sequence.
  2. Save an image from that node.
  3. Create a scale group from that image but choose the plan you actually need in production.

Scaling Policy

A scaling policy drives the lifecycle of group nodes. The portal supports three policy types, and you can use one or combine them:

Policy typeWhen the Scaler actsDocumented in
Elastic PolicyWhen a monitored metric (CPU or a custom attribute) crosses your threshold for a sustained watch period. Elastic Policy
Scheduled PolicyAt specific times you define with cron expressions (for example, scale up at 9 AM on weekdays).Scheduled Policy
Elastic and Scheduled PolicyBoth of the above together. This is the portal's default selection.Elastic and Scheduled Policy

A policy is built from these settings:

  • Minimum nodes and Maximum nodes
  • Desired nodes (cardinality)
  • Policy parameter (the metric being watched)
  • Scale-up and scale-down thresholds
  • Watch period and period duration
  • Cooldown

When you set an elastic upscale rule, the Scaler automatically creates the matching downscale (negative) rule. For example, an upscale expression of CPU > 60 produces a downscale expression of CPU < 30.

Minimum and Maximum Nodes

These set the floor and ceiling for the scale group. The Scaler never drops below the minimum or rises above the maximum. The portal default is a minimum of 2 and a maximum of 5, and the maximum cannot exceed 50.

Desired Nodes (Cardinality)

The desired node count is normally determined by the policy, but you can adjust it manually — for example, to pre-warm extra capacity before a code or image update. See Resize a scale group.

tip

Start with 2 nodes and let the scale group take over from there.

Policy Parameter (Target Metric)

The default elastic policy parameter is CPU utilization. To scale on any other signal — memory, network traffic, disk I/O, request count, and so on — use a custom policy, which lets you publish your own metric to the node. See Custom Scaling Policies.

Watch Period and Cooldown

Watch Period

A watch period has two parts — Watch Period (the number of consecutive periods) and Period Duration (the length of each period in seconds). The monitored metric must stay over the threshold for the full watch period before the Scaler acts. This prevents brief spikes from triggering a scale operation.

Example

  • Expression: CPU > 75
  • Watch Period: 2
  • Period Duration: 10 seconds

The Scaler watches two consecutive 10-second periods. If CPU stays above 75% for both, the scale-up begins.

Cooldown

The cooldown is a pause after a scaling action during which no further scaling occurs, giving the system time to absorb the effect of the previous action. The portal default is 150 seconds.

Load Balancer

A load balancer is the stable entry point for a scale group. As group nodes (and their IP addresses) come and go, the load balancer keeps a single, consistent address for your users and automatically lists or de-lists backend nodes as the Scaler changes the pool.

tip

Always pair a scale group with a load balancer. Attach the scale group as an Auto Scale Group backend — see Backend Mapping.


ResourceUse it for
Create a Scale GroupStep-by-step creation flow.
Custom Scaling PoliciesScale on memory, network, disk, or any custom metric.
Auto Scale EncryptionEncrypt group nodes at rest.
Autoscale for E1 SeriesStorage, IOPS, and billing for E1-series images.
Manage Scale GroupsOperate, edit, and monitor an existing scale group.
Auto Scaling APIAutomate scale groups.

Last updated on June 9, 2026.