Introduction
E2E Auto scaling enables you to dynamically scale compute nodes based on varying workloads and a defined policy. Using this feature, you can meet the seasonal or varying demands of infrastructure while optimizing the cost.
The core unit of EAS is a scale group. The following list covers the features and capabilities of scale groups:
- Rule-based Policy setup for adding nodes based on workload
- Integration with Load Balancer to automatically list or de-list backend servers
- Automatic removal of nodes when utilization falls below a set threshold
- SSH access to each node enables activities like log viewing, debugging, etc.
Before you define your first scale group, we recommend familiarizing yourself with concepts and terminologies.
Concepts
Application Scaling helps you offer consistent performance for your end-users during high demand and also reduce your spend during periods of low demand. The following section covers the key terminologies used through this document:
Scaler
Scaler is the E2E service that manages Application Scaling functionality.
Scale Group
Scale Groups represent the nodes launched by Scaler. Each group is destined to adhere to a scaling policy (e.g., Add a compute node when CPU utilization on an existing node hovers at 70% for 20 seconds).
Group nodes
The nodes in a scale group are dynamically added or removed. These nodes will be referred to as Group nodes or just nodes in the rest of the document.
The lifecycle of group nodes starts with the creation of scale groups and ends with the termination of the group. You will be charged for the time between the start action of a node and the time of termination.
Saved Image
Due to the dynamic nature of nodes, you would want to automate the launch sequence of the application too. This is where the saved image comes into play.
A saved image is a compute node that you had saved, with the capability to launch your application at startup.
Compute Plan
The compute plan or plan is where you select infrastructure or hardware requirements for your group nodes. It need not be the same as the plan used to create your saved image.
Example Plan Sequence
- Create a node with a conservative plan for the application (e.g., C Series 1 CPU 1GB)
- Add a launch sequence to auto-install and start your application during startup
- Create a scale group with the actual plan required for your production servers (e.g., C Series 16 CPU 64 GB)
Scaling Policy
A scaling policy determines the lifecycle of group nodes. It consists of an expression containing:
- Min. nodes
- Max. nodes
- Desired nodes (Cardinality)
- Watch Period and Period Duration
- Cooldown
A negative policy is automatically created by the Scaler to handle the termination of nodes. For example, if a user sets an expression of CPU > 80
for upscaling, the scaler will create a downscaling policy of CPU < 80
.
Min and Max nodes
These define the maximum and minimum guarantees for your scale group.
Cardinality or Desired Nodes
While the number of nodes is usually determined by the policy configuration, you can manually adjust it when necessary, such as during code or image updates.
Tip: Start with 2 nodes and let the scale group take over.
Performance or Target Metric
Currently, scaling policies support CPU Utilization only.
Watch Periods
A watch period has two parts: Periods and Period Duration. Spikes in CPU usage must last for the watch period to trigger a scaling decision.
Example:
- Expression: CPU > 75
- Watch Period: 2
- Period Duration: 10 seconds
The scaler will monitor two consecutive periods of 10 seconds. If CPU utilization stays above 75% during this time, the scaling operation will initiate.
Cooldown
A cooldown period prevents scaling operations immediately after a previous operation. This ensures stability by waiting to assess the impact of the previous scaling action. Default: 150 seconds.
Load Balancer
Load balancers serve as entry points for scaling applications. As group nodes (and their IPs) change, the load balancer ensures consistent access for external users.
Always bundle your scale groups with a load balancer.
Define Scale Groups
Application scaling helps you optimize infrastructure by automatically adjusting the number of compute nodes based on a predefined policy. You can define a scale group (a pool of compute nodes) for any web frontend or backend application to maintain consistent performance.
Before you Begin
- To save an image of a Virtual Node, it must be powered down. For this, click Power Off under the Actions section.
- After powering off the node, click the Action button again, and then click Save Image.
Steps to Define Scale Groups
- Go to My Account
- Navigate to Compute > Auto Scaling
- Click Create a new Group
- Select a Saved Image that can launch your application at startup
Click Create a new Group button:
Choose the image to use:
Select the plan according to your needs:
Elastic Policy
Elastic policies allow the infrastructure to scale automatically based on conditions such as CPU utilization, network traffic, or request latency.
After selecting the plan, provide the scale group name, parameters, and policy.
Elastic Policy offers Default and Custom options:
- Default Policy: Scaling is based on CPU or Memory utilization
- Custom Policy: Specify custom metrics like memory usage or request count
Default Policy
- CPU: Resources scale based on CPU utilization
- Memory: Resources scale based on memory utilization
Custom Policy
A Custom Policy allows scaling based on a custom metric (e.g., disk I/O, network traffic). You’ll receive a CURL command for updating the custom parameter within scripts, cron jobs, or hooks.
Note: The default value of the custom parameter is set to 0.
Manage Custom Policy
The custom policy feature in auto scaling enables you to define your own custom attribute. The auto scaling service utilizes this attribute to make scaling decisions. The customer is responsible for setting this value on the service VM. To configure the custom attribute on the VM, the customer should first set up the attribute on an existing VM. Afterward, they need to create a saved image from that VM and use it to create the scaler service.
The custom policy attribute must be configured on the image used to create the Scaler service.
Custom Policy Name
The Custom Policy Name field is where you enter the name of the custom attribute that you want to use to monitor your service. This attribute can be any name that you choose, but it is helpful to use names that are descriptive of the aspect of your service that you are monitoring. For example, you could use the names "MEMORY" for memory usage, "NETTX" for network traffic, "DISKWRIOPS" for disk write operations, etc.
Node Utilization Section
Specify the values that will trigger a scale-up (increase in cardinality) or scale-down (decrease in cardinality) operation based on your preferences.
Scaling Period Policy
You need to define the watch period, duration of each period, and cooldown period.
If your custom policy names are MEMORY, NETTX, NETRX, DISKWRIOPS, DISKRDIOPS, DISKWRBYTES, or DISKWRIOPS, you do not need to worry about anything else. However, if you wish to adjust the cardinality of your auto-scaling service based on different attributes, you must configure them through your node. Scaling (incrementing and decrementing) occurs based on the average value of the custom policy attribute.
To Set Custom Attributes on Service Nodes, Follow These Steps:
- Create a new node.
- Establish an SSH connection to that node.
- Add the script provided below to your node and set up a cron job for it if desired.
Let us assume that you have set the custom policy name as "CUSTOM_ATT," with a max utilization of 60 units and a minimum utilization of 30 units. The cardinality will increase when the value of CUSTOM_ATT exceeds 60 units and decrease if it falls below 30 units.
If your goal is to adjust the cardinality based on the percentage of memory utilization, you need to assign the CUSTOM_ATT attribute to the node. This attribute will monitor memory utilization through the script. To achieve this, create a cron job that monitors memory utilization and updates the attribute periodically.
When Writing the Script, You'll Need to Obtain the Following Information:
- ONEGATE_ENDPOINT
- TOKENTXT
- VMID
You can find these details at the following location:
/var/run/one-context/one_env
To Create the Script in a .sh File, Follow These Steps:
- Create or update the file with the desired file name, like
file_name.sh
. - Inside the script file
file_name.sh
, you can begin writing your script.
Now you have two options to write a script inside the file file_name.sh
: either use option 1 or option 2 to write the script.
Option 1
TMP_DIR=`mktemp -d`
echo "" > $TMP_DIR/metrics
MEM_TOTAL=`grep MemTotal: /proc/meminfo | awk '{print $2}'`
MEM_FREE=`grep MemFree: /proc/meminfo | awk '{print $2}'`
MEM_USED=$(($MEM_TOTAL-$MEM_FREE))
MEM_USED_PERC="0"
if ! [ -z $MEM_TOTAL ] && [ $MEM_TOTAL -gt 0 ]; then
MEM_USED_PERC=`echo "$MEM_USED $MEM_TOTAL" | \
awk '{ printf "%.2f", 100 * $1 / $2 }'`
fi
CUSTOM_ATTR=$MEM_USED_PERC
echo "CUSTOM_ATTR = $CUSTOM_ATTR" >> $TMP_DIR/metrics
VMID=$(source /var/run/one-context/one_env; echo $VMID)
ONEGATE_ENDPOINT=$(source /var/run/one-context/one_env; echo $ONEGATE_ENDPOINT)
ONEGATE_TOKEN=$(source /var/run/one-context/one_env; echo $TOKENTXT)
curl -X "PUT" $ONEGATE_ENDPOINT/vm \
--header "X-ONEGATE-TOKEN: $ONEGATE_TOKEN" \
--header "X-ONEGATE-VMID: $VMID" \
--data-binary @$TMP_DIR/metrics
Option2
MEM_TOTAL=`grep MemTotal: /proc/meminfo | awk '{print $2}'`
MEM_FREE=`grep MemFree: /proc/meminfo | awk '{print $2}'`
MEM_USED=$(($MEM_TOTAL-$MEM_FREE))
MEM_USED_PERC="0"
if ! [ -z $MEM_TOTAL ] && [ $MEM_TOTAL -gt 0 ]; then
MEM_USED_PERC=`echo "$MEM_USED $MEM_TOTAL" | \
awk '{ printf "%.2f", 100 * $1 / $2 }'`
fi
VMID=$(source /var/run/one-context/one_env; echo $VMID)
onegate vm update $VMID --data CUSTOM_ATTR=$MEM_USED_PERC
Now To make the (file_name.sh) file executable, execute the following command:
chmod +x file_name.sh
To run the file_name.sh script, use the following command:
./file_name.sh":
This will execute the script and perform its intended actions.
But for continuous monitoring of your attribute there is a need to make a cron of your script. So for this in the terminal execute :
crontab -e
You will be prompted to specify a file where you need to provide the scheduled time for the cron job and the location of the script.
Example:
* * * * * /path/to/your/file_name.sh (/root/file_name.sh)
Afterward, create an image of that node.
Launch your auto scale group using a custom policy name (make sure to use the same name during configuration).
This setup will monitor the percentage of memory utilization and store it in the specified custom attribute (CUSTOM_ATTR). Based on the values which you have provided for cardinality increment and decrement, your scheduled actions will be performed.
To see the set attributes you can use given blow command -
$ onegate vm show VMID --json
After run above command the detail will be shown like this:-
{
"VM": {
"NAME": "machine_name",
"ID": "machine_id",
"STATE": "machine_state",
"LCM_STATE": "machine_lcm_state",
"USER_TEMPLATE": {
"CUSTOM_ATTR": "set_attribute",
"DISTRO": "distro",
"HOT_RESIZE": {
"CPU_HOT_ADD_ENABLED": "NO",
"MEMORY_HOT_ADD_ENABLED": "NO"
},
"HYPERVISOR": "kvm",
"INPUTS_ORDER": "",
"LOGO": "images/logos/centos.png",
"LXD_SECURITY_PRIVILEGED": "true",
"MEMORY_UNIT_COST": "MB",
"MY_ACCOUNT_DISPLAY_CATEGORY": "Linux Virtual Node",
"OS_TYPE": "CentOS-7.5",
"SAVED_TEMPLATE_ID": "0",
"SCHED_DS_REQUIREMENTS": "ID=\"0\"",
"SCHED_REQUIREMENTS": "ID=\"10\" | ID=\"11\"",
"SKU_TYPE": "sku_type",
"TYPE": "Distro"
},
"TEMPLATE": {
"NIC": [
{
"IP": "ip_add",
"MAC": "mac_add",
"NAME": "nic_name",
"NETWORK": "your_network"
}
],
"NIC_ALIAS": [
]
}
}
}
To see the VMID, you can use given blow command -
$ onegate vm show
Output will be like this:
VM 8
NAME : web_0_(service_1)
STATE : RUNNING
IP : 192.168.122.23
Schedule Policy
Schedule Policy - Autoscaling schedule policy allows you to define a predetermined schedule for automatically adjusting the capacity of your resources. For example, you may use a scheduled autoscaling policy to increase the number of instances in your service during peak traffic hours and then decrease the number of instances during off-peak hours.
Recurrence - In autoscaling, recurrence refers to scheduling scaling actions on a recurring basis. This is useful for applications with predictable traffic patterns, such as a website that receives more traffic on weekends or during peak business hours.
Cron - To configure recurrence in autoscaling, specify a cron expression. For example, the cron expression 0 0 * * *
specifies that the scaling action should be run at midnight every day.
Upscale Recurrence
Specify the cardinality of nodes at a specific time by adjusting the field in the cron settings. Ensure the value is lower than the maximum number of nodes you set.
Downscale Recurrence
Specify the cardinality of nodes at a specific time by adjusting the field in the cron settings. Ensure the value is greater than the maximum number of nodes you set.
To choose a scheduled policy, select Schedule Policy instead of Elastic Policy, set the upscale and downscale recurrence, and click Create Scale.
Elastic and Scheduled Policy
If you want to create a scaler service using both options, choose the "Both" policy option, configure the parameters, and proceed.
To see the scale group details, click on Scale Group Details.
To view the details of the active nodes, click on the Active Node Details tab.
To view the details of terminated nodes, click on the Terminated Node Details tab.
To view associated network details, click on the Network tab.
To view the details of the attached load balancer (LB), click on the Attached LB tab.
To view monitoring details, click on the Monitoring tab.
To view security group details, click on the Security Group tab.
To view log details, click on the Logs tab.
Actions
Resize Service
To resize the service, click on the 3 dots menu and select Resize Services.
Start/Stop Action
The start/stop actions in the autoscaling service allow you to manage the service state.
Stop Action
The stop action halts the service within the autoscaling infrastructure.
- Process: When initiated, the service transitions to a stopped state after a brief period.
- State: The service is marked as 'stopped,' with desired nodes set to zero.
- Billing: Billing is paused during the stopped state.
Start Action
The start action begins the service within the autoscaling environment.
- Process: The service transitions to a running state after initiation.
- State: The service is in a running state, with desired nodes set to the minimum node count.
- Billing: Billing resumes upon starting the service.
Conclusion
Utilizing start and stop actions helps you manage autoscaling efficiently, controlling costs and resources.
Delete
To delete the service, click on the Delete button.
Edit & Update Auto Scale
To edit scale group details like parameters and policies, click on the Edit icon.
To edit scale group parameters, click on the Edit icon.
After making changes, click the Update button.
To select Elastic Policy, click the Edit icon and choose from the dropdown.
To select Scheduled Policy, click the Edit icon and choose from the dropdown.
To select both policies, click the Edit icon and choose from the dropdown.
Deleting a saved image associated with a Scale Group is not allowed. First, terminate the associated scale group to delete the saved image.