Training Cluster
Training Cluster enables the creation of a dedicated environment with a predefined allocation of RAM, CPU, and GPU resources. The pricing for the Training Cluster is fixed, unaffected by the actual usage percentage of the allocated resources. Additionally, creating deployments within the Training Cluster incurs no extra charges.
Create Training Cluster
- To create a new Training Cluster, click the CREATE CLUSTER button under Training Cluster.
To Launch TIR Cluster
- Enter the desired Cluster Name, and Select the Cluster Configuration by choosing the appropriate machine type with the required CPU, GPU, and an available plan for TIR Cluster.
- On this page, you can view the details of the selected plan. Depending on whether you choose an Hourly-Based Plan or a Committed Plan, the summary section will display the corresponding details and associated costs and then click on CREATE.
To Launch Training Cluster In Private Cluster
- Enter the desired Cluster Name, and Select the Cluster Configuration by choosing the private Cluster.
- If you don't have Private Cluster then click on Click here and then click on CREATE to launch a Private Cluster
- On this page, you can view the details of the selected plan. Depending on whether you choose an Hourly-Based Plan or a Committed Plan, the summary section will display the corresponding details and associated costs and then click on NEXT.
- Enter the desired Cluster Name, and Select the Node Count and then select the Node Configuration and then select the plan and then click on LAUNCH button.
- Now select the Private Cluster and then select the Plan and if the resources is not available then click on request otherwise select the plan and then click on CREATE.
Manage Training Cluster
Overview
- You can view the details of the selected Training Cluster, including the Cluster Name, Number of Nodes, Plan Name, and the Cluster Node Configuration, which displays the count of GPUs, CPUs, and RAM allocated within the cluster.
Monitoring
You can view the Disk Usage and Memory Usage for the selected Node within the Training Cluster. Additionally, the following metrics are also available: GPU Utilization, GPU Temperature, CPU Utilization, Memory Utilization, Disk Total Read Bytes, and Disk Total Write Bytes.