Skip to main content

Private Cluster

Overview

Private Cluster enables you to create a dedicated pool of GPU-backed compute resources that can be shared across multiple teams and projects. The cluster is billed at a fixed price, independent of actual GPU utilization, providing predictable costs and guaranteed capacity.

Once a Private Cluster is created, you can deploy Nodes, Inference Endpoints, Training Clusters, and Vector Databases within it without incurring additional charges.


Who Should Use a Private Cluster?

Private Clusters are best suited for teams that:

  • Run long-running or predictable AI/ML workloads
  • Require guaranteed GPU availability
  • Want to share GPUs across multiple projects without repeated provisioning
  • Prefer fixed and predictable billing
  • Need controlled access through role-based permissions

When to Choose a Private Cluster?

Choose a Private Cluster if you want to:

  • Avoid GPU shortages during peak demand
  • Eliminate per-resource billing for nodes, inference, and databases
  • Centrally manage GPU capacity across teams
  • Control GPU allocation without deleting workloads
  • Reduce operational overhead for large or growing AI teams

Key Benefits

  • Fixed Pricing – Billing is independent of actual GPU usage
  • No Hidden Costs – No extra charges for deploying services inside the cluster
  • Guaranteed Capacity – GPUs are reserved exclusively for your organization
  • Flexible Allocation – Dynamically allocate and deallocate GPUs across projects
  • Secure Access Control – Governed by IAM roles and permissions

Creating a Private Cluster

Step-by-Step Guide

  1. Navigate to Private Cluster from the sidebar under Products.
  2. Click Create Private Cluster.
  3. Select the required compute configuration and choose a pricing option:
    • Hourly – Flexible, pay-as-you-go pricing
    • Committed – Lower cost for predictable workloads with a fixed commitment period
  4. Enter the Cluster Name and specify the Node Count.
  5. Click Create.

Commitment Options (Committed Plans Only)

If you select a committed plan, choose what happens after the commitment period ends:

  • Auto-Renew – Continue the commitment automatically
  • Switch to Hourly Billing – Retain the cluster with hourly pricing
  • Auto-Terminate – Automatically delete the cluster
Note

Committed plans provide cost benefits for steady workloads. Any additional nodes added later are billed at hourly rates.


Understanding Roles and Permissions

Private Cluster access is controlled using IAM roles, ensuring that only authorized users can manage cluster capacity.

How Access Works

  • Admins and Owners manage cluster creation and capacity.
  • Team Leads and Project Leads manage GPU usage within their scope.
  • Team Members typically have read-only visibility unless explicitly granted permissions.

Role-Based Access Matrix

RoleView ClusterCreate / Update ClusterAllocate NodesDeallocate NodesScope
Admin / OwnerYesYesYesYesCluster (CRN)
Team LeadYesNoNoYesAll projects in team
Project LeadYesNoNoYesAssigned project only
Team MemberYes (Read-only)NoNoYes (if permitted)As per IAM policy

Common Scenarios

  • I am a Team Lead and want to free GPUs from a project → Allowed
  • I am a Project Lead and want to resize the cluster → Not allowed
  • I am a Member and want to view cluster usage → Allowed

How Node Allocation Works

Node Lifecycle States

Each node in a Private Cluster exists in one of the following states:

  • Free – Available for allocation
  • Allocated – Assigned to a project
  • Occupied – Actively running workloads (Nodes, Inference, Training, Databases)

Understanding these states helps you manage capacity without disrupting running services.