Private Cluster
Overview
Private Cluster enables you to create a dedicated pool of GPU-backed compute resources that can be shared across multiple teams and projects. The cluster is billed at a fixed price, independent of actual GPU utilization, providing predictable costs and guaranteed capacity.
Once a Private Cluster is created, you can deploy Nodes, Inference Endpoints, Training Clusters, and Vector Databases within it without incurring additional charges.
Who Should Use a Private Cluster?
Private Clusters are best suited for teams that:
- Run long-running or predictable AI/ML workloads
- Require guaranteed GPU availability
- Want to share GPUs across multiple projects without repeated provisioning
- Prefer fixed and predictable billing
- Need controlled access through role-based permissions
When to Choose a Private Cluster?
Choose a Private Cluster if you want to:
- Avoid GPU shortages during peak demand
- Eliminate per-resource billing for nodes, inference, and databases
- Centrally manage GPU capacity across teams
- Control GPU allocation without deleting workloads
- Reduce operational overhead for large or growing AI teams
Key Benefits
- Fixed Pricing – Billing is independent of actual GPU usage
- No Hidden Costs – No extra charges for deploying services inside the cluster
- Guaranteed Capacity – GPUs are reserved exclusively for your organization
- Flexible Allocation – Dynamically allocate and deallocate GPUs across projects
- Secure Access Control – Governed by IAM roles and permissions
Creating a Private Cluster
Step-by-Step Guide
- Navigate to Private Cluster from the sidebar under Products.
- Click Create Private Cluster.
- Select the required compute configuration and choose a pricing option:
- Hourly – Flexible, pay-as-you-go pricing
- Committed – Lower cost for predictable workloads with a fixed commitment period
- Enter the Cluster Name and specify the Node Count.
- Click Create.
Commitment Options (Committed Plans Only)
If you select a committed plan, choose what happens after the commitment period ends:
- Auto-Renew – Continue the commitment automatically
- Switch to Hourly Billing – Retain the cluster with hourly pricing
- Auto-Terminate – Automatically delete the cluster
Committed plans provide cost benefits for steady workloads. Any additional nodes added later are billed at hourly rates.
Understanding Roles and Permissions
Private Cluster access is controlled using IAM roles, ensuring that only authorized users can manage cluster capacity.
How Access Works
- Admins and Owners manage cluster creation and capacity.
- Team Leads and Project Leads manage GPU usage within their scope.
- Team Members typically have read-only visibility unless explicitly granted permissions.
Role-Based Access Matrix
| Role | View Cluster | Create / Update Cluster | Allocate Nodes | Deallocate Nodes | Scope |
|---|---|---|---|---|---|
| Admin / Owner | Yes | Yes | Yes | Yes | Cluster (CRN) |
| Team Lead | Yes | No | No | Yes | All projects in team |
| Project Lead | Yes | No | No | Yes | Assigned project only |
| Team Member | Yes (Read-only) | No | No | Yes (if permitted) | As per IAM policy |
Common Scenarios
- I am a Team Lead and want to free GPUs from a project → Allowed
- I am a Project Lead and want to resize the cluster → Not allowed
- I am a Member and want to view cluster usage → Allowed
How Node Allocation Works
Node Lifecycle States
Each node in a Private Cluster exists in one of the following states:
- Free – Available for allocation
- Allocated – Assigned to a project
- Occupied – Actively running workloads (Nodes, Inference, Training, Databases)
Understanding these states helps you manage capacity without disrupting running services.