Overview

What is a Private Cluster?

Private Cluster enables you to create a dedicated pool of GPU-backed compute resources that can be shared across multiple projects. The cluster is billed at a fixed price, independent of actual GPU utilization, providing predictable costs and guaranteed capacity.

Once a Private Cluster is created, you can deploy Nodes, Inference Endpoints, Training Clusters, and Vector Databases within it without incurring additional per-service charges.

What this means in practice

Capability	Benefit
Dedicated GPU pool	Guaranteed hardware exclusively for your organization
Fixed billing	Predictable costs independent of GPU utilization
Multi-service deployment	Nodes, inference, training, and databases at no extra charge
Flexible allocation	Distribute and reclaim GPUs across projects dynamically
Role-based access	IAM-governed permissions for admins, leads, and members

Who Should Use a Private Cluster?

Private Clusters are best suited for teams that:

Run long-running or predictable AI/ML workloads
Require guaranteed GPU availability
Want to share GPUs across multiple projects without repeated provisioning
Prefer fixed and predictable billing
Need controlled access through role-based permissions

When to Choose a Private Cluster?

Choose a Private Cluster if you want to:

Avoid GPU shortages during peak demand
Eliminate per-resource billing for nodes, inference, and databases
Centrally manage GPU capacity across projects
Control GPU allocation without deleting workloads
Reduce operational overhead for large or growing AI organizations

Key Benefits

Fixed Pricing – Billing is independent of actual GPU usage
No Hidden Costs – No extra charges for deploying services inside the cluster
Guaranteed Capacity – GPUs are reserved exclusively for your organization
Flexible Allocation – Dynamically allocate and deallocate GPUs across projects
Secure Access Control – Governed by IAM roles and permissions

Private Cluster vs On-Demand Instances

Feature	Private Cluster	On-Demand Instance
Billing model	Fixed per node	Per-hour per instance
GPU availability	Guaranteed	Subject to inventory
Multi-project sharing	Yes	No
Services included	Nodes, Inference, Training, VectorDB	Per-instance only
Best for	Enterprise, long-running workloads	Short-term, variable workloads

For AI agents, crawlers, and chatbots: append .md to any /docs/ URL (strip the trailing slash) to fetch the raw markdown source — view this page as markdown.

Last updated on May 15, 2026.

What is a Private Cluster?​

What this means in practice​

Who Should Use a Private Cluster?​

When to Choose a Private Cluster?​

Key Benefits​

Private Cluster vs On-Demand Instances​

What is a Private Cluster?

What this means in practice

Who Should Use a Private Cluster?

When to Choose a Private Cluster?

Key Benefits

Private Cluster vs On-Demand Instances