Skip to main content

Overview

What is a Private Cluster?

Private Cluster enables you to create a dedicated pool of GPU-backed compute resources that can be shared across multiple projects. The cluster is billed at a fixed price, independent of actual GPU utilization, providing predictable costs and guaranteed capacity.

Once a Private Cluster is created, you can deploy Nodes, Inference Endpoints, Training Clusters, and Vector Databases within it without incurring additional per-service charges.


What this means in practice

CapabilityBenefit
Dedicated GPU poolGuaranteed hardware exclusively for your organization
Fixed billingPredictable costs independent of GPU utilization
Multi-service deploymentNodes, inference, training, and databases at no extra charge
Flexible allocationDistribute and reclaim GPUs across projects dynamically
Role-based accessIAM-governed permissions for admins, leads, and members

Who Should Use a Private Cluster?

Private Clusters are best suited for teams that:

  • Run long-running or predictable AI/ML workloads
  • Require guaranteed GPU availability
  • Want to share GPUs across multiple projects without repeated provisioning
  • Prefer fixed and predictable billing
  • Need controlled access through role-based permissions

When to Choose a Private Cluster?

Choose a Private Cluster if you want to:

  • Avoid GPU shortages during peak demand
  • Eliminate per-resource billing for nodes, inference, and databases
  • Centrally manage GPU capacity across projects
  • Control GPU allocation without deleting workloads
  • Reduce operational overhead for large or growing AI organizations

Key Benefits

  • Fixed Pricing – Billing is independent of actual GPU usage
  • No Hidden Costs – No extra charges for deploying services inside the cluster
  • Guaranteed Capacity – GPUs are reserved exclusively for your organization
  • Flexible Allocation – Dynamically allocate and deallocate GPUs across projects
  • Secure Access Control – Governed by IAM roles and permissions

Private Cluster vs On-Demand Instances

FeaturePrivate ClusterOn-Demand Instance
Billing modelFixed per nodePer-hour per instance
GPU availabilityGuaranteedSubject to inventory
Multi-project sharingYesNo
Services includedNodes, Inference, Training, VectorDBPer-instance only
Best forEnterprise, long-running workloadsShort-term, variable workloads