Multi-Instance GPU (MIG)
Introduction
At E2E Cloud, we offer Multi-Instance GPU (MIG) support as part of our Private Cloud / Private Cluster offerings. MIG is a feature available on NVIDIA GPUs based on the Ampere architecture and above, enabling us to securely partition a single physical GPU into up to seven independent GPU instances.
Each GPU instance functions like a separate GPU, allowing multiple workloads or users to run in parallel on the same physical GPU. This enables better GPU utilization, especially for workloads that do not require the full compute capacity of an entire GPU.

MIG in AI Cloud Private Clusters
We provide Multi-Instance GPU (MIG) as part of our Private Cluster offering to help customers efficiently utilize dedicated GPU nodes. Each private cluster node may contain multiple GPUs (for example, 8 GPUs per node), and enabling MIG allows these GPUs to be partitioned into smaller, isolated instances.
This model is well suited for workloads such as notebooks, inference services, and lightweight training jobs that do not need exclusive access to a full GPU.
By offering MIG in private clusters, we enable:
- Parallel execution of multiple GPU workloads on the same physical GPU
- Improved utilization of dedicated GPU resources
- Efficient GPU sharing across projects, jobs, and services within a private environment
Performance Isolation and Resource Guarantees
MIG provides hardware-level isolation for each GPU instance. We allocate dedicated and isolated paths across the entire GPU memory and compute system, including:
- Streaming Multiprocessors (SMs)
- L2 cache banks
- Memory controllers
- DRAM bandwidth
This design ensures that workloads running in one MIG instance do not interfere with workloads running in another. Even if one workload heavily consumes compute or memory resources, other instances continue to operate with predictable throughput and latency.
These guarantees allow us to deliver a defined quality of service (QoS) for workloads running in:
- Virtual machines
- Containers
- Individual processes
MIG can also be used alongside GPU pass-through and virtual GPU (vGPU) configurations, while maintaining strong isolation guarantees between workloads.
Supported MIG Profiles
This section provides an overview of the supported MIG profiles and their possible placements on supported GPUs.
H100 MIG Profiles
The following diagram illustrates the MIG profiles supported on the NVIDIA H100 GPU:

The table below lists the supported MIG (Multi-Instance GPU) profiles available on the NVIDIA H100 80GB GPU for both PCIe and SXM5 variants.
Table 1: GPU Instance Profiles on H100
| Profile Name | Fraction of Memory | Fraction of SMs | Hardware Units | L2 Cache Size | Copy Engines | Instances Available |
|---|---|---|---|---|---|---|
| MIG 1g.10gb | 1/8 | 1/7 | 1 NVDEC / 1 JPEG / 0 OFA | 1/8 | 1 | 7 |
| MIG 1g.20gb | 1/4 | 1/7 | 1 NVDEC / 1 JPEG / 0 OFA | 1/8 | 1 | 4 |
| MIG 2g.20gb | 2/8 | 2/7 | 2 NVDECs / 2 JPEG / 0 OFA | 2/8 | 2 | 3 |
| MIG 3g.40gb | 4/8 | 3/7 | 3 NVDECs / 3 JPEG / 0 OFA | 4/8 | 3 | 2 |
| MIG 4g.40gb | 4/8 | 4/7 | 4 NVDECs / 4 JPEG / 0 OFA | 4/8 | 4 | 1 |
| MIG 7g.80gb | Full | 7/7 | 7 NVDECs / 7 JPEG / 1 OFA | Full | 8 | 1 |
H200 MIG Profiles
The following diagram illustrates the MIG profiles supported on the NVIDIA H200 GPU:

The table below lists the supported MIG (Multi-Instance GPU) profiles available on the NVIDIA H200 141GB GPU.
Table 2: GPU Instance Profiles on H200
| Profile Name | Fraction of Memory | Fraction of SMs | Hardware Units | L2 Cache Size | Copy Engines | Instances Available |
|---|---|---|---|---|---|---|
| MIG 1g.18gb | 1/8 | 1/7 | 1 NVDEC / 1 JPEG / 0 OFA | 1/8 | 1 | 7 |
| MIG 1g.35gb | 1/4 | 1/7 | 1 NVDEC / 1 JPEG / 0 OFA | 1/8 | 1 | 4 |
| MIG 2g.35gb | 2/8 | 2/7 | 2 NVDECs / 2 JPEG / 0 OFA | 2/8 | 2 | 3 |
| MIG 3g.71gb | 4/8 | 3/7 | 3 NVDECs / 3 JPEG / 0 OFA | 4/8 | 3 | 2 |
| MIG 4g.71gb | 4/8 | 4/7 | 4 NVDECs / 4 JPEG / 0 OFA | 4/8 | 4 | 1 |
| MIG 7g.141gb | Full | 7/7 | 7 NVDECs / 7 JPEG / 1 OFA | Full | 8 | 1 |
Summary
Multi-Instance GPU (MIG) enables efficient, secure, and predictable sharing of high-performance NVIDIA GPUs within E2E Cloud Private Clusters. By partitioning a single physical GPU into multiple hardware-isolated instances, MIG allows multiple workloads to run concurrently without performance interference.
Through predefined MIG profiles on GPUs such as NVIDIA H100 and H200, users can choose the right balance of GPU memory, compute (SMs), and hardware units based on workload requirements. This ensures optimal GPU utilization while maintaining strict performance isolation and quality of service (QoS).
MIG is particularly well suited for notebooks, inference services, and lightweight training jobs that do not require a full GPU. By offering MIG in private clusters, E2E Cloud helps customers reduce resource waste, scale efficiently, and maximize the value of their dedicated GPU infrastructure.