Skip to main content

Multi-Instance GPU (MIG)

Introduction

At E2E Cloud, we offer Multi-Instance GPU (MIG) support as part of our Private Cloud / Private Cluster offerings. MIG is a feature available on NVIDIA GPUs based on the Ampere architecture and above, enabling us to securely partition a single physical GPU into up to seven independent GPU instances.

Each GPU instance functions like a separate GPU, allowing multiple workloads or users to run in parallel on the same physical GPU. This enables better GPU utilization, especially for workloads that do not require the full compute capacity of an entire GPU.

Mig Overview

MIG in AI Cloud Private Clusters

We provide Multi-Instance GPU (MIG) as part of our Private Cluster offering to help customers efficiently utilize dedicated GPU nodes. Each private cluster node may contain multiple GPUs (for example, 8 GPUs per node), and enabling MIG allows these GPUs to be partitioned into smaller, isolated instances.

This model is well suited for workloads such as notebooks, inference services, and lightweight training jobs that do not need exclusive access to a full GPU.

By offering MIG in private clusters, we enable:

  • Parallel execution of multiple GPU workloads on the same physical GPU
  • Improved utilization of dedicated GPU resources
  • Efficient GPU sharing across projects, jobs, and services within a private environment

Performance Isolation and Resource Guarantees

MIG provides hardware-level isolation for each GPU instance. We allocate dedicated and isolated paths across the entire GPU memory and compute system, including:

  • Streaming Multiprocessors (SMs)
  • L2 cache banks
  • Memory controllers
  • DRAM bandwidth

This design ensures that workloads running in one MIG instance do not interfere with workloads running in another. Even if one workload heavily consumes compute or memory resources, other instances continue to operate with predictable throughput and latency.

These guarantees allow us to deliver a defined quality of service (QoS) for workloads running in:

  • Virtual machines
  • Containers
  • Individual processes

MIG can also be used alongside GPU pass-through and virtual GPU (vGPU) configurations, while maintaining strong isolation guarantees between workloads.


Supported MIG Profiles

This section provides an overview of the supported MIG profiles and their possible placements on supported GPUs.


H100 MIG Profiles

The following diagram illustrates the MIG profiles supported on the NVIDIA H100 GPU:

H100 Mig Profile

The table below lists the supported MIG (Multi-Instance GPU) profiles available on the NVIDIA H100 80GB GPU for both PCIe and SXM5 variants.

Table 1: GPU Instance Profiles on H100

Profile NameFraction of MemoryFraction of SMsHardware UnitsL2 Cache SizeCopy EnginesInstances Available
MIG 1g.10gb1/81/71 NVDEC / 1 JPEG / 0 OFA1/817
MIG 1g.20gb1/41/71 NVDEC / 1 JPEG / 0 OFA1/814
MIG 2g.20gb2/82/72 NVDECs / 2 JPEG / 0 OFA2/823
MIG 3g.40gb4/83/73 NVDECs / 3 JPEG / 0 OFA4/832
MIG 4g.40gb4/84/74 NVDECs / 4 JPEG / 0 OFA4/841
MIG 7g.80gbFull7/77 NVDECs / 7 JPEG / 1 OFAFull81

H200 MIG Profiles

The following diagram illustrates the MIG profiles supported on the NVIDIA H200 GPU:

H200 Mig Profile

The table below lists the supported MIG (Multi-Instance GPU) profiles available on the NVIDIA H200 141GB GPU.

Table 2: GPU Instance Profiles on H200

Profile NameFraction of MemoryFraction of SMsHardware UnitsL2 Cache SizeCopy EnginesInstances Available
MIG 1g.18gb1/81/71 NVDEC / 1 JPEG / 0 OFA1/817
MIG 1g.35gb1/41/71 NVDEC / 1 JPEG / 0 OFA1/814
MIG 2g.35gb2/82/72 NVDECs / 2 JPEG / 0 OFA2/823
MIG 3g.71gb4/83/73 NVDECs / 3 JPEG / 0 OFA4/832
MIG 4g.71gb4/84/74 NVDECs / 4 JPEG / 0 OFA4/841
MIG 7g.141gbFull7/77 NVDECs / 7 JPEG / 1 OFAFull81

Summary

Multi-Instance GPU (MIG) enables efficient, secure, and predictable sharing of high-performance NVIDIA GPUs within E2E Cloud Private Clusters. By partitioning a single physical GPU into multiple hardware-isolated instances, MIG allows multiple workloads to run concurrently without performance interference.

Through predefined MIG profiles on GPUs such as NVIDIA H100 and H200, users can choose the right balance of GPU memory, compute (SMs), and hardware units based on workload requirements. This ensures optimal GPU utilization while maintaining strict performance isolation and quality of service (QoS).

MIG is particularly well suited for notebooks, inference services, and lightweight training jobs that do not require a full GPU. By offering MIG in private clusters, E2E Cloud helps customers reduce resource waste, scale efficiently, and maximize the value of their dedicated GPU infrastructure.