Skip to main content

Choose a GPU Card

Use this guide when the create-node flow asks you to pick a GPU card and plan. The portal is always the final source for currently available GPUs, plans, locations, and prices — use this page to understand what each card is for before you select one.

A GPU card is not the same thing as a plan. The card describes the NVIDIA accelerator (architecture, memory, fabric), while the plan describes the exact vCPU, RAM, GPU memory, CUDA version, disk space, IOPS, and number of GPU cards bundled into the node. The portal shows plan details for the selected card in a format similar to:

<vCPU> – <RAM> – <GPU Memory> – <CUDA (Version)> – <Disk Space> – <IOPS (R/W)>
note

GPU availability is region-gated. The GPU tab is hidden for some regions and accounts. If you do not see the card you want, switch regions in the location selector at the top of MyAccount, or contact cloud-platform@e2enetworks.com.


Quick Comparison

CardMemoryArchitectureBest fitAvoid when
NVIDIA H10080 GB HBM3HopperLLM training and inference, FP8, Transformer Engine.You do not need FP8 or Hopper Transformer Engine.
NVIDIA A100 80GB80 GB HBM2eAmpereMulti-tenant LLM inference, training, HPC.You can fit your model in 40 GB and want a cheaper plan.
NVIDIA A100 40GB40 GB HBM2AmpereMainstream LLM training and inference.The model state exceeds 40 GB without sharding.
NVIDIA L40S48 GB GDDR6Ada LovelaceCost-efficient inference, generative AI, video, fine-tuning.Largest-model training across many cards.
NVIDIA A4048 GB GDDR6Ampere (RTX)Visualization, rendering, mid-tier inference and training.Pure datacenter Tensor Core workloads where L40S/A100 fit the budget.
NVIDIA A3024 GB HBM2AmpereMid-range AI inference, classical ML.Large models that do not fit in 24 GB.
NVIDIA L424 GB GDDR6Ada LovelaceLow-power inference, video transcoding, generative AI at scale.Training workloads or anything memory-bound above 24 GB.
NVIDIA V10016 / 32 GB HBM2VoltaLegacy compatibility with Volta-tuned pipelines.You are starting a new project. Prefer Ampere, Ada, or Hopper.
NVIDIA T416 GB GDDR6TuringCost-sensitive inference and lab work.You need Tensor Float (TF32), FP8, or modern Tensor Core throughput.
note

Do not rely on documentation alone. Always check the live MyAccount portal for the exact set of GPU cards available in your region.


How to Choose

Start with the workload bottleneck:

If your main concern isStart with
Mixed-precision training and FP8 inferenceH100
Large-memory LLM inferenceA100 80GB
Mid-tier training and inferenceA100 40GB or L40S
Cost-efficient generative AI inference and fine-tuningL40S
Visualization, rendering, OpenGL/VulkanA40
Classical ML, smaller AI inferenceA30
Low-power inference and video transcoding at scaleL4
Legacy CUDA pipelines that target Volta or TuringV100 or T4

Then check that the GPU memory column is large enough for your model state, batch size, and KV cache. Most "out of memory" issues at run time start with a card chosen on price rather than memory.

Best Practice

For LLM inference, target a card whose memory comfortably fits the model weights plus the KV cache for your maximum context length and concurrency. Sharding across multiple cards costs latency.


GPU Card Profiles

NVIDIA H100

Hopper, 80 GB HBM3, FP8 Transformer Engine, fourth-generation NVLink. The current mainstream training and inference card.

  • Use it for: training 7B–70B LLMs, FP8 inference, fine-tuning, scientific computing.
  • Reference: NVIDIA H100.

NVIDIA A100 80GB and A100 40GB

Ampere, third-generation Tensor Cores, TF32 / BF16 / FP16 / INT8 / FP64 precisions. The 80GB variant uses HBM2e with up to ~2.0 TB/s of memory bandwidth; the 40GB variant uses HBM2 at ~1.55 TB/s.

NVIDIA L40S

Ada Lovelace, 48 GB GDDR6, fourth-generation Tensor Cores, third-generation RT Cores. A strong cost-per-inference card with generative-AI features.

  • Use it for: stable diffusion, fine-tuning, mid-tier LLM inference, video AI, virtual workstations.
  • Reference: NVIDIA L40S.

NVIDIA A40

Ampere RTX, 48 GB GDDR6. Designed for professional visualization and mixed AI workloads.

  • Use it for: rendering, 3D, VDI with NVIDIA RTX vWS, mid-tier inference.
  • Reference: NVIDIA A40.

NVIDIA A30

Ampere, 24 GB HBM2. Lower-cost datacenter card.

  • Use it for: classical ML, recommendation, smaller-model inference.
  • Reference: NVIDIA A30.

NVIDIA L4

Ada Lovelace, 24 GB GDDR6, low-profile, low-power (72 W TDP class). Designed for inference and video AI at scale.

  • Use it for: high-volume small-model inference, video transcoding, generative AI on a budget.
  • Reference: NVIDIA L4.

NVIDIA V100

Volta, 16 GB or 32 GB HBM2. Legacy datacenter card. Choose it only for compatibility with Volta-tuned pipelines.

  • Use it for: legacy compatibility with Volta-tuned pipelines.
  • Reference: NVIDIA V100.

NVIDIA T4

Turing, 16 GB GDDR6, 70 W. Legacy inference card. Use it for cost-sensitive inference or lab work, but prefer L4 or L40S for any new project.

  • Use it for: cost-sensitive inference and lab work.
  • Reference: NVIDIA T4.

Pricing and Billing

GPU plans are billed per minute. The portal Summary panel shows the live price for the selected plan, region, and billing mode at launch. Committed plans are explicitly confirmed; on-demand plans show the hourly rate and a monthly estimate.

Caution

Do not rely on static documentation for prices or savings percentages. Use the live portal Summary, API response, or approved pricing page for the current amount. Pricing is also gated by region.

For a side-by-side billing view, see the E2E GPU pricing page and the committed billing notes.


CategoryResourceUse it for
LaunchCreate a GPU nodeFull step-by-step GPU launch flow.
ConnectConnect to a Linux GPU nodeSSH access and nvidia-smi verification.
Managed AITIR AI/ML PlatformLaunch notebooks, endpoints, and training jobs without managing a GPU node.
Last updated on May 26, 2026.