Skip to main content

Bake and Reuse a GPU Image

A fresh GPU node spends 15–25 minutes on first boot installing the driver, downloading the framework wheels, and pulling container images. Saving an image once and launching new nodes from it cuts that to 3–6 minutes. This guide shows what to bake in and how to use Save Image for that workflow.

For the underlying portal action, see Save Image in the canonical node docs. This page covers the GPU-specific guidance.

note

Snapshots are not available for GPU nodes. Save Image is the only way to capture a re-launchable copy of a GPU node's root disk.


What to Bake In

A well-prepared GPU image should include everything that takes time to install on a fresh node:

ComponentWhy bake it
NVIDIA datacenter driverAvoids the kernel-module install and reboot path on first boot.
CUDA toolkit (if you compile)Skips the multi-GB toolkit download.
cuDNN / NCCLRequired by most frameworks; bundled with the toolkit but worth verifying.
Docker + NVIDIA Container ToolkitSkips the nvidia-ctk runtime configure step on every new node.
Pulled container imagesSkips multi-GB image pulls for every replica.
Conda or pip virtual environmentsSkips the PyTorch / JAX / vLLM / TGI install path.
System config (firewall, sysctl, persistence mode)Avoids per-node tuning drift.

Do not bake in:

  • Model weights or datasets that are large enough to inflate the image into the hundreds of GB. Mount these from block storage or Parallel File Storage instead.
  • Secrets, API keys, or per-node identifiers. Inject them at runtime.

Save Image Workflow

  1. Launch a GPU node from a base image and install everything from the bake list above.

  2. Verify the stack works end-to-end on this node — run nvidia-smi, run a container with --gpus all, run a single inference or training step.

  3. Enable persistence mode and make it survive reboots:

    sudo nvidia-smi -pm 1

    On Ubuntu 24.04 with NVIDIA driver 580.x, nvidia-persistenced is a static systemd unit that starts automatically — systemctl enable nvidia-persistenced will return an error and is not needed. On older driver branches (570.x and earlier), run sudo systemctl enable nvidia-persistenced after the command above.

  4. Stop the node from the portal Actions menu.

  5. Run Save Image on the stopped node.

  6. Wait for the image to appear under Compute > Images. GPU images can be 50–300 GB; expect 10–20 minutes for the save to complete.

  7. To launch a new node from the saved image, open the create-node flow, switch to My Images when selecting the OS, and pick your saved image.

Best Practice

Bake one image per (card family, framework version) combination. Re-bake when you upgrade the framework or driver. Do not try to make one image cover every workload — it slows boot and bloats storage.

warning

GPU saved images can reach hundreds of GB. They incur storage charges for the full image size while retained. Delete saved images you no longer need from the Images page.


First-Boot Time, Roughly

Starting pointTime to "ready to serve"
Fresh OS image → install driver → install framework → pull container15–25 min
Pre-baked NVIDIA image → install framework → pull container8–12 min
Saved image with driver, framework, and container baked in3–6 min

The third row is the goal for any production GPU fleet.


ResourceUse it for
Save ImagePortal action reference.
Node ImagesManage saved images.
Run GPU Workloads in DockerWhat to bake in for the container layer.
Manage GPU NodesGPU-specific differences for Save Image and Plan Upgrade.
Last updated on May 26, 2026.