Bake and Reuse a GPU Image
A fresh GPU node spends 15–25 minutes on first boot installing the driver, downloading the framework wheels, and pulling container images. Saving an image once and launching new nodes from it cuts that to 3–6 minutes. This guide shows what to bake in and how to use Save Image for that workflow.
For the underlying portal action, see Save Image in the canonical node docs. This page covers the GPU-specific guidance.
Snapshots are not available for GPU nodes. Save Image is the only way to capture a re-launchable copy of a GPU node's root disk.
What to Bake In
A well-prepared GPU image should include everything that takes time to install on a fresh node:
| Component | Why bake it |
|---|---|
| NVIDIA datacenter driver | Avoids the kernel-module install and reboot path on first boot. |
| CUDA toolkit (if you compile) | Skips the multi-GB toolkit download. |
| cuDNN / NCCL | Required by most frameworks; bundled with the toolkit but worth verifying. |
| Docker + NVIDIA Container Toolkit | Skips the nvidia-ctk runtime configure step on every new node. |
| Pulled container images | Skips multi-GB image pulls for every replica. |
| Conda or pip virtual environments | Skips the PyTorch / JAX / vLLM / TGI install path. |
| System config (firewall, sysctl, persistence mode) | Avoids per-node tuning drift. |
Do not bake in:
- Model weights or datasets that are large enough to inflate the image into the hundreds of GB. Mount these from block storage or Parallel File Storage instead.
- Secrets, API keys, or per-node identifiers. Inject them at runtime.
Save Image Workflow
-
Launch a GPU node from a base image and install everything from the bake list above.
-
Verify the stack works end-to-end on this node — run
nvidia-smi, run a container with--gpus all, run a single inference or training step. -
Enable persistence mode and make it survive reboots:
sudo nvidia-smi -pm 1On Ubuntu 24.04 with NVIDIA driver 580.x,
nvidia-persistencedis a static systemd unit that starts automatically —systemctl enable nvidia-persistencedwill return an error and is not needed. On older driver branches (570.x and earlier), runsudo systemctl enable nvidia-persistencedafter the command above. -
Stop the node from the portal Actions menu.
-
Run Save Image on the stopped node.
-
Wait for the image to appear under Compute > Images. GPU images can be 50–300 GB; expect 10–20 minutes for the save to complete.
-
To launch a new node from the saved image, open the create-node flow, switch to My Images when selecting the OS, and pick your saved image.
Bake one image per (card family, framework version) combination. Re-bake when you upgrade the framework or driver. Do not try to make one image cover every workload — it slows boot and bloats storage.
GPU saved images can reach hundreds of GB. They incur storage charges for the full image size while retained. Delete saved images you no longer need from the Images page.
First-Boot Time, Roughly
| Starting point | Time to "ready to serve" |
|---|---|
| Fresh OS image → install driver → install framework → pull container | 15–25 min |
| Pre-baked NVIDIA image → install framework → pull container | 8–12 min |
| Saved image with driver, framework, and container baked in | 3–6 min |
The third row is the goal for any production GPU fleet.
Related Resources
| Resource | Use it for |
|---|---|
| Save Image | Portal action reference. |
| Node Images | Manage saved images. |
| Run GPU Workloads in Docker | What to bake in for the container layer. |
| Manage GPU Nodes | GPU-specific differences for Save Image and Plan Upgrade. |