Features
Foundation Studio Fine-Tune Models gives you full control over every aspect of the training process — from dataset preparation and model selection to hyperparameter tuning, monitoring, and job lifecycle management.
1. Supported Models
Fine-tune a variety of open-source foundation models:
| Model | HuggingFace ID | Type | Notes |
|---|---|---|---|
| Llama 2 7B | meta-llama/Llama-2-7b-hf | LLM | Gated — requires Hugging Face token and license acceptance |
| Meta Llama 3 8B | meta-llama/Meta-Llama-3-8B | LLM | Gated — requires Hugging Face token and license acceptance |
| Meta Llama 3 8B Instruct | meta-llama/Meta-Llama-3-8B-Instruct | LLM | Gated — requires Hugging Face token and license acceptance |
| Meta Llama 3.1 8B | meta-llama/Llama-3.1-8B | LLM | Gated — requires Hugging Face token and license acceptance |
| Meta Llama 3.1 8B Instruct | meta-llama/Llama-3.1-8B-Instruct | LLM | Gated — requires Hugging Face token and license acceptance |
| Meta Llama 3.2 11B Vision | meta-llama/Llama-3.2-11B-Vision | Multimodal LLM | Vision-language fine-tuning; gated model |
| Meta Llama 3.2 11B Vision Instruct | meta-llama/Llama-3.2-11B-Vision-Instruct | Multimodal LLM | Vision-language fine-tuning; gated model |
| Gemma 7B | google/gemma-7b | LLM | Gated — requires Hugging Face token and license acceptance |
| Gemma 7B Instruct | google/gemma-7b-it | LLM | Instruction-tuned variant; gated model |
| Stable Diffusion 2.1 | stabilityai/stable-diffusion-2-1 | Image generation | Text-to-image fine-tuning |
| Stable Diffusion XL | stabilityai/stable-diffusion-xl-base-1.0 | Image generation | Higher-resolution text-to-image fine-tuning |
2. Hyperparameter Configuration
Foundation Studio gives you full control over the training process through a dedicated Hyperparameter Configuration step. You can tune core training settings such as learning rate, epochs, and max context length, as well as LoRA/PEFT-specific parameters for parameter-efficient fine-tuning. Advanced options like quantization, batch size, gradient accumulation, and debug limits are also available to help balance training speed, accuracy, and resource usage.
The right combination of hyperparameters directly impacts model quality — experimenting with these settings helps avoid overfitting or underfitting and ensures the model generalizes well to your task.
For a full list of available parameters, see the Quick Start Guide — Step 5.
3. Quantization
Quantization reduces model size and lowers GPU memory requirements during training. Useful when fine-tuning large models on GPUs with limited VRAM.
| Option | Description |
|---|---|
| Load in 4Bit | Load model weights in 4-bit precision to reduce memory footprint |
| Compute Datatype | Data type for computations (e.g. float16, bfloat16) |
| QuantType | Quantization algorithm (e.g. NF4) |
| Use DoubleQuant | Apply a second quantization pass for further memory savings |
4. Advanced Training Settings
Additional options for fine-grained performance tuning:
| Parameter | Description |
|---|---|
| Batch size | Number of samples processed in each training step |
| Gradient accumulation steps | Accumulate gradients over multiple steps before updating weights — simulates larger batch sizes when GPU memory is limited |
5. Model Checkpoints
- Start from scratch — Begin training from the base model weights (default behavior).
- Resume from checkpoint — Continue training from a previously saved checkpoint. Useful for iterating on a partially trained model or recovering from an interrupted job.
All training checkpoints are automatically saved and accessible from the model repository after training completes.
6. Experiment Tracking with WandB
Integrate with Weights & Biases (WandB) to track training runs in real time. WandB is a platform for experiment tracking, model visualization, and team collaboration that lets you monitor and compare your fine-tuning runs.
- Monitor loss curves, learning rate schedules, and custom metrics in the WandB dashboard.
- Compare multiple fine-tuning runs side by side.
- Access full training history and model version records.
To enable, add your WandB API key via External Integrations and select it during the Hyperparameter Configuration step when creating a job.
Debug options are also available for additional runtime visibility into the training process.
7. Job Monitoring
Once a job is running, view the following tabs on the job detail page:
| Tab | What it shows |
|---|---|
| Overview | Job configuration, assigned GPU plan, current status, and resource summary |
| Events | Pod lifecycle events — scheduling, container start, and termination. Useful for diagnosing startup failures |
| Logs | Real-time streaming logs from the training process — monitor convergence, diagnose errors, and audit training steps |
| Training Metrics | Visual charts for training loss, validation loss, and other model-specific metrics |
| Metrics | Hardware resource utilization — GPU utilization (%), GPU memory usage, and CPU utilization |
8. Job Actions
Manage active and historical fine-tuning jobs with the following actions:
| Action | When to use | Notes |
|---|---|---|
| Clone | Create a new job with the same configuration, with the option to modify parameters | Useful for hyperparameter sweeps and dataset iterations |
| Retry | Restart a job that ended in a failed state | Retains original configuration; no re-configuration needed |
| Terminate | Stop a job that is currently running | Use when you want to cancel training early |
| Delete | Remove a job and its metadata permanently | Does not automatically delete the model repository |
9. Model Repository
After a successful training run, the fine-tuned model is stored in a model repository containing:
- All checkpoints saved during training
- LoRA adapters (if applicable)
- Model configuration and tokenizer files
Navigate to the Models tab on the job detail page to access the repository.
10. One-Click Deployment to Inference
Once your fine-tuned model is ready, you can deploy it as a live API endpoint directly from Foundation Studio — no additional setup required. The fine-tuned model repository is automatically linked to TIR Inference, so you can go from a completed training job to a running endpoint in just a few clicks.
- Navigate to the Models tab on your fine-tuning job page.
- Click Deploy to create an Inference endpoint using your fine-tuned model.
- Select a serving framework, GPU plan, and scaling configuration.
- Once deployed, you receive a live endpoint URL that your applications can call immediately.
The endpoint is OpenAI-compatible, so it works with any tool or SDK that supports the OpenAI API format.
For full deployment instructions, see the Inference — Model Endpoints documentation.