Features

Foundation Studio Fine-Tune Models gives you full control over every aspect of the training process — from dataset preparation and model selection to hyperparameter tuning, monitoring, and job lifecycle management.

1. Supported Models

Fine-tune a variety of open-source foundation models:

Model	HuggingFace ID	Type	Notes
Llama 2 7B	`meta-llama/Llama-2-7b-hf`	LLM	Gated — requires Hugging Face token and license acceptance
Meta Llama 3 8B	`meta-llama/Meta-Llama-3-8B`	LLM	Gated — requires Hugging Face token and license acceptance
Meta Llama 3 8B Instruct	`meta-llama/Meta-Llama-3-8B-Instruct`	LLM	Gated — requires Hugging Face token and license acceptance
Meta Llama 3.1 8B	`meta-llama/Llama-3.1-8B`	LLM	Gated — requires Hugging Face token and license acceptance
Meta Llama 3.1 8B Instruct	`meta-llama/Llama-3.1-8B-Instruct`	LLM	Gated — requires Hugging Face token and license acceptance
Meta Llama 3.2 11B Vision	`meta-llama/Llama-3.2-11B-Vision`	Multimodal LLM	Vision-language fine-tuning; gated model
Meta Llama 3.2 11B Vision Instruct	`meta-llama/Llama-3.2-11B-Vision-Instruct`	Multimodal LLM	Vision-language fine-tuning; gated model
Gemma 7B	`google/gemma-7b`	LLM	Gated — requires Hugging Face token and license acceptance
Gemma 7B Instruct	`google/gemma-7b-it`	LLM	Instruction-tuned variant; gated model
Stable Diffusion 2.1	`stabilityai/stable-diffusion-2-1`	Image generation	Text-to-image fine-tuning
Stable Diffusion XL	`stabilityai/stable-diffusion-xl-base-1.0`	Image generation	Higher-resolution text-to-image fine-tuning

2. Hyperparameter Configuration

Foundation Studio gives you full control over the training process through a dedicated Hyperparameter Configuration step. You can tune core training settings such as learning rate, epochs, and max context length, as well as LoRA/PEFT-specific parameters for parameter-efficient fine-tuning. Advanced options like quantization, batch size, gradient accumulation, and debug limits are also available to help balance training speed, accuracy, and resource usage.

The right combination of hyperparameters directly impacts model quality — experimenting with these settings helps avoid overfitting or underfitting and ensures the model generalizes well to your task.

For a full list of available parameters, see the Quick Start Guide — Step 5.

3. Quantization

Quantization reduces model size and lowers GPU memory requirements during training. Useful when fine-tuning large models on GPUs with limited VRAM.

Option	Description
Load in 4Bit	Load model weights in 4-bit precision to reduce memory footprint
Compute Datatype	Data type for computations (e.g. float16, bfloat16)
QuantType	Quantization algorithm (e.g. NF4)
Use DoubleQuant	Apply a second quantization pass for further memory savings

4. Advanced Training Settings

Additional options for fine-grained performance tuning:

Parameter	Description
Batch size	Number of samples processed in each training step
Gradient accumulation steps	Accumulate gradients over multiple steps before updating weights — simulates larger batch sizes when GPU memory is limited

5. Model Checkpoints

Start from scratch — Begin training from the base model weights (default behavior).
Resume from checkpoint — Continue training from a previously saved checkpoint. Useful for iterating on a partially trained model or recovering from an interrupted job.

All training checkpoints are automatically saved and accessible from the model repository after training completes.

6. Experiment Tracking with WandB

Integrate with Weights & Biases (WandB) to track training runs in real time. WandB is a platform for experiment tracking, model visualization, and team collaboration that lets you monitor and compare your fine-tuning runs.

Monitor loss curves, learning rate schedules, and custom metrics in the WandB dashboard.
Compare multiple fine-tuning runs side by side.
Access full training history and model version records.

To enable, add your WandB API key via External Integrations and select it during the Hyperparameter Configuration step when creating a job.

Debug options are also available for additional runtime visibility into the training process.

7. Job Monitoring

Once a job is running, view the following tabs on the job detail page:

Tab	What it shows
Overview	Job configuration, assigned GPU plan, current status, and resource summary
Events	Pod lifecycle events — scheduling, container start, and termination. Useful for diagnosing startup failures
Logs	Real-time streaming logs from the training process — monitor convergence, diagnose errors, and audit training steps
Training Metrics	Visual charts for training loss, validation loss, and other model-specific metrics
Metrics	Hardware resource utilization — GPU utilization (%), GPU memory usage, and CPU utilization

8. Job Actions

Manage active and historical fine-tuning jobs with the following actions:

Action	When to use	Notes
Clone	Create a new job with the same configuration, with the option to modify parameters	Useful for hyperparameter sweeps and dataset iterations
Retry	Restart a job that ended in a failed state	Retains original configuration; no re-configuration needed
Terminate	Stop a job that is currently running	Use when you want to cancel training early
Delete	Remove a job and its metadata permanently	Does not automatically delete the model repository

9. Model Repository

After a successful training run, the fine-tuned model is stored in a model repository containing:

All checkpoints saved during training
LoRA adapters (if applicable)
Model configuration and tokenizer files

Navigate to the Models tab on the job detail page to access the repository.

10. One-Click Deployment to Inference

Once your fine-tuned model is ready, you can deploy it as a live API endpoint directly from Foundation Studio — no additional setup required. The fine-tuned model repository is automatically linked to TIR Inference, so you can go from a completed training job to a running endpoint in just a few clicks.

Navigate to the Models tab on your fine-tuning job page.
Click Deploy to create an Inference endpoint using your fine-tuned model.
Select a serving framework, GPU plan, and scaling configuration.
Once deployed, you receive a live endpoint URL that your applications can call immediately.

The endpoint is OpenAI-compatible, so it works with any tool or SDK that supports the OpenAI API format.

For full deployment instructions, see the Inference — Model Endpoints documentation.

For AI agents, crawlers, and chatbots: append .md to any /docs/ URL (strip the trailing slash) to fetch the raw markdown source — view this page as markdown.

Last updated on May 15, 2026.

1. Supported Models​

2. Hyperparameter Configuration​

3. Quantization​

4. Advanced Training Settings​

5. Model Checkpoints​

6. Experiment Tracking with WandB​

7. Job Monitoring​

8. Job Actions​

9. Model Repository​

10. One-Click Deployment to Inference​

1. Supported Models

2. Hyperparameter Configuration

3. Quantization

4. Advanced Training Settings

5. Model Checkpoints

6. Experiment Tracking with WandB

7. Job Monitoring

8. Job Actions

9. Model Repository

10. One-Click Deployment to Inference