Fine-Tune Models
Fine-tune pre-trained LLMs and diffusion models on your own data using E2E AI Cloud's high-performance GPU infrastructure. Foundation Studio manages compute so you can focus on data, configuration, and results.
Quick Start​
Quick Start Guide
Create your first fine-tuning job and train a model on your own dataset.
Features
Explore dataset types, hyperparameters, quantization, checkpoints, and monitoring.
Plans & Pricing
Understand GPU billing, plan options, and cost examples for fine-tuning jobs.
FAQs
Get answers to common questions about datasets, training, jobs, and model output.
What can you do with Fine-Tune Models?​
Fine-tune LLMs (Llama, Mistral, BLOOM, Gemma) and diffusion models on your own data
Use custom .jsonl datasets from EOS or pull directly from Hugging Face
Configure hyperparameters, quantization, and gradient accumulation
Resume training from any previous checkpoint
Track experiments in real time with Weights & Biases (WandB)
Deploy fine-tuned models directly to Inference endpoints
Key Characteristics​
Models
Wide Model Support
Fine-tune Llama 3.x, Gemma 7B, Stable Diffusion, and SDXL. Gated models require a Hugging Face token.
Data
Flexible Dataset Input
Bring custom .jsonl datasets via EOS buckets or link directly to any public or private Hugging Face dataset.
Compute
High-Performance GPUs
Choose from H100 and A100 GPU plans. H100 for large models and fast iteration; A100 for most 7B–13B workloads.
Training
Full Hyperparameter Control
Set training type, epochs, learning rate, batch size, gradient accumulation, and quantization (4-bit, DoubleQuant).
Monitoring
Built-in Observability
View training logs, loss curves, GPU utilization, and memory metrics. Optionally integrate WandB for full experiment tracking.
Output
Checkpoint Management
All checkpoints and LoRA adapters are stored in a model repository. Resume from any checkpoint or deploy directly to inference.
Best Practices​
Best Practices for Fine-Tuning
Ensure your .jsonl file is correctly formatted before uploading. A single malformed line will cause the job to fail.
Run 1–2 epochs on a small dataset slice to validate configuration before committing to a full training run.
Enable Load in 4Bit when fine-tuning models larger than 7B to reduce GPU memory usage and cost.
Enable WandB integration to compare runs, catch overfitting early, and maintain a complete training history.
API Reference​
Fine-Tune Models API Reference
Programmatically create, list, manage, and delete fine-tuning jobs in TIR.
/teams/{Team_Id}/projects/{Project_Id}/finetune/jobs/List fine-tuning jobs/teams/{Team_Id}/projects/{Project_Id}/finetune/jobs/Create a fine-tuning job/teams/{Team_Id}/projects/{Project_Id}/finetune/jobs/{job_id}/Get fine-tuning job details/teams/{Team_Id}/projects/{Project_Id}/finetune/jobs/{job_id}/Retry or terminate a job/teams/{Team_Id}/projects/{Project_Id}/finetune/jobs/{job_id}/Delete a fine-tuning job