# Fine-tuning LLaMA-3 with LLaMA Factory on TIR **LLaMA Factory** is an easy-to-use platform for fine-tuning large language models. With E2E GPU Nodes on TIR, you can train models like LLaMA-3 using either CLI or WebUI. ## Step 1: Setup Environment Open JupyterLab (Python 3 Notebook) on your GPU Node and run: ```bash %cd ~/ !rm -rf LLaMA-Factory !git clone https://github.com/hiyouga/LLaMA-Factory.git %cd LLaMA-Factory !pip install -e .[torch,bitsandbytes] !pip install bitsandbytes ``` Verify GPU: ```python import torch assert torch.cuda.is_available(), "GPU not detected" ``` ## Step 2: Select GPUs ```python import os os.environ["CUDA_VISIBLE_DEVICES"] = "0" # or "0,1" for multi-GPU ``` ## Step 3: Create Training Config LLaMA Factory CLI requires a YAML/JSON config file, not just a dataset path. Example: `train_llama3.yaml` ```yaml model_name_or_path: unsloth/llama-3-8b-Instruct-bnb-4bit dataset: alpaca output_dir: ./output/llama3-lora per_device_train_batch_size: 2 num_train_epochs: 1 learning_rate: 2e-5 fp16: true finetuning_type: lora ``` ## Step 4: Run Training ```bash llamafactory-cli train train_llama3.yaml ``` :::info Note For Meta's official LLaMA-3 models, you must request access on Hugging Face and then log in: ```bash huggingface-cli login ``` ::: ## Step 5: Inference / Chat ```bash llamafactory-cli chat train_llama3.yaml ``` ## Step 6: Merge LoRA and Export (Optional) ```bash llamafactory-cli export merge_llama3.yaml ``` :::info Note Merging 8B models requires around 18GB RAM. ::: ## Step 7: WebUI Option You can also fine-tune via LlamaBoard (Gradio): ```bash !GRADIO_SHARE=0 llamafactory-cli webui ``` ## References * [LLaMA-Factory GitHub](https://github.com/hiyouga/LLaMA-Factory) * [LLaMA-Factory CLI examples](https://github.com/hiyouga/LLaMA-Factory/blob/main/examples/README.md) ---