Skip to main content

Guide to Fine Tune Google Gemma-7b

Introduction

The Gemma-7b Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Developed by Google DeepMind and other teams across Google, Gemma is inspired by Gemini, and the name reflects the Latin gemma, meaning “precious stone.”

Fine-tuning the Gemma-7b model involves refining its parameters and optimizing its performance for a particular task or dataset. This method leverages the pre-existing knowledge encoded within the model through prior training on a diverse dataset, then tailors it to better suit the nuances and intricacies of a specific domain or task. In essence, fine-tuning serves as a form of transfer learning, where the model's existing understanding is adapted and enhanced to excel in a targeted area of application.

What is Fine-Tuning?

Fine-tuning refers to the process of modifying a pre-existing, pre-trained model to cater to a new, specific task by training it on a smaller dataset related to the new task. This approach leverages the existing knowledge gained from the pre-training phase, thereby reducing the need for extensive data and resources.

In the Context of Neural Networks and Deep Learning:

In the specific context of neural networks and deep learning, fine-tuning is typically executed by adjusting the parameters of a pre-trained model. This adjustment is made using a smaller, task-specific dataset. The pre-trained model, having already learned a set of features from a large dataset, is further trained on the new dataset to adapt these features to the new task.

How to Create a Fine-Tuning Job?

To initiate the Fine-Tuning Job process, first, the user should navigate to the sidebar section and select Foundation Studio. Upon selecting Foundation Studio, a dropdown menu will appear, featuring an option labeled Fine-Tune Models.

  1. Upon clicking the Fine-Tune Models option, the user will be directed to the "Manage Fine-Tuning Jobs" page.

    Manage Fine-Tuning Jobs

  2. After redirecting to the Manage Fine-Tuning Jobs, users can locate and click on the Create Fine-Tuning Job button or Click-here button to create Fine-Tune models.

    Create Fine-Tuning Job

  3. After clicking the Create Fine-Tuning Job button, the Create Fine-Tuning Job page will open. On this page, there are several options such as Job Name, Model, and Hugging Face Token.

    Create Fine-Tuning Job Page

    Hugging Face Token

  4. If the user already has an integration with a Hugging Face token, they can select it from the dropdown options. If the user does not have any integration setup with Hugging Face, they can click on the Create button to add a new one, and the Create Integration page will open. After adding a token, the user can move to the next stage by clicking on the Create button.

    If the user doesn't have a Hugging Face token, they will not be able to access certain services. In this case, you need to sign up for an account on Hugging Face and obtain an HF token to use their services.

    To obtain a Hugging Face token, you can follow these steps:

    1. Go to the Hugging Face website and create an account if you haven't already.
    2. Once you have created an account, log in and go to your account settings.
    3. Click on the Tokens tab.
    4. Click on the "New Access Token" button.
    5. Give your token a name and select the permissions you want to grant to the token.
    6. Click on the "Create" button.
    7. Your new token will be displayed. Make sure to copy it and store it in a safe place, as you will not be able to see it again after you close the window.

    Hugging Face Token Creation

Note

Some model are available for commercial use but requires access granted by their Custodian/Administrator (creator/maintainer of this model). You can visit the model card on huggingface to initiate the process.

How to Define Dataset Preparation?

After defining the Job Model configuration, users can move on to the next section for Dataset Preparation. The Dataset page will open, providing several options such as Select Task, Dataset Type, Choose a Dataset, Validation Split Ratio, and Prompt Configuration. Once these options are filled, the dataset preparation configuration will be set, and the user can move to the next section.

Dataset Preparation Options

Dataset Type

In the Dataset Type, you can select either CUSTOM or HUGGING FACE as the dataset type. The CUSTOM Dataset Type allows training models with user-provided data, offering flexibility for unique tasks. Alternatively, the Hugging Face option provides a variety of pre-existing datasets, enabling convenient selection and utilization for model training.

Dataset Type Selection

Choosing a Custom Dataset

If you select the dataset type as CUSTOM, you will need to choose a user-defined dataset by clicking the CHOOSE button.

Choose Custom Dataset

After clicking the CHOOSE button, you will see the screen below if you have already objects in the selected dataset.

To ensure dataset compatibility, it is recommended to maintain your data in the .json, .jsonl, or .parquet file format. Please verify and convert your dataset to these extensions prior to model training.

You can use this dataset containing Python code for testing purposes.

You can create a new dataset by clicking the click here link.

Note

The listed datasets here use in EOS Bucket for data storage.

Dataset Format for Text Models

Dataset Format for Text Models like Gemma-7b

The dataset provided should be in the following format: One metadata file mapping inputs with their corresponding outputs.

{"input": "What color is the sky?", "output": "The sky is blue."}
{"input": "Where is the best place to get cloud GPUs?", "output": "E2E Networks"}
Note

Uploading incorrect dataset format will result into finetuning run failure.

Note

Here, labels "input" and "output" will be provided in prompt configuration.

You can provide the prompt configuration by specifying in the dialog box in the following format:

    eg: 
Below is an instruction that describes a task. Write a response that appropriately completes the request.

###Instruction:[Name_of_the_input_field_in_your_dataset]
###Response: [Name_of_the_output_field_in_your_dataset]

We have taken a custom data which has two fields: instruction and output. So the prompt configuration will look like this-

Upload Dataset Process

PROCESS TO UPLOAD DATASET

  1. After selecting the dataset, you can upload objects in a particular dataset by selecting the dataset and clicking on the UPLOAD DATASET button.

Upload Dataset Button

  1. Click on the UPLOAD DATASET button, upload objects, and then click on the UPLOAD button.

Upload Button

  1. After uploading objects to a specific dataset, choose a particular file to continue and then click on the SUBMIT button.

Submit Button

How to define a Hyperparameter Configuration?

  1. Upon providing the dataset preparation details, users are directed to the Hyperparameter Configuration page. This interface allows users to customize the training process by specifying desired hyperparameters, thereby facilitating effective hyperparameter tuning. The form provided enables the selection of various hyperparameters, including but not limited to training type, epoch, learning rate, and max steps. Please fill out the form meticulously to optimize the model training process.

Hyperparameter Configuration

  1. In addition to the standard hyperparameters, the configuration page offers advanced options such as batch size and gradient accumulation steps. These settings can be utilized to further refine the training process. Users are encouraged to explore and employ these advanced options as needed to achieve optimal model performance.

Advanced Hyperparameters

  1. Upon specifying the advanced settings, users are advised to leverage the WandB Integration feature for comprehensive job tracking. This involves filling in the necessary details in the provided interface. By doing so, users can effectively monitor and manage the model training process, ensuring transparency and control throughout the lifecycle of the job.

WandB Integration

  1. Users can also describe the debug option as desired for debugging purposes or quicker training.

Debug Option

  1. Once the debug option has been thoroughly addressed, users are required to select their preferred machine configuration for the fine-tuning job. Subsequently, clicking on the LAUNCH button will initiate or schedule the job, depending on the chosen settings. To ensure fast and precise training, a variety of high-performance GPUs, such as Nvidia H100 and A100, are available for selection. This allows users to optimize their resources and accelerate the model training process.

Machine Configuration

Viewing your Job Parameters and Fine-Tuned Models

  1. Upon completion of the job, a Fine-Tuned model will be created and shown in the models section in the lower part of the page. This fine-tuned model repo will contain all checkpoints of model training as well as adapters built during training. Users can directly go to the model repo page under inference to view it.

Fine-Tuned Model

  1. If desired, the user can view job parameter details in the overview section of the job as shown below.

Job Parameter Overview

  1. Users can effortlessly access the logs of the model training by clicking on the run of the fine-tuning job they've created. These logs serve as a valuable resource, aiding users in debugging the training process and identifying any potential errors that could impede the model's progress.

Training Logs

  1. Upon successful completion of training, the model will be seamlessly pushed, ensuring smooth integration with the existing model infrastructure.

Model Integration

Running Inference on Fine-Tuned Gemma-7b Model

Upon successful completion of model training, the fine-tuned Gemma-7b model can be conveniently accessed and viewed under the Models section.

Access Fine-Tuned Model

Deployment of Fine-Tuned Gemma-7b

Selecting the Deploy icon reveals a sidebar featuring different menus, including Resource details, Endpoint details, and Environment Variables.

  1. Within the Resource details menu, users can specify their desired CPU/GPU configuration and set the number of replicas. Once configurations are finalized, users can proceed by clicking on the Next button.

Resource Details

  1. In the Endpoint details section, users can either enter a custom name for the endpoint or opt to use the default name provided. We have used the default name.

Endpoint Details

  1. In the Environment Variables section, input your Hugging Face token for authentication purposes.

Environment Variables

  1. Following the completion of the aforementioned steps, proceed by selecting the Create option. This action will initiate the creation process for your model endpoint.

Create Endpoint

Running Inference on the Model Endpoint

Navigate to the Inference section and select the Model Endpoint option to access the details of your created model endpoint. Here, you can monitor the status of your model endpoint. Once the status changes to Running, you can proceed by clicking on the API Request icon to make requests to your endpoint.

Model Endpoint Status

API Request

Sample API Request

A sidebar menu titled Sample API Request will appear. Within this menu, you have two options for running inference:

  1. Curl Script: Execute the provided curl script on your terminal of your local machine.

Curl Script

  1. Python Script: Run the provided Python script on your terminal or any code editor of your choice.

Python Script

Note

Ensure to replace the placeholder $AUTH_TOKEN in the script with your actual API authentication token. You can locate your authentication token in the "API Tokens" section. If you don't have an authentication token, you can generate a new one from the same section.

For Authentication refer this Documentation Click here.