TIR Models are storage buckets to store model weights and config files. A model may be backed by E2E Object Storage or a PVC storage in kubernetes.
The concept of model is loosely defined. There is no hard structure, framework or format that you must adhere to. Rather, you can think of a model as a simple directory hosted on either EOS or a disk. This further opens up possibilities of versioning through directory structure. You may define a sub-folders like v1, v2 etc to track different model versions.
When you define a model in a project, every team member has access to it. This enables re-use and collaboration among team members. So we recommend defining a TIR model to store, use and share your model weights.
Uploading weights to TIR Models¶
When you create a new model, TIR automatically creates an EOS bucket as a storage house for your model weights and configuration files(e.g. tsconfig file in torchserve). You can find the connection details to access EOS bucket.
If you have not used EOS (E2E Object Storage) before, read here
EOS offers a s3-compatible API to upload or download content. So you can use any s3-compatible CLI like mc (by Minio) or s3cmd alike.
We recommend using Minio CLI (mc). In TIR Models Setups section, you will find ready to use commands to configure the client and upload contents.
A typical command to setup the CLI would look like this. Minio has a concept of alias which represents a connection profile. We use the model-name as alias (or connection profile)
mc config host add <tir-model-name> https://objectstore.e2enetworks.net <access-key> <secret-key>
Once you setup the alias (or connection profile), you can start uploading content using a commands like these:
# upload contents of saved-model directory to llma-7b-23233 (EOS Bucket). mc cp -r /user/jovyan/saved-model/* <tir-model-name>/llma-7b-23233 # upload contents of saved-model to llma-7b-23233(bucket)/v1(folder) mc cp -r /user/jovyan/saved-model/* <tir-model-name>/llma-7b-23233/v1
We recommend uploading model weights and config files such that they can be easily downloaded and used by TIR notebooks or inference service pods.
For huggingface models, the entire snapshot folder (under .cache/huggingface/hub/<model-name>/) needs to be uploaded to the model bucket.
When this is done correctly, you will be able download the weights (and configs) on any inference service pod or TIR notebook and load the model with the AutoModelForCausalLM.from_pretrained() call.
Downloading weights from TIR models¶
The model weights would be needed on the device whether you are fine-tuning or serving the inference requests through API.
To download the contents of TIR models manually: .. code:
# download contents of saved-model directory to llma-7b-23233 (EOS Bucket). mc cp -r <tir-model-name>/llma-7b-23233 /user/jovyan/download-model/* # download contents of saved-model to llma-7b-23233(bucket)/v1(folder) mc cp -r <tir-model-name>/llma-7b-23233/v1 /user/jovyan/download-model/*
Typical use cases for downloading content from TIR models: * Downloads to local device for fine-tuning. You can install and use mc command to download the model files. * Downloads to TIR Notebook for fine-tuning or running inference tests. You can use mc command provided in TIR notebook to download the model files. * Downloads to Inference Service (Model Endpoints). Once you attach model to an endpoint, the model files will be automatically downloaded to the container.
How to create Model in TIR dashboard?¶
To create Models click on the ‘Create Model’ button.
To create models you have to select Model type like Pytorch or Triton and Bucket Type New EOS Bucket or Existing EOS Bucket then click on ‘Create’.
After entering all text fields Model name, Model Type, Bucket Type & Bucket Name click on Create Button.
After clicking on the Create button a pop up will appear on the screen showing the Model credentials.
After clicking on the ‘OK’ then Model is shown in the list, we can see the overview of the model and setup in the Model.
You can click on Setup and see Setup methods.
Click on Actions button it should show Deploy Model & Delete Options