Generating Vector Embeddings using TIR GenAI API
Model Playground is a fully-managed, low-code environment for leveraging well-established open source and nvidia models. Customers will be able to directly access a ready-to-use API (provided by TIR) and don’t have to plan or worry about infrastructure. One of the services provided in it is the vector embeddings service. You can call the Vector Embeddings api to generate the vector embeddings of your input text. You can use the genearted vector embeddings to use it in your model inference or insert them in your vector database for similarity search.
Currently we provide the following models for generating vector embeddings:
e5-mistral-7b-instruct - Genearates 4096 dimentional vector embeddings for any given text.
Steps to use the API:
1. Install our python client package
We provide our own python client to make it easy for you to use our APIs. You can install the client using the following command:
$ pip install e2enetworks
2. Set the required variables
You need to set the following variables in your python script to use the GenAI API:
E2E_TIR_ACCESS_TOKEN = <paste_your_token_here>
E2E_TIR_API_KEY = <paste_your_apikey_here>
E2E_TIR_PROJECT_ID = <paste_your_project_id_here>
E2E_TIR_TEAM_ID = <paste_your_team_id_here>
You will find all the necessary data by generating a config file from the API Tokens section in the TIR website. You can check out the API Token Documentation to see how to generate the config file.
3. Generate the vector embeddings
You can use the following code to generate vector embeddings using our API:
from e2enetworks.cloud import tir
tir.init(api_key=TIR_API_KEY, access_token=TIR_ACCESS_TOKEN)
tir_client = tir.ModelAPIClient(project=TIR_PROJECT_ID, team=TIR_TEAM_ID)
text = <enter_your_text_here>
data = {"prompt": text}
model_name = "vector-embedding-e5-mistral-7b-instruct" # model name for generating vector embeddings
response = client.infer(model_name=model_name, data=data)
generated_vector_embeddings = response.outputs[0].data
print(generated_vector_embeddings)
How to insert the generated vector embeddings into a Vector database
Currently TIR provides the following vector databases:
Qdrant
Qdrant
This guide will show you how to use our Qdrant service and insert the generated vector embeddings into a collection in the Qdrant service.
How to:
Installing the Python Client
We are going to install the client using pip.
pip install qdrant-client
Using the Qdrant Client
Now that we have installed python qdrant client, lets see how we can use it.
Importing the Client
from qdrant_client import QdrantClient
from qdrant_client.http.models import Distance, VectorParams
These imports will be used to interact with the client in the following examples.
Connecting to the Qdrant Instance
To connect to the qdrant instance, you need to create a client object and pass the host and the API-Key to the client.
To get the Host and the API key:
Go to TIR and head to the Vector Database section.
There, you will find all your Vector Databases.
Click on the database you want to connect to.
You will find the endpoint host and the API-Key in the Overview tab.
Note
For HTTP (REST) requests, use port no. 6333
For gRPC requests, use port no. 6334
Creating a http client object:
host = <your-qdrant-instance-url>
port = 6333
api_key = <your-api-key>
client = QdrantClient(host=host, port=port, api_key=api_key)
Creating a grpc client object:
host = <your-qdrant-instance-url>
port = 6334
api_key = <your-api-key>
client = QdrantClient(host=host, port=port, api_key=api_key, prefer_grpc=True)
Creating a Collection
We will create a collection with the name “example_collection”.
We are going to set the vector size to 4096. This means that each vector will have 4096 dimensions.
We are going to set the distance metric to DOT. This means that the distance between vectors will be calculated using the dot product.
We are going to set the shard_number to 3. This means that your data will be sharded into 3 indivdual shards.
We are also going to set the replication_factor to 3. This means 3 shard-replicas will be stored, i.e., 2 additional copy of each of your data will be maintained automatically.
For more detailed explanation on sharding and replication factor you can refer to the Basic Terminology section.
collection_name = "test_collection"
vectors_config=VectorParams(size=4096, distance=Distance.DOT)
shard_number = 3
replication_factor = 3
client.create_collection(collection_name=collection_name,vectors_config=vectors_config,
shard_number=shard_number,replication_factor=replication_factor)
Adding Vectors
Now that we have created a collection, lets add some vectors to it. Qdrant allows you to add vectors to the collection using the upsert method. The upsert method takes a list of PointStruct objects as input.
- The PointStruct object has the following fields:
id : The id of the vector. This is a unique identifier for the vector.
vector : The vector embeddings generated from the GenAI API.
payload : This is a dictionary that can contain the text data from which the vector was generated and any additional data you want to store with the vector.
vectors = [
PointStruct(id=<your_desired_id>,
vector=generated_vector_embeddings,
payload={"data": text}),
]
client.upsert(collection_name=collection_name, points=points, wait=True)
If the vectors are added successfully, it will return:
UpdateResult(operation_id=<some_id>, status=<UpdateStatus.COMPLETED: 'completed'>)
If there is an error, it will raise an exception.
Getting detailed info on a Collection
Now that we have inserted a vector in the collection lets see the detailed information about the collection.
collection_info = client.get_collection(collection_name)
print(collection_info)
In the response you will see that the vectors_count=1. This means that there is 1 vector in the collection.
Whats Next?
If you want to know more about the Qdrant service, you can check out the Qdrant Documentation.