Knowledge Base

A knowledge base (KB) is a structured repository of information designed to store and organize data that a retrieval model can access to find relevant context for a language model (LLM) to generate accurate responses. The KB in RAG systems acts as the "memory" from which relevant knowledge is retrieved and used to enhance response quality.

Key Components and Functionality of a Knowledge Base in RAG

Data Sources and Parsing

The KB typically consists of various types of data, including documents (FAQs, guides, reports), structured datasets, and other relevant content. Data is preprocessed and parsed into smaller chunks (paragraphs, sentences, or user-defined sections) to enable efficient retrieval.

Embedding and Vectorization

Each chunk of text is converted into an embedding (a numerical vector that captures the semantic meaning of the text) using models such as BERT, RoBERTa, or specialized embedding models. These embeddings allow the KB to organize data in a way that enables similarity-based search, enabling the retriever to locate contextually relevant information efficiently.

Indexing for Efficient Search

The KB is indexed to support rapid retrieval based on vector similarity. This indexing process enables matching user queries with relevant chunks in the KB based on semantic similarity. Indexing methods such as FAISS (Facebook AI Similarity Search) are commonly used to optimize this process.

Retrieval of Contextual Data

When a user query is received, the RAG system’s retriever searches the KB for chunks that closely match the meaning of the query. The retrieved chunks are then passed to the LLM, which uses them as context to generate a more accurate and informed response.

Knowledge Base Updating and Maintenance

The KB in a RAG system can be updated with new information as it becomes available, ensuring that the system remains accurate and up to date. Changes in company policies, new product information, and other relevant updates are incorporated into the KB, and embeddings are recalculated as required.

Getting Started

Create a New Knowledge Base

Step 1: Log in to the TIR AI Platform

Ensure that you are logged in and working within the correct project. If required, create a new project before proceeding.

Step 2: Navigate to the Knowledge Base Section

From the TIR dashboard:

Locate the primary navigation panel (typically on the left side of the interface)
Select the RAG or Knowledge Base section
This will open the page where all existing knowledge bases are listed

Step 3: Create a Knowledge Base

On the Knowledge Base listing page:

Locate the Create Knowledge Base button (typically positioned in the top-right corner)
Click this button to begin the creation process

Step 4: Configure Knowledge Base Details

You will be redirected to the Create Knowledge Base page.

On this page:

Fill in all required fields presented in the form
Ensure all mandatory configurations are completed before proceeding

Model Source

Users can choose the model source as either GenAI or Endpoint.

If GenAI is selected:
- Open the embedding model dropdown
- Select a predefined embedding model from the available options
If Endpoint is selected:
- Select the embedding model from the model endpoint list
- Select the corresponding model name
- Perform model validation to ensure that the selected model exists and is available for use
- Only proceed after successful validation

Embedding Model

Embedding models in a knowledge base are used to convert textual data (or other data types) into numerical representations, known as embeddings, that capture the semantic meaning of the content. These embeddings enable the knowledge base to perform efficient search, comparison, and retrieval of relevant information based on meaning, rather than relying solely on exact keyword matching.

Model Source (Repeated Configuration)

Users can again verify or modify the model source configuration as either GenAI or Endpoint.

If GenAI is selected, ensure a predefined embedding model is selected.
If Endpoint is selected, ensure both embedding model and model name are selected and validated before proceeding.

Chunking Method

Chunking in a knowledge base is a method of dividing larger documents or datasets into smaller, manageable units of information, referred to as "chunks." This approach is particularly important in systems that rely on retrieval-augmented generation (RAG) models or semantic search techniques, as it improves the system’s ability to retrieve relevant context and generate accurate responses by operating on smaller, more focused pieces of content.

Chunk Size

Chunk size in a knowledge base refers to the length of each individual chunk, which may be defined in terms of words, tokens, sentences, or characters. Selecting an optimal chunk size is critical, as it directly impacts context preservation, retrieval accuracy, and computational efficiency. Proper tuning of chunk size ensures that each chunk contains sufficient contextual information while maintaining efficient processing and retrieval performance.

Delimiter

A delimiter in a knowledge base is a character or sequence of characters used to separate sections of text or data. Delimiters play a key role in chunking, parsing, and structuring content within the knowledge base, enabling the system to accurately process, segment, and retrieve information in a consistent and reliable manner.

Questions

Automatically extract N questions for each chunk to increase their ranking for queries containing those questions.

After chunk creation, navigate to the chunk list within the knowledge base
Review the automatically generated questions for each chunk
Modify or update the questions if required

Additional considerations:

If an error occurs during question extraction, the chunking process will continue
In such cases, empty results may be added to the original chunk
This feature consumes additional tokens from the LLM configured in the system model settings

Step 5: Launch Knowledge Base

After completing all configurations:

Locate the Launch button on the Create Knowledge Base page
Click Launch to finalize and create the Knowledge Base

Knowledge Base List

After creation, all knowledge bases are displayed in a list view.

Each row represents an individual knowledge base
To manage a knowledge base:
- Click on the corresponding row or entry
- This will open the detailed view for that knowledge base

Documents

After selecting a knowledge base, you will be redirected to the Documents section.

Dashboard

The dashboard displays:

A table of all uploaded documents
Document status
Available actions for each document

Import Data

To upload data into the knowledge base:

Locate and click the Import Data button within the Documents section
This opens the data import interface

You can upload data using two methods:

Browse Files

In the import interface, locate the upload area labeled: “Drop a file or click to upload”
Click the upload area or drag and drop files
Select one or more files from your local system
After selection, click the Import button to upload the files

Import from TIR Dataset

In the import interface, locate the TIR Dataset dropdown
Click the dropdown to view available datasets associated with your account
Select the desired dataset
Choose the specific file within the dataset
Click the Import button to add the file to the knowledge base

Bulk Actions

Bulk actions allow operations to be performed on multiple documents simultaneously.

In the document table, use the selection checkboxes to select multiple documents
Once selected, use the bulk action controls located above the table
Apply the desired action (such as parsing or deletion) to all selected documents

Status Filter

The status filter allows you to quickly locate documents based on their processing status.

Locate the filter control above the document table
Select a status (e.g., processed, pending, failed)
The table will update to display only documents matching the selected status

Actions

Each document entry in the table includes the following actions:

Parsing

Locate the parsing option within the document row
Click the parsing action to initiate processing for the selected document
This is required for documents that have been newly uploaded or modified

Edit

Locate and click the Edit option for a document
A configuration panel will open
Update parameters such as:
- Chunking method
- Chunk size
- Delimiter
Save changes to apply the updated configuration

Delete

Locate the Delete option within the document row
Click Delete to remove the document
Confirm the action when prompted

Download

Locate the Download option within the document row
Click Download to save the document to your local system

Retrieval Testing

Retrieval testing evaluates how effectively the system retrieves relevant information based on user queries.

Key Components of Retrieval Testing

Test Queries

Develop a set of diverse test queries representing real user inputs
Include:
- Keyword-based queries
- Natural language questions
- Entity-specific queries

Expected Results

Define expected results for each test query
Use these results to evaluate the relevance and correctness of retrieved data

Retrieval Mechanism

Identify the retrieval approach used by the system:
- Keyword matching
- Semantic search
- Embedding-based retrieval
Ensure that the retrieval mechanism aligns with the intended system architecture

Performance Metrics

Measure retrieval effectiveness using:

Precision: Ratio of relevant results retrieved to total results retrieved
Recall: Ratio of relevant results retrieved to total relevant results available
F1 Score: Harmonic mean of precision and recall
Response Time: Time taken to retrieve and present results

Configuration

Within a specific knowledge base, the Configuration section allows updates to:

Chunk size
Delimiter
Questions

To update configuration:

Navigate to the Configuration section of the selected knowledge base
Modify the required parameters
Save changes to apply updates

Data Imports

The Data Imports section allows you to manage data synchronization and view logs.

To access:

Navigate to the Data Imports tab within the knowledge base

Within this section:

Initiate data synchronization processes
View logs, including timestamps and status of import operations

For AI agents, crawlers, and chatbots: append .md to any /docs/ URL (strip the trailing slash) to fetch the raw markdown source — view this page as markdown.

Last updated on May 15, 2026.

Key Components and Functionality of a Knowledge Base in RAG​

Data Sources and Parsing​

Embedding and Vectorization​

Indexing for Efficient Search​

Retrieval of Contextual Data​

Knowledge Base Updating and Maintenance​

Getting Started​

Create a New Knowledge Base​

Step 1: Log in to the TIR AI Platform​

Step 2: Navigate to the Knowledge Base Section​

Step 3: Create a Knowledge Base​

Step 4: Configure Knowledge Base Details​

Model Source​

Embedding Model​

Model Source (Repeated Configuration)​

Chunking Method​

Chunk Size​

Delimiter​

Questions​

Step 5: Launch Knowledge Base​

Knowledge Base List​

Documents​

Dashboard​

Import Data​

Browse Files​

Import from TIR Dataset​

Bulk Actions​

Status Filter​

Actions​

Parsing​

Edit​

Delete​

Download​

Retrieval Testing​

Key Components of Retrieval Testing​

Test Queries​

Expected Results​

Retrieval Mechanism​

Performance Metrics​

Configuration​

Data Imports​

Key Components and Functionality of a Knowledge Base in RAG

Data Sources and Parsing

Embedding and Vectorization

Indexing for Efficient Search

Retrieval of Contextual Data

Knowledge Base Updating and Maintenance

Getting Started

Create a New Knowledge Base

Step 1: Log in to the TIR AI Platform

Step 2: Navigate to the Knowledge Base Section

Step 3: Create a Knowledge Base

Step 4: Configure Knowledge Base Details

Model Source

Embedding Model

Model Source (Repeated Configuration)

Chunking Method

Chunk Size

Delimiter

Questions

Step 5: Launch Knowledge Base

Knowledge Base List

Documents

Dashboard

Import Data

Browse Files

Import from TIR Dataset

Bulk Actions

Status Filter

Actions

Parsing

Edit

Delete

Download

Retrieval Testing

Key Components of Retrieval Testing

Test Queries

Expected Results

Retrieval Mechanism

Performance Metrics

Configuration

Data Imports