RAG (Retrieval-Augmented Generation)
Enhance LLM responses by connecting them to your own knowledge base, delivering accurate, context-specific answers without retraining.
Quick Start
Create a Knowledge Base
Set up a knowledge base in 5 steps: choose an embedding model, configure chunking, and import your documents.
Set Up a Chat Assistant
Build a conversational assistant: attach knowledge bases, configure the Prompt Engine, and select your LLM.
What is RAG?
Explore the core concepts, architecture, and supported use cases of Retrieval-Augmented Generation.
Understand Billing
RAG billing covers storage, retrieval tokens, and GenAI API calls, with detailed worked examples for each billing scenario.
Explore RAG
Knowledge Base
Create knowledge bases, manage documents, and evaluate retrieval quality
Chat Assistant
Build and configure chat assistants with custom prompts and LLM models
Advanced Features
Configure embedding models, chunking strategies, and retrieval optimization
Billing
Understand storage, retrieval, and GenAI API cost components
Billing & Plans
Billing & Credits
RAG billing covers three components: storage for knowledge base documents, retrieval (search & indexing), and GenAI API calls for embeddings and LLM inference.
Storage
Knowledge base documents and vector embeddings billed at Rs 8/GB/month.
Retrieval
Vector search and chunk retrieval charged at Rs 10 per 1 million tokens.
GenAI API
Embedding model and LLM token usage billed per GenAI model pricing. No additional cost when using Model Endpoints.