# Natural Language Queries This guide explains how to use **Code-Llama** to convert natural language questions into SQL queries using a notebook on the E2E Network AI Platform. --- ## Overview * Convert natural language queries into valid SQL * No SQL knowledge required * Uses Code-Llama Instruct model from HuggingFace --- ## Step 1: Launch a Instance 1. Sign in at **E2E Cloud** 2. Switch to **AI Platform** 3. Create and launch a Instance(Node) --- ## Step 2: Install Dependencies Run inside notebook: ```bash !pip install transformers accelerate ``` --- ## Step 3: Run NL → SQL Conversion Replace the token with your HuggingFace API Key: ```python import requests from IPython.display import Markdown, display input_text = "Find the names of all employees who are from California." API_URL = "https://api-inference.huggingface.co/models/codellama/CodeLlama-34b-Instruct-hf" headers = {"Authorization": "Bearer hf_XXXXXXXXXXXXXXXXXXXXXXXX"} def query(payload): response = requests.post(API_URL, headers=headers, json=payload) return response.json() output = query({"inputs": input_text}) while output[0]['generated_text'] != input_text: input_text = output[0]['generated_text'] output = query({'inputs': input_text}) display(Markdown(output[0]['generated_text'])) ``` ### HuggingFace Token Create your token here: > [https://huggingface.co/settings/tokens](https://huggingface.co/settings/tokens) --- ## Example Inputs & Outputs 🧠 **Example Query 1:** Retrieve customers who spent more than $100 in the last month. ```sql SELECT name, email FROM customers WHERE id IN ( SELECT customer_id FROM orders WHERE total_amount > 100 AND created_at > DATE_SUB(NOW(), INTERVAL 1 MONTH) ); ``` 🧠 **Example Query 2:** Average salary per department with >10 employees. ```sql SELECT d.name AS department_name, AVG(e.salary) AS average_salary, COUNT(e.id) AS total_employees, MAX(e.salary) AS highest_salary FROM departments d JOIN employees e ON d.id = e.department_id GROUP BY d.name HAVING COUNT(e.id) > 10; ``` --- ## Best Practices for Better Results ✅ Provide full context (table + filters + timeframe) ✅ Use explicit SQL-style references (e.g., `hire_date > '2023-01-01'`) ✅ Avoid vague terms (“recent”, “large sales”, “active users”) ✅ Include entity names (departments, customers, etc.) --- ## Challenges & Solutions | Challenge | Recommended Solution | | ----------------------- | ---------------------------------------------- | | SQL dialect differences | Match your DB engine (MySQL / Postgres / etc.) | | Complex multi-joins | Break query into smaller steps, refine output | | Missing schema context | Mention table + field names explicitly | | Incorrect SQL syntax | Validate + autorepair in application layer | | Data privacy risks | Use authentication and restricted roles | --- ## Conclusion Code-Llama helps: * Non-SQL users query databases easily * Speed up data-driven decisions * Avoid manual SQL writing It performs best when inputs are precise and unambiguous — treat it as a **power tool** that enhances human capability. --- ---