Skip to main content

Natural Language Queries

This guide explains how to use Code-Llama to convert natural language questions into SQL queries using a notebook on the E2E Network AI Platform.


Overview

  • Convert natural language queries into valid SQL
  • No SQL knowledge required
  • Uses Code-Llama Instruct model from HuggingFace

Step 1: Launch a Instance

  1. Sign in at E2E Cloud
  2. Switch to AI Platform
  3. Create and launch a Instance(Node)

Step 2: Install Dependencies

Run inside notebook:

!pip install transformers accelerate

Step 3: Run NL → SQL Conversion

Replace the token with your HuggingFace API Key:

import requests
from IPython.display import Markdown, display

input_text = "Find the names of all employees who are from California."
API_URL = "https://api-inference.huggingface.co/models/codellama/CodeLlama-34b-Instruct-hf"
headers = {"Authorization": "Bearer hf_XXXXXXXXXXXXXXXXXXXXXXXX"}

def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()

output = query({"inputs": input_text})

while output[0]['generated_text'] != input_text:
input_text = output[0]['generated_text']
output = query({'inputs': input_text})

display(Markdown(output[0]['generated_text']))

HuggingFace Token

Create your token here:

https://huggingface.co/settings/tokens


Example Inputs & Outputs

🧠 Example Query 1: Retrieve customers who spent more than $100 in the last month.

SELECT name, email
FROM customers
WHERE id IN (
SELECT customer_id
FROM orders
WHERE total_amount > 100
AND created_at > DATE_SUB(NOW(), INTERVAL 1 MONTH)
);

🧠 Example Query 2: Average salary per department with >10 employees.

SELECT d.name AS department_name,
AVG(e.salary) AS average_salary,
COUNT(e.id) AS total_employees,
MAX(e.salary) AS highest_salary
FROM departments d
JOIN employees e ON d.id = e.department_id
GROUP BY d.name
HAVING COUNT(e.id) > 10;

Best Practices for Better Results

✅ Provide full context (table + filters + timeframe)

✅ Use explicit SQL-style references (e.g., hire_date > '2023-01-01')

✅ Avoid vague terms (“recent”, “large sales”, “active users”)

✅ Include entity names (departments, customers, etc.)


Challenges & Solutions

ChallengeRecommended Solution
SQL dialect differencesMatch your DB engine (MySQL / Postgres / etc.)
Complex multi-joinsBreak query into smaller steps, refine output
Missing schema contextMention table + field names explicitly
Incorrect SQL syntaxValidate + autorepair in application layer
Data privacy risksUse authentication and restricted roles

Conclusion

Code-Llama helps:

  • Non-SQL users query databases easily
  • Speed up data-driven decisions
  • Avoid manual SQL writing

It performs best when inputs are precise and unambiguous — treat it as a power tool that enhances human capability.