Synopsis: Text-to-SQL with LlamaIndex¶
Text-to-SQL: Talk to your data using human language and contemporary large language models, optionally offline.
Install¶
Project dependencies. For example, use them in a requirements.txt file.
langchain-openai<0.4
llama-index-embeddings-langchain<0.5
llama-index-embeddings-openai<0.6
llama-index-llms-azure-openai<0.5
llama-index-llms-ollama<0.8
llama-index-llms-openai<0.6
sqlalchemy-cratedb
Walkthrough¶
Import Python modules.
import os
from llama_index.llms.ollama import Ollama
from llama_index.llms.openai import OpenAI
from llama_index.core import SQLDatabase
from llama_index.core.query_engine import NLSQLTableQueryEngine
import sqlalchemy as sa
Provision an LLM using an OpenAI model.
llm = OpenAI(
model=os.getenv("OPENAI_MODEL", "gpt-4.1"),
temperature=0.0,
api_key=os.getenv("OPENAI_API_KEY"),
)
Alternatively, provision an LLM using a self-hosted model.
llm = Ollama(
base_url=os.getenv("OLLAMA_BASE_URL", "http://localhost:11434"),
model=os.getenv("OLLAMA_MODEL", "gemma3:1b"),
temperature=0.0,
request_timeout=120.0,
keep_alive=-1,
)
Connect to CrateDB.
database = sa.create_engine(os.getenv("CRATEDB_SQLALCHEMY_URL", "crate://crate@localhost:4200"))
Invoke Text-to-SQL query.
sql_database = SQLDatabase(engine=database)
nlsql = NLSQLTableQueryEngine(sql_database=sql_database, llm=llm)
answer = nlsql.query("What is the average value for sensor 1?")
Also try other languages.
answer = nlsql.query("Яке середнє значення для датчика 1?")
answer = nlsql.query("¿Cuál es el valor promedio del sensor 1?")
answer = nlsql.query("Was ist der Durchschnittswert für Sensor 1?")
answer = nlsql.query("Quelle est la valeur moyenne pour le capteur 1 ?")
Full code example¶
import os
import sqlalchemy as sa
from llama_index.core import SQLDatabase
from llama_index.core.query_engine import NLSQLTableQueryEngine
from llama_index.core import Settings
engine = sa.create_engine("crate://localhost:4200/")
engine.connect()
sql_database = SQLDatabase(
engine,
include_tables=["testdrive"]
)
query_engine = NLSQLTableQueryEngine(
sql_database=sql_database,
tables=[os.getenv("CRATEDB_TABLE_NAME")],
llm=Settings.llm
)
query_str = "What is the average value for sensor 1?"
answer = query_engine.query(query_str)
print(answer.get_formatted_sources())
print("Query was:", query_str)
print("Answer was:", answer)
# query was: What is the average value for sensor 1?
# answer was: The average value for sensor 1 is 17.03.
Note
Please find the executable example at CrateDB Examples » llama-index.