In this guide we show you how to setup a text-to-SQL pipeline over your data with our [query pipeline](https://docs.llamaindex.ai/en/stable/module_guides/querying/pipeline/root.html) syntax.
This gives you flexibility to enhance text-to-SQL with additional techniques. We show these in the below sections:
1.**Query-Time Table Retrieval**: Dynamically retrieve relevant tables in the text-to-SQL prompt.
2.**Query-Time Sample Row retrieval**: Embed/Index each row, and dynamically retrieve example rows for each table in the text-to-SQL prompt.
Our out-of-the box pipelines include our `NLSQLTableQueryEngine` and `SQLTableRetrieverQueryEngine`. (if you want to check out our text-to-SQL guide using these modules, take a look [here](https://docs.llamaindex.ai/en/stable/examples/index_structs/struct_indices/SQLIndexDemo.html)). This guide implements an advanced version of those modules, giving you the utmost flexibility to apply this to your own setting.
Given an input question, first create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer. You can order the results by a relevant column to return the most interesting examples in the database.
Never query for all the columns from a specific table, only ask for a few relevant columns given the question.
Pay attention to use only the column names that you can see in the schema description. Be careful to not query for columns that do not exist. Pay attention to which column is in which table. Also, qualify column names with the table name when needed. You are required to use the following format, each taking one line:
query="What was the year that The Notorious B.I.G was signed to Bad Boy?"
)
print(str(response))
```
%% Output
[1;3;38;2;155;135;227m> Running module input with input:
query: What was the year that The Notorious B.I.G was signed to Bad Boy?
[0m[1;3;38;2;155;135;227m> Running module table_retriever with input:
input: What was the year that The Notorious B.I.G was signed to Bad Boy?
[0m[1;3;38;2;155;135;227m> Running module table_output_parser with input:
table_schema_objs: [SQLTableSchema(table_name='Bad_Boy_Artists', context_str='List of artists signed to Bad Boy Records and their album releases'), SQLTableSchema(table_name='Bad_Boy_Artists', context_str='List of artis...
[0m[1;3;38;2;155;135;227m> Running module text2sql_prompt with input:
query_str: What was the year that The Notorious B.I.G was signed to Bad Boy?
schema: Table 'Bad_Boy_Artists' has columns: Act (VARCHAR), Year_signed (INTEGER), _Albums_released_under_Bad_Boy (VARCHAR), and foreign keys: . The table description is: List of artists signed to Bad Boy Rec...
[0m[1;3;38;2;155;135;227m> Running module text2sql_llm with input:
messages: Given an input question, first create a syntactically correct sqlite query to run, then look at the results of the query and return the answer. You can order the results by a relevant column to return...
[0m[1;3;38;2;155;135;227m> Running module sql_output_parser with input:
response: assistant: SELECT Year_signed
FROM Bad_Boy_Artists
WHERE Act = 'The Notorious B.I.G'
SQLResult: 1993
Answer: The Notorious B.I.G was signed to Bad Boy in 1993.
[0mRAW RESPONSE SELECT Year_signed
FROM Bad_Boy_Artists
WHERE Act = 'The Notorious B.I.G'
SQLResult: 1993
Answer: The Notorious B.I.G was signed to Bad Boy in 1993.
[1;3;38;2;155;135;227m> Running module sql_retriever with input:
input: SELECT Year_signed
FROM Bad_Boy_Artists
WHERE Act = 'The Notorious B.I.G'
[0m[1;3;38;2;155;135;227m> Running module response_synthesis_prompt with input:
query_str: What was the year that The Notorious B.I.G was signed to Bad Boy?
response=qp.run(query="Who won best director in the 1972 academy awards")
print(str(response))
```
%% Output
[1;3;38;2;155;135;227m> Running module input with input:
query: Who won best directory in the 1972 academy awards
[0m[1;3;38;2;155;135;227m> Running module table_retriever with input:
input: Who won best directory in the 1972 academy awards
[0m[1;3;38;2;155;135;227m> Running module table_output_parser with input:
table_schema_objs: [SQLTableSchema(table_name='Academy_Awards_1972', context_str='List of award categories and nominees for the 1972 Academy Awards'), SQLTableSchema(table_name='Academy_Awards_1972', context_str='List o...
[0m[1;3;38;2;155;135;227m> Running module text2sql_prompt with input:
query_str: Who won best directory in the 1972 academy awards
schema: Table 'Academy_Awards_1972' has columns: Award (VARCHAR), Category (VARCHAR), Nominee (VARCHAR), Result (VARCHAR), and foreign keys: . The table description is: List of award categories and nominees f...
[0m[1;3;38;2;155;135;227m> Running module text2sql_llm with input:
messages: Given an input question, first create a syntactically correct sqlite query to run, then look at the results of the query and return the answer. You can order the results by a relevant column to return...
[0m[1;3;38;2;155;135;227m> Running module sql_output_parser with input:
response: assistant: SELECT Nominee
FROM Academy_Awards_1972
WHERE Category = 'Best Director' AND Result = 'Won'
SQLResult: The result of the SQLQuery will be the name of the director who won the Best Director ...
[0mRAW RESPONSE SELECT Nominee
FROM Academy_Awards_1972
WHERE Category = 'Best Director' AND Result = 'Won'
SQLResult: The result of the SQLQuery will be the name of the director who won the Best Director award in the 1972 Academy Awards.
Answer: The winner of the Best Director award in the 1972 Academy Awards was [Director's Name].
[1;3;38;2;155;135;227m> Running module sql_retriever with input:
input: SELECT Nominee
FROM Academy_Awards_1972
WHERE Category = 'Best Director' AND Result = 'Won'
[0m[1;3;38;2;155;135;227m> Running module response_synthesis_prompt with input:
query_str: Who won best directory in the 1972 academy awards
sql_query: SELECT Nominee
FROM Academy_Awards_1972
WHERE Category = 'Best Director' AND Result = 'Won'
response=qp.run(query="What was the term of Pasquale Preziosa?")
print(str(response))
```
%% Output
[1;3;38;2;155;135;227m> Running module input with input:
query: What was the term of Pasquale Preziosa?
[0m[1;3;38;2;155;135;227m> Running module table_retriever with input:
input: What was the term of Pasquale Preziosa?
[0m[1;3;38;2;155;135;227m> Running module table_output_parser with input:
table_schema_objs: [SQLTableSchema(table_name='Italian_Presidents', context_str='List of Italian Presidents and their terms in office'), SQLTableSchema(table_name='Italian_Presidents', context_str='List of Italian Presi...
[0m[1;3;38;2;155;135;227m> Running module text2sql_prompt with input:
query_str: What was the term of Pasquale Preziosa?
schema: Table 'Italian_Presidents' has columns: Name (VARCHAR), Term_start (VARCHAR), Term_end (VARCHAR), and foreign keys: . The table description is: List of Italian Presidents and their terms in office
Ta...
[0m[1;3;38;2;155;135;227m> Running module text2sql_llm with input:
messages: Given an input question, first create a syntactically correct sqlite query to run, then look at the results of the query and return the answer. You can order the results by a relevant column to return...
[0m[1;3;38;2;155;135;227m> Running module sql_output_parser with input:
## 2. Advanced Capability 2: Text-to-SQL with Query-Time Row Retrieval (along with Table Retrieval)
One problem in the previous example is that if the user asks a query that asks for "The Notorious BIG" but the artist is stored as "The Notorious B.I.G", then the generated SELECT statement will likely not return any matches.
We can alleviate this problem by fetching a small number of example rows per table. A naive option would be to just take the first k rows. Instead, we embed, index, and retrieve k relevant rows given the user query to give the text-to-SQL LLM the most contextually relevant information for SQL generation.
We expand the capability of our `table_parser_component` to not only return the relevant table schemas, but also return relevant rows per table schema.
It now takes in both `table_schema_objs` (output of table retriever), but also the original `query_str` which will then be used for vector retrieval of relevant rows.
query="What was the year that The Notorious BIG was signed to Bad Boy?"
)
print(str(response))
```
%% Output
[1;3;38;2;155;135;227m> Running module input with input:
query: What was the year that The Notorious BIG was signed to Bad Boy?
[0m[1;3;38;2;155;135;227m> Running module table_retriever with input:
input: What was the year that The Notorious BIG was signed to Bad Boy?
[0m[1;3;38;2;155;135;227m> Running module table_output_parser with input:
query_str: What was the year that The Notorious BIG was signed to Bad Boy?
table_schema_objs: [SQLTableSchema(table_name='Bad_Boy_Artists', context_str='List of artists signed to Bad Boy Records and their album releases'), SQLTableSchema(table_name='Bad_Boy_Artists', context_str='List of artis...
[0m[1;3;38;2;155;135;227m> Running module text2sql_prompt with input:
query_str: What was the year that The Notorious BIG was signed to Bad Boy?
schema: Table 'Bad_Boy_Artists' has columns: Act (VARCHAR), Year_signed (INTEGER), _Albums_released_under_Bad_Boy (VARCHAR), and foreign keys: . The table description is: List of artists signed to Bad Boy Rec...
[0m[1;3;38;2;155;135;227m> Running module text2sql_llm with input:
messages: Given an input question, first create a syntactically correct sqlite query to run, then look at the results of the query and return the answer. You can order the results by a relevant column to return...
[0m[1;3;38;2;155;135;227m> Running module sql_output_parser with input:
response: assistant: SELECT Year_signed
FROM Bad_Boy_Artists
WHERE Act = 'The Notorious B.I.G'
SQLResult: 1993
Answer: The Notorious BIG was signed to Bad Boy in 1993.
[0mRAW RESPONSE SELECT Year_signed
FROM Bad_Boy_Artists
WHERE Act = 'The Notorious B.I.G'
SQLResult: 1993
Answer: The Notorious BIG was signed to Bad Boy in 1993.
[1;3;38;2;155;135;227m> Running module sql_retriever with input:
input: SELECT Year_signed
FROM Bad_Boy_Artists
WHERE Act = 'The Notorious B.I.G'
[0m[1;3;38;2;155;135;227m> Running module response_synthesis_prompt with input:
query_str: What was the year that The Notorious BIG was signed to Bad Boy?