LLM-based Text2SQL

Gao, D., Wang, H., Li, Y., Sun, X., Qian, Y., Ding, B., & Zhou, J. (2023). Text-to-sql empowered by large language models: A benchmark evaluation. arXiv preprint arXiv:2308.15363.

个人总结: 一篇 LLM 在 Text2SQL 数据集上的 prompt engineering 的实验报告. 在文中评测的两个数据集中效果是开源方案中最好的. 提出的 prompt 方案 DAIL-SQL 融合了现有的几种 RAG 方法.

数据集

Spider is a large-scale complex and cross-domain semantic parsing and text-to-SQL dataset annotated by 11 Yale students. It consists of 10,181 questions and 5,693 unique complex SQL queries on 200 databases with multiple tables covering 138 different domains.

实际上看给出的 Data Examples, 即使是 EXTRA HARD 的样例, 涉及的数据库和 SQL 相比实际都相当简单.

[Extra Hard] What is the average life expectancy in the countries where English is not the official language?

SELECT AVG(life_expectancy)
FROM country
WHERE name NOT IN 
   (SELECT T1.name
    FROM country AS T1 JOIN
    country_language AS T2
    ON T1.code = T2.country_code
    WHERE T2.language = "English"
      AND T2.is_official = "T")

BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation) contains over 12,751 unique question-SQL pairs, 95 big databases with a total size of 33.4 GB. It also covers more than 37 professional domains, such as blockchain,...

Gao, D., Wang, H., Li, Y., Sun, X., Qian, Y., Ding, B., & Zhou, J. (2023). Text-to-sql empowered by large language models: A benchmark evaluation. arXiv preprint arXiv:2308.15363.

数据集

Spider is a large-scale complex and cross-domain semantic parsing and text-to-SQL dataset annotated by 11 Yale students. It consists of 10,181 questions and 5,693 unique complex SQL queries on 200 databases with multiple tables covering 138 different domains.

实际上看给出的 Data Examples, 即使是 EXTRA HARD 的样例, 涉及的数据库和 SQL 相比实际都相当简单.

[Extra Hard] What is the average life expectancy in the countries where English is not the official language?

SELECT AVG(life_expectancy)
FROM country
WHERE name NOT IN 
   (SELECT T1.name
    FROM country AS T1 JOIN
    country_language AS T2
    ON T1.code = T2.country_code
    WHERE T2.language = "English"
      AND T2.is_official = "T")

BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation) contains over 12,751 unique question-SQL pairs, 95 big databases with a total size of 33.4 GB. It also covers more than 37 professional domains, such as blockchain,...

Gao, D., Wang, H., Li, Y., Sun, X., Qian, Y., Ding, B., & Zhou, J. (2023). Text-to-sql empowered by large language models: A benchmark evaluation. arXiv preprint arXiv:2308.15363.

数据集

Spider is a large-scale complex and cross-domain semantic parsing and text-to-SQL dataset annotated by 11 Yale students. It consists of 10,181 questions and 5,693 unique complex SQL queries on 200 databases with multiple tables covering 138 different domains.

实际上看给出的 Data Examples, 即使是 EXTRA HARD 的样例, 涉及的数据库和 SQL 相比实际都相当简单.

[Extra Hard] What is the average life expectancy in the countries where English is not the official language?

SELECT AVG(life_expectancy)
FROM country
WHERE name NOT IN 
   (SELECT T1.name
    FROM country AS T1 JOIN
    country_language AS T2
    ON T1.code = T2.country_code
    WHERE T2.language = "English"
      AND T2.is_official = "T")

BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation) contains over 12,751 unique question-SQL pairs, 95 big databases with a total size of 33.4 GB. It also covers more than 37 professional domains, such as blockchain,...