- GPT-3, Language Models are Few-Shot Learners. NeurIPS 20. [Paper]
- T5, Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. [Paper]
- FLAN, Finetuned Language Models Are Zero-Shot Learners. ICLR 22. [Paper] [Code]
- DPO, Direct Preference Optimization: Your Language Model is Secretly a Reward Model. NeurIPS 23. [Paper]
- PEFT, The Power of Scale for Parameter-Efficient Prompt Tuning. EMNLP 21. [Paper]
- LoRA, LoRA: Low-rank Adaptation of Large Language Models. ICLR 22. [Paper]
- Chain-of-thought Prompting, Chain-of-thought prompting elicits reasoning in large language models. NeurIPS 22. [Paper]
- Least-to-most Prompting, Least-to-most prompting enables complex reasoning in large language models. ICLR 23. [Paper]
- Self-consistency Prompting, Self-consistency improves chain of thought reasoning in language models. ICLR 23. [Paper]
- ReAct, ReAct: Synergizing Reasoning and Acting in Language Models. ICLR 23. [Paper] [Code]
- TaBERT, TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data. ACL 20 Main. [Paper] [Code]
- TaPEx, TAPEX: Table Pre-training via Learning a Neural SQL Executor. ICLR 22. [Paper] [Code] [Models]
- TABBIE, TABBIE: Pretrained Representations of Tabular Data. NAACL 21 Main. [Paper] [Code]
- TURL, TURL: Table Understanding through Representation Learning. VLDB 21. [Paper] [Code]
- RESDSQL, RESDSQL: Decoupling Schema Linking and Skeleton Parsing for Text-to-SQL. AAAI 23. [Paper] [Code]
- UnifiedSKG, UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models. EMNLP 22 Main. [Paper ] [Code]
- SpreadsheetCoder, SpreadsheetCoder: Formula Prediction from Semi-structured Context. ICML 21. [Paper] [Code]
- Table-GPT, Table-GPT: Table-tuned GPT for Diverse Table Tasks. arXiv 2023. [Paper]
- TableLlama, TableLlama: Towards Open Large Generalist Models for Tables. NAACL 24. [Paper] [Code] [Model: TableLlama 7B] [Dataset: TableInstruct]
- Codex, Evaluating Large Language Models Trained on Code. arXiv 21. [Paper]
- StarCoder, StarCoder: may the source be with you!. TMLR 23. [Paper] [Code] [Models]
- Code Llama, Code Llama: Open Foundation Models for Code. arXiv 23. [Paper] [Code]
- WizardLM, WizardLM: Empowering Large Pre-Trained Language Models to Follow Complex Instructions. ICLR 24. [Paper] [Model: WizardLM 13B] [Model: WizardLM 70B]
- WizardCoder, WizardCoder: Empowering Code Large Language Models with Evol-Instruct. ICLR 24. [Paper] [Code] [Models: WizardCoder 15B]
- Magicoder, Magicoder: Source Code Is All You Need. ICML 24. [Paper] [Code] [Models 6.7B/7B]
- Lemur, Lemur: Harmonizing Natural Language and Code for Language Agents. ICLR 24. [Paper] [Code] [Model: Lemur 70B] [Model: Lemur 70B Chat]
- InfiAgent-DABench, InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks. ICML 24. [Paper] [Code]
- TableLLM, TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios. [Paper] [Model TableLLM 7B] [Model TableLLM 13B]
- StructLM, StructLM: Towards Building Generalist Models for Structured Knowledge Grounding. arXiv 24. [Paper] [Model: StructLM 7B] [Model: StructLM 13B] [Model: StructLM 34B] [Dataset: SKGInstruct]
- FinSQL, FinSQL: Model-Agnostic LLMs-based Text-to-SQL Framework for Financial Analysis. SIGMOD Companion 24. [[Paper](https://arxiv.org/pdf/2401.10506)]
- SENSE, Synthesizing Text-to-SQL Data from Weak and Strong LLMs. ACL 24. [Paper]
- ZeroNL2SQL, Combining Small Language Models and Large Language Models for Zero-Shot NL2SQL. VLDB 24. [Paper]
- LayoutLM, LayoutLM: Pre-training of Text and Layout for Document Image Understanding. KDD 20. [Paper]
- PubTabNet, Image-Based Table Recognition: Data, Model, and Evaluation. ECCV 20. [Paper] [Code & Data]
- Table-LLaVA, Multimodal Table Understanding. ACL 24. [Paper] [Code] [Model]
- TableLVM, TableVLM: Multi-modal Pre-training for Table Structure Recognition. ACL 23. [Paper]
- PixT3, PixT3: Pixel-based Table-To-Text Generation. ACL 24. [Paper]
- Tabular representation, noisy operators, and impacts on table structure understanding tasks in LLMs. NeurIPS 2023 second table representation learning workshop. [Paper]
- SpreadsheetLLM, SpreadsheetLLM: Encoding Spreadsheets for Large Language Models. arXiv 24. [Paper]
- Enhancing Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies. EMNLP 23. [Paper] [Code]
- Tables as Texts or Images: Evaluating the Table Reasoning Ability of LLMs and MLLMs. arXiv 24. [Paper]
- The Dawn of Natural Language to SQL: Are We Fully Ready? VLDB 24. [Paper] [Code]
- MCS-SQL, MCS-SQL: Leveraging Multiple Prompts and Multiple-Choice Selection For Text-to-SQL Generation. [Paper]
- DIN-SQL, DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction Prompting, Decompose. NeurIPS 23. [Paper] [Code]
- DAIL-SQL, Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation. VLDB 24. [Paper] [Code]
- C3, C3: Zero-shot Text-to-SQL with ChatGPT. arXiv 24. [Paper] [Code]
- Dater, Large Language Models are Versatile Decomposers: Decompose Evidence and Questions for Table-based Reasoning. SIGIR 23. [Paper] [Code]
- Binder, Binding language models in symbolic languages. ICLR 23. [Paper] [Code]
- ReAcTable, ReAcTable: Enhancing ReAct for Table Question Answering. VLDB 24. [Paper] [Code]
- E5, E5: Zero-shot Hierarchical Table Analysis using Augmented LLMs via Explain, Extract, Execute, Exhibit and Extrapolate. NAACL 24. [Paper] [Code]
- Chain-of-Table, Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding. ICLR 24. [Paper]
- ITR, An Inner Table Retriever for Robust Table Question Answering. ACL 23. [Paper]
- LI-RAGE, LI-RAGE: Late Interaction Retrieval Augmented Generation with Explicit Signals for Open-Domain Table Question Answering. ACL 23. [Paper]
- SheetCopilot, SheetCopilot: Bringing Software Productivity to the Next Level through Large Language Models Agent. NeurIPS 23. [Paper] [Code]
- SheetAgent, SheetAgent: A Generalist Agent for Spreadsheet Reasoning and Manipulation via Large Language Models. arXiv 24. [Paper]
- Vision Language Models for Spreadsheet Understanding: Challenges and Opportunities. arXiv 24. [Paper]
- StructGPT, StructGPT: A General Framework for Large Language Model to Reason over Structured Data. EMNLP 23 Main. [Paper] [Code]
- TAP4LLM, TAP4LLM: Table Provider on Sampling, Augmenting, and Packing Semi-structured Data for Large Language Model Reasoning. arXiv 23. [Paper]
- UniDM, UniDM: A Unified Framework for Data Manipulation with Large Language Models. MLSys 24. [Paper]
- Data-Copilot, Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow. arXiv 23. [Paper] [Code]
- LlamaIndex
- PandasAI
- Vanna
- DB-GPT. DB-GPT: Empowering Database Interactions with Private Large Language Models. [Paper] [Code]
- RetClean. RetClean: Retrieval-Based Data Cleaning Using Foundation Models and Data Lakes. [Paper] [Code]
- A Survey of Large Language Models. [Paper]
- A Survey on Large Language Model Based Autonomous Agents. [Paper]
- Table Pre-training: A Survey on Model Architectures, Pre-training Objectives, and Downstream Tasks. [Paper]
- Transformers for tabular data representation: A survey of models and applications. [Paper]
- A Survey of Table Reasoning with Large Language Models. [Paper]
- A survey on table question answering: Recent advances. [Paper]
- Large Language Models(LLMs) on Tabular Data - A Survey. [Paper]
- A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future Directions. [Paper]
Name | Keywords | Artifact | Paper |
---|---|---|---|
MBPP | Code | link | arXiv 21 |
HumanEval | Code | link | arXiv 21 |
Dr.Spider | NL2SQL, Robustness | link | ICLR 23 |
WiKiTableQuestions | Table QA | link | ACL 15 |
WiKiSQL | Table QA,NL2SQL | link | arXiv 17 |
TabFact | Table Fact Verification | link | ICLR 20 |
HyBirdQA | Table QA | link | EMNLP 20 |
FetaQA | Table Fact Verification | link | TACL 22 |
RobuT | Table QA | link | ACL 23 |
AnaMeta | Table Metadata | link | ACL 23 |
GPT4Table | Table QA, Table-to-text | link | WSDM 24 |
ToTTo | Table-to-text | link | EMNLP 20 |
SpreadsheetBench | Spreadsheet Manipulation | link | NeurIPS 24 |
BIRD | NL2SQL | link | NeurIPS 23 |
Spider | NL2SQL | link | EMNLP 18 |
Dr.Spider | NL2SQL | link | ICLR 23 |
ScienceBenchmark | NL2SQL | link | VLDB 24 |
DS-1000 | Data Analysis | link | ICML 23 |
InfiAgent-DABench | Data Analysis | link | ICML 24 |
TableBank | Table Detection | link | LERC 20 |
PubTabNet | Table Extraction | link | ECCV 20 |
ComTQA | Visual Table QA, Table Detection, Table Extraction | link | arXiv 24 |
Name | Keywords | Artifact | Paper |
---|---|---|---|
TableInstruct | Table Instruction Tuning | link | arXiv 23 |
WDC | Web Table | link | WWW 16 |
GitTables | GitHub CSVs | link | SIGMOD 23 |
DART | Table-to-text | link | NAACL 21 |
MMTab | Multimodal Table Understanding | link | ACL 24 |
SchemaPile | Database Schemas | link | SIGMOD 24 |