This repository contains code for the 2025 NAACL paper:
LLM-Supported Natural Language to Bash Translation
Note: Our code has only been tested on Ubuntu 20.04 with Python 3.10 and PyTorch 2.6.0+cu124.
- Install Docker Engine (Instructions)
- Configure Docker for non-sudo users (Instructions)
- Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
- Install embedding model
ollama pull mxbai-embed-large
- Setup virtual environment
python3 -m venv nl2sh_venv
source nl2sh_venv/bin/activate
pip install -r requirements.txt
python3 -m ipykernel install --user --name=nl2sh_venv --display-name="nl2sh_venv"
- Run example.ipynb
- paper/ - Latex source for our paper
- example.ipynb - Starter code
- model_comparison.ipynb - Reproduce our best model (+ parser) results
- finetuned_model_comparison.ipynb - Reproduce our fine-tuned model results
- feh_comparison.ipynb - Reproduce our FEH comparison results
Our datasets, benchmark code and fine-tuned models are available at these links:
- Datasets
- Benchmark
- Models
If you find our work helpful, please cite:
@misc{westenfelder2025llmsupportednaturallanguagebash,
title={LLM-Supported Natural Language to Bash Translation},
author={Finnian Westenfelder and Erik Hemberg and Miguel Tulla and Stephen Moskal and Una-May O'Reilly and Silviu Chiricescu},
year={2025},
eprint={2502.06858},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.06858},
}