Skip to content

Latest commit

 

History

History
62 lines (43 loc) · 1.75 KB

README.md

File metadata and controls

62 lines (43 loc) · 1.75 KB

Help with querying the Land Matrix database via Python

Land Matrix

This project aims to facilitate the retrieval of Land Matrix data through natural language queries.

This repository provides several resources:

  • An end-to-end Streamlit application with optimal configuration. Explanations.
  • A pipeline to reproduce our benchmark of models and methods. See below
  • Educational notebooks that describe all the tasks needed for the entire pipeline. Explanations.

1. Installation

git clone https://github.com/tetis-nlp/landmatrix-graphql-python.git
  • Installation of the Python environment

    conda create -n landmatrix python=3.9 pandas scikit-learn spacy streamlit
    conda activate landmatrix
    conda install -c conda-forge sentence-transformers
    pip install transformers faiss-cpu
    pip install ollama
    pip install langchain-openai
    pip install langchain-community
    pip install openpyxl
  • Downloading the Spacy model

    python -m spacy download en_core_web_sm
  • Installation and launch of Ollama

    curl -fsSL https://ollama.com/install.sh | sh
    ollama serve
    ollama pull llama3:8b
  • Configure API keys (only compatible with chat ISDM): add your own ISDM API keys (without ")

    cp credentials.ini.default credentials.ini
    vim credentials.ini

2. Reproduce our benchmark

python src/experiments.py 
  • Monitore your pipeline : tail -f logs/pipeline.log
  • Stop the pipeline: Kill all the subprocess: pkill -f src/