Question Detection and Extraction Benchmark

This project benchmarks different approaches to question detection and extraction using various models and techniques.

Components

HuggingFace model for question detection
Ollama model for question detection
Groq model for question detection
spaCy model for question extraction

Setup

1. Create and activate a virtual environment

# Create a virtual environment
python -m venv venv

# Activate the virtual environment
# On Windows:
venv\Scripts\activate
# On macOS and Linux:
source venv/bin/activate

2. Install dependencies

pip install -r requirements.txt

3. Install spaCy models

python -m spacy download en_core_web_sm
python -m spacy download en_core_web_lg

4. Download and install Ollama

Download and install Ollama from https://ollama.ai/

5. Set up environment variables

Set GROQ_API_KEY for the Groq API

Running the Benchmarks

HuggingFace Model

Start the HuggingFace model server:

python is_question/hf_model_host.py

Run the benchmark:

python is_question/bench_hf.py

Ollama Model

Pull the Gemma model:

ollama pull gemma2:2b

Run the benchmark:

python is_question/bench_llm_ollama.py

Groq Model

Run the benchmark:

python is_question/bench_llm_groq.py

spaCy Model for Question Extraction

Start the spaCy model server:

python extract_question/spacy_model_host.py

Run the benchmark:

python extract_question/bench_spacy_eq.py

Understanding the Scripts

Question Detection Scripts

is_question/hf_model_host.py:
- Loads a pre-trained HuggingFace model for question detection
- Sets up a FastAPI server to host the model
- Defines an endpoint that accepts text input and returns a prediction
is_question/bench_hf.py:
- Loads test cases from test_cases_iq.py
- Sends each test case to the HuggingFace model server
- Measures accuracy and response time
- Outputs detailed results and overall performance metrics
is_question/bench_llm_ollama.py:
- Uses the Ollama CLI to interact with the Gemma model
- Processes each test case from test_cases_iq.py
- Measures accuracy and response time
- Outputs detailed results and overall performance metrics
is_question/bench_llm_groq.py:
- Uses the Groq API to process test cases
- Loads test cases from test_cases_iq.py
- Measures accuracy and response time
- Outputs detailed results and overall performance metrics

Question Extraction Scripts

extract_question/spacy_model_host.py:
- Loads the spaCy English language model
- Sets up a FastAPI server to host the model
- Defines an endpoint that accepts text input and extracts questions
extract_question/bench_spacy_eq.py:
- Loads test cases from test_cases_eq.py
- Sends each test case to the spaCy model server
- Measures accuracy and response time for question extraction
- Outputs detailed results and overall performance metrics

Test Cases

is_question/test_cases_iq.py: Contains test cases for question detection
- Each test case includes input text and expected output (is it a question or not)
extract_question/test_cases_eq.py: Contains test cases for question extraction
- Each test case includes input text and expected extracted questions

You can modify these files to add or change test cases.

Results

Each benchmark script will output:

Overall accuracy
Average response time
Detailed results for each test case

Compare these results to evaluate the performance of different approaches to question detection and extraction.

Customization

To use different models or datasets, modify the respective script and test case files.
Adjust hyperparameters or prompts in the benchmark scripts to optimize performance.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
TRAIN		TRAIN
extract_question		extract_question
is_question		is_question
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Question Detection and Extraction Benchmark

Components

Setup

1. Create and activate a virtual environment

2. Install dependencies

3. Install spaCy models

4. Download and install Ollama

5. Set up environment variables

Running the Benchmarks

HuggingFace Model

Ollama Model

Groq Model

spaCy Model for Question Extraction

Understanding the Scripts

Question Detection Scripts

Question Extraction Scripts

Test Cases

Results

Customization

About

Releases

Packages

Languages

jooni22/is_question

Folders and files

Latest commit

History

Repository files navigation

Question Detection and Extraction Benchmark

Components

Setup

1. Create and activate a virtual environment

2. Install dependencies

3. Install spaCy models

4. Download and install Ollama

5. Set up environment variables

Running the Benchmarks

HuggingFace Model

Ollama Model

Groq Model

spaCy Model for Question Extraction

Understanding the Scripts

Question Detection Scripts

Question Extraction Scripts

Test Cases

Results

Customization

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages