This project is a web-based application that classifies text as either human-generated or LLM (Large Language Model) generated. The application is built using a Flask backend and a deep learning model implemented with TensorFlow and Keras.
- Text Classification: The model classifies whether a given text is generated by a human or an LLM.
- Visualization: The application provides a bar graph displaying the probability distribution of the classification.
- Word Cloud: A word cloud of the input text is generated and displayed on the frontend. !(./SihLLMEmbedding.jpg) !(./SihLLMEmbedding.jpg) !(./SihLLMEmbedding.jpg)
The model used for classification is a Sequential neural network with the following layers:
- Embedding Layer: Converts input text sequences into dense vectors of fixed size.
- LSTM Layers: Two LSTM (Long Short-Term Memory) layers are used for sequential data processing.
- LSTM with 128 units (returning sequences).
- LSTM with 64 units.
- Dropout Layers: Added to prevent overfitting with a dropout rate of 50%.
- Dense Layers:
- Dense layer with 32 units and ReLU activation.
- Output Dense layer with 1 unit and sigmoid activation.
The model is compiled with the Adam optimizer, binary crossentropy loss, and accuracy as a metric.
- Clone the repository:
git clone https://github.com/your-username/LLM-Generated-Text-Detector.git cd LLM-Generated-Text-Detector
Here is the text converted to markdown:
## Installation
1. **Install dependencies:**
```bash
pip install -r requirements.txt
-
Download the model:
Place the pre-trained model (
model2.keras
) and tokenizer (tokenizer.pkl
) in the root directory of the project. -
Run the Flask application:
python app.py
-
Access the application:
Open your web browser and go to
http://127.0.0.1:5000/
.
- app.py: The main Flask application file.
- preprocess.py: Contains the
preprocess
function for tokenizing and padding input text. - plot_prediction.py: Contains the
plot_prediction
function for generating and saving the prediction plot and word cloud. - templates/: Contains the HTML templates for rendering the web pages.
- index.html: The home page.
- predict.html: The prediction result page.
- static/: Contains static files like CSS, JavaScript, and images.
- prediction_plot.png: The prediction plot generated during runtime.
- wordcloud.png: The word cloud image generated from the input text.
- Navigate to the home page and enter the text you want to classify.
- Submit the text to get the prediction result.
- The result page will display:
- The classification result (Human-generated or LLM-generated).
- A bar graph showing the prediction probability.
- A word cloud generated from the input text.
- Flask
- TensorFlow
- Keras
- Matplotlib
- Seaborn
- WordCloud
This project is licensed under the MIT License. See the LICENSE file for details.