This project is a hybrid text summarization application combining extractive (TF-IDF) and abstractive (Hugging Face Transformers) techniques. It features a Flask backend for API support and a Streamlit frontend for an interactive user experience. Users can upload text files and receive a concise summary using advanced NLP methods.
- Overview
- Features
- Technologies Used
- How It Works
- Setup and Installation
- Usage
- Output
- Future Enhancements
- Acknowledgments
This project provides a hybrid approach to summarization by combining:
Extractive Summarization: Selects key sentences using TF-IDF scores. Abstractive Summarization: Generates summaries using Hugging Face’s facebook/bart-large-cnn model. Hybrid Summarization: Refines extractive summaries through abstractive techniques for improved results.
-File Upload Support: Accepts text files for summarization. -Hybrid Summarization: Combines extractive and abstractive methods. -Interactive UI: Streamlit frontend for user-friendly interactions. -API Integration: Flask backend for scalable summarization services.
- Backend: Flask
- Frontend: Streamlit
- NLP Model: -SpaCy for sentence tokenization. -Hugging Face Transformers (facebook/bart-large-cnn) for abstractive summarization. -Scikit-learn's TfidfVectorizer for extractive summarization.
- Libraries:
- Requests for API communication. -Streamlit for frontend development.
-
Input:
- User uploads a text file via the Streamlit interface.
-
Summarization:
- The text is processed and tokenized using SpaCy. -Extractive summarization ranks sentences by relevance using TF-IDF scores. -Abstractive summarization generates refined summaries using Hugging Face's model. -The hybrid approach combines both methods for optimal results.
-
Output: -The summary is displayed in the Streamlit app.
- Python 3.8 or higher
- Pipenv or pip for Python dependency management
- Clone the Repository:
git clone <https://github.com/siddhinarayan09/Text-Summarizer-NLP> cd Text-Summarizer-NLP
- Set Up the Backend:
Create a virtual environment and install dependencies:
pip install -r requirements.txt
- Start the Flask App:
python app.py
- Run the Streamlit Frontend:
streamlit run streamlit_app.py
Launch the Streamlit app. Upload a text file for summarization. Click "Summarize" to generate a hybrid summary. View the results dynamically in the app.
The application produces:
Support for Multiple File Formats: Add support for PDFs and Word documents. Advanced Visualizations: Use Plotly for interactive visual outputs. Authentication: Allow user accounts to save and access summaries. Language Support: Extend the model to handle multilingual text.
Hugging Face: For providing the facebook/bart-large-cnn model. SpaCy: For efficient text preprocessing. Streamlit: For building an interactive frontend. Flask: For lightweight backend API development.
Inspiration from NLP projects and article summarizers.
Thank you for exploring the project:)