This project implements a Python-based NLP service using Flask and spaCy, containerized with Docker. It is designed to extract keywords from provided text, supporting multiple languages. The service is deployed using Gunicorn as the WSGI server for production environments.
- Keyword Extraction: Extract keywords from input text using spaCy.
- Multi-language Support: Uses the spaCy 'xx_ent_wiki_sm' model to support multiple languages.
- Dockerized Application: Containerized with Docker for easy deployment and scalability.
- Production-Ready: Includes Gunicorn for production-level deployments.
- Docker
- Any system that supports Docker (Windows/Linux/Mac)
-
Clone the Repository:
git clone https://github.com/circleboom/worker-nlp-service.git cd nlp-service
-
Build the Docker Image::
docker build -t nlp-service .
-
Run the Docker Container:
docker run -p 4000:5000 nlp-service
This command runs the Docker container, mapping port 5000 in the container to port 4000 on the host.
##Usage
Send a POST request to the service to extract keywords:
curl -X POST http://localhost:4000/extract_keywords \
-H "Content-Type: application/json" \
-d '{"text": "Your text here"}'
###Using the Service with C# Refer to the C# code snippet provided in the integration section to setup and call this service from a C# application.
##Development
###Local Setup For local development without Docker:
- Ensure Python 3.8+ is installed.
- Install dependencies:
Copy code pip install -r requirements.txt python3 -m spacy download xx_ent_wiki_sm
- Run the Flask app locally:
Copy code python3 run.py
###Adding New Features
- Extend the service by adding new routes or integrating more NLP features from spaCy.
- Update the spaCy model or switch to a different model for enhanced accuracy or different capabilities.
To ensure that all features function as intended and to prevent regressions, we have a comprehensive test suite. Follow these steps to run the tests:
Before running the tests, make sure that you have the project's dependencies installed and your virtual environment activated. If you haven't set up the virtual environment or installed the dependencies, refer to the Installation section of this README.
We use pytest
for running tests due to its simplicity and powerful features. To execute the tests, follow these steps:
-
Navigate to the Project Root:
Ensure you are in the root directory of the project where the pytest configuration files (
pytest.ini
orconftest.py
) are located. -
Run Pytest:
Execute the following command in your terminal:
python3 -m pytest
This command will discover and run all test cases in the
tests
directory.
The tests are designed to cover:
- Basic functionality of all routes.
- Integration tests to ensure different parts of the application work together correctly.
- Edge cases and error handling scenarios.
pytest
will provide a detailed report for each test, indicating whether it passed or failed. Review the output to ensure all tests pass. If a test fails, pytest
will provide a detailed error that can be used to diagnose and resolve the issue.
This project is configured to run these tests automatically via Continuous Integration (CI) tools every time changes are pushed to the repository. Ensure that all tests pass before pushing to ensure seamless integration and deployment.
Contributions are welcome! Please fork the repository and submit pull requests with new features or fixes. For major changes, please open an issue first to discuss what you would like to change.
Ensure to update tests as appropriate.
This project was created with the help of an AI, ChatGPT 4, developed by OpenAI, designed to assist in software development and other intellectual tasks.