diff --git a/README.md b/README.md index 704fba4..7afe889 100644 --- a/README.md +++ b/README.md @@ -54,76 +54,9 @@ There are a few environment variables that need to be set so that the applicatio # Pipeline architectures -## Celery-based pipeline -![Celery-based pipeline architecture](/images/celery_pipeline.png "Celery-based pipeline architecture") - ## Thread-based pipeline ![Thread-based pipeline architecture](/images/thread_pipeline.png "Thread-based pipeline architecture") -# How to use the celery-based pipeline - -## 1. Start RabbitMQ locally (Optional) - -Set up a local instance of RabbitMQ using Docker: - -```bash -docker pull rabbitmq:3-management -docker run --rm -it -p 15672:15672 -p 5672:5672 rabbitmq:3-management -``` - -The rabbitMQ management interface can be access on the url http://localhost:15672 - - -NOTE: If RabbitMQ is run locally, set the BROKER_URL env variable to amqp://guest:guest@localhost. The default username and password are uest. - - -## 2. Start a celery worker - -Start a celery worker: - -```bash -python -m celery -A main worker --loglevel=INFO -n ETLPipeline@%h -``` - -Alternatively, run the customisable `start_celery_worker.sh` script: - -```bash -./start_celery_worker.sh -``` - -## 4. Run the ETL data pipeline - -Run one of the examples in the examples directory, e.g.: - -```bash -python -u examples/example_celery.py -``` - -## 4. Monitor a Celery cluster with Flower (Optional) - -Install Flower using pip: - -```bash -pip install flower -``` - -Launch the Flower server at specified port (default is 5555, so `--port=5555` can be ommited): - -``` -python -m celery -A main flower --port=5555 -``` - -Alternatively, run Flower via docker: - -``` -docker run -p 5555:5555 mher/flower -``` - -Access Flower on the url http://localhost:5555/ - - - - ## Filesystems ### Credentials to access the object store (.json file)