diff --git a/README.md b/README.md index 5e78ed4..3e5bf0f 100644 --- a/README.md +++ b/README.md @@ -1,61 +1,29 @@ -# prefect-kghub-example-task prefect task +# Prefect-kghub-example-task prefect task -## Project Documentation - -#TODO: add the documentation of your code here - -## Prefect flow development - -- Create a virtual environment and install the requirements.txt: - - $ python3 -m venv .venv # Only needed once - $ .venv\Scripts\activate # Activate on windows - $ source .venv/bin/activate # Activate on linux - $ pip install -r requirements.txt # Whenever requirements.txt changes - -- Write your script in the `src` folder, with your 'main' script in `flows.py`, and tasks in `tasks.py`. Feel free to add new folders or files in the `src` folder. - -- Create a deployment in `python src/server.py`. - -- To test your flow locally, start a prefect instance within your venv: - - $ prefect server start - -- Run `python src/flows.py` to run your script, or `python src/server.py` to test the deployment. - -## Deploying your flow to production - -- Create the repo: https://github.com/nens/prefect-kghub-example-task +This is an example task on how to develop operational data flows for the KGhub, the WE-ACT datawarehouse in Kyrgyzstan. +This script uses the [KGhub worker](https://kghubworker.caiag.kg) to upload data to the [KGhub](https://kghub.caiag.kg/api). +Some other examples for which the KGhubworker can be used. +- Processing csv files from an ftp to load data ito the KGhub +- Downloading data from external API's into the KGhub. +- Calculating discharges from waterlevel measurements in real time. -- Start a local git project: - - $ git init - -- To keep code readable and maintainable, pre-commit is installed. If you have never used it, install globally on your device with: - - $ pip install pre-commit - -- Install the pre-commit for this git repo: - - $ pre-commit install - -- Commit your code and push it to your new repo. If you have troubles with pre-commit, you can always run it manually with: - - $ pre-commit run --all - -> **_NOTE:_** You need to fix all the pre-commit problems if it doesn't fix them itself. If pre-commit fails, the docker image build will also fail, and your flow will not be deployed. - -- Add 2 new teams to your github repo: Adviseurs (write) and Nelen & Schuurmans pull only (admin) - -- Ask Florian or Reinout to register your new flow. +## Project Documentation +This script downloads rainfall data for Bishkek from the Dutch Lizard and posts it to the KGhub as a timeseries. -## Optional (but definitely recommended): +The task runs every day at 04:10 GMT. -If you use vscode and did the `.venv` thingy above, the python plugin will detect your code and prefect. So you'll have proper code completion! And type hints become more useful. Tip: also install [the vscode editorconfig plugin](https://marketplace.visualstudio.com/items?itemName=EditorConfig.EditorConfig) because that will automatically handle unneeded spaces at the end of lines and other minutia. +## Developing your own tasks -There are even tests files (see https://docs.pytest.org/ for instructions), you can use them to test calculations. Ask Reinout or Florian for tips. Don't test whether an ftp download can work, but *do* test when you do some real programming work. +The KGhub worker uses Prefect[https://prefect.com] to operationally run Python scripts. If you have your own tasks that you want to run operationally in the KGhub worker you can set them up similarly to this repository. +At least the following files should be present. -Once the virtualenv is activated, you can run the tests simply with: +- A **README.md** describing what the task does. +- A **src/flows.py** which contains the main script. Optionally the script can be split up in separate tasks which can be put in the **src.tasks.py**. +- A **src/server.py** to control the task run frequency and other setting. Copy it from the repo and adjust as needed. +- A **Dockerfile** to build the docker, copy it from this repository. +- A **docker-compose.yml** +- A **requirements.txt** showing which Python packages are used. - $ pytest +For more info see the Prefect [getting started](https://docs-2.prefect.io/latest/) page. +For any questions and to actually deploy your script, reach out to Martijn Krol (martijn.krol@nelen-schuurmans.nl) or Kizje Marif (kizje.marif@nelen-schuurmans.nl). diff --git a/src/flows.py b/src/flows.py index 3b6c26f..3fac28e 100644 --- a/src/flows.py +++ b/src/flows.py @@ -11,12 +11,11 @@ @flow( - name="Clear name of your flow", + name="Rainfall data Bishkek", flow_run_name="kghub_example_task Flow run", - description="Short description of what the flow does.", - retries=0, # If wanted, place your retries count here, + description="Download rainfall datafrom GPM raster in Dutch Lizard and upload to KGhub as a timeseries", + retries=1, # If wanted, place your retries count here, retry_delay_seconds=10, - log_prints=True, ) def kghub_example_task_flow(): logger = get_run_logger()