Skip to content

Commit

Permalink
README and flow name
Browse files Browse the repository at this point in the history
  • Loading branch information
mdkrol committed Nov 11, 2024
1 parent e4c43d1 commit 7bfa860
Show file tree
Hide file tree
Showing 2 changed files with 24 additions and 57 deletions.
74 changes: 21 additions & 53 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,61 +1,29 @@
# prefect-kghub-example-task prefect task
# Prefect-kghub-example-task prefect task

## Project Documentation

#TODO: add the documentation of your code here

## Prefect flow development

- Create a virtual environment and install the requirements.txt:

$ python3 -m venv .venv # Only needed once
$ .venv\Scripts\activate # Activate on windows
$ source .venv/bin/activate # Activate on linux
$ pip install -r requirements.txt # Whenever requirements.txt changes

- Write your script in the `src` folder, with your 'main' script in `flows.py`, and tasks in `tasks.py`. Feel free to add new folders or files in the `src` folder.

- Create a deployment in `python src/server.py`.

- To test your flow locally, start a prefect instance within your venv:

$ prefect server start

- Run `python src/flows.py` to run your script, or `python src/server.py` to test the deployment.

## Deploying your flow to production

- Create the repo: https://github.com/nens/prefect-kghub-example-task
This is an example task on how to develop operational data flows for the KGhub, the WE-ACT datawarehouse in Kyrgyzstan.
This script uses the [KGhub worker](https://kghubworker.caiag.kg) to upload data to the [KGhub](https://kghub.caiag.kg/api).
Some other examples for which the KGhubworker can be used.
- Processing csv files from an ftp to load data ito the KGhub
- Downloading data from external API's into the KGhub.
- Calculating discharges from waterlevel measurements in real time.

- Start a local git project:

$ git init

- To keep code readable and maintainable, pre-commit is installed. If you have never used it, install globally on your device with:

$ pip install pre-commit

- Install the pre-commit for this git repo:

$ pre-commit install

- Commit your code and push it to your new repo. If you have troubles with pre-commit, you can always run it manually with:

$ pre-commit run --all

> **_NOTE:_** You need to fix all the pre-commit problems if it doesn't fix them itself. If pre-commit fails, the docker image build will also fail, and your flow will not be deployed.
- Add 2 new teams to your github repo: Adviseurs (write) and Nelen & Schuurmans pull only (admin)

- Ask Florian or Reinout to register your new flow.
## Project Documentation

This script downloads rainfall data for Bishkek from the Dutch Lizard and posts it to the KGhub as a timeseries.

## Optional (but definitely recommended):
The task runs every day at 04:10 GMT.

If you use vscode and did the `.venv` thingy above, the python plugin will detect your code and prefect. So you'll have proper code completion! And type hints become more useful. Tip: also install [the vscode editorconfig plugin](https://marketplace.visualstudio.com/items?itemName=EditorConfig.EditorConfig) because that will automatically handle unneeded spaces at the end of lines and other minutia.
## Developing your own tasks

There are even tests files (see https://docs.pytest.org/ for instructions), you can use them to test calculations. Ask Reinout or Florian for tips. Don't test whether an ftp download can work, but *do* test when you do some real programming work.
The KGhub worker uses Prefect[https://prefect.com] to operationally run Python scripts. If you have your own tasks that you want to run operationally in the KGhub worker you can set them up similarly to this repository.
At least the following files should be present.

Once the virtualenv is activated, you can run the tests simply with:
- A **README.md** describing what the task does.
- A **src/flows.py** which contains the main script. Optionally the script can be split up in separate tasks which can be put in the **src.tasks.py**.
- A **src/server.py** to control the task run frequency and other setting. Copy it from the repo and adjust as needed.
- A **Dockerfile** to build the docker, copy it from this repository.
- A **docker-compose.yml**
- A **requirements.txt** showing which Python packages are used.

$ pytest
For more info see the Prefect [getting started](https://docs-2.prefect.io/latest/) page.
For any questions and to actually deploy your script, reach out to Martijn Krol ([email protected]) or Kizje Marif ([email protected]).
7 changes: 3 additions & 4 deletions src/flows.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,11 @@


@flow(
name="Clear name of your flow",
name="Rainfall data Bishkek",
flow_run_name="kghub_example_task Flow run",
description="Short description of what the flow does.",
retries=0, # If wanted, place your retries count here,
description="Download rainfall datafrom GPM raster in Dutch Lizard and upload to KGhub as a timeseries",
retries=1, # If wanted, place your retries count here,
retry_delay_seconds=10,
log_prints=True,
)
def kghub_example_task_flow():
logger = get_run_logger()
Expand Down

0 comments on commit 7bfa860

Please sign in to comment.