Skip to content

Commit

Permalink
Merge pull request #32 from IATI/additional_org_fields
Browse files Browse the repository at this point in the history
Save and make available additional reporting_org fields
  • Loading branch information
simon-20 authored Jan 16, 2025
2 parents b9b9ff4 + a5c5919 commit 6bfd8ab
Show file tree
Hide file tree
Showing 43 changed files with 1,005 additions and 356 deletions.
1 change: 1 addition & 0 deletions .python-version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.12.6
3 changes: 2 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
FROM python:3.12.5-slim-bookworm
FROM python:3.12.6-slim-bookworm

RUN apt-get update -y

WORKDIR /bulk-data-service

COPY requirements.txt .
COPY pyproject.toml .

RUN pip install -r requirements.txt

Expand Down
45 changes: 21 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@

# IATI Bulk Data Service Tool

## Summary

Product | IATI Bulk Data Service
--- | ---
Description | A Python application which fetches the list of registered IATI datasets and periodically downloads them, making each available individually as an XML file and ZIP file, and also providing a ZIP file containing all the datasets.
Website | None
Related |
Documentation | Rest of README.md
Technical Issues | See https://github.com/IATI/bulk-data-service/issues
Support | https://iatistandard.org/en/guidance/get-support/
| Product | IATI Bulk Data Service |
| ---------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Description | A Python application which fetches the list of registered IATI datasets and periodically downloads them, making each available individually as an XML file and ZIP file, and also providing a ZIP file containing all the datasets. |
| Website | None |
| Related |
| Documentation | Rest of README.md |
| Technical Issues | See https://github.com/IATI/bulk-data-service/issues |
| Support | https://iatistandard.org/en/guidance/get-support/ |

## High-level requirements

* Python 3.12
* Postgres DB
* Azure storage account with blob storage enabled
- Python 3.12.6 or above
- (This is specified in .python-version, Dockerfile, and pyproject.toml)
- Postgres DB
- Azure storage account with blob storage enabled

## Running the app locally

Expand Down Expand Up @@ -73,8 +73,7 @@ dotenv run python src/iati_bulk_data_service.py -- --operation zipper --single-r

It will store the ZIP files in the directory defined in the `ZIP_WORKING_DIR` environment variable.


*Note:* not all versions of `dotenv` require a `run` subcommand.
_Note:_ not all versions of `dotenv` require a `run` subcommand.

## Development on the app

Expand Down Expand Up @@ -121,7 +120,6 @@ Code formatter `black` is configured via `pyproject.toml` and can be run with:
black .
```


### Adding new dependencies to main project

New dependencies need to be added to `pyproject.toml`.
Expand Down Expand Up @@ -158,7 +156,6 @@ dotenv run yoyo -- rollback # rollback, interactively
dotenv run yoyo -- new # create file for a new migration
```


### Automated tests

Requirements: docker compose
Expand Down Expand Up @@ -206,21 +203,24 @@ This will create a resource group on Azure called `rg-bulk-data-service-dev`, an

At the end of its run, the `azure-create-resources.sh` script will print out various secrets which need to be added to Github Actions.

### Deployment - Versioning

The app version is set in `pyproject.toml`, and this is read by the app to use in the `User-Agent` header. When making a new release, set the version here to the appropriate value. Then, when releasing the app using the normal IATI Python app deployment process, choose the tag name to match the version chosen.

### Deployment - CI/CD

The application is setup to deploy to the dev instance when a PR is merged to
`develop`, and to production when a release is done on `main` branch.
`develop`, and to production when a release is done on `main` branch.

Sometimes, when altering the CI/CD setup or otherwise debugging, it can be
useful to do things manually. The Bulk Data Service can be released to an Azure instance (e.g., a test instance) using the following command:
Sometimes, when altering the CI/CD setup or otherwise debugging, it can be
useful to do things manually. The Bulk Data Service can be released to an Azure instance (e.g., a test instance) using the following command:

```bash
```bash
./azure-deployment/manual-azure-deploy-from-local.sh test
```

For this to work, you need to put the secrets you want to use in `azure-deployment/manual-azure-deploy-secrets.env` and the variables you want to use in `azure-deployment/manual-azure-deploy-variables.env`. These is an example of each of these files that can be used as a starting point.


### Manually building the docker image (to test/develop the deployment setup)

You can build the docker image using the following command, replacing `INSTANCE_NAME` with the relevant instance:
Expand All @@ -235,9 +235,6 @@ To run it locally:
docker container run --env-file=.env-docker "criati.azurecr.io/bulk-data-service-dev" --operation checker --single-run --run-for-n-datasets 20
```


## Resources

[Reference docs for the Azure deployment YAML file](https://learn.microsoft.com/en-us/azure/container-instances/container-instances-reference-yaml#schema) (`azure-deployment/deploy.yml`).


5 changes: 5 additions & 0 deletions db-migrations/20250109_01_87Bsu.rollback.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
--
-- depends: 20240827_01_pVOLG
--

drop table iati_organisations;
30 changes: 30 additions & 0 deletions db-migrations/20250109_01_87Bsu.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
--
-- depends: 20240827_01_pVOLG
--

-- auto-generated definition
create table iati_organisations
(
id uuid not null,
short_name varchar not null,
iati_identifier varchar,
human_readable_name varchar,
registration_service_reporting_org_metadata varchar
);

comment on column iati_organisations.id is 'the UUID of the reporting organisation';

comment on column iati_organisations.short_name is 'the short id of the reporting organisation';

comment on column iati_organisations.iati_identifier is 'the IATI identifier of the reporting organisation';

comment on column iati_organisations.human_readable_name is 'the canonical human readable name of the reporting organisation';

comment on column iati_organisations.registration_service_reporting_org_metadata is 'the original reporting organisation metadata record from the data registration service';

alter table iati_organisations
owner to bds;

create unique index iati_organisations_pk
on iati_organisations (id);

6 changes: 6 additions & 0 deletions db-migrations/20250109_02_CDARS.rollback.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
--
-- depends: 20250109_01_87Bsu
--

alter table iati_reporting_orgs
rename to iati_organisations;
6 changes: 6 additions & 0 deletions db-migrations/20250109_02_CDARS.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
--
-- depends: 20250109_01_87Bsu
--

alter table iati_organisations
rename to iati_reporting_orgs;
17 changes: 17 additions & 0 deletions db-migrations/20250109_03_5lLgM.rollback.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
--
-- depends: 20250109_02_CDARS
--

ALTER TABLE
iati_datasets
ADD
registration_service_publisher_metadata VARCHAR;

alter table iati_datasets
rename column reporting_org_id to publisher_id;

alter table iati_datasets
rename column reporting_org_short_name to publisher_name;

alter table iati_datasets
rename column short_name to name;
15 changes: 15 additions & 0 deletions db-migrations/20250109_03_5lLgM.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
--
-- depends: 20250109_02_CDARS
--

alter table iati_datasets
drop column if exists registration_service_publisher_metadata;

alter table iati_datasets
rename column publisher_id to reporting_org_id;

alter table iati_datasets
rename column publisher_name to reporting_org_short_name;

alter table iati_datasets
rename column name to short_name;
8 changes: 5 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,14 +1,15 @@
[project]
name = "bulk-data-service"
version = "0.1.7"
requires-python = ">= 3.12"
version = "0.2.0"
requires-python = ">= 3.12.6"
readme = "README.md"
dependencies = [
"azure-storage-blob==12.20.0",
"psycopg[binary,pool]==3.1.18",
"requests==2.31.0",
"yoyo-migrations==9.0.0",
"prometheus-client==0.20.0",
"toml==0.10.2"
]


Expand All @@ -23,7 +24,8 @@ dev = [
"flake8-pyproject",
"types-requests",
"python-dotenv",
"pytest-watcher"
"pytest-watcher",
"types-toml"
]


Expand Down
54 changes: 28 additions & 26 deletions requirements-dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,51 +4,51 @@
#
# pip-compile --extra=dev --output-file=requirements-dev.txt --strip-extras pyproject.toml
#
azure-core==1.30.2
azure-core==1.32.0
# via azure-storage-blob
azure-storage-blob==12.20.0
# via bulk-data-service (pyproject.toml)
black==24.8.0
black==24.10.0
# via bulk-data-service (pyproject.toml)
build==1.2.1
build==1.2.2.post1
# via pip-tools
certifi==2024.8.30
certifi==2024.12.14
# via requests
cffi==1.17.1
# via cryptography
charset-normalizer==3.3.2
charset-normalizer==3.4.1
# via requests
click==8.1.7
click==8.1.8
# via
# black
# pip-tools
cryptography==43.0.1
cryptography==44.0.0
# via azure-storage-blob
flake8==7.1.1
# via
# bulk-data-service (pyproject.toml)
# flake8-pyproject
flake8-pyproject==1.2.3
# via bulk-data-service (pyproject.toml)
idna==3.8
idna==3.10
# via requests
importlib-metadata==8.4.0
importlib-metadata==8.5.0
# via yoyo-migrations
iniconfig==2.0.0
# via pytest
isodate==0.6.1
isodate==0.7.2
# via azure-storage-blob
isort==5.13.2
# via bulk-data-service (pyproject.toml)
mccabe==0.7.0
# via flake8
mypy==1.11.2
mypy==1.14.1
# via bulk-data-service (pyproject.toml)
mypy-extensions==1.0.0
# via
# black
# mypy
packaging==24.1
packaging==24.2
# via
# black
# build
Expand All @@ -57,7 +57,7 @@ pathspec==0.12.1
# via black
pip-tools==7.4.1
# via bulk-data-service (pyproject.toml)
platformdirs==4.2.2
platformdirs==4.3.6
# via black
pluggy==1.5.0
# via pytest
Expand All @@ -67,19 +67,19 @@ psycopg==3.1.18
# via bulk-data-service (pyproject.toml)
psycopg-binary==3.1.18
# via psycopg
psycopg-pool==3.2.2
psycopg-pool==3.2.4
# via psycopg
pycodestyle==2.12.1
# via flake8
pycparser==2.22
# via cffi
pyflakes==3.2.0
# via flake8
pyproject-hooks==1.1.0
pyproject-hooks==1.2.0
# via
# build
# pip-tools
pytest==8.3.2
pytest==8.3.4
# via bulk-data-service (pyproject.toml)
pytest-watcher==0.4.3
# via bulk-data-service (pyproject.toml)
Expand All @@ -89,15 +89,17 @@ requests==2.31.0
# via
# azure-core
# bulk-data-service (pyproject.toml)
six==1.16.0
# via
# azure-core
# isodate
sqlparse==0.5.1
six==1.17.0
# via azure-core
sqlparse==0.5.3
# via yoyo-migrations
tabulate==0.9.0
# via yoyo-migrations
types-requests==2.32.0.20240905
toml==0.10.2
# via bulk-data-service (pyproject.toml)
types-requests==2.32.0.20241016
# via bulk-data-service (pyproject.toml)
types-toml==0.10.8.20240310
# via bulk-data-service (pyproject.toml)
typing-extensions==4.12.2
# via
Expand All @@ -106,17 +108,17 @@ typing-extensions==4.12.2
# mypy
# psycopg
# psycopg-pool
urllib3==2.2.2
urllib3==2.3.0
# via
# requests
# types-requests
watchdog==5.0.2
watchdog==6.0.0
# via pytest-watcher
wheel==0.44.0
wheel==0.45.1
# via pip-tools
yoyo-migrations==9.0.0
# via bulk-data-service (pyproject.toml)
zipp==3.20.1
zipp==3.21.0
# via importlib-metadata

# The following packages are considered to be unsafe in a requirements file:
Expand Down
Loading

0 comments on commit 6bfd8ab

Please sign in to comment.