Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert learner+template to clientAPI+jobAPI for nlp example #3200

Merged
merged 4 commits into from
Feb 7, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 18 additions & 31 deletions examples/advanced/nlp-ner/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,13 @@ pip install -r ./requirements.txt

The raw data can be accessed from [official page](https://www.ncbi.nlm.nih.gov/CBBresearch/Dogan/DISEASE/).
In this example, we use the preprocessed csv-files from the reference repo above, which can be downloaded [here](https://drive.google.com/drive/folders/13wROtEAnMgWpLMIGHB5CY1BQ1Xe2XqhG). Please download three files `train.csv`, `dev.csv`, and `test.csv`.
In the following, we assume the downloaded files are placed in a folder `DATASET_ROOT`, and we default to `/tmp/nvflare/data/nlp_ner`

We then use the preprocessed data to generate random splits for both 4-client and 2-client experiments.
Please modify the `DATASET_ROOT` below to point to folder containing the four downloaded csv-files.
```commandline
bash prepare_data.sh DATASET_ROOT
DATASET_ROOT=/tmp/nvflare/data/nlp_ner
bash prepare_data.sh $DATASET_ROOT
```
The expected output is
```
Expand All @@ -52,31 +54,14 @@ Let's take a closer look at the word-label correspondence:
As shown above, the task is to capture the keywords related to medical findings.

## Run automated experiments
We use the NVFlare [simulator](https://nvflare.readthedocs.io/en/latest/user_guide/nvflare_cli/fl_simulator.html) to run the FL training.
Set `PYTHONPATH` to include custom files of this example:
We run the federated training on a single client using NVFlare Simulator via [JobAPI](https://nvflare.readthedocs.io/en/main/programming_guide/fed_job_api.html).
```
export PYTHONPATH=${PWD}
```
### Prepare local configs
Please modify the `DATASET_ROOT` within [config_fed_client.json](./jobs/bert_ncbi/app/config/config_fed_client.json)
### Use NVFlare simulator to run the experiments
We use the NVFlare simulator to run the FL training experiments, following the pattern:
python3 nlp_fl_job.py --model_name Bert
python3 nlp_fl_job.py --model_name GPT
```
nvflare simulator jobs/[job] -w ${workspace_path}/[job] -c [clients] -gpu [gpu] -t [thread]
```
`[job]` is the experiment job that will be submitted for the FL training.
In this example, it will be `bert_ncbi` and `gpt2_ncbi`.
The combination of `-c` and `-gpu`/`-t` controls the resource allocation.

## Results
In this example, we run 4 clients on 2 GPUs with 4 threads for BERT model, and 2 clients on 2 GPUs with 2 threads for GPT-2 model. The minimum GPU memory requirement is 10 GB per GPU for BERT and 8 GB per GPU for GPT-2. We put the workspace in `/tmp` folder
```
nvflare simulator jobs/bert_ncbi -w /tmp/nvflare/workspaces/bert_ncbi -n 4 -gpu 0,1,0,1
```
and
```
nvflare simulator jobs/gpt2_ncbi -w /tmp/nvflare/workspaces/gpt2_ncbi -n 2 -gpu 0,1
```
In this example, we run 4 clients for BERT model, and 2 clients for GPT-2 model.

### Validation curve on each site
In this example, each client computes their validation scores using their own
Expand All @@ -94,27 +79,29 @@ The testing score is computed for the global model over the testing set.
We provide a script for performing validation on testing data.
Please modify the `DATASET_ROOT` below:
```
bash test_global_model.sh DATASET_ROOT
DATASET_ROOT=/tmp/nvflare/data/nlp_ner
export PYTHONPATH=${PWD}
bash test_global_model.sh ${DATASET_ROOT}
```
The test results are:
```
BERT
precision recall f1-score support

_ 0.96 0.97 0.97 1255
_ 0.96 0.98 0.97 1255

micro avg 0.96 0.97 0.97 1255
macro avg 0.96 0.97 0.97 1255
weighted avg 0.96 0.97 0.97 1255
micro avg 0.96 0.98 0.97 1255
macro avg 0.96 0.98 0.97 1255
weighted avg 0.96 0.98 0.97 1255

GPT-2
precision recall f1-score support

_ 0.86 0.90 0.88 1255
_ 0.87 0.90 0.88 1255

micro avg 0.86 0.90 0.88 1255
macro avg 0.86 0.90 0.88 1255
weighted avg 0.86 0.90 0.88 1255
micro avg 0.87 0.90 0.88 1255
macro avg 0.87 0.90 0.88 1255
weighted avg 0.87 0.90 0.88 1255

```
Note that training is not deterministic so the numbers can have some variations.
Loading
Loading