Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing icenet CLI training argument in template script #54

Open
bnubald opened this issue Aug 29, 2024 · 3 comments
Open

Missing icenet CLI training argument in template script #54

bnubald opened this issue Aug 29, 2024 · 3 comments
Labels
bug Something isn't working
Milestone

Comments

@bnubald
Copy link
Contributor

bnubald commented Aug 29, 2024

IceNet version: v0.3.0_dev
Pipeline version: 0.3.0_dev

Getting an error when attempting training, the workers flag, -w seems to have been removed in icenet v0.3.0_dev which is causing this.

Pipeline script run:

./run_train_ensemble.sh -b $BATCH_SIZE -e 10 -f $FILTER_FACTOR -p $PREP_SCRIPT -q 4 ${TRAIN_DATA_NAME}_${HEMI} ${TRAIN_DATA_NAME}_${HEMI} ${FORECAST}_${HEMI}

Resulting in this error in the ensemble run:

usage: icenet_train_tensorflow [-h] [-o OUTPUT_PATH] [-v] [-b BATCH_SIZE]
                               [-ca CHECKPOINT_MODE] [-cm CHECKPOINT_MONITOR]
                               [-ds [ADDITIONAL ...]] [-e EPOCHS]
                               [--early-stopping EARLY_STOPPING] [-p PRELOAD]
                               [-r RATIO] [--shuffle-train] [--lr LR]
                               [--lr_10e_decay_fac LR_10E_DECAY_FAC]
                               [--lr_decay_start LR_DECAY_START]
                               [--lr_decay_end LR_DECAY_END] [-f FILTER_SIZE]
                               [-n N_FILTERS_FACTOR]
                               [-s {default,mirrored,central}] [-nw] [-wo]
                               [-wp WANDB_PROJECT] [-wu WANDB_USER]
                               dataset run_name seed
icenet_train_tensorflow: error: ambiguous option: -w could match -wo, -wp, -wu

@JimCircadian, looking through icenet v0.3.0_dev, removing -w seems intended, remove -w {{ run.ntasks }} in template here to fix?

https://github.com/icenet-ai/icenet-pipeline/blob/4f98699d819245fe1ed5d40d9c30b64d1efebfa5/ensemble/template/icenet_train.sh.j2#L40C171-L40C190

@bnubald bnubald added the bug Something isn't working label Aug 29, 2024
@bnubald bnubald added this to the v0.3.0 milestone Aug 29, 2024
@JimCircadian
Copy link
Member

@bnubald I've marked the comment above and reported it as a phishing attempt.

The -w flag was for multiprocessing that wasn't needed / enabled for tf.data usage as I recall (it is applied to the model.fit call), so yes, removing that would work. Interesting that the template is still using icenet_train too, unless that's substituted? Apologies, this has gotten a bit stale in my brain now

@bnubald
Copy link
Contributor Author

bnubald commented Aug 29, 2024

Yes, I've switched to icenet_train_tensorflow for now, and also picking up more missing args:

❯ t ensemble/torch_unet_south/torch_unet_south-0/train.6233459.node022.42.err 
                               [--early-stopping EARLY_STOPPING] [-p PRELOAD]
                               [-r RATIO] [--shuffle-train] [--lr LR]
                               [--lr_10e_decay_fac LR_10E_DECAY_FAC]
                               [--lr_decay_start LR_DECAY_START]
                               [--lr_decay_end LR_DECAY_END] [-f FILTER_SIZE]
                               [-n N_FILTERS_FACTOR]
                               [-s {default,mirrored,central}] [-nw] [-wo]
                               [-wp WANDB_PROJECT] [-wu WANDB_USER]
                               dataset run_name seed
icenet_train_tensorflow: error: unrecognized arguments: -m -qs 4

@JimCircadian
Copy link
Member

Both related and can go @bnubald!

@github-staff github-staff deleted a comment from wahyu-n Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants