This is the implementation of Towards a Theoretical Framework of Out-of-Distribution Generalization. Our code is inherited from DomainBed.
Downloads VLCS dataset and link it to
The structure of dataset should be same as DomainBed. To run our full experiments, one should also download PACS and OfficeHome dataset in the same way.
To run a single experiment, run the following command:
python -m main --data_dir domainbed/datasets \
--trial_seed 89500 --algorithm ERM --dataset VLCS --test_envs 1 \
--steps 5001 --output_dir logs/ERM_VLCS_test_env1/ \
--extract_feature ERM_exp1 \
--output_result_file result.csv \
--checkpoint_freq 500 \
--save_feature_every_checkpoint \
--hparams "{\"lr\": 0.0001, \"resnet18\": false}" \
The final result will be saved in logs/ERM_VLCS_test_env1/ERM_exp1
To reproduce our experiment, one can run the following code to generate different models for model selection
python --command_launcher my_multi_gpu launch
After model generation, one should use the following codes to collect feature and metric from different models. The result will be record in renamed/*.csv
cd logs
to use our model selection criterion, one can run
This command use the result in renamed/*.csv
to select model according to different criterion and report their test evironment accuracy. train_acc
means the selection criterion is the average training accuracy, train_mix
is the selection criterion defined in our paper