Skip to content

Latest commit

 

History

History
73 lines (55 loc) · 4.59 KB

MODULAR.md

File metadata and controls

73 lines (55 loc) · 4.59 KB

In kerNET, there are three interfaces with different flexibility that provide access to our modular learning method.

  1. Easiest, but Minimum Flexibility
  2. Full Customization Via Command Line Interface
  3. Using Component(s) From the Modular Learning Method

To test a trained model, see Test a Trained Model.

Easiest, but Minimum Flexibility

If you only want to use the default set-up to train some classifiers on the datasets that we support, directly run the helper training scripts that we provide in scripts/. For example, to train a ResNet-18 on CIFAR-10 with our modular method (as two modules), do

cd scripts
./cifar10_modular.sh

To train a ResNet-18 on CIFAR-10 with end-to-end backpropagation as a baseline, do

cd scripts
./cifar10_e2e.sh

You can customize the training settings to some degree by modifying the options in the scripts. By using the exact settings in those scripts, you should get performance on par with what's in the following table, which is what we got by running those scripts on our machine. Do not expect exactly the same results since the models will not have exactly the same random initializations on your machine.

MNIST Fashion-MNIST CIFAR-10 SVHN
end-to-end 99.28 95.23 94.34 96.40
modular 99.32 95.31 93.96 96.73

Note that these hyperparameters provided in the scripts are not necessarily the best, the scripts are merely there for you to get a quick start.

Full Customization Via Command Line Interface

To gain access to all the hyperparameters and settings that we support tuning, you may use our command line interface (the scripts in scripts/ are in fact wrappers on top of this interface). Specifically, to train a ResNet-18 on CIFAR-10 with our modular method (as two modules), do

python kernet/examples/modular_train.py --dataset cifar10 --model kresnet18 --n_parts 2 --loss xe --lr1 .1 --lr2 .1 --activation reapen --optimizer sgd --n_epochs1 200 --n_epochs2 50 --hidden_objective srs_upper_tri_alignment --in_channels 3 --batch_size 128  --save_dir my_checkpoint --n_val 5000

To see all the things that you can tune, do

python kernet/examples/modular_train.py -h

End-to-end training baselines can be obtained with kernet/examples/train.py.

More details on the training pipelines are provided here.

Using Component(s) From the Modular Learning Method

If you want to use certain component(s) from our modular learning method, you can import the desired component(s) from kerNET into your own code.

  • Proxy objectives: These are the objective functions we use to train the hidden modules. Some reference implementations are in kernet.layers.loss.
  • Kernels induced by neural network nonlinearities: Neural network nonlinearities such as tanh or ReLU can be used to induce kernel functions, which are then used to construct the proxy objectives and also enable us to view neural networks as kernel machines. These kernel functions can be accessed via kernet.layers.kcore.Phi.
  • Models
    • The models in kernet/models whose names starts with a k are basically the same as their counterparts that do not start with a k. The only real difference is that the penultimate activation vectors of the k models are always normalized in order for them to be trained with our modular method. The k models also have some extra helper methods such as split that allow them to be used in our modular learning pipeline.
    • The models whose names end with an N had their penultimate activation vectors normalized. So they are exactly the same models as the k models. But the N models do not have the helper methods needed and therefore cannot be used in our modular learning pipeline.

Test a Trained Model

Suppose your model is saved in my_checkpoint/, to test it, do

python kernet/examples/test.py --load_opt --opt_file my_checkpoint/opt.pkl --checkpoint_dir my_checkpoint

You can modify some settings during testing, to see all the things that you can tune, do

python kernet/examples/test.py -h

Test logs will be saved in test.log and test.json.

You can test on a randomly-chosen subset by specifying --max_testset_size.