In kerNET, there are three interfaces with different flexibility that provide access to our modular learning method.
- Easiest, but Minimum Flexibility
- Full Customization Via Command Line Interface
- Using Component(s) From the Modular Learning Method
To test a trained model, see Test a Trained Model.
If you only want to use the default set-up to train some classifiers on the datasets that we support, directly run the helper training scripts that we provide in scripts/. For example, to train a ResNet-18 on CIFAR-10 with our modular method (as two modules), do
cd scripts
./cifar10_modular.sh
To train a ResNet-18 on CIFAR-10 with end-to-end backpropagation as a baseline, do
cd scripts
./cifar10_e2e.sh
You can customize the training settings to some degree by modifying the options in the scripts. By using the exact settings in those scripts, you should get performance on par with what's in the following table, which is what we got by running those scripts on our machine. Do not expect exactly the same results since the models will not have exactly the same random initializations on your machine.
MNIST | Fashion-MNIST | CIFAR-10 | SVHN | |
---|---|---|---|---|
end-to-end | 99.28 | 95.23 | 94.34 | 96.40 |
modular | 99.32 | 95.31 | 93.96 | 96.73 |
Note that these hyperparameters provided in the scripts are not necessarily the best, the scripts are merely there for you to get a quick start.
To gain access to all the hyperparameters and settings that we support tuning, you may use our command line interface (the scripts in scripts/ are in fact wrappers on top of this interface). Specifically, to train a ResNet-18 on CIFAR-10 with our modular method (as two modules), do
python kernet/examples/modular_train.py --dataset cifar10 --model kresnet18 --n_parts 2 --loss xe --lr1 .1 --lr2 .1 --activation reapen --optimizer sgd --n_epochs1 200 --n_epochs2 50 --hidden_objective srs_upper_tri_alignment --in_channels 3 --batch_size 128 --save_dir my_checkpoint --n_val 5000
To see all the things that you can tune, do
python kernet/examples/modular_train.py -h
End-to-end training baselines can be obtained with kernet/examples/train.py.
More details on the training pipelines are provided here.
If you want to use certain component(s) from our modular learning method, you can import the desired component(s) from kerNET into your own code.
- Proxy objectives: These are the objective functions we use to train the hidden modules. Some reference implementations are in
kernet.layers.loss
. - Kernels induced by neural network nonlinearities: Neural network nonlinearities such as tanh or ReLU can be used to induce kernel functions, which are then used to construct the proxy objectives and also enable us to view neural networks as kernel machines. These kernel functions can be accessed via
kernet.layers.kcore.Phi
. - Models
- The models in kernet/models whose names starts with a
k
are basically the same as their counterparts that do not start with ak
. The only real difference is that the penultimate activation vectors of thek
models are always normalized in order for them to be trained with our modular method. Thek
models also have some extra helper methods such assplit
that allow them to be used in our modular learning pipeline. - The models whose names end with an
N
had their penultimate activation vectors normalized. So they are exactly the same models as thek
models. But theN
models do not have the helper methods needed and therefore cannot be used in our modular learning pipeline.
- The models in kernet/models whose names starts with a
Suppose your model is saved in my_checkpoint/
, to test it, do
python kernet/examples/test.py --load_opt --opt_file my_checkpoint/opt.pkl --checkpoint_dir my_checkpoint
You can modify some settings during testing, to see all the things that you can tune, do
python kernet/examples/test.py -h
Test logs will be saved in test.log
and test.json
.
You can test on a randomly-chosen subset by specifying --max_testset_size
.