A general PyTorch based framework for learning tracking representations.
The installation script will automatically generate a local configuration file "admin/local.py". In case the file was not generated, run admin.environment.create_default_local_file()
to generate it. Next, set the paths to the training workspace,
i.e. the directory where the checkpoints will be saved. Also set the paths to the datasets you want to use. If all the dependencies have been correctly installed, you can train a network using the run_training.py script in the correct conda environment.
conda activate pytracking
python run_training.py train_module train_name
Here, train_module
is the sub-module inside train_settings
and train_name
is the name of the train setting file to be used.
For example, you can train using the included default ATOM settings by running:
python run_training bbreg atom_default
The framework consists of the following submodules.
- actors: Contains the actor classes for different trainings. The actor class is responsible for passing the input data through the network can calculating losses.
- admin: Includes functions for loading networks, tensorboard etc. and also contains environment settings.
- dataset: Contains integration of a number of training datasets, namely TrackingNet, GOT-10k, LaSOT, ImageNet-VID, DAVIS, YouTube-VOS, MS-COCO, SBD, LVIS, ECSSD, MSRA10k, and HKU-IS. Additionally, it includes modules to generate synthetic videos from image datasets.
- data_specs: Information about train/val splits of different datasets.
- data: Contains functions for processing data, e.g. loading images, data augmentations, sampling frames from videos.
- external: External libraries needed for training. Added as submodules.
- models: Contains different layers and network definitions.
- trainers: The main class which runs the training.
- train_settings: Contains settings files, specifying the training of a network.
The framework currently contains the training code for the following trackers.
The following setting files can be used to train the TaMOs tracker. In addition to the typical tracking datasets used for single object trackers we further include TAO, YoutubeVOS and ImagenetVid training data. When training with TAO we use the BURST annotations since they provide a higher annotation frame rate. We converted those annotations to our own format TaoBurst.json.
- tamos.tamos_resnet50: The default setting use for training with ResNet50 backbone.
- tamos.tamos_swin_base: The default setting use for training with SwinBase backbone. If needed, the weights of the SwinBase backbone can be downloaded here.
Three steps are required to train RTS:
-
Download lasot_got10k_pregenerated_masks.zip.
Unzip the archive in the
pregenerated_masks
set inltr/admin/local.py
. -
Download the pretrained LWL weights lwl_stage2.pth.
Save the weights in the
pretrained_networks
set inltr/admin/local.py
. -
Use this setting for training with ResNet50 backbone: rts.rts50
The following setting files can be used to train the ToMP tracker. We omit training with a separate test encoding since the training is more stable but leads to comparable performance. Set the flag to false to use the same setup as in the paper.
- tomp.tomp50: The default setting use for training with ResNet50 backbone.
- tomp.tomp101: The default setting use for training with ResNet101 backbone.
In order to train KeepTrack the following three steps are required.
- Prepare the base tracker: Download the weights super_dimp_simple.pth.tar or retrain the tracker using the settings dimp.super_dimp_simple.
- Prepare the training dataset: Download target_candidates_dataset_dimp_simple_super_dimp_simple.json or re-create the dataset by switching to
../pytracking/util_scripts
and running create_distractor_dataset usingpython create_distractor_dataset.py dimp_simple super_dimp_simple lasot_train $DATASET_DIR
. Add the path of the dataset file to thelocal.py
file. - Train KeepTrack using the settings keep_track.keep_track using super_dimp_simple as base tracker.
The following setting files can be used to train the LWL networks, or to know the exact training details.
- lwl.lwl_stage1: The default settings used for initial network training with fixed backbone weights. We initialize the backbone ResNet with pre-trained Mask-RCNN weights. These weights can be obtained from here. Download and save these weights in
env_settings().pretrained_networks
directory. - lwl.lwl_stage2: The default settings used for training the final LWL model. This setting fine-tunes all layers in the model trained using lwl_stage1.
- lwl.lwl_boxinit: The default settings used for training the bounding box encoder network in order to enable VOS with box initialization.
The following setting file can be used to train the KYS networks, or to know the exact training details.
- kys.kys: The default settings used for training the KYS model with ResNet-50 backbone.
The following setting files can be used to train the DiMP networks, or to know the exact training details.
- dimp.prdimp18: The default settings used for training the PrDiMP model with ResNet-18 backbone.
- dimp.prdimp50: The default settings used for training the PrDiMP model with ResNet-50 backbone.
- dimp.super_dimp: Combines the bounding-box regressor of PrDiMP with the standard DiMP classifier and better training and inference settings.
The following setting files can be used to train the DiMP networks, or to know the exact training details.
- dimp.dimp18: The default settings used for training the DiMP model with ResNet-18 backbone.
- dimp.dimp50: The default settings used for training the DiMP model with ResNet-50 backbone.
The following setting file can be used to train the ATOM network, or to know the exact training details.
- bbreg.atom: The settings used in the paper for training the network in ATOM.
- bbreg.atom: Newer settings used for training the network in ATOM, also utilizing the GOT10k dataset.
- bbreg.atom: Settings for ATOM with the probabilistic bounding box regression proposed in this paper.
- bbreg.atom: The baseline ATOM* setting evaluated in this paper.
To train a custom network using the toolkit, the following components need to be specified in the train settings. For reference, see atom.py.
- Datasets: The datasets to be used for training. A number of standard tracking datasets are already available in
dataset
module. - Processing: This function should perform the necessary post-processing of the data, e.g. cropping of target region, data augmentations etc.
- Sampler: Determines how the frames are sampled from a video sequence to form the batches.
- Network: The network module to be trained.
- Objective: The training objective.
- Actor: The trainer passes the training batch to the actor who is responsible for passing the data through the network correctly, and calculating the training loss.
- Optimizer: Optimizer to be used, e.g. Adam.
- Trainer: The main class which runs the epochs and saves checkpoints.