Video Analyst

This is the implementation of a series of basic algorithms which is useful for video understanding, including Single Object Tracking (SOT), Video Object Segmentation (VOS), etc.

Current implementation list:

SOT
- SiamFC++: Towards Robust and Accurate Visual Tracking with Target Estimation Guidelines [demo]

Example SiamFC++ outputs.

VOS
- State-Aware Tracker for Real-Time Video Object Segmentation

Example SAT outputs.

SOT Quick start

Setup

Please refer to SETUP.md, SOT_SETUP.md

Demo

SOT video demo

# demo with web camera
python3 ./demo/main/video/sot_video.py --config 'experiments/siamfcpp/test/vot/siamfcpp_alexnet.yaml' --device cuda --video "webcam" 

# demo with video file, and dump result into video file (optional)
python3 ./demo/main/video/sot_video.py --config 'experiments/siamfcpp/test/vot/siamfcpp_alexnet.yaml' --device cuda --video $video_dir/demo.mp4 --output $dump_path/result.mp4

# demo with extracted image files, and dump result into image files (optional)
python3 ./demo/main/video/sot_video.py --config 'experiments/siamfcpp/test/vot/siamfcpp_alexnet.yaml' --device cuda --video $video_dir/*.jpg --output $dump_dir

Test

Please refer to SOT_TEST.md for detail.

Training

Please refer to SOT_TRAINING.md for detail.

Repository structure (in progress)

project_root/
├── experiments  # experiment configurations, in yaml format
├── main
│   ├── train.py  # trainng entry point
│   └── test.py  # test entry point
├── video_analyst
│   ├── data  # modules related to data
│   │   ├── dataset  # data fetcher of each individual dataset
│   │   ├── sampler  # data sampler, including inner-dataset and intra-dataset sampling procedure
│   │   ├── dataloader.py  # data loading procedure
│   │   └── transformer  # data augmentation
│   ├── engine  # procedure controller, including traiing control / hp&model loading
│   │   ├── monitor  # monitor for tasks during training, including visualization / logging / benchmarking
│   │   ├── trainer.py  # train a epoch
│   │   ├── tester.py  # test a model on a benchmark
│   ├── model # model builder
│   │   ├── backbone  # backbone network builder
│   │   ├── common_opr  # shared operator (e.g. cross-correlation)
│   │   ├── task_model  # holistic model builder
│   │   ├── task_head  # head network builder
│   │   └── loss  # loss builder
│   ├── pipeline  # pipeline builder (tracking / vos)
│   │   ├── segmenter  # segmenter builder for vos
│   │   ├── tracker  # tracker builder for tracking
│   │   └── utils  # pipeline utils
│   ├── config  # configuration manager
│   ├── evaluation  # benchmark
│   ├── optim  # optimization-related module (learning rate, gradient clipping, etc.)
│   │   ├── optimizer # optimizer
│   │   ├── scheduler # learning rate scheduler
│   │   └── grad_modifier # gradient-related operation (parameter freezing)
│   └── utils  # useful tools
└── README.md

docs

For detail, please refer to markdown files under docs.

SOT

SOT_SETUP.md: instructions for setting-up
SOT_MODEL_ZOO.md: description of released sot models
SOT_TRAINING.md: details related to training
SOT_TEST.md: details related to test

VOS

VOS_SETUP.md: instructions for setting-up
VOS_MODEL_ZOO.md: description of released sot models
VOS_TRAINING.md: details related to training
VOS_TEST.md: details related to training

DEVELOP

DEVELOP.md: description of project design (registry, configuration tree, etc.)
PIPELINE_API.md: description for pipeline API
FORMATTING_INSTRUCTION: instruction for code formatting (yapf/isort/flake/etc.)

TODO

[] refine code stype and test cases

Acknowledgement

video_analyst/evaluation/vot_benchmark and other related code have been borrowed from PySOT
video_analyst/evaluation/got_benchmark and other related code have been borrowed from got-toolkit
detectron2
fvcore
pytracking
DROL

References

@inproceedings{xu2020siamfc++,
  title={SiamFC++: Towards Robust and Accurate Visual Tracking with Target Estimation Guidelines.},
  author={Xu, Yinda and Wang, Zeyu and Li, Zuoxin and Yuan, Ye and Yu, Gang},
  booktitle={AAAI},
  pages={12549--12556},
  year={2020}
}

@inproceedings{chen2020state,
  title={State-Aware Tracker for Real-Time Video Object Segmentation},
  author={Chen, Xi and Li, Zuoxin and Yuan, Ye and Yu, Gang and Shen, Jianxin and Qi, Donglian},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={9384--9393},
  year={2020}
}

Contact

Maintainer (sorted by family name):

Xi Chen@XavierCHEN34
Zuoxin Li@lzx1413
Zeyu Wang@JWarlock
Yinda Xu@MARMOTatZJU

Name		Name	Last commit message	Last commit date
Latest commit History 771 Commits
demo		demo
docs		docs
experiments		experiments
main		main
tools		tools
videoanalyst		videoanalyst
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
.style.yapf		.style.yapf
.travis.yml		.travis.yml
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
check_format.sh		check_format.sh
compile.sh		compile.sh
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video Analyst

SOT Quick start

Setup

Demo

SOT video demo

Test

Training

Repository structure (in progress)

docs

SOT

VOS

DEVELOP

TODO

Acknowledgement

References

Contact

About

Releases

Packages

Contributors 5

Languages

License

megvii-research/video_analyst

Folders and files

Latest commit

History

Repository files navigation

Video Analyst

SOT Quick start

Setup

Demo

SOT video demo

Test

Training

Repository structure (in progress)

docs

SOT

VOS

DEVELOP

TODO

Acknowledgement

References

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages