Troy-VIS: Towards Real-Time Open-Vocabulary Video Instance Segmentation

Installation: Please refer to INSTALL.md for more details.
Data preparation: Please refer to DATA.md for more details.
Training: Please refer to TRAIN.md for more details.
Testing: Please refer to TEST.md for more details.
Model zoo: Please refer to MODEL_ZOO.md for more details.

This is not an officially supported Google product.

Highlight:

Troy-VIS is the first efficient foundation model family for open-vocabulary object perception. It can detect and segment objects of any class in images and track objects of any class in videos.
Troy-VIS can do open-vocabulary video instance segmentation of more than 1K object categories in real-time on A100 GPUs.
Troy-VIS is trained on huge amount of images and videos from different domains, showing strong zero-shot perception ability.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
conversion		conversion
data_utils		data_utils
evaluate		evaluate
projects/EVAP		projects/EVAP
third_party		third_party
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
debug.sh		debug.sh
dockerfile		dockerfile
infer.sh		infer.sh
launch.py		launch.py
requirements.txt		requirements.txt