Haoyi Jiang1, Liu Liu2, Tianheng Cheng1, Xinjie Wang2,
Tianwei Lin2, Zhizhong Su2, Wenyu Liu1, Xinggang Wang1
1Huazhong University of Science & Technology, 2Horizon Robotics
We recommend cloning the repository with the --single-branch
option to avoid downloading unnecessary large media files for the project website from other branches.
git clone https://github.com/hustvl/GaussTR.git --single-branch
cd GaussTR
pip install -r requirements.txt
-
Prepare the nuScenes dataset following the instructions provided in the mmdetection3d docs.
-
Update the dataset
.pkl
files withscene_idx
to match with occupancy ground truths by running:python tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./data/nuscenes --extra-tag nuscenes
-
Download the occupancy ground truth data from CVPR2023-3D-Occupancy-Prediction and place it under
data/nuscenes/gts
. -
Generate the required features and rendering targets:
- Run
PYTHONPATH=. python tools/generate_depth.py
to generate metric depth estimations. - Navigate to the FeatUp repository and run
python tools/generate_featup.py
there. - Optionally, navigate to the Grounded SAM 2 and run
python tools/generate_grounded_sam2.py
to enable training augmentation.
- Run
Download pre-generated CLIP text embeddings from the releases section, or manually generate custom embeddings by referring to open-mmlab/mmpretrain#1737.
PYTHONPATH=. mim train mmdet3d configs/gausstr.py [-l pytorch -G [GPU_NUM]]
PYTHONPATH=. mim test mmdet3d configs/gausstr.py -C [CKPT_PATH] [-l pytorch -G [GPU_NUM]]
To enable visualization during testing, include the following in the config:
custom_hooks = [
dict(type='DumpResultHook '),
]
After testing, visualize the saved .pkl
files by executing:
python tools/visualize.py [PKL_PATH] [--save]
If you find our paper and code helpful for your research, please consider starring this repository ⭐ and citing our work:
@article{GaussTR,
title = {GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding},
author = {Haoyi Jiang and Liu Liu and Tianheng Cheng and Xinjie Wang and Tianwei Lin and Zhizhong Su and Wenyu Liu and Xinggang Wang},
year = 2024,
journal = {arXiv preprint arXiv:2412.13193}
}
This project builds upon the pioneering work of FeatUp, MaskCLIP and gsplat. We extend our gratitude to these projects for their contributions to the community.
Released under the MIT License.