Skip to content

Latest commit

 

History

History
96 lines (64 loc) · 3.92 KB

README.md

File metadata and controls

96 lines (64 loc) · 3.92 KB

GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding

Haoyi Jiang1, Liu Liu2, Tianheng Cheng1, Xinjie Wang2, Tianwei Lin2, Zhizhong Su2, Wenyu Liu1, Xinggang Wang1
1Huazhong University of Science & Technology, 2Horizon Robotics

Project page arXiv License: MIT

Setup

Installation

We recommend cloning the repository with the --single-branch option to avoid downloading unnecessary large media files for the project website from other branches.

git clone https://github.com/hustvl/GaussTR.git --single-branch
cd GaussTR
pip install -r requirements.txt

Dataset Preparation

  1. Prepare the nuScenes dataset following the instructions provided in the mmdetection3d docs.

  2. Update the dataset .pkl files with scene_idx to match with occupancy ground truths by running:

    python tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./data/nuscenes --extra-tag nuscenes
  3. Download the occupancy ground truth data from CVPR2023-3D-Occupancy-Prediction and place it under data/nuscenes/gts.

  4. Generate the required features and rendering targets:

    • Run PYTHONPATH=. python tools/generate_depth.py to generate metric depth estimations.
    • Navigate to the FeatUp repository and run python tools/generate_featup.py there.
    • Optionally, navigate to the Grounded SAM 2 and run python tools/generate_grounded_sam2.py to enable training augmentation.

CLIP Text Embeddings

Download pre-generated CLIP text embeddings from the releases section, or manually generate custom embeddings by referring to open-mmlab/mmpretrain#1737.

Usage

Training

PYTHONPATH=. mim train mmdet3d configs/gausstr.py [-l pytorch -G [GPU_NUM]]

Testing

PYTHONPATH=. mim test mmdet3d configs/gausstr.py -C [CKPT_PATH] [-l pytorch -G [GPU_NUM]]

Visualization

To enable visualization during testing, include the following in the config:

custom_hooks = [
    dict(type='DumpResultHook '),
]

After testing, visualize the saved .pkl files by executing:

python tools/visualize.py [PKL_PATH] [--save]

Citation

If you find our paper and code helpful for your research, please consider starring this repository ⭐ and citing our work:

@article{GaussTR,
    title = {GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding},
    author = {Haoyi Jiang and Liu Liu and Tianheng Cheng and Xinjie Wang and Tianwei Lin and Zhizhong Su and Wenyu Liu and Xinggang Wang},
    year = 2024,
    journal = {arXiv preprint arXiv:2412.13193}
}

Acknowledgements

This project builds upon the pioneering work of FeatUp, MaskCLIP and gsplat. We extend our gratitude to these projects for their contributions to the community.

License

Released under the MIT License.