Official implementation of 'PointCLIP: Point Cloud Understanding by CLIP'.
The paper has been accepted by CVPR 2022.
PointCLIP V2 with much stronger zero-shot performance will be released at repo.
PointCLIP is the first to apply CLIP for point cloud recognition, which transfers 2D pre-trained knowledge into 3D domains. To achieve zero-shot classification, we encode a point cloud by projecting it onto multi-view depth maps and aggregate the view-wise prediction in an end-to-end manner. On top of that, we design an inter-view adapter to further enhance the few-shot performance, and explore the complementary property of PointCLIP for muti-knowledge ensemble.
Create a conda environment and install dependencies:
git clone https://github.com/ZrrSkywalker/PointCLIP.git
cd PointCLIP
conda create -n pointclip python=3.7
conda activate pointclip
pip install -r requirements.txt
# Install the according versions of torch and torchvision
conda install pytorch torchvision cudatoolkit
# Install the modified dassl library (no need to re-build if the source code is changed)
cd Dassl3D/
python setup.py develop
cd ..
Download the official ModelNet40 dataset and put the unzip folder under data/
.
The directory structure should be:
│PointCLIP/
├──...
├──data/
│ ├──modelnet40_ply_hdf5_2048/
├──...
Edit the running settings in scripts/zeroshot.sh
, e.g. config file and output directory. Then run Zero-shot PointCLIP:
cd scripts
bash zeroshot.sh
If you need the post-search for the best view weights, add --post-search
and modulate the search parameters in the config file. More search time leads to higher search results but longer time.
Set the shot number and other settings in scripts/fewshot.sh
. Then run PointCLIP with the inter-view adapter:
cd scripts
bash fewshot.sh
--post-search
is also optional.
Download the pre-pretrained checkpoint by 16-shot fine-tuning and put it under ckpt/adapter/
. It will produce 86.71% on ModelNet40 test set and 87%+ by post-search:
cd scripts
bash eval.sh
You can edit the --model-dir
and --output-dir
to evaluate checkpoints trained by your own.
This repo benefits from CLIP, SimpleView and the excellent codebase Dassl. Thanks for their wonderful works.
@article{zhang2021pointclip,
title={PointCLIP: Point Cloud Understanding by CLIP},
author={Zhang, Renrui and Guo, Ziyu and Zhang, Wei and Li, Kunchang and Miao, Xupeng and Cui, Bin and Qiao, Yu and Gao, Peng and Li, Hongsheng},
journal={arXiv preprint arXiv:2112.02413},
year={2021}
}
If you have any question about this project, please feel free to contact [email protected].