Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

poor result on redwood data #6

Open
xtchen96 opened this issue Sep 4, 2024 · 1 comment
Open

poor result on redwood data #6

xtchen96 opened this issue Sep 4, 2024 · 1 comment

Comments

@xtchen96
Copy link

xtchen96 commented Sep 4, 2024

Hi,

Thanks for your great work. I tried to test it on the RedWood bedroom dataset (http://redwood-data.org/indoor_lidar_rgbd/index.html) with downsampled RGB-D images (from 21930 to 219 frames, resolution 640x480), both original and downsampled pointcloud (~5M, 100k points) but cannot get reasonable outputs. After filtering it seems that only the first frame result is remained as I checked the camera pose by reprojecting the first frame depth into the scene scan point cloud. It says originally with 580 prompts in 3d proposal stage and 51 remains after 2d-guided filter. Then 15 after prompt consolidation.

There is one point I don't know whether I got it correct: in utils/main_utils.py:transform_pt_depth_scannet_torch(), it requires bx and by from camera intrinsic matrix. I don't know what they mean and set them to 0s.

Could you provide any insights on refining the results? e.g. lower image resolution for SAM, change filter parameters, etc.

First frame rgb image:
000000

Final segmented point cloud, the floor is segmented well, but for other parts seem only around the first frame viewpoint:
image

@mutianxu
Copy link
Contributor

mutianxu commented Sep 5, 2024

Hi,

Thanks for your interest in our work!

My suggestions are:

  1. Try to input more points (not just 100k, may be 1000k or even more), since providing adequate initial prompts is very important to generate adequate confident masks for later filter and consolidation.
  2. Try to use more frames (10% of original frames).

If they do not work:
Since I assume that you have successfully run 3D_prompt_proposal.py, then your problem may be due to the issue of main.py. So I recommend you to visualize the 2D results after prompt filter and consolidation, to see what happened here.
Also, you need to check the camera pose to make sure that the masked area of each frame is correctly projected onto 3D space.

Hope this is helpful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants