You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
NaVILA
2-level navigation foundation model (mid-level action VLA + locomotion skills) leveraging massive offline datasets, e.g., not only sim envs, but also human touring videos
Interesting papers
CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models
pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction
LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and Control
Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation
Navigation World Models
MV-DUSt3R+: Single-Stage Scene Reconstruction from Sparse Views In 2 Seconds
MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors
Diorama: Unleashing Zero-shot Single-view 3D Scene Modeling
MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic Videos
The text was updated successfully, but these errors were encountered: