Junru Lin*1,2 Chirag Vashist*3 Mikaela Angelina Uy2,4 Colton Stearns2
Xuan Luo5 Leonidas Guibas2 Ke Li3
1University of Toronto 2Stanford University 3Simon Fraser University 4Nvidia 5Google
*equal contribution
Paper (coming soon)
Code (coming soon)
Data (coming soon)
Existing dynamic scene interpolation methods typically assume that the motion between consecutive time steps is small enough so that displacements can be locally approximated by linear models. In practice, even slight deviations from this small-motion assumption can cause conventional techniques to fail. In this paper, we introduce Global Motion Corresponder (GMC), a novel approach that robustly handles large motion and achieves smooth transitions. GMC learns unary potential fields that predict SE(3) mappings into a shared canonical space, balancing correspondence, spatial and semantic smoothness, and local rigidity. We demonstrate that our method significantly outperforms existing baselines on 3D scene interpolation when the two states undergo large global motions. Furthermore, our method enables extrapolation capabilities where other baseline methods cannot.
(1) Left: For small inter-frame motion, determining a point’s motion is effectively equivalent to matching it with a corresponding point within a small local neighborhood. Local neighborhood searches yield correct correspondence and motion prediction. (2) Middle: With large global motion, local searches lead to a wrong corresponding region. (3) Right: An ideal method would be able to predict correct correspondence and achieve global motion.
Firstly, two 3D Gaussian Splatting (3DGS) models are trained from the start state and end state (column 1). Then, the Gaussians from both models are transformed into a SE(3) canonical space where they align with each other (column 2). After alignment, 3D matching is established, where the Gaussians from both models are colored with PCA-DINO features (column 3). Finally, the continuous 3D interpolation is derived, as shown in the last column (column 4).
No DINO Input
|
No Position Input
|
No Local Isometry Loss
|
No Joint Refinement
|