Dec 1 2023, 11:30 am, EC4-2101A
*** 11:30am Presenter: Yuxiang Huang ***
Identifying and segmenting moving objects from a moving monocular camera is difficult when there is unknown camera motion, different types of object motions and complex scene structures. Deep learning methods achieve impressive results for generic motion segmentation, but they require massive training data and do not generalize well to novel scenes and objects. On the other hand, recent geometric methods show promising results by fusing different geometric models together, but they require manually corrected point trajectories and cannot generate a coherent segmentation mask. To combine the advantages of both methods, we propose a zero-shot motion segmentation approach that performs motion model fusion on object proposals. We first generate object proposals and motion cues using off-the-shelf deep learning foundation models, and then synergistically fuse different motion cues to cluster the objects into different motion groups. Experiments show that our method achieves competitive results on multiple datasets compared to state-of-the-art methods which are trained in a supervised or semi-supervised manner.