Dheeraj Khanna
February 7th, 2025 – 11am-12pm, EC4-2101A
Multi-Object Tracking (MOT) is essential in computer vision, with applications in autonomous driving, surveillance, and sports analytics. Despite advancements, challenges like long-term identity association, dynamic object counts, and occlusions persist. This research addresses these issues by proposing a learning-based motion model and optimizing data association strategies.
Inspired by state-space models (SSMs), particularly Mamba \cite{mamba}, we introduce a novel motion prediction architecture that combines Mamba and self-attention to capture non-linear motion patterns within the Tracking-By-Detection (TBD) paradigm. Mamba’s sequence modeling enhances long-range temporal dependencies, improving motion prediction. We further refine data association by integrating IoU variants, Re-ID cues, and dynamically updating the feature bank to enhance matching. To reduce ID switches and improve trajectory consistency, we introduce virtual detections in overlapping scenarios.