Learning to Reach Goals from Suboptimal Demonstrations via World Models – VISION AND IMAGE PROCESSING RESEARCH LAB (VIP LAB)

Qasim Ali

November 14th, 2025 – 1:00-2:00pm, EC4-2101A

A major challenge in training autonomous agents is the scarcity of high-quality demonstrations. While expert data is costly and limited, vast amounts of suboptimal trajectories—short, noisy, or exploratory—are much easier to collect but are rarely used effectively. This work explores how self-supervised representation learning can turn such imperfect data into a useful learning signal.

In this talk, I will introduce World Model Contrastive Reinforcement Learning (WM-CRL), a method that augments goal-conditioned reinforcement learning with predictive representations from a world model. The world model is trained to anticipate future states from past state–action pairs, capturing the underlying environment dynamics. As a result, it can be trained on any data—expert or suboptimal—and its learned representations help the agent generalize beyond what is directly demonstrated.

Across locomotion and manipulation benchmarks, WM-CRL improves performance in settings with fragmented or exploratory trajectories, showing that predictive world models can bridge the gap between imperfect data and effective control. I will conclude by discussing how such representations might scale toward more general, data-efficient robotic learning.