vipadmin – VISION AND IMAGE PROCESSING (VIP) RESEARCH GROUP

Scalable Open-Vocabulary Wildlife Detection and Human-in-the-Loop Annotation for Operational Conservation Monitoring

vipadmin — Fri, 28 Nov 2025 17:00:00 +0000

Jayden Hsiao

November 28th, 2025 – 1:00-2:00pm, EC4-2101A

Accurate wildlife monitoring is critical for understanding ecosystem change, yet most conservation agencies still rely on manual interpretation of aerial imagery, a process that is slow, inconsistent, and difficult to scale. This talk presents two complementary advances aimed at making species detection from aerial platforms more adaptable and directly usable in real conservation workflows.

First, I introduce OpenWildlife, a multi-species open vocabulary detector trained across diverse aerial wildlife datasets. By using language guided grounding rather than fixed class labels, the model can identify a wide range of species including those not present in its training set and can be efficiently fine tuned with only a small number of annotated examples, giving conservationists a strong and broadly transferable starting point rather than requiring them to train a new model from scratch for each survey.

Second, I describe a deployed human in the loop annotation system built around this model and used operationally by the Arctic Eider Society for demographic analysis of eider duck populations in Hudson Bay. The workflow integrates automated predictions, regional correction tools, incremental fine tuning, and interface features designed for dense small object scenes typical of eider colonies, and in field use it substantially reduced annotation effort while enabling continuous model refinement as new imagery is collected.

Together, these components demonstrate a path toward wildlife detection systems that are both technically robust and practically deployable, supporting conservation programs that require adaptable and scalable tools rather than one off research models.

Learning to Reach Goals from Suboptimal Demonstrations via World Models

vipadmin — Fri, 14 Nov 2025 18:04:30 +0000

Qasim Ali

November 14th, 2025 – 1:00-2:00pm, EC4-2101A

A major challenge in training autonomous agents is the scarcity of high-quality demonstrations. While expert data is costly and limited, vast amounts of suboptimal trajectories—short, noisy, or exploratory—are much easier to collect but are rarely used effectively. This work explores how self-supervised representation learning can turn such imperfect data into a useful learning signal.

In this talk, I will introduce World Model Contrastive Reinforcement Learning (WM-CRL), a method that augments goal-conditioned reinforcement learning with predictive representations from a world model. The world model is trained to anticipate future states from past state–action pairs, capturing the underlying environment dynamics. As a result, it can be trained on any data—expert or suboptimal—and its learned representations help the agent generalize beyond what is directly demonstrated.

Across locomotion and manipulation benchmarks, WM-CRL improves performance in settings with fragmented or exploratory trajectories, showing that predictive world models can bridge the gap between imperfect data and effective control. I will conclude by discussing how such representations might scale toward more general, data-efficient robotic learning.

Supporting Indigenous-led research with technology using SIKU: The Indigenous Knowledge App

vipadmin — Fri, 31 Oct 2025 04:00:00 +0000

Joel Heath, Executive Director – Arctic Eider Society (AES)

October 31st, 2025 – 1:00-2:00pm, EC4-2101A

The Arctic Eider Society (AES) is an Inuit-led charity based in Sanikiluaq, Nunavut, that supports Indigenous self-determination in research, education, and environmental stewardship, and provides tools and services for ice safety, language preservation, and environmental monitoring. AES develops SIKU: The Indigenous Knowledge App, a mobile and web platform that provides tools, services, and training to over 100 active Indigenous-led projects focused on research, monitoring, subsistence harvesting, guardians programs, protected areas establishment, and exploratory fisheries. SIKU users systematically take photos and use tags for Indigenous classification systems using Indigenous Environmental Terminology (IET) on their mobile phones associated with wildlife, ice, weather, and other indicators. Using SIKU, daily observations are transformed from oral history to the quantitative data that has always formed the basis of Indigenous knowledge (IK). Importantly, SIKU incorporates self-determination and Indigenous data sovereignty as a part of its core principles and terms of use that respect IK, the policies and approaches of Indigenous research, and licensing and governance bodies. This presentation will outline the SIKU app and several case studies of how it is being used to run Indigenous-led research projects as well as new tools incorporating machine learning and Indigenous knowledge systems.

Towards Automatic Sports Analytics: Team Affiliation, Jersey Number Recognition and Player Tracking

vipadmin — Fri, 24 Oct 2025 16:00:00 +0000

Dr. Maria Koshkina

October 24th, 2025 – 1:00-2:00pm, EC4-2101A

Automatic understanding of sports video can transform how we analyze games, coach athletes, and engage viewers. This talk presents a framework for reliable player identification and tracking in team sports, where athletes often look alike and jersey numbers are only intermittently visible. I will describe three components: a self-supervised method for classifying team affiliation, a robust jersey number recognition pipeline, and a graph-based tracker that integrates these identity cues for long-term player tracking.

Add just 10 more minutes of physical activity to protect your joints – Motivation from biomechanics and imaging

vipadmin — Fri, 03 Oct 2025 16:00:00 +0000

Prof. Monica Maly

October 3rd, 2025 – 1:00-2:00pm, EC4-2101A

This presentation will outline evidence showing that osteoarthritis (OA), the most common form of arthritis, is a “tag team” event. Joint loading becomes a powerful driver of joint disease when in the presence of obesity. This presentation will outline data showing the interaction of biomechanics and obesity on degrading joint tissues, and provide a way forward for managing and potentially preventing OA disease through physical activity.

Structure-aware Mamba-transformer hybrid model for hyperspectral image classification

vipadmin — Fri, 27 Jun 2025 16:04:36 +0000

Tina Liu

June 27th, 2025 – 12:00-1:00pm, EC4-2101A

Hyperspectral image (HSI) classification underpins applications ranging from environmental monitoring and precision agriculture to urban planning and mineral exploration, yet conventional convolutional networks capture only local patterns and transformers—while adept at global context—remain computationally heavy and still struggle with the very long‐range spectral–spatial dependencies latent in hundreds of contiguous bands. Leveraging the recent Mamba state-space model, which offers linear-time sequence processing, we introduce a structure-aware state-fusion mechanism that explicitly encodes neighbouring spectral and spatial relationships within the latent state, reducing redundancy and strengthening representations. Building on this foundation, we insert a lightweight self-attention block solely in the final layer of a Mamba backbone, yielding a hybrid Mamba–Transformer architecture that balances efficiency and global context modeling. Tested on three public benchmarks (Indian Pines, Pavia University, and WHU-Hi-HanChuan), the proposed network matches or exceeds state-of-the-art Transformer and Mamba variants, underscoring the promise of combining state-space and attention mechanisms for accurate, efficient HSI classification.

Reality Capture for Smarter Infrastructure Inspections

vipadmin — Thu, 19 Jun 2025 20:21:54 +0000

Professor Chul Min Yeum

June 20th, 2025 – 12:00-1:00pm, EC4-2101A

This presentation explores how emerging technologies—AI, robotics, drones, and extended reality—are redefining infrastructure inspection. Through real-world systems like TowerEye and Holo-Inspector, Computer vision for Smart Structure Laboratory (CVISSLab) demonstrates how reality capture enables real-time, automated inspections with enhanced accuracy, safety, and collaboration.

From 3D Gaussian splatting to AR-guided damage detection, the talk highlights practical innovations that integrate physical sites with digital twins for intelligent, scalable asset management. It reflects a transformation in civil engineering, where technology enables more accurate, efficient, and connected inspection processes.

Freshwater Ice Remote Sensing – Upcoming Research and Retrospectives

vipadmin — Thu, 19 Jun 2025 20:20:28 +0000

Professor Grant Gunn

June 13th, 2025 – 10:00-11:00am, EC4-2101A

Description: How do we observe and monitor abrupt environmental changes in lakes in Canada’s Arctic? What types of variables from the Cryosphere can we retrieve using remote sensing, and why is it important to observe them? In this VIP session, Dr. Grant Gunn will explore the importance of freshwater ice in the context of the physical and human aspects of a changing climate, and demonstrate current and future capabilities for snow and ice retrievals using a variety of sensors, (e.g. microwave, optical), techniques (e.g. polarimetric decomposition, interferometry) and technologies (e.g. synthetic aperture radar, google search engine)

Projector Calibration via Overlapping Point Cloud Distance Minimization

vipadmin — Fri, 06 Jun 2025 16:33:14 +0000

Pranav Venkatesan

June 6th, 2025 – 12:30-1:00pm, EC4-2101A

Camera and projector calibration is essential for accurate spatial measurement, particularly in applications like projection mapping where multiple projectors must align precisely on complex surfaces. Since projector calibration involves nonlinear relationships between parameters, nonlinear optimization is required, typically using reprojection error as the objective function. However, in the absence of ground truth, reprojection error alone can result in visible gaps between overlapping projector point clouds. To address this, the thesis proposes a multi-step optimization with a novel objective function. Tested on both simulated and real-world setups, the method shows fast and reliable performance. The optimization parameterizes calibration as a function of stereo overlap, which affects both accuracy and system cost. Understanding this relationship allows reduced overlap without compromising accuracy, minimizing camera usage and cost.

Superpixel Salient Object DetectionSummary

vipadmin — Fri, 06 Jun 2025 15:59:36 +0000

Jinman Park

June 6th, 2025 – 12:00-12:30pm, EC4-2101A

This talk explores a lightweight, superpixel-based approach to salient object detection—segmenting the most visually prominent regions in an image. While traditional methods rely on dense pixel-level computation, we introduce SuperFormer, a vision transformer tailored for superpixel inputs. Our work addresses challenges of superpixel heterogeneity, positional encoding, and pre-training, achieving state-of-the-art results on multiple benchmarks with significantly reduced computational cost.