Portrait de Hsiu-Chin Lin

Hsiu-Chin Lin

Membre académique associé
Professeure adjointe, McGill University, Département de génie électrique et informatique
Sujets de recherche
Apprentissage par renforcement
Apprentissage profond
Changement climatique
Détection hors distribution (OOD)
Navigation robotique autonome
Robotique

Biographie

Hsiu-Chin Lin est professeure adjointe à l'École d'informatique et au Département de génie électrique et informatique de l'Université McGill. Ses recherches portent sur le contrôle du mouvement basé sur des modèles, l'optimisation et l'apprentissage automatique pour la planification du mouvement. Elle s'intéresse particulièrement à l'adaptation du mouvement des robots dans des environnements dynamiques pour les manipulateurs et les robots quadrupèdes. Avant de travailler à McGill, elle a été associée de recherche à l'Université d'Édimbourg et à l'Université de Birmingham. Elle a obtenu un doctorat de l'Université d'Édimbourg pour ses travaux sur l'apprentissage des robots.

Étudiants actuels

Doctorat - McGill
Co-superviseur⋅e :
Maîtrise recherche - McGill
Superviseur⋅e principal⋅e :
Maîtrise recherche - McGill
Superviseur⋅e principal⋅e :
Maîtrise recherche - McGill

Publications

SLowRL: Safe Low-Rank Adaptation for Bridging the Sim-to-Real Gap in Legged Locomotion
Shafeef Omar
Majid Khadiv
A simulator is, at best, a coarse low-fidelity model of the real world the agent eventually has to act in. Closing this residual gap on hard… (voir plus)ware is a canonical instance of operating in a big world: the real environment exposes contact dynamics, latencies, and disturbances that the agent was never given the capacity (parameters or data) to model during pretraining. Naive on-hardware fine-tuning is risky --- the policy can damage the robot before it improves --- and full-parameter updates require prohibitive interaction time. We propose SLowRL, a continual fine-tuning framework that confronts this big-world adaptation problem with two complementary forms of capacity limitation: (i) a rank-1 LoRA adapter applied per layer to both actor and critic, restricting each layer's update to a single direction in its image space (
Drift Q-Learning
Offline reinforcement learning requires improving a policy from fixed data while avoiding out-of-distribution actions with unreliable value … (voir plus)estimates. Diffusion and flow policies handle this trade-off by modeling the behavior distribution to regularize the RL objective, but they require iterative denoising, solver integrations, and in more efficient variants, distillation or other approximations at inference. We propose DriftQL, which combines a drift-based behavioral regularizer with critic-driven policy improvement. The value signal biases the policy toward high-value regions of the data support, while attraction and repulsion together keep generated actions near the data and prevent collapse onto a single mode. DriftQL is implemented as a single network with a unified training objective and generates actions in a single forward pass. On D4RL and OGBench, DriftQL consistently outperforms diffusion and flow methods, advancing the state of the art. Under degraded data quality, where the baselines visibly struggle, DriftQL remains close to its clean-data performance, positioning it as a promising alternative to diffusion and flow-based methods while maintaining the simplicity and efficiency of deterministic approaches. Project page: https://driftql.github.io/
Genetic connectivity of Uroteuthis sibogae (Cephalopoda: Loliginidae) in the Sulu Sea with notes on morphology and statolith microchemistry
Jessica M. Legaspi
Lorenzo C. Halasan
Kris Angeli S. Sanchez-Delos Reyes
Tomoyo Okumura
Hsiu‐Chin Lin
Toward Hardware-Agnostic Quadrupedal World Models via Morphology Conditioning
World models promise a paradigm shift in robotics, where an agent learns the underlying physics of its environment once to enable efficient … (voir plus)planning and behavior learning. However, current world models are often hardware-locked specialists: a model trained on a Boston Dynamics Spot robot fails catastrophically on a Unitree Go1 due to the mismatch in kinematic and dynamic properties, as the model overfits to specific embodiment constraints rather than capturing the universal locomotion dynamics. Consequently, a slight change in actuator dynamics or limb length necessitates training a new model from scratch. In this work, we take a step towards a framework for training a generalizable Quadrupedal World Model (QWM) that disentangles environmental dynamics from robot morphology. We address the limitations of implicit system identification, where treating static physical properties (like mass or limb length) as latent variables to be inferred from motion history creates an adaptation lag that can compromise zero-shot safety and efficiency. Instead, we explicitly condition the generative dynamics on the robot's engineering specifications. By integrating a physical morphology encoder and a reward normalizer, we enable the model to serve as a neural simulator capable of generalizing across morphologies. This capability unlocks zero-shot control across a range of embodiments. We introduce, for the first time, a world model that enables zero-shot generalization to new morphologies for locomotion. While we carefully study the limitations of our method, QWM operates as a distribution-bounded interpolator within the quadrupedal morphology family rather than a universal physics engine, this work represents a significant step toward morphology-conditioned world models for legged locomotion.
SLowRL: Safe Low-Rank Adaptation Reinforcement Learning for Locomotion
Shafeef Omar
Majid Khadiv
Sim-to-real transfer of locomotion policies often leads to performance degradation due to the inevitable sim-to-real gap. Naively fine-tunin… (voir plus)g these policies directly on hardware is problematic, as it poses risks of mechanical failure and suffers from high sample inefficiency. In this paper, we address the challenge of safely and efficiently fine-tuning reinforcement learning (RL) policies for dynamic locomotion tasks. Specifically, we focus on fine-tuning policies learned in simulation directly on hardware, while explicitly enforcing safety constraints. In doing so, we introduce SLowRL, a framework that combines Low-Rank Adaptation (LoRA) with training-time safety enforcement via a recovery policy. We evaluate our method both in simulation and on a real Unitree Go2 quadruped robot for jump and trot tasks. Experimental results show that our method achieves a
Tactile Modality Fusion for Vision-Language-Action Models
We propose TacFiLM, a lightweight modality-fusion approach that integrates visual-tactile signals into vision-language-action (VLA) models. … (voir plus)While recent advances in VLA models have introduced robot policies that are both generalizable and semantically grounded, these models mainly rely on vision-based perception. Vision alone, however, cannot capture the complex interaction dynamics that occur during contact-rich manipulation, including contact forces, surface friction, compliance, and shear. While recent attempts to integrate tactile signals into VLA models often increase complexity through token concatenation or large-scale pretraining, the heavy computational demands of behavioural models necessitate more lightweight fusion strategies. To address these challenges, TacFiLM outlines a post-training finetuning approach that conditions intermediate visual features on pretrained tactile representations using feature-wise linear modulation (FiLM). Experimental results on insertion tasks demonstrate consistent improvements in success rate, direct insertion performance, completion time, and force stability across both in-distribution and out-of-distribution tasks. Together, these results support our method as an effective approach to integrating tactile signals into VLA models, improving contact-rich manipulation behaviours.
Piezoelectric tuning of thermal conductivity in nano-architected gallium nitride metamaterials
Jun Cai
Alireza Seyedkanani
Benyamin Shahryari
Abdolhamid Akbarzadeh
VOCALoco: Viability-Optimized Cost-aware Adaptive Locomotion
Recent advancements in legged robot locomotion have facilitated traversal over increasingly complex terrains. Despite this progress, many ex… (voir plus)isting approaches rely on end-to-end deep reinforcement learning (DRL), which poses limitations in terms of safety and interpretability, especially when generalizing to novel terrains. To overcome these challenges, we introduce VOCALoco, a modular skill-selection framework that dynamically adapts locomotion strategies based on perceptual input. Given a set of pre-trained locomotion policies, VOCALoco evaluates their viability and energy-consumption by predicting both the safety of execution and the anticipated cost of transport over a fixed planning horizon. This joint assessment enables the selection of policies that are both safe and energy-efficient, given the observed local terrain. We evaluate our approach on staircase locomotion tasks, demonstrating its performance in both simulated and real-world scenarios using a quadrupedal robot. Empirical results show that VOCALoco achieves improved robustness and safety during stair ascent and descent compared to a conventional end-to-end DRL policy
Contractive Diffusion Policies
Diffusion policies have emerged as powerful generative models for offline policy learning, whose sampling process can be rigorously characte… (voir plus)rized by a score function guiding a Stochastic Differential Equation (SDE). However, the same score-based SDE modeling that grants diffusion policies the flexibility to learn diverse behavior also incurs solver and score-matching errors, large data requirements, and inconsistencies in action generation. While less critical in image generation, these inaccuracies compound and lead to failure in continuous control settings. We introduce **C**ontractive **D**iffusion **P**olicies (CDPs) to induce contractive behavior in the diffusion sampling dynamics. Contraction pulls nearby flows closer to enhance robustness against solver and score-matching errors while reducing unwanted action variance. We develop an in-depth theoretical analysis along with a practical implementation recipe to incorporate CDPs into existing diffusion policy architectures with minimal modification and computational cost. We evaluate CDPs for offline learning by conducting extensive experiments in simulation and real world settings. Across benchmarks, CDPs often outperform baseline policies, with pronounced benefits under data scarcity. Project page: https://contractive-diffusion.github.io
Contractive Diffusion Policies: Robust Action Diffusion via Contractive Score-Based Sampling with Differential Equations
Charlotte Morissette
Anas El Houssaini
Diffusion policies have emerged as powerful generative models for offline policy learning, whose sampling process can be rigorously characte… (voir plus)rized by a score function guiding a Stochastic Differential Equation (SDE). However, the same score-based SDE modeling that grants diffusion policies the flexibility to learn diverse behavior also incurs solver and score-matching errors, large data requirements, and inconsistencies in action generation. While less critical in image generation, these inaccuracies compound and lead to failure in continuous control settings. We introduce Contractive Diffusion Policies (CDPs) to induce contractive behavior in the diffusion sampling dynamics. Contraction pulls nearby flows closer to enhance robustness against solver and score-matching errors while reducing unwanted action variance. We develop an in-depth theoretical analysis along with a practical implementation recipe to incorporate CDPs into existing diffusion policy architectures with minimal modification and computational cost. We evaluate CDPs for offline learning by conducting extensive experiments in simulation and real-world settings. Across benchmarks, CDPs often outperform baseline policies, with pronounced benefits under data scarcity.
Safe Domain Randomization via Uncertainty-Aware Out-of-Distribution Detection and Policy Adaptation
Deploying reinforcement learning (RL) policies in real-world involves significant challenges, including distribution shifts, safety concerns… (voir plus), and the impracticality of direct interactions during policy refinement. Existing methods, such as domain randomization (DR) and off-dynamics RL, enhance policy robustness by direct interaction with the target domain, an inherently unsafe practice. We propose Uncertainty-Aware RL (UARL), a novel framework that prioritizes safety during training by addressing Out-Of-Distribution (OOD) detection and policy adaptation without requiring direct interactions in target domain. UARL employs an ensemble of critics to quantify policy uncertainty and incorporates progressive environmental randomization to prepare the policy for diverse real-world conditions. By iteratively refining over high-uncertainty regions of the state space in simulated environments, UARL enhances robust generalization to the target domain without explicitly training on it. We evaluate UARL on MuJoCo benchmarks and a quadrupedal robot, demonstrating its effectiveness in reliable OOD detection, improved performance, and enhanced sample efficiency compared to baselines.
Speciation of coral-associated barnacles: generalists versus specialists in the Indo-West Pacific
Lorenzo C. Halasan
Yoko Nozawa
Benny Kwok Kan Chan