Portrait of Hanqing Zhao

Hanqing Zhao

Affiliate Member
Assistant Professor, Université Laval, Electrical and Computer Engineering
Research Topics
Multi-Agent Systems
Multitask Learning
Reinforcement Learning
Robotics
Swarm Intelligence

Biography

Hanqing Zhao is an Assistant Professor in the Département de génie électrique et de génie informatique of Université Laval. He is a member of the Laboratoire de Vision et Systèmes Numériques (LVSN).

Hanqing began his academic journey at the École Centrale de Pékin (Université Beihang). He earned an Ingénieur civil en informatique degree from École Polytechnique de Bruxelles (Université libre de Bruxelles), supervised by Marco Dorigo; and later received his Ph.D. in Computer Science (robotics) from McGill University, supervised by Gregory Dudek and Xue (Steve) Liu. He was then a Postdoctoral Researcher at the MIST Lab of École Polytechnique de Montréal, supervised by Giovanni Beltrame.

His research focuses on enabling robots to accomplish complex tasks while remaining resilient to faults and external disturbances. He leverages machine learning, adaptive control, and advanced consensus achievement techniques, such as reinforcement learning, supervised learning, Blockchain technologies to develop robust, (especially multi-)robot systems.

Publications

Zero-Shot Fault Detection for Manipulators Through Bayesian Inverse Reinforcement Learning
We consider the detection of faults in robotic manipulators, with particular emphasis on faults that have not been observed or identified in… (see more) advance, which naturally includes those that occur very infrequently. Recent studies indicate that the reward function obtained through Inverse Reinforcement Learning (IRL) can help detect anomalies caused by faults in a control system (i.e. fault detection). Current IRL methods for fault detection, however, either use a linear reward representation or require extensive sampling from the environment to estimate the policy, rendering them inappropriate for safety-critical situations where sampling of failure observations via fault injection can be expensive and dangerous. To address this issue, this paper proposes a zero-shot and exogenous fault detector based on an approximate variational reward imitation learning (AVRIL) structure. The fault detector recovers a reward signal as a function of externally observable information to describe the normal operation, which can then be used to detect anomalies caused by faults. Our method incorporates expert knowledge through a customizable reward prior distribution, allowing the fault detector to learn the reward solely from normal operation samples, without the need for a simulator or costly interactions with the environment. We evaluate our approach for exogenous partial fault detection in multi-stage robotic manipulator tasks, comparing it with several baseline methods. The results demonstrate that our method more effectively identifies unseen faults even when they occur within just three controller time steps.