Portrait of Xue (Steve) Liu is unavailable

Xue (Steve) Liu

Associate Academic Member
Full Professor, McGill University, School of Computer Science
Vice President Research and Development, Chief Scientist and Co-Director, Samsung's Montreal AI Center

Biography

Xue (Steve) Liu is an associate academic member of Mila – Quebec Artificial Intelligence Institute and full professor at McGill University’s School of Computer Science.

He is also a William Dawson Scholar at McGill, as well as a professor (courtesy appointment) in the Department of Mathematics and Statistics, associate member of the Centre for Intelligent Machines (CIM), and associate member of the Centre for Advanced Systems and Technologies in Communications (SYTACom).

Liu is VP of R&D, chief scientist and co-director of Samsung AI Center Montréal. Before that, he was chief scientist in charge of research and innovation at Tinder Inc., the world’s largest dating and social discovery app, then valued at over US$10 billion.

He is a Fellow of the IEEE and the Canadian Academy of Engineering in addition to being the recipient of many awards, including the 2017 Mitacs Award for Exceptional Leadership – Professor; Outstanding Young Canadian Computer Science Researcher Prize from the Canadian Association of Computer Science (2014); and McGill’s Tomlinson Scientist Award for “recognition of excellence and scientific leadership.” He founded McGill’s Cyber-Physical Intelligence Lab in 2007 and still serves as its director.

Liu also briefly served as Samuel R. Thompson Chair Associate Professor in the Department of Computer Science and Engineering at the University of Nebraska-Lincoln, and worked at Hewlett-Packard Labs in Palo Alto (California) and IBM’s Thomas J. Watson Research Center (New York)

Current Students

PhD - McGill University
Co-supervisor :
PhD - McGill University
Master's Research - McGill University
Master's Research - McGill University
Postdoctorate - McGill University
Co-supervisor :
Master's Research - McGill University
PhD - McGill University
PhD - McGill University
PhD - McGill University
Co-supervisor :
Master's Research - McGill University
PhD - McGill University
PhD - McGill University
PhD - McGill University
PhD - McGill University

Publications

Adaptive Dynamic Programming for Energy-Efficient Base Station Cell Switching
Junliang Luo
Yi Tian Xu
Di Wu
M. Jenkin
Energy saving in wireless networks is growing in importance due to increasing demand for evolving new-gen cellular networks, environmental a… (see more)nd regulatory concerns, and potential energy crises arising from geopolitical tensions. In this work, we propose an approximate dynamic programming (ADP)-based method coupled with online optimization to switch on/off the cells of base stations to reduce network power consumption while maintaining adequate Quality of Service (QoS) metrics. We use a multilayer perceptron (MLP) given each state-action pair to predict the power consumption to approximate the value function in ADP for selecting the action with optimal expected power saved. To save the largest possible power consumption without deteriorating QoS, we include another MLP to predict QoS and a long short-term memory (LSTM) for predicting handovers, incorporated into an online optimization algorithm producing an adaptive QoS threshold for filtering cell switching actions based on the overall QoS history. The performance of the method is evaluated using a practical network simulator with various real-world scenarios with dynamic traffic patterns.
A Generic Framework for Byzantine-Tolerant Consensus Achievement in Robot Swarms
Hanqing Zhao
Alexandre Pacheco
Volker Strobel
Andreagiovanni Reina
Marco Dorigo
Recent studies show that some security features that blockchains grant to decentralized networks on the internet can be ported to swarm robo… (see more)tics. Although the integration of blockchain technology and swarm robotics shows great promise, thus far, research has been limited to proof-of-concept scenarios where the blockchain-based mechanisms are tailored to a particular swarm task and operating environment. In this study, we propose a generic framework based on a blockchain smart contract that enables robot swarms to achieve secure consensus in an arbitrary observation space. This means that our framework can be customized to fit different swarm robotics missions, while providing methods to identify and neutralize Byzantine robots, that is, robots which exhibit detrimental behaviours stemming from faults or malicious tampering.
Zero-Shot Fault Detection for Manipulators Through Bayesian Inverse Reinforcement Learning
We consider the detection of faults in robotic manipulators, with particular emphasis on faults that have not been observed or identified in… (see more) advance, which naturally includes those that occur very infrequently. Recent studies indicate that the reward function obtained through Inverse Reinforcement Learning (IRL) can help detect anomalies caused by faults in a control system (i.e. fault detection). Current IRL methods for fault detection, however, either use a linear reward representation or require extensive sampling from the environment to estimate the policy, rendering them inappropriate for safety-critical situations where sampling of failure observations via fault injection can be expensive and dangerous. To address this issue, this paper proposes a zero-shot and exogenous fault detector based on an approximate variational reward imitation learning (AVRIL) structure. The fault detector recovers a reward signal as a function of externally observable information to describe the normal operation, which can then be used to detect anomalies caused by faults. Our method incorporates expert knowledge through a customizable reward prior distribution, allowing the fault detector to learn the reward solely from normal operation samples, without the need for a simulator or costly interactions with the environment. We evaluate our approach for exogenous partial fault detection in multi-stage robotic manipulator tasks, comparing it with several baseline methods. The results demonstrate that our method more effectively identifies unseen faults even when they occur within just three controller time steps.
Importance-aware Co-teaching for Offline Model-based Optimization
Ye Yuan
Can Chen
Zixuan Liu
Willie Neiswanger
Offline model-based optimization aims to find a design that maximizes a property of interest using only an offline dataset, with application… (see more)s in robot, protein, and molecule design, among others. A prevalent approach is gradient ascent, where a proxy model is trained on the offline dataset and then used to optimize the design. This method suffers from an out-of-distribution issue, where the proxy is not accurate for unseen designs. To mitigate this issue, we explore using a pseudo-labeler to generate valuable data for fine-tuning the proxy. Specifically, we propose
Importance-aware Co-teaching for Offline Model-based Optimization
Ye Yuan
Can Chen
Zixuan Liu
Willie Neiswanger
Parallel-mentoring for Offline Model-based Optimization
Can Chen
Christopher Beckham
Zixuan Liu
Parallel-mentoring for Offline Model-based Optimization
Can Chen
Christopher Beckham
Zixuan Liu
We study offline model-based optimization to maximize a black-box objective function with a static dataset of designs and scores. These desi… (see more)gns encompass a variety of domains, including materials, robots, DNA sequences, and proteins. A common approach trains a proxy on the static dataset and performs gradient ascent to obtain new designs. However, this often results in poor designs due to the proxy inaccuracies for out-of-distribution designs. Recent studies indicate that (a) gradient ascent with a mean ensemble of proxies generally outperforms simple gradient ascent, and (b) a trained proxy provides weak ranking supervision signals for design selection. Motivated by (a) and (b), we propose
Retrieval-Augmented Multiple Instance Learning
Yufei Cui
Ziquan Liu
Yixin CHEN
Yuchen Lu
Xinyue Yu
Tei-Wei Kuo
Miguel R. D. Rodrigues
Chun Jason Xue
Antoni B. Chan
Multiple Instance Learning (MIL) is a crucial weakly supervised learning method applied across various domains, e.g., medical diagnosis base… (see more)d on whole slide images (WSIs). Recent advancements in MIL algorithms have yielded exceptional performance when the training and test data originate from the same domain, such as WSIs obtained from the same hospital. However, this paper reveals a performance deterioration of MIL models when tested on an out-of-domain test set, exemplified by WSIs sourced from a novel hospital. To address this challenge, this paper introduces the Retrieval-AugMented MIL (RAM-MIL) framework, which integrates Optimal Transport (OT) as the distance metric for nearest neighbor retrieval. The development of RAM-MIL is driven by two key insights. First, a theoretical discovery indicates that reducing the input's intrinsic dimension can minimize the approximation error in attention-based MIL. Second, previous studies highlight a link between input intrinsic dimension and the feature merging process with the retrieved data. Empirical evaluations conducted on WSI classification demonstrate that the proposed RAM-MIL framework achieves state-of-the-art performance in both in-domain scenarios, where the training and retrieval data are in the same domain, and more crucially, in out-of-domain scenarios, where the (unlabeled) retrieval data originates from a different domain. Furthermore, the use of the transportation matrix derived from OT renders the retrieval results interpretable at the instance level, in contrast to the vanilla
Retrieval-Augmented Multiple Instance Learning
Yufei Cui
Ziquan Liu
Yixin CHEN
Yuchen Lu
Xinyue Yu
Tei-Wei Kuo
Miguel R. D. Rodrigues
Chun Jason Xue
Antoni B. Chan
Multiple Instance Learning (MIL) is a crucial weakly supervised learning method applied across various domains, e.g., medical diagnosis base… (see more)d on whole slide images (WSIs). Recent advancements in MIL algorithms have yielded exceptional performance when the training and test data originate from the same domain, such as WSIs obtained from the same hospital. However, this paper reveals a performance deterioration of MIL models when tested on an out-of-domain test set, exemplified by WSIs sourced from a novel hospital. To address this challenge, this paper introduces the Retrieval-AugMented MIL (RAM-MIL) framework, which integrates Optimal Transport (OT) as the distance metric for nearest neighbor retrieval. The development of RAM-MIL is driven by two key insights. First, a theoretical discovery indicates that reducing the input's intrinsic dimension can minimize the approximation error in attention-based MIL. Second, previous studies highlight a link between input intrinsic dimension and the feature merging process with the retrieved data. Empirical evaluations conducted on WSI classification demonstrate that the proposed RAM-MIL framework achieves state-of-the-art performance in both in-domain scenarios, where the training and retrieval data are in the same domain, and more crucially, in out-of-domain scenarios, where the (unlabeled) retrieval data originates from a different domain. Furthermore, the use of the transportation matrix derived from OT renders the retrieval results interpretable at the instance level, in contrast to the vanilla
Towards Hybrid-grained Feature Interaction Selection for Deep Sparse Network
Fuyuan Lyu
Xing Tang
Dugang Liu
Chen Ma
Weihong Luo
Liang Chen
xiuqiang He
Deep sparse networks are widely investigated as a neural network architecture for prediction tasks with high-dimensional sparse features, wi… (see more)th which feature interaction selection is a critical component. While previous methods primarily focus on how to search feature interaction in a coarse-grained space, less attention has been given to a finer granularity. In this work, we introduce a hybrid-grained feature interaction selection approach that targets both feature field and feature value for deep sparse networks. To explore such expansive space, we propose a decomposed space which is calculated on the fly. We then develop a selection algorithm called OptFeature, which efficiently selects the feature interaction from both the feature field and the feature value simultaneously. Results from experiments on three large real-world benchmark datasets demonstrate that OptFeature performs well in terms of accuracy and efficiency. Additional studies support the feasibility of our method. All source code are publicly available\footnote{https://anonymous.4open.science/r/OptFeature-Anonymous}.
Towards Hybrid-grained Feature Interaction Selection for Deep Sparse Network
Fuyuan Lyu
Xing Tang
Dugang Liu
Chen Ma
Weihong Luo
Liang Chen
xiuqiang He
Teacher-Student Architecture for Knowledge Distillation: A Survey
Chengming Hu
Xuan Li
Danyang Liu
Haolun Wu
Xi Chen
Ju Wang
Although Deep neural networks (DNNs) have shown a strong capacity to solve large-scale problems in many areas, such DNNs are hard to be depl… (see more)oyed in real-world systems due to their voluminous parameters. To tackle this issue, Teacher-Student architectures were proposed, where simple student networks with a few parameters can achieve comparable performance to deep teacher networks with many parameters. Recently, Teacher-Student architectures have been effectively and widely embraced on various knowledge distillation (KD) objectives, including knowledge compression, knowledge expansion, knowledge adaptation, and knowledge enhancement. With the help of Teacher-Student architectures, current studies are able to achieve multiple distillation objectives through lightweight and generalized student networks. Different from existing KD surveys that primarily focus on knowledge compression, this survey first explores Teacher-Student architectures across multiple distillation objectives. This survey presents an introduction to various knowledge representations and their corresponding optimization objectives. Additionally, we provide a systematic overview of Teacher-Student architectures with representative learning algorithms and effective distillation schemes. This survey also summarizes recent applications of Teacher-Student architectures across multiple purposes, including classification, recognition, generation, ranking, and regression. Lastly, potential research directions in KD are investigated, focusing on architecture design, knowledge quality, and theoretical studies of regression-based learning, respectively. Through this comprehensive survey, industry practitioners and the academic community can gain valuable insights and guidelines for effectively designing, learning, and applying Teacher-Student architectures on various distillation objectives.