Portrait de Xue (Steve) Liu n'est pas disponible

Xue (Steve) Liu

Membre académique associé
Professeur titulaire, McGill University, École d'informatique
Vice-président, recherche et développement, directeur scientifique et co-directeur, Samsung's Montreal AI Center
Sujets de recherche
Apprentissage profond

Biographie

Xue (Steve) Liu est professeur titulaire à l'École d'informatique de l’Université McGill, ainsi que vice-président de la recherche et du développement, scientifique en chef et codirecteur du Centre d'IA de Samsung à Montréal. Il est également titulaire d'une bourse William Dawson (professeur titulaire) à l'Université McGill et professeur de mathématiques et de statistiques (nomination de courtoisie) dans le même établissement. Auparavant, il était scientifique en chef chez Tinder Inc., où il dirigeait la recherche et l'innovation touchant l’application de rencontre et de découverte sociale la plus importante au monde, évaluée à plus de 10 milliards de dollars américains.

M. Liu est membre de l'IEEE et membre associé de Mila – Institut québécois d’intelligence artificielle. À l'Université McGill, il est également membre associé du Centre sur les machines intelligentes (CIM) et du Centre sur les systèmes et les technologies avancés en communication (SYTACom). Il a reçu plusieurs récompenses, notamment le prix Mitacs 2017 reconnaissant un leadership exceptionnel parmi le corps professoral, le prix Outstanding Young Canadian Computer Science Researcher de l'Association canadienne de l'informatique en 2014, et le prix Tomlinson Scientist soulignant l'excellence et le leadership scientifique à l'Université McGill. Il est le directeur du Laboratoire sur l’intelligence cyberphysique de l'Université McGill, qu’il a fondé en 2007. Il a également travaillé brièvement en tant que professeur associé de la chaire Samuel R. Thompson au Département d'informatique et d'ingénierie de l'Université du Nebraska à Lincoln, aux laboratoires Hewlett-Packard à Palo Alto, en Californie, et au centre de recherche T. J. Watson d'IBM à New York.

Étudiants actuels

Collaborateur·rice alumni - McGill
Co-superviseur⋅e :
Doctorat - McGill
Doctorat - McGill
Co-superviseur⋅e :
Doctorat - McGill
Doctorat - McGill
Doctorat - McGill
Maîtrise recherche - McGill
Doctorat - McGill
Doctorat - McGill
Doctorat - McGill
Maîtrise recherche - McGill
Doctorat - McGill
Co-superviseur⋅e :
Maîtrise recherche - McGill
Doctorat - McGill
Doctorat - McGill

Publications

How to Train Your LLM Web Agent: A Statistical Diagnosis
Santhoshi Ravichandran
Hadi Nekoei
Nicolas Gontier
Miguel Muñoz-Mármol
Stefania Raimondo
Alexandre Piché
Alexandre Lacoste
Massimo Caccia
Large language model (LLM) agents for web interfaces have advanced rapidly, yet open-source systems still lag behind proprietary agents. Bri… (voir plus)dging this gap is key to enabling customizable, efficient, and privacy-preserving agents. Two challenges hinder progress: the reproducibility issues in RL and LLM agent training, where results often depend on sensitive factors like seeds and decoding parameters, and the focus of prior work on single-step tasks, overlooking the complexities of web-based, multi-step decision-making. We address these gaps by providing a statistically driven study of training LLM agents for web tasks. Our two-stage pipeline combines imitation learning from a Llama 3.3 70B teacher with on-policy fine-tuning via Group Relative Policy Optimization (GRPO) on a Llama 3.1 8B student. Through 240 configuration sweeps and rigorous bootstrapping, we chart the first compute allocation curve for open-source LLM web agents. Our findings show that dedicating one-third of compute to teacher traces and the rest to RL improves MiniWoB++ success by 6 points and closes 60% of the gap to GPT-4o on WorkArena, while cutting GPU costs by 45%. We introduce a principled hyperparameter sensitivity analysis, offering actionable guidelines for robust and cost-effective agent training.
How to Train Your LLM Web Agent: A Statistical Diagnosis
Santhoshi Ravichandran
Hadi Nekoei
Nicolas Gontier
Miguel Muñoz-Mármol
Stefania Raimondo
Alexandre Piché
Alexandre Lacoste
Massimo Caccia
LLM-based web agents have recently made significant progress, but much of it has occurred in closed-source systems, widening the gap with op… (voir plus)en-source alternatives. Progress has been held back by two key challenges: first, a narrow focus on single-step tasks that overlooks the complexity of multi-step web interactions; and second, the high compute costs required to post-train LLM-based web agents. To address this, we present the first statistically grounded study on compute allocation for LLM web-agent post-training. Our approach uses a two-stage pipeline, training a Llama 3.1 8B student to imitate a Llama 3.3 70B teacher via supervised fine-tuning (SFT), followed by on-policy reinforcement learning. We find this process highly sensitive to hyperparameter choices, making exhaustive sweeps impractical. To spare others from expensive trial-and-error, we sample 1,370 configurations and use bootstrapping to estimate effective hyperparameters. Our results show that combining SFT with on-policy RL consistently outperforms either approach alone on both WorkArena and MiniWob++. Further, this strategy requires only 55% of the compute to match the peak performance of pure SFT on MiniWob++, effectively pushing the compute-performance Pareto frontier, and is the only strategy that can close the gap with closed-source models.
AIoT Smart Home via Autonomous LLM Agents
Dmitriy Rivkin
Francois Hogan
Amal Feriani
Abhisek Konar
Adam Sigal
The common-sense reasoning abilities and vast general knowledge of large language models (LLMs) make them a natural fit for interpreting use… (voir plus)r requests in a smart home assistant context. LLMs, however, lack specific knowledge about the user and their home, which limits their potential impact. Smart home agent with grounded execution (SAGE), overcomes these and other limitations by using a scheme in which a user request triggers an LLM-controlled sequence of discrete actions. These actions can be used to retrieve information, interact with the user, or manipulate device states. SAGE controls this process through a dynamically constructed tree of LLM prompts, which help it decide which action to take next, whether an action was successful, and when to terminate the process. The SAGE action set augments an LLM’s capabilities to support some of the most critical requirements for a smart home assistant. These include: flexible and scalable user preference management (“Is my team playing tonight?”), access to any smart device’s full functionality without device-specific code via API reading (“Turn down the screen brightness on my dryer”), persistent device state monitoring (“Remind me to throw out the milk when I open the fridge”), natural device references using only a photo of the room (“Turn on the lamp on the dresser”), and more. We introduce a benchmark of 50 new and challenging smart home tasks where SAGE achieves a 76% success rate, significantly outperforming existing LLM-enabled baselines (30% success rate).
AIoT Smart Home via Autonomous LLM Agents
Dmitriy Rivkin
Francois Hogan
Amal Feriani
Abhisek Konar
Adam Sigal
ParetoFlow: Guided Flows in Multi-Objective Optimization
In offline multi-objective optimization (MOO), we leverage an offline dataset of designs and their associated labels to simultaneously minim… (voir plus)ize multiple objectives. This setting more closely mirrors complex real-world problems compared to single-objective optimization. Recent works mainly employ evolutionary algorithms and Bayesian optimization, with limited attention given to the generative modeling capabilities inherent in such data. In this study, we explore generative modeling in offline MOO through flow matching, noted for its effectiveness and efficiency. We introduce ParetoFlow, specifically designed to guide flow sampling to approximate the Pareto front. Traditional predictor (classifier) guidance is inadequate for this purpose because it models only a single objective. In response, we propose a multi-objective predictor guidance module that assigns each sample a weight vector, representing a weighted distribution across multiple objective predictions. A local filtering scheme is introduced to address non-convex Pareto fronts. These weights uniformly cover the entire objective space, effectively directing sample generation towards the Pareto front. Since distributions with similar weights tend to generate similar samples, we introduce a neighboring evolution module to foster knowledge sharing among neighboring distributions. This module generates offspring from these distributions, and selects the most promising one for the next iteration. Our method achieves state-of-the-art performance across various tasks.
Warmup Generations: A Task-Agnostic Approach for Guiding Sequence-to-Sequence Learning with Unsupervised Initial State Generation
Senyu Li
Jiayi Wang
Pontus Stenetorp
Robust Guided Diffusion for Offline Black-Box Optimization
Can Chen
Christopher Beckham
Zixuan Liu
Offline black-box optimization aims to maximize a black-box function using an offline dataset of designs and their measured properties. Two … (voir plus)main approaches have emerged: the forward approach, which learns a mapping from input to its value, thereby acting as a proxy to guide optimization, and the inverse approach, which learns a mapping from value to input for conditional generation. (a) Although proxy-free~(classifier-free) diffusion shows promise in robustly modeling the inverse mapping, it lacks explicit guidance from proxies, essential for generating high-performance samples beyond the training distribution. Therefore, we propose \textit{proxy-enhanced sampling} which utilizes the explicit guidance from a trained proxy to bolster proxy-free diffusion with enhanced sampling control. (b) Yet, the trained proxy is susceptible to out-of-distribution issues. To address this, we devise the module \textit{diffusion-based proxy refinement}, which seamlessly integrates insights from proxy-free diffusion back into the proxy for refinement. To sum up, we propose \textit{\textbf{R}obust \textbf{G}uided \textbf{D}iffusion for Offline Black-box Optimization}~(\textbf{RGD}), combining the advantages of proxy~(explicit guidance) and proxy-free diffusion~(robustness) for effective conditional generation. RGD achieves state-of-the-art results on various design-bench tasks, underscoring its efficacy. Our code is at https://anonymous.4open.science/r/RGD-27A5/README.md.
ParetoFlow: Guided Flows in Multi-Objective Optimization
In offline multi-objective optimization (MOO), we leverage an offline dataset of designs and their associated labels to simultaneously minim… (voir plus)ize multiple objectives. This setting more closely mirrors complex real-world problems compared to single-objective optimization. Recent works mainly employ evolutionary algorithms and Bayesian optimization, with limited attention given to the generative modeling capabilities inherent in such data. In this study, we explore generative modeling in offline MOO through flow matching, noted for its effectiveness and efficiency. We introduce ParetoFlow, specifically designed to guide flow sampling to approximate the Pareto front. Traditional predictor (classifier) guidance is inadequate for this purpose because it models only a single objective. In response, we propose a multi-objective predictor guidance module that assigns each sample a weight vector, representing a weighted distribution across multiple objective predictions. A local filtering scheme is introduced to address non-convex Pareto fronts. These weights uniformly cover the entire objective space, effectively directing sample generation towards the Pareto front. Since distributions with similar weights tend to generate similar samples, we introduce a neighboring evolution module to foster knowledge sharing among neighboring distributions. This module generates offspring from these distributions, and selects the most promising one for the next iteration. Our method achieves state-of-the-art performance across various tasks.
ParetoFlow: Guided Flows in Multi-Objective Optimization
In offline multi-objective optimization (MOO), we leverage an offline dataset of designs and their associated labels to simultaneously minim… (voir plus)ize multiple objectives. This setting more closely mirrors complex real-world problems compared to single-objective optimization. Recent works mainly employ evolutionary algorithms and Bayesian optimization, with limited attention given to the generative modeling capabilities inherent in such data. In this study, we explore generative modeling in offline MOO through flow matching, noted for its effectiveness and efficiency. We introduce ParetoFlow, specifically designed to guide flow sampling to approximate the Pareto front. Traditional predictor (classifier) guidance is inadequate for this purpose because it models only a single objective. In response, we propose a multi-objective predictor guidance module that assigns each sample a weight vector, representing a weighted distribution across multiple objective predictions. A local filtering scheme is introduced to address non-convex Pareto fronts. These weights uniformly cover the entire objective space, effectively directing sample generation towards the Pareto front. Since distributions with similar weights tend to generate similar samples, we introduce a neighboring evolution module to foster knowledge sharing among neighboring distributions. This module generates offspring from these distributions, and selects the most promising one for the next iteration. Our method achieves state-of-the-art performance across various tasks.
Health satisfaction outcome from integrated autonomous mobile clinics
Yuzhang Huang
Shaoshan Liu
Zhongying Pan
Carl Wu
Herng-Chia Chiu
Leiyu Shi
Health satisfaction outcome from integrated autonomous mobile clinics
Yuzhang Huang
Shaoshan Liu
Zhongying Pan
Carl Wu
Herng-Chia Chiu
Leiyu Shi
Autonomous mobile clinics (AMCs) have the potential to revolutionize healthcare delivery by bringing healthcare services to patients at the … (voir plus)order of patient’s fingertips. Particularly, AMCs can act as an essential touch point of integrated care, which is a worldwide response to the fragmented delivery of health by focusing on more coordinated and integrated forms of care provision. However, the impact of AMCs on the health satisfaction outcome effectiveness still remains unknown. In this article, in collaboration with United Family Healthcare (UFH), we study the potential effectiveness improvement of integrated care delivery through AMCs. Supplementary Information The online version contains supplementary material available at 10.1038/s41598-024-75611-x.
Robust Guided Diffusion for Offline Black-Box Optimization
Can Chen
Christopher Beckham
Zixuan Liu
Offline black-box optimization aims to maximize a black-box function using an offline dataset of designs and their measured properties. Two … (voir plus)main approaches have emerged: the forward approach, which learns a mapping from input to its value, thereby acting as a proxy to guide optimization, and the inverse approach, which learns a mapping from value to input for conditional generation. (a) Although proxy-free~(classifier-free) diffusion shows promise in robustly modeling the inverse mapping, it lacks explicit guidance from proxies, essential for generating high-performance samples beyond the training distribution. Therefore, we propose \textit{proxy-enhanced sampling} which utilizes the explicit guidance from a trained proxy to bolster proxy-free diffusion with enhanced sampling control. (b) Yet, the trained proxy is susceptible to out-of-distribution issues. To address this, we devise the module \textit{diffusion-based proxy refinement}, which seamlessly integrates insights from proxy-free diffusion back into the proxy for refinement. To sum up, we propose \textit{\textbf{R}obust \textbf{G}uided \textbf{D}iffusion for Offline Black-box Optimization}~(\textbf{RGD}), combining the advantages of proxy~(explicit guidance) and proxy-free diffusion~(robustness) for effective conditional generation. RGD achieves state-of-the-art results on various design-bench tasks, underscoring its efficacy. Our code is at https://github.com/GGchen1997/RGD.