Portrait of Mark Coates

Mark Coates

Associate Academic Member
Associate Professor, McGill University, Department of Electrical and Computer Engineering
Research Topics
Dynamical Systems
Graph Neural Networks
Learning on Graphs
Recommender Systems
Representation Learning

Biography

Mark Coates is a professor in the Department of Electrical and Computer Engineering at McGill University, which he joined in 2002. He received his Bachelor of Engineering degree in computer systems engineering from the University of Adelaide, Australia, in 1995 and his PhD degree in information engineering from the University of Cambridge, U.K., in 1999. Coates was formerly a research associate and lecturer at Rice University, Texas (1999–2001) and a senior scientist at Winton Capital Management, Oxford, U.K. (2012–2013).

He has assumed multiple editorial roles, including senior area editor of IEEE Signal Processing Letters, associate editor of IEEE Transactions on Signal Processing, and associate editor of IEEE Transactions on Signal and Information Processing over Networks. His research interests include machine learning and statistical signal processing, Bayesian and Monte Carlo inference, and learning on graphs and networks. His most influential and widely cited contributions have been on the topics of network tomography and distributed particle filtering.

Current Students

PhD - McGill University
Master's Research - McGill University
PhD - McGill University
PhD - McGill University
Co-supervisor :
PhD - McGill University

Publications

Robust and Interpretable Relational Reasoning with Large Language Models and Symbolic Solvers
Ge Zhang
Mohammad Alomrani
Hongjian Gu
Jiaming Zhou
Yaochen Hu
Bin Wang
Qun Liu
Yingxue Zhang
Jianye Hao
Large language models (LLMs) possess vast semantic knowledge but often struggle with complex reasoning tasks, particularly in relational rea… (see more)soning problems such as kinship or spatial reasoning. In this paper, we present Path-of-Thoughts (PoT), a novel framework designed to tackle relation reasoning by decomposing the task into three key stages: graph extraction, path identification, and reasoning. Unlike previous approaches, PoT efficiently extracts a task-agnostic graph that identifies crucial entities, relations, and attributes within the problem context. Subsequently, PoT identifies relevant reasoning chains within the graph corresponding to the posed question, facilitating inference of potential answers. Experimental evaluations on four benchmark datasets, demanding long reasoning chains, demonstrate that PoT surpasses state-of-the-art baselines by a significant margin (maximum 21.3\%) without necessitating fine-tuning or extensive LLM calls. Furthermore, as opposed to prior neuro-symbolic methods, PoT exhibits improved resilience against LLM errors by leveraging the compositional nature of graphs.
One Demo Is All It Takes: Planning Domain Derivation with LLMs from A Single Demonstration
Jinbang Huang
Yixin Xiao
Zhanguang Zhang
Jianye Hao
Yingxue Zhang
Pre-trained Large Language Models (LLMs) have shown promise in solving planning problems but often struggle to ensure plan correctness, espe… (see more)cially for long-horizon tasks. Meanwhile, traditional robotic task and motion planning (TAMP) frameworks address these challenges more reliably by combining high-level symbolic search with low-level motion planning. However, TAMP relies on the availability of planning domains that typically involve substantial manual effort and domain expertise, limiting its generalizability. We introduce Planning Domain Derivation with LLMs (PDDLLM), a novel approach that combines simulated physical interaction with LLM reasoning to improve planning performance. The method reduces reliance on humans by inferring planning domains from a single annotated task-execution demonstration. Unlike prior domain-inference methods that rely on partially predefined or language descriptions of planning domains, PDDLLM constructs domains entirely from scratch and automatically integrates them with low-level motion planning skills, enabling fully automated long-horizon planning. PDDLLM is evaluated on over 1,200 diverse tasks spanning nine environments and benchmarked against six LLM-based planning baselines, demonstrating superior planning performance, lower token costs, and successful deployment on multiple robot platforms.
Half Search Space is All You Need
Pavel Rumiantsev
Half Search Space is All You Need
Pavel Rumiantsev
SKOLR: Structured Koopman Operator Linear RNN for Time-Series Forecasting
Yitian Zhang
Liheng Ma
Antonios Valkanas
Boris Oreshkin
Koopman operator theory provides a framework for nonlinear dynamical system analysis and time-series forecasting by mapping dynamics to a sp… (see more)ace of real-valued measurement functions, enabling a linear operator representation. Despite the advantage of linearity, the operator is generally infinite-dimensional. Therefore, the objective is to learn measurement functions that yield a tractable finite-dimensional Koopman operator approximation. In this work, we establish a connection between Koopman operator approximation and linear Recurrent Neural Networks (RNNs), which have recently demonstrated remarkable success in sequence modeling. We show that by considering an extended state consisting of lagged observations, we can establish an equivalence between a structured Koopman operator and linear RNN updates. Building on this connection, we present SKOLR, which integrates a learnable spectral decomposition of the input signal with a multilayer perceptron (MLP) as the measurement functions and implements a structured Koopman operator via a highly parallel linear RNN stack. Numerical experiments on various forecasting benchmarks and dynamical systems show that this streamlined, Koopman-theory-based design delivers exceptional performance. Our code is available at: https://github.com/networkslab/SKOLR.
When to retrain a machine learning model
Florence Regol
Leo Schwinn
Kyle Sprague
Thomas Markovich
A significant challenge in maintaining real-world machine learning models is responding to the continuous and unpredictable evolution of dat… (see more)a. Most practitioners are faced with the difficult question: when should I retrain or update my machine learning model? This seemingly straightforward problem is particularly challenging for three reasons: 1) decisions must be made based on very limited information - we usually have access to only a few examples, 2) the nature, extent, and impact of the distribution shift are unknown, and 3) it involves specifying a cost ratio between retraining and poor performance, which can be hard to characterize. Existing works address certain aspects of this problem, but none offer a comprehensive solution. Distribution shift detection falls short as it cannot account for the cost trade-off; the scarcity of the data, paired with its unusual structure, makes it a poor fit for existing offline reinforcement learning methods, and the online learning formulation overlooks key practical considerations. To address this, we present a principled formulation of the retraining problem and propose an uncertainty-based method that makes decisions by continually forecasting the evolution of model performance evaluated with a bounded metric. Our experiments, addressing classification tasks, show that the method consistently outperforms existing baselines on 7 datasets. We thoroughly assess its robustness to varying cost trade-off values and mis-specified cost trade-offs.
Sparse Decomposition of Graph Neural Networks
Yaochen Hu
Mai Zeng
Ge Zhang
Pavel Rumiantsev
Liheng Ma
Yingxue Zhang
Refining Answer Distributions for Improved Large Language Model Reasoning
Soumyasundar Pal
Didier Chételat
Yingxue Zhang
Large Language Models (LLMs) have exhibited an impressive capability to perform reasoning tasks, especially if they are encouraged to genera… (see more)te a sequence of intermediate steps. Reasoning performance can be improved by suitably combining multiple LLM responses, generated either in parallel in a single query, or via sequential interactions with LLMs throughout the reasoning process. Existing strategies for combination, such as self-consistency and progressive-hint-prompting, make inefficient usage of the LLM responses. We present Refined Answer Distributions, a novel and principled algorithmic framework to enhance the reasoning capabilities of LLMs. Our approach can be viewed as an iterative sampling strategy for forming a Monte Carlo approximation of an underlying distribution of answers, with the goal of identifying the mode --- the most likely answer. Empirical evaluation on several reasoning benchmarks demonstrates the superiority of the proposed approach.
InnerThoughts: Disentangling Representations and Predictions in Large Language Models
Didier Chételat
Joseph Cotnareanu
Rylee Thompson
Yingxue Zhang
Large language models (LLMs) contain substantial factual knowledge which is commonly elicited by multiple-choice question-answering prompts.… (see more) Internally, such models process the prompt through multiple transformer layers, building varying representations of the problem within its hidden states. Ultimately, however, only the hidden state corresponding to the final layer and token position is used to predict the answer label. In this work, we propose instead to learn a small separate neural network predictor module on a collection of training questions, that take the hidden states from all the layers at the last temporal position as input and outputs predictions. In effect, such a framework disentangles the representational abilities of LLMs from their predictive abilities. On a collection of hard benchmarks, our method achieves considerable improvements in performance, sometimes comparable to supervised fine-tuning procedures, but at a fraction of the computational cost.
MODL: Multilearner Online Deep Learning
Antonios Valkanas
Boris Oreshkin
Online deep learning solves the problem of learning from streams of data, reconciling two opposing objectives: learn fast and learn deep. Ex… (see more)isting work focuses almost exclusively on exploring pure deep learning solutions, which are much better suited to handle the"deep"than the"fast"part of the online learning equation. In our work, we propose a different paradigm, based on a hybrid multilearner approach. First, we develop a fast online logistic regression learner. This learner does not rely on backpropagation. Instead, it uses closed form recursive updates of model parameters, handling the fast learning part of the online learning problem. We then analyze the existing online deep learning theory and show that the widespread ODL approach, currently operating at complexity
Personalized Negative Reservoir for Incremental Learning in Recommender Systems
Antonios Valkanas
Yuening Wang
Yingxue Zhang
Variation Matters: from Mitigating to Embracing Zero-Shot NAS Ranking Function Variation
Pavel Rumiantsev
Neural Architecture Search (NAS) is a powerful automatic alternative to manual design of a neural network. In the zero-shot version, a fast … (see more)ranking function is used to compare architectures without training them. The outputs of the ranking functions often vary significantly due to different sources of randomness, including the evaluated architecture's weights' initialization or the batch of data used for calculations. A common approach to addressing the variation is to average a ranking function output over several evaluations. We propose taking into account the variation in a different manner, by viewing the ranking function output as a random variable representing a proxy performance metric. During the search process, we strive to construct a stochastic ordering of the performance metrics to determine the best architecture. Our experiments show that the proposed stochastic ordering can effectively boost performance of a search on standard benchmark search spaces.