Portrait de Mark Coates

Mark Coates

Membre académique associé
Professeur agrégé, McGill University, Département de génie électrique et informatique
Sujets de recherche
Apprentissage de représentations
Apprentissage sur graphes
Réseaux de neurones en graphes
Systèmes de recommandation
Systèmes dynamiques

Biographie

Mark Coates est professeur au Département de génie électrique et informatique de l'Université McGill, auquel il s’est joint en 2002. Il a obtenu une licence en génie des systèmes informatiques de l'Université d'Adélaïde (Australie) en 1995 et un doctorat en génie de l'information de l'Université de Cambridge (Royaume-Uni) en 1999. Il a été associé de recherche et conférencier à l'Université Rice, au Texas, de 1999 à 2001. En 2012-2013, il a travaillé en tant que scientifique principal chez Winton Capital Management à Oxford, au Royaume-Uni. Il a assumé de multiples rôles éditoriaux, notamment en tant que rédacteur principal pour IEEE Signal Processing Letters, rédacteur associé pour IEEE Transactions on Signal Processing et rédacteur associé pour IEEE Transactions on Signal and Information Processing over Networks. Les recherches de Mark Coates portent sur l'apprentissage automatique et le traitement statistique des signaux, l'inférence bayésienne et Monte Carlo, et l'apprentissage sur les graphes et les réseaux. Ses contributions les plus influentes et les plus citées concernent la tomographie des réseaux et le filtrage distribué des particules.

Étudiants actuels

Publications

Sparse Decomposition of Graph Neural Networks
Yaochen Hu
Mai Zeng
Ge Zhang
Pavel Rumiantsev
Liheng Ma
Yingxue Zhang
Refining Answer Distributions for Improved Large Language Model Reasoning
Soumyasundar Pal
Didier Chételat
Yingxue Zhang
Large Language Models (LLMs) have exhibited an impressive capability to perform reasoning tasks, especially if they are encouraged to genera… (voir plus)te a sequence of intermediate steps. Reasoning performance can be improved by suitably combining multiple LLM responses, generated either in parallel in a single query, or via sequential interactions with LLMs throughout the reasoning process. Existing strategies for combination, such as self-consistency and progressive-hint-prompting, make inefficient usage of the LLM responses. We present Refined Answer Distributions, a novel and principled algorithmic framework to enhance the reasoning capabilities of LLMs. Our approach can be viewed as an iterative sampling strategy for forming a Monte Carlo approximation of an underlying distribution of answers, with the goal of identifying the mode --- the most likely answer. Empirical evaluation on several reasoning benchmarks demonstrates the superiority of the proposed approach.
Variation Matters: from Mitigating to Embracing Zero-Shot NAS Ranking Function Variation
Pavel Rumiantsev
Neural Architecture Search (NAS) is a powerful automatic alternative to manual design of a neural network. In the zero-shot version, a fast … (voir plus)ranking function is used to compare architectures without training them. The outputs of the ranking functions often vary significantly due to different sources of randomness, including the evaluated architecture's weights' initialization or the batch of data used for calculations. A common approach to addressing the variation is to average a ranking function output over several evaluations. We propose taking into account the variation in a different manner, by viewing the ranking function output as a random variable representing a proxy performance metric. During the search process, we strive to construct a stochastic ordering of the performance metrics to determine the best architecture. Our experiments show that the proposed stochastic ordering can effectively boost performance of a search on standard benchmark search spaces.
Personalized Negative Reservoir for Incremental Learning in Recommender Systems
Antonios Valkanas
Yuening Wang
Yingxue Zhang
InnerThoughts: Disentangling Representations and Predictions in Large Language Models
Didier Chételat
Joseph Cotnareanu
Rylee Thompson
Yingxue Zhang
Large language models (LLMs) contain substantial factual knowledge which is commonly elicited by multiple-choice question-answering prompts.… (voir plus) Internally, such models process the prompt through multiple transformer layers, building varying representations of the problem within its hidden states. Ultimately, however, only the hidden state corresponding to the final layer and token position is used to predict the answer label. In this work, we propose instead to learn a small separate neural network predictor module on a collection of training questions, that take the hidden states from all the layers at the last temporal position as input and outputs predictions. In effect, such a framework disentangles the representational abilities of LLMs from their predictive abilities. On a collection of hard benchmarks, our method achieves considerable improvements in performance, sometimes comparable to supervised fine-tuning procedures, but at a fraction of the computational cost.
MODL: Multilearner Online Deep Learning
Antonios Valkanas
Boris Oreshkin
Online deep learning solves the problem of learning from streams of data, reconciling two opposing objectives: learn fast and learn deep. Ex… (voir plus)isting work focuses almost exclusively on exploring pure deep learning solutions, which are much better suited to handle the"deep"than the"fast"part of the online learning equation. In our work, we propose a different paradigm, based on a hybrid multilearner approach. First, we develop a fast online logistic regression learner. This learner does not rely on backpropagation. Instead, it uses closed form recursive updates of model parameters, handling the fast learning part of the online learning problem. We then analyze the existing online deep learning theory and show that the widespread ODL approach, currently operating at complexity
Path-of-Thoughts: Extracting and Following Paths for Robust Relational Reasoning with Large Language Models
Ge Zhang
Mohammad Alomrani
Hongjian Gu
Jiaming Zhou
Yaochen Hu
Bin Wang
Qun Liu
Yingxue Zhang
Jianye Hao
Large language models (LLMs) possess vast semantic knowledge but often struggle with complex reasoning tasks, particularly in relational rea… (voir plus)soning problems such as kinship or spatial reasoning. In this paper, we present Path-of-Thoughts (PoT), a novel framework designed to tackle relation reasoning by decomposing the task into three key stages: graph extraction, path identification, and reasoning. Unlike previous approaches, PoT efficiently extracts a task-agnostic graph that identifies crucial entities, relations, and attributes within the problem context. Subsequently, PoT identifies relevant reasoning chains within the graph corresponding to the posed question, facilitating inference of potential answers. Experimental evaluations on four benchmark datasets, demanding long reasoning chains, demonstrate that PoT surpasses state-of-the-art baselines by a significant margin (maximum 21.3%) without necessitating fine-tuning or extensive LLM calls. Furthermore, as opposed to prior neuro-symbolic methods, PoT exhibits improved resilience against LLM errors by leveraging the compositional nature of graphs.
Path-of-Thoughts: Extracting and Following Paths for Robust Relational Reasoning with Large Language Models
Ge Zhang
Mohammad Alomrani
Hongjian Gu
Jiaming Zhou
Yaochen Hu
Bin Wang
Qun Liu
Yingxue Zhang
Jianye Hao
Large language models (LLMs) possess vast semantic knowledge but often struggle with complex reasoning tasks, particularly in relational rea… (voir plus)soning problems such as kinship or spatial reasoning. In this paper, we present Path-of-Thoughts (PoT), a novel framework designed to tackle relation reasoning by decomposing the task into three key stages: graph extraction, path identification, and reasoning. Unlike previous approaches, PoT efficiently extracts a task-agnostic graph that identifies crucial entities, relations, and attributes within the problem context. Subsequently, PoT identifies relevant reasoning chains within the graph corresponding to the posed question, facilitating inference of potential answers. Experimental evaluations on four benchmark datasets, demanding long reasoning chains, demonstrate that PoT surpasses state-of-the-art baselines by a significant margin (maximum 21.3%) without necessitating fine-tuning or extensive LLM calls. Furthermore, as opposed to prior neuro-symbolic methods, PoT exhibits improved resilience against LLM errors by leveraging the compositional nature of graphs.
Hint Marginalization for Improved Reasoning in Large Language Models
Soumyasundar Pal
Didier Ch'etelat
Yingxue Zhang
Large Language Models (LLMs) have exhibited an impressive capability to perform reasoning tasks, especially if they are encouraged to genera… (voir plus)te a sequence of intermediate steps. Reasoning performance can be improved by suitably combining multiple LLM responses, generated either in parallel in a single query, or via sequential interactions with LLMs throughout the reasoning process. Existing strategies for combination, such as self-consistency and progressive-hint-prompting, make inefficient usage of the LLM responses. We present Hint Marginalization, a novel and principled algorithmic framework to enhance the reasoning capabilities of LLMs. Our approach can be viewed as an iterative sampling strategy for forming a Monte Carlo approximation of an underlying distribution of answers, with the goal of identifying the mode the most likely answer. Empirical evaluation on several benchmark datasets for arithmetic reasoning demonstrates the superiority of the proposed approach.
Hint Marginalization for Improved Reasoning in Large Language Models
Soumyasundar Pal
Didier Ch'etelat
Yingxue Zhang
Large Language Models (LLMs) have exhibited an impressive capability to perform reasoning tasks, especially if they are encouraged to genera… (voir plus)te a sequence of intermediate steps. Reasoning performance can be improved by suitably combining multiple LLM responses, generated either in parallel in a single query, or via sequential interactions with LLMs throughout the reasoning process. Existing strategies for combination, such as self-consistency and progressive-hint-prompting, make inefficient usage of the LLM responses. We present Hint Marginalization, a novel and principled algorithmic framework to enhance the reasoning capabilities of LLMs. Our approach can be viewed as an iterative sampling strategy for forming a Monte Carlo approximation of an underlying distribution of answers, with the goal of identifying the mode the most likely answer. Empirical evaluation on several benchmark datasets for arithmetic reasoning demonstrates the superiority of the proposed approach.
Graph Knowledge Distillation to Mixture of Experts
Pavel Rumiantsev
HardCore Generation: Generating Hard UNSAT Problems for Data Augmentation
Joseph Cotnareanu
Zhanguang Zhang
Hui-Ling Zhen
Yingxue Zhang
Efficiently determining the satisfiability of a boolean equation --- known as the SAT problem for brevity --- is crucial in various industri… (voir plus)al problems. Recently, the advent of deep learning methods has introduced significant potential for enhancing SAT solving. However, a major barrier to the advancement of this field has been the scarcity of large, realistic datasets. The majority of current public datasets are either randomly generated or extremely limited, containing only a few examples from unrelated problem families. These datasets are inadequate for meaningful training of deep learning methods. In light of this, researchers have started exploring generative techniques to create data that more accurately reflect SAT problems encountered in practical situations. These methods have so far suffered from either the inability to produce challenging SAT problems or time-scalability obstacles. In this paper we address both by identifying and manipulating the key contributors to a problem's ``hardness'', known as cores. Although some previous work has addressed cores, the time costs are unacceptably high due to the expense of traditional heuristic core detection techniques. We introduce a fast core detection procedure that uses a graph neural network. Our empirical results demonstrate that we can efficiently generate problems that remain hard to solve and retain key attributes of the original example problems. We show via experiment that the generated synthetic SAT problems can be used in a data augmentation setting to provide improved prediction of solver runtimes.