Portrait of Mark Coates

Mark Coates

Associate Academic Member
Associate Professor, McGill University, Department of Electrical and Computer Engineering
Research Topics
Dynamical Systems
Graph Neural Networks
Learning on Graphs
Recommender Systems
Representation Learning

Biography

Mark Coates is a professor in the Department of Electrical and Computer Engineering at McGill University, which he joined in 2002. He received his Bachelor of Engineering degree in computer systems engineering from the University of Adelaide, Australia, in 1995 and his PhD degree in information engineering from the University of Cambridge, U.K., in 1999. Coates was formerly a research associate and lecturer at Rice University, Texas (1999–2001) and a senior scientist at Winton Capital Management, Oxford, U.K. (2012–2013).

He has assumed multiple editorial roles, including senior area editor of IEEE Signal Processing Letters, associate editor of IEEE Transactions on Signal Processing, and associate editor of IEEE Transactions on Signal and Information Processing over Networks. His research interests include machine learning and statistical signal processing, Bayesian and Monte Carlo inference, and learning on graphs and networks. His most influential and widely cited contributions have been on the topics of network tomography and distributed particle filtering.

Current Students

PhD - McGill University
Master's Research - McGill University
PhD - McGill University
PhD - McGill University

Publications

DyG2Vec: Efficient Representation Learning for Dynamic Graphs
Mohammad Alomrani
Mahdi Biparva
Yingxue Zhang
Temporal graph neural networks have shown promising results in learning inductive representations by automatically extracting temporal patte… (see more)rns. However, previous works often rely on complex memory modules or inefficient random walk methods to construct temporal representations. To address these limitations, we present an efficient yet effective attention-based encoder that leverages temporal edge encodings and window-based subgraph sampling to generate task-agnostic embeddings. Moreover, we propose a joint-embedding architecture using non-contrastive SSL to learn rich temporal embeddings without labels. Experimental results on 7 benchmark datasets indicate that on average, our model outperforms SoTA baselines on the future link prediction task by 4.23% for the transductive setting and 3.30% for the inductive setting while only requiring 5-10x less training/inference time. Lastly, different aspects of the proposed framework are investigated through experimental analysis and ablation studies. The code is publicly available at https://github.com/huawei-noah/noah-research/tree/master/graph_atlas.
Enhancing Click-through Rate Prediction in Recommendation Domain with Search Query Representation
Yuening Wang
Man Chen
Yaochen Hu
Wei Guo
Yingxue Zhang
Huifeng Guo
Yong Liu
Population Monte Carlo With Normalizing Flow
Soumyasundar Pal
Antonios Valkanas
Adaptive importance sampling (AIS) methods provide a useful alternative to Markov Chain Monte Carlo (MCMC) algorithms for performing inferen… (see more)ce of intractable distributions. Population Monte Carlo (PMC) algorithms constitute a family of AIS approaches which adapt the proposal distributions iteratively to improve the approximation of the target distribution. Recent work in this area primarily focuses on ameliorating the proposal adaptation procedure for high-dimensional applications. However, most of the AIS algorithms use simple proposal distributions for sampling, which might be inadequate in exploring target distributions with intricate geometries. In this work, we construct expressive proposal distributions in the AIS framework using normalizing flow, an appealing approach for modeling complex distributions. We use an iterative parameter update rule to enhance the approximation of the target distribution. Numerical experiments show that in high-dimensional settings, the proposed algorithm offers significantly improved performance compared to the existing techniques.
Multi-resolution Time-Series Transformer for Long-term Forecasting
Yitian Zhang
Liheng Ma
Soumyasundar Pal
Yingxue Zhang
Interacting Diffusion Processes for Event Sequence Forecasting
Mai Zeng
Florence Regol
Neural Temporal Point Processes (TPPs) have emerged as the primary framework for predicting sequences of events that occur at irregular time… (see more) intervals, but their sequential nature can hamper performance for long-horizon forecasts. To address this, we introduce a novel approach that incorporates a diffusion generative model. The model facilitates sequence-to-sequence prediction, allowing multi-step predictions based on historical event sequences. In contrast to previous approaches, our model directly learns the joint probability distribution of types and inter-arrival times for multiple events. This allows us to fully leverage the high dimensional modeling capability of modern generative models. Our model is composed of two diffusion processes, one for the time intervals and one for the event types. These processes interact through their respective denoising functions, which can take as input intermediate representations from both processes, allowing the model to learn complex interactions. We demonstrate that our proposal outperforms state-of-the-art baselines for long-horizon forecasting of TPP.
Jointly-Learned Exit and Inference for a Dynamic Neural Network : JEI-DNN
Florence Regol
Joud Chataoui
Substituting Data Annotation with Balanced Updates and Collective Loss in Multi-label Text Classification
Muberra Ozmen
Joseph Cotnareanu
Multi-label text classification (MLTC) is the task of assigning multiple labels to a given text, and has a wide range of application domains… (see more). Most existing approaches require an enormous amount of annotated data to learn a classifier and/or a set of well-defined constraints on the label space structure, such as hierarchical relations which may be complicated to provide as the number of labels increases. In this paper, we study the MLTC problem in annotation-free and scarce-annotation settings in which the magnitude of available supervision signals is linear to the number of labels. Our method follows three steps, (1) mapping input text into a set of preliminary label likelihoods by natural language inference using a pre-trained language model, (2) calculating a signed label dependency graph by label descriptions, and (3) updating the preliminary label likelihoods with message passing along the label dependency graph, driven with a collective loss function that injects the information of expected label frequency and average multi-label cardinality of predictions. The experiments show that the proposed framework achieves effective performance under low supervision settings with almost imperceptible computational and memory overheads added to the usage of pre-trained language model outperforming its initial performance by 70\% in terms of example-based F1 score.
Neural Graph Generation from Graph Statistics
Kiarash Zahirnia
Yaochen Hu
Oliver Schulte
Neural Graph Generation from Graph Statistics.
Kiarash Zahirnia
Yaochen Hu
Oliver Schulte
Motion In-Betweening via Deep <inline-formula><tex-math notation="LaTeX">$\Delta$</tex-math><alternatives><mml:math><mml:mi>Δ</mml:mi></mml:math><inline-graphic xlink:href="oreshkin-ieq1-3309107.gif"/></alternatives></inline-formula>-Interpolator
Boris Oreshkin
Antonios Valkanas
Félix Harvey
Louis-Simon Ménard
Florent Bocquelet
We show that the task of synthesizing human motion conditioned on a set of key frames can be solved more accurately and effectively if a dee… (see more)p learning based interpolator operates in the delta mode using the spherical linear interpolator as a baseline. We empirically demonstrate the strength of our approach on publicly available datasets achieving state-of-the-art performance. We further generalize these results by showing that the
Bidirectional Learning for Offline Model-based Biological Sequence Design
Can Chen
Yingxue Zhang
Evaluation of Categorical Generative Models - Bridging the Gap Between Real and Synthetic Data
Florence Regol
Anja Kroon
The machine learning community has mainly relied on real data to benchmark algorithms as it provides compelling evidence of model applicabil… (see more)ity. Evaluation on synthetic datasets can be a powerful tool to provide a better understanding of a model’s strengths, weaknesses and overall capabilities. Gaining these insights can be particularly important for generative modeling as the target quantity is completely unknown. Multiple issues related to the evaluation of generative models have been reported in the literature. We argue those problems can be avoided by an evaluation based on ground truth. General criticisms of synthetic experiments are that they are too simplified and not representative of practical scenarios. As such, our experimental setting is tailored to a realistic generative task. We focus on categorical data and introduce an appropriately scalable evaluation method. Our method involves tasking a generative model to learn a distribution in a high-dimensional setting. We then successively bin the large space to obtain smaller probability spaces where meaningful statistical tests can be applied. We consider increasingly large probability spaces, which correspond to increasingly difficult modeling tasks, and compare the generative models based on the highest task difficulty they can reach before being detected as being too far from the ground truth. We validate our evaluation procedure with synthetic experiments on both synthetic generative models and current state-of-the-art categorical generative models.