Portrait of Mark Coates

Mark Coates

Associate Academic Member
Associate Professor, McGill University, Department of Electrical and Computer Engineering
Research Topics
Dynamical Systems
Graph Neural Networks
Learning on Graphs
Recommender Systems
Representation Learning

Biography

Mark Coates is a professor in the Department of Electrical and Computer Engineering at McGill University, which he joined in 2002. He received his Bachelor of Engineering degree in computer systems engineering from the University of Adelaide, Australia, in 1995 and his PhD degree in information engineering from the University of Cambridge, U.K., in 1999. Coates was formerly a research associate and lecturer at Rice University, Texas (1999–2001) and a senior scientist at Winton Capital Management, Oxford, U.K. (2012–2013).

He has assumed multiple editorial roles, including senior area editor of IEEE Signal Processing Letters, associate editor of IEEE Transactions on Signal Processing, and associate editor of IEEE Transactions on Signal and Information Processing over Networks. His research interests include machine learning and statistical signal processing, Bayesian and Monte Carlo inference, and learning on graphs and networks. His most influential and widely cited contributions have been on the topics of network tomography and distributed particle filtering.

Current Students

PhD - McGill University
Master's Research - McGill University
PhD - McGill University
PhD - McGill University

Publications

CKGConv: General Graph Convolution with Continuous Kernels
Liheng Ma
Soumyasundar Pal
Yitian Zhang
Jiaming Zhou
Yingxue Zhang
The existing definitions of graph convolution, either from spatial or spectral perspectives, are inflexible and not unified. Defining a gene… (see more)ral convolution operator in the graph domain is challenging due to the lack of canonical coordinates, the presence of irregular structures, and the properties of graph symmetries. In this work, we propose a novel and general graph convolution framework by parameterizing the kernels as continuous functions of pseudo-coordinates derived via graph positional encoding. We name this Continuous Kernel Graph Convolution (CKGConv). Theoretically, we demonstrate that CKGConv is flexible and expressive. CKGConv encompasses many existing graph convolutions, and exhibits a stronger expressiveness, as powerful as graph transformers in terms of distinguishing non-isomorphic graphs. Empirically, we show that CKGConv-based Networks outperform existing graph convolutional networks and perform comparably to the best graph transformers across a variety of graph datasets. The code and models are publicly available at https://github.com/networkslab/CKGConv.
Interacting Diffusion Processes for Event Sequence Forecasting
Mai Zeng
Florence Regol
Neural Temporal Point Processes (TPPs) have emerged as the primary framework for predicting sequences of events that occur at irregular time… (see more) intervals, but their sequential nature can hamper performance for long-horizon forecasts. To address this, we introduce a novel approach that incorporates a diffusion generative model. The model facilitates sequence-to-sequence prediction, allowing multi-step predictions based on historical event sequences. In contrast to previous approaches, our model directly learns the joint probability distribution of types and inter-arrival times for multiple events. The model is composed of two diffusion processes, one for the time intervals and one for the event types. These processes interact through their respective denoising functions, which can take as input intermediate representations from both processes, allowing the model to learn complex interactions. We demonstrate that our proposal outperforms state-of-the-art baselines for long-horizon forecasting of TPPs.
MODL: Multilearner Online Deep Learning
Antonios Valkanas
Boris Oreshkin
Online deep learning solves the problem of learning from streams of data, reconciling two opposing objectives: learn fast and learn deep. Ex… (see more)isting work focuses almost exclusively on exploring pure deep learning solutions, which are much better suited to handle the"deep"than the"fast"part of the online learning equation. In our work, we propose a different paradigm, based on a hybrid multilearner approach. First, we develop a fast online logistic regression learner. This learner does not rely on backpropagation. Instead, it uses closed form recursive updates of model parameters, handling the fast learning part of the online learning problem. We then analyze the existing online deep learning theory and show that the widespread ODL approach, currently operating at complexity
CKGConv: General Graph Convolution with Continuous Kernels
Liheng Ma
Soumyasundar Pal
Yitian Zhang
Jiaming Zhou
Yingxue Zhang
The existing definitions of graph convolution, either from spatial or spectral perspectives, are inflexible and not unified. Defining a gene… (see more)ral convolution operator in the graph domain is challenging due to the lack of canonical coordinates, the presence of irregular structures, and the properties of graph symmetries. In this work, we propose a novel graph convolution framework by parameterizing the kernels as continuous functions of pseudo-coordinates derived via graph positional encoding. We name this Continuous Kernel Graph Convolution (CKGConv). Theoretically, we demonstrate that CKGConv is flexible and expressive. CKGConv encompasses many existing graph convolutions, and exhibits the same expressiveness as graph transformers in terms of distinguishing non-isomorphic graphs. Empirically, we show that CKGConv-based Networks outperform existing graph convolutional networks and perform comparably to the best graph transformers across a variety of graph datasets.
Categorical Generative Model Evaluation via Synthetic Distribution Coarsening
Florence Regol
As we expect to see a rapid integration of generative models in our day to day lives, the development of rigorous methods of evaluation and … (see more)analysis for generative models has never been more pressing. Multiple works have highlighted the shortcomings of widely used metrics and exposed how they fail to behave as expected in some settings. So far, the response has been to use a variety of metrics that target different desirable and interpretable properties such as fidelity, diversity, and authenticity, to obtain a clearer picture of a generative model’s capabilities. These methods mainly focus on ordinal data and they all suffer from the same unavoidable issues stemming from estimating quantities of high-dimensional data from a limited number of samples. We propose to take an alternative approach and to return to the synthetic data setting where the ground truth is explicit and known. We focus on nominal categorical data and introduce an evaluation method that can scale to the high-dimensional settings often encountered in practice. Our method involves successively binning the large space to obtain smaller probability spaces and coarser distributions where meaningful statistical estimates can be obtained. This allows us to provide probabilistic guarantees and sample complexities and we illustrate how our method can be applied to distinguish between the capabilities of several state-of-the-art categorical models.
Multi-resolution Time-Series Transformer for Long-term Forecasting
Yitian Zhang
Liheng Ma
Soumyasundar Pal
Yingxue Zhang
Personalized Negative Reservoir for Incremental Learning in Recommender Systems
Antonios Valkanas
Yuening Wang
Yingxue Zhang
Jointly-Learned Exit and Inference for a Dynamic Neural Network
Florence Regol
Joud Chataoui
DyG2Vec: Efficient Representation Learning for Dynamic Graphs
Mohammad Alomrani
Mahdi Biparva
Yingxue Zhang
Temporal graph neural networks have shown promising results in learning inductive representations by automatically extracting temporal patte… (see more)rns. However, previous works often rely on complex memory modules or inefficient random walk methods to construct temporal representations. To address these limitations, we present an efficient yet effective attention-based encoder that leverages temporal edge encodings and window-based subgraph sampling to generate task-agnostic embeddings. Moreover, we propose a joint-embedding architecture using non-contrastive SSL to learn rich temporal embeddings without labels. Experimental results on 7 benchmark datasets indicate that on average, our model outperforms SoTA baselines on the future link prediction task by 4.23% for the transductive setting and 3.30% for the inductive setting while only requiring 5-10x less training/inference time. Lastly, different aspects of the proposed framework are investigated through experimental analysis and ablation studies. The code is publicly available at https://github.com/huawei-noah/noah-research/tree/master/graph_atlas.
Population Monte Carlo With Normalizing Flow
Soumyasundar Pal
Antonios Valkanas
Adaptive importance sampling (AIS) methods provide a useful alternative to Markov Chain Monte Carlo (MCMC) algorithms for performing inferen… (see more)ce of intractable distributions. Population Monte Carlo (PMC) algorithms constitute a family of AIS approaches which adapt the proposal distributions iteratively to improve the approximation of the target distribution. Recent work in this area primarily focuses on ameliorating the proposal adaptation procedure for high-dimensional applications. However, most of the AIS algorithms use simple proposal distributions for sampling, which might be inadequate in exploring target distributions with intricate geometries. In this work, we construct expressive proposal distributions in the AIS framework using normalizing flow, an appealing approach for modeling complex distributions. We use an iterative parameter update rule to enhance the approximation of the target distribution. Numerical experiments show that in high-dimensional settings, the proposed algorithm offers significantly improved performance compared to the existing techniques.
Multi-resolution Time-Series Transformer for Long-term Forecasting
Yitian Zhang
Liheng Ma
Soumyasundar Pal
Yingxue Zhang
Interacting Diffusion Processes for Event Sequence Forecasting
Mai Zeng
Florence Regol
Neural Temporal Point Processes (TPPs) have emerged as the primary framework for predicting sequences of events that occur at irregular time… (see more) intervals, but their sequential nature can hamper performance for long-horizon forecasts. To address this, we introduce a novel approach that incorporates a diffusion generative model. The model facilitates sequence-to-sequence prediction, allowing multi-step predictions based on historical event sequences. In contrast to previous approaches, our model directly learns the joint probability distribution of types and inter-arrival times for multiple events. This allows us to fully leverage the high dimensional modeling capability of modern generative models. Our model is composed of two diffusion processes, one for the time intervals and one for the event types. These processes interact through their respective denoising functions, which can take as input intermediate representations from both processes, allowing the model to learn complex interactions. We demonstrate that our proposal outperforms state-of-the-art baselines for long-horizon forecasting of TPP.