Portrait of Guillaume Rabusseau

Guillaume Rabusseau

Core Academic Member
Canada CIFAR AI Chair
Assistant Professor, Université de Montréal, Department of Computer Science and Operations Research
Research Topics
Deep Learning
Graph Neural Networks
Learning on Graphs
Machine Learning Theory
Probabilistic Models
Quantum Information Theory
Recommender Systems
Recurrent Neural Networks
Tensor Factorization

Biography

I have been an assistant professor at Mila – Quebec Artificial Intelligence Institute and in the Department of Computer Science and Operations Research (DIRO) at Université de Montréal (UdeM) since September 2018. I was awarded a Canada CIFAR AI Chair in March 2019. Before joining UdeM, I was a postdoctoral research fellow in the Reasoning and Learning Lab at McGill University, where I worked with Prakash Panangaden, Joelle Pineau and Doina Precup.

I obtained my PhD in 2016 from Aix-Marseille University (AMU) in France, where I worked in the Qarma team (Machine Learning and Multimedia) under the supervision of François Denis and Hachem Kadri. I also obtained my MSc in fundamental computer science and my BSc in computer science from AMU. I am interested in tensor methods for machine learning and in designing learning algorithms for structured data by leveraging linear and multilinear algebra (e.g., spectral methods).

Current Students

Master's Research - Université de Montréal
Postdoctorate - Université de Montréal
PhD - Université de Montréal
PhD - Université de Montréal
Collaborating researcher - University of Mannheim
Co-supervisor :
PhD - Université de Montréal
Co-supervisor :
PhD - McGill University
Principal supervisor :
PhD - Université de Montréal
Master's Research - McGill University
Principal supervisor :
Collaborating researcher
Co-supervisor :
PhD - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Co-supervisor :
PhD - Université de Montréal

Publications

Optimal Spectral-Norm Approximate Minimization of Weighted Finite Automata
We address the approximate minimization problem for weighted finite automata (WFAs) with weights in …
Assessing the Impact: Does an Improvement to a Revenue Management System Lead to an Improved Revenue?
Estimating the Impact of an Improvement to a Revenue Management System: An Airline Application
Greta Laage
William Hamilton
Airlines have been making use of highly complex Revenue Management Systems to maximize revenue for decades. Estimating the impact of changin… (see more)g one component of those systems on an important outcome such as revenue is crucial, yet very challenging. It is indeed the difference between the generated value and the value that would have been generated keeping business as usual, which is not observable. We provide a comprehensive overview of counterfactual prediction models and use them in an extensive computational study based on data from Air Canada to estimate such impact. We focus on predicting the counterfactual revenue and compare it to the observed revenue subject to the impact. Our microeconomic application and small expected treatment impact stand out from the usual synthetic control applications. We present accurate linear and deep-learning counterfactual prediction models which achieve respectively 1.1% and 1% of error and allow to estimate a simulated effect quite accurately.
Scalable Change Point Detection for Dynamic Graphs
Real world networks often evolve in complex ways over time. Understanding anomalies in dynamic networks is crucial for applications such as … (see more)traffic accident detection, intrusion identification and detection of ecosystem disturbances. In this work, we focus on the problem of change point detection in dynamic graphs. The goal is to identify time steps where the graph structure deviates significantly from the norm. Despite empirical success of recent methods, building a change point detection method for real world dynamic graphs, which often scale to millions of nodes, remains an open question. To fill this gap, we propose LADdos, a scalable method for change point detection in dynamic graphs. LADdos brings together ideas from two recent works: an accurate change point detection method for graphs called LAD [10] which detects the changes in the full Laplacian spectrum of the graph in each timestamp, and the general framework of network density of states (DOS) [5] which models the distribution of the singular values through efficient approximation methods. In experiments with two common graph models –the Stochastic Block Model (SBM) and the Barabási-Albert (BA) model – we show that LADdos has equal performance to LAD, which is the current state-of-the-art, while being orders of magnitude faster. For instance, on a dynamic graph with total 21 million edges over 150 timestamps, LADdos achieves 100x speedup when compared to LAD.
A Theoretical Analysis of Catastrophic Forgetting through the NTK Overlap Matrix
Thang Doan
Mehdi Abbana Bennani
Bogdan Mazoure
Pierre Alquier
Towards a Trace-Preserving Tensor Network Representation of Quantum Channels
Siddarth Srinivasan
Sandesh M. Adhikary
Jacob Miller
Bibek Pokharel
Byron Boots
The problem of characterizing quantum channels arises in a number of contexts such as quantum process tomography and quantum error correctio… (see more)n. However, direct approaches to parameterizing and optimizing the Choi matrix representation of quantum channels face a curse of dimensionality: the number of parameters scales exponentially in the number of qubits. Recently, Torlai et al. [2020] proposed using locally purified density operators (LPDOs), a tensor network representation of Choi matrices, to overcome the unfavourable scaling in parameters. While the LPDO structure allows it to satisfy a ‘complete positivity’ (CP) constraint required of physically valid quantum channels, it makes no guarantees about a similarly required ‘trace preservation’ (TP) constraint. In practice, the TP constraint is violated, and the learned quantum channel may even be trace-increasing, which is non-physical. In this work, we present the problem of optimizing over TP LPDOs, discuss two approaches to characterizing the TP constraints on LPDOs, and outline the next steps for developing an optimization scheme.
Horizontal gene transfer and recombination analysis of SARS-CoV-2 genes helps discover its close relatives and shed light on its origin
Bogdan Mazoure
Pierre Legendre
The SARS-CoV-2 pandemic is one of the greatest global medical and social challenges that have emerged in recent history. Human coronavirus s… (see more)trains discovered during previous SARS outbreaks have been hypothesized to pass from bats to humans using intermediate hosts, e.g. civets for SARS-CoV and camels for MERS-CoV. The discovery of an intermediate host of SARS-CoV-2 and the identification of specific mechanism of its emergence in humans are topics of primary evolutionary importance. In this study we investigate the evolutionary patterns of 11 main genes of SARS-CoV-2. Previous studies suggested that the genome of SARS-CoV-2 is highly similar to the horseshoe bat coronavirus RaTG13 for most of the genes and to some Malayan pangolin coronavirus (CoV) strains for the receptor binding (RB) domain of the spike protein. We provide a detailed list of statistically significant horizontal gene transfer and recombination events (both intergenic and intragenic) inferred for each of 11 main genes of the SARS-CoV-2 genome. Our analysis reveals that two continuous regions of genes S and N of SARS-CoV-2 may result from intragenic recombination between RaTG13 and Guangdong (GD) Pangolin CoVs. Statistically significant gene transfer-recombination events between RaTG13 and GD Pangolin CoV have been identified in region [1215–1425] of gene S and region [534–727] of gene N. Moreover, some statistically significant recombination events between the ancestors of SARS-CoV-2, RaTG13, GD Pangolin CoV and bat CoV ZC45-ZXC21 coronaviruses have been identified in genes ORF1ab, S, ORF3a, ORF7a, ORF8 and N. Furthermore, topology-based clustering of gene trees inferred for 25 CoV organisms revealed a three-way evolution of coronavirus genes, with gene phylogenies of ORF1ab, S and N forming the first cluster, gene phylogenies of ORF3a, E, M, ORF6, ORF7a, ORF7b and ORF8 forming the second cluster, and phylogeny of gene ORF10 forming the third cluster. The results of our horizontal gene transfer and recombination analysis suggest that SARS-CoV-2 could not only be a chimera virus resulting from recombination of the bat RaTG13 and Guangdong pangolin coronaviruses but also a close relative of the bat CoV ZC45 and ZXC21 strains. They also indicate that a GD pangolin may be an intermediate host of this dangerous virus.
Quantum Tensor Networks, Stochastic Processes, and Weighted Automata
Siddarth Srinivasan
Sandesh M. Adhikary
Jacob Miller
Byron Boots
Modeling joint probability distributions over sequences has been studied from many perspectives. The physics community developed matrix prod… (see more)uct states, a tensor-train decomposition for probabilistic modeling, motivated by the need to tractably model many-body systems. But similar models have also been studied in the stochastic processes and weighted automata literature, with little work on how these bodies of work relate to each other. We address this gap by showing how stationary or uniform versions of popular quantum tensor network models have equivalent representations in the stochastic processes and weighted automata literature, in the limit of infinitely long sequences. We demonstrate several equivalence results between models used in these three communities: (i) uniform variants of matrix product states, Born machines and locally purified states from the quantum tensor networks literature, (ii) predictive state representations, hidden Markov models, norm-observable operator models and hidden quantum Markov models from the stochastic process literature,and (iii) stochastic weighted automata, probabilistic automata and quadratic automata from the formal languages literature. Such connections may open the door for results and methods developed in one area to be applied in another.
A Theoretical Analysis of Catastrophic Forgetting through the NTK Overlap Matrix
Thang Doan
Mehdi Abbana Bennani
Bogdan Mazoure
Pierre Alquier
Continual learning (CL) is a setting in which an agent has to learn from an incoming stream of data during its entire lifetime. Although maj… (see more)or advances have been made in the field, one recurring problem which remains unsolved is that of Catastrophic Forgetting (CF). While the issue has been extensively studied empirically, little attention has been paid from a theoretical angle. In this paper, we show that the impact of CF increases as two tasks increasingly align. We introduce a measure of task similarity called the NTK overlap matrix which is at the core of CF. We analyze common projected gradient algorithms and demonstrate how they mitigate forgetting. Then, we propose a variant of Orthogonal Gradient Descent (OGD) which leverages structure of the data through Principal Component Analysis (PCA). Experiments support our theoretical findings and show how our method reduces CF on classical CL datasets.
Laplacian Change Point Detection for Dynamic Graphs
Shenyang Huang
Yasmeen Hitti
Adaptive Learning of Tensor Network Structures
Meraj Hashemizadeh
Michelle Liu
Jacob Miller
Tensor Networks (TN) offer a powerful framework to efficiently represent very high-dimensional objects. TN have recently shown their potenti… (see more)al for machine learning applications and offer a unifying view of common tensor decomposition models such as Tucker, tensor train (TT) and tensor ring (TR). However, identifying the best tensor network structure from data for a given task is challenging. In this work, we leverage the TN formalism to develop a generic and efficient adaptive algorithm to jointly learn the structure and the parameters of a TN from data. Our method is based on a simple greedy approach starting from a rank one tensor and successively identifying the most promising tensor network edges for small rank increments. Our algorithm can adaptively identify TN structures with small number of parameters that effectively optimize any differentiable objective function. Experiments on tensor decomposition, tensor completion and model compression tasks demonstrate the effectiveness of the proposed algorithm. In particular, our method outperforms the state-of-the-art evolutionary topology search [Li and Sun, 2020] for tensor decomposition of images (while being orders of magnitude faster) and finds efficient tensor network structures to compress neural networks outperforming popular TT based approaches [Novikov et al., 2015].
On Overfitting and Asymptotic Bias in Batch Reinforcement Learning with Partial Observability (Extended Abstract)
Vincent Francois-Lavet
Damien Ernst
Raphael Fonteneau
When an agent has limited information on its environment, the suboptimality of an RL algorithm can be decomposed into the sum of two terms: … (see more)a term related to an asymptotic bias (suboptimality with unlimited data) and a term due to overfitting (additional suboptimality due to limited data). In the context of reinforcement learning with partial observability, this paper provides an analysis of the tradeoff between these two error sources. In particular, our theoretical analysis formally characterizes how a smaller state representation increases the asymptotic bias while decreasing the risk of overfitting.