Guillaume Rabusseau

Biography

I have been an assistant professor at Mila – Quebec Artificial Intelligence Institute and in the Department of Computer Science and Operations Research (DIRO) at Université de Montréal (UdeM) since September 2018. I was awarded a Canada CIFAR AI Chair in March 2019. Before joining UdeM, I was a postdoctoral research fellow in the Reasoning and Learning Lab at McGill University, where I worked with Prakash Panangaden, Joelle Pineau and Doina Precup.

I obtained my PhD in 2016 from Aix-Marseille University (AMU) in France, where I worked in the Qarma team (Machine Learning and Multimedia) under the supervision of François Denis and Hachem Kadri. I also obtained my MSc in fundamental computer science and my BSc in computer science from AMU. I am interested in tensor methods for machine learning and in designing learning algorithms for structured data by leveraging linear and multilinear algebra (e.g., spectral methods).

Current Students

Jun Dai

Postdoctorate - Université de Montréal

Alireza Dizaji

Master's Research - Université de Montréal

Github

Marawan Gamal

PhD - Université de Montréal

PhD - Université de Montréal

Co-supervisor :

Collaborating Alumni - McGill University

Principal supervisor :

Independent visiting researcher - Technical University of Hambrug, Germany

Collaborating researcher - Université de Montréal

Github

Maude Lizaire

PhD - Université de Montréal

Sitao Luan

Postdoctorate - McGill University

Co-supervisor :

Master's Research - Université de Montréal

Soroush Omranpour

Collaborating researcher - McGill University

Principal supervisor :

Reihaneh Rabbany

Github

Michael Rizvi-Martel

PhD - Université de Montréal

Co-supervisor :

Pascal Tikeng Notsawo

PhD - Université de Montréal

Co-supervisor :

Collaborating researcher - Université de Montréal

Co-supervisor :

Reihaneh Rabbany

Website

Beheshteh Toloueirakhshan

PhD - Université de Montréal

Website

Publications

Simulating Weighted Automata over Sequences and Trees with Transformers

Michael Rizvi

Maude Lizaire

Clara Lacroce

2024-03-12

ArXiv (preprint)

Towards Foundational Models for Molecular Learning on Large-Scale Multi-Task Datasets

Dominique Beaini

Shenyang Huang

Joao Alex Cunha

Zhiyi Li

Gabriela Moisescu-Pareja

Oleksandr Dymov

Samuel Maddrell-Mander

Callum McLean

Jama Hussein Mohamud

Michael Craig

Cristian Gabellini

Kerstin Klaser

Josef Dean

Cas Wognum … (see 15 more)

Maciej Sypetkowski

Ioannis Koutis

Hadrien Mary

Therence Bois

Andrew William Fitzgibbon

Blazej Banaszewski

Chad Martin

Dominic Masters

Recently, pre-trained foundation models have enabled significant advancements in multiple fields. In molecular machine learning, however, wh… (see more)ere datasets are often hand-curated, and hence typically small, the lack of datasets with labeled features, and codebases to manage those datasets, has hindered the development of foundation models. In this work, we present seven novel datasets categorized by size into three distinct categories: ToyMix, LargeMix and UltraLarge. These datasets push the boundaries in both the scale and the diversity of supervised labels for molecular learning. They cover nearly 100 million molecules and over 3000 sparsely defined tasks, totaling more than 13 billion individual labels of both quantum and biological nature. In comparison, our datasets contain 300 times more data points than the widely used OGB-LSC PCQM4Mv2 dataset, and 13 times more than the quantum-only QM1B dataset. In addition, to support the development of foundational models based on our proposed datasets, we present the Graphium graph machine learning library which simplifies the process of building and training molecular machine learning models for multi-task and multi-level molecular datasets. Finally, we present a range of baseline results as a starting point of multi-task and multi-level training on these datasets. Empirically, we observe that performance on low-resource biological datasets show improvement by also training on large amounts of quantum data. This indicates that there may be potential in multi-task and multi-level training of a foundation model and fine-tuning it to resource-constrained downstream tasks. The Graphium library is publicly available on Github and the dataset links are available in Part 1 and Part 2.

2024-01-16

ICLR.cc/2024/Conference (poster)

Laplacian Change Point Detection for Single and Multi-view Dynamic Graphs

Samy Coulombe

Dynamic graphs are rich data structures that are used to model complex relationships between entities over time. In particular, anomaly dete… (see more)ction in temporal graphs is crucial for many real-world applications such as intrusion identification in network systems, detection of ecosystem disturbances, and detection of epidemic outbreaks. In this article, we focus on change point detection in dynamic graphs and address three main challenges associated with this problem: (i) how to compare graph snapshots across time, (ii) how to capture temporal dependencies, and (iii) how to combine different views of a temporal graph. To solve the above challenges, we first propose Laplacian Anomaly Detection (LAD) which uses the spectrum of graph Laplacian as the low dimensional embedding of the graph structure at each snapshot. LAD explicitly models short-term and long-term dependencies by applying two sliding windows. Next, we propose MultiLAD, a simple and effective generalization of LAD to multi-view graphs. MultiLAD provides the first change point detection method for multi-view dynamic graphs. It aggregates the singular values of the normalized graph Laplacian from different views through the scalar power mean operation. Through extensive synthetic experiments, we show that (i) LAD and MultiLAD are accurate and outperforms state-of-the-art baselines and their multi-view extensions by a large margin, (ii) MultiLAD’s advantage over contenders significantly increases when additional views are available, and (iii) MultiLAD is highly robust to noise from individual views. In five real-world dynamic graphs, we demonstrate that LAD and MultiLAD identify significant events as top anomalies such as the implementation of government COVID-19 interventions which impacted the population mobility in multi-view traffic networks.

2024-01-12

ACM Transactions on Knowledge Discovery from Data (published)

Connecting Weighted Automata, Tensor Networks and Recurrent Neural Networks through Spectral Learning

Tianyu Li

Doina Precup

2024-01-01

Mach. Learn. (published)

UTG: Towards a Unified View of Snapshot and Event Based Models for Temporal Graphs

Emanuele Rossi

2024-01-01

LoG (published)

Generative Learning of Continuous Data by Tensor Networks

Alex Meiburg

Jian Hua Chen

Jacob Miller

Raphaelle Tihon

Alejandro Perdomo-ortiz

2023-10-31

ArXiv (preprint)

Temporal Graph Benchmark for Machine Learning on Temporal Graphs

Shenyang Huang

Farimah Poursafaei

Jacob Danovitch

Matthias Fey

Weihua Hu

Emanuele Rossi

Jure Leskovec

Michael M. Bronstein

Reihaneh Rabbany

We present the Temporal Graph Benchmark (TGB), a collection of challenging and diverse benchmark datasets for realistic, reproducible, and r… (see more)obust evaluation of machine learning models on temporal graphs. TGB datasets are of large scale, spanning years in duration, incorporate both node and edge-level prediction tasks and cover a diverse set of domains including social, trade, transaction, and transportation networks. For both tasks, we design evaluation protocols based on realistic use-cases. We extensively benchmark each dataset and find that the performance of common models can vary drastically across datasets. In addition, on dynamic node property prediction tasks, we show that simple methods often achieve superior performance compared to existing temporal graph models. We believe that these findings open up opportunities for future research on temporal graphs. Finally, TGB provides an automated machine learning pipeline for reproducible and accessible temporal graph research, including data loading, experiment setup and performance evaluation. TGB will be maintained and updated on a regular basis and welcomes community feedback. TGB datasets, data loaders, example codes, evaluation setup, and leaderboards are publicly available at https://tgb.complexdatalab.com/.

ROSA: Random Orthogonal Subspace Adaptation

2023-06-20

ICML.cc/2023/Workshop/ES-FoMO (poster)

Fast and Attributed Change Detection on Dynamic Graphs with Density of States

2023-05-15

ArXiv (preprint)

Recurrent Real-valued Neural Autoregressive Density Estimator for Online Density Estimation and Classification of Streaming Data

Tianyu Li

Bogdan Mazoure

In contrast with the traditional offline learning, where complete data accessibility is assumed, many modern applications involve processing… (see more) data in a streaming fashion. This online learning setting raises various challenges, including concept drift, hardware memory constraints, etc. In this paper, we propose the Recurrent Real-valued Neural Autoregressive Density Estimator (RRNADE), a flexible density-based model for online classification and density estimation. RRNADE combines a neural Gaussian mixture density module with a recurrent module. This combination allows RRNADE to exploit possible sequential correlations in the streaming task, which are often ignored in the classical streaming setting where each input is assumed to be independent from the previous ones. We showcase the ability of RRNADE to adapt to concept drifts on synthetic density estimation tasks. We also apply RRNADE to online classification tasks on both real world and synthetic datasets and compare it with multiple density based as well as nondensity based online classification methods. In almost all of these tasks, RRNADE outperforms the other methods. Lastly, we conduct an ablation study demonstrating the complementary benefits of the density and the recurrent modules.

2023-02-01

ICLR.cc/2023/Conference (rejected)

Benchmarking State-Merging Algorithms for Learning Regular Languages.

Adil Soubki

Jeffrey Heinz

François Coste

Faissal Ouardi

2023-01-01

International Conference on Graphics and Interaction (published)

dblp.uni-trier.de

Explaining Graph Neural Networks Using Interpretable Local Surrogates

Farzaneh Heidari

Perouz Taslakian