Guy Wolf

Core Academic Member

wolfguy@mila.quebec

Canada CIFAR AI Chair

Associate Professor, Université de Montréal, Department of Mathematics and Statistics

Concordia University

CHUM - Montreal University Hospital Center

Website

Google Scholar

Biography

Guy Wolf is an associate professor in the Department of Mathematics and Statistics at Université de Montréal.

His research interests lie at the intersection of machine learning, data science and applied mathematics. He is particularly interested in data mining methods that use manifold learning and deep geometric learning, as well as applications for the exploratory analysis of biomedical data.

Wolf’s research focuses on exploratory data analysis and its applications in bioinformatics. His approaches are multidisciplinary and bring together machine learning, signal processing and applied math tools. His recent work has used a combination of diffusion geometries and deep learning to find emergent patterns, dynamics, and structure in big high dimensional- data (e.g., in single-cell genomics and proteomics).

Current Students

Sabeur Aridhi

Independent visiting researcher - University of Lorraine

sabeur.aridhi@mila.quebec

Ria Arora

Master's Research - Université de Montréal

Co-supervisor :

Liam Paull

ria.arora@mila.quebec

Github

Nader Asadi

Collaborating Alumni

Principal supervisor :

Eugene Belilovsky

nader.asadi@mila.quebec

PhD - Université de Montréal

adrien.aumon@mila.quebec

Semih Cantürk

PhD - Université de Montréal

semih.canturk@mila.quebec

Collaborating Alumni

muawiz.chaudhary@mila.quebec

Github

Google Scholar

Stefan Horoi

PhD - Université de Montréal

stefan.horoi@mila.quebec

Website

Github

Google Scholar

Will Hua

Master's Research - McGill University

Principal supervisor :

Doina Precup

chenqing.hua@mila.quebec

Guillaume Huguet

PhD - Université de Montréal

guillaume.huguet@mila.quebec

Github

Google Scholar

Charles-Etienne Joseph

Master's Research - Université de Montréal

Principal supervisor :

Eugene Belilovsky

charles-etienne.joseph@mila.quebec

Github

Jake Kovalic

Collaborating researcher - Yale

jake.kovalic@mila.quebec

Github

Vincent Létourneau

Postdoctorate - Université de Montréal

letournv@mila.quebec

Myriam Lizotte

PhD - Université de Montréal

myriam.lizotte@mila.quebec

Lydia Mezrag

PhD - Université de Montréal

lydia.mezrag@mila.quebec

Sacha Morin

PhD - Université de Montréal

Co-supervisor :

Liam Paull

sacha.morin@mila.quebec

Hasti Nafisi

Master's Research - Université de Montréal

Co-supervisor :

Yashar Hezaveh

hasti.nafisi@mila.quebec

Geraldin Nanfack

Postdoctorate - Concordia University

Principal supervisor :

Eugene Belilovsky

geraldin.nanfack@mila.quebec

Amine Natik

PhD - Université de Montréal

Principal supervisor :

Guillaume Lajoie

natikami@mila.quebec

Shuang Ni

PhD - Université de Montréal

shuang.ni@mila.quebec

Website

Github

Albert Orozco Camacho

PhD - Concordia University

Principal supervisor :

Independent visiting researcher

edouard.oyallon@mila.quebec

Master's Research - Concordia University

Principal supervisor :

Eugene Belilovsky

paria.mehrbod@mila.quebec

Donald Shenaj

Collaborating researcher - Concordia University

Principal supervisor :

Eugene Belilovsky

donald.shenaj@mila.quebec

Collaborating researcher - Université de Montréal

Co-supervisor :

Eugene Belilovsky

pedro.vianna@mila.quebec

Github

Google Scholar

Siddharth Viswanath

Collaborating researcher - Yale

siddarth.viswanath@mila.quebec

Frederik Wenkel

PhD - Université de Montréal

frederik.wenkel@mila.quebec

Research Intern - Western Washington University

Principal supervisor :

Guillaume Lajoie

vivian.white@mila.quebec

Github

Zhang Yanlei

Postdoctorate - Université de Montréal

yanlei.zhang@mila.quebec

Website

Google Scholar

Publications

Graph Positional and Structural Encoder

Semih Cantürk

Renming Liu

Olivier Lapointe-Gagné

Vincent Létourneau

Guy Wolf

Dominique Beaini

Ladislav Rampášek

Positional and structural encodings (PSE) enable better identifiability of nodes within a graph, as in general graphs lack a canonical node … (see more)ordering. This renders PSEs essential tools for empowering modern GNNs, and in particular graph Transformers. However, designing PSEs that work optimally for a variety of graph prediction tasks is a challenging and unsolved problem. Here, we present the graph positional and structural encoder (GPSE), a first-ever attempt to train a graph encoder that captures rich PSE representations for augmenting any GNN. GPSE can effectively learn a common latent representation for multiple PSEs, and is highly transferable. The encoder trained on a particular graph dataset can be used effectively on datasets drawn from significantly different distributions and even modalities. We show that across a wide range of benchmarks, GPSE-enhanced models can significantly improve the performance in certain tasks, while performing on par with those that employ explicitly computed PSEs in other cases. Our results pave the way for the development of large pre-trained models for extracting graph positional and structural information and highlight their potential as a viable alternative to explicitly computed PSEs as well as to existing self-supervised pre-training approaches.

2024-05-01

ICML.cc/2024/Conference (poster)

doi.org

openreview.net

Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis

Stefan Horoi

Albert Manuel Orozco Camacho

Eugene Belilovsky

Guy Wolf

2024-05-01

ICML.cc/2024/Conference (poster)

openreview.net

Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis

Stefan Horoi

Albert Manuel Orozco Camacho

Eugene Belilovsky

Guy Wolf

Ensembling multiple models enhances predictive performance by utilizing the varied learned features of the different models but incurs signi… (see more)ficant computational and storage costs. Model fusion, which combines parameters from multiple models into one, aims to mitigate these costs but faces practical challenges due to the complex, non-convex nature of neural network loss landscapes, where learned minima are often separated by high loss barriers. Recent works have explored using permutations to align network features, reducing the loss barrier in parameter space. However, permutations are restrictive since they assume a one-to-one mapping between the different models' neurons exists. We propose a new model merging algorithm, CCA Merge, which is based on Canonical Correlation Analysis and aims to maximize the correlations between linear combinations of the model features. We show that our method of aligning models leads to better performances than past methods when averaging models trained on the same, or differing data splits. We also extend this analysis into the harder many models setting where more than 2 models are merged, and we find that CCA Merge works significantly better in this setting than past methods.

2024-05-01

ICML.cc/2024/Conference (poster)

openreview.net

Improving and Generalizing Flow-Based Generative Models with Minibatch Optimal Transport

Alexander Tong

Nikolay Malkin

Guillaume Huguet

Yanlei Zhang

Jarrid Rector-Brooks

Kilian FATRAS

Guy Wolf

Yoshua Bengio

Continuous normalizing flows (CNFs) are an attractive generative modeling technique, but they have been held back by limitations in their si… (see more)mulation-based maximum likelihood training. We introduce the generalized \textit{conditional flow matching} (CFM) technique, a family of simulation-free training objectives for CNFs. CFM features a stable regression objective like that used to train the stochastic flow in diffusion models but enjoys the efficient inference of deterministic flow models. In contrast to both diffusion models and prior CNF training algorithms, CFM does not require the source distribution to be Gaussian or require evaluation of its density. A variant of our objective is optimal transport CFM (OT-CFM), which creates simpler flows that are more stable to train and lead to faster inference, as evaluated in our experiments. Furthermore, OT-CFM is the first method to compute dynamic OT in a simulation-free way. Training CNFs with CFM improves results on a variety of conditional and unconditional generation tasks, such as inferring single cell dynamics, unsupervised image translation, and Schrödinger bridge inference.

2024-03-11

TMLR (accepted)

openreview.net

Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis

Stefan Horoi

Albert Manuel Orozco Camacho

Eugene Belilovsky

Guy Wolf

2024-03-02

ICLR.cc/2024/Workshop/Re-Align (poster)

openreview.net

Learning and Aligning Structured Random Feature Networks

Vivian White

Muawiz Sajjad Chaudhary

Guy Wolf

Guillaume Lajoie

Kameron Decker Harris

Artificial neural networks (ANNs) are considered ``black boxes'' due to the difficulty of interpreting their learned weights. While choosin… (see more)g the best features is not well understood, random feature networks (RFNs) and wavelet scattering ground some ANN learning mechanisms in function space with tractable mathematics. Meanwhile, the genetic code has evolved over millions of years, shaping the brain to devlop variable neural circuits with reliable structure that resemble RFNs. We explore a similar approach, embedding neuro-inspired, wavelet-like weights into multilayer RFNs. These can outperform scattering and have kernels that describe their function space at large width. We build learnable and deeper versions of these models where we can optimize separate spatial and channel covariances of the convolutional weight distributions. We find that these networks can perform comparatively with conventional ANNs while dramatically reducing the number of trainable parameters. Channel covariances are most influential, and both weight and activation alignment are needed for classification performance. Our work outlines how neuro-inspired configurations may lead to better performance in key cases and offers a potentially tractable reduced model for ANN learning.

2024-03-02

ICLR.cc/2024/Workshop/Re-Align (poster)

openreview.net

Learning and Aligning Structured Random Feature Networks

Vivian White

Muawiz Sajjad Chaudhary

Guy Wolf

Guillaume Lajoie

Kameron Decker Harris

Artificial neural networks (ANNs) are considered "black boxes'' due to the difficulty of interpreting their learned weights. While choosing… (see more) the best features is not well understood, random feature networks (RFNs) and wavelet scattering ground some ANN learning mechanisms in function space with tractable mathematics. Meanwhile, the genetic code has evolved over millions of years, shaping the brain to develop variable neural circuits with reliable structure that resemble RFNs. We explore a similar approach, embedding neuro-inspired, wavelet-like weights into multilayer RFNs. These can outperform scattering and have kernels that describe their function space at large width. We build learnable and deeper versions of these models where we can optimize separate spatial and channel covariances of the convolutional weight distributions. We find that these networks can perform comparatively with conventional ANNs while dramatically reducing the number of trainable parameters. Channel covariances are most influential, and both weight and activation alignment are needed for classification performance. Our work outlines how neuro-inspired configurations may lead to better performance in key cases and offers a potentially tractable reduced model for ANN learning.

2024-03-02

ICLR.cc/2024/Workshop/Re-Align (poster)

openreview.net

Channel-Selective Normalization for Label-Shift Robust Test-Time Adaptation

Pedro Vianna

Muawiz Chaudhary

Paria Mehrbod

An Tang

Guy Cloutier

Guy Wolf

Michael Eickenberg

Eugene Belilovsky

Deep neural networks have useful applications in many different tasks, however their performance can be severely affected by changes in the … (see more)data distribution. For example, in the biomedical field, their performance can be affected by changes in the data (different machines, populations) between training and test datasets. To ensure robustness and generalization to real-world scenarios, test-time adaptation has been recently studied as an approach to adjust models to a new data distribution during inference. Test-time batch normalization is a simple and popular method that achieved compelling performance on domain shift benchmarks. It is implemented by recalculating batch normalization statistics on test batches. Prior work has focused on analysis with test data that has the same label distribution as the training data. However, in many practical applications this technique is vulnerable to label distribution shifts, sometimes producing catastrophic failure. This presents a risk in applying test time adaptation methods in deployment. We propose to tackle this challenge by only selectively adapting channels in a deep network, minimizing drastic adaptation that is sensitive to label shifts. Our selection scheme is based on two principles that we empirically motivate: (1) later layers of networks are more sensitive to label shift (2) individual features can be sensitive to specific classes. We apply the proposed technique to three classification tasks, including CIFAR10-C, Imagenet-C, and diagnosis of fatty liver, where we explore both covariate and label distribution shifts. We find that our method allows to bring the benefits of TTA while significantly reducing the risk of failure common in other methods, while being robust to choice in hyperparameters.

2024-02-07

ArXiv (preprint)

doi.org

arxiv.org

Effective Protein-Protein Interaction Exploration with PPIretrieval

Chenqing Hua

Connor Coley

Guy Wolf

Doina Precup

Shuangjia Zheng

2024-02-06

ArXiv (preprint)

doi.org

arxiv.org

Gaining Biological Insights through Supervised Data Visualization

Jake S. Rhodes

Adrien Aumon

Sacha Morin

Marc Girard

Catherine Larochelle

Boaz Lahav

Elsa Brunet-Ratnasingham

Amélie Pagliuzza

Lorie Marchitto

Wei Zhang

Adele Cutler

F. Grand'Maison

Anhong Zhou

Andrés Finzi

Nicolas Chomont

Daniel E. Kaufmann

Stephanie Zandee

Alexandre Prat

Guy Wolf

Kevin R. Moon

Dimensionality reduction-based data visualization is pivotal in comprehending complex biological data. The most common methods, such as PHAT… (see more)E, t-SNE, and UMAP, are unsupervised and therefore reflect the dominant structure in the data, which may be independent of expert-provided labels. Here we introduce a supervised data visualization method called RF-PHATE, which integrates expert knowledge for further exploration of the data. RF-PHATE leverages random forests to capture intricate featurelabel relationships. Extracting information from the forest, RF-PHATE generates low-dimensional visualizations that highlight relevant data relationships while disregarding extraneous features. This approach scales to large datasets and applies to classification and regression. We illustrate RF-PHATE’s prowess through three case studies. In a multiple sclerosis study using longitudinal clinical and imaging data, RF-PHATE unveils a sub-group of patients with non-benign relapsingremitting Multiple Sclerosis, demonstrating its aptitude for time-series data. In the context of Raman spectral data, RF-PHATE effectively showcases the impact of antioxidants on diesel exhaust-exposed lung cells, highlighting its proficiency in noisy environments. Furthermore, RF-PHATE aligns established geometric structures with COVID-19 patient outcomes, enriching interpretability in a hierarchical manner. RF-PHATE bridges expert insights and visualizations, promising knowledge generation. Its adaptability, scalability, and noise tolerance underscore its potential for widespread adoption.

2024-01-21

bioRxiv (preprint)

doi.org

Towards Foundational Models for Molecular Learning on Large-Scale Multi-Task Datasets

Dominique Beaini

Shenyang Huang

Joao Alex Cunha

Zhiyi Li

Gabriela Moisescu-Pareja

Oleksandr Dymov

Samuel Maddrell-Mander

Callum McLean

Frederik Wenkel

Luis Müller

Jama Hussein Mohamud

Ali Parviz

Michael Craig

Michał Koziarski

Jiarui Lu

Zhaocheng Zhu

Cristian Gabellini

Kerstin Klaser

Josef Dean

Cas Wognum … (see 15 more)

Maciej Sypetkowski

Guillaume Rabusseau

Reihaneh Rabbany

Jian Tang

Christopher Morris

Ioannis Koutis

Mirco Ravanelli

Guy Wolf

Prudencio Tossou

Hadrien Mary

Therence Bois

Andrew William Fitzgibbon

Blazej Banaszewski

Chad Martin

Dominic Masters

Recently, pre-trained foundation models have enabled significant advancements in multiple fields. In molecular machine learning, however, wh… (see more)ere datasets are often hand-curated, and hence typically small, the lack of datasets with labeled features, and codebases to manage those datasets, has hindered the development of foundation models. In this work, we present seven novel datasets categorized by size into three distinct categories: ToyMix, LargeMix and UltraLarge. These datasets push the boundaries in both the scale and the diversity of supervised labels for molecular learning. They cover nearly 100 million molecules and over 3000 sparsely defined tasks, totaling more than 13 billion individual labels of both quantum and biological nature. In comparison, our datasets contain 300 times more data points than the widely used OGB-LSC PCQM4Mv2 dataset, and 13 times more than the quantum-only QM1B dataset. In addition, to support the development of foundational models based on our proposed datasets, we present the Graphium graph machine learning library which simplifies the process of building and training molecular machine learning models for multi-task and multi-level molecular datasets. Finally, we present a range of baseline results as a starting point of multi-task and multi-level training on these datasets. Empirically, we observe that performance on low-resource biological datasets show improvement by also training on large amounts of quantum data. This indicates that there may be potential in multi-task and multi-level training of a foundation model and fine-tuning it to resource-constrained downstream tasks. The Graphium library is publicly available on Github and the dataset links are available in Part 1 and Part 2.

2024-01-16

ICLR.cc/2024/Conference (poster)

doi.org

openreview.net

Assessing Neural Network Representations During Training Using Noise-Resilient Diffusion Spectral Entropy

Danqi Liao

Chen Liu

Benjamin W Christensen

Alexander Tong

Guillaume Huguet

Guy Wolf

Maximilian Nickel

Ian Adelstein

Smita Krishnaswamy

Entropy and mutual information in neural networks provide rich information on the learning process, but they have proven difficult to comput… (see more)e reliably in high dimensions. Indeed, in noisy and high-dimensional data, traditional estimates in ambient dimensions approach a fixed entropy and are prohibitively hard to compute. To address these issues, we leverage data geometry to access the underlying manifold and reliably compute these information-theoretic measures. Specifically, we define diffusion spectral entropy (DSE) in neural representations of a dataset as well as diffusion spectral mutual information (DSMI) between different variables representing data. First, we show that they form noise-resistant measures of intrinsic dimensionality and relationship strength in high-dimensional simulated data that outperform classic Shannon entropy, nonparametric estimation, and mutual information neural estimation (MINE). We then study the evolution of representations in classification networks with supervised learning, self-supervision, or overfitting. We observe that (1) DSE of neural representations increases during training; (2) DSMI with the class label increases during generalizable learning but stays stagnant during overfitting; (3) DSMI with the input signal shows differing trends: on MNIST it increases, while on CIFAR-10 and STL-10 it decreases. Finally, we show that DSE can be used to guide better network initialization and that DSMI can be used to predict downstream classification accuracy across 962 models on ImageNet.

2024-01-01

CISS (published)

doi.org

openreview.net

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Guy Wolf

Biography

Current Students

Publications

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Popular keywords:

Guy Wolf

Biography

Current Students

Publications