Portrait of Guy Wolf

Guy Wolf

Core Academic Member
Canada CIFAR AI Chair
Associate Professor, Université de Montréal, Department of Mathematics and Statistics
Concordia University
CHUM - Montreal University Hospital Center
Research Topics
Data Mining
Deep Learning
Dynamical Systems
Graph Neural Networks
Information Retrieval
Learning on Graphs
Machine Learning Theory
Medical Machine Learning
Molecular Modeling
Multimodal Learning
Representation Learning
Spectral Learning

Biography

Guy Wolf is an associate professor in the Department of Mathematics and Statistics at Université de Montréal.

His research interests lie at the intersection of machine learning, data science and applied mathematics. He is particularly interested in data mining methods that use manifold learning and deep geometric learning, as well as applications for the exploratory analysis of biomedical data.

Wolf’s research focuses on exploratory data analysis and its applications in bioinformatics. His approaches are multidisciplinary and bring together machine learning, signal processing and applied math tools. His recent work has used a combination of diffusion geometries and deep learning to find emergent patterns, dynamics, and structure in big high dimensional- data (e.g., in single-cell genomics and proteomics).

Current Students

PhD - Université de Montréal
PhD - Université de Montréal
Collaborating researcher - Yale University
Collaborating Alumni
PhD - Université de Montréal
Master's Research - Concordia University
Principal supervisor :
PhD - Université de Montréal
PhD - Concordia University
Principal supervisor :
PhD - Université de Montréal
PhD - Université de Montréal
Co-supervisor :
Master's Research - Concordia University
Principal supervisor :
PhD - Université de Montréal
Collaborating researcher
PhD - Université de Montréal
Co-supervisor :
Postdoctorate - Concordia University
Principal supervisor :
PhD - Université de Montréal
PhD - Concordia University
Principal supervisor :
Master's Research - Université de Montréal
PhD - Université de Montréal
Principal supervisor :
PhD - Université de Montréal
Master's Research - Université de Montréal
Master's Research - Université de Montréal
Postdoctorate - Université de Montréal
Co-supervisor :
Collaborating researcher - McGill University (assistant professor)

Publications

Geometry-Aware Generative Autoencoders for Metric Learning and Generative Modeling on Data Manifolds
Xingzhi Sun
Danqi Liao
Kincaid MacDonald
Yanlei Zhang
Ian Adelstein
Tim G. J. Rudner
Non-linear dimensionality reduction methods have proven successful at learning low-dimensional representations of high-dimensional point clo… (see more)uds on or near data manifolds. However, existing methods are not easily extensible—that is, for large datasets, it is prohibitively expensive to add new points to these embeddings. As a result, it is very difficult to use existing embeddings generatively, to sample new points on and along these manifolds. In this paper, we propose GAGA (geometry-aware generative autoencoders) a framework which merges the power of generative deep learning with non-linear manifold learning by: 1) learning generalizable geometry-aware neural network embeddings based on non-linear dimensionality reduction methods like PHATE and diffusion maps, 2) deriving a non-euclidean pullback metric on the embedded space to generate points faithfully along manifold geodesics, and 3) learning a flow on the manifold that allows us to transport populations. We provide illustration on easily-interpretable synthetic datasets and showcase results on simulated and real single cell datasets. In particular, we show that the geodesic-based generation can be especially important for scientific datasets where the manifold represents a state space and geodesics can represent dynamics of entities over this space.
Simulating federated learning for steatosis detection using ultrasound images
Yue Qi
Alexandre Cadrin-Chênevert
Katleen Blanchet
Emmanuel Montagnon
Guy Cloutier
Michael Chassé
An Tang
We aimed to implement four data partitioning strategies evaluated with four federated learning (FL) algorithms and investigate the impact of… (see more) data distribution on FL model performance in detecting steatosis using B-mode US images. A private dataset (153 patients; 1530 images) and a public dataset (55 patient; 550 images) were included in this retrospective study. The datasets contained patients with metabolic dysfunction-associated fatty liver disease (MAFLD) with biopsy-proven steatosis grades and control individuals without steatosis. We employed four data partitioning strategies to simulate FL scenarios and we assessed four FL algorithms. We investigated the impact of class imbalance and the mismatch between the global and local data distributions on the learning outcome. Classification performance was assessed with area under the receiver operating characteristic curve (AUC) on a separate test set. AUCs were 0.93 (95% CI 0.92, 0.94) for source-based partitioning scenario with FedAvg, 0.90 (95% CI 0.89, 0.91) for a centralized model, and 0.83 (95% CI 0.81, 0.85) for a model trained in a single-center scenario. When data was perfectly balanced on the global level and each site had an identical data distribution, the model yielded an AUC of 0.90 (95% CI 0.88, 0.92). When each site contained data exclusively from one single class, irrespective of the global data distribution, the AUC fell in the range of 0.34–0.70. FL applied to B-mode US images provide performance comparable to a centralized model and higher than single-center scenario. Global data imbalance and local data heterogeneity influenced the learning outcome.
Noisy Data Visualization using Functional Data Analysis
Haozhe Chen
Andres Felipe Duque Correa
Kevin R. Moon
Data visualization via dimensionality reduction is an important tool in exploratory data analysis. However, when the data are noisy, many ex… (see more)isting methods fail to capture the underlying structure of the data. The method called Empirical Intrinsic Geometry (EIG) was previously proposed for performing dimensionality reduction on high dimensional dynamical processes while theoretically eliminating all noise. However, implementing EIG in practice requires the construction of high-dimensional histograms, which suffer from the curse of dimensionality. Here we propose a new data visualization method called Functional Information Geometry (FIG) for dynamical processes that adapts the EIG framework while using approaches from functional data analysis to mitigate the curse of dimensionality. We experimentally demonstrate that the resulting method outperforms a variant of EIG designed for visualization in terms of capturing the true structure, hyperparameter robustness, and computational speed. We then use our method to visualize EEG brain measurements of sleep activity.
Supervised latent factor modeling isolates cell-type-specific transcriptomic modules that underlie Alzheimer’s disease progression
Yasser Iturria-Medina
Jo Anne Stratton
David A. Bennett
Late onset Alzheimer’s disease (AD) is a progressive neurodegenerative disease, with brain changes beginning years before symptoms surface… (see more). AD is characterized by neuronal loss, the classic feature of the disease that underlies brain atrophy. However, GWAS reports and recent single-nucleus RNA sequencing (snRNA-seq) efforts have highlighted that glial cells, particularly microglia, claim a central role in AD pathophysiology. Here, we tailor pattern-learning algorithms to explore distinct gene programs by integrating the entire transcriptome, yielding distributed AD-predictive modules within the brain’s major cell-types. We show that these learned modules are biologically meaningful through the identification of new and relevant enriched signaling cascades. The predictive nature of our modules, especially in microglia, allows us to infer each subject’s progression along a disease pseudo-trajectory, confirmed by post-mortem pathological brain tissue markers. Additionally, we quantify the interplay between pairs of cell-type modules in the AD brain, and localized known AD risk genes to enriched module gene programs. Our collective findings advocate for a transition from cell-type-specificity to gene modules specificity to unlock the potential of unique gene programs, recasting the roles of recently reported genome-wide AD risk loci. Designing a supervised latent factor framework for snRNA-seq human brain, the authors find distinct Alzheimer’s-predictive gene modules across celltypes, suggesting subcelltype disease progression trajectories.
Sustained IFN signaling is associated with delayed development of SARS-CoV-2-specific immunity
Elsa Brunet-Ratnasingham
Haley E. Randolph
Marjorie Labrecque
Justin Bélair
Raphaël Lima-Barbosa
Amélie Pagliuzza
Lorie Marchitto
Michael Hultström
Julia Niessl
Rose Cloutier
Alina M. Sreng Flores
Nathalie Brassard
Mehdi Benlarbi
Jérémie Prévost
Shilei Ding
Sai Priya Anand
Gérémy Sannier
Anders Larsson
Dick Wågsäter … (see 27 more)
Eric Bareke
Hugo Zeberg
Miklos Lipcsey
Robert Frithiof
Anders Larsson
Sirui Zhou
Tomoko Nakanishi
David Morrison
Dani Vezina
Catherine Bourassa
Gabrielle Gendron-Lepage
Halima Medjahed
Floriane Point
Jonathan Richard
Catherine Larochelle
Alexandre Prat
Elsa Brunet-Ratnasingham
Nathalie Arbour
Madeleine Durand
J Brent Richards
Kevin Moon
Nicolas Chomont
Andrés Finzi
Martine Tétreault
Luis Barreiro
Daniel E. Kaufmann
Plasma RNAemia, delayed antibody responses and inflammation predict COVID-19 outcomes, but the mechanisms underlying these immunovirological… (see more) patterns are poorly understood. We profile 782 longitudinal plasma samples from 318 hospitalized patients with COVID-19. Integrated analysis using k-means reveals four patient clusters in a discovery cohort: mechanically ventilated critically-ill cases are subdivided into good prognosis and high-fatality clusters (reproduced in a validation cohort), while non-critical survivors segregate into high and low early antibody responders. Only the high-fatality cluster is enriched for transcriptomic signatures associated with COVID-19 severity, and each cluster has distinct RBD-specific antibody elicitation kinetics. Both critical and non-critical clusters with delayed antibody responses exhibit sustained IFN signatures, which negatively correlate with contemporaneous RBD-specific IgG levels and absolute SARS-CoV-2-specific B and CD4+ T cell frequencies. These data suggest that the “Interferon paradox” previously described in murine LCMV models is operative in COVID-19, with excessive IFN signaling delaying development of adaptive virus-specific immunity.
Improving and Generalizing Flow-Based Generative Models with Minibatch Optimal Transport
Continuous normalizing flows (CNFs) are an attractive generative modeling technique, but they have been held back by limitations in their si… (see more)mulation-based maximum likelihood training. We introduce the generalized \textit{conditional flow matching} (CFM) technique, a family of simulation-free training objectives for CNFs. CFM features a stable regression objective like that used to train the stochastic flow in diffusion models but enjoys the efficient inference of deterministic flow models. In contrast to both diffusion models and prior CNF training algorithms, CFM does not require the source distribution to be Gaussian or require evaluation of its density. A variant of our objective is optimal transport CFM (OT-CFM), which creates simpler flows that are more stable to train and lead to faster inference, as evaluated in our experiments. Furthermore, OT-CFM is the first method to compute dynamic OT in a simulation-free way. Training CNFs with CFM improves results on a variety of conditional and unconditional generation tasks, such as inferring single cell dynamics, unsupervised image translation, and Schrödinger bridge inference.
Learning and Aligning Structured Random Feature Networks
Muawiz Sajjad Chaudhary
Kameron Decker Harris
Artificial neural networks (ANNs) are considered "black boxes'' due to the difficulty of interpreting their learned weights. While choosing… (see more) the best features is not well understood, random feature networks (RFNs) and wavelet scattering ground some ANN learning mechanisms in function space with tractable mathematics. Meanwhile, the genetic code has evolved over millions of years, shaping the brain to develop variable neural circuits with reliable structure that resemble RFNs. We explore a similar approach, embedding neuro-inspired, wavelet-like weights into multilayer RFNs. These can outperform scattering and have kernels that describe their function space at large width. We build learnable and deeper versions of these models where we can optimize separate spatial and channel covariances of the convolutional weight distributions. We find that these networks can perform comparatively with conventional ANNs while dramatically reducing the number of trainable parameters. Channel covariances are most influential, and both weight and activation alignment are needed for classification performance. Our work outlines how neuro-inspired configurations may lead to better performance in key cases and offers a potentially tractable reduced model for ANN learning.
Generalization of deep learning models for hepatic steatosis grading using B-mode ultrasound images
Yue Qi
Michael Chassé
An Tang
Guy Cloutier
Grayscale ultrasound remains a key modality for screening of hepatic steatosis due to its non-invasiveness and availability. While neural ne… (see more)tworks have shown promise in this field, their main drawback lies in their inability to generalize to diverse real-world settings. Variations in equipment, acquisition parameters, or population significantly affect model performance. Test-time adaptation, an unsupervised domain adaptation technique, overcomes these limitations by adjusting trained models during inference. Our retrospective study used two datasets collected in separate populations, with different scanners and protocols. We propose an adaptation method, using test-time batch normalization to selectively adjust BatchNorm layers based on test data for predicting steatosis grades. Comparing the non-adapted and adapted models, the mean absolute error (± standard deviation) in grading four severities of steatosis decreased from 0.92 ± 0.21 to 0.64 ± 0.22 . Specifically, for detection of steatosis the area under the curve increased from 0.76 ± 0.05 to 0.95 ± 0.02 when using the adapted model. Adapted models show promising results in improving performance compared to base models when testing data differ significantly from training data. Results suggest that the proposed method effectively addresses domain shift in diagnosing fatty liver using ultrasound images, reducing risks associated with deploying trained models.
Towards Foundational Models for Molecular Learning on Large-Scale Multi-Task Datasets
Joao Alex Cunha
Zhiyi Li
Samuel Maddrell-Mander
Callum McLean
Jama Hussein Mohamud
Michael Craig
Cristian Gabellini
Kerstin Klasers
Josef Dean
Maciej Sypetkowski
Ioannis Koutis
Hadrien Mary
Therence Bois
Andrew Fitzgibbon
Błażej Banaszewski
Chad Martin
Dominic Masters
Recently, pre-trained foundation models have shown significant advancements in multiple fields. However, the lack of datasets with labeled f… (see more)eatures and codebases has hindered the development of a supervised foundation model for molecular tasks. Here, we have carefully curated seven datasets specifically tailored for node- and graph-level prediction tasks to facilitate supervised learning on molecules. Moreover, to support the development of multi-task learning on our proposed datasets, we created the Graphium graph machine learning library. Our dataset collection encompasses two distinct categories. Firstly, the TOYMIX category modifies three small existing datasets with additional data for multi-task learning. Secondly, the LARGEMIX category includes four large-scale datasets with 344M graph-level data points and 409M node-level data points from ∼5M unique molecules. Finally, the ultra-large dataset contains 2,210M graph-level data points and 2,031M node-level data points coming from 86M molecules. Hence our datasets represent an order of magnitude increase in data volume compared to other 2D-GNN datasets. In addition, recognizing that molecule-related tasks often span multiple levels, we have designed our library to explicitly support multi-tasking, offering a diverse range of multi-level representations, i.e., representations at the graph, node, edge, and node-pair level. We equipped the library with an extensive collection of models and features to cover different levels of molecule analysis. By combining our curated datasets with this versatile library, we aim to accelerate the development of molecule foundation models. Datasets and code are available at https://github.com/datamol-io/graphium.
Assessing Neural Network Representations During Training Using Noise-Resilient Diffusion Spectral Entropy
Danqi Liao
Chen Liu
Benjamin W Christensen
Maximilian Nickel
Ian Adelstein
Entropy and mutual information in neural networks provide rich information on the learning process, but they have proven difficult to comput… (see more)e reliably in high dimensions. Indeed, in noisy and high-dimensional data, traditional estimates in ambient dimensions approach a fixed entropy and are prohibitively hard to compute. To address these issues, we leverage data geometry to access the underlying manifold and reliably compute these information-theoretic measures. Specifically, we define diffusion spectral entropy (DSE) in neural representations of a dataset as well as diffusion spectral mutual information (DSMI) between different variables representing data. First, we show that they form noise-resistant measures of intrinsic dimensionality and relationship strength in high-dimensional simulated data that outperform classic Shannon entropy, nonparametric estimation, and mutual information neural estimation (MINE). We then study the evolution of representations in classification networks with supervised learning, self-supervision, or overfitting. We observe that (1) DSE of neural representations increases during training; (2) DSMI with the class label increases during generalizable learning but stays stagnant during overfitting; (3) DSMI with the input signal shows differing trends: on MNIST it increases, while on CIFAR-10 and STL-10 it decreases. Finally, we show that DSE can be used to guide better network initialization and that DSMI can be used to predict downstream classification accuracy across 962 models on ImageNet.
Enhancing Supervised Visualization Through Autoencoder and Random Forest Proximities for Out-of-Sample Extension
Kevin R. Moon
Jake S. Rhodes
The value of supervised dimensionality reduction lies in its ability to uncover meaningful connections between data features and labels. Com… (see more)mon dimensionality reduction methods embed a set of fixed, latent points, but are not capable of generalizing to an unseen test set. In this paper, we provide an out-of-sample extension method for the random forest-based supervised dimensionality reduction method, RF-PHATE, combining information learned from the random forest model with the function-learning capabilities of autoencoders. Through quantitative assessment of various autoencoder architectures, we identify that networks that reconstruct random forest proximities are more robust for the embedding extension problem. Furthermore, by leveraging proximity-based prototypes, we achieve a 40% reduction in training time without compromising extension quality. Our method does not require label information for out-of-sample points, thus serving as a semi-supervised method, and can achieve consistent quality using only 10% of the training data.
Learnable Filters for Geometric Scattering Modules
Dhananjay Bhaskar
Kincaid MacDonald
Jackson Grady
Michael Perlmutter