Portrait de Guy Wolf

Guy Wolf

Membre académique principal
Chaire en IA Canada-CIFAR
Professeur agrégé, Université de Montréal, Département de mathématiques et statistiques
Concordia University
CHUM - Montreal University Hospital Center
Sujets de recherche
Apprentissage automatique médical
Apprentissage de représentations
Apprentissage multimodal
Apprentissage profond
Apprentissage spectral
Apprentissage sur graphes
Exploration des données
Modélisation moléculaire
Recherche d'information
Réseaux de neurones en graphes
Systèmes dynamiques
Théorie de l'apprentissage automatique

Biographie

Guy Wolf est professeur agrégé au Département de mathématiques et de statistique de l'Université de Montréal. Ses intérêts de recherche se situent au carrefour de l'apprentissage automatique, de la science des données et des mathématiques appliquées. Il s'intéresse particulièrement aux méthodes d'exploration de données qui utilisent l'apprentissage multiple et l'apprentissage géométrique profond, ainsi qu'aux applications pour l'analyse exploratoire des données biomédicales.

Ses recherches portent sur l'analyse exploratoire des données, avec des applications en bio-informatique. Ses approches sont multidisciplinaires et combinent l'apprentissage automatique, le traitement du signal et les outils mathématiques appliqués. En particulier, ses travaux récents utilisent une combinaison de géométries de diffusion et d'apprentissage profond pour trouver des modèles émergents, des dynamiques et des structures dans les mégadonnées à grande dimension (par exemple, dans la génomique et la protéomique de la cellule unique).

Étudiants actuels

Maîtrise recherche - UdeM
Co-superviseur⋅e :
Doctorat - UdeM
Visiteur de recherche indépendant - Helmholtz Munich
Collaborateur·rice alumni
Stagiaire de recherche - UdeM
Collaborateur·rice de recherche - Western Washington University (faculty; assistant prof))
Co-superviseur⋅e :
Maîtrise recherche - McGill
Superviseur⋅e principal⋅e :
Maîtrise recherche - Concordia
Superviseur⋅e principal⋅e :
Doctorat - Concordia
Superviseur⋅e principal⋅e :
Maîtrise recherche - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche - Yale
Postdoctorat - UdeM
Visiteur de recherche indépendant - Helmholtz Munich / TUM
Doctorat - UdeM
Visiteur de recherche indépendant - LMU Munich & Helmholtz Munich
Doctorat - UdeM
Co-superviseur⋅e :
Maîtrise recherche - Concordia
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Doctorat - UdeM
Co-superviseur⋅e :
Maîtrise recherche - UdeM
Co-superviseur⋅e :
Postdoctorat - Concordia
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Doctorat - Concordia
Superviseur⋅e principal⋅e :
Maîtrise recherche - UdeM
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche - UdeM
Co-superviseur⋅e :
Collaborateur·rice de recherche - Yale
Stagiaire de recherche - Western Washington University
Superviseur⋅e principal⋅e :
Postdoctorat - UdeM
Collaborateur·rice de recherche - McGill (assistant professor)

Publications

Graph Neural Networks Meet Probabilistic Graphical Models: A Survey
Chenqing Hua
Sitao Luan
Qian Zhang
Jie Fu
Towards Graph Foundation Models: A Study on the Generalization of Positional and Structural Encodings
Billy Joe Franks
Moshe Eliasof
Semih Cantürk
Carola-Bibiane Schönlieb
Sophie Fellenz
Marius Kloft
Recent advances in integrating positional and structural encodings (PSEs) into graph neural networks (GNNs) have significantly enhanced thei… (voir plus)r performance across various graph learning tasks. However, the general applicability of these encodings and their potential to serve as foundational representations for graphs remain uncertain. This paper investigates the fine-tuning efficiency, scalability with sample size, and generalization capability of learnable PSEs across diverse graph datasets. Specifically, we evaluate their potential as universal pre-trained models that can be easily adapted to new tasks with minimal fine-tuning and limited data. Furthermore, we assess the expressivity of the learned representations, particularly, when used to augment downstream GNNs. We demonstrate through extensive benchmarking and empirical analysis that PSEs generally enhance downstream models. However, some datasets may require specific PSE-augmentations to achieve optimal performance. Nevertheless, our findings highlight their significant potential to become integral components of future graph foundation models. We provide new insights into the strengths and limitations of PSEs, contributing to the broader discourse on foundation models in graph learning.
Random Forest Autoencoders for Guided Representation Learning
Adrien Aumon
Shuang Ni
Myriam Lizotte
Kevin R. Moon
Jake S. Rhodes
Decades of research have produced robust methods for unsupervised data visualization, yet supervised visualization…
Random Forest Autoencoders for Guided Representation Learning
Adrien Aumon
Shuang Ni
Myriam Lizotte
Kevin R. Moon
Jake S. Rhodes
Decades of research have produced robust methods for unsupervised data visualization, yet supervised visualization…
Channel-Selective Normalization for Label-Shift Robust Test-Time Adaptation
Pedro Vianna
Muawiz Chaudhary
Paria Mehrbod
An Tang
Guy Cloutier
Michael Eickenberg
Deep neural networks have useful applications in many different tasks, however their performance can be severely affected by changes in the … (voir plus)data distribution. For example, in the biomedical field, their performance can be affected by changes in the data (different machines, populations) between training and test datasets. To ensure robustness and generalization to real-world scenarios, test-time adaptation has been recently studied as an approach to adjust models to a new data distribution during inference. Test-time batch normalization is a simple and popular method that achieved compelling performance on domain shift benchmarks. It is implemented by recalculating batch normalization statistics on test batches. Prior work has focused on analysis with test data that has the same label distribution as the training data. However, in many practical applications this technique is vulnerable to label distribution shifts, sometimes producing catastrophic failure. This presents a risk in applying test time adaptation methods in deployment. We propose to tackle this challenge by only selectively adapting channels in a deep network, minimizing drastic adaptation that is sensitive to label shifts. Our selection scheme is based on two principles that we empirically motivate: (1) later layers of networks are more sensitive to label shift (2) individual features can be sensitive to specific classes. We apply the proposed technique to three classification tasks, including CIFAR10-C, Imagenet-C, and diagnosis of fatty liver, where we explore both covariate and label distribution shifts. We find that our method allows to bring the benefits of TTA while significantly reducing the risk of failure common in other methods, while being robust to choice in hyperparameters.
Principal Curvatures Estimation with Applications to Single Cell Data
Yanlei Zhang
Lydia Mezrag
Xingzhi Sun
Charles Xu
Kincaid MacDonald
Dhananjay Bhaskar
Bastian Rieck
The rapidly growing field of single-cell transcriptomic sequencing (scRNAseq) presents challenges for data analysis due to its massive datas… (voir plus)ets. A common method in manifold learning consists in hypothesizing that datasets lie on a lower dimensional manifold. This allows to study the geometry of point clouds by extracting meaningful descriptors like curvature. In this work, we will present Adaptive Local PCA (AdaL-PCA), a data-driven method for accurately estimating various notions of intrinsic curvature on data manifolds, in particular principal curvatures for surfaces. The model relies on local PCA to estimate the tangent spaces. The evaluation of AdaL-PCA on sampled surfaces shows state-of-the-art results. Combined with a PHATE embedding, the model applied to single-cell RNA sequencing data allows us to identify key variations in the cellular differentiation.
Principal Curvatures Estimation with Applications to Single Cell Data
Yanlei Zhang
Lydia Mezrag
Xingzhi Sun
Charles Xu
Kincaid MacDonald
Dhananjay Bhaskar
Bastian Rieck
AdaFisher: Adaptive Second Order Optimization via Fisher Information
Damien MARTINS GOMES
Yanlei Zhang
Mahdi S. Hosseini
First-order optimization methods are currently the mainstream in training deep neural networks (DNNs). Optimizers like Adam incorporate limi… (voir plus)ted curvature information by employing the diagonal matrix preconditioning of the stochastic gradient during the training. Despite their widespread, second-order optimization algorithms exhibit superior convergence properties compared to their first-order counterparts e.g. Adam and SGD. However, their practicality in training DNNs are still limited due to increased per-iteration computations and suboptimal accuracy compared to the first order methods. We present AdaFisher--an adaptive second-order optimizer that leverages a block-diagonal approximation to the Fisher information matrix for adaptive gradient preconditioning. AdaFisher aims to bridge the gap between enhanced convergence capabilities and computational efficiency in second-order optimization framework for training DNNs. Despite the slow pace of second-order optimizers, we showcase that AdaFisher can be reliably adopted for image classification, language modelling and stand out for its stability and robustness in hyperparameter tuning. We demonstrate that AdaFisher outperforms the SOTA optimizers in terms of both accuracy and convergence speed. Code available from \href{https://github.com/AtlasAnalyticsLab/AdaFisher}{https://github.com/AtlasAnalyticsLab/AdaFisher}
Geometry-Aware Generative Autoencoders for Warped Riemannian Metric Learning and Generative Modeling on Data Manifolds
Xingzhi Sun
Danqi Liao
Kincaid MacDonald
Yanlei Zhang
Guillaume Huguet
Chen Liu
Ian Adelstein
Tim G. J. Rudner
Geometry-Aware Generative Autoencoder for Warped Riemannian Metric Learning and Generative Modeling on Data Manifolds
Xingzhi Sun
Danqi Liao
Kincaid MacDonald
Yanlei Zhang
Guillaume Huguet
Ian Adelstein
Tim G. J. Rudner
Rapid growth of high-dimensional datasets in fields such as single-cell RNA sequencing and spatial genomics has led to unprecedented opportu… (voir plus)nities for scientific discovery, but it also presents unique computational and statistical challenges. Traditional methods struggle with geometry-aware data generation, interpolation along meaningful trajectories, and transporting populations via feasible paths. To address these issues, we introduce Geometry-Aware Generative Autoencoder (GAGA), a novel framework that combines extensible manifold learning with generative modeling. GAGA constructs a neural network embedding space that respects the intrinsic geometries discovered by manifold learning and learns a novel warped Riemannian metric on the data space. This warped metric is derived from both the points on the data manifold and negative samples off the manifold, allowing it to characterize a meaningful geometry across the entire latent space. Using this metric, GAGA can uniformly sample points on the manifold, generate points along geodesics, and interpolate between populations across the learned manifold. GAGA shows competitive performance in simulated and real-world datasets, including a 30% improvement over SOTA in single-cell population-level trajectory inference.
Non-Uniform Parameter-Wise Model Merging
Albert Manuel Orozco Camacho
Stefan Horoi
Combining multiple machine learning models has long been a technique for enhancing performance, particularly in distributed settings. Tradit… (voir plus)ional approaches, such as model ensembles, work well, but are expensive in terms of memory and compute. Recently, methods based on averaging model parameters have achieved good results in some settings and have gained popularity. However, merging models initialized differently that do not share a part of their training trajectories can yield worse results than simply using the base models, even after aligning their neurons. In this paper, we introduce a novel approach, Non-uniform Parameter-wise Model Merging, or NP Merge, which merges models by learning the contribution of each parameter to the final model using gradient-based optimization. We empirically demonstrate the effectiveness of our method for merging models of various architectures in multiple settings, outperforming past methods. We also extend NP Merge to handle the merging of multiple models, showcasing its scalability and robustness.
Non-Uniform Parameter-Wise Model Merging
Albert M. Orozco Camacho
Stefan Horoi
Combining multiple machine learning models has long been a technique for enhancing performance, particularly in distributed settings. Tradit… (voir plus)ional approaches, such as model ensembles, work well, but are expensive in terms of memory and compute. Recently, methods based on averaging model parameters have achieved good results in some settings and have gained popularity. However, merging models initialized differently that do not share a part of their training trajectories can yield worse results than simply using the base models, even after aligning their neurons. In this paper, we introduce a novel approach, Non-uniform Parameter-wise Model Merging, or NP Merge, which merges models by learning the contribution of each parameter to the final model using gradient-based optimization. We empirically demonstrate the effectiveness of our method for merging models of various architectures in multiple settings, outperforming past methods. We also extend NP Merge to handle the merging of multiple models, showcasing its scalability and robustness.