Portrait de Guy Wolf

Guy Wolf

Membre académique principal
Chaire en IA Canada-CIFAR
Professeur agrégé, Université de Montréal, Département de mathématiques et statistiques
Concordia University
CHUM - Montreal University Hospital Center
Sujets de recherche
Apprentissage automatique médical
Apprentissage de représentations
Apprentissage multimodal
Apprentissage profond
Apprentissage spectral
Apprentissage sur graphes
Exploration des données
Modélisation moléculaire
Recherche d'information
Réseaux de neurones en graphes
Systèmes dynamiques
Théorie de l'apprentissage automatique

Biographie

Guy Wolf est professeur agrégé au Département de mathématiques et de statistique de l'Université de Montréal. Ses intérêts de recherche se situent au carrefour de l'apprentissage automatique, de la science des données et des mathématiques appliquées. Il s'intéresse particulièrement aux méthodes d'exploration de données qui utilisent l'apprentissage multiple et l'apprentissage géométrique profond, ainsi qu'aux applications pour l'analyse exploratoire des données biomédicales.

Ses recherches portent sur l'analyse exploratoire des données, avec des applications en bio-informatique. Ses approches sont multidisciplinaires et combinent l'apprentissage automatique, le traitement du signal et les outils mathématiques appliqués. En particulier, ses travaux récents utilisent une combinaison de géométries de diffusion et d'apprentissage profond pour trouver des modèles émergents, des dynamiques et des structures dans les mégadonnées à grande dimension (par exemple, dans la génomique et la protéomique de la cellule unique).

Étudiants actuels

Maîtrise recherche - UdeM
Co-superviseur⋅e :
Doctorat - UdeM
Collaborateur·rice alumni
Collaborateur·rice alumni - UdeM
Collaborateur·rice de recherche - Western Washington University (faculty; assistant prof))
Co-superviseur⋅e :
Collaborateur·rice alumni - McGill
Maîtrise recherche - Concordia
Superviseur⋅e principal⋅e :
Doctorat - Concordia
Superviseur⋅e principal⋅e :
Maîtrise recherche - UdeM
Superviseur⋅e principal⋅e :
Stagiaire de recherche - UdeM
Postdoctorat - UdeM
Doctorat - UdeM
Doctorat - UdeM
Co-superviseur⋅e :
Maîtrise recherche - Concordia
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Doctorat - UdeM
Co-superviseur⋅e :
Postdoctorat - Concordia
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Doctorat - Concordia
Superviseur⋅e principal⋅e :
Maîtrise recherche - UdeM
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche - UdeM
Co-superviseur⋅e :
Stagiaire de recherche - Western Washington University
Superviseur⋅e principal⋅e :
Postdoctorat - UdeM
Collaborateur·rice de recherche - McGill (assistant professor)

Publications

Conditional Flow Matching: Simulation-Free Dynamic Optimal Transport
Alexander Tong
Nikolay Malkin
Guillaume Huguet
Yanlei Zhang
Jarrid Rector-Brooks
Kilian FATRAS
Continuous normalizing flows (CNFs) are an attractive generative modeling technique, but they have thus far been held back by limitations i… (voir plus)n their simulation-based maximum likelihood training. In this paper, we introduce a new technique called conditional flow matching (CFM), a simulation-free training objective for CNFs. CFM features a stable regression objective like that used to train the stochastic flow in diffusion models but enjoys the efficient inference of deterministic flow models. In contrast to both diffusion models and prior CNF training algorithms, our CFM objec-tive does not require the source distribution to be Gaussian or require evaluation of its density. Based on this new objective, we also introduce optimal transport CFM (OT-CFM), which creates simpler flows that are more stable to train and lead to faster inference, as evaluated in our experiments. Training CNFs with CFM improves results on a variety of conditional and unconditional generation tasks such as inferring single cell dynamics, unsupervised image translation, and Schr ¨ odinger bridge inference. Code is available at https://github.com/atong01/ conditional-flow-matching .
Conditional Flow Matching: Simulation-Free Dynamic Optimal Transport
Alexander Tong
Nikolay Malkin
Guillaume Huguet
Yanlei Zhang
Jarrid Rector-Brooks
Kilian FATRAS
GEODESIC SINKHORN FOR FAST AND ACCURATE OPTIMAL TRANSPORT ON MANIFOLDS
Guillaume Huguet
Alexander Tong
María Ramos Zapatero
Christopher J. Tape
Efficient computation of optimal transport distance between distributions is of growing importance in data science. Sinkhorn-based methods a… (voir plus)re currently the state-of-the-art for such computations, but require O(n2) computations. In addition, Sinkhorn-based methods commonly use an Euclidean ground distance between datapoints. However, with the prevalence of manifold structured scientific data, it is often desirable to consider geodesic ground distance. Here, we tackle both issues by proposing Geodesic Sinkhorn—based on diffusing a heat kernel on a manifold graph. Notably, Geodesic Sinkhorn requires only O(n log n) computation, as we approximate the heat kernel with Chebyshev polynomials based on the sparse graph Laplacian. We apply our method to the computation of barycenters of several distributions of high dimensional single cell data from patient samples undergoing chemotherapy. In particular, we define the barycentric distance as the distance between two such barycenters. Using this definition, we identify an optimal transport distance and path associated with the effect of treatment on cellular data.
A Heat Diffusion Perspective on Geodesic Preserving Dimensionality Reduction
Guillaume Huguet
Alexander Tong
Edward De Brouwer
Yanlei Zhang
Ian Adelstein
Inferring Dynamic Regulatory Interaction Graphs From Time Series Data With Perturbations
Dhananjay Bhaskar
Daniel Sumner Magruder
Edward De Brouwer
Matheo Morales
Aarthi Venkat
Frederik Wenkel
Learning Shared Neural Manifolds from Multi-Subject FMRI Data
Jessie Huang
Je-chun Huang
Erica Lindsey Busch
Tom Wallenstein
Michal Gerasimiuk
Andrew Benz
Nicholas Turk-Browne
Functional magnetic resonance imaging (fMRI) data is collected in millions of noisy, redundant dimensions. To understand how different brain… (voir plus)s process the same stimulus, we aim to denoise the fMRI signal via a meaningful embedding space that captures the data's intrinsic structure as shared across brains. We assume that stimulus-driven responses share latent features common across subjects that are jointly discoverable. Previous approaches to this problem have relied on linear methods like principal component analysis and shared response modeling. We propose a neural network called MRMD-AE (manifold-regularized multiple- decoder, autoencoder) that learns a common embedding from multi-subject fMRI data while retaining the ability to decode individual responses. Our latent common space represents an extensible manifold (where untrained data can be mapped) and improves classification accuracy of stimulus features of unseen timepoints, as well as cross-subject translation of fMRI signals.
Parametric Scattering Networks
Shanel Gauthier
Benjamin Thérien
Laurent Alséne-Racicot
Muawiz Chaudhary
Michael Eickenberg
The wavelet scattering transform creates geometric in-variants and deformation stability. In multiple signal do-mains, it has been shown to … (voir plus)yield more discriminative rep-resentations compared to other non-learned representations and to outperform learned representations in certain tasks, particularly on limited labeled data and highly structured signals. The wavelet filters used in the scattering trans-form are typically selected to create a tight frame via a pa-rameterized mother wavelet. In this work, we investigate whether this standard wavelet filterbank construction is op-timal. Focusing on Morlet wavelets, we propose to learn the scales, orientations, and aspect ratios of the filters to produce problem-specific parameterizations of the scattering transform. We show that our learned versions of the scattering transform yield significant performance gains in small-sample classification settings over the standard scat-tering transform. Moreover, our empirical results suggest that traditional filterbank constructions may not always be necessary for scattering transforms to extract effective rep-resentations.
Multiscale PHATE identifies multimodal signatures of COVID-19
Manik Kuchroo
Je-chun Huang
Patrick Wong
Jean-Christophe Grenier
Dennis Shung
Alexander Tong
Carolina Lucas
Jon Klein
Daniel B. Burkhardt
Scott Gigante
Abhinav Godavarthi
Bastian Rieck
Benjamin Israelow
Michael Simonov
Tianyang Mao
Ji Eun Oh
Julio Silva
Takehiro Takahashi
Camila D. Odio
Arnau Casanovas-Massana … (voir 10 de plus)
John Fournier
Shelli Farhadian
Charles S. Dela Cruz
Albert I. Ko
Matthew Hirn
F. Perry Wilson
Akiko Iwasaki
Multiscale PHATE identifies multimodal signatures of COVID-19
Manik Kuchroo
Je-chun Huang
Patrick W. Wong
Jean-Christophe Grenier
Dennis L. Shung
Alexander Tong
C. Lucas
J. Klein
Daniel B. Burkhardt
Scott Gigante
Abhinav Godavarthi
Bastian Rieck
Benjamin Israelow
Michael Simonov
Tianyang Mao
Ji Eun Oh
Julio Silva
Takehiro Takahashi
C. Odio
Arnau Casanovas‐massana … (voir 10 de plus)
John Byrne Fournier
Shelli F. Farhadian
C. D. Dela Cruz
A. Ko
Matthew Hirn
F. Wilson
Akiko Iwasaki
Population Genomics Approaches for Genetic Characterization of SARS-CoV-2 Lineages
Fatima Mostefai
I. Gamache
Arnaud N’Guessan
Justin Pelletier
Jessie Huang
Carmen Lia Murall
Ahmad Pesaranghader
Vanda Gaonac'h-Lovejoy
David J. Hamelin
Raphael Poujol
Jean-Christophe Grenier
Martin W. Smith
Étienne Caron
Morgan Craig
B. Jesse Shapiro
The genome of the Severe Acute Respiratory Syndrome coronavirus 2 (SARS-CoV-2), the pathogen that causes coronavirus disease 2019 (COVID-19)… (voir plus), has been sequenced at an unprecedented scale leading to a tremendous amount of viral genome sequencing data. To assist in tracing infection pathways and design preventive strategies, a deep understanding of the viral genetic diversity landscape is needed. We present here a set of genomic surveillance tools from population genetics which can be used to better understand the evolution of this virus in humans. To illustrate the utility of this toolbox, we detail an in depth analysis of the genetic diversity of SARS-CoV-2 in first year of the COVID-19 pandemic. We analyzed 329,854 high-quality consensus sequences published in the GISAID database during the pre-vaccination phase. We demonstrate that, compared to standard phylogenetic approaches, haplotype networks can be computed efficiently on much larger datasets. This approach enables real-time lineage identification, a clear description of the relationship between variants of concern, and efficient detection of recurrent mutations. Furthermore, time series change of Tajima's D by haplotype provides a powerful metric of lineage expansion. Finally, principal component analysis (PCA) highlights key steps in variant emergence and facilitates the visualization of genomic variation in the context of SARS-CoV-2 diversity. The computational framework presented here is simple to implement and insightful for real-time genomic surveillance of SARS-CoV-2 and could be applied to any pathogen that threatens the health of populations of humans and other organisms.
Population Genomics Approaches for Genetic Characterization of SARS-CoV-2 Lineages
Fatima Mostefai
Isabel Gamache
Arnaud N’Guessan
Justin Pelletier
Jessie Huang
Carmen Lia Murall
Ahmad Pesaranghader
Vanda Gaonac'h-Lovejoy
David J. Hamelin
Raphael Poujol
Jean-Christophe Grenier
Martin Smith
Etienne Caron
Morgan Craig
B. Jesse Shapiro
Goal-driven optimization of single-neuron properties in artificial networks reveals regularization role of neural diversity and adaptation in the brain
Victor Geadah
Stefan Horoi
Giancarlo Kerg
Neurons in the brain have rich and adaptive input-output properties. Features such as diverse f-I curves and spike frequency adaptation are … (voir plus)known to place single neurons in optimal coding regimes when facing changing stimuli. Yet, it is still unclear how brain circuits exploit single neuron flexibility, and how network-level requirements may have shaped such cellular function. To answer this question, a multi-scaled approach is needed where the computations of single neurons and of neural circuits must be considered as a complete system. In this work, we use artificial neural networks to systematically investigate single neuron input-output adaptive mechanisms, optimized in an end-to-end fashion. Throughout the optimization process, each neuron has the liberty to modify its nonlinear activation function, parametrized to mimic f-I curves of biological neurons, and to learn adaptation strategies to modify activation functions in real-time during a task. We find that such networks show much-improved robustness to noise and changes in input statistics. Importantly, we find that this procedure recovers precise coding strategies found in biological neurons, such as gain scaling and fractional order differentiation/integration. Using tools from dynamical systems theory, we analyze the role of these emergent single neuron properties and argue that neural diversity and adaptation plays an active regularization role that enables neural circuits to optimally propagate information across time.