Guy Wolf

Joao Felipe Carneiro Barbosa Rocha

Biographie

Guy Wolf est professeur titulaire au Département de mathématiques et de statistique (DMS) de l'Université de Montréal (UdeM), titulaire d'une chaire en IA Canada-CIFAR et membre académique principal de Mila (l'Institut québécois d'intelligence artificielle), chercheur associé au CRCHUM (Centre de recherche du Centre hospitalier de l'Université de Montréal) et chercheur principal participant au Laboratoire international Helmholtz pour la dynamique cellulaire causale.

En 2024, il a reçu une bourse de recherche Humboldt pour chercheurs expérimentés, dans le cadre de laquelle il a été professeur invité à l'Université de Heidelberg (2024) et à Helmholtz Munich (2024-2026) en Allemagne. Avant de joindre l'UdeM et Mila, il a été professeur adjoint Gibbs (2015-2018) au sein du programme de mathématiques appliquées, puis chercheur scientifique associé au Département de génétique (2018) de l'Université Yale (Connecticut, États-Unis). Auparavant, il a travaillé comme chercheur postdoctoral (2013-2015) au Département d'informatique de l'École normale supérieure à Paris (France). Il détient un doctorat en informatique de l'Université de Tel-Aviv (Israel) et possède cinq ans d'expérience préalable en conception et développement de logiciels informatiques pour l'analyse de données en contexte militaire.

Ses recherches actuelles portent sur l'apprentissage guidé de représentations pour l'exploration de données, notamment par des méthodes qui exploitent l'apprentissage de variétés (manifold learning) et l'apprentissage profond géométrique pour la réduction de dimensionnalité, la visualisation, le débruitage, l'augmentation de données et la modélisation à gros grains (coarse graining). Bien que ces approches s'appliquent à un large éventail de domaines, il s'intéresse particulièrement à l'intersection de l'IA et de la santé, notamment aux outils facilitant l'analyse exploratoire de données biomédicales, comme dans les domaines de la multiomique sur cellule unique (single-cell multiomics), de la découverte de médicaments et des neurosciences.

Étudiants actuels

Doctorat - UdeM

Doctorat - UdeM

Collaborateur·rice de recherche - Yale University

Co-superviseur⋅e :

Maîtrise recherche - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Xiao Huang

Maîtrise recherche - Concordia

Superviseur⋅e principal⋅e :

Collaborateur·rice alumni - UdeM

Github

Paul Janson

Doctorat - Concordia

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Visiteur de recherche indépendant - Helmholtz Munich

Philippe Martin

Doctorat - UdeM

Co-superviseur⋅e :

Paul François

Paria Mehrbod

Maîtrise recherche - Concordia

Superviseur⋅e principal⋅e :

Eugene Belilovsky

Lydia Mezrag

Doctorat - UdeM

Kevin Moon

Collaborateur·rice de recherche

Github

Sacha Morin

Doctorat - UdeM

Co-superviseur⋅e :

Postdoctorat - Concordia

Superviseur⋅e principal⋅e :

Shuang Ni

Doctorat - UdeM

Github

Albert Orozco Camacho

Doctorat - Concordia

Superviseur⋅e principal⋅e :

Jake Rhodes

Collaborateur·rice de recherche - BYU

Thomas Sabourin

Maîtrise recherche - UdeM

Matthew Scicluna

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Maîtrise recherche - UdeM

Jesuino Vieira Filho

Maîtrise recherche - UdeM

Collaborateur·rice alumni - UdeM

Co-superviseur⋅e :

Collaborateur·rice de recherche - McGill (assistant professor)

Analyser le paradoxe des interférons inhérent à la COVID-19 au moyen de la réduction de la dimensionnalité et du regroupement

Billets de blogue

Graph and representation of working methodology, and graph of data on deaths 60 days after onset of symptoms.

19 février 2025

par

Sacha Morin

Elsa Brunet-Ratnasingham

Guy Wolf

Lire l'article

Publications

Recipe for a General, Powerful, Scalable Graph Transformer

Ladislav Rampasek

Mikhail Galkin

Vijay Prakash Dwivedi

Anh Tuan Luu

Dominique Beaini

We propose a recipe on how to build a general, powerful, scalable (GPS) graph Transformer with linear complexity and state-of-the-art result… (voir plus)s on a diverse set of benchmarks. Graph Transformers (GTs) have gained popularity in the field of graph representation learning with a variety of recent publications but they lack a common foundation about what constitutes a good positional or structural encoding, and what differentiates them. In this paper, we summarize the different types of encodings with a clearer definition and categorize them as being

2021-12-31

Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (publié)

openreview.net

Patient health records and whole viral genomes from an early SARS-CoV-2 outbreak in a Quebec hospital reveal features associated with favorable outcomes

Bastien Paré

Marieke Rozendaal

Sacha Morin

Raphaël Poujol

Fatima Mostefai

Shawn M. Simpson

Jean-Christophe Grenier

Léa Kaufmann

Henry Xing

Miguelle Sanchez

Ariane Yechouron

Ronald Racette

Julie G. Hussin

Ivan Pavlov

Martin A. Smith

The first confirmed case of COVID-19 in Quebec, Canada, occurred at Verdun Hospital on February 25, 2020. A month later, a localized outbrea… (voir plus)k was observed at this hospital. We performed tiled amplicon whole genome nanopore sequencing on nasopharyngeal swabs from all SARS-CoV-2 positive samples from 31 March to 17 April 2020 in 2 local hospitals to assess the viral diversity of the outbreak. We report 264 viral genomes from 242 individuals (both staff and patients) with associated clinical features and outcomes, as well as longitudinal samples, technical replicates and the first publicly disseminated SARS-CoV-2 genomes in Quebec. Viral lineage assessment identified multiple subclades in both hospitals, with a predominant subclade in the Verdun outbreak, indicative of hospital-acquired transmission. Dimensionality reduction identified two subclades that evaded supervised lineage assignment methods, including Pangolin, and identified certain symptoms (headache, myalgia and sore throat) that are significantly associated with favorable patient outcomes. We also address certain limitations of standard SARS-CoV-2 bioinformatics procedures, notably when presented with multiple viral haplotypes.

2021-12-01

PLOS ONE (publié)

Fixing Bias in Reconstruction-Based Anomaly Detection with Lipschitz Discriminators

Alexander Tong

Smita Krishnaswamy

Anomaly detection is of great interest in fields where abnormalities need to be identified and corrected (e.g., medicine and finance). Deep … (voir plus)learning methods for this task often rely on autoencoder reconstruction error, sometimes in conjunction with other errors. We show that this approach exhibits intrinsic biases that lead to undesirable results. Reconstruction-based methods are sensitive to training-data outliers and simple-to-reconstruct points. Instead, we introduce a new unsupervised Lipschitz anomaly discriminator that does not suffer from these biases. Our anomaly discriminator is trained, similar to the ones used in GANs, to detect the difference between the training data and corruptions of the training data. We show that this procedure successfully detects unseen anomalies with guarantees on those that have a certain Wasserstein distance from the data or corrupted training set. These additions allow us to show improved performance on MNIST, CIFAR10, and health record data.

2021-11-27

Journal of Signal Processing Systems (inconnu)

arxiv.org

Data-driven approaches for genetic characterization of SARS-CoV-2 lineages

Fatima Mostefai

Isabel Gamache

Jessie Huang

Arnaud N’Guessan

Justin Pelletier

Ahmad Pesaranghader

David Hamelin

Carmen Lia Murall

Raphaël Poujol

Jean-Christophe Grenier

Martin Smith

Etienne Caron

Morgan Craig

Jesse Shapiro

Smita Krishnaswamy

Julie G. Hussin

The genome of the Severe Acute Respiratory Syndrome coronavirus 2 (SARS-CoV-2), the pathogen that causes coronavirus disease 2019 (COVID-19)… (voir plus), has been sequenced at an unprecedented scale, leading to a tremendous amount of viral genome sequencing data. To understand the evolution of this virus in humans, and to assist in tracing infection pathways and designing preventive strategies, we present a set of computational tools that span phylogenomics, population genetics and machine learning approaches. To illustrate the utility of this toolbox, we detail an in depth analysis of the genetic diversity of SARS-CoV-2 in first year of the COVID-19 pandemic, using 329,854 high-quality consensus sequences published in the GISAID database during the pre-vaccination phase. We demonstrate that, compared to standard phylogenetic approaches, haplotype networks can be computed efficiently on much larger datasets, enabling real-time analyses. Furthermore, time series change of Tajima’s D provides a powerful metric of population expansion. Unsupervised learning techniques further highlight key steps in variant detection and facilitate the study of the role of this genomic variation in the context of SARS-CoV-2 infection, with Multiscale PHATE methodology identifying fine-scale structure in the SARS-CoV-2 genetic data that underlies the emergence of key lineages. The computational framework presented here is useful for real-time genomic surveillance of SARS-CoV-2 and could be applied to any pathogen that threatens the health of worldwide populations of humans and other organisms.

2021-09-28

BioRxiv (prépublication)

Embedding Signals on Graphs with Unbalanced Diffusion Earth Mover's Distance

Dennis Shung

Manik Kuchroo

In modern relational machine learning it is common to encounter large graphs that arise via interactions or similarities between observation… (voir plus)s in many domains. Further, in many cases the target entities for analysis are actually signals on such graphs. We propose to compare and organize such datasets of graph signals by using an earth mover's distance (EMD) with a geodesic cost over the underlying graph. Typically, EMD is computed by optimizing over the cost of transporting one probability distribution to another over an underlying metric space. However, this is inefficient when computing the EMD between many signals. Here, we propose an unbalanced graph EMD that efficiently embeds the unbalanced EMD on an underlying graph into an

2021-07-25

arXiv (prépublication)

arxiv.org

Extendable and invertible manifold learning with geometry regularized autoencoders

Andrés F. Duque

Sacha Morin

Kevin Moon

A fundamental task in data exploration is to extract simplified low dimensional representations that capture intrinsic geometry in data, esp… (voir plus)ecially for faithfully visualizing data in two or three dimensions. Common approaches to this task use kernel methods for manifold learning. However, these methods typically only provide an embedding of fixed input data and cannot extend to new data points. Autoencoders have also recently become popular for representation learning. But while they naturally compute feature extractors that are both extendable to new data and invertible (i.e., reconstructing original features from latent representation), they have limited capabilities to follow global intrinsic geometry compared to kernel-based manifold learning. We present a new method for integrating both approaches by incorporating a geometric regularization term in the bottleneck of the autoencoder. Our regularization, based on the diffusion potential distances from the recently-proposed PHATE visualization method, encourages the learned latent representation to follow intrinsic data geometry, similar to manifold learning algorithms, while still enabling faithful extension to new data and reconstruction of data in the original feature space from latent coordinates. We compare our approach with leading kernel methods and autoencoder models for manifold learning to provide qualitative and quantitative evidence of our advantages in preserving intrinsic structure, out of sample extension, and reconstruction. Our method is easily implemented for big-data applications, whereas other methods are limited in this regard.

2020-12-09

2020 IEEE International Conference on Big Data (Big Data) (publié)

arxiv.org

Multiscale PHATE Exploration of SARS-CoV-2 Data Reveals Multimodal Signatures of Disease

Manik Kuchroo

Jessie Huang

Patrick Wong

Jean-Christophe Grenier

Dennis Shung

Alexander Tong

Carolina Lucas

Jon Klein

Daniel B. Burkhardt

Scott Gigante

Abhinav Godavarthi

Benjamin Israelow

Tianyang Mao

Ji Eun Oh

Julio Silva

Takehiro Takahashi

Camila D. Odio

Arnau Casanovas-Massana

John Fournier

Shelli Farhadian … (voir 7 de plus)

Charles S. Dela Cruz

Albert I. Ko

F. Perry Wilson

Julie Hussin

Akiko Iwasaki

Smita Krishnaswamy

Abstract

The biomedical community is producing increasingly high dimensional datasets, integrated from hundreds of… (voir plus) patient samples, which current computational techniques struggle to explore. To uncover biological meaning from these complex datasets, we present an approach called Multiscale PHATE, which learns abstracted biological features from data that can be directly predictive of disease. Built on a coarse graining process called diffusion condensation, Multiscale PHATE learns a data topology that can be analyzed at coarse levels for high level summarizations of data, as well as at fine levels for detailed representations on subsets. We apply Multiscale PHATE to study the immune response to COVID-19 in 54 million cells from 168 hospitalized patients. Through our analysis of patient samples, we identify CD16-hi,CD66b-lo neutrophil and IFNγ+,GranzymeB+ Th17 cell responses enriched in patients who die. Furthermore, we show that population groupings Multiscale PHATE discovers can be directly fed into a classifier to predict disease outcome. We also use Multiscale PHATE-derived features to construct two different manifolds of patients, one from abstracted flow cytometry features and another directly on patient clinical features, both associating immune subsets and clinical markers with outcome.

2020-11-16

bioRxiv (prépublication)

Geometric Wavelet Scattering Networks on Compact Riemannian Manifolds

Michael Perlmutter

Feng Gao

Matthew Hirn

The Euclidean scattering transform was introduced nearly a decade ago to improve the mathematical understanding of convolutional neural netw… (voir plus)orks. Inspired by recent interest in geometric deep learning, which aims to generalize convolutional neural networks to manifold and graph-structured domains, we define a geometric scattering transform on manifolds. Similar to the Euclidean scattering transform, the geometric scattering transform is based on a cascade of wavelet filters and pointwise nonlinearities. It is invariant to local isometries and stable to certain types of diffeomorphisms. Empirical results demonstrate its utility on several geometric learning tasks. Our results generalize the deformation stability and local translation invariance of Euclidean scattering, and demonstrate the importance of linking the used filter structures to the underlying geometry of the data.

2020-06-30

PubMed (inconnu)