Guy Wolf

Joao Felipe Carneiro Barbosa Rocha

Google Scholar

Biographie

Guy Wolf est professeur agrégé au Département de mathématiques et de statistique de l'Université de Montréal. Ses intérêts de recherche se situent au carrefour de l'apprentissage automatique, de la science des données et des mathématiques appliquées. Il s'intéresse particulièrement aux méthodes d'exploration de données qui utilisent l'apprentissage multiple et l'apprentissage géométrique profond, ainsi qu'aux applications pour l'analyse exploratoire des données biomédicales.

Ses recherches portent sur l'analyse exploratoire des données, avec des applications en bio-informatique. Ses approches sont multidisciplinaires et combinent l'apprentissage automatique, le traitement du signal et les outils mathématiques appliqués. En particulier, ses travaux récents utilisent une combinaison de géométries de diffusion et d'apprentissage profond pour trouver des modèles émergents, des dynamiques et des structures dans les mégadonnées à grande dimension (par exemple, dans la génomique et la protéomique de la cellule unique).

Étudiants actuels

Doctorat - UdeM

Doctorat - UdeM

Collaborateur·rice de recherche - Yale University

Co-superviseur⋅e :

Collaborateur·rice alumni

Doctorat - UdeM

Xiao Huang

Maîtrise recherche - Concordia

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Paul Janson

Doctorat - Concordia

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Doctorat - UdeM

Co-superviseur⋅e :

Paul François

Paria Mehrbod

Maîtrise recherche - Concordia

Superviseur⋅e principal⋅e :

Eugene Belilovsky

Lydia Mezrag

Doctorat - UdeM

Kevin Moon

Collaborateur·rice de recherche

Github

Google Scholar

Sacha Morin

Doctorat - UdeM

Co-superviseur⋅e :

Postdoctorat - Concordia

Superviseur⋅e principal⋅e :

Shuang Ni

Doctorat - UdeM

Github

Albert Orozco Camacho

Doctorat - Concordia

Superviseur⋅e principal⋅e :

Maîtrise recherche - UdeM

Matthew Scicluna

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Maîtrise recherche - UdeM

Pedro Vianna

Collaborateur·rice de recherche - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Postdoctorat - UdeM

Co-superviseur⋅e :

Collaborateur·rice de recherche - McGill (assistant professor)

Analyser le paradoxe des interférons inhérent à la COVID-19 au moyen de la réduction de la dimensionnalité et du regroupement

Google Scholar

Billets de blogue

Graph and representation of working methodology, and graph of data on deaths 60 days after onset of symptoms.

19 février 2025

par

Sacha Morin

Elsa Brunet-Ratnasingham

Guy Wolf

Lire l'article

Publications

Random Forest Autoencoders for Guided Representation Learning

Kevin R. Moon

Jake S. Rhodes

Decades of research have produced robust methods for unsupervised data visualization, yet supervised visualization…

2025-10-22

logconference.io/LOG/2025/Conference (poster)

Low-dimensional embeddings of high-dimensional data

Cyril de Bodt

Alex Diaz-Papkovich

Michael Bleher

Kerstin Bunte

Corinna Coupette

Sebastian Damrich

Enrique Fita Sanmartin

Fred A. Hamprecht

EmHoke-'Agnes Horv'at

Dhruv Kohli

John A. Lee 0001

Boudewijn P. F. Lelieveldt

Leland McInnes

Ian T. Nabney

Maximilian Noichl

Pavlin G. Polivcar

Bastian Rieck

Gal Mishne … (voir 1 de plus)

Dmitry Kobak

Large collections of high-dimensional data have become nearly ubiquitous across many academic fields and application domains, ranging from b… (voir plus)iology to the humanities. Since working directly with high-dimensional data poses challenges, the demand for algorithms that create low-dimensional representations, or embeddings, for data visualization, exploration, and analysis is now greater than ever. In recent years, numerous embedding algorithms have been developed, and their usage has become widespread in research and industry. This surge of interest has resulted in a large and fragmented research field that faces technical challenges alongside fundamental debates, and it has left practitioners without clear guidance on how to effectively employ existing methods. Aiming to increase coherence and facilitate future work, in this review we provide a detailed and critical overview of recent developments, derive a list of best practices for creating and using low-dimensional embeddings, evaluate popular approaches on a variety of datasets, and discuss the remaining challenges and open problems in the field.

2025-08-21

ArXiv (prépublication)

arxiv.org

Low-dimensional embeddings of high-dimensional data

Cyril de Bodt

Alex Diaz-Papkovich

Michael Bleher

Kerstin Bunte

Corinna Coupette

Sebastian Damrich

Enrique Fita Sanmartin

Fred Hamprecht

EmHoke-'Agnes Horv'at

Dhruv Kohli

John A. Lee 0001

Boudewijn P. F. Lelieveldt

Leland McInnes

Ian T. Nabney

Maximilian Noichl

Pavlin G. Polivcar

Bastian Rieck

Gal Mishne … (voir 1 de plus)

Dmitry Kobak

2025-08-21

ArXiv (prépublication)

arxiv.org

Low-dimensional embeddings of high-dimensional data

Cyril de Bodt

Alex Diaz-Papkovich

Michael Bleher

Kerstin Bunte

Corinna Coupette

Sebastian Damrich

Enrique Fita Sanmartin

Fred Hamprecht

Emőke-Ágnes Horvát

Dhruv Kohli

John A. Lee

Boudewijn P. F. Lelieveldt

Leland McInnes

Ian T. Nabney

Maximilian Noichl

Pavlin G. Poličar

Bastian Rieck

Gal Mishne … (voir 1 de plus)

Dmitry Kobak

2025-08-01

arXiv (publié)

Towards a General Recipe for Combinatorial Optimization with Multi-Filter GNNs

Michael Perlmutter

2025-07-30

Proceedings of the Third Learning on Graphs Conference (publié)

proceedings.mlr.press

Circuit Discovery Helps To Detect LLM Jailbreaking

Despite extensive safety alignment, large language models (LLMs) remain vulnerable to jailbreak attacks that bypass safeguards to elicit har… (voir plus)mful content. While prior work attributes this vulnerability to safety training limitations, the internal mechanisms by which LLMs process adversarial prompts remain poorly understood. We present a mechanistic analysis of the jailbreaking behavior in a large-scale, safety-aligned LLM, focusing on LLaMA-2-7B-chat-hf. Leveraging edge attribution patching and subnetwork probing, we systematically identify computational circuits responsible for generating affirmative responses to jailbreak prompts. Ablating these circuits during the first token prediction can reduce attack success rates by up to 80\%, demonstrating its critical role in safety bypass. Our analysis uncovers key attention heads and MLP pathways that mediate adversarial prompt exploitation, revealing how important tokens propagate through these components to override safety constraints. These findings advance the understanding of adversarial vulnerabilities in aligned LLMs and pave the way for targeted, interpretable defenses mechanisms based on mechanistic interpretability.

2025-06-30

ICML.cc/2025/Workshop/R2-FM (poster)

Test Time Adaptation Using Adaptive Quantile Recalibration

2025-06-10

ICML.cc/2025/Workshop/PUT (poster)

RETRO SYNFLOW: Discrete Flow Matching for Accurate and Diverse Single-Step Retrosynthesis

Robin Yadav

Qi Yan

Joey Bose

Renjie Liao

A fundamental problem in organic chemistry is identifying and predicting the series of reactions that synthesize a desired target product mo… (voir plus)lecule. Due to the combinatorial nature of the chemical search space, single-step reactant prediction -- i.e. single-step retrosynthesis -- remains challenging even for existing state-of-the-art template-free generative approaches to produce an accurate yet diverse set of feasible reactions. In this paper, we model single-step retrosynthesis planning and introduce RETRO SYNFLOW (RSF) a discrete flow-matching framework that builds a Markov bridge between the prescribed target product molecule and the reactant molecule. In contrast to past approaches, RSF employs a reaction center identification step to produce intermediate structures known as synthons as a more informative source distribution for the discrete flow. To further enhance diversity and feasibility of generated samples, we employ Feynman-Kac steering with Sequential Monte Carlo based resampling to steer promising generations at inference using a new reward oracle that relies on a forward-synthesis model. Empirically, we demonstrate \nameshort achieves

2025-06-04

ArXiv (prépublication)

arxiv.org

Geometry aware graph attention networks to explain single-cell chromatin state and gene expression

Gabriele Malagoli

Patrick Hanel

A. Danese

Maria Colomé-Tatché

2025-06-01

bioRxiv (prépublication)

Neurospectrum: A Geometric and Topological Deep Learning Framework for Uncovering Spatiotemporal Signatures in Neural Activity

Dhananjay Bhaskar

Yanlei Zhang

Jessica Moore

Feng Gao

Bastian Rieck

Firas Khasawneh

Elizabeth Munch

J. Adam Noah

Helen Pushkarskaya

Christopher Pittenger

Valentina Greco

2025-05-08

bioRxiv (prépublication)

Graph Neural Networks Meet Probabilistic Graphical Models: A Survey

Qian Zhang

2025-04-06

IEEE International Conference on Acoustics, Speech, and Signal Processing (publié)

Unsupervised Test-Time Adaptation for Hepatic Steatosis Grading Using Ultrasound B-Mode Images.

Michael Eickenberg

An Tang

Guy Cloutier

Ultrasound is considered a key modality for the clinical assessment of hepatic steatosis (i.e., fatty liver) due to its non-invasiveness and… (voir plus) availability. Deep learning methods have attracted considerable interest in this field, as they are capable of learning patterns in a collection of images and achieve clinically comparable levels of accuracy in steatosis grading. However, variations in patient populations, acquisition protocols, equipment, and operator expertise across clinical sites can introduce domain shifts that reduce model performance when applied outside the original training setting. In response, unsupervised domain adaptation techniques are being investigated to address these shifts, allowing models to generalize more effectively across diverse clinical environments. In this work, we propose a test-time batch normalization technique designed to handle domain shift, especially for changes in label distribution, by adapting selected features of batch normalization layers in a trained convolutional neural network model. This approach operates in an unsupervised manner, allowing robust adaptation to new distributions without access to label data. The method was evaluated on two abdominal ultrasound datasets collected at different institutions, assessing its capability in mitigating domain shift for hepatic steatosis classification. The proposed method reduced the mean absolute error in steatosis grading by 37% and improved the area under the receiver operating characteristic curve for steatosis detection from 0.78 to 0.97, compared to non-adapted models. These findings demonstrate the potential of the proposed method to address domain shift in ultrasound-based hepatic steatosis diagnosis, minimizing risks associated with deploying trained models in various clinical settings.

2025-03-26

IEEE Transactions on Ultrasonics, Ferroelectrics and Frequency Control (publié)