Joey Bose

Metric Flow Matching for Smooth Interpolations on the Data Manifold

Kacper Kapusniak

Peter Potaptchik

Teodora Reu

Leo Zhang

Alexander Tong

Michael M. Bronstein

Francesco Di Giovanni

Matching objectives underpin the success of modern generative models and rely on constructing conditional paths that transform a source dist… (voir plus)ribution into a target distribution. Despite being a fundamental building block, conditional paths have been designed principally under the assumption of Euclidean geometry, resulting in straight interpolations. However, this can be particularly restrictive for tasks such as trajectory inference, where straight paths might lie outside the data manifold, thus failing to capture the underlying dynamics giving rise to the observed marginals. In this paper, we propose Metric Flow Matching (MFM), a novel simulation-free framework for conditional flow matching where interpolants are approximate geodesics learned by minimizing the kinetic energy of a data-induced Riemannian metric. This way, the generative model matches vector fields on the data manifold, which corresponds to lower uncertainty and more meaningful interpolations. We prescribe general metrics to instantiate MFM, independent of the task, and test it on a suite of challenging problems including LiDAR navigation, unpaired image translation, and modeling cellular dynamics. We observe that MFM outperforms the Euclidean baselines, particularly achieving SOTA on single-cell trajectory prediction.

2024-01-01

NeurIPS (publié)

arxiv.org

Predicting Drug Effects from High-Dimensional, Asymmetric Drug Datasets by Using Graph Neural Networks: A Comprehensive Analysis of Multitarget Drug Effect Prediction

Guojing Cong

Graph neural networks (GNNs) have emerged as one of the most effective ML techniques for drug effect prediction from drug molecular graphs. … (voir plus)Despite having immense potential, GNN models lack performance when using datasets that contain high-dimensional, asymmetrically co-occurrent drug effects as targets with complex correlations between them. Training individual learning models for each drug effect and incorporating every prediction result for a wide spectrum of drug effects are impractical. Therefore, an opportunity exists to address this challenge as multitarget prediction problems and predict all drug effects at a time. We developed standard and hybrid GNNs to perform two separate tasks: multiregression for continuous values and multilabel classification for categorical values contained in our datasets. Because multilabel classification makes the target data even more sparse and introduces asymmetric label co-occurrence, learning these models becomes difficult and heavily impacts the GNN's performance. To address these challenges, we propose a new data oversampling technique to improve multilabel classification performances on all the given imbalanced molecular graph datasets. Using the technique, we improve the data imbalance ratio of the drug effects while protecting the datasets' integrity. Finally, we evaluate the multilabel classification performance of the best-performing hybrid GNN model on all the oversampled datasets obtained from the proposed oversampling technique. In all the evaluation metrics (i.e., precision, recall, and F1 score), this model significantly outperforms other ML models, including GNN models when they are trained on the original datasets or oversampled datasets with MLSMOTE, which is a well-known oversampling technique.

2024-01-01

ICMLA (publié)

arxiv.org

Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Generation.

Guillaume Huguet

James Vuckovic

Kilian FATRAS

Eric Laufer

Pablo Lemos

Riashat Islam

Cheng-Hao Liu

Jarrid Rector-Brooks

Tara Akhound-Sadegh

Michael M. Bronstein

Alexander Tong

2024-01-01

Neural Information Processing Systems (publié)

dblp.uni-trier.de

A General Framework For Proving The Equivariant Strong Lottery Ticket Hypothesis

Damien Ferbach

Christos Tsirigotis

Gauthier Gidel

The Strong Lottery Ticket Hypothesis (SLTH) stipulates the existence of a subnetwork within a sufficiently overparameterized (dense) neural … (voir plus)network that -- when initialized randomly and without any training -- achieves the accuracy of a fully trained target network. Recent works by Da Cunha et. al 2022; Burkholz 2022 demonstrate that the SLTH can be extended to translation equivariant networks -- i.e. CNNs -- with the same level of overparametrization as needed for the SLTs in dense networks. However, modern neural networks are capable of incorporating more than just translation symmetry, and developing general equivariant architectures such as rotation and permutation has been a powerful design principle. In this paper, we generalize the SLTH to functions that preserve the action of the group

2023-02-01

ICLR.cc/2023/Conference (poster)

Feature Likelihood Divergence: Evaluating the Generalization of Generative Models Using Samples

Marco Jiralerspong

Ian Gemp

Chongli Qin

Yoram Bachrach

Gauthier Gidel

Feature Likelihood Score: Evaluating Generalization of Generative Models Using Samples

Marco Jiralerspong

Gauthier Gidel

Deep generative models have demonstrated the ability to generate complex, high-dimensional, and photo-realistic data. However, a uniﬁed fr… (voir plus)amework for evaluating different generative modeling families remains a challenge. Indeed, likelihood-based metrics do not apply in many cases while pure sample-based metrics such as FID fail to capture known failure modes such as overﬁtting on training data. In this work, we introduce the Feature Likelihood Score (FLS), a parametric sample-based score that uses density estimation to quantitatively measure the quality/diversity of generated samples while taking into account overﬁtting. We empirically demonstrate the ability of FLS to identify speciﬁc overﬁtting problem cases, even when previously proposed metrics fail. We further perform an extensive experimental evaluation on various image datasets and model classes. Our results indicate that FLS matches intuitions of previous metrics, such as FID, while providing a more holistic evaluation of generative models that highlights models whose generalization abilities are under or overappreciated. Code for computing FLS is provided at https://github.com/marcojira/ﬂs.

2023-01-01

arXiv.org (prépublication)

Riemannian Diffusion Models

Chin-Wei Huang

Milad Aghajohari

Prakash Panangaden

Aaron Courville

Diffusion models are recent state-of-the-art methods for image generation and likelihood estimation. In this work, we generalize continuous-… (voir plus)time diffusion models to arbitrary Riemannian manifolds and derive a variational framework for likelihood estimation. Computationally, we propose new methods for computing the Riemannian divergence which is needed for likelihood estimation. Moreover, in generalizing the Euclidean case, we prove that maximizing this variational lower-bound is equivalent to Riemannian score matching. Empirically, we demonstrate the expressive power of Riemannian diffusion models on a wide spectrum of smooth manifolds, such as spheres, tori, hyperboloids, and orthogonal groups. Our proposed method achieves new state-of-the-art likelihoods on all benchmarks.

A Cross-Domain Transferable Neural Coherence Model

Peng Xu

H. Saghir

Jin Sung Kang

Teng Long

Yanshuai Cao

Jackie Cheung

Coherence is an important aspect of text quality and is crucial for ensuring its readability. One important limitation of existing coherence… (voir plus) models is that training on one domain does not easily generalize to unseen categories of text. Previous work advocates for generative models for cross-domain generalization, because for discriminative models, the space of incoherent sentence orderings to discriminate against during training is prohibitively large. In this work, we propose a local discriminative neural model with a much smaller negative sampling space that can efficiently learn against incorrect orderings. The proposed coherence model is simple in structure, yet it significantly outperforms previous state-of-art methods on a standard benchmark dataset on the Wall Street Journal corpus, as well as in multiple new challenging settings of transfer to unseen categories of discourse on Wikipedia articles.

2019-07-01

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (publié)