Guillaume Huguet

Martineau Jean-Louis

Charles-Olivier Martin

C.O. Martin

Zohra Saci

Nadine Younis

Petra Tamer

Elise Douard

Anne M. Maillard

Borja Rodriguez-Herreros

Aurélie Pain

Sonia Richetin

Leila Kushan

Ana I. Silva … (see 13 more)

Marianne B.M. van den Bree

David E.J. Linden

M. J. Owen

Jeremy Hall

Sarah Lippé

Bogdan Draganski

Ida E. Sønderby

Ole A. Andreassen

David C. Glahn

Paul M. Thompson

Carrie E. Bearden

Sébastien Jacquemont

Danilo Bzdok

2023-03-02

Nature human behaviour (published)

Graph Fourier MMD for signals on data graphs

Samuel Leone

Alexander Tong

Guy Wolf

Smita Krishnaswamy

While numerous methods have been proposed for computing distances between probability distributions in Euclidean space, relatively little at… (see more)tention has been given to computing such distances for distributions on graphs. However, there has been a marked increase in data that either lies on graph (such as protein interaction networks) or can be modeled as a graph (single cell data), particularly in the biomedical sciences. Thus, it becomes important to find ways to compare signals defined on such graphs. Here, we propose Graph Fourier MMD (GFMMD), a novel a distance between distributions, or non-negative signals on graphs. GFMMD is defined via an optimal witness function that is both smooth on the graph and maximizes difference in expectation between the pair of distributions on the graph. We find an analytical solution to this optimization problem as well as an embedding of distributions that results from this method. We also prove several properties of this method including scale invariance and applicability to disconnected graphs. We showcase it on graph benchmark datasets as well on single cell RNA-sequencing data analysis. In the latter, we use the GFMMD-based gene embeddings to find meaningful gene clusters. We also propose a novel type of score for gene selection called {\em gene localization score} which helps select genes for cellular state space characterization.

2023-02-01

ICLR.cc/2023/Conference (rejected)

openreview.net

Improving and generalizing flow-based generative models with minibatch optimal transport

Alexander Tong

Yanlei Zhang

Kilian FATRAS

Continuous normalizing flows (CNFs) are an attractive generative modeling technique, but they have been held back by limitations in their si… (see more)mulation-based maximum likelihood training. We introduce the generalized conditional flow matching (CFM) technique, a family of simulation-free training objectives for CNFs. CFM features a stable regression objective like that used to train the stochastic flow in diffusion models but enjoys the efficient inference of deterministic flow models. In contrast to both diffusion models and prior CNF training algorithms, CFM does not require the source distribution to be Gaussian or require evaluation of its density. A variant of our objective is optimal transport CFM (OT-CFM), which creates simpler flows that are more stable to train and lead to faster inference, as evaluated in our experiments. Furthermore, we show that when the true OT plan is available, our OT-CFM method approximates dynamic OT. Training CNFs with CFM improves results on a variety of conditional and unconditional generation tasks, such as inferring single cell dynamics, unsupervised image translation, and Schr\"odinger bridge inference.

2023-02-01

ArXiv (preprint)

Improving and generalizing flow-based generative models with minibatch optimal transport

Alexander Tong

Yanlei Zhang

Kilian FATRAS

Continuous normalizing flows (CNFs) are an attractive generative modeling technique, but they have been held back by limitations in their si… (see more)mulation-based maximum likelihood training. We introduce the generalized conditional flow matching (CFM) technique, a family of simulation-free training objectives for CNFs. CFM features a stable regression objective like that used to train the stochastic flow in diffusion models but enjoys the efficient inference of deterministic flow models. In contrast to both diffusion models and prior CNF training algorithms, CFM does not require the source distribution to be Gaussian or require evaluation of its density. A variant of our objective is optimal transport CFM (OT-CFM), which creates simpler flows that are more stable to train and lead to faster inference, as evaluated in our experiments. Furthermore, we show that when the true OT plan is available, our OT-CFM method approximates dynamic OT. Training CNFs with CFM improves results on a variety of conditional and unconditional generation tasks, such as inferring single cell dynamics, unsupervised image translation, and Schr\"odinger bridge inference.

2023-02-01

ArXiv (preprint)

Improving and generalizing flow-based generative models with minibatch optimal transport

Alexander Tong

Yanlei Zhang

Kilian FATRAS

Continuous normalizing flows (CNFs) are an attractive generative modeling technique, but they have been held back by limitations in their si… (see more)mulation-based maximum likelihood training. We introduce the generalized conditional flow matching (CFM) technique, a family of simulation-free training objectives for CNFs. CFM features a stable regression objective like that used to train the stochastic flow in diffusion models but enjoys the efficient inference of deterministic flow models. In contrast to both diffusion models and prior CNF training algorithms, CFM does not require the source distribution to be Gaussian or require evaluation of its density. A variant of our objective is optimal transport CFM (OT-CFM), which creates simpler flows that are more stable to train and lead to faster inference, as evaluated in our experiments. Furthermore, we show that when the true OT plan is available, our OT-CFM method approximates dynamic OT. Training CNFs with CFM improves results on a variety of conditional and unconditional generation tasks, such as inferring single cell dynamics, unsupervised image translation, and Schr\"odinger bridge inference.

2023-02-01

ArXiv (preprint)

Improving and generalizing flow-based generative models with minibatch optimal transport

Alexander Tong

Yanlei Zhang

Kilian FATRAS

Continuous normalizing flows (CNFs) are an attractive generative modeling technique, but they have been held back by limitations in their si… (see more)mulation-based maximum likelihood training. We introduce the generalized conditional flow matching (CFM) technique, a family of simulation-free training objectives for CNFs. CFM features a stable regression objective like that used to train the stochastic flow in diffusion models but enjoys the efficient inference of deterministic flow models. In contrast to both diffusion models and prior CNF training algorithms, CFM does not require the source distribution to be Gaussian or require evaluation of its density. A variant of our objective is optimal transport CFM (OT-CFM), which creates simpler flows that are more stable to train and lead to faster inference, as evaluated in our experiments. Furthermore, we show that when the true OT plan is available, our OT-CFM method approximates dynamic OT. Training CNFs with CFM improves results on a variety of conditional and unconditional generation tasks, such as inferring single cell dynamics, unsupervised image translation, and Schr\"odinger bridge inference.

2023-02-01

ArXiv (preprint)

Conditional Flow Matching: Simulation-Free Dynamic Optimal Transport

Alexander Tong

Yanlei Zhang

Kilian FATRAS

2023-01-01

arXiv.org (preprint)

Conditional Flow Matching: Simulation-Free Dynamic Optimal Transport

Alexander Tong

Yanlei Zhang

Kilian FATRAS

Continuous normalizing ﬂows (CNFs) are an attractive generative modeling technique, but they have thus far been held back by limitations i… (see more)n their simulation-based maximum likelihood training. In this paper, we introduce a new technique called conditional ﬂow matching (CFM), a simulation-free training objective for CNFs. CFM features a stable regression objective like that used to train the stochastic ﬂow in diffusion models but enjoys the efﬁcient inference of deterministic ﬂow models. In contrast to both diffusion models and prior CNF training algorithms, our CFM objec-tive does not require the source distribution to be Gaussian or require evaluation of its density. Based on this new objective, we also introduce optimal transport CFM (OT-CFM), which creates simpler ﬂows that are more stable to train and lead to faster inference, as evaluated in our experiments. Training CNFs with CFM improves results on a variety of conditional and unconditional generation tasks such as inferring single cell dynamics, unsupervised image translation, and Schr ¨ odinger bridge inference. Code is available at https://github.com/atong01/ conditional-flow-matching .

2023-01-01

arXiv.org (preprint)

GEODESIC SINKHORN FOR FAST AND ACCURATE OPTIMAL TRANSPORT ON MANIFOLDS

Alexander Tong

María Ramos Zapatero

Christopher J. Tape

Guy Wolf

Smita Krishnaswamy

Efficient computation of optimal transport distance between distributions is of growing importance in data science. Sinkhorn-based methods a… (see more)re currently the state-of-the-art for such computations, but require O(n2) computations. In addition, Sinkhorn-based methods commonly use an Euclidean ground distance between datapoints. However, with the prevalence of manifold structured scientific data, it is often desirable to consider geodesic ground distance. Here, we tackle both issues by proposing Geodesic Sinkhorn—based on diffusing a heat kernel on a manifold graph. Notably, Geodesic Sinkhorn requires only O(n log n) computation, as we approximate the heat kernel with Chebyshev polynomials based on the sparse graph Laplacian. We apply our method to the computation of barycenters of several distributions of high dimensional single cell data from patient samples undergoing chemotherapy. In particular, we define the barycentric distance as the distance between two such barycenters. Using this definition, we identify an optimal transport distance and path associated with the effect of treatment on cellular data.

2023-01-01

2023 IEEE 33rd International Workshop on Machine Learning for Signal Processing (MLSP) (published)

A Heat Diffusion Perspective on Geodesic Preserving Dimensionality Reduction

Alexander Tong

Edward De Brouwer

Yanlei Zhang

Guy Wolf

Ian Adelstein

Smita Krishnaswamy

openreview.net

P397. Genomic Deletions and Duplications Show Mirror Effects on Cognitive Ability According to Spatial Patterns of Gene Expression in the Human Brain

Kuldeep Kumar

Sayeh Kazem

Elise Douard

Zohra Saci

Laura Almasy

David C. Glahn

Guillaume Dumas

Sébastien Jacquemont

2022-05-01

Biological Psychiatry (published)

Rare CNVs and phenome-wide profiling: a tale of brain-structural divergence and phenotypical convergence

Jakub Kopal

Kuldeep Kumar

Karin Saltoun

Claudia Modenato

Clara A. Moreau

Sandra Martin-Brevet

Martineau Jean-Louis

C.O. Martin

Zohra Saci

Nadine Younis

Petra Tamer

Elise Douard

Anne M. Maillard

Borja Rodriguez-Herreros

Aurélie Pain

Sonia Richetin

Leila Kushan

Ana I. Silva

Marianne B.M. van den Bree … (see 12 more)

David E.J. Linden

M. J. Owen

Jeremy Hall

Sarah Lippé

Bogdan Draganski

Ida E. Sønderby

Ole A. Andreassen

David C. Glahn

Paul M. Thompson

Carrie E. Bearden

Sébastien Jacquemont

Danilo Bzdok

Copy number variations (CNVs) are rare genomic deletions and duplications that can exert profound effects on brain and behavior. Previous re… (see more)ports of pleiotropy in CNVs imply that they converge on shared mechanisms at some level of pathway cascades, from genes to large-scale neural circuits to the phenome. However, studies to date have primarily examined single CNV loci in small clinical cohorts. It remains unknown how distinct CNVs escalate the risk for the same developmental and psychiatric disorders. Here, we quantitatively dissect the impact on brain organization and behavioral differentiation across eight key CNVs. In 534 clinical CNV carriers from multiple sites, we explored CNV-specific brain morphology patterns. We extensively annotated these CNV-associated patterns with deep phenotyping assays through the UK Biobank resource. Although the eight CNVs cause disparate brain changes, they are tied to similar phenotypic profiles across ∼1000 lifestyle indicators. Our population-level investigation established brain structural divergences and phenotypical convergences of CNVs, with direct relevance to major brain disorders.

2022-04-25

bioRxiv (preprint)