Siamak Ravanbakhsh

Kirill Neklyudov

Alexander Tong

Sampling efficiently from a target unnormalized probability density remains a core challenge, with relevance across countless high-impact sc… (voir plus)ientific applications. A promising approach towards this challenge is the design of amortized samplers that borrow key ideas, such as probability path design, from state-of-the-art generative diffusion models. However, all existing diffusion-based samplers remain unable to draw samples from distributions at the scale of even simple molecular systems. In this paper, we propose Progressive Inference-Time Annealing (PITA), a novel framework to learn diffusion-based samplers that combines two complementary interpolation techniques: I.) Annealing of the Boltzmann distribution and II.) Diffusion smoothing. PITA trains a sequence of diffusion models from high to low temperatures by sequentially training each model at progressively higher temperatures, leveraging engineered easy access to samples of the temperature-annealed target density. In the subsequent step, PITA enables simulating the trained diffusion model to procure training samples at a lower temperature for the next diffusion model through inference-time annealing using a novel Feynman-Kac PDE combined with Sequential Monte Carlo. Empirically, PITA enables, for the first time, equilibrium sampling of N-body particle systems, Alanine Dipeptide, and tripeptides in Cartesian coordinates with dramatically lower energy function evaluations. Code available at: https://github.com/taraak/pita

2025-06-19

ArXiv (prépublication)

Progressive Inference-Time Annealing of Diffusion Models for Sampling from Boltzmann Densities

Tara Akhound-Sadegh

Jungyoon Lee

Joey Bose

Valentin De Bortoli

Arnaud Doucet

Michael M. Bronstein

Dominique Beaini

Kirill Neklyudov

Alexander Tong

Sampling efficiently from a target unnormalized probability density remains a core challenge, with relevance across countless high-impact sc… (voir plus)ientific applications. A promising approach towards this challenge is the design of amortized samplers that borrow key ideas, such as probability path design, from state-of-the-art generative diffusion models. However, all existing diffusion-based samplers remain unable to draw samples from distributions at the scale of even simple molecular systems. In this paper, we propose Progressive Inference-Time Annealing (PITA), a novel framework to learn diffusion-based samplers that combines two complementary interpolation techniques: I.) Annealing of the Boltzmann distribution and II.) Diffusion smoothing. PITA trains a sequence of diffusion models from high to low temperatures by sequentially training each model at progressively higher temperatures, leveraging engineered easy access to samples of the temperature-annealed target density. In the subsequent step, PITA enables simulating the trained diffusion model to procure training samples at a lower temperature for the next diffusion model through inference-time annealing using a novel Feynman-Kac PDE combined with Sequential Monte Carlo. Empirically, PITA enables, for the first time, equilibrium sampling of N-body particle systems, Alanine Dipeptide, and tripeptides in Cartesian coordinates with dramatically lower energy function evaluations. Code available at: https://github.com/taraak/pita

2025-06-19

ArXiv (prépublication)

Beyond Scalar Rewards: An Axiomatic Framework for Lexicographic MDPs

Mehran Shakerinava

Adam M. Oberman

2025-05-17

ArXiv (prépublication)

Beyond Scalar Rewards: An Axiomatic Framework for Lexicographic MDPs

Mehran Shakerinava

Adam M. Oberman

2025-05-01

arXiv (publié)

SymmCD: Symmetry-Preserving Crystal Generation with Diffusion Models

Daniel Levy

Siba Smarak Panigrahi

Sékou-Oumar Kaba

Qiang Zhu

Kin Long Kelvin Lee

Mikhail Galkin

Santiago Miret

2025-01-22

ICLR.cc/2025/Conference (poster)

On the Identifiability of Causal Abstractions

Xiusi Li

Sékou-Oumar Kaba

Causal representation learning methods seek to enhance machine learning models' robustness and generalization capabilities by learning laten… (voir plus)t representations and causal graphs aligned with the data generating process. In many systems, fully recovering the true causal structure is challenging because we cannot intervene on all latent variables individually. We introduce a theoretical framework that calculates the degree to which we can identify a causal structure in the more realistic setting of interventions on arbitrary subsets of latent variables. We find that in that case, we can only identify a causal model up to a \emph{causal abstraction}. These causal abstractions are still meaningful in that they describe the system at a higher level of granularity. Conversely, given a causal abstraction, our framework provides sufficient conditions for its identifiability. Our findings extend existing identifiability results in two areas: those that address abstractions of latent variables without considering graphical structures and those that focus on graphical structures without incorporating their abstractions.

2025-01-22

aistats.org/AISTATS/2025/Conference (poster)

On the Identifiability of Causal Abstractions

Xiusi Li

Sékou-Oumar Kaba

Causal representation learning (CRL) enhances machine learning models' robustness and generalizability by learning structural causal models … (voir plus)associated with data-generating processes. We focus on a family of CRL methods that uses contrastive data pairs in the observable space, generated before and after a random, unknown intervention, to identify the latent causal model. (Brehmer et al., 2022) showed that this is indeed possible, given that all latent variables can be intervened on individually. However, this is a highly restrictive assumption in many systems. In this work, we instead assume interventions on arbitrary subsets of latent variables, which is more realistic. We introduce a theoretical framework that calculates the degree to which we can identify a causal model, given a set of possible interventions, up to an abstraction that describes the system at a higher level of granularity.

2025-01-22

aistats.org/AISTATS/2025/Conference (poster)

Symmetry-Aware Generative Modeling through Learned Canonicalization

Kusha Sareen

Daniel Levy

Arnab Kumar Mondal

Sékou-Oumar Kaba

Tara Akhound-Sadegh

Generative modeling of symmetric densities has a range of applications in AI for science, from drug discovery to physics simulations. The ex… (voir plus)isting generative modeling paradigm for invariant densities combines an invariant prior with an equivariant generative process. However, we observe that this technique is not necessary and has several drawbacks resulting from the limitations of equivariant networks. Instead, we propose to model a learned slice of the density so that only one representative element per orbit is learned. To accomplish this, we learn a group-equivariant canonicalization network that maps training samples to a canonical pose and train a non-equivariant generative model over these canonicalized samples. We implement this idea in the context of diffusion models. Our preliminary experimental results on molecular modeling are promising, demonstrating improved sample quality and faster inference time.

2025-01-14

ArXiv (prépublication)

Symmetry-Aware Generative Modeling through Learned Canonicalization

Kusha Sareen

Daniel Levy

Arnab Kumar Mondal

Sékou-Oumar Kaba

Tara Akhound-Sadegh

Generative modeling of symmetric densities has a range of applications in AI for science, from drug discovery to physics simulations. The ex… (voir plus)isting generative modeling paradigm for invariant densities combines an invariant prior with an equivariant generative process. However, we observe that this technique is not necessary and has several drawbacks resulting from the limitations of equivariant networks. Instead, we propose to model a learned slice of the density so that only one representative element per orbit is learned. To accomplish this, we learn a group-equivariant canonicalization network that maps training samples to a canonical pose and train a non-equivariant generative model over these canonicalized samples. We implement this idea in the context of diffusion models. Our preliminary experimental results on molecular modeling are promising, demonstrating improved sample quality and faster inference time.

2024-10-23

NeurIPS.cc/2024/Workshop/NeurReps (poster)

Sampling from Energy-based Policies using Diffusion

Vineet Jain

Tara Akhound-Sadegh

2024-10-02

ArXiv (prépublication)

Iterated Denoising Energy Matching for Sampling from Boltzmann Densities

Tara Akhound-Sadegh

Jarrid Rector-Brooks

Joey Bose

Sarthak Mittal

Pablo Lemos

Cheng-Hao Liu

Marcin Sendera

Gauthier Gidel

Yoshua Bengio

Nikolay Malkin

Alexander Tong

Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-… (voir plus)body systems, is a foundational problem in science. In this paper, we propose Iterated Denoising Energy Matching (iDEM), an iterative algorithm that uses a novel stochastic score matching objective leveraging solely the energy function and its gradient---and no data samples---to train a diffusion-based sampler. Specifically, iDEM alternates between (I) sampling regions of high model density from a diffusion-based sampler and (II) using these samples in our stochastic matching objective to further improve the sampler. iDEM is scalable to high dimensions as the inner matching objective, is *simulation-free*, and requires no MCMC samples. Moreover, by leveraging the fast mode mixing behavior of diffusion, iDEM smooths out the energy landscape enabling efficient exploration and learning of an amortized sampler. We evaluate iDEM on a suite of tasks ranging from standard synthetic energy functions to invariant

2024-05-01

ICML.cc/2024/Conference (poster)

Weight-Sharing Regularization

Mehran Shakerinava

Motahareh Sohrabi

Simon Lacoste-Julien

2024-04-18

Proceedings of The 27th International Conference on Artificial Intelligence and Statistics (publié)