Publications

ET-Flow: Equivariant Flow-Matching for Molecular Conformer Generation

Nikhil Shenoy

Hannes Stärk

Stephan Thaler

Predicting low-energy molecular conformations given a molecular graph is an important but challenging task in computational drug discovery.… (see more) Existing state- of-the-art approaches either resort to large scale transformer-based models that diffuse over conformer fields, or use computationally expensive methods to gen- erate initial structures and diffuse over torsion angles. In this work, we introduce Equivariant Transformer Flow (ET-Flow). We showcase that a well-designed flow matching approach with equivariance and harmonic prior alleviates the need for complex internal geometry calculations and large architectures, contrary to the prevailing methods in the field. Our approach results in a straightforward and scalable method that directly operates on all-atom coordinates with minimal assumptions. With the advantages of equivariance and flow matching, ET-Flow significantly increases the precision and physical validity of the generated con- formers, while being a lighter model and faster at inference. Code is available https://github.com/shenoynikhil/ETFlow.

2024-09-25

NeurIPS.cc/2024/Conference (poster)

openreview.net

A Foundation Model for Zero-shot Logical Query Reasoning

Mikhail Galkin

Jincheng Zhou

Bruno Ribeiro

Jian Tang

Zhaocheng Zhu

Complex logical query answering (CLQA) in knowledge graphs (KGs) goes beyond simple KG completion and aims at answering compositional querie… (see more)s comprised of multiple projections and logical operations. Existing CLQA methods that learn parameters bound to certain entity or relation vocabularies can only be applied to the graph they are trained on which requires substantial training time before being deployed on a new graph. Here we present UltraQuery, the first foundation model for inductive reasoning that can zero-shot answer logical queries on any KG. The core idea of UltraQuery is to derive both projections and logical operations as vocabulary-independent functions which generalize to new entities and relations in any KG. With the projection operation initialized from a pre-trained inductive KG completion model, UltraQuery can solve CLQA on any KG after finetuning on a single dataset. Experimenting on 23 datasets, UltraQuery in the zero-shot inference mode shows competitive or better query answering performance than best available baselines and sets a new state of the art on 15 of them.

2024-09-25

NeurIPS.cc/2024/Conference (poster)

openreview.net

A Generative Model of Symmetry Transformations

James Urquhart Allingham

Bruno Mlodozeniec

Shreyas Padhy

Javier Antoran

David Scott Krueger

Richard E. Turner

Eric Nalisnick

José Miguel Hernández-Lobato

Correctly capturing the symmetry transformations of data can lead to efficient models with strong generalization capabilities, though method… (see more)s incorporating symmetries often require prior knowledge. While recent advancements have been made in learning those symmetries directly from the dataset, most of this work has focused on the discriminative setting. In this paper, we take inspiration from group theoretic ideas to construct a generative model that explicitly aims to capture the data's approximate symmetries. This results in a model that, given a prespecified broad set of possible symmetries, learns to what extent, if at all, those symmetries are actually present. Our model can be seen as a generative process for data augmentation. We provide a simple algorithm for learning our generative model and empirically demonstrate its ability to capture symmetries under affine and color transformations, in an interpretable way. Combining our symmetry model with standard generative models results in higher marginal test-log-likelihoods and improved data efficiency.

2024-09-25

NeurIPS.cc/2024/Conference (poster)

doi.org

openreview.net

GenRL: Multimodal-foundation world models for generalization in embodied agents

Pietro Mazzaglia

Tim Verbelen

Bart Dhoedt

Aaron Courville

Sai Rajeswar

Learning generalist embodied agents, able to solve multitudes of tasks in different domains is a long-standing problem. Reinforcement learni… (see more)ng (RL) is hard to scale up as it requires a complex reward design for each task. In contrast, language can specify tasks in a more natural way. Current foundation vision-language models (VLMs) generally require fine-tuning or other adaptations to be adopted in embodied contexts, due to the significant domain gap. However, the lack of multimodal data in such domains represents an obstacle to developing foundation models for embodied applications. In this work, we overcome these problems by presenting multimodal-foundation world models, able to connect and align the representation of foundation VLMs with the latent space of generative world models for RL, without any language annotations. The resulting agent learning framework, GenRL, allows one to specify tasks through vision and/or language prompts, ground them in the embodied domain’s dynamics, and learn the corresponding behaviors in imagination. As assessed through large-scale multi-task benchmarking in locomotion and manipulation domains, GenRL enables multi-task generalization from language and visual prompts. Furthermore, by introducing a data-free policy learning strategy, our approach lays the groundwork for foundational policy learning using generative world models. Website, code and data: https://mazpie.github.io/genrl/

2024-09-25

NeurIPS.cc/2024/Conference (poster)

openreview.net

Geometry of naturalistic object representations in recurrent neural network models of working memory

Xiaoxuan Lei

Takuya Ito

Pouya Bashivan

2024-09-25

NeurIPS.cc/2024/Conference (poster)

doi.org

openreview.net

Grounding Multimodal Large Language Models in Actions

Andrew Szot

Bogdan Mazoure

Harsh Agrawal

(Rex) Devon Hjelm

Zsolt Kira

Alexander T Toshev

Multimodal Large Language Models (MLLMs) have demonstrated a wide range of capabilities across many domains including Embodied AI. In this w… (see more)ork, we study how to best ground a MLLM into different embodiments and their associated action spaces, including both continuous and discrete actions. For continuous actions, a set of learned tokenizations that capture an action at various resolutions allows for sufficient modeling precision, yielding the best performance on downstream tasks. For discrete actions, semantically aligning these actions with the native output token space of the MLLM leads to the strongest performance. We arrive at these lessons via a thorough study of seven action grounding approaches on five different environments, encompassing over 114 embodied tasks.

2024-09-25

NeurIPS.cc/2024/Conference (poster)

doi.org

openreview.net

HardCore Generation: Generating Hard UNSAT Problems for Data Augmentation

Joseph Cotnareanu

Zhanguang Zhang

Hui-Ling Zhen

Yingxue Zhang

Mark Coates

Efficiently determining the satisfiability of a boolean equation --- known as the SAT problem for brevity --- is crucial in various industri… (see more)al problems. Recently, the advent of deep learning methods has introduced significant potential for enhancing SAT solving. However, a major barrier to the advancement of this field has been the scarcity of large, realistic datasets. The majority of current public datasets are either randomly generated or extremely limited, containing only a few examples from unrelated problem families. These datasets are inadequate for meaningful training of deep learning methods. In light of this, researchers have started exploring generative techniques to create data that more accurately reflect SAT problems encountered in practical situations. These methods have so far suffered from either the inability to produce challenging SAT problems or time-scalability obstacles. In this paper we address both by identifying and manipulating the key contributors to a problem's ``hardness'', known as cores. Although some previous work has addressed cores, the time costs are unacceptably high due to the expense of traditional heuristic core detection techniques. We introduce a fast core detection procedure that uses a graph neural network. Our empirical results demonstrate that we can efficiently generate problems that remain hard to solve and retain key attributes of the original example problems. We show via experiment that the generated synthetic SAT problems can be used in a data augmentation setting to provide improved prediction of solver runtimes.

2024-09-25

NeurIPS.cc/2024/Conference (poster)

openreview.net

Harnessing small projectors and multiple views for efficient vision pretraining

Kumar Krishna Agrawal

Arna Ghosh

Shagun Sodhani

Adam M. Oberman

Blake Richards

2024-09-25

NeurIPS.cc/2024/Conference (poster)

openreview.net

How Molecules Impact Cells: Unlocking Contrastive PhenoMolecular Retrieval

Philip Fradkin

Puria Azadi Moghadam

Karush Suri

Frederik Wenkel

Ali Bashashati

Maciej Sypetkowski

Dominique Beaini

Predicting molecular impact on cellular function is a core challenge in therapeutic design. Phenomic experiments, designed to capture cellul… (see more)ar morphology, utilize microscopy based techniques and demonstrate a high throughput solution for uncovering molecular impact on the cell. In this work, we learn a joint latent space between molecular structures and microscopy phenomic experiments, aligning paired samples with contrastive learning. Specifically, we study the problem ofContrastive PhenoMolecular Retrieval, which consists of zero-shot molecular structure identification conditioned on phenomic experiments. We assess challenges in multi-modal learning of phenomics and molecular modalities such as experimental batch effect, inactive molecule perturbations, and encoding perturbation concentration. We demonstrate improved multi-modal learner retrieval through (1) a uni-modal pre-trained phenomics model, (2) a novel inter sample similarity aware loss, and (3) models conditioned on a representation of molecular concentration. Following this recipe, we propose MolPhenix, a molecular phenomics model. MolPhenix leverages a pre-trained phenomics model to demonstrate significant performance gains across perturbation concentrations, molecular scaffolds, and activity thresholds. In particular, we demonstrate an 8.1x improvement in zero shot molecular retrieval of active molecules over the previous state-of-the-art, reaching 77.33% in top-1% accuracy. These results open the door for machine learning to be applied in virtual phenomics screening, which can significantly benefit drug discovery applications.

2024-09-25

NeurIPS.cc/2024/Conference (poster)

doi.org

openreview.net

On improved Conditioning Mechanisms and Pre-training Strategies for Diffusion Models

Tariq Berrada

Pietro Astolfi

Melissa Hall

Reyhane Askari Hemmat

Yohann Benchetrit

Marton Havasi

Matthew J. Muckley

Karteek Alahari

Adriana Romero Soriano

Jakob Verbeek

Michal Drozdzal

Large-scale training of latent diffusion models (LDMs) has enabled unprecedented quality in image generation. However, large-scale end-to-e… (see more)nd training of these models is computationally costly, and hence most research focuses either on finetuning pretrained models or experiments at smaller scales. In this work we aim to improve the training efficiency and performance of LDMs with the goal of scaling to larger datasets and higher resolutions. We focus our study on two points that are critical for good performance and efficient training: (i) the mechanisms used for semantic level (\eg a text prompt, or class name) and low-level (crop size, random flip, \etc) conditioning of the model, and (ii) pre-training strategies to transfer representations learned on smaller and lower-resolution datasets to larger ones. The main contributions of our work are the following: we present systematic experimental study of these points, we propose a novel conditioning mechanism that disentangles semantic and low-level conditioning, we obtain state-of-the-art performance on CC12M for text-to-image at 512 resolution.

2024-09-25

NeurIPS.cc/2024/Conference (poster)

doi.org

openreview.net

Improved off-policy training of diffusion samplers

Minsu Kim

We study the problem of training diffusion models to sample from a distribution with a given unnormalized density or energy function. We ben… (see more)chmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods (continuous generative flow networks). Our results shed light on the relative advantages of existing algorithms while bringing into question some claims from past work. We also propose a novel exploration strategy for off-policy methods, based on local search in the target space with the use of a replay buffer, and show that it improves the quality of samples on a variety of target distributions. Our code for the sampling methods and benchmarks studied is made public at [this link](https://github.com/GFNOrg/gfn-diffusion) as a base for future work on diffusion models for amortized inference.

2024-09-25

NeurIPS.cc/2024/Conference (poster)

openreview.net

Improving Context-Aware Preference Modeling for Language Models

Silviu Pitis

Ziang Xiao

Nicolas Le Roux

Alessandro Sordoni

While finetuning language models from pairwise preferences has proven remarkably effective, the underspecified nature of natural language pr… (see more)esents critical challenges. Direct preference feedback is uninterpretable, difficult to provide where multidimensional criteria may apply, and often inconsistent, either because it is based on incomplete instructions or provided by diverse principals. To address these challenges, we consider the two-step preference modeling procedure that first resolves the under-specification by selecting a context, and then evaluates preference with respect to the chosen context. We decompose reward modeling error according to these two steps, which suggests that supervising context in addition to context-specific preference may be a viable approach to aligning models with diverse human preferences. For this to work, the ability of models to evaluate context-specific preference is critical. To this end, we contribute context-conditioned preference datasets and accompanying experiments that investigate the ability of language models to evaluate context-specific preference. Unlike past datasets, where context-specific preference is highly correlated with general preference, our "preference reversal" datasets disentangle context-specific and general preferences to isolate context-specific capabilities. We use our datasets to (1) show that existing preference models benefit from, but fail to fully consider, added context, (2) finetune a context-aware reward model with context-specific performance exceeding that of GPT-4 and Llama 3 70B, and (3) investigate the potential value of context-aware preference modeling.

2024-09-25

NeurIPS.cc/2024/Conference (poster)

doi.org

openreview.net

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

Hugo Larochelle appointed Scientific Director of Mila

Publications

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

Hugo Larochelle appointed Scientific Director of Mila

Popular keywords:

Publications