Publications

Predicting Future Actions of Reinforcement Learning Agents

Stephen Chung

Scott Niekum

David M. Krueger

2024-09-24

NeurIPS.cc/2024/Conference (poster)

openreview.net

QGFN: Controllable Greediness with Action Values

Stephen Zhewen Lu

Generative Flow Networks (GFlowNets; GFNs) are a family of energy-based generative methods for combinatorial objects, capable of generating … (voir plus)diverse and high-utility samples. However, consistently biasing GFNs towards producing high-utility samples is non-trivial. In this work, we leverage connections between GFNs and reinforcement learning (RL) and propose to combine the GFN policy with an action-value estimate,

2024-09-24

NeurIPS.cc/2024/Conference (poster)

doi.org

openreview.net

RGFN: Synthesizable Molecular Generation Using GFlowNets

Michał Koziarski

Andrei Rekesh

Dmytro Shevchuk

Almer Van Der Sloot

Piotr Gainski

Yoshua Bengio

Cheng-Hao Liu

Mike Tyers

Robert A. Batey

Generative models hold great promise for small molecule discovery, significantly increasing the size of search space compared to traditional… (voir plus) in silico screening libraries. However, most existing machine learning methods for small molecule generation suffer from poor synthesizability of candidate compounds, making experimental validation difficult. In this paper we propose Reaction-GFlowNet (RGFN), an extension of the GFlowNet framework that operates directly in the space of chemical reactions, thereby allowing out-of-the-box synthesizability while maintaining comparable quality of generated candidates. We demonstrate that with the proposed set of reactions and building blocks, it is possible to obtain a search space of molecules orders of magnitude larger than existing screening libraries coupled with low cost of synthesis. We also show that the approach scales to very large fragment libraries, further increasing the number of potential molecules. We demonstrate the effectiveness of the proposed approach across a range of oracle models, including pretrained proxy models and GPU-accelerated docking.

2024-09-24

NeurIPS.cc/2024/Conference (poster)

doi.org

openreview.net

Self-Consuming Generative Models with Curated Data Provably Optimize Human Preferences

Damien Ferbach

Quentin Bertrand

Avishek Joey Bose

Gauthier Gidel

The rapid progress in generative models has resulted in impressive leaps in generation quality, blurring the lines between synthetic and rea… (voir plus)l data. Web-scale datasets are now prone to the inevitable contamination by synthetic data, directly impacting the training of future generated models. Already, some theoretical results on self-consuming generative models (a.k.a., iterative retraining) have emerged in the literature, showcasing that either model collapse or stability could be possible depending on the fraction of generated data used at each retraining step. However, in practice, synthetic data is often subject to human feedback and curated by users before being used and uploaded online. For instance, many interfaces of popular text-to-image generative models, such as Stable Diffusion or Midjourney, produce several variations of an image for a given query which can eventually be curated by the users. In this paper, we theoretically study the impact of data curation on iterated retraining of generative models and show that it can be seen as an \emph{implicit preference optimization mechanism}. However, unlike standard preference optimization, the generative model does not have access to the reward function or negative samples needed for pairwise comparisons. Moreover, our study doesn't require access to the density function, only to samples. We prove that, if the data is curated according to a reward model, then the expected reward of the iterative retraining procedure is maximized. We further provide theoretical results on the stability of the retraining loop when using a positive fraction of real data at each step. Finally, we conduct illustrative experiments on both synthetic datasets and on CIFAR10 showing that such a procedure amplifies biases of the reward model.

2024-09-24

NeurIPS.cc/2024/Conference (spotlight)

doi.org

openreview.net

Simplifying Constraint Inference with Inverse Reinforcement Learning

Adriana Hugessen

Harley Wiltzer

Glen Berseth

Learning safe policies has presented a longstanding challenge for the reinforcement learning (RL) community. Various formulations of safe RL… (voir plus) have been proposed; However, fundamentally, tabula rasa RL must learn safety constraints through experience, which is problematic for real-world applications. Imitation learning is often preferred in real-world settings because the experts’ safety preferences are embedded in the data the agent imitates. However, imitation learning is limited in its extensibility to new tasks, which can only be learned by providing the agent with expert trajectories. For safety-critical applications with sub-optimal or inexact expert data, it would be preferable to learn only the safety aspects of the policy through imitation, while still allowing for task learning with RL. The field of inverse constrained RL, which seeks to infer constraints from expert data, is a promising step in this direction. However, prior work in this area has relied on complex tri-level optimizations in order to infer safe behavior (constraints). This challenging optimization landscape leads to sub-optimal performance on several benchmark tasks. In this work, we present a simplified version of constraint inference that performs as well or better than prior work across a collection of continuous-control benchmarks. Moreover, besides improving performance, this simplified framework is easier to implement, tune, and more readily lends itself to various extensions, such as offline constraint inference. Our code is made available at https://github.com/ahugs/simple-icrl.

2024-09-24

NeurIPS.cc/2024/Conference (poster)

doi.org

openreview.net

Soft Prompt Threats: Attacking Safety Alignment and Unlearning in Open-Source LLMs through the Embedding Space

Leo Schwinn

David Dobre

Sophie Xhonneux

Gauthier Gidel

Stephan Günnemann

Current research in adversarial robustness of LLMs focuses on discrete input manipulations in the natural language space, which can be direc… (voir plus)tly transferred to closed-source models. However, this approach neglects the steady progression of open-source models. As open-source models advance in capability, ensuring their safety also becomes increasingly imperative. Yet, attacks tailored to open-source LLMs that exploit full model access remain largely unexplored. We address this research gap and propose the embedding space attack, which directly attacks the continuous embedding representation of input tokens. We find that embedding space attacks circumvent model alignments and trigger harmful behaviors more efficiently than discrete attacks or model fine-tuning. Furthermore, we present a novel threat model in the context of unlearning and show that embedding space attacks can extract supposedly deleted information from unlearned LLMs across multiple datasets and models. Our findings highlight embedding space attacks as an important threat model in open-source LLMs. Trigger Warning: the appendix contains LLM-generated text with violence and harassment.

2024-09-24

NeurIPS.cc/2024/Conference (poster)

doi.org

openreview.net

Stress-Testing Capability Elicitation With Password-Locked Models

Ryan Greenblatt

Fabien Roger

Dmitrii Krasheninnikov

David M. Krueger

2024-09-24

NeurIPS.cc/2024/Conference (poster)

doi.org

openreview.net

Textoshop: Interactions Inspired by Drawing Software to Facilitate Text Editing

Damien Masson

Young-Ho Kim

Fanny Chevalier

2024-09-24

ArXiv (prépublication)

doi.org

arxiv.org

The Factorization Curse: Which Tokens You Predict Underlie the Reversal Curse and More

Ouail Kitouni

Niklas Nolte

Adina Williams

Michael G. Rabbat

Diane Bouchacourt

Mark Ibrahim

2024-09-24

NeurIPS.cc/2024/Conference (poster)

openreview.net

The High Line: Exact Risk and Learning Rate Curves of Stochastic Adaptive Learning Rate Algorithms

Elizabeth Collins-Woodfin

INBAR SEROUSSI

Begoña García Malaxechebarría

Andrew W. Mackenzie

Elliot Paquette

Courtney Paquette

We develop a framework for analyzing the training and learning rate dynamics on a large class of high-dimensional optimization problems, whi… (voir plus)ch we call the high line, trained using one-pass stochastic gradient descent (SGD) with adaptive learning rates. We give exact expressions for the risk and learning rate curves in terms of a deterministic solution to a system of ODEs. We then investigate in detail two adaptive learning rates -- an idealized exact line search and AdaGrad-Norm -- on the least squares problem. When the data covariance matrix has strictly positive eigenvalues, this idealized exact line search strategy can exhibit arbitrarily slower convergence when compared to the optimal fixed learning rate with SGD. Moreover we exactly characterize the limiting learning rate (as time goes to infinity) for line search in the setting where the data covariance has only two distinct eigenvalues. For noiseless targets, we further demonstrate that the AdaGrad-Norm learning rate converges to a deterministic constant inversely proportional to the average eigenvalue of the data covariance matrix, and identify a phase transition when the covariance density of eigenvalues follows a power law distribution. We provide our code for evaluation at https://github.com/amackenzie1/highline2024.

2024-09-24

NeurIPS.cc/2024/Conference (poster)

doi.org

openreview.net

On the Scalability of Certified Adversarial Robustness with Generated Data

Thomas Altstidl

David Dobre

Arthur Kosmala

Björn Eskofier

Gauthier Gidel

Leo Schwinn

Certified defenses against adversarial attacks offer formal guarantees on the robustness of a model, making them more reliable than empirica… (voir plus)l methods such as adversarial training, whose effectiveness is often later reduced by unseen attacks. Still, the limited certified robustness that is currently achievable has been a bottleneck for their practical adoption. Gowal et al. and Wang et al. have shown that generating additional training data using state-of-the-art diffusion models can considerably improve the robustness of adversarial training. In this work, we demonstrate that a similar approach can substantially improve deterministic certified defenses but also reveal notable differences in the scaling behavior between certified and empirical methods. In addition, we provide a list of recommendations to scale the robustness of certified training approaches. Our approach achieves state-of-the-art deterministic robustness certificates on CIFAR-10 for the ℓ 2 ( ϵ = 36 / 255 ) and ℓ ∞ ( ϵ = 8 / 255 ) threat models, outperforming the previous results by +3 . 95 and +1 . 39 percentage points, respectively. Furthermore, we report similar improvements for CIFAR-100.

2024-09-24

NeurIPS.cc/2024/Conference (poster)

doi.org

openreview.net

On the Scalability of GNNs for Molecular Graphs

Maciej Sypetkowski

Frederik Wenkel

Farimah Poursafaei

Nia Dickson

Karush Suri

Philip Fradkin

Dominique Beaini

Scaling deep learning models has been at the heart of recent revolutions in language modelling and image generation. Practitioners have obse… (voir plus)rved a strong relationship between model size, dataset size, and performance. However, structure-based architectures such as Graph Neural Networks (GNNs) are yet to show the benefits of scale mainly due to the lower efficiency of sparse operations, large data requirements, and lack of clarity about the effectiveness of various architectures. We address this drawback of GNNs by studying their scaling behavior. Specifically, we analyze message-passing networks, graph Transformers, and hybrid architectures on the largest public collection of 2D molecular graphs. For the first time, we observe that GNNs benefit tremendously from the increasing scale of depth, width, number of molecules, number of labels, and the diversity in the pretraining datasets, resulting in a 30.25% improvement when scaling to 1 billion parameters and 28.98% improvement when increasing size of dataset to eightfold. We further demonstrate strong finetuning scaling behavior on 38 tasks, outclassing previous large models. We hope that our work paves the way for an era where foundational GNNs drive pharmaceutical drug discovery.

2024-09-24

Neural Information Processing Systems (poster)

doi.org

openreview.net

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Avantage IA : productivité dans la fonction publique

Publications

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Avantage IA : productivité dans la fonction publique

Mots-clés populaires:

Publications