Publications

In-Context Learning Can Re-learn Forbidden Tasks

Sophie Xhonneux

David Dobre

Despite significant investment into safety training, large language models (LLMs) deployed in the real world still suffer from numerous vuln… (see more)erabilities. One perspective on LLM safety training is that it algorithmically forbids the model from answering toxic or harmful queries. To assess the effectiveness of safety training, in this work, we study forbidden tasks, i.e., tasks the model is designed to refuse to answer. Specifically, we investigate whether in-context learning (ICL) can be used to re-learn forbidden tasks despite the explicit fine-tuning of the model to refuse them. We first examine a toy example of refusing sentiment classification to demonstrate the problem. Then, we use ICL on a model fine-tuned to refuse to summarise made-up news articles. Finally, we investigate whether ICL can undo safety training, which could represent a major security risk. For the safety task, we look at Vicuna-7B, Starling-7B, and Llama2-7B. We show that the attack works out-of-the-box on Starling-7B and Vicuna-7B but fails on Llama2-7B. Finally, we propose an ICL attack that uses the chat template tokens like a prompt injection attack to achieve a better attack success rate on Vicuna-7B and Starling-7B. Trigger Warning: the appendix contains LLM-generated text with violence, suicide, and misinformation.

2024-02-08

ArXiv (preprint)

doi.org

arxiv.org

When is Momentum Extragradient Optimal? A Polynomial-Based Analysis

Junhyung Lyle Kim

Gauthier Gidel

Anastasios Kyrillidis

Fabian Pedregosa

The extragradient method has gained popularity due to its robust convergence properties for differentiable games. Unlike single-objective op… (see more)timization, game dynamics involve complex interactions reflected by the eigenvalues of the game vector field's Jacobian scattered across the complex plane. This complexity can cause the simple gradient method to diverge, even for bilinear games, while the extragradient method achieves convergence. Building on the recently proven accelerated convergence of the momentum extragradient method for bilinear games \citep{azizian2020accelerating}, we use a polynomial-based analysis to identify three distinct scenarios where this method exhibits further accelerated convergence. These scenarios encompass situations where the eigenvalues reside on the (positive) real line, lie on the real line alongside complex conjugates, or exist solely as complex conjugates. Furthermore, we derive the hyperparameters for each scenario that achieve the fastest convergence rate.

2024-02-08

TMLR (accepted)

openreview.net

Channel-Selective Normalization for Label-Shift Robust Test-Time Adaptation

Pedro Vianna

Muawiz Chaudhary

Paria Mehrbod

An Tang

Guy Cloutier

Guy Wolf

Michael Eickenberg

Eugene Belilovsky

Deep neural networks have useful applications in many different tasks, however their performance can be severely affected by changes in the … (see more)data distribution. For example, in the biomedical field, their performance can be affected by changes in the data (different machines, populations) between training and test datasets. To ensure robustness and generalization to real-world scenarios, test-time adaptation has been recently studied as an approach to adjust models to a new data distribution during inference. Test-time batch normalization is a simple and popular method that achieved compelling performance on domain shift benchmarks. It is implemented by recalculating batch normalization statistics on test batches. Prior work has focused on analysis with test data that has the same label distribution as the training data. However, in many practical applications this technique is vulnerable to label distribution shifts, sometimes producing catastrophic failure. This presents a risk in applying test time adaptation methods in deployment. We propose to tackle this challenge by only selectively adapting channels in a deep network, minimizing drastic adaptation that is sensitive to label shifts. Our selection scheme is based on two principles that we empirically motivate: (1) later layers of networks are more sensitive to label shift (2) individual features can be sensitive to specific classes. We apply the proposed technique to three classification tasks, including CIFAR10-C, Imagenet-C, and diagnosis of fatty liver, where we explore both covariate and label distribution shifts. We find that our method allows to bring the benefits of TTA while significantly reducing the risk of failure common in other methods, while being robust to choice in hyperparameters.

2024-02-07

ArXiv (preprint)

doi.org

arxiv.org

On diffusion models for amortized inference: Benchmarking and improving stochastic control and sampling

Marcin Sendera

Minsu Kim

Sarthak Mittal

Pablo Lemos

Luca Scimeca

Jarrid Rector-Brooks

Alexandre Adam

Yoshua Bengio

Nikolay Malkin

We study the problem of training diffusion models to sample from a distribution with a given unnormalized density or energy function. We ben… (see more)chmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods (continuous generative flow networks). Our results shed light on the relative advantages of existing algorithms while bringing into question some claims from past work. We also propose a novel exploration strategy for off-policy methods, based on local search in the target space with the use of a replay buffer, and show that it improves the quality of samples on a variety of target distributions. Our code for the sampling methods and benchmarks studied is made public at https://github.com/GFNOrg/gfn-diffusion as a base for future work on diffusion models for amortized inference.

2024-02-07

ArXiv (preprint)

doi.org

arxiv.org

E(3)-Equivariant Mesh Neural Networks

Thuan N.a. Trang

Nhat-Khang Ngô

Daniel Levy

Thieu N. Vo

Siamak Ravanbakhsh

Truong Son Hy

Triangular meshes are widely used to represent three-dimensional objects. As a result, many recent works have address the need for geometric… (see more) deep learning on 3D mesh. However, we observe that the complexities in many of these architectures does not translate to practical performance, and simple deep models for geometric graphs are competitive in practice. Motivated by this observation, we minimally extend the update equations of E(n)-Equivariant Graph Neural Networks (EGNNs) (Satorras et al., 2021) to incorporate mesh face information, and further improve it to account for long-range interactions through hierarchy. The resulting architecture, Equivariant Mesh Neural Network (EMNN), outperforms other, more complicated equivariant methods on mesh tasks, with a fast run-time and no expensive pre-processing. Our implementation is available at https://github.com/HySonLab/EquiMesh

2024-02-07

ArXiv (preprint)

doi.org

arxiv.org

Gradient descent induces alignment between weights and the empirical NTK for deep non-linear networks

Daniel Beaglehole

Ioannis Mitliagkas

Atish Agarwala

2024-02-07

ArXiv (preprint)

doi.org

arxiv.org

QGFN: Controllable Greediness with Action Values

Elaine Lau

Stephen Zhewen Lu

Ling Pan

Doina Precup

Emmanuel Bengio

Generative Flow Networks (GFlowNets; GFNs) are a family of reward/energy-based generative methods for combinatorial objects, capable of gene… (see more)rating diverse and high-utility samples. However, biasing GFNs towards producing high-utility samples is non-trivial. In this work, we leverage connections between GFNs and reinforcement learning (RL) and propose to combine the GFN policy with an action-value estimate,

2024-02-07

ArXiv (preprint)

doi.org

arxiv.org

Randomized Confidence Bounds for Stochastic Partial Monitoring

Maxime Heuillet

Ola Ahmad

Audrey Durand

2024-02-07

ArXiv (preprint)

doi.org

arxiv.org