Publications

In-Context Learning, Can It Break Safety?
Sophie Xhonneux
David Dobre
Michael Noukhovitch
Robust Knowledge Unlearning via Mechanistic Localizations
Phillip Huang Guo
Aaquib Syed
Abhay Sheshadri
Aidan Ewart
Towards Adversarially Robust Vision-Language Models: Insights from Design Choices and Prompt Formatting Techniques
Rishika Bhagwatkar
Shravan Nayak
Reza Bayat
Alexis Roger
Daniel Z Kaplan
Vision-Language Models (VLMs) have witnessed a surge in both research and real-world applications. However, as they becoming increasingly pr… (see more)evalent, ensuring their robustness against adversarial attacks is paramount. This work systematically investigates the impact of model design choices on the adversarial robustness of VLMs against image-based attacks. Additionally, we introduce novel, cost-effective approaches to enhance robustness through prompt formatting. By rephrasing questions and suggesting potential adversarial perturbations, we demonstrate substantial improvements in model robustness against strong image-based attacks such as Auto-PGD. Our findings provide important guidelines for developing more robust VLMs, particularly for deployment in safety-critical environments.
Voices Unheard: NLP Resources and Models for Yor\`ub\'a Regional Dialects
Orevaoghene Ahia
Aremu Anuoluwapo
Diana Abagyan
Hila Gonen
Daud Abolade
Noah A. Smith
Yulia Tsvetkov
A Context-Driven Approach for Co-Auditing Smart Contracts with The Support of GPT-4 code interpreter
Mohamed Salah Bouafif
Chen Zheng
Ilham Qasse
Ed Zulkoski
Mohammad Hamdaqa
The surge in the adoption of smart contracts necessitates rigorous auditing to ensure their security and reliability. Manual auditing, altho… (see more)ugh comprehensive, is time-consuming and heavily reliant on the auditor's expertise. With the rise of Large Language Models (LLMs), there is growing interest in leveraging them to assist auditors in the auditing process (co-auditing). However, the effectiveness of LLMs in smart contract co-auditing is contingent upon the design of the input prompts, especially in terms of context description and code length. This paper introduces a novel context-driven prompting technique for smart contract co-auditing. Our approach employs three techniques for context scoping and augmentation, encompassing code scoping to chunk long code into self-contained code segments based on code inter-dependencies, assessment scoping to enhance context description based on the target assessment goal, thereby limiting the search space, and reporting scoping to force a specific format for the generated response. Through empirical evaluations on publicly available vulnerable contracts, our method demonstrated a detection rate of 96\% for vulnerable functions, outperforming the native prompting approach, which detected only 53\%. To assess the reliability of our prompting approach, manual analysis of the results was conducted by expert auditors from our partner, Quantstamp, a world-leading smart contract auditing company. The experts' analysis indicates that, in unlabeled datasets, our proposed approach enhances the proficiency of the GPT-4 code interpreter in detecting vulnerabilities.
Data harmonization for Advancing research on Personalized Rehabilitation Interventions for Patients with Traumatic Brain Injury and Stroke: A proof of concept
Dorra Rakia Allegue
Despoina Petsani
Nathalie Ponthon
Evdokimos Konstantinidis
Panagiotis Bamidis
Eva Kehayia
Sara Ahmed
Stroke and traumatic brain injury (TBI) are leading causes of morbidity and mortality, affecting survivors’ mobility and social participat… (see more)ion. Although personalized interventions could positively impact survivors' recovery, the effectiveness of such interventions remains unclear. Open-access data repositories can provide access to multiple shared data which could help uncover new evidence of effective interventions; however, harmonizing data between different studies requires many steps to make it possible given the various methods of data collection, intervention characteristics and population sociodemographic profile. This proof-of-concept study aimed to describe the steps and anchors that contributed to the development of guiding frameworks to harmonize data across different studies. Data were extracted from the Federal Interagency Traumatic Brain Injury Research (FITBIR) repository and stored on an online cloud platform. The outcome measures were mapped to mobility determinants using the International Classification of Functioning, Disability, and Health (ICF) and Webber framework. The intervention's effect was categorized according to the Minimal Clinically Important Difference (MCID)s of the measures administered. The study proposed a novel framework for intervention features, which aims to enhance our understanding of the mechanisms of action and potential impact of rehabilitation interventions. The framework classified interventions based on their nature, context, specific body systems, dosage, caregiver assistance, and behaviour change strategies. In conclusion, this study demonstrated the feasibility of harmonizing data extracted from different sources in the FITBIR repository. Leveraging existing open databases offers tremendous opportunities to advance research on personalized interventions for patients with TBI and stroke and inform decision-making during transitions.
Implicit Diffusion: Efficient Optimization through Stochastic Sampling
Pierre Marion
Anna Korba
Peter Bartlett
Mathieu Blondel
Valentin De Bortoli
Arnaud Doucet
Felipe Llinares-López
Quentin Berthet
Mixture of Experts in a Mixture of RL settings
Timon Willi
Johan Samir Obando Ceron
Jakob Nicolaus Foerster
Mixtures of Experts (MoEs) have gained prominence in (self-)supervised learning due to their enhanced inference efficiency, adaptability to … (see more)distributed training, and modularity. Previous research has illustrated that MoEs can significantly boost Deep Reinforcement Learning (DRL) performance by expanding the network's parameter count while reducing dormant neurons, thereby enhancing the model's learning capacity and ability to deal with non-stationarity. In this work, we shed more light on MoEs' ability to deal with non-stationarity and investigate MoEs in DRL settings with"amplified"non-stationarity via multi-task training, providing further evidence that MoEs improve learning capacity. In contrast to previous work, our multi-task results allow us to better understand the underlying causes for the beneficial effect of MoE in DRL training, the impact of the various MoE components, and insights into how best to incorporate them in actor-critic-based DRL networks. Finally, we also confirm results from previous work.
Multimodal foundation world models for generalist embodied agents
Pietro Mazzaglia
Tim Verbelen
Bart Dhoedt
Sai Rajeswar
BayTTA: Uncertainty-aware medical image classification with optimized test-time augmentation using Bayesian model averaging
Zeinab Sherkatghanad
Moloud Abdar
Mohammadreza Bakhtyari
Test-time augmentation (TTA) is a well-known technique employed during the testing phase of computer vision tasks. It involves aggregating m… (see more)ultiple augmented versions of input data. Combining predictions using a simple average formulation is a common and straightforward approach after performing TTA. This paper introduces a novel framework for optimizing TTA, called BayTTA (Bayesian-based TTA), which is based on Bayesian Model Averaging (BMA). First, we generate a model list associated with different variations of the input data created through TTA. Then, we use BMA to combine model predictions weighted by their respective posterior probabilities. Such an approach allows one to take into account model uncertainty, and thus to enhance the predictive performance of the related machine learning or deep learning model. We evaluate the performance of BayTTA on various public data, including three medical image datasets comprising skin cancer, breast cancer, and chest X-ray images and two well-known gene editing datasets, CRISPOR and GUIDE-seq. Our experimental results indicate that BayTTA can be effectively integrated into state-of-the-art deep learning models used in medical image analysis as well as into some popular pre-trained CNN models such as VGG-16, MobileNetV2, DenseNet201, ResNet152V2, and InceptionRes-NetV2, leading to the enhancement in their accuracy and robustness performance.
On the consistency of hyper-parameter selection in value-based deep reinforcement learning
Johan Samir Obando Ceron
J. G. Ara'ujo
Deep reinforcement learning (deep RL) has achieved tremendous success on various domains through a combination of algorithmic design and car… (see more)eful selection of hyper-parameters. Algorithmic improvements are often the result of iterative enhancements built upon prior approaches, while hyper-parameter choices are typically inherited from previous methods or fine-tuned specifically for the proposed technique. Despite their crucial impact on performance, hyper-parameter choices are frequently overshadowed by algorithmic advancements. This paper conducts an extensive empirical study focusing on the reliability of hyper-parameter selection for value-based deep reinforcement learning agents, including the introduction of a new score to quantify the consistency and reliability of various hyper-parameters. Our findings not only help establish which hyper-parameters are most critical to tune, but also help clarify which tunings remain consistent across different training regimes.
Controlling Large Language Model Agents with Entropic Activation Steering
Nathan Rahn
Pierluca D'Oro
The generality of pretrained large language models (LLMs) has prompted increasing interest in their use as in-context learning agents. To be… (see more) successful, such agents must form beliefs about how to achieve their goals based on limited interaction with their environment, resulting in uncertainty about the best action to take at each step. In this paper, we study how LLM agents form and act on these beliefs by conducting experiments in controlled sequential decision-making tasks. To begin, we find that LLM agents are overconfident: They draw strong conclusions about what to do based on insufficient evidence, resulting in inadequately explorative behavior. We dig deeper into this phenomenon and show how it emerges from a collapse in the entropy of the action distribution implied by sampling from the LLM. We then demonstrate that existing token-level sampling techniques are by themselves insufficient to make the agent explore more. Motivated by this fact, we introduce Entropic Activation Steering (EAST), an activation steering method for in-context LLM agents. EAST computes a steering vector as an entropy-weighted combination of representations, and uses it to manipulate an LLM agent's uncertainty over actions by intervening on its activations during the forward pass. We show that EAST can reliably increase the entropy in an LLM agent's actions, causing more explorative behavior to emerge. Finally, EAST modifies the subjective uncertainty an LLM agent expresses, paving the way to interpreting and controlling how LLM agents represent uncertainty about their decisions.